Skip to content

Chunk columnar retarget#769

Draft
frankmcsherry wants to merge 7 commits into
TimelyDataflow:master-nextfrom
frankmcsherry:chunk-columnar-retarget
Draft

Chunk columnar retarget#769
frankmcsherry wants to merge 7 commits into
TimelyDataflow:master-nextfrom
frankmcsherry:chunk-columnar-retarget

Conversation

@frankmcsherry

Copy link
Copy Markdown
Member

Demonstration of retargeting the columnar/ container/batch/trace stack onto the chunk/ framework.

frankmcsherry and others added 7 commits June 23, 2026 11:21
Reimplement the columnar trace on the `Chunk` trait (`trace/chunk/`)
instead of the bespoke batcher + `OrdValBatch`-backed spine.

`ColChunk<U>` backs a chunk with the `UpdatesTyped` trie, resident or
paged. Its four transducers delegate to the reused trie-native survey
merge (`trie_merger`); the harness supplies the batch, straddle cursor,
batcher, builder, and spine. `ValSpine`/`ValBatcher`/`ValBuilder` now
re-export that harness over `ColChunk`, and trace merges are trie-native
(they ran through ord_neu's row-oriented merger before).

Deletes the machinery the harness replaces: `batcher.rs` (the bespoke
`MergeBatcher`), `ValMirror`, and `trie_merger`'s driver shell
(`merge_batches`/`ChainBuilder`/`form_chunks`/`merge_iterator`), keeping
the survey/merge core that `ColChunk` drives.

Spill moves onto `Chunk::settle`: `ColChunk` is `Resident | Paged`, where
a paged chunk keeps resident bounds + len and a byte handle, materializing
through a `OnceCell` cache on read. `settle` pages committed chunks out via
the new per-worker `spill` controller once over a record budget; the
backend (`BytesStore`/`BytesSource`) is pluggable and lives with the
caller. Adds an `Updates` byte codec and rewires `columnar_spill` onto it.

Known limitation: `settle` sees only its local output, not timely's whole
batcher queue, so eviction is an approximate per-worker budget rather than
the old exact head-reserve — it bounds memory and round-trips correctly.

Adds `chunk_bench` (layout microbenchmark) and a `col` mode to `chunks`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Both `Chunk` impls ran the same maximal-packing `settle` (carry, coalesce
adjacent sub-TARGET chunks, peel over-TARGET ones); only three operations
differed. Lift the algorithm into a free `pack` helper on the harness that
`settle` can delegate to, parameterized by closures for those operations:

  - coalesce two adjacent chain chunks,
  - split a chunk at `n` updates,
  - seal a committed chunk (compress / spill; identity to keep).

`settle` stays a required trait method, so nothing is forced on
implementors — they opt in by calling `pack`. `vec`'s `settle` passes
make_mut-extend / split_off / identity; `col`'s passes meld / split_at /
a `seal_chunk` that pages via the spiller. The packing lives once; any
impl gets it by delegating, or writes its own `settle` and ignores `pack`.

Also simplify the spill policy to a high-water mark (keep the first
`budget` records resident, page the rest): the old running inc/dec
double-counted once coalescing re-counted the growing carry, so the
example stopped paging. The high-water mark is monotonic and robust; the
default example run again spills (~35M records, 2x lz4) and round-trips.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`ColumnarUpdate` required each field's columnar `Container: Debug`, but
nothing needs it: `BatchContainer`'s only supertrait is `'static`,
`Layout` puts no `Debug` on its containers, neither `Coltainer`/`ChunkBatch`
nor the old `OrdValBatch` derives it, and the lone `Debug for UpdatesTyped`
impl prints just the type name. The bound was the sole reason
`isize`/`usize`-keyed columnar arrangements didn't compile: their
`Isizes`/`Usizes` containers (which re-encode pointer-width ints as `i64`
for portable serialization) don't implement `Debug`, though the values
themselves are `Debug + Ord`.

Drop `+ Debug` from the container bounds (keeping the value-level `Debug`
on K/V/T/R), and restore the `chunks` example's `col` mode to `isize` +
`InputSession::insert` — it had been switched to `i64` + `update(_, 1i64)`
only to satisfy the bound.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`spines.rs` and `chunks.rs` were near-duplicate end-to-end `arrange`+`join`
benchmarks. Fold `chunks` into `spines` by adding a `vec` mode (the
`Vec`-backed `Chunk` trace, via `ContainerChunker<VecChunk>`, reusing the
row workload). `spines` now covers all four backends — `key`, `val`, `vec`,
`col` — on `String` keys; `chunks.rs` is removed. (`col` keeps the native
columnar input path; the `ContainerChunker<ColChunk>` path it had is still
exercised by the chunk_bench benchmark.)

`chunk_bench` is a microbenchmark, not an example — it isolates per-`Chunk`
build/merge/scan and resident memory (with its own counting allocator),
which is a different instrument from an end-to-end spine benchmark and
can't share a binary (one global allocator). Move it to
`benches/chunk_bench.rs` with `harness = false`; run via
`cargo bench --bench chunk_bench -- [updates]`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`col` reached the (now chunk-backed) columnar trace via `columnar/`'s input
plumbing — `ValColBuilder` / `ValPact` / `ValChunker` and a bespoke
`ColWorkload` that formatted into a reusable `String` buffer. Switch it to the
same generic `Chunk` path `vec` uses: `ContainerChunker<ColChunk>` +
`trace::chunk::col`, fed by the shared input harness. `vec` and `col` are now
symmetric (identical input path, differing only in chunk layout), so the
comparison is apples-to-apples, and `spines` no longer depends on `columnar/`'s
input stack (still exercised by the `columnar` and `columnar_spill` examples).

With every mode now sharing one input type, the `Workload` trait + `Box<dyn>`
were vestigial; collapsed to a single concrete `Workload` struct.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`columnar/arrangement` held a confusing mix: substrate (`Coltainer`,
`trie_merger`), container-glue (`TrieChunker`), and trace-name aliases
(`Val*`) that re-exported `chunk::col` types — so naming a columnar trace
meant reaching "around" into `arrangement`, even though the trace lives in
`trace/chunk/col`. Dissolve it, sorting each piece to where it belongs:

- `Coltainer` → `columnar/layout.rs` (it's the `BatchContainer` half of
  `ColumnarLayout`; pure substrate).
- `trie_merger` → `columnar/trie_merger.rs` (the trie-native merge core, beside
  the `updates` trie it operates on).
- `TrieChunker` → `columnar/chunker.rs` (the `RecordedUpdates → ColChunk`
  container-glue chunker).
- The `Val*` aliases move to `columnar/mod.rs` as thin re-exports of the
  canonical `trace::chunk::col` harness types.

`chunk/col.rs` now imports substrate directly (`columnar::{layout, trie_merger,
updates, spill}`) with no lateral hop through `arrangement`. Pure motion: no
behavior change, build/tests clean, spill smoke unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ace faces

Rework the columnar work into a self-contained storage subsystem rooted at
`columnar/`, organized by the function DD needs rather than by implementation:

  columnar/
    layout · updates · trie_merger   DATA — the shared columnar core
    collection/                      COLLECTION face: RecordedUpdates (+ Negate/
                                     Enter/Leave/ResultsIn), Builder, Pact, and
                                     the columnar operators (join_function, ...)
    trace/                           TRACE face: ColChunk (a `Chunk` impl) with
                                     Spine / Batcher / Builder / Chunker, + spill

Both faces derive from the same data; they are linked only by two bridge
operators (the trace `Chunker`, collection->batch; and `as_recorded_updates`,
trace->collection). The generic `Chunk` trait + harness stay in `trace/chunk/`
(layout-agnostic, shared with `vec`); `columnar::trace::ColChunk` implements it.

This supersedes the interim placement under `trace/chunk/col/`: the collection
surface (RecordedUpdates, join_function, ...) was mis-homed inside the
trace/chunk module when it is collection — not chunk — machinery. Rooting
everything under `columnar/` makes it a storage variant that borrows DD's
abstractions but can be sheared off without changing DD's own structure — a
template for future storage variants.

Exports use BATCH/TRACE vocabulary (Spine/Batcher/Builder/Chunker), dropping the
historical `Val*` prefix; `ColChunk` keeps its name as the honest `Chunk` impl.
Also folds in the wire/spill codec dedup (`RecordedUpdates` delegates its body to
`Updates::write_to`/`read_from`) and its round-trip test.

Examples (`columnar`, `columnar_spill`, `spines`) and `chunk_bench` updated to
the new paths. Pure motion otherwise: build/tests clean, runtime smokes unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@frankmcsherry frankmcsherry force-pushed the chunk-columnar-retarget branch from 944a74c to fef2ac3 Compare June 23, 2026 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant