Replace BTreeMap in PortConnectivity with a sorted Vec; semi-naive iteration replacing in-place updates#802
Merged
Merged
Conversation
…mmarize_outputs PortConnectivity now stores a sorted Vec<(usize, Antichain<TS>)> with lazy consolidation (appends set a dirty bit; reads expect consolidated data, with freeze points at Builder::add_node and PerOperatorState::new). The summarize_outputs fixed point is restructured into semi-naive rounds over log-structured sorted runs, removing its HashMap accumulator; the changed-bit insert methods it required are removed as dead. This excises all BTreeMap (and the one HashMap) instantiations from the progress subsystem: generated IR for the bfs example drops 9.1% (254,560 -> 231,310 llvm-lines; btree machinery 8,723 -> 0). Construction remains O(n log n) under any insertion order; shape_scaling timings at 10k/100k operators are unchanged or slightly improved. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The cycle check already observed every location, so the map was dense in content; index in-degree counts by per-node location offsets instead. This removes the last std collection map from the progress subsystem. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
add_port now merges summaries at a port by antichain insertion, matching insert's semantics; it is the builder's responsibility to introduce multiple summaries for a port only when multiple paths exist. Operate::initialize is now documented to return connectivity in canonical (consolidated) form, and the in-tree implementations (builder_raw, Subgraph) establish that form before returning. The consolidation at the consumer in PerOperatorState::new remains as defense in depth against foreign Operate implementations. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
PortConnectivity gains is_consolidated(), and a Consolidate extension trait gives the Connectivity alias consolidate()/is_consolidated() verbs, so boundaries state the invariant as a predicate rather than prose. Operate::initialize's contract now reads 'must satisfy is_consolidated'; builder_raw consolidates its summary in place before cloning (leaving the retained copy canonical as well); the defense-in-depth site in PerOperatorState::new debug_asserts the contract before repairing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
insert now literally delegates to add_port (the in-place fast path was an unproven optimization); type and method docs state requirements without explaining the implementation; builder_raw's initialize takes mut self rather than rebinding. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Split PortConnectivity into an append-only PortConnectivityBuilder (insert/add_port/FromIterator, freeze) and a frozen, always-canonical PortConnectivity (get/iter_ports only). Deletes the dirty bit, the read-side debug_asserts, the Consolidate extension trait, the defense-in-depth consolidation in PerOperatorState::new and Builder::add_node, and the documented temporal contract on Operate::initialize, whose return type now carries the invariant. builder_rc's shared Rc<RefCell<PortConnectivity>> (also held by input handles and capabilities) stays frozen; new_output_connection re-freezes it in place via mem::take + into_builder + freeze. handles.rs and capability.rs are unchanged. Sizing notes in typed-split-sizing.md. cargo test -p timely passes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The summaries shared with input handles are created at new_input, mutated only during construction, and read only at runtime through capabilities. The builder now accumulates them in PortConnectivityBuilders and freezes each into a shared OnceCell when the operator is built, so the lifecycle (write once at build, immutable thereafter) is expressed in the type and the re-freeze-in-place dance disappears along with RefCell borrows on the capability read path. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…sizing doc into_builder's only caller was the re-freeze dance that the OnceCell sharing removed; freeze is one-way. The summaries field comment now matches the tuple order, set() failure uses expect, and the exploratory sizing document leaves the repository. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
merge_disjoint uses the while-let-both-Some then extend pattern, and summarize_outputs's per-round worklist avoids the name frontier, which means something else throughout this crate. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Replaces the Vec<Vec<_>> levels in summarize_outputs with BinaryRuns: a single vector whose length's binary representation reveals its sorted runs, largest first. Batches of novel keys merge trailing runs as in binary addition (the smallest run is always the lowest set bit of the current length), keeping introduction amortized logarithmic and lookups to at most logarithmically many runs, in one allocation. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Introducing a batch now extends the vector and sort_unstables the suffix past the runs shared between the old and new lengths' decompositions. This removes merge_disjoint and the carry cascade entirely, costs a log factor more in comparisons (n log^2 n total), and runs in place, once per dataflow construction, on modest n. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The lengths' xor locates the highest bit on which they disagree; runs at the agreeing bits above it stay put, replacing the bit-scanning loop. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Same-key proposals within a round first collapse into an antichain, so only elements that survive both the round's own batch and insertion into the accumulated antichain are shipped as next-round work; previously every improving step shipped, including ones dominated within the same round. Also inlines the final into_sorted call. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Antichain::drain yields owned elements while leaving the allocation for reuse, mirroring MutableAntichain::update_iter's smallvec::Drain. The proposal-collapsing batch in summarize_outputs is hoisted out of the key loop: existing keys drain it (no clones, allocation retained), and fresh keys surrender it via mem::take, which fresh must own anyway. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PortConnectivity now stores a sorted
Vec<(usize, Antichain<TS>)>with lazy consolidation (appends set a dirty bit; reads expect consolidated data, with freeze points atBuilder::add_nodeandPerOperatorState::new). Thesummarize_outputsfixed point is restructured into semi-naive rounds over log-structured sorted runs, removing itsHashMapaccumulator; the changed-bit insert methods it required are removed as dead.This excises all
BTreeMap(and the oneHashMap) instantiations from the progress subsystem: generated IR for the bfs example drops 9.1% (254,560 -> 231,310 llvm-lines; btree machinery 8,723 -> 0). Construction remains O(n log n) under any insertion order;shape_scalingtimings at 10k/100k operators are unchanged or slightly improved.(ed: there are some O(n log^2 n) datastructures now, to avoid a one-off LSM implementation)
Measurements
All on
master@0b23c249vs this PR @1024c32d, release builds,Apple Silicon (darwin 25.1.0). Scaling tests timed over 3 runs each;
RSS sampled via
psafter construction completes, 2 runs each.Generated code (
bfsexample):cargo llvm-linestotalConstruction scaling (
tests/shape_scaling.rs, with temporary 1M variants):operator_scaling1Msubgraph_scaling1Moperator_scaling100ksubgraph_scaling100kBoth sides scale ≈11–12× from 100k → 1M (log factors, no quadratic
behavior on either side).
event_driven1000 × 1000: