Devnet4#181
Open
TomWambsgans wants to merge 160 commits into
Open
Conversation
* poseidon avx2 avx512 (#192) Co-authored-by: Tom Wambsgans <TomWambsgans@users.noreply.github.com> * fix avx512 (panicked on small instances) (#193) * fix avx512 (panicked on small instances) * add `test_aggregation` --------- Co-authored-by: Tom Wambsgans <TomWambsgans@users.noreply.github.com> * fmt --------- Co-authored-by: Tom Wambsgans <TomWambsgans@users.noreply.github.com>
This reverts commit 89e0320.
c6f9fd4 to
c09c85a
Compare
… on avx512 (hetzner ax42-u)
…d of log_packing_width)
…ON dot product regression tests Co-authored-by: mo-melvin77 <momelvinmome@gmail.com> Co-authored-by: Thomas Coratger <thomas.coratger@gmail.com>
…vnet4` has been merged into `leanSig:main`)
#221) The benchmark recursion already builds a NodeStats per node for the live-tree display, but only the root's wall-clock is returned to callers. Promote NodeStats / NodeReport / BenchmarkReport to pub and add `run_aggregation_benchmark_report` returning the full per-node breakdown (time, proof_kib, cycles, memory, poseidons, dots, n_xmss). The existing `run_aggregation_benchmark` is preserved as a thin wrapper returning the root node's `time_secs`, so the `test_aggregation_throughput_per_num_xmss` test in the same file continues to compile unchanged. Matches the API already shipped on devnet5, letting downstream consumers (e.g. lean-bench) collect identical per-node telemetry across both branches.
eacd019 to
9b2f632
Compare
c5a3050 to
9dc5d68
Compare
The merge of main into devnet4 (8eec56c) changed MleOwned to hold an ArenaVec, but combine_statement still returned a heap Vec, which was then bridged with ArenaVec::from_slice. At n_vars=24 that is a single-threaded memcpy of ~256 MiB of extension elements per proof while all worker threads sit idle, and the data crosses the memory hierarchy twice. Build the weights directly in an ArenaVec so it is moved, never copied. All writers (compute_eval_eq_packed*, split_at_mut_many) take &mut [T] and work unchanged via deref. The ArenaVec is created inside the same proving phase where it was previously copied into one, so arena-phase semantics are unchanged. On a Zen5 box (Ryzen 9700X) this turns a -5.3% XMSS-aggregation regression vs pre-merge into a +2.2% improvement (215.9 -> 220.7 XMSS/s); run_initial_sumcheck_rounds self time drops from 7.94% back to 0.00%. Proof size is unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.