E1+E3: reduce relay ingest/fan-out DB round trips; ack p99 −7–16%, fd p99 −6–28%, p999 tails −29–53% vs PR #1453 tip#1454
Merged
Conversation
E1 (correctness ruling §4.8, GREEN): ingest re-SELECTed the full channel row up to three times per accepted event — the membership open-visibility fallback, the archived-channel gate, and the join-request visibility check each issued their own community-scoped get_channel. Fetch the row once after channel_id resolution and thread it through all three gates. Within-request threading only — no cross-request cache, so there is no invalidation surface. The row is community-scoped (Inv_LabelPropagation: channel UUIDs collide across communities). Missing-row behavior per gate is unchanged: membership fallback treats it as not-open, the archived gate skips (kind:9007 creates the channel later in the same request), and join requests still reject with 'channel not found'. check_channel_membership takes the row as Option; the ephemeral-event path (handlers/event.rs) has no fetched row and passes None, keeping its existing lookup. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
E1 phase-2 (correctness ruling §4.8 phase-2 addendum, GREEN with three fences): fan-out re-resolved channel visibility even though ingest just fetched the channel row for the same event. Resolve visibility once at ingest — through the same channel_visibility_cached gate fan-out uses, seeded with the phase-1 once-per-request row — and thread it into dispatch_persistent_event as a ThreadedChannelVisibility bundle. Saves one visibility SELECT per accepted channel event on the fan-out path. Quinn's three fences, verbatim from the ruling: 1. Fail-closed on error is preserved. Ingest-side lookup failure (or a missing row: global events, kind:9007 pre-create) threads None, and fan-out performs its own fresh fail-closed lookup exactly as before. 'No threaded visibility' is never interpreted as 'assume open'. 2. The threaded read goes through channel_visibility_cached, not raw get_channel. The prefetched row only replaces the DB read inside the gate: a cached 'private' still wins over the row, and a 'private' read from the row still populates the cache. 3. The threaded value stays community-scoped through fan-out. It travels bundled with the (community_id, channel_id) it was resolved under, and filter_fanout_by_access consults it only on exact equality with the fan-out's own (community_id, channel_id) — anything else falls through to the fresh lookup (channel UUIDs collide across communities, Inv_LabelPropagation). Pubsub (cross-node) and ephemeral fan-out paths pass None — threading is same-request, same-node only. Membership checks are unchanged and stay fresh; the threaded value only replaces the visibility SELECT. Three fence tests added in fanout_access: mismatched bundle falls back to the fresh fail-closed lookup; matching 'private' gates to members only; matching 'open' passes through without a DB read. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…invalidation E3 (correctness ruling §4.10): every ingested channel event ran a list_enabled_channel_workflows SELECT to find trigger candidates, and most channels have no workflows at all. Add a moka look-aside cache in WorkflowEngine keyed (community_id, channel_id) so the trigger path hits the DB once per channel per TTL instead of once per event. §4.10 fences: - Key is community-scoped (community_id, channel_id) — channel UUIDs collide across communities (Inv_LabelPropagation). - TTL is 10s (ruling allows ≤30s), matching the relay's other moka caches. - Negative results are cached: an empty list is inserted like any other, which is where most of the win is. - Synchronous invalidation at every live mutation site. Audit (per Wren's broad definition — ingest upsert/delete, HTTP/API toggles, admin/test helpers, soft-delete/reactivation): the only paths that write trigger eligibility or channel binding are the kind:30620 command upsert and NIP-09 a-tag deletion; both invalidate immediately after the DB write. delete_workflow_for_owner now RETURNING channel_id so the deletion path invalidates without a second lookup. The unused buzz-db mutators (create_workflow, update_workflow, update_workflow_status, set_workflow_enabled, delete_workflow) have no callers anywhere in the workspace — CLI and desktop route through event submission, webhook and approval-resume only write workflow_runs — and each carries a doc note requiring shared invalidation from any future caller. Consistency: no cross-pod invalidation, deliberately. Workflow triggering is not an access-control fence; the worst case on another pod is a just-deleted workflow firing (or a just-created one missing events) for up to 10s. The same TTL bounds the look-aside fill race. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
30d2941 to
eb38833
Compare
…usted-input only) Two new quick-xml advisories published 2026-07-02 broke the Security CI gate on every branch. Both are DoS-class and require attacker-controlled XML; our locked versions parse only trusted input (rust-s3 responses from our own S3/MinIO endpoint; plist reads of local macOS system files). The patched release (>= 0.41.0) is unreachable until rust-s3 and plist/netdev bump their requirements. Documented ignores, matching the existing pattern for RUSTSEC-2024-0384/0436. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
tlongwell-block
pushed a commit
that referenced
this pull request
Jul 2, 2026
Brings the branch current with main (~20 commits, incl. relay perf #1453/#1454 and mention ranking #1431). One conflict resolved in useMentions.ts: kept main's rankMentionCandidates pipeline, re-applied this branch's suggestion slice change (Math.max(MENTION_SUGGESTION_LIMIT, mentionCandidates.length)). Verified post-merge: tsc --noEmit clean, biome check clean, desktop unit tests 1475/1475, cargo check --workspace clean. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
wpfleger96
added a commit
that referenced
this pull request
Jul 2, 2026
…into HEAD * origin/paul/nip-am-agent-turn-metrics: fix(profile): consolidate agent profile runtime metadata (#1451) fix(desktop): simplify workspace rail badges (#1462) perf(desktop): instant channel switching — non-blocking first paint, persisted snapshots (#1452) perf(relay): bounded-concurrency multi-filter query execution (S2) (#1457) fix(desktop): classify timeline prepends so history loads don't bump unread (#1416) fix(desktop): quiet gate for workspace switches instead of boot splash (#1449) fix(read-path): reach complete threads, dense-second timelines, and all people in the GUI (#1418) E1+E3: reduce relay ingest/fan-out DB round trips; ack p99 −7–16%, fd p99 −6–28%, p999 tails −29–53% vs PR #1453 tip (#1454) perf(relay): defer post-commit dispatch and avoid verify clone (#1453) fix(relay): include git hook tools in runtime image (#1326) feat(chart): per-pod emptyDir git scratch when persistence disabled (multi-replica HA) (#1450) fix(relay): remove media bearer-token auth (#1444) fix(desktop): stop search shortcut from hijacking the sidebar (#1447) Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #1453 (base:
eva/relay-perf-w8-arc-verify; retargets tomainwhen #1453 merges). Three commits, each independently reviewed GREEN by Wren against Quinn's correctness rulings (RESEARCH/RELAY_PERF_CORRECTNESS.md§4.8 + phase-2 addendum, §4.10):What
d33b4636— E1-phase-1: fetch channel row once per ingest request. Ingest re-SELECTed the full channel row up to three times per accepted event (membership open-visibility fallback, archived gate, join-visibility check). One community-scoped fetch afterchannel_idresolution now feeds all three. Missing-row semantics preserved per gate (incl. kind:9007 pre-create). No cross-request cache, no invalidation surface.42dd950d— E1-phase-2: thread channel visibility from ingest into fan-out. Visibility resolved once at ingest through the samechannel_visibility_cachedgate fan-out uses (seeded with the phase-1 row) and threaded intodispatch_persistent_eventas aThreadedChannelVisibilitybundle. Ruling fences, verified in review:None; fan-out does its own fresh fail-closed lookup.Noneis never "assume open".channel_visibility_cached— cachedprivatewins over the prefetched row; row-derivedprivatestill populates the cache.(community_id, channel_id)it was resolved under and is consulted only on exact id equality at fan-out; mismatch falls back fresh (channel UUIDs collide across communities,Inv_LabelPropagation). Pubsub/cross-node and ephemeral paths passNone— threading is same-request/same-node only.30d29414— E3: per-channel enabled-workflow cache with sync invalidation. moka cache inWorkflowEnginekeyed(community_id, channel_id), TTL 10s (ruling allows ≤30s), negatives cached — the common no-workflow channel skips a per-event SELECT. Full mutation-site audit: the only live writers of trigger eligibility/channel binding are kind:30620 command upsert and NIP-09 a-tag delete; both invalidate synchronously (delete_workflow_for_ownernowRETURNING channel_id). Unused buzz-db mutators carry doc fences requiring shared invalidation from future callers. No cross-pod invalidation, deliberately: triggering is not an access-control fence; the worst cross-pod case is a just-mutated workflow mis-firing/missing for ≤10s.Bench (Sami, three-protocol A/B vs
c61b4c14= #1453 tip; all runs 0 timeouts, warmup discarded)Attribution by mechanism: E1-phase-1/2 remove channel-row/visibility SELECTs from accepted channel-event paths (sender-side ack win); E3 removes the per-event no-workflow SELECT via negative cache (large p999 tail win). Receiver-side first_delivery p99 improved on every protocol — same-pod fd is now below raw baseline, so W1's local spawn-hop trade (recorded on #1453) is paid back by this stack. No regressions: ack p50 deltas are noise, zero timeouts. Run files:
RESEARCH/RELAY_PERF_BENCH_RUNS/e1e3-30d29414-*.Validation
cargo test -p buzz-relay -p buzz-workflow -p buzz-db: relay 438 lib (incl. 3 new phase-2 fence tests) + 1 main, workflow 148, db 79 — 0 failedcargo clippy --all-targetsclean,cargo fmt --checkclean, pre-push hooks green