Skip to content

Scoped feature flags: resolution and gating#36959

Open
antiguru wants to merge 2 commits into
MaterializeInc:mainfrom
antiguru:claude/great-brahmagupta-ybi2yf
Open

Scoped feature flags: resolution and gating#36959
antiguru wants to merge 2 commits into
MaterializeInc:mainfrom
antiguru:claude/great-brahmagupta-ybi2yf

Conversation

@antiguru

@antiguru antiguru commented Jun 10, 2026

Copy link
Copy Markdown
Member

Motivation

The resolution layer for scoped feature flags. The foundation (#37079), the
durable persistence and introspection (#37080), and the catalog-implications
foundation (#37151, which derives the replica-scoped controller push from the
committed catalog diff) have all merged. Create-time render-frozen correctness
is completed in the follow-up #37158. Design:
doc/developer/design/20260609_scoped_feature_flags.md (#36947).

Behind the enable_scoped_system_parameters dyncfg (off by default), this PR
consumes the durable scoped overrides and resolves them at the per-scope
boundaries.

Description

  • The sync loop evaluates the cluster (replica-free) and replica (size, size
    family, owning cluster) LaunchDarkly contexts and records an override only when
    the scoped value differs from the environment-wide value, compared in
    canonical encoding so bool spellings (on/off vs true/false) match. LD
    multi-context evaluation cannot reveal which context produced a value, so the
    differs-from-env test, not the variation_detail reason, is the recording
    rule. The evaluation is skipped while the gate is off, and the gate is read
    from the sync working copy so a disabled environment takes no catalog snapshot
    per tick.
  • Resolution boundaries: cluster-coherent overrides feed plan-time
    OptimizerFeatureOverrides (with a lenient bool decode so on/off cannot
    panic the optimizer). Replica-local overrides reach the compute controller
    through the catalog implication from adapter: derive replica-scoped override push from catalog implications #37151.
  • Create-time: a post-transact resolve folds a freshly created object's overrides
    into the working copy so it need not wait for the next sync tick. The follow-up
    Scoped feature flags: create-time evaluation #37158 moves this into the create transaction so render-frozen replica flags are
    set before create_replica.

Verification

Canonical bool bridging, lenient optimizer decode, and frontend context unit
tests, plus the test/launchdarkly mzcompose workflow extended with
cluster/replica/durability/gating/create-time cases, validated end-to-end
against a live LaunchDarkly project.

🤖 Generated with Claude Code

https://claude.ai/code/session_01FUorxx5m6B8TT2VAek2STo

@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch 2 times, most recently from e2c7200 to 59a43c8 Compare June 15, 2026 11:08
@antiguru antiguru marked this pull request as ready for review June 15, 2026 14:22
@antiguru antiguru requested review from a team as code owners June 15, 2026 14:22
Comment thread src/adapter/src/catalog/apply.rs Outdated
Comment thread src/adapter/src/config/frontend.rs Outdated
@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from 877ff6a to cc7f88c Compare June 15, 2026 15:26

@def- def- left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-cluster LD feature flags in combination with a cluster-targeted optimizer flag crash:
Running cargo test -p mz-repr --lib bool_override_decodes_lenient_spellings with this diff:

diff --git a/src/repr/src/optimize.rs b/src/repr/src/optimize.rs
index 75d4a56c71..d0d3fcef05 100644
--- a/src/repr/src/optimize.rs
+++ b/src/repr/src/optimize.rs
@@ -189,3 +189,21 @@ macro_rules! impl_optimizer_feature_type {
 // Implement `OptimizerFeatureType` for all types used in the
 // `optimizer_feature_flags!(...)`  call above.
 impl_optimizer_feature_type![bool, usize];
+
+#[cfg(test)]
+mod tests {
+    use std::collections::BTreeMap;
+
+    use super::OptimizerFeatureOverrides;
+
+    #[mz_ore::test]
+    fn bool_override_decodes_lenient_spellings() {
+        for (stored, want) in [("true", true), ("false", false), ("on", true), ("off", false)] {
+            let map = BTreeMap::from([("enable_eager_delta_joins".to_string(), stored.to_string())]);
+            assert_eq!(
+                OptimizerFeatureOverrides::from(map).enable_eager_delta_joins,
+                Some(want),
+            );
+        }
+    }
+}

panics:

thread 'optimize::tests::bool_override_decodes_lenient_spellings' (4146015) panicked at src/repr/src/optimize.rs:191:1:
called `Result::unwrap()` on an `Err` value: ParseBoolError
stack backtrace:
   0: __rustc::rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: <bool as mz_repr::optimize::OptimizerFeatureType>::decode
   4: <mz_repr::optimize::OptimizerFeatureOverrides as core::convert::From<alloc::collections::btree::map::BTreeMap<alloc::string::String, alloc::string::String>>>::from
   5: mz_repr::optimize::tests::bool_override_decodes_lenient_spellings::test_impl
   6: mz_repr::optimize::tests::bool_override_decodes_lenient_spellings
   7: mz_repr::optimize::tests::bool_override_decodes_lenient_spellings::{closure#0}
   8: <mz_repr::optimize::tests::bool_override_decodes_lenient_spellings::{closure#0} as core::ops::function::FnOnce<()>>::call_once

@antiguru

Copy link
Copy Markdown
Member Author

Good catch — fixed in 8ea690964b (folded into the catalog/adapter commit). <bool as OptimizerFeatureType>::decode now accepts the canonical on/off spellings in addition to true/false, so a cluster-coherent override served from LaunchDarkly as on/off no longer panics the optimizer at plan time. Added your bool_override_decodes_lenient_spellings test (passes).

@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from cc7f88c to 75f521c Compare June 15, 2026 17:26
@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from 75f521c to 43c4d79 Compare June 16, 2026 17:19
@antiguru antiguru changed the title Scoped feature flags: per-cluster and per-replica LaunchDarkly overrides Scoped feature flags 3/3: resolution, gating, and create-time evaluation Jun 16, 2026

@DAlperin DAlperin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo one fix

Comment thread src/adapter/src/config/frontend.rs Outdated
let differs = match params.canonicalize(param_name, &value) {
Some(canonical) => canonical != base,
// Unparseable for this var: fall back to a raw comparison.
None => value != base,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a risk here. If LD gives us a badly typed response for a bool value (lets say maybe) canconicalize will return false maybe != off so we store maybe but then bool::decode will panic for every query plan I think. Probably better to just do None here.

@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from 43c4d79 to 7296095 Compare June 17, 2026 09:59
@antiguru antiguru requested a review from def- June 17, 2026 11:25
@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch 2 times, most recently from ea2b4e1 to 44cd10a Compare June 17, 2026 12:00
antiguru added a commit that referenced this pull request Jun 17, 2026
…plumbing (#37079)

### Motivation

First of a three-PR stack splitting #36959 (scoped feature flags) for
review:
**1/3 (this) → 2/3 #37080 → 3/3 #36959**. Design:
doc/developer/design/20260609_scoped_feature_flags.md (#36947).

This PR is the additive foundation: it introduces the vocabulary the
later PRs
build on, with no behavior change on its own.

### Description

* `ParameterScope` (`Environment` / `Cluster` / `Replica`), declared on
system
vars and dyncfg `Config`s and carried through to the synced system vars.
The
declaration is the single source of truth for which contexts get
evaluated.
* The size-family taxonomy: `ReplicaAllocation::family` plus a size-map
`family`
  field, with a `cc` / `legacy` fallback for sizes that don't set one.
* The compute controller's per-replica dyncfg override layer
(`update_replica_dyncfg_overrides` + per-replica command
specialization),
  inert until the adapter wires it in 3/3.

Nothing consumes the scope or the override layer yet, so this is a
no-op.

### Verification

`test_replica_allocation_family` covers the size→family fallback; the
rest is
exercised by the later PRs in the stack.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch 4 times, most recently from 7561342 to d77eb1b Compare June 18, 2026 09:22

@aljoscha aljoscha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the top commit only (d77eb1bc, the resolution layer), per the stack note.

Overall this is a solid, cohesive change and the correctness reasoning holds up well: the differs-from-env recording rule, the canonical bool bridging so on/off vs true/false don't register as spurious overrides, the lenient optimizer decode guarded by canonicalize so a stored value can't panic the optimizer, the create-time resolution ordered before render-freeze, and the revert-via-full-compute_config-re-push are all sound. The end-to-end LD test covers the interesting paths (differs-from-env guard, recreate-by-name, bidirectional reconcile, gating, durability, create-time). No blocking correctness issues found.

A few things to clean up, mostly around our writing/comment conventions:

Strong suggestions

  1. Em-dashes and semicolons in comments. Our style is to avoid em-dashes and sentence-structuring semicolons everywhere, including code comments. This commit adds ~22 lines with em-dashes and a similar number of structuring semicolons (heaviest in coord.rs and config/frontend.rs). Please do a pass replacing them with full stops and commas. Inline suggestions below mark representative spots.

  2. Doc comments that narrate / duplicate the body. evaluate_scoped_overrides is the clearest case: the differs-from-env-vs-variation_detail rationale is spelled out almost verbatim in both the doc comment and the inline comment at the decision point. Our convention is for the doc comment to state the contract and for the reasoning to live inline at the decision point. Trim the doc comment (suggestion inline).

  3. scoped_ld_ctx is a dead public wrapper. It only forwards to the private ld_ctx, and the real callers (pull_cluster_overrides / pull_replica_overrides) already call ld_ctx directly. Its only users are the unit tests, which live in the same module and can call ld_ctx too. Suggest dropping both the wrapper and its pub use re-export.

Nits

  • push_replica_dyncfg_overrides: the "compute push reaches storage via the shared persist ConfigSet" subtlety is genuinely worth recording, so keep it, but the doc comment is long and could be tightened (and is part of the em-dash pass).
  • Consider a unit-level test for the evaluate_scoped_overrides differs-from-env decision itself. The canonicalize and decode units are good, but the core recording decision is currently only exercised by the live-LD mzcompose workflow. Not blocking given the LD client dependency.

Comment thread src/adapter/src/config/frontend.rs Outdated
Comment on lines +258 to +271
/// Evaluates each of `param_names` against `ctx`, returning only the values
/// that *differ* from the environment-wide value held in `params`. Shared by
/// the cluster and replica passes; the returned map is therefore sparse.
///
/// The difference — not the `variation_detail` reason — is the signal that
/// the scope context produced a genuine opinion. The reason cannot say which
/// context kind's clause matched (an env-level rollout rule and a
/// cluster-specific rule both report `RULE_MATCH`), and `FALLTHROUGH` serves
/// the env-wide value to every object; either would record a row for every
/// object (dense) and let a coarse env-wide value override a finer manual
/// `CREATE CLUSTER ... FEATURES` pin. Comparing against the env-wide baseline
/// captures exactly "this scope context changed the answer". A silent LD
/// (flag absent / off / error / failed prerequisite) resolves to the `base`
/// default and is dropped. See the scoped feature flags design, §Resolution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The differs-from-env rationale here is duplicated almost verbatim by the inline // Record iff ... comment at the recording decision below. Per our convention the doc comment states the contract and the reasoning lives inline at the decision point, so this paragraph can go. (It also carries several em-dashes/semicolons.) Suggested trim:

Suggested change
/// Evaluates each of `param_names` against `ctx`, returning only the values
/// that *differ* from the environment-wide value held in `params`. Shared by
/// the cluster and replica passes; the returned map is therefore sparse.
///
/// The difference — not the `variation_detail` reason — is the signal that
/// the scope context produced a genuine opinion. The reason cannot say which
/// context kind's clause matched (an env-level rollout rule and a
/// cluster-specific rule both report `RULE_MATCH`), and `FALLTHROUGH` serves
/// the env-wide value to every object; either would record a row for every
/// object (dense) and let a coarse env-wide value override a finer manual
/// `CREATE CLUSTER ... FEATURES` pin. Comparing against the env-wide baseline
/// captures exactly "this scope context changed the answer". A silent LD
/// (flag absent / off / error / failed prerequisite) resolves to the `base`
/// default and is dropped. See the scoped feature flags design, §Resolution.
/// Evaluates each of `param_names` against `ctx`, returning only the values
/// that differ from the environment-wide value held in `params`. Shared by
/// the cluster and replica passes, so the returned map is sparse.
///
/// We record on the differs-from-env test, not the `variation_detail`
/// reason. The inline comment at the recording decision explains why.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied your suggested trim — the doc now states the contract and points at the inline comment for the differs-from-env reasoning.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. The doc now states the contract and defers to the inline comment for the reasoning. Thanks.

Comment thread src/adapter/src/config/frontend.rs Outdated
///
/// The environment-wide pass passes `None` for both. This is the single entry
/// point the sync loop uses to evaluate each scoped pass.
pub fn scoped_ld_ctx(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scoped_ld_ctx only forwards to the private ld_ctx, and the real callers (pull_cluster_overrides / pull_replica_overrides) already call ld_ctx directly. Its only users are the unit tests in this module, which can call ld_ctx directly too. Suggest dropping this wrapper and the pub use frontend::{... scoped_ld_ctx} re-export in config.rs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the scoped_ld_ctx wrapper and its config.rs re-export; the unit tests call the private ld_ctx directly (same module), and its doc moved onto ld_ctx. Real callers were already on ld_ctx.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. The wrapper and the re-export are gone, the tests call ld_ctx directly, and the stray [scoped_ld_ctx] doc reference was updated too. Thanks.

Comment thread src/adapter/src/coord.rs Outdated
Comment on lines +2114 to +2131
/// Resolves the replica-local scoped overrides from the catalog working copy
/// into the compute controller's per-replica dyncfg layer, then re-pushes
/// the environment-wide compute configuration so existing replicas observe
/// the new values. Called after reconcile and on bootstrap.
///
/// Only the *compute* controller's per-replica dyncfg layer is pushed here,
/// but on `clusterd` that also reaches storage for the configs that matter.
/// Compute and storage share one process, and the compute worker's
/// `handle_update_configuration` applies the pushed dyncfg updates both to
/// compute's own worker `ConfigSet` *and* to the shared persist client
/// `ConfigSet` (`persist_clients.cfg()`), which the co-located storage server
/// reads from the same `Arc`. So persist-backed and process-global
/// replica-local configs — the design's examples: the persist pager, LZ4,
/// persist client tuning, `lgalloc` — take effect on storage as well, even
/// though the push goes through the compute controller. The only thing this
/// would miss is a future `Replica`-scoped config realized *solely* in the
/// storage worker's own `ConfigSet` (applied only via the storage
/// controller's `UpdateConfiguration`); none exists today.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the "compute push reaches storage via the shared persist ConfigSet" subtlety is worth recording, so keep it, but this doc comment is long and stacks several em-dashes/semicolons. Consider tightening it and folding the example list into prose as part of the style pass.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tightened: folded the example list into prose, cut the em-dashes/semicolons, kept the shared-persist-ConfigSet storage-reach subtlety.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. Reads well now and keeps the shared-persist-ConfigSet storage-reach subtlety. Thanks.

@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from d77eb1b to c34249e Compare June 18, 2026 09:56
@aljoscha

Copy link
Copy Markdown
Contributor

Re-reviewed the responses. The three inline fixes all look good, including the follow-on cleanup of the stray scoped_ld_ctx doc reference.

One item from the review summary is still open: the broader em-dash and structuring-semicolon pass across the new comments. The coord.rs doc comment I flagged inline is clean now, but the rest of the new comments still carry them. By a quick count there are about 19 added lines with em-dashes and about 17 comment lines with structuring semicolons. Examples: the // Record iff ... block in config/frontend.rs, the scoped_frontend field doc and the resolve_scoped_for_new_objects / push_new_replica_scoped_override docs in coord.rs, and the ReplicaScopeContext::size_family and canonicalize docs. It is a mechanical pass, flagging only because it is a project-wide convention. Happy to list the exact lines if that helps.

@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from c34249e to 6f6a849 Compare June 18, 2026 11:44
@antiguru

Copy link
Copy Markdown
Member Author

Done the em-dash and structuring-semicolon pass over the new comments. Removed the em-dashes (reworded into separate sentences or commas) and the structuring semicolons (split into sentences) across config/frontend.rs, config/params.rs, config/sync.rs, command.rs, coord.rs, and test/launchdarkly/mzcompose.py — including the // Record iff ... block, the scoped_frontend field doc, the resolve_scoped_for_new_objects / push_new_replica_scoped_override docs, and the ReplicaScopeContext::size_family / canonicalize docs you called out. The remaining em-dashes in client.rs / coord/introspection.rs / frontend_peek.rs are pre-existing lines this PR does not touch, so I left them.

On the non-blocking suggestion: I agree a direct unit test of the differs-from-env decision would be nice, but evaluate_scoped_overrides takes a live ld::Client with no seam to inject canned variations, so it would need a fake-client shim that does not exist yet. The decision's two halves are unit-covered (canonicalize bridging and the lenient decode), and the end-to-end behavior is exercised by the test/launchdarkly workflow. Happy to add the shim as a follow-up if you think it's worth the surface.

@aljoscha

Copy link
Copy Markdown
Contributor

Did a focused performance pass on the hot paths and continual triggers. Summary: the read/plan path is fine, the costs are all on the write side and only when the feature is enabled.

Plan / peek path: no concern. cluster_scoped_optimizer_overrides() is added to every optimizer-config build (peek, MV, index, subscribe, introspection, fast-path peek). With no override for the target cluster (always the case while gated off) it is one miss on an empty BTreeMap plus a default 19-field OptimizerFeatureOverrides and 19 failed override_from branches. No allocation, no locking. Good.

push_new_replica_scoped_override evaluates and discards all cluster overrides (every CREATE ... REPLICA, when on). It calls evaluate_scoped_parameters(.., None, Some(&replicas)) with cluster_filter = None, so it runs an LD pass over every cluster's cluster-coherent flags and then uses only evaluated.replica. The cluster-create path passes an empty set for the mirror case. Suggest passing Some(&BTreeSet::new()) as the cluster filter here, so a replica creation does not run and throw away an O(#clusters x #cluster-params) LD evaluation.

push_new_replica_scoped_override re-pushes the full env-wide compute config unconditionally (every CREATE ... REPLICA, when on). push_replica_dyncfg_overrides_from always calls update_replica_dyncfg_overrides then update_configuration(compute_config). update_configuration clones the full dyncfg ComputeParameters per instance and broadcasts to every replica (controller.rs:603), the same cost as an env-wide ALTER SYSTEM push. It fires even when the new replica has no override (the common case), and it runs synchronously on the coordinator thread. Creating a managed cluster with N replicas triggers N such re-pushes. Suggest an early-out when the new replica contributes nothing, e.g. if evaluated.replica.is_empty() { return; }.

That early-out also removes the common case of a subtle transient: replica create-time overrides are not written back to the catalog working copy (in-memory/controller only, by design), so merged is rebuilt from the stale working copy. If replica A was given a create-time override and replica B is then created before the next reconcile, the push for B rebuilds from the working copy that does not contain A, momentarily clearing A's override at the controller until the next tick. Bounded by the 1s reconcile window and harmless for render-frozen flags, but worth being aware of for replica-local configs that take effect on update.

Continual trigger: per-tick O(fleet) evaluation when enabled. With the feature on, sync_scoped_params runs every tick (1s in prod) and rebuilds eval contexts for every cluster and replica (several String allocations each) and runs an LD variation per object per scoped param, regardless of whether anything changed. This is on the dedicated sync-loop task so it does not block the coordinator, but it is continual O(#clusters + #replicas) CPU and allocation churn every second. Fine for small fleets, worth keeping in mind for large ones. The coordinator-side per-tick cost is just the CatalogSnapshot Arc-clone plus the UpdateScopedSystemParameters equality early-out, both negligible.

Minor: SynchronizedParameters::new(catalog.system_config().clone()) on the create-cluster and create-replica paths deep-clones the whole SystemVars (~450 vars) and builds a ~450-element BTreeSet. Only on those DDL paths and only when enabled, so acceptable, just noting it is not a cheap call.

Net: nothing on the read path, and the write-path costs are gated off by default. The two worth acting on are the discarded cluster evaluation and the unconditional full compute re-push on replica creation.

@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from 6f6a849 to fc13ba4 Compare June 18, 2026 11:51
@antiguru

Copy link
Copy Markdown
Member Author

Thanks for the perf pass. Acted on the two write-path items in push_new_replica_scoped_override:

  • Discarded cluster evaluation: now passes Some(&BTreeSet::new()) as the cluster filter, so a CREATE ... REPLICA runs only the replica-local pass and no longer evaluates+throws away the O(#clusters x #cluster-params) cluster-coherent LD pass.
  • Unconditional full compute re-push: added an early-out — if evaluated.replica.is_empty() { return; } — before the merge and push. The new replica having no override is the common case, so we skip the per-instance update_configuration broadcast entirely. As you noted, this also removes the common-case transient where rebuilding the pushed map from the working copy (which does not carry other replicas not-yet-reconciled create-time overrides) momentarily clears them at the controller. The rare has-override case keeps the current behavior, with that transient bounded by the 1s reconcile window.

On the rest, agreed and no change:

  • Plan/peek path: as you found, gated-off is an empty-map miss + a default-overrides no-op. Nothing to do.
  • Per-tick O(fleet) evaluation when enabled: real, but on the dedicated sync-loop task and only when the feature is on. Noted for large fleets; if it bites we can diff-and-skip unchanged objects or widen the tick.
  • SynchronizedParameters::new(... .clone()) on the DDL paths: only on create-cluster/create-replica and only when enabled. Acceptable, noted.

@aljoscha

Copy link
Copy Markdown
Contributor

Follow-up on the per-tick note, on whether this is new work or just more of what the sync loop already did. It is net-new, and it runs unconditionally, so worth calling out explicitly.

Before this feature, the sync loop did two things per tick: backend.pull to fetch current values from the coordinator, and frontend.pull to evaluate each synced param against the single env-wide LD context. That env-wide pass is O(#synced params) against one context, and it only does the downstream backend.push when a value actually changed. No catalog snapshot, no per-object work.

This PR adds, on every tick:

  • an unconditional catalog_snapshot().await, and
  • when enabled, evaluate_scoped_parameters which builds eval contexts for every cluster and replica (O(#clusters + #replicas), several String allocations each) and runs an LD variation per object per scoped param (O(#objects x #scoped-params)), followed by an update_scoped_system_parameters().await.

So this is not the env-wide pass getting a bit bigger. The object factor is entirely additional. And unlike the env-wide pass, which gates its expensive work on "did anything change", the scoped pass runs the full fleet evaluation every tick regardless. The coordinator early-outs on the equality check in reconcile_scoped_system_parameters, but the sync-task-side evaluation does not, so we redo the whole O(#objects x #scoped-params) LD pass and the per-object context allocations once per second whether or not anything changed.

It is on the dedicated sync-loop task so it does not block the coordinator, and it is gated off by default. But it is continual extra work proportional to fleet size, so worth a deliberate look before this goes on by default, especially for large environments. A cheap guard would be to skip the pass when neither the set of live objects nor the relevant LD state has changed since the last tick, though detecting the latter is the hard part.

@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from fc13ba4 to 210c04f Compare June 18, 2026 13:20
@antiguru

Copy link
Copy Markdown
Member Author

Agreed, the object factor is net-new and the scoped pass runs every tick regardless of change. Two parts here:

The off-path should not regress, since a patch release to undo it would be painful. Fixed in 210c04f: the scoped reconcile now reads the gate from the sync working copy (params) instead of taking a catalog_snapshot().await every tick, so a disabled environment (the default) does no per-tick coordinator round-trip. A small maybe_dirty flag still does the one-time clear of any leftover overrides on startup and on an enable->disable transition, then goes quiet.

The on-path full-fleet evaluation is the part worth optimizing, and since this will be on by default we will want it. I would rather do that in a follow-up than widen this PR: a cheap object-set/env-change guard is unsafe on its own (a scoped LD rule can change with neither the live-object set nor the env-wide context changing, which would silently freeze scoped overrides), so the real fix is either a dedicated slower scoped tick or hooking LD streaming changes. Tracking that separately.

@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from 210c04f to d3df8f0 Compare June 19, 2026 09:17
@antiguru antiguru marked this pull request as draft June 19, 2026 09:53
@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch 2 times, most recently from edb3bb6 to d75b222 Compare June 19, 2026 12:51
@antiguru antiguru changed the title Scoped feature flags 3/3: resolution, gating, and create-time evaluation Scoped feature flags: resolution and gating Jun 19, 2026
@antiguru antiguru marked this pull request as ready for review June 19, 2026 12:53
@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from d75b222 to db631b9 Compare June 19, 2026 13:28
antiguru added a commit that referenced this pull request Jun 19, 2026
#37151)

### Motivation

Scoped feature flags need the per-replica controller push to be *derived
from
the committed catalog diff* rather than issued as a side effect in the
sequencer. Deriving it removes the working-copy/controller divergence
that lets
a create-time replica override be wiped by a reconcile in the same DDL,
and it
makes the push fire for a follower `environmentd` that only replays the
catalog
diff. This is the pattern the catalog-implications framework (#29673,
`TODO(aljoscha)`) exists to provide, which until now ignored config-type
changes.

### Description

`parse_state_update` no longer drops `ReplicaSystemConfiguration`
updates; it
emits a `ParsedStateUpdateKind::ReplicaSystemConfiguration`. When such
an update
is present in a transaction, `apply_catalog_implications` rebuilds the
complete
per-replica compute dyncfg layer from the catalog working copy and
pushes it
after clusters are created and before replicas are created, so a new
replica's
first configuration replays with its override. The same push runs once
on
bootstrap from the restored working copy.

Because the push reads the working copy that the same transaction's
catalog
apply has already updated, the diff drives the push. The push mechanism
(`push_replica_dyncfg_overrides`) lives here. Cluster-scoped overrides
are read
at plan time and have no controller effect, so they remain unparsed.

This is the foundation under #36959, which adds the writers that drive
the
implication; the implication is dormant until that PR lands on top.

### Verification

Adds a unit test asserting a `ReplicaSystemConfiguration` update is
parsed
rather than dropped. End-to-end controller behavior is covered by the
LaunchDarkly mzcompose test in #36959.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

https://claude.ai/code/session_01FUorxx5m6B8TT2VAek2STo

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch 2 times, most recently from 96a390d to d790e90 Compare June 21, 2026 19:30

@aljoscha aljoscha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the resolution and gating layer locally against main. Overall the design is solid and the comments are unusually good. One blocking correctness item from the new prune_scope rework, plus a couple of suggestions. Main item is inline on catalog/transact.rs, the rest is below.

Blocking

See the inline comment on src/adapter/src/catalog/transact.rs: dropped clusters and replicas leak their scoped-config rows, and as a knock-on the periodic sync stops converging. Details and a suggested fix are there.

Strong suggestions

  • No Rust unit test covers the UpdateScopedSystemParameters diff and prune semantics. The only coverage is the test/launchdarkly mzcompose workflow, which needs live credentials and is not in standard PR CI. A catalog-level unit test exercising upsert, remove, prune_scope bounding, and the dropped-object case would have caught the blocking item and guards this boundary going forward. The other new logic (canonicalize, lenient decode, context building) is unit tested, which is great.
  • The cluster-scoped optimizer override is wired in ad hoc at around a dozen plan-path call sites via .override_from(&self.cluster_scoped_optimizer_overrides(cluster_id)) (peek, index, MV, subscribe, introspection, bootstrap, frontend_peek). It looks complete today, but a future plan path that builds a per-cluster OptimizerConfig will silently miss the scoped layer. Consider a single helper that assembles the per-cluster features (system config plus config.features() plus scoped overrides) so the precedence lives in one place. Partly pre-existing, since config.features() is already repeated, but this PR roughly doubles the override points.

Nits

  • bool::decode in src/repr/src/optimize.rs still panics on truly invalid input, even though the doc says decoding must not crash the optimizer. It is effectively unreachable, since canonicalize gates what gets stored and the FEATURES clause is bool typed, so it is a defensive assertion rather than a real hazard. The wording just oversells the guarantee.
  • Replica create-time resolution is not done here. Only the cluster paths call resolve_scoped_for_new_objects, always with an empty replica set, so a freshly created replica's render-frozen flags use env-wide values until the next sync tick. The description documents this as deferred to #37158. Fine given the gate, just calling it out.

What is good

  • The differs-from-env recording rule with canonical bool bridging is the right call and is well explained and tested.
  • The gating works as intended: the loop reads the gate from params, and backend.pull seeds params from the catalog system vars every tick, so ALTER SYSTEM SET and additional_system_parameter_defaults are observed even though the gate is not an LD flag.
  • The prune_scope protection for concurrently created objects is sound, and the create-time merge correctly preserves existing overrides.
  • Create-time resolution does a local in-memory LD evaluation only, with no network round-trip on the coordinator loop, so it is cheap.

if prune_cluster(&cluster_id)
&& !desired_cluster.contains(&(cluster_id, name.clone()))
{
tx.remove_cluster_system_config(cluster_id, &name);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: this prune_scope change drops the lazy GC of orphaned rows for dropped objects, with a few knock-on effects.

The sync loop builds prune_scope from catalog.clusters(), so it only ever contains live objects. A dropped cluster or replica is never in prune_scope, so prune_cluster / prune_replica returns false for it and its rows are never removed. Nothing else GCs these rows either: remove_clusters / remove_cluster_replicas cascade only to cluster_replicas and introspection_sources, and the only callers of remove_*_system_config are right here. The previous code pruned any row not in the desired state, which lazily covered dropped objects (the old comment said exactly that).

Consequences:

  1. Durable and introspection leak. mz_cluster_system_parameters / mz_replica_system_parameters are unfiltered projections of these collections, so orphan rows with a dangling cluster_id / replica_id accumulate until the feature is toggled off (the only full wipe).
  2. The per-tick early return is defeated. reconcile_scoped_system_parameters skips the durable write when working_copy == scoped, but the sync loop's scoped is live-only while the working copy retains orphans, so they can never be equal once an orphan exists. The coordinator then runs a catalog_transact every tick that can never converge.
  3. The recreate-then-clear case in test/launchdarkly/mzcompose.py should fail: after DROP plus CREATE ld_role, clearing the rules and asserting count(*) ... = 0 will see the orphan row from the dropped incarnation. I could not run it locally since it needs a live LD project.

Suggested fix, local to this handler: also prune rows whose owning object is not live. live_clusters / live_replicas are already computed just above, so something like (!live_clusters.contains(&cluster_id) || prune_cluster(&cluster_id)) && !desired_cluster.contains(...) (and the replica analogue). That restores dropped-object GC while keeping the concurrent-create protection that prune_scope was added for. Orphan pruning is always safe since ids are never reused.

c.testdrive(
"\n".join(
[
f"> SELECT count(*) FROM mz_internal.mz_cluster_system_parameters WHERE name = '{OPTIMIZER_PARAM}'",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This count(*) = 0 assertion is where the orphan-GC regression surfaces. After the DROP plus CREATE of ld_role above, the dropped incarnation's row is not pruned by the reconcile, since its id is no longer in prune_scope, so this count should be 1, not 0. See the comment on src/adapter/src/catalog/transact.rs. Worth re-running against the live LD project once the prune fix is in.

Replica ids were allocated in-apply by insert_cluster_replica, reading
the IdAlloc counter from the snapshot the coordinator opened the
transaction with. Cluster and item ids are instead allocated out-of-band
by the durable allocator before the transaction, and the op carries the
explicit id. That asymmetry breaks any caller that pre-allocates a
replica id out-of-band: the coordinator does not apply IdAlloc updates to
its in-memory catalog, so a later in-apply allocation reuses the id and a
DuplicateReplica panic surfaces on the next bootstrap.

Make replica ids durable-allocated and single-source like cluster ids.
Op::CreateClusterReplica now carries a required replica_id; the apply
path always uses insert_cluster_replica_with_id and never allocates. The
runtime construction sites pre-allocate via the durable allocator,
choosing user or system ids from the owning cluster. The catalog-open
builtin-replica migration keeps a transaction-level system allocation,
which is single-source inside the open transaction. The in-apply
insert_cluster_replica and the unused transaction-level user allocator
are removed.

DB-147

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FUorxx5m6B8TT2VAek2STo
…tion

The resolution layer that consumes the durable scoped overrides, behind the
`enable_scoped_system_parameters` dyncfg (off by default).

* The sync loop evaluates the `cluster` and `replica` LaunchDarkly contexts and
  records an override only when the scoped value differs from the environment-
  wide value, comparing in canonical encoding so bool spellings match. The whole
  evaluation is skipped while the gate is off, and the working copy cleared.
* Resolution boundaries: cluster-coherent overrides feed plan-time
  `OptimizerFeatureOverrides` (with a lenient bool decode so `on`/`off` cannot
  panic the optimizer); replica-local overrides feed the compute controller's
  per-replica dyncfg push.
* Create-time evaluation resolves a freshly created cluster or replica through a
  shared `SystemParameterFrontend` so new objects get their overrides without
  waiting for the next sync tick.
* Tests: canonical bool bridging, lenient optimizer decode, frontend context
  construction, the LaunchDarkly end-to-end cases, and workload-harness
  registration of the new gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@antiguru antiguru force-pushed the claude/great-brahmagupta-ybi2yf branch from d790e90 to 926f501 Compare June 22, 2026 10:09
@aljoscha

Copy link
Copy Markdown
Contributor

Thanks for the quick turnaround. The rebased revision (926f501) resolves the blocking item: the prune loop now reclaims orphaned rows for dropped objects (!live_clusters.contains(..) || prune_cluster(..)), and the new test_transact_update_scoped_system_parameters_prune covers upsert, bounded prune, stale in-scope prune, and the orphan case. That also clears my note on the recreate-then-clear assertion in the launchdarkly test. (My earlier inline comments are now outdated against this push.)

Remaining items, all non-blocking:

  • The cluster-scoped optimizer override is still applied ad hoc at roughly a dozen plan-path call sites. Not a problem today, but a future plan path that builds a per-cluster OptimizerConfig can silently miss the scoped layer. A single per-cluster feature-assembly helper would keep the precedence in one place. Fine as a follow-up.
  • bool::decode in repr/optimize.rs still panics on truly invalid input even though the doc says it must not crash the optimizer. It is effectively unreachable since canonicalize gates what gets stored, so this is just a wording nit.
  • Replica create-time resolution remains deferred to Scoped feature flags: create-time evaluation #37158, as documented.

LGTM once you are happy with those. Nice work on the orphan-prune test in particular.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants