Scoped feature flags 2/3: durable persistence and introspection (inert)#37080
Conversation
ba7c8fb to
37dc4cd
Compare
8916620 to
98a035d
Compare
aljoscha
left a comment
There was a problem hiding this comment.
see the one inline comment about builtins please
…plumbing (#37079) ### Motivation First of a three-PR stack splitting #36959 (scoped feature flags) for review: **1/3 (this) → 2/3 #37080 → 3/3 #36959**. Design: doc/developer/design/20260609_scoped_feature_flags.md (#36947). This PR is the additive foundation: it introduces the vocabulary the later PRs build on, with no behavior change on its own. ### Description * `ParameterScope` (`Environment` / `Cluster` / `Replica`), declared on system vars and dyncfg `Config`s and carried through to the synced system vars. The declaration is the single source of truth for which contexts get evaluated. * The size-family taxonomy: `ReplicaAllocation::family` plus a size-map `family` field, with a `cc` / `legacy` fallback for sizes that don't set one. * The compute controller's per-replica dyncfg override layer (`update_replica_dyncfg_overrides` + per-replica command specialization), inert until the adapter wires it in 3/3. Nothing consumes the scope or the override layer yet, so this is a no-op. ### Verification `test_replica_allocation_family` covers the size→family fallback; the rest is exercised by the later PRs in the stack. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
98a035d to
32fe560
Compare
aljoscha
left a comment
There was a problem hiding this comment.
Reviewed the second commit only (catalog: persist scoped system parameters and surface in introspection).
Overall this is clean and closely mirrors the existing durable-collection plumbing; the inert-foundation framing is clear and the commit message is excellent. A few things, grouped by severity (details inline).
Blocking
- CI is red (
slt-1):test/sqllogictest/catalog_server_explain.sltis out of date — it's missing theEXPLAIN MATERIALIZED VIEWentries for the two new MVs (the commit bumped the row counts but didn't add the per-MV blocks). Regenerate withbin/mzcompose --find sqllogictest run catalog-server-explain --rewrite.
Strong suggestions
- Don't name the vendor ("LaunchDarkly") in user-facing surfaces — the
valuecolumn comments and the newmz_internal.mdprose. It leaks an implementation detail and goes stale if the sync source ever changes; describe it generically. - The two introspection MVs wrap the
kindfilter in a subquery citing database-issues/8495, but that issue is join-specific and these MVs have no join. The join-less catalog MVs (mz_databases,mz_schemas) just use a plainWHERE. Suggest matching them.
Nits (inline)
- A couple of rustdoc comments narrate design/abstractions that belong in the design doc (consistent with the earlier note on
config.rs). - The two
apply_*_system_configuration_updatefns are byte-identical save the id type — a small generic helper would remove the duplication. - Redundant liveness check in the reconcile loops; inconsistent visibility on the inert accessors.
- The reconcile diff/prune and the value-matched retraction are the only non-trivial logic here and are currently untested (inert). A focused unit test would land cleanly now rather than waiting for 3/3.
32fe560 to
ef3cfe1
Compare
aljoscha
left a comment
There was a problem hiding this comment.
Re-reviewed the updated second commit (ef3cfe1632). Thanks for the quick turnaround — all six points are addressed cleanly:
- ✅ Vendor name removed from the
valuecolumn comments and themz_internal.mdprose (and "table" → "view" there too). - ✅ Both MVs dropped the subquery + the database-issues/8495 comment and now use a plain
WHERE data->>'kind' = '...', matchingmz_databases/mz_schemas. - ✅ The two
apply_*fns collapsed into one genericapply_scoped_system_configuration_update<Id: Ord>helper — nice, and the value-matched-retraction rationale moved into its doc. - ✅
objects.rsdocs: dropped the fragile "step 2/3" references and corrected "on startup" → "maintained from it on every catalog update". - ✅ Redundant
|| !live_*.contains(...)checks removed, with a clarifying comment on why a missing entry already covers dropped objects. - ✅ Accessor visibility made consistent (
scoped_system_parameters()is nowpub, no stray#[allow(dead_code)]).
Still blocking (unchanged): CI slt-1 is still red on the same error — test/sqllogictest/catalog_server_explain.slt is missing the EXPLAIN MATERIALIZED VIEW entries for the two new MVs:
Missing EXPLAIN queries (2):
+ EXPLAIN MATERIALIZED VIEW "mz_internal"."mz_cluster_system_parameters";
+ EXPLAIN MATERIALIZED VIEW "mz_internal"."mz_replica_system_parameters";
The commit regenerated mz_internal.slt / the index-accounting slt and bumped the row counts in catalog_server_explain.slt, but the per-MV EXPLAIN blocks still need to be added. Regenerate with:
bin/mzcompose --find sqllogictest run catalog-server-explain --rewrite
Once that lands and slt-1 is green, this LGTM from my side.
ef3cfe1 to
e012691
Compare
aljoscha
left a comment
There was a problem hiding this comment.
Two doc-comment nits from a closer re-read (the introspection views themselves look correct — verified parse_catalog_id handles both id types, the kind discriminator matches, and the absence of .with_key is required since the optimizer can't prove the composite key). Both nits are docs that narrate the 3/3 resolution layer rather than the local contract.
| /// Returns a copy of `self` with the cluster and replica overrides from | ||
| /// `other` merged in, replacing any existing entry for the same object. | ||
| /// | ||
| /// Create-time resolution uses this to add freshly-evaluated new objects to |
There was a problem hiding this comment.
nit: this second paragraph narrates the 3/3 resolution layer ("create-time resolution", "periodic full-state reconcile") that doesn't exist yet in this commit — the same kind of leaking-abstraction the comment thread above flagged. merge itself is just a two-map extend. I'd trim to the local contract, e.g.: "Returns a copy of self with other's entries merged in, replacing any existing entry for the same object. Expresses no removals." The caller-side rationale can live where the caller lands.
There was a problem hiding this comment.
Trimmed to the local contract: "Returns a copy of self with other's entries merged in, replacing any existing entry for the same object. Expresses no removals." Caller-side rationale will live with the caller in 3/3.
| /// Returns the cluster-coherent scoped optimizer-feature overrides for | ||
| /// `cluster_id` from the in-memory scoped-parameter working copy. | ||
| /// | ||
| /// These are layered on top of the cluster's manual `CREATE CLUSTER ... |
There was a problem hiding this comment.
nit: this precedence model (cluster-scoped LD rule beats a manual FEATURES pin beats env-wide) is implemented in the 3/3 resolution layer, not here — and it re-introduces the vendor name we just took out of the user-facing surfaces. This accessor is just a lookup + From. Suggest trimming the doc to the contract (returns the cluster's scoped optimizer-feature overrides, empty if none) and keeping the precedence/layering reasoning where the layering actually happens (or in the design doc).
There was a problem hiding this comment.
Trimmed to the contract (returns the cluster's scoped optimizer-feature overrides, empty if none) and dropped the precedence/vendor prose — that layering lives in 3/3 / the design doc.
aljoscha
left a comment
There was a problem hiding this comment.
Left a couple more comments before, and would be good to resolve these. But otherwise good to merge!
Durable storage and introspection for scoped (per-cluster and per-replica) feature overrides, inert until the resolution layer wires it. * Two durable catalog collections, `cluster_system_configurations` and `replica_system_configurations`, keyed by object id, plus their builtin introspection tables `mz_internal.mz_cluster_system_parameters` / `mz_replica_system_parameters`. CATALOG_VERSION 86 with a no-op migration. * `CatalogState::scoped_system_parameters` is the in-memory mirror, with a value-matched retraction in `apply` so updates are order-independent, and a durable `Op::UpdateScopedSystemParameters` that reconciles the persisted cache to a complete desired state. * The working copy is never populated here, so introspection stays empty and resolution falls back to the environment-wide value everywhere — no behavior change until the resolution layer lands. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
e012691 to
8d17f87
Compare
|
Thanks for the reviews! |
Second of a three-PR stack splitting #36959: 1/3 #37079 (merged) → 2/3 (this)
→ 3/3 #36959. Design: doc/developer/design/20260609_scoped_feature_flags.md
(#36947).
Adds the durable storage and introspection for scoped overrides, inert until the
resolution layer (3/3) wires it.
Description
cluster_system_configurationsandreplica_system_configurations, keyed by object id, surfaced as builtinmaterialized views
mz_internal.mz_cluster_system_parameters/mz_replica_system_parametersovermz_catalog_raw(no new builtin tables,per the ongoing MV migration). CATALOG_VERSION 86 with a no-op migration.
CatalogState::scoped_system_parametersis the in-memory working copy used forresolution, maintained from the durable state with a value-matched retraction
in
apply(order-independent updates); a durableOp::UpdateScopedSystemParametersreconciles the persisted cache to a completedesired state.
introspection views stay empty and resolution falls back to the
environment-wide value everywhere — no behavior change until 3/3.
Verification
Catalog migration round-trip and golden-encoding tests (full
mz-catalogsuite),regenerated sqllogictest catalog enumerations, and
ScopedParametersunit tests.