feat(mcp): support multiple active index bindings (RAAE-1604) by vishal-bala · Pull Request #629 · redis/redis-vl-python

vishal-bala · 2026-06-16T08:44:31Z

Motivation

The RedisVL MCP server currently binds to exactly one Redis index per process. That single-binding assumption is enforced by a config validator and baked throughout the codebase — single-resource server state, single-binding convenience accessors on MCPConfig, and the search/upsert tools. Before the server can expose multiple logical indexes from a single endpoint (RAAE-1603), that assumption has to be removed and replaced with a real multi-binding model.

This PR (RAAE-1604) does exactly that, and nothing more: it reshapes the configuration and runtime model so the server can start, inspect, validate, and serve one or many bindings, while keeping existing single-index configs and callers behaving identically. It is the foundation the rest of the epic (discovery via list-indexes, index routing on search-records/upsert-records, docs) builds on, so it intentionally does not yet add any new request parameters or tools.

Implementation

The core of the change is a new immutable BindingRuntime (in redisvl/mcp/runtime.py) that bundles everything a tool call needs for one logical index: the binding config, the connected AsyncSearchIndex, its effective (inspected + overridden) schema, an optional vectorizer, the resolved native-hybrid-search capability, and the effective read-only flag. The server now holds a dict[str, BindingRuntime] keyed by logical id instead of a single set of _index/_vectorizer fields. Startup iterates every configured binding and inspects, validates, and initializes each one independently — each binding owns its own Redis client — with all-or-nothing teardown so a single bad binding fails startup cleanly without leaking connections.

On the config side, the "exactly one configured index binding" validator is gone (we now simply require at least one binding with non-blank ids), and the schema-inspection, runtime-mapping, and search-validation methods move from MCPConfig onto MCPIndexBindingConfig where they naturally belong per binding. The single-binding convenience accessors on MCPConfig are removed. Each binding gains optional description and read_only fields, and a binding's effective write availability is computed as global --read-only OR the per-index read_only. Tool resolution goes through a new server.resolve_binding(index_id) helper that defaults to the sole binding when one is configured (preserving backward compatibility) and returns an invalid_request error when an index is omitted with multiple bindings configured or when an unknown id is given. The search and upsert tools were re-threaded to operate on a resolved BindingRuntime rather than reaching into single-binding server accessors.

Additional notes:

Native-hybrid-search support is now probed eagerly per binding at startup and stored on the BindingRuntime, replacing the previous lazy single-index cache.
The concurrency semaphore is a single process-wide ceiling sized from the maximum max_concurrency across bindings; the request timeout is sourced per-binding and passed explicitly into run_guarded.
get_index() / get_vectorizer() are retained as thin convenience wrappers over resolve_binding(None).
Implemented test-first: new coverage for multi-binding config loading, description/read_only defaults, resolve_binding routing semantics, semaphore sizing, per-binding teardown, and three integration tests (multi-binding startup, global read-only override, and a single invalid binding failing startup), alongside the updated single-index tests that confirm backward compatibility.

Verification

mypy clean across all source files; black/isort formatted.
182 MCP unit tests pass.
44 MCP integration tests pass (2 skipped on Redis-version gates) against Redis 8.

🤖 Generated with Claude Code

Note

Medium Risk
Touches MCP startup, connection lifecycle, and tool execution paths; multi-binding partial-failure teardown is new, but single-index configs and default resolution behavior are preserved and heavily tested.

Overview
The MCP server moves from a single enforced index binding to a multi-binding model: YAML indexes may define one or many logical ids, each with its own Redis index, schema inspection, vectorizer, search validation, and connection.

Config drops the “exactly one binding” rule (requires at least one non-blank id), removes MCPConfig-level single-index helpers, and moves schema override / search validation onto MCPIndexBindingConfig, which gains optional description and per-binding read_only.

Runtime introduces immutable BindingRuntime per binding. Startup initializes every binding independently (with fail-closed teardown on partial failure), probes native hybrid search per binding, sets effective_read_only from global settings OR binding read_only, sizes the shared concurrency semaphore from the max max_concurrency across bindings, and passes per-binding timeouts into run_guarded.

resolve_binding(index_id) defaults to the sole binding when only one is configured; with multiple bindings, a missing or unknown id returns invalid_request. Search and upsert tools are wired through resolved binding runtime (still defaulting via resolve_binding(None) in this PR—no new index parameter on tools yet). Multi-binding setups omit schema-specific hints from the search tool description.

^{Reviewed by Cursor Bugbot for commit f83ff32. Bugbot is set up for automated code reviews on this repo. Configure here.}

Treat the MCP `indexes` config as a real multi-binding map instead of enforcing exactly one logical binding. This is the foundation for multi-index MCP support (RAAE-1603); single-index configs and callers behave identically. Config: - Drop the "exactly one configured index binding" validator; require at least one binding with non-blank ids. - Add per-binding `description` and `read_only` fields. - Move schema inspection / runtime-mapping / search validation methods onto MCPIndexBindingConfig and remove the single-binding convenience accessors from MCPConfig. Runtime/server: - Introduce an immutable BindingRuntime bundling a binding's config, index, effective schema, vectorizer, native-hybrid support, and effective read-only flag. - Replace single-resource server state with a dict of BindingRuntime; startup inspects/validates/initializes every binding independently (one client per binding) with all-or-nothing teardown. - Resolve native-hybrid support eagerly per binding. - Size the concurrency semaphore from the max binding limit and pass a per-binding request timeout into run_guarded. - Add resolve_binding(index_id): defaults to the sole binding, raises invalid_request for omitted-on-multi and unknown ids. - Compute effective_read_only as global read-only OR per-binding read_only. Tools: - Re-thread search/upsert onto a resolved BindingRuntime instead of single-binding server accessors. Tests follow TDD: multi-binding config/startup coverage, resolve_binding semantics, semaphore sizing, per-binding teardown, and backward-compatible single-index behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jit-ci · 2026-06-16T08:46:33Z

🛡️ Jit Security Scan Results

✅ No security findings were detected in this PR

^{Security scan by Jit}

Both fixtures used `user_index_{worker_id}` / `v1_{worker_id}` as index name and prefix, causing a collision when the two test modules run on the same xdist worker. Switch to `redis_test_name` (which incorporates the node hash) so every test gets a unique namespace, and add `drop=True` to `create(overwrite=True)` so stale documents from an interrupted run are removed rather than reused. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit f83ff32. Configure here.}

vishal-bala · 2026-06-25T09:57:52Z

CI is failing because of flaky tests that will be fixed by redis-developer/sql-redis#39

nkanu17 · 2026-06-26T15:34:20Z

+        self.config = None
+        self._semaphore = None
+
+        for runtime in bindings:


Teardown loop leaks connections on partial failure

_close_resources has an internal try/finally that ensures index.disconnect() runs even if vectorizer.aclose() throws. But _close_resources itself can still raise (e.g. if index.disconnect() throws a network error).

If that happens on binding 1, the exception propagates out of this loop immediately. Bindings 2 through N never reach _close_resources. Their Redis connections are never closed, causing a connection leak on every failed teardown. This matters most in the partial-startup-failure path where _teardown_runtime is called from the except block in _initialize_runtime_resources.

Suggested fix:

for runtime in bindings: try: await self._close_resources( index=runtime.index, vectorizer=runtime.vectorizer ) except Exception: logger.warning( "error closing binding %s during teardown", runtime.binding_id, exc_info=True, )

Best-effort teardown: log the failure, keep going, clean up everything possible.

nkanu17 · 2026-06-26T15:34:20Z

+        bindings = list(self._bindings.values())
+        self._bindings = {}
+        self.config = None
+        self._semaphore = None


_tools_registered is never reset in teardown

_teardown_runtime clears _bindings, config, and _semaphore but leaves _tools_registered = True. If startup ever fails after _register_tools() succeeds (unlikely with the current call order but possible if it shifts), a subsequent startup() call will hit the guard at the top of _register_tools and skip registration entirely, leaving the server with no tools.

Add self._tools_registered = False here alongside the other state resets.

nkanu17 · 2026-06-26T15:34:20Z

+            )
+        return runtime
+
+    async def get_index(self, index_id: str | None = None) -> AsyncSearchIndex:


nit: get_index() and get_vectorizer() are footguns on multi-binding servers

Both methods call resolve_binding(None) which throws INVALID_REQUEST on any multi-binding server. A future tool author who finds these convenience methods and uses them:

index = await server.get_index() # breaks silently on multi-binding

gets a runtime error only when deployed with a multi-binding config. No TypeError at import time, no static analysis warning unless mypy is run against the caller.

Consider adding a deprecation warning or removing them entirely and directing all tool authors to use resolve_binding(index_id) directly.

nkanu17 · 2026-06-26T15:34:20Z

+            raise RuntimeError("MCP server vectorizer is not configured")
+        return runtime.vectorizer
+
+    async def run_guarded(


nit: breaking api change

Adding timeout_seconds as a required keyword-only argument is the right Python pattern. But this is a breaking change to a previously 2-argument public method. Any code subclassing RedisVLMCPServer or calling run_guarded directly (custom tools, plugins) will get:

TypeError: run_guarded() missing 1 required keyword-only argument: 'timeout_seconds'

This only surfaces at runtime when the codepath is hit, not at import time. Please add a CHANGELOG entry and a note in the PR description so downstream authors know to update their call sites.

nkanu17

I really like the design!

Binding runtime dataclass is nice and like how resolve binding works as the single routing entry point shared across all tools keeps the behavior consistent and how it's backward compatible for single binding servers works

One blocking issue before merge:

the teardown loop in _teardown_runtime does not protect subsequent bindings if _close_resources raises on an earlier one.

A failed disconnect on binding 1 will propagate out of the loop immediately, leaving bindings 2 through N with open Redis connections that are never closed.
In a partial startup failure this is exactly the path that gets exercised, so the leak is not hypothetical.
Wrapping each iteration in a try/except that logs and continues is a small fix and should be done before this lands.

nkanu17 · 2026-06-26T15:43:09Z

+        # The semaphore is a single process-wide concurrency ceiling shared by
+        # all bindings; take the max configured limit across bindings.
+        self._semaphore = asyncio.Semaphore(


nit: Semaphore semantics need a clarifying comment

Suggested change

# The semaphore is a single process-wide concurrency ceiling shared by

# all bindings; take the max configured limit across bindings.

self._semaphore = asyncio.Semaphore(

# max_concurrency is configured per-binding but applied here as a single

# process-wide ceiling shared across all bindings. Taking max() means the

# most permissive binding sets the global cap: 5 bindings each with

# max_concurrency=2 yields Semaphore(2), not Semaphope(10).

# Consider renaming to global_max_concurrency in a future config version

# to avoid misleading operators who expect per-index slot allocation.

self._semaphore = asyncio.Semaphore(

vishal-bala marked this pull request as ready for review June 25, 2026 09:20

cursor Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread redisvl/mcp/tools/upsert.py

vishal-bala requested a review from nkanu17 June 25, 2026 09:58

nkanu17 reviewed Jun 26, 2026

View reviewed changes

nkanu17 requested changes Jun 26, 2026

View reviewed changes

nkanu17 reviewed Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(mcp): support multiple active index bindings (RAAE-1604)#629

feat(mcp): support multiple active index bindings (RAAE-1604)#629
vishal-bala wants to merge 2 commits into
feature/raae-1603-mcp-multi-indexfrom
feature/raae-1604-config-runtime-model

vishal-bala commented Jun 16, 2026 •

edited by cursor Bot

Loading

Uh oh!

jit-ci Bot commented Jun 16, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

vishal-bala commented Jun 25, 2026

Uh oh!

nkanu17 Jun 26, 2026 •

edited

Loading

Uh oh!

nkanu17 Jun 26, 2026 •

edited

Loading

Uh oh!

nkanu17 Jun 26, 2026 •

edited

Loading

Uh oh!

nkanu17 Jun 26, 2026 •

edited

Loading

Uh oh!

nkanu17 left a comment •

edited

Loading

Uh oh!

nkanu17 Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        # The semaphore is a single process-wide concurrency ceiling shared by
-        # all bindings; take the max configured limit across bindings.
-        self._semaphore = asyncio.Semaphore(
+        # max_concurrency is configured per-binding but applied here as a single
+        # process-wide ceiling shared across all bindings. Taking max() means the
+        # most permissive binding sets the global cap: 5 bindings each with
+        # max_concurrency=2 yields Semaphore(2), not Semaphope(10).
+        # Consider renaming to global_max_concurrency in a future config version
+        # to avoid misleading operators who expect per-index slot allocation.
+        self._semaphore = asyncio.Semaphore(

Uh oh!

Conversation

vishal-bala commented Jun 16, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Implementation

Verification

Uh oh!

jit-ci Bot commented Jun 16, 2026

🛡️ Jit Security Scan Results

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vishal-bala commented Jun 25, 2026

Uh oh!

nkanu17 Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkanu17 Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkanu17 Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkanu17 Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkanu17 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkanu17 Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vishal-bala commented Jun 16, 2026 •

edited by cursor Bot

Loading

nkanu17 Jun 26, 2026 •

edited

Loading

nkanu17 Jun 26, 2026 •

edited

Loading

nkanu17 Jun 26, 2026 •

edited

Loading

nkanu17 Jun 26, 2026 •

edited

Loading

nkanu17 left a comment •

edited

Loading