Skip to content

[BUG] Bedrock streaming is buffered - users wait for full response before seeing any text #1986

@paerts

Description

@paerts

📋 Prerequisites

  • I have searched the existing issues to avoid creating a duplicate
  • By submitting this issue, you agree to follow our Code of Conduct
  • I am using the latest version of the software
  • I have tried to clear cache/cookies or used incognito mode (if ui-related)
  • I can consistently reproduce this issue

🎯 Affected Service(s)

App Service

🚦 Impact/Severity

Minor inconvenience

🐛 Bug Description

We're using a Bedrock-backed agent with stream: true and noticed our users have to wait 4-5 seconds before seeing
anything, even for a simple "Hi" message. After some digging through the code, I found the culprit in
kagent/adk/models/_bedrock.py around line 320:

def _run_converse_stream(**kw):
    resp = client.converse_stream(**kw)
    return list(resp.get("stream", []))

That list() call collects the entire Bedrock response into memory before anything happens downstream. The rest of the
streaming chain actually works great (partial LlmResponse events, A2A SSE, the whole thing) but it's all waiting on this
one function to finish collecting everything first.

🔄 Steps To Reproduce

  1. Deploy an agent with stream: true and a Bedrock model config (we use eu.anthropic.claude-sonnet-4-6 in eu-central-1)
  2. Send a simple message like "Hi" through the A2A protocol
  3. Notice nothing appears for 4-5 seconds, then the full answer shows up all at once
  4. For comparison, calling Bedrock's converse_stream API directly gives you the first token in about 500ms

🤔 Expected Behavior

Text should start appearing within roughly 500ms (which is Bedrock's actual time-to-first-token), with the rest streaming
in progressively.

📱 Actual Behavior

The UI is blank for 4-5 seconds, then the complete response appears in one shot. Looking at the A2A events, the
kagent_adk_partial: true events do get emitted but they all arrive as a burst after buffering, not incrementally as
Bedrock produces them.

💻 Environment

  • kagent-adk: 0.3.0
  • google-adk: 1.31.1
  • a2a-sdk: 0.3.23
  • Model: eu.anthropic.claude-sonnet-4-6 via Bedrock (eu-central-1)
  • Running on EKS, agent configured with stream: true

🔧 CLI Bug Report

No response

🔍 Additional Context

I think the fix could be fairly contained. Replace the list() buffering with something that bridges boto3's synchronous
iterator to the async world incrementally. Something along these lines:

async def _iter_converse_stream(client, **kw):
    queue = asyncio.Queue()

    def _produce():
        resp = client.converse_stream(**kw)
        for event in resp.get("stream", []):
            queue.put_nowait(event)
        queue.put_nowait(None)

    loop = asyncio.get_event_loop()
    loop.run_in_executor(None, _produce)
    while (event := await queue.get()) is not None:
        yield event

Or whatever pattern fits the project's conventions better. The important thing is just not materializing the whole stream
upfront.

All the downstream plumbing (event converter, A2A event queue, SSE to client) already handles partial events correctly.
It's really just this one spot that's holding things up.

Happy to test a fix on our cluster if you want to point me at a branch.

📋 Logs

📷 Screenshots

No response

🙋 Are you willing to contribute?

  • I am willing to submit a PR to fix this issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions