Skip to content

[copilot-token-optimizer] Optimize Repository Quality Improvement Agent — reduce 30-turn read-only run (~500K tokens/run) #29337

@github-actions

Description

@github-actions

Overview

The Repository Quality Improvement Agent (repository-quality-improver.md) consumed 1,784,366 tokens in 30 turns on 2026-04-30, despite performing zero write actions. The run was flagged resource_heavy_for_domain (high severity) and partially_reducible (50% data-gathering turns). With 27 of 30 turns triggering blocked requests, the agent spent most of its budget on exploratory reads that could be replaced with deterministic pre-agent steps.

Analysis period: 2026-04-24 to 2026-04-30 · Runs audited: 1 · Run: §25167483584

Token Profile

Metric Value
Total tokens 1,784,366
Effective tokens 1,987,258
Input tokens 1,773,062
Output tokens 11,304
Cache read tokens 1,689,800
Cache efficiency 48.8%
Turns 30
Avg tokens/turn ~59,100
Write actions 0
Blocked requests 27 / 30 turns
Duration 9.1 min
Conclusion success

Ranked Recommendations

1. 🔴 Move data-gathering phases to deterministic pre-agent bash steps (~450K tokens saved)

Estimated savings: ~450,000 tokens/run
Evidence: agentic_fraction=0.50, assessment partially_reducible — "About 50% of this run's turns appear to be data-gathering". With 30 turns × 59K tokens/turn, ~15 turns are pure shell reads (cache check, metrics collection, directory stats) that produce no LLM-only value.

Action: Extract Phase 0 (cache history check) and Phase 1 analysis commands into frontmatter steps: (pre-agent). Write collected metrics to /tmp/gh-aw/agent/analysis-context.md. The agent then reads the pre-computed file in turn 1 instead of running shell commands itself over multiple turns.

steps:
  - name: Collect quality metrics
    run: |
      mkdir -p /tmp/gh-aw/agent
      {
        echo "## Cache History"
        cat /tmp/gh-aw/cache-memory-focus-areas/history.json 2>/dev/null || echo "{}"
        echo "## Code Metrics"
        find . -type f -name "*.go" ! -name "*_test.go" ! -path "./.git/*" | xargs wc -l 2>/dev/null | sort -rn | head -20
        echo "## Test Ratio"
        ...
      } > /tmp/gh-aw/agent/analysis-context.md

2. 🔴 Trim the oversized prompt — extract report template to a shared import (~300K tokens saved)

Estimated savings: ~300,000 tokens/run
Evidence: The workflow prompt is ~850 lines. It embeds a full Markdown report template (~150 lines), multiple bash code blocks for every analysis category, and an exhaustive task-generation template. This entire prompt is echoed every turn, at ~59K tokens/turn × 30 turns.

Action: Extract the report template into shared/repository-quality-report-template.md and import it. Collapse per-category bash examples into a single compact reference table instead of full shell snippets. Target: reduce prompt to ≤300 lines (~65% reduction in repeated context).

imports:
  - shared/repository-quality-report-template.md

3. 🟡 Cap turns with max-turns to enforce a ceiling (~100K tokens saved)

Estimated savings: ~100,000 tokens/run (guards against regressions)
Evidence: 30 turns for a read-only reporting task is high. After applying recommendations 1–2, the expected turn count drops to ~12–15. Adding an explicit ceiling prevents future prompt changes from silently ballooning costs.

Action: Add max-turns: 18 to the frontmatter (generous enough not to truncate a normal run, tight enough to catch regressions).


4. 🟡 Remove unused Serena MCP from cold-path invocations (~50K tokens saved)

Estimated savings: ~50,000 tokens/run
Evidence: tool_breadth: narrow, tool_types: 0. The workflow imports shared/mcp/serena-go.md and conditionally uses Serena for "deeper analysis". The single audited run performed no write actions and tool breadth was narrow, suggesting Serena was never invoked. Its ambient toolset description still inflates every turn's context.

Action: Gate the Serena import behind a workflow_dispatch input flag (serena: false by default). Scheduled runs skip Serena unless opted in.


Summary of Expected Savings

Recommendation Tokens saved/run Confidence
Pre-agent data-gathering steps ~450,000 High
Prompt trimming / template extraction ~300,000 High
max-turns ceiling ~100,000 Medium (regression guard)
Conditional Serena import ~50,000 Medium
Total ~900,000

Caveats

  • Only 1 run was available for analysis; savings estimates are proportional projections based on token-per-turn averages.
  • The 48.8% cache efficiency indicates the large prompt is partially cached; prompt reduction will also improve cache hit ratio for remaining turns.
  • Serena recommendation is based on absence of tool use in the single audited run — verify over 3+ runs before removing.
Run detail
Field Value
Run ID 25167483584
Date 2026-04-30
Engine GitHub Copilot CLI (claude-sonnet-4.6)
Event schedule
Execution style exploratory
Tool breadth narrow
Actuation style read_only
Resource profile heavy
Agentic fraction 0.50
Blocked requests 27

References:

Generated by Copilot Token Usage Optimizer · ● 1.1M ·

  • expires on May 7, 2026, 3:32 PM UTC

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions