Context Budget Management & Upstream Reduction

Core

Manage context effectively in large codebase exploration · Difficulty 3/5

0%
context-budgetdata-reductionupstreamoptimization

When combined outputs from multiple agents or exploration passes exceed the downstream agent's optimal processing range, the most effective solution is reducing volume at the source and managing context budgets strategically.

Context Degradation

In extended sessions, models start giving inconsistent answers and referencing "typical patterns" rather than specific classes discovered earlier. This happens because verbose exploration output fills the context window, pushing earlier findings out of reliable attention range.

Solutions

  • Scratchpad files: Persist key findings to files that can be re-read when needed, rather than relying on context alone
  • Subagent delegation: Isolate verbose exploration in subagents; main agent sees only structured summaries
  • Phase summaries: Summarize findings from one exploration phase before starting the next
  • /compact: Reduce context usage during extended sessions when context fills with verbose output
  • Upstream Data Reduction

    Modify upstream agents to return structured data instead of verbose content:

  • Key facts (not full page content)
  • Citations with source URLs
  • Relevance scores
  • Summary findings (not reasoning chains)
  • Crash Recovery

    Design structured state persistence where each agent exports state to a known location. The coordinator loads a manifest on resume and injects state into agent prompts, avoiding full re-exploration.

    Key Takeaways

    • Reduce data volume at the source rather than trying to handle large inputs downstream
    • Use scratchpad files to persist findings across context boundaries
    • Subagent delegation isolates verbose exploration from the main agent's context

    Test Yourself1 of 1

    Your web search agent returns ~85K tokens of content (including full page text) and the document analysis agent returns ~70K tokens (including reasoning chains). The synthesis agent works best under 50K tokens. What's the most effective way to handle this volume mismatch?