Context Budget Management & Upstream Reduction

Core

Manage context effectively in large codebase exploration · Difficulty 3/5

context-budgetdata-reductionupstreamoptimization

Prerequisites

When combined outputs from multiple agents or exploration passes exceed the downstream agent's optimal processing range, the most effective solution is reducing volume at the source and managing context budgets strategically.

Context Degradation

In extended sessions, models start giving inconsistent answers and referencing "typical patterns" rather than specific classes discovered earlier. This happens because verbose exploration output fills the context window, pushing earlier findings out of reliable attention range.

Solutions

Scratchpad files: Persist key findings to files that can be re-read when needed, rather than relying on context alone

Subagent delegation: Isolate verbose exploration in subagents; main agent sees only structured summaries

Phase summaries: Summarize findings from one exploration phase before starting the next

/compact: Reduce context usage during extended sessions when context fills with verbose output

Upstream Data Reduction

Modify upstream agents to return structured data instead of verbose content:

Key facts (not full page content)

Citations with source URLs

Relevance scores

Summary findings (not reasoning chains)

Crash Recovery

Design structured state persistence where each agent exports state to a known location. The coordinator loads a manifest on resume and injects state into agent prompts, avoiding full re-exploration.

Key Takeaways

✓Reduce data volume at the source rather than trying to handle large inputs downstream
✓Use scratchpad files to persist findings across context boundaries
✓Subagent delegation isolates verbose exploration from the main agent's context

Related Concepts

Lost in the Middle & Position Effects

Models attend best to beginning and end of context, less to the middle

Codebase Exploration Context Strategies

Use scratchpad files to externalize findings beyond the context window

Test Yourself1 of 1

Your web search agent returns ~85K tokens of content (including full page text) and the document analysis agent returns ~70K tokens (including reasoning chains). The synthesis agent works best under 50K tokens. What's the most effective way to handle this volume mismatch?