Context Windows & Provision Strategies

Core

Manage conversation context to preserve critical information across long interactions · Difficulty 2/5

0%
contextprovisiontokenscontext-window

The Context Window defines the maximum amount of text (measured in tokens) that Claude can process in a single request. Claude can only reason about information it can see, so providing the right context is essential for accurate responses.

Key Concepts

  • Input tokens: The system prompt, user messages, and any provided context
  • Output tokens: Claude's generated response (controlled by `max_tokens`)
  • Total context: Input + output tokens must fit within the model's Context Window
  • Context Provision Methods

  • Direct inclusion: Include relevant files/data directly in the prompt
  • Tool-based retrieval: Give Claude tools to fetch information on demand
  • RAG pipeline: Pre-retrieve relevant documents using embeddings/search
  • Prior findings: Include results from previous analysis passes
  • Key Principle: Include What Claude Needs to See

    Claude has no implicit knowledge of your codebase, test suite, or prior reviews. If you want it to account for existing work, you must explicitly provide that context. If Claude suggests test cases that duplicate existing tests, the solution is simple: include the existing test file in the context.

    Best Practices

  • **Monitor Token usage**: Track input/output tokens to avoid hitting limits
  • Prioritize context: Place the most important information at the beginning and end
  • Summarize when needed: For long conversations, progressively summarize older turns
  • Use prompt caching: Cache static system prompts to reduce costs on repeated calls
  • Key Takeaways

    • Context window = input tokens + output tokens combined
    • Claude can only reason about information explicitly provided in context
    • Include existing code/tests/reviews to prevent duplicate suggestions
    • Use progressive summarization for long conversations

    Test Yourself1 of 3

    Your synthesis agent processes results from subagents and produces a report. When a second research query on a related topic runs, the synthesis agent suggests findings that duplicate the first report. What's the most effective fix?