Context Management & Reliability

15% of exam

Manage conversation context across long interactions, design escalation and ambiguity resolution, implement error propagation, handle codebase exploration, design human review workflows, and preserve information provenance.

6

task statements

15

concepts

57

practice questions

Domain Mastery

0%
See full Exam Guide
ts-5.1

Manage conversation context to preserve critical information across long interactions

Techniques for managing context windows, mitigating position effects, and preserving critical details across long conversations.

Knowledge of

  • Progressive summarization risks: condensing numerical values, percentages, dates, and customer-stated expectations into vague summaries
  • The "lost in the middle" effect: models reliably process information at the beginning and end of long inputs but may omit findings from middle sections
  • How tool results accumulate in context and consume tokens disproportionately to their relevance (e.g., 40+ fields per order lookup when only 5 are relevant)
  • The importance of passing complete conversation history in subsequent API requests to maintain conversational coherence

Skills in

  • Extracting transactional facts (amounts, dates, order numbers, statuses) into a persistent "case facts" block included in each prompt, outside summarized history
  • Extracting and persisting structured issue data (order IDs, amounts, statuses) into a separate context layer for multi-issue sessions
  • Trimming verbose tool outputs to only relevant fields before they accumulate in context (e.g., keeping only return-relevant fields from order lookups)
  • Placing key findings summaries at the beginning of aggregated inputs and organizing detailed results with explicit section headers to mitigate position effects
  • Requiring subagents to include metadata (dates, source locations, methodological context) in structured outputs to support accurate downstream synthesis
  • Modifying upstream agents to return structured data (key facts, citations, relevance scores) instead of verbose content and reasoning chains when downstream agents have limited context budgets

Concepts

ts-5.2

Design effective escalation and ambiguity resolution patterns

Defining when to escalate to humans and how to handle ambiguous tool results safely.

Knowledge of

  • Appropriate escalation triggers: customer requests for a human, policy exceptions/gaps (not just complex cases), and inability to make meaningful progress
  • The distinction between escalating immediately when a customer explicitly demands it versus offering to resolve when the issue is straightforward
  • Why sentiment-based escalation and self-reported confidence scores are unreliable proxies for actual case complexity
  • How multiple customer matches require clarification (requesting additional identifiers) rather than heuristic selection

Skills in

  • Adding explicit escalation criteria with few-shot examples to the system prompt demonstrating when to escalate versus resolve autonomously
  • Honoring explicit customer requests for human agents immediately without first attempting investigation
  • Acknowledging frustration while offering resolution when the issue is within the agent's capability, escalating only if the customer reiterates their preference
  • Escalating when policy is ambiguous or silent on the customer's specific request (e.g., competitor price matching when policy only addresses own-site adjustments)
  • Instructing the agent to ask for additional identifiers when tool results return multiple matches, rather than selecting based on heuristics

Concepts

ts-5.3

Implement error propagation strategies across multi-agent systems

Structured error handling, failure type distinction, and graceful degradation in multi-agent architectures.

Knowledge of

  • Structured error context (failure type, attempted query, partial results, alternative approaches) as enabling intelligent coordinator recovery decisions
  • The distinction between access failures (timeouts needing retry decisions) and valid empty results (successful queries with no matches)
  • Why generic error statuses ("search unavailable") hide valuable context from the coordinator
  • Why silently suppressing errors (returning empty results as success) or terminating entire workflows on single failures are both anti-patterns

Skills in

  • Returning structured error context including failure type, what was attempted, partial results, and potential alternatives to enable coordinator recovery
  • Distinguishing access failures from valid empty results in error reporting so the coordinator can make appropriate decisions
  • Having subagents implement local recovery for transient failures and only propagate errors they cannot resolve, including what was attempted and partial results
  • Structuring synthesis output with coverage annotations indicating which findings are well-supported versus which topic areas have gaps due to unavailable sources

Concepts

ts-5.4

Manage context effectively in large codebase exploration

Strategies for managing context degradation, scratchpad persistence, and crash recovery during extended codebase exploration.

Knowledge of

  • Context degradation in extended sessions: models start giving inconsistent answers and referencing "typical patterns" rather than specific classes discovered earlier
  • The role of scratchpad files for persisting key findings across context boundaries
  • Subagent delegation for isolating verbose exploration output while the main agent coordinates high-level understanding
  • Structured state persistence for crash recovery: each agent exports state to a known location, and the coordinator loads a manifest on resume

Skills in

  • Spawning subagents to investigate specific questions (e.g., "find all test files," "trace refund flow dependencies") while the main agent preserves high-level coordination
  • Having agents maintain scratchpad files recording key findings, referencing them for subsequent questions to counteract context degradation
  • Summarizing key findings from one exploration phase before spawning sub-agents for the next phase, injecting summaries into initial context
  • Designing crash recovery using structured agent state exports (manifests) that the coordinator loads on resume and injects into agent prompts
  • Using /compact to reduce context usage during extended exploration sessions when context fills with verbose discovery output

Concepts

ts-5.5

Design human review workflows and confidence calibration

Stratified sampling, confidence calibration, and routing strategies for human review of AI outputs.

Knowledge of

  • The risk that aggregate accuracy metrics (e.g., 97% overall) may mask poor performance on specific document types or fields
  • Stratified random sampling for measuring error rates in high-confidence extractions and detecting novel error patterns
  • Field-level confidence scores calibrated using labeled validation sets for routing review attention
  • The importance of validating accuracy by document type and field segment before automating high-confidence extractions

Skills in

  • Implementing stratified random sampling of high-confidence extractions for ongoing error rate measurement and novel pattern detection
  • Analyzing accuracy by document type and field to verify consistent performance across all segments before reducing human review
  • Having models output field-level confidence scores, then calibrating review thresholds using labeled validation sets
  • Routing extractions with low model confidence or ambiguous/contradictory source documents to human review, prioritizing limited reviewer capacity

Concepts

ts-5.6

Preserve information provenance and handle uncertainty in multi-source synthesis

Maintaining source attribution, handling conflicting data, and preserving temporal context through synthesis pipelines.

Knowledge of

  • How source attribution is lost during summarization steps when findings are compressed without preserving claim-source mappings
  • The importance of structured claim-source mappings that the synthesis agent must preserve and merge when combining findings
  • How to handle conflicting statistics from credible sources: annotating conflicts with source attribution rather than arbitrarily selecting one value
  • Temporal data: requiring publication/collection dates in structured outputs to prevent temporal differences from being misinterpreted as contradictions

Skills in

  • Requiring subagents to output structured claim-source mappings (source URLs, document names, relevant excerpts) that downstream agents preserve through synthesis
  • Structuring reports with explicit sections distinguishing well-established findings from contested ones, preserving original source characterizations and methodological context
  • Completing document analysis with conflicting values included and explicitly annotated, letting the coordinator decide how to reconcile before passing to synthesis
  • Requiring subagents to include publication or data collection dates in structured outputs to enable correct temporal interpretation
  • Rendering different content types appropriately in synthesis outputs -- financial data as tables, news as prose, technical findings as structured lists -- rather than converting everything to a uniform format

Concepts