Three Claude API Features That Cut Agent Token Costs by 85% — And Improve Accuracy
Production Claude agents fail at tool use in three distinct ways: they exhaust their context window before finishing a task, they make sequential inference passes that could run in parallel, or they call tools with subtly incorrect parameters. Anthropic has released three beta API features that address each failure mode independently. Understanding when to apply each — and in which combination — is a core skill for the Claude SA exam.
Feature 1: Tool Search — Deferred Definition Loading
In a multi-server MCP environment, loading every tool definition at session start can consume tens of thousands of tokens. A representative five-server setup combining a code repository, messaging platform, observability stack, infrastructure tooling, and log analysis can load around 55,000 tokens before any conversation content.
Tool search addresses this by allowing tools to be marked with defer_loading: true. These definitions are omitted from initial context. When the agent needs a tool, it issues a search query and receives only the definitions of matching tools on demand. Published results show this approach reduces definition overhead from approximately 77,000 tokens to 8,700 — an 85% reduction — while improving task accuracy on Opus 4.5 from 79.5% to 88.1%, likely because the remaining context window contains more relevant task information.
Feature 2: Programmatic Tool Calling — Code-Executed Orchestration
Traditional tool use requires one full model inference pass per tool call. Intermediate results accumulate in context regardless of whether they are still needed. On a workflow that requires 20 tool invocations, this creates 20 sequential inference passes and a context window progressively filled with stale data.
Programmatic tool calling inverts this: Claude writes Python code that runs in a sandboxed execution environment. Tools marked with allowed_callers: ["code_execution_20250825"] become callable as Python functions inside the sandbox. Results are processed within the execution environment — only the final filtered output enters Claude's context. For a budget compliance task involving 2,000+ expense line items across an engineering team, this approach reduces context consumption from roughly 200KB of raw data to approximately 1KB of actionable results, while eliminating 19 sequential inference passes.
# Claude writes orchestration code like this
expenses = await asyncio.gather(
*[get_expenses(member["id"], "Q3") for member in team]
)
exceeded = [
{"member": m["name"], "spent": sum(e["amount"] for e in exps)}
for m, exps in zip(team, expenses)
if sum(e["amount"] for e in exps) > budgets[m["level"]]
]
return exceeded # Only this enters Claude's contextFeature 3: Tool Use Examples — Conveying Convention Beyond Schema
JSON Schema is good at defining what a tool accepts structurally, but poor at expressing usage conventions: when optional parameters apply, which field combinations are valid together, how domain-specific values should be formatted (ISO dates, proprietary ID patterns, enum edge cases). Without examples, Claude must infer these conventions from schema descriptions alone — which works for simple tools but breaks down on complex ones.
The input_examples field on tool definitions provides concrete invocation examples that demonstrate format and convention. Published accuracy results show improvement from 72% to 90% on complex parameter handling tasks. This feature is most valuable for tools with many optional parameters, domain-specific formatting requirements, or cases where multiple valid combinations produce different outcomes.
Choosing the Right Feature
- Primary bottleneck is context window size → start with Tool Search
- Primary bottleneck is latency or intermediate data accumulation → start with Programmatic Calling
- Primary issue is incorrect or malformed tool parameters → start with Usage Examples
- Complex production agents typically benefit from all three in combination
Enabling the Beta
client.messages.create(
model="claude-opus-4-6",
betas=["advanced-tool-use-2025-11-20"],
tools=[
{"name": "search_logs", "defer_loading": True, ...},
{"name": "run_query",
"allowed_callers": ["code_execution_20250825"], ...},
{"name": "create_ticket", "input_examples": [...], ...},
],
...
)Preparing for the Claude SA Exam?
Explore 150+ exam concepts, 91 glossary terms, and full mock exams — all free.