Multi-Agent Orchestration: The Orchestrator-Worker Pattern

The Orchestrator-Worker Pattern

At the largest scale, subagents power true multi-agent systems. Anthropic's own multi-agent research system uses the orchestrator-worker pattern: a lead agent analyzes the query, develops a strategy, and spawns specialized subagents that explore different aspects in parallel — then synthesizes their findings into a final answer. This is the architecture behind Claude's advanced research capabilities.

Orchestrator-worker: an Opus lead agent fans out parallel Sonnet subagents over different aspects, then synthesizes.

The Numbers

This isn't a marginal improvement. In Anthropic's published results, a multi-agent system with Claude Opus 4 as the lead agent and Claude Sonnet 4 subagents outperformed single-agent Claude Opus 4 by 90.2% on their internal research eval. The advantage was strongest for breadth-first queries that benefit from pursuing many directions at once.

Metric	Finding
Multi-agent vs single-agent	+90.2% on Anthropic's internal research eval
Token usage (multi-agent)	~15× more tokens than a chat interaction
Token usage (standard agent)	~4× more tokens than chat
What explains performance	Token usage alone explains ~80% of the variance
Parallel tool calling	Agents using 3+ tools at once cut research time up to 90%

The headline figures from Anthropic's multi-agent research system — big gains, but at a real token cost.

⚠️

Performance comes from spending tokens

That ~80% figure is the crucial nuance: most of the multi-agent advantage comes from simply doing more work (more tokens, more parallel exploration). This is why multi-agent isn't a free win — it's a deliberate trade of cost for capability on tasks where that trade pays off.

When Multi-Agent Is Worth It

The 15× token cost means multi-agent orchestration only makes sense for the right tasks. Anthropic's guidance maps cleanly onto everything you've learned about delegation and the decision rule:

Multi-agent is worth it for	Multi-agent is a poor fit for
High-value tasks that justify the token cost	Tasks needing shared context across all agents
Heavy parallelization (independent directions)	Heavy interdependencies between steps
Information exceeding a single context window	Real-time coordination between agents
Complex tool interfaces	Most coding tasks

Multi-agent shines on parallel, high-value research; it struggles where agents must share context or coordinate tightly — including most coding.

⭐

The same decision rule, scaled up

Notice this is the 'does the intermediate work matter?' rule at system scale. Parallel research over independent directions (journey doesn't matter) → multi-agent wins. Tightly coupled work where steps depend on each other (journey matters) → keep it coordinated in one place. Most coding is the latter, which is why multi-agent is a poor fit there.

Prompt-Engineering an Orchestrator

Building a good orchestrator is mostly about teaching it to delegate well — which is exactly the description-and-output-format discipline from earlier, applied to the lead agent:

•Give each subagent a detailed task: objective, output format, tool guidance, and clear boundaries (so it knows exactly what to return).
•Embed scaling rules: a simple query might need one agent making 3-10 calls; complex research warrants 10+ subagents.
•Use parallel tool calling — agents using 3+ tools simultaneously cut research time up to 90%.
•Start broad, then progressively narrow; give heuristics rather than rigid rules.
•Keep humans in the loop for evaluation — automation misses edge cases like source-selection bias and hallucinations on unusual queries.

✨

You've now seen subagents from a single helper all the way to a 90%-better research system. The final lesson consolidates everything and gives you exam-focused pointers.

Key Takeaways

✓The orchestrator-worker pattern: a lead agent plans, spawns specialized subagents to explore aspects in parallel, then synthesizes — the architecture behind Anthropic's multi-agent research system.
✓Anthropic's result: a multi-agent system (Opus 4 lead + Sonnet 4 subagents) outperformed single-agent Opus 4 by 90.2% on their internal research eval, especially for breadth-first queries.
✓Token cost is real: multi-agent uses ~15× the tokens of a chat (standard agents ~4×), and token usage alone explains ~80% of performance variance.
✓Multi-agent is worth it for high-value, heavily parallel tasks with info exceeding one context window; it's a poor fit for shared-context, interdependent, or real-time work — including most coding.
✓This is the 'does the intermediate work matter?' decision rule applied at system scale — independent parallel work wins, tightly coupled work doesn't.
✓Orchestrator prompt-engineering = detailed per-subagent tasks (objective/format/tools/boundaries), scaling rules, parallel tool calling, broad→narrow, and humans in the loop.

Check Your Understanding

Test what you learned in this lesson.

Q1.What is the orchestrator-worker pattern?

Q2.By how much did Anthropic's multi-agent system outperform single-agent Claude Opus 4 on their internal research eval?

Q3.What is the key cost nuance of multi-agent systems?

Q4.For which kind of task is multi-agent orchestration a POOR fit?

Practice This Lesson

Parallel, Sequential, and the Workflow Tool

Course Review and Exam Tips