4.3 Structured Output with Tool Use

4.3.1 Getting Output a Program Can Trust

Often you don't want Claude to write prose — you want STRUCTURED DATA your program can use: extract an invoice into fields, pull entities from a document, return a JSON object your code parses. Task Statement 4.3 is about getting that structure RELIABLY. The naive approach — 'please respond in JSON' in the prompt — works most of the time, which in production means it FAILS some of the time: a stray comma, an unescaped quote, a missing bracket, and your parser crashes.

There's a much more reliable mechanism, and it reuses something you already know: tool use. Recall from Domain 2 that a tool is defined with a JSON SCHEMA describing its inputs. When Claude calls a tool, the API guarantees the call conforms to that schema — valid JSON, right field names, right types. So you can define a 'tool' whose only purpose is to RECEIVE your structured data (an extract_invoice tool with fields for date, total, line items), have Claude 'call' it, and read the structured result straight out of the tool call. It's like the difference between asking someone to write their answer freehand versus handing them a form with labelled boxes — the form guarantees you get the fields you need in the shape you expect.

The headline: tool_use with a JSON schema is the most reliable way to get schema-compliant structured output, and it ELIMINATES JSON syntax errors. But — and this is the lesson's most important nuance — it does NOT eliminate every kind of error. Let's see exactly what it guarantees and what it doesn't.

Asking for JSON in prose gives freehand output that occasionally breaks parsers; tool_use with a JSON schema is a labelled form that guarantees valid, well-shaped JSON — eliminating syntax errors.

ℹ️

The one idea to hold onto

tool_use with a JSON schema is the most reliable way to get schema-compliant structured output — it eliminates JSON SYNTAX errors. Define a tool whose schema IS the structure you want, and read the data out of the tool call.

4.3.2 Controlling Whether and Which Tool Is Called

For structured output you usually need to GUARANTEE Claude calls your extraction tool rather than replying in prose. That's the tool_choice setting from Domain 2.3, applied here for output structure. There are exactly four values, and the exam tests that you know them precisely.

tool_choice	Behavior	For structured output
auto (default)	Model may call a tool OR return text	Risky — might reply in prose
any	Must call SOME tool (model picks)	Guarantees a tool call when multiple schemas exist
{type:'tool',name:..}	Must call THAT specific tool	Force one extraction schema to run
none	No tools allowed	Not for extraction

The four tool_choice values. For guaranteed structured output use 'any' (some tool must be called) or a forced specific tool — not 'auto', which may return prose.

Two patterns matter for extraction. Use 'any' when you have multiple possible extraction schemas and the document type is unknown — it guarantees Claude calls ONE of them rather than chatting. Use a forced specific tool ({type:'tool',name:'extract_metadata'}) when a particular extraction MUST run first — e.g. extract metadata before any enrichment step — then handle later steps in follow-up turns. And for an ironclad guarantee, combine tool_choice:'any' with strict:true on the tool, which forces both that a tool IS called AND that its inputs strictly match your schema.

⭐

4.3.2 — Key Concept

tool_choice has exactly four values: auto (may return text), any (must call some tool), {type:'tool',name} (forced specific tool), none. For guaranteed structured output use 'any' or a forced tool — not 'auto'; add strict:true to also guarantee schema-valid inputs.

4.3.3 The Critical Limit: Structure Isn't Correctness

Now the single most important — and most tested — idea in this lesson. tool_use with a strict schema guarantees the SHAPE of the output: valid JSON, all required fields present, correct types. It does NOT guarantee the output is CORRECT. The schema is a form; it checks that every box is filled with the right TYPE of thing, but it can't check that the VALUES are right.

Concretely, a schema-valid extraction can still contain SEMANTIC errors: line items that don't add up to the stated total, a value placed in the wrong field (the invoice number where the PO number belongs), or outright fabricated data that happens to fit the schema. Every one of those passes schema validation — the JSON is perfectly well-formed — yet the data is wrong. The exam tests this directly: don't assume tool_use means the answer is correct. Schema = structure; correctness needs separate checking (which is the whole subject of the next lesson, 4.4).

⭐

4.3.3 — Key Concept

tool_use with a JSON schema eliminates SYNTAX errors but NOT SEMANTIC errors — a schema-valid result can still have line items that don't sum, values in the wrong field, or fabricated data. Schema guarantees structure, not correctness.

4.3.4 Schema Design to Prevent Fabrication

Schema design itself can prevent one common error: fabrication. Here's the trap. If you mark every field REQUIRED, you're telling the model 'this field must have a value' — so when the source document doesn't contain that information, the model FABRICATES one to satisfy the requirement. You've literally pressured it into making things up. The fix is to make fields that may be absent OPTIONAL or NULLABLE, which permits an honest null when the data isn't there. A nullable field says 'it's fine to say you don't know' — and that's exactly what you want for extraction.

Two more schema design moves help with ambiguity and varied formats. For categorization, add an 'unclear' enum value for genuinely ambiguous cases (so the model isn't forced to pick a wrong category) and an 'other' + a free-text detail field for things outside your fixed categories (extensible without fabrication). And put format-NORMALIZATION rules in the prompt alongside the schema — e.g. 'express all dates as ISO-8601' — so inconsistent source formatting gets cleaned up as it's extracted.

Design choice	Prevents
Nullable/optional fields for maybe-absent data	Fabrication to satisfy a required field
'unclear' enum for ambiguous cases	Forced wrong-category guesses
'other' + detail string	Cramming novel items into fixed categories
Format-normalization rules in the prompt	Inconsistent source formatting leaking through

Schema design as a fabrication guard: nullable fields permit honest nulls, 'unclear'/'other' handle ambiguity and novelty, and prompt-level normalization standardizes formats.

⭐

4.3.4 — Key Concept

Make maybe-absent fields OPTIONAL/NULLABLE so the model can return an honest null instead of fabricating a value to satisfy a required field. Add an 'unclear' enum for ambiguity and 'other'+detail for novel cases, and put format-normalization rules in the prompt.

4.3.5 The Exam Traps

The 4.3 traps cluster around three ideas: tool_use ≠ correctness, the right tool_choice for structure, and required-fields-cause-fabrication.

•Thinking tool_use guarantees correctness. ✗ Assuming a schema-valid extraction is right. ✓ It's structurally valid only; semantic errors (sums, misplaced values, fabrication) still need validation (4.4).
•Wrong tool_choice for structure. ✗ Leaving tool_choice on 'auto' when you must get structured output. ✓ Use 'any' (or a forced tool); add strict:true.
•All fields required. ✗ Marking every field required, pressuring fabrication of absent data. ✓ Make maybe-absent fields nullable for honest nulls.
•Confusing auto/any/tool. ✗ Mixing up the four tool_choice values. ✓ auto may return text; any forces some tool; tool forces a specific one; none disables tools.

⚠️

4.3.5 — Exam Trap

✗ Believing tool_use prevents ALL errors (it stops syntax, not semantics). ✗ 'auto' when structured output is required (use 'any' / forced + strict:true). ✗ Making every field required (pressures fabrication — use nullable). Schema guarantees structure; correctness needs validation-retry (4.4).

4.3.6 Put It Together: Extract Reliably

You now know how tool_use guarantees structure, the four tool_choice values, the critical structure-vs-correctness limit, and schema design that prevents fabrication. The exercise builds a real extraction tool and exposes both what the schema guarantees and what it doesn't.

✨

4.3.6 — Build Exercise (30 min)

(1) Define an extraction tool with a JSON schema (date, total, line items) and read structured data from the tool_use response. (2) Set tool_choice:'any' so the model can't reply in prose; add strict:true and confirm inputs match the schema. (3) Make optional document fields NULLABLE and feed a document missing some fields — verify the model returns null instead of fabricating values; then make them required and watch fabrication appear. (4) Feed a document where line items don't sum to the stated total — confirm the schema-valid output still contains the semantic error, motivating the validation-retry loop of Lesson 4.4.

tool_use guarantees structure; it can't guarantee correctness. The next lesson, 4.4, closes that gap — validation and retry loops that catch and fix the semantic errors the schema lets through.

ℹ️

Where this shows up on the exam

4.3 questions test tool_use for structured output, the four tool_choice values, that schema ≠ correctness (semantic errors remain), and nullable fields to prevent fabrication. The 'tool_use prevents all errors' misconception is a favorite distractor.

Key Takeaways

✓tool_use with a JSON schema is the most reliable way to get schema-compliant structured output and eliminates JSON SYNTAX errors — define a tool whose schema is the structure you want and read data from the tool call.
✓tool_choice has exactly four values: auto (may return text), any (must call some tool), {type:'tool',name} (forced specific tool), none (no tools).
✓For guaranteed structured output use 'any' or a forced tool (not 'auto'); combine with strict:true to also guarantee inputs match the schema.
✓CRITICAL: tool_use eliminates syntax errors but NOT semantic errors — line items that don't sum, values in the wrong field, and fabrication all pass schema validation. Schema = structure, not correctness.
✓Making every field required pressures the model to FABRICATE values for absent data; make maybe-absent fields optional/nullable so it can return an honest null.
✓Add an 'unclear' enum for ambiguous cases and 'other' + a detail string for novel ones; put format-normalization rules in the prompt to standardize varied source formats.
✓Correctness of schema-valid output must be checked separately — the job of validation-retry loops (Lesson 4.4).

Check Your Understanding

Test what you learned in this lesson.

Q1.You need reliably parseable structured data from Claude and keep getting occasional malformed JSON with the 'please respond in JSON' approach. What's the most reliable fix?

Q2.An extraction using tool_use with a strict schema returns well-formed JSON, but the line items don't add up to the stated total. What does this illustrate?

Q3.To prevent the model from fabricating values for fields that may be absent from the source document, you should…

Q4.You have several possible extraction schemas and the document type is unknown, but you MUST get a tool call (not a prose reply). Which tool_choice fits?

Practice This Lesson

4.2 Few-Shot Prompting

4.4 Validation & Retry Loops