Domain 4: Prompt Engineering & Structured Output (20%)Lesson 22 of 30

4.4 Validation & Retry Loops

4.4.1 Catching the Errors the Schema Lets Through

Lesson 4.3 ended on a crucial gap: tool_use guarantees your output is well-FORMED, but not that it's CORRECT. A schema-valid extraction can still have line items that don't sum, values in the wrong field, or fabricated data. Task Statement 4.4 is about closing that gap — validating the output and, when it's wrong, retrying in a way that actually fixes it.

The naive approach is a plain retry: it failed, so run it again and hope. But hope isn't a strategy — a plain retry usually reproduces the SAME mistake, because the model has no idea what went wrong. Picture a student who got a math problem wrong; handing the identical problem back unchanged just gets the same wrong answer. But hand it back with 'your total of £450 doesn't match the line items, which sum to £500' and now they know exactly what to fix. That targeted feedback is the heart of this lesson.

The technique is retry-WITH-error-feedback: when validation catches a problem, you send the model back (1) the original document, (2) its failed extraction, and (3) the SPECIFIC validation error. With that, the model can self-correct precisely instead of guessing again. But there's an important limit — some errors retries can never fix, no matter how good the feedback — and knowing which is which is the most-tested idea here.

Naive retry vs. retry-with-feedbackNaive retry (✗)same prompt againno idea what was wrongreproduces the same mistakeRetry + feedback (✓)doc + failed output + theSPECIFIC validation errorself-corrects precisely

A naive retry reproduces the same mistake; retry-with-error-feedback sends the doc, the failed output, and the specific error so the model can self-correct.

ℹ️

The one idea to hold onto

Validate the output, and when it's wrong, retry WITH the specific error: send the original document, the failed extraction, and the exact validation error so the model self-corrects — rather than a naive retry that just repeats the mistake.

4.4.2 What Retries Can and Can't Fix

This is the heart of 4.4. Retry-with-feedback is powerful, but it has a hard boundary, and the exam tests whether you know it. The deciding question: is the information actually THERE to get right?

Retries CAN fix problems where the right answer is derivable from what's in front of the model: format mismatches (a date in the wrong format), structural errors (fields nested wrong), misplaced values (the right number in the wrong field), and math errors (a sum that's off). In all of these, the correct answer is recoverable — the feedback just points the model at it. Retries CANNOT fix a fundamentally different problem: information that is simply ABSENT from the source. If the document never states the customer's VAT number, no amount of retrying or feedback will conjure it — it isn't there. The same applies to anything requiring EXTERNAL knowledge or documents the model wasn't given.

So when validation fails, first ask: is this a recoverable error (retry with feedback) or missing information (don't retry — flag for a human, or return null)? Retrying for absent data is a waste that, worse, can pressure the model into fabricating the missing value. Match the response to the cause.

Validation failureRetry?Right response
Wrong date formatYesRetry with the format error
Line items don't sumYesRetry with the discrepancy noted
Value in the wrong fieldYesRetry pointing at the misplacement
Field's info absent from the documentNoFlag for human / return null — retrying can't help

Retries fix recoverable errors (format, structure, misplacement, math). They cannot supply information that isn't in the source — flag those for a human instead.

4.4.2 — Key Concept

Retries fix recoverable errors — format mismatches, structural errors, misplaced values, math errors — where the answer is derivable from the input. They CANNOT fix information ABSENT from the source (or needing external knowledge); flag those for a human or return null instead of retrying.

4.4.3 Building Self-Correction Into the Schema

You can make validation easier by designing the schema to SURFACE its own errors. Instead of computing the total separately and comparing, have the extraction return BOTH a calculated_total (summed from the line items) AND the stated_total (the figure printed on the document). Now a discrepancy is visible right in the output — your validator just compares two fields, and the model itself is prompted to notice when they disagree.

The same self-correction idea extends further. Add a conflict_detected boolean for sources that contradict each other, so inconsistency is flagged rather than silently resolved. Add a detected_pattern field that records WHICH code construct triggered a finding — so when developers dismiss certain findings, you can analyze the dismissal patterns and refine your criteria (tying back to the false-positive work in 4.1). Schema fields can do more than hold data; they can build in the checks that catch semantic errors.

And recall the distinction from 4.3: schema SYNTAX errors are already eliminated by tool_use, so they're not what validation-retry is for. Validation-retry exists to catch SEMANTIC errors — the sums, misplacements, and conflicts that well-formed JSON can still contain. The two lessons are partners: 4.3 guarantees the form, 4.4 verifies the substance.

4.4.3 — Key Concept

Design schemas for self-correction: extract calculated_total alongside stated_total so discrepancies are visible; add conflict_detected booleans and detected_pattern fields. Validation-retry targets SEMANTIC errors — syntax errors are already eliminated by tool_use (4.3).

4.4.4 The Exam Traps

The 4.4 traps test the retry-able-vs-not distinction and the difference between naive and feedback-driven retries. The signature question gives two documents — one with a recoverable error, one with missing info — and asks how to handle each.

  • Naive retry. ✗ Re-running the same prompt and hoping. ✓ Retry WITH the document, the failed output, and the specific error.
  • Retrying for absent info. ✗ Retrying when the needed value isn't in the source. ✓ Flag for a human or return null — retrying can't supply missing data and may cause fabrication.
  • Treating syntax and semantic errors the same. ✗ Using retry loops for JSON syntax. ✓ tool_use already eliminates syntax (4.3); retry-validate is for semantic errors.
  • Not surfacing the error. ✗ Computing totals entirely outside the model. ✓ Have the schema return calculated vs stated values so discrepancies are visible.
⚠️

4.4.4 — Exam Trap

Match the response to the cause: recoverable error (format/structure/misplacement/math) → retry WITH specific feedback; information ABSENT from the source → flag for human / return null (retrying can't help and risks fabrication). Don't use retry loops for syntax (that's tool_use, 4.3).

4.4.5 Put It Together: Validate and Retry Well

You now know retry-with-feedback, the recoverable-vs-absent boundary, and schema-based self-correction. The exercise has you build a validation-retry loop and prove the boundary by feeding it both kinds of failure.

4.4.5 — Build Exercise (30 min)

(1) Build a validation-retry loop: when schema/Pydantic validation fails, send a follow-up with the original document, the failed extraction, and the specific error. (2) Test on Document A where the calculated total (£450) disagrees with the stated total (£500) — confirm the feedback retry corrects it. (3) Test on Document B missing a required field — confirm you flag it for a human rather than retrying (and watch a naive retry either loop or fabricate). (4) Add calculated_total alongside stated_total and a conflict_detected boolean to your schema so discrepancies surface automatically.

Validation-retry verifies one document at a time. The next lesson, 4.5, scales up to VOLUME — the Message Batches API for processing thousands of documents cost-effectively, and when batch is (and isn't) the right choice.

ℹ️

Where this shows up on the exam

4.4 questions give a validation failure and ask how to handle it. Decide: recoverable (retry with specific feedback) or absent info (flag for human / null). Naive retry and retrying-for-missing-data are the distractors.

Key Takeaways

  • tool_use guarantees well-formed output but not correct output; validation-retry loops close that gap by catching and fixing SEMANTIC errors.
  • Retry-WITH-error-feedback sends the original document, the failed extraction, and the SPECIFIC validation error so the model self-corrects — a naive retry just reproduces the same mistake.
  • Retries CAN fix recoverable errors: format mismatches, structural errors, misplaced values, and math errors — where the right answer is derivable from the input.
  • Retries CANNOT fix information that is ABSENT from the source (or needs external knowledge) — flag those for a human or return null; retrying wastes effort and risks fabrication.
  • Design schemas for self-correction: extract calculated_total alongside stated_total to surface discrepancies; add conflict_detected booleans and detected_pattern fields.
  • Validation-retry targets SEMANTIC errors; JSON SYNTAX errors are already eliminated by tool_use (4.3) — the two lessons are partners (form vs substance).
  • When validation fails, first classify the cause (recoverable vs absent) and match the response accordingly.

Check Your Understanding

Test what you learned in this lesson.

Q1.Your extraction sometimes produces a total that doesn't match the line items. What's the most effective way to fix it on retry?

Q2.Validation fails because a required field's information simply isn't present in the source document. What should you do?

Q3.Which type of error can a validation-retry loop NOT fix?

Q4.How can you design an extraction schema to make a total discrepancy easy to catch?

Practice This Lesson