Self-Critique Limitations & Independent Review

Core

Design multi-instance and multi-pass review architectures · Difficulty 3/5

0%
self-critiquereviewconfirmation-bias

The Evaluator-Optimizer Pattern uses a second pass to evaluate and improve the first pass's output. However, self-review in the same context has fundamental limitations.

Self-Review Limitations

When Claude generates code or analysis, it retains reasoning context from generation. This makes it less likely to question its own decisions in the same session -- a form of confirmation bias. The model may rationalize edge cases it considered and dismissed during generation.

Independent Review Instances

A second Claude instance evaluating WITHOUT the generator's reasoning context provides a fresh perspective:

  • Catches issues the first instance rationalized away
  • More effective than self-review instructions or extended thinking
  • Mirrors human peer review benefits
  • Pattern

    Instance 1 (Generator): Produce code/analysis
    Instance 2 (Reviewer): Review output WITHOUT seeing Instance 1's reasoning

    Verification Passes with Confidence

    Run verification passes where the model self-reports confidence alongside each finding. This enables calibrated review routing:

  • High-confidence findings go directly to the developer
  • Low-confidence findings get additional review passes
  • Confidence calibration improves over time with feedback data
  • Key Takeaways

    • Self-review in the same context suffers from confirmation bias -- the model retains generation reasoning
    • Independent review instances (without prior reasoning) catch issues self-review misses
    • This pattern is more effective than self-review instructions or extended thinking
    • Add confidence self-reporting to findings for calibrated review routing

    Test Yourself1 of 1

    Production metrics show that when your agent resolves complex cases involving billing disputes or multi-order returns, customer satisfaction scores are 15% lower than for simple cases — even when the resolution is technically correct. Root cause analysis reveals the agent provides accurate resolutions but inconsistently explains the reasoning: sometimes omitting relevant policy details, other times missing timeline information or next steps. The specific context gaps vary by case. You want to improve resolution quality without adding human review overhead. Which approach is most effective?