Tool Selection Reliability & Debugging

Core

Design effective tool interfaces with clear descriptions and boundaries · Difficulty 3/5

0%
tool-selectionreliabilitydebuggingdiagnostics

When an agent consistently selects the wrong tool, systematic diagnostic steps can identify and fix the root cause.

Diagnostic Order

  • Check tool descriptions first: Are they clear and differentiated?
  • Check system prompt: Are there keyword-sensitive instructions causing unintended routing?
  • Add few-shot examples: Target ambiguous scenarios with reasoning
  • Add programmatic guardrails: For critical sequences, enforce ordering in code
  • System Prompt Keyword Bias

    Keyword-sensitive instructions in the system prompt can override well-written tool descriptions. If tool A is called 78% of the time when keyword X appears but 7% without it, check the system prompt for keyword-sensitive instructions that bias routing.

    Example: "When the user mentions 'account'..." can create unintended tool associations that override the tool's own description.

    Few-Shot for Ambiguity

    Provide 4-6 examples targeting the specific ambiguous cases, with reasoning about why one tool was chosen over plausible alternatives.

    Review Checklist

  • Are any two tools' descriptions similar enough to confuse the model?
  • Does the system prompt mention keywords that might bias tool selection?
  • Are there examples covering the ambiguous edge cases?
  • Do critical sequences have programmatic enforcement?
  • Key Takeaways

    • Diagnose tool selection issues: descriptions first, then system prompt, then examples
    • System prompt keyword-sensitive instructions can override tool descriptions
    • Use few-shot examples targeting specific ambiguous scenarios
    • Review system prompts for keyword bias that creates unintended tool associations

    Test Yourself1 of 2

    In testing, you notice the agent frequently calls get_customer when users ask about order status, even though lookup_order would be more appropriate. What should you examine first to address this issue?