Glossary

A Claude Code CLI flag that controls the response format. Values: 'text' (default, plain text), 'json' (structured JSON), 'stream-json' (streaming JSON events). Use with --json-schema to guarantee output matches a specific schema. Critical for CI/CD pipeline integration.

-p / --print Flag

Claude Code CLI flag for non-interactive (headless) execution. Processes the prompt, outputs to stdout, and exits immediately without entering an interactive session. Essential for CI/CD pipeline integration, scripting, and automation workflows.

.

.claude/settings.json

The Claude Code settings file that configures tool permissions, hook scripts, environment variables, and behavioral settings. Project-scoped (.claude/settings.json) checked into version control, or user-scoped (~/.claude/settings.json) for personal preferences. Hooks are defined here.

.mcp.json

The project-scoped MCP configuration file placed in the repository root. Defines which MCP servers are available for the project, their commands, arguments, and environment variable bindings. Checked into version control to share server configuration with the team. Supports ${ENV_VAR} expansion for credentials.

@

@import Directive

A CLAUDE.md directive that includes the contents of another markdown file. Enables modular CLAUDE.md organization where large configuration is split into focused files (e.g., @import docs/api-conventions.md). Reduces duplication when multiple CLAUDE.md files share common instructions.

A

AgentDefinition

An Agent SDK configuration object that specifies a subagent's identity, capabilities, available tools, and behavioral constraints. Used to define reusable specialized agents that can be invoked consistently across different parts of a system.

Agentic Loop

The core pattern for AI agents: call Claude, check stop_reason, if 'tool_use' execute the requested tools and append results, then call Claude again. Repeat until stop_reason is 'end_turn'. The number of loop iterations is determined dynamically by task complexity.

allowed-tools

A skill frontmatter field that whitelists which tools a skill can use during its execution. Enforces the principle of least privilege for skills. Use specific tool names or MCP server patterns (e.g., 'mcp__github__*' allows all GitHub MCP tools).

B

budget_tokens

A parameter within the extended thinking configuration that sets the maximum tokens Claude can use for its internal reasoning process. Higher budgets allow more thorough reasoning but increase cost and latency. Must be at least 1024.

C

Cache TTL

The time-to-live for cached prompt content in Claude's prompt caching system. The cache entry is refreshed (TTL resets) each time the cached content is used. If unused beyond the TTL window, the cache entry expires and the content must be re-processed and billed at full rate.

Chain-of-Thought Prompting

A prompting strategy that instructs Claude to reason step-by-step before providing a final answer. Improves accuracy on complex reasoning tasks by making intermediate steps explicit. Can be triggered by instructions like 'think step by step' or via extended thinking.

Claude Agent SDK

Anthropic's Python SDK for building agentic applications with Claude. Provides primitives for sessions, subagent orchestration, tool integration, and lifecycle hooks. Imported as 'claude_agent_sdk'. Simplifies building multi-agent systems compared to raw API orchestration.

Claude Haiku

Models

The fastest and most cost-effective Claude model tier, optimized for high-throughput, low-latency tasks like classification, extraction, and simple Q&A. Ideal as a routing model to assess complexity before delegating to a more capable tier.

Claude Opus

Models

The most capable Claude model tier, excelling at complex reasoning, nuanced analysis, and creative tasks. Highest accuracy but most expensive and slowest. Best suited for tasks where quality outweighs cost and latency concerns.

Claude Sonnet

Models

The balanced Claude model tier offering a good trade-off between capability, speed, and cost. Recommended default for most production applications. Handles the majority of complex tasks effectively without Opus-level expense.

CLAUDE.md

A markdown configuration file read by Claude Code at startup that provides persistent context, instructions, project conventions, and tool configurations. Supports a hierarchy: ~/.claude/CLAUDE.md (global), project root CLAUDE.md, and directory-specific CLAUDE.md files. Uses @import for modular organization.

Confidence Calibration

The practice of having Claude estimate and report its confidence in its output, then using that estimate to determine whether to proceed autonomously or escalate. Requires explicit confidence scoring in prompts and defined thresholds for escalation vs. auto-approval.

Content Block

The structured units that make up Claude's response. Types include: 'text' (plain text response), 'tool_use' (a request to call a tool with specific inputs), and 'thinking' (internal reasoning when extended thinking is enabled). A single response can contain multiple content blocks.

Context Compression

Techniques for reducing context window usage while preserving essential information. Strategies include: progressive summarization (replace old turns with summaries), selective retention (keep only relevant messages), and external storage (move history to a database with retrieval on demand).

Context Window

The maximum amount of text (measured in tokens) that Claude can process in a single request. Includes both input tokens (prompt, history, tool results) and output tokens. Exceeding the context window causes an error or requires context management strategies.

context: fork

A skill frontmatter option in Claude Code that runs the skill in an isolated sub-agent context with its own conversation history. Prevents verbose analysis or exploration output from polluting the main conversation context. The forked context is discarded when the skill completes.

Coordinator/Orchestrator Pattern

A multi-agent architecture where a central coordinator manages specialized subagents using hub-and-spoke communication. The coordinator handles task decomposition, routing, and result aggregation. Subagents never communicate directly with each other, keeping the system auditable.

custom_id

A field in the Message Batches API request body that lets you correlate each batch request with its response. Must be unique within a batch. Essential for matching asynchronous results back to originating requests when processing thousands of items.

E

end_turn

A stop_reason value indicating Claude finished its response naturally without hitting a limit or requesting a tool. In agentic loops, this signals the loop should stop and the final response should be presented to the user.

Error Propagation

The risk in multi-agent systems where an error or incorrect output from one subagent contaminates downstream agents, amplifying the mistake. Mitigated by validating subagent outputs before passing them forward and designing subagents to return structured error types rather than silent failures.

Escalation Pattern

A reliability pattern where the agent recognizes conditions it cannot handle autonomously and escalates to a human or higher-capability system. Escalation triggers include: conflicting data sources, low confidence scores, ambiguous requirements, or irreversible high-stakes actions.

Evaluator-Optimizer Pattern

A two-pass architecture where a generator produces output and a separate evaluator assesses it against explicit criteria. For true quality assurance, the evaluator must be a separate Claude instance with independent context — using the same instance creates confirmation bias.

Extended Thinking

A Claude capability that allows the model to reason through complex problems step-by-step in a dedicated thinking block before producing its final response. Controlled via the 'thinking' parameter with a 'budget_tokens' limit. Thinking tokens are billed but improve accuracy on hard reasoning tasks.

F

Few-Shot Prompting

A prompting technique that provides concrete examples of desired input-output behavior to guide the model. More reliable than instructions alone for consistent formatting and nuanced decisions. Examples should target ambiguous edge cases, not just obvious ones.

fork_session

An Agent SDK operation that creates a copy of the current session state, allowing parallel exploration of different solution paths without affecting the original session. Each fork can proceed independently; results can be compared and the best chosen.

FSRS (Free Spaced Repetition Scheduler)

Platform

A modern spaced repetition algorithm used in Claude Architect Lab to schedule concept reviews. Tracks four parameters per card: stability (how long memory persists), difficulty (how hard the card is), retrievability (current recall probability), and due date. More accurate than older algorithms like SM-2.

G

Graceful Degradation

A system design principle where partial failures result in reduced functionality rather than complete failure. In multi-agent systems, if one subagent fails, the coordinator produces a partial result with a clear explanation of what is missing rather than returning an error to the user.

H

Hooks (Claude Code)

Shell scripts or commands configured in .claude/settings.json that run at defined lifecycle points: PreToolUse (before tool execution), PostToolUse (after tool execution), Stop (before ending), SubagentStop (when subagent finishes). Used for code quality gates, notifications, logging, and safety checks.

Human-in-the-Loop

A design pattern that interrupts the agentic loop at defined checkpoints to request human review or approval before proceeding. Used for high-stakes decisions, irreversible actions, or cases where confidence is below threshold. Balances automation with oversight.

I

Information Provenance

The tracking of which source each piece of information in an agent's output came from. Critical for multi-source agents where conflicting data must be attributed. Preserved via claim-source mappings that travel with data through the pipeline.

Iterative Refinement

A Claude Code workflow pattern that builds solutions incrementally through small, verifiable steps rather than attempting complete implementation in one pass. Each step produces testable output; failures are caught early. Pair with test-driven iteration for maximum reliability.

L

Least Privilege (Tool Access)

A security principle applied to agent tool design: give each agent and subagent only the minimum tools required to complete its specific task. Reduces blast radius if an agent is compromised or makes an error. Implemented via AgentDefinition tool lists and skill allowed-tools.

Lost in the Middle Effect

The observed phenomenon where Claude (and other LLMs) give less attention to content in the middle of a long context window compared to content at the beginning and end. Critical information should be placed at the start (system prompt) or end (most recent user turn) of the context.

M

max_tokens

API parameter that sets the maximum number of tokens Claude will generate in its response. If the response would exceed this limit, it is truncated and stop_reason is set to 'max_tokens'. Always check stop_reason to detect truncation.

MCP Client

An application that connects to MCP servers to access their tools, resources, and prompts. Claude Code and the Anthropic API act as MCP clients. Responsible for discovering available capabilities, formatting requests, and handling MCP protocol messages.

MCP Inspector

An official Anthropic debugging tool for MCP server development. Provides a UI to connect to any MCP server, list its tools/resources/prompts, execute calls, and inspect responses. Essential for testing MCP server implementations during development.

MCP Primitives

The three fundamental building blocks of the MCP protocol: Tools (model-controlled actions Claude can invoke), Resources (application-controlled data Claude can read), and Prompts (user-controlled templates for common interactions). Each serves a distinct control model.

MCP Prompts

The user-controlled MCP primitive. Pre-defined prompt templates that users explicitly trigger via the application UI (e.g., a '/summarize' command). The user chooses when to apply them. Prompts can be parameterized and support autocomplete via the MCP completion endpoint.

MCP Resources

The application-controlled MCP primitive. The host application determines what data to provide to Claude by reading resources. Resources are identified by URIs and can be text, JSON, binary data, or template-generated content. Claude reads but does not autonomously request resources.

MCP Roots

An MCP security mechanism where clients define allowed filesystem paths (roots) that servers can access. Servers request which roots they need; clients enforce boundaries. Prevents MCP servers from accessing files outside designated directories.

MCP Sampling

An advanced MCP capability that allows MCP servers to request LLM completions through the client, enabling servers to use AI without direct API access. The client controls model selection, permissions, and billing. Used for AI-powered tool implementations that need their own Claude calls.

MCP Server

A process that implements the MCP protocol and exposes tools, resources, and prompts to MCP clients. Can be built with official SDKs (Python, TypeScript). Deployed locally (stdio transport) or remotely (StreamableHTTP). Claude Code auto-discovers servers configured in .mcp.json.

MCP Tools (Primitive)

The model-controlled MCP primitive. Claude autonomously decides when to invoke MCP tools based on task requirements. Tools are model-controlled because the AI determines when they are appropriate — the application does not schedule them. Examples: run_query, create_issue, send_email.

Message Batches API

An asynchronous Claude API for processing multiple requests in a batch with 50% cost savings versus synchronous requests. Processing takes up to 24 hours with no guaranteed latency SLA. Does not support iterative tool use, streaming, or prompt caching. Best for scheduled, non-blocking analysis.

Model Context Protocol (MCP)

An open standard protocol for connecting Claude to external tools and data sources. Defines a client-server architecture where MCP servers expose tools, resources, and prompts that Claude clients can discover and use. Supports project-scoped (.mcp.json) and user-scoped configurations.

Model Routing

The practice of directing requests to different Claude model tiers based on assessed complexity and requirements. A common pattern uses a fast, cheap model (Haiku) to classify task complexity, then routes to Sonnet or Opus accordingly.

P

Parallel Tool Use

Claude's ability to request multiple tool calls in a single response by returning multiple tool_use blocks. All requested tools can be executed concurrently, then all tool_result blocks are returned together. Significantly reduces the number of API round-trips for independent operations.

Path-Specific Rules

CLAUDE.md files placed in subdirectories (.claude/rules/ or directly in subdirectories) that override or extend root-level CLAUDE.md for that directory subtree. Enables context-sensitive behavior: different rules for tests/, docs/, and src/ without one massive root configuration.

Plan Mode

A Claude Code execution mode for exploration and analysis before making changes. Claude reads and analyzes but does not write files or execute commands. Use when requirements are ambiguous, multiple valid approaches exist, or decisions have significant architectural implications.

PostToolUse Hook

An Agent SDK lifecycle hook that intercepts tool results before the agent processes them. Can normalize, enrich, or transform results from multiple tools into a consistent format. Works with both custom and third-party MCP tools without modifying their source code.

PreToolUse Hook

An Agent SDK lifecycle hook that intercepts tool calls before execution. Can inspect, modify, or block the call. Used for access control, parameter sanitization, rate limiting, and audit logging. Runs synchronously before the tool executes.

Prompt Caching

A Claude API feature that caches frequently-used prompt content (system prompts, large documents, tool definitions) to reduce cost and latency on repeated API calls. Cached tokens are billed at a discounted rate. Cache has a TTL that resets on each use. Must be enabled by marking content with cache_control.

Prompt Injection

An attack where malicious content in external data (web pages, documents, user input) attempts to override the system prompt or hijack Claude's behavior. Mitigation: use XML tags to separate untrusted content from instructions, validate outputs, apply least-privilege tool access.

R

Response Prefill

A technique where you begin Claude's response by adding a partial assistant turn before the API call. Claude continues from that starting point, allowing precise control over response format, structure, and starting content. Useful for forcing JSON or specific syntax.

Retrieval-Augmented Generation (RAG)

A pattern that dynamically retrieves relevant information from an external knowledge base and injects it into the context window based on the current query. Allows Claude to reason over large document sets without fitting everything in context at once.

Role Assignment

The practice of defining Claude's identity and expertise in the system prompt to anchor its behavior. Roles like 'You are an expert security auditor' improve response quality for domain-specific tasks by activating relevant knowledge and behavioral patterns.

Rolling Window

A context management strategy that maintains only the N most recent conversation turns in the active context, discarding older turns entirely. Simple to implement but risks losing important earlier context. Best combined with summarization of discarded content.

S

Self-Critique

A pattern where Claude reviews its own output before finalizing it. Useful for catching obvious errors but limited: Claude tends to confirm its own reasoning due to anchoring bias. For high-stakes verification, use a separate Claude instance with independent context.

Session

A stateful context in the Agent SDK that maintains conversation history, tool results, and agent state across multiple turns. Sessions can be persisted, resumed, and forked. Session state management is critical for long-running tasks that may span interruptions.

Skill Frontmatter

YAML configuration at the top of a Claude Code skill file (SKILL.md) that controls how the skill is triggered and executed. Key fields: 'description' (natural language trigger for automatic activation), 'allowed-tools' (whitelist of permitted tools), 'context' (fork/current/none).

Skills (Claude Code)

Reusable markdown instruction files with YAML frontmatter that define custom slash commands in Claude Code. Invoked with /skill-name. Frontmatter configures: description (for trigger matching), allowed-tools (tool restrictions), and context (fork for isolation). Stored in .claude/skills/.

Spaced Repetition

Platform

A learning technique that schedules review of concepts at increasing intervals based on recall performance. Highly effective for long-term retention. Claude Architect Lab implements spaced repetition via the FSRS algorithm across all 150+ exam concepts in the review queue.

stdio Transport

An MCP transport mechanism where the client launches the server as a subprocess and communicates via stdin/stdout pipes. Ideal for local development and trusted single-user environments. Simple to set up but limited to same-machine deployment.

Stop Hook

An Agent SDK lifecycle hook that runs when the agent reaches an end_turn stop condition. Can inspect the final response and decide whether to allow the stop or inject additional instructions to continue the loop. Used for output validation and quality gates.

stop_reason

A field in the Claude API response indicating why the model stopped generating. Values: 'end_turn' (natural completion), 'max_tokens' (hit limit), 'stop_sequence' (hit custom stop), 'tool_use' (wants to call a tool). The primary signal for controlling agentic loops.

stop_sequences

An API parameter that provides a list of strings that, when encountered in the response, cause Claude to stop generating immediately. The matched string is not included in the response. Useful for enforcing structured output boundaries or workflow step delimiters.

StreamableHTTP Transport

An MCP transport mechanism that communicates over HTTP, supporting both request-response and server-sent events for streaming. Suitable for remote deployment, multi-user environments, and production services. Preferred for enterprise MCP servers that need to serve multiple clients.

Streaming

An API mode where Claude sends partial response tokens as they are generated, rather than waiting for the full response. Reduces perceived latency for users. Use server-sent events (SSE) to consume the stream. Not available with the Message Batches API.

Structured Error Response Design

The practice of returning structured, actionable error information from tools rather than generic error strings. Well-designed error responses include: error type, what went wrong, what Claude should try next. Prevents Claude from retrying the same failing approach repeatedly.

Structured Output

Guaranteed formatted output (typically JSON) from Claude. The most reliable method is to define a schema as a tool and set tool_choice to force its use — Claude's tool_use blocks are always valid JSON. Alternatively, use --output-format json with --json-schema in Claude Code CLI.

Subagent

A specialized Claude instance invoked by an orchestrator to handle a specific subtask. Runs in its own isolated context window, preventing context contamination. Returns a structured result to the orchestrator. Invoked via the Task tool in the Agent SDK.

SubagentStop Hook

An Agent SDK lifecycle hook that fires when a subagent finishes its task. Allows the orchestrator to inspect and post-process subagent results before they are returned to the parent session. Used for result validation and format normalization.

Summarization Strategy

A context management approach where older conversation turns or documents are condensed into shorter summaries. Risk: summarization is lossy and may discard details that become relevant later. Best practice: summarize completed sub-tasks, not ongoing work. Never summarize tool results that may need exact values.

System Prompt

The initial instruction set provided to Claude that defines its behavior, role, constraints, and operational context for an entire conversation. Set via the 'system' parameter in the API. Processed before the user turn and shapes all subsequent responses.

T

Task Decomposition

The process of breaking a complex task into smaller, independently executable subtasks that can be assigned to specialized subagents or processed sequentially. Good decomposition creates subtasks with clear boundaries, independent execution, and verifiable outputs.

Task Tool

The built-in Agent SDK tool used to invoke a subagent. Takes a task description, available tools, and optional context. The subagent runs its own agentic loop in an isolated context and returns its final result. The primary mechanism for multi-agent delegation.

Temperature

API parameter (0.0-1.0) that controls the randomness of Claude's output. Lower values (0.0) produce deterministic, consistent responses ideal for classification and extraction. Higher values increase creativity and variety for generative tasks.

Test-Driven Iteration

A Claude Code workflow where tests are written or identified before implementation, and each iteration is verified by running the test suite. Claude uses test failures as feedback to correct its approach. The 'interview pattern' involves asking clarifying questions before writing any code.

Token

The fundamental unit of text processing for Claude. Roughly 3-4 characters or about 0.75 words in English. Used for measuring input, output, and context window size. Costs are calculated per input and output token.

Tool Interface Design

The practice of writing tool definitions (name, description, input schema) that enable Claude to reliably select and use tools correctly. Key principles: precise descriptions that distinguish similar tools, explicit input format requirements, clear boundary examples, and documented error return formats.

tool_choice

API parameter controlling how Claude selects tools. 'auto' (default): Claude decides whether to use tools. 'any': Claude must use at least one tool. 'none': Claude cannot use tools. '{type: tool, name: X}': Claude must use the specific named tool. Used to force structured output via a schema tool.

tool_result

A content block type in the user message that returns the output of a tool execution back to Claude. Must include the 'tool_use_id' matching the original tool_use block. Can be text, images, or error messages. Claude processes the result and continues reasoning.

tool_use

A content block type in Claude's response indicating the model wants to call a specific tool. Contains 'id', 'name', and 'input' fields. The agent must execute the tool and return results in a tool_result content block for the conversation to continue.

V

Validation Loop (Retry-with-Feedback)

A pattern where structured output that fails validation is sent back to Claude with the specific error, requesting a correction. More effective than silent retries because Claude uses the error feedback to understand and fix the problem. Typically capped at 2-3 retries.

X

XML Tags in Prompts