The Agent SDK embeds Claude Code's abilities into your application
The Claude Agent SDK is best understood as Claude Code as a programmable library. With the regular Claude API, your application usually sends a message and receives text; if tools are involved, you implement the tool loop yourself. The Agent SDK packages Claude Code's agent loop, built-in tools, context handling, permission checks, and event stream behind Python and TypeScript APIs.
That lets your program launch an agent that can inspect files, search a codebase, edit files, run commands, call MCP tools, observe the results, and continue working. You still define the goal, tool boundaries, permission model, and runtime environment, but you no longer need to build a durable agent loop from scratch.
A simple decision rule: if your app needs a single model response, use the Anthropic Client SDK. If your app needs Claude to keep taking actions inside your environment, use the Agent SDK.
It is not the regular Client SDK, and it is not a hosted agent service
The Claude ecosystem has several entry points that sound similar. Separating them makes the choice much clearer.
| Anthropic Client SDK | A general client for the Messages API. You construct each turn, manage conversation state, run tools, retry requests, enforce permissions, and audit behavior. Best for chat, structured output, and lightweight tool use. |
|---|---|
| Claude Code CLI | A terminal tool for human developers. Best for everyday coding, bug fixing, project explanation, and one-off automation. |
| Claude Agent SDK | A way to embed Claude Code's tools and loop inside your application. Best for CI bots, internal platforms, code agents, research agents, and automation workflows. |
| Managed Agents | Anthropic-hosted agent infrastructure. The Agent SDK runs in your own process and environment; Managed Agents move more runtime, sandboxing, and event history into the platform. |
So the central design question for the Agent SDK is not "how do I send one message?" It is "what am I willing to let this agent do, in which environment, to what depth, and with what audit trail?"
The agent loop: from goal, to tools, back to Claude
The Agent SDK executes a loop, not a single request. A typical task moves through these stages:
Initialization
The SDK starts the Claude Code agent, loads the system prompt, tool definitions, working directory, setting sources, and session information, then emits an init event.
Model evaluation
Claude decides the next step from the current context: answer directly, or request one or more tool calls.
Tool execution
The SDK checks permissions, runs tools such as Read, Edit, Bash, or MCP tools, and returns the tool results to Claude as new context.
Loop continuation
Claude reads the result, keeps analyzing, editing, testing, or searching, and continues until it has no new tool calls.
Result collection
The SDK emits the final answer and ResultMessage, including success or interruption reason, output text, session id, token usage, and estimated cost.
This is why production integrations should set maxTurns / max_turns, maxBudgetUsd / max_budget_usd, permission mode, timeouts, and audit logging. The more capable the agent is, the clearer its boundaries need to be.
Install the SDK and run the first agent
Prepare the runtime first: TypeScript needs Node.js 18+; Python needs Python 3.10+. Then set an API key. The official path is to use an ANTHROPIC_API_KEY from the Anthropic Console or the authentication mechanism described in the current Anthropic documentation.
TypeScript install
mkdir my-agent
cd my-agent
npm init -y
npm install @anthropic-ai/claude-agent-sdk tsx
export ANTHROPIC_API_KEY=your-api-key
Python install
mkdir my-agent
cd my-agent
python3 -m venv .venv
source .venv/bin/activate
pip install claude-agent-sdk
export ANTHROPIC_API_KEY=your-api-key
Python: let the agent read and edit files
The example below asks Claude to inspect utils.py and fix crash-prone edge cases. allowed_tools pre-approves reading, locating, and editing files; permission_mode="acceptEdits" lets ordinary edits proceed automatically.
import asyncio
from claude_agent_sdk import (
query,
ClaudeAgentOptions,
AssistantMessage,
ResultMessage,
)
async def main():
async for message in query(
prompt="Review utils.py for crash bugs. Fix the issues you find.",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob"],
permission_mode="acceptEdits",
max_turns=8,
max_budget_usd=0.75,
),
):
if isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
print(block.text)
elif hasattr(block, "name"):
print(f"Tool: {block.name}")
if isinstance(message, ResultMessage):
print("session:", message.session_id)
print("status:", message.subtype)
print("result:", message.result)
asyncio.run(main())
TypeScript: the same pattern
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Review utils.ts for crash bugs. Fix the issues you find.",
options: {
allowedTools: ["Read", "Edit", "Glob"],
permissionMode: "acceptEdits",
maxTurns: 8,
maxBudgetUsd: 0.75,
},
})) {
if (message.type === "assistant") {
for (const block of message.message.content) {
if (block.type === "text") console.log(block.text);
if (block.type === "tool_use") console.log("Tool:", block.name);
}
}
if (message.type === "result") {
console.log("session:", message.session_id);
console.log("status:", message.subtype);
console.log("result:", message.result);
}
}
At runtime, the SDK continuously yields events. Your code can focus only on the final result, or surface every tool call in a UI, log stream, or job queue.
Message types determine how you show progress
The Agent SDK returns an asynchronous iterator. Each iteration can be a different message type:
| SystemMessage | Session lifecycle events. The common init event includes session metadata. |
|---|---|
| AssistantMessage | Claude's output for the current turn. It may contain text blocks, tool-use blocks, or both. |
| UserMessage | Tool-result messages produced after execution. The SDK feeds them back to Claude. |
| StreamEvent | Lower-level streaming fragments when partial messages are enabled. Useful for realtime interfaces. |
| ResultMessage | The loop completion marker. It includes the final result, stop reason, usage, estimated cost, and session id. |
For CI jobs or background tasks, you may only need to persist ResultMessage. For a web console or internal platform, it is usually better to show tool calls, text fragments, and the final answer from AssistantMessage so users can see what the agent is reading, running, and changing.
A useful habit: do not force-exit the iterator immediately after receiving ResultMessage. Let the async iteration finish naturally, because a small number of trailing system events may still arrive.
Core options: set boundaries for the agent
query() is the most common entry point. Python uses ClaudeAgentOptions; TypeScript uses an options object. The options below determine what the agent can do, where it can do it, and how long it can continue.
cwd |
Sets the working directory. File tools are scoped around that directory by default. In production, give each task an isolated workspace. |
|---|---|
allowedTools |
Pre-approves tool calls. It is not the only constraint; tools outside the list may still be handled by permission mode or callbacks. Python name: allowed_tools. |
disallowedTools |
Removes or blocks tools. For example, blocking Bash(rm *) can prevent destructive commands even when the broader permission mode is permissive. |
permissionMode |
Controls the approval strategy. Common values include acceptEdits, dontAsk, bypassPermissions, and default. |
maxTurns |
Limits the number of agent-loop turns so open-ended tasks cannot run indefinitely. Python name: max_turns. |
maxBudgetUsd |
Stops a task based on client-side estimated cost. Useful for production automation and batch jobs. Python name: max_budget_usd. |
systemPrompt |
Sets the system prompt. You can use a Claude Code preset or provide a fully custom prompt. |
mcpServers |
Connects external tools, databases, browsers, internal APIs, or in-process MCP servers. |
settingSources |
Controls whether user, project, or local Claude Code settings are loaded. For SDK-only applications, an empty array can reduce environmental drift. |
Tool permissions are product boundaries, not decoration
The Agent SDK is powerful because it can use tools. That also means you must design permissions deliberately. A simple approach is to divide tasks by risk:
Read-only analysis
Good for code review, documentation summaries, and risk scans. Allow only Read, Glob, and Grep, paired with dontAsk so unapproved tools are rejected.
const options = {
allowedTools: ["Read", "Glob", "Grep"],
permissionMode: "dontAsk",
};
Trusted editing
Good for a developer machine or isolated branch. Allow reads, writes, and test commands, but still limit turns, budget, and dangerous shell patterns.
options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Write", "Glob", "Grep", "Bash"],
disallowed_tools=["Bash(rm *)", "Bash(sudo *)"],
permission_mode="acceptEdits",
max_turns=12,
)
A common trap: allowedTools is a pre-approval list, not always the only allowed list. If you also use bypassPermissions, tools outside that list may still be allowed. For a locked-down agent, use allowedTools with permissionMode: "dontAsk". To block specific tools in a broader mode, use disallowedTools.
Custom tools and MCP: give Claude access to your systems
Built-in tools cover files, search, shell commands, and web access. Production agents often need business systems too: CRM, tickets, databases, browsers, internal APIs, asset libraries, or deployment platforms. The Agent SDK recommends exposing these through MCP servers, and it can also create in-process MCP servers when a separate process would be unnecessary.
A tool usually has four parts: a name, a description, an input schema, and a handler. The clearer the description, the easier it is for Claude to call the tool at the right time.
from typing import Any
from claude_agent_sdk import tool, create_sdk_mcp_server, ClaudeAgentOptions, query
@tool(
"lookup_order",
"Look up an order by order_id and return shipment status.",
{"order_id": str},
)
async def lookup_order(args: dict[str, Any]) -> dict[str, Any]:
order_id = args["order_id"]
# Replace this with your real database or API call.
return {
"content": [
{"type": "text", "text": f"Order {order_id}: shipped, tracking ready."}
],
"structuredContent": {
"order_id": order_id,
"status": "shipped",
},
}
order_server = create_sdk_mcp_server(
name="orders",
version="1.0.0",
tools=[lookup_order],
)
async for message in query(
prompt="Check order A1024 and draft a short customer update.",
options=ClaudeAgentOptions(
mcp_servers={"orders": order_server},
allowed_tools=["mcp__orders__lookup_order"],
),
):
print(message)
Tool design is mostly about reliability: keep schemas narrow, permissions small, and handler errors explainable. Do not let unhandled exceptions take down the whole agent loop. For tools without side effects, you can mark them read-only so Claude can plan more freely, but that label is guidance, not a security boundary. Real enforcement belongs in your handler and permission layer.
Sessions, memory, and context management
The Agent SDK supports sessions. On the first run, you can obtain a session id from the init message or ResultMessage, then use resume to continue with context. This is useful for multi-step tasks: inspect the project, ask a human to confirm direction, then continue the implementation.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, SystemMessage, ResultMessage
async def main():
session_id = None
async for message in query(
prompt="Read the authentication module and summarize its design.",
options=ClaudeAgentOptions(allowed_tools=["Read", "Glob", "Grep"]),
):
if isinstance(message, SystemMessage) and message.subtype == "init":
session_id = message.data["session_id"]
async for message in query(
prompt="Now find all call sites and identify risky assumptions.",
options=ClaudeAgentOptions(
resume=session_id,
allowed_tools=["Read", "Glob", "Grep"],
),
):
if isinstance(message, ResultMessage):
print(message.result)
asyncio.run(main())
The context window is not infinite. Long tasks consume context through files, tool results, conversation history, system prompts, and tool definitions. In practice, keep goals specific, make tool outputs concise, and use search tools to locate relevant files before reading them deeply.
System prompts: the default is light; Claude Code presets are richer
When you do not set a system prompt, the Agent SDK uses a minimal default prompt that covers tool use but does not necessarily mirror the full coding behavior, response style, and project context of the Claude Code CLI. If you want an SDK agent to behave more like claude -p, use the claude_code preset and append your product-specific rules.
const options = {
systemPrompt: {
type: "preset",
preset: "claude_code",
append: "You are running inside our CI bot. Keep summaries concise and never commit changes.",
},
settingSources: ["project"],
allowedTools: ["Read", "Glob", "Grep", "Bash"],
permissionMode: "dontAsk",
};
When should you use a fully custom system prompt? When the agent is not primarily a coding assistant: support, research, data processing, operations, or product-specific workflows. A custom prompt can replace many default behaviors, so spell out tool boundaries, output format, safety requirements, failure handling, and human approval points.
Integration patterns for real applications
Once the first script works, the next question is where the agent belongs in your product. Most Agent SDK deployments fall into a few patterns. The right choice depends on whether the user needs live feedback, whether tool use has side effects, and how much isolation the workload needs.
| CLI automation | A script runs the agent against a local repository or generated workspace. This is ideal for developer tools, codemods, release chores, and one-off operational tasks. |
|---|---|
| Background jobs | A queue worker launches agents for discrete tasks, persists events, and reports completion. This works well for CI review, documentation generation, repository maintenance, and research jobs. |
| Interactive console | A web or desktop UI streams assistant messages and tool calls as they happen. This pattern is best when humans need to supervise, approve, or steer long-running work. |
| Service integration | An internal service exposes business-specific tools through MCP or an in-process server, then uses the SDK to reason over those tools. This is the common shape for support, operations, and data workflows. |
For any pattern, keep the agent runtime close to the resources it must inspect, but keep secrets and irreversible actions behind explicit tool handlers. The SDK should orchestrate work; your application should enforce policy.
Production checklist: do not stop at a working demo
Agents can take actions, so production work starts with control, not cleverness.
- Isolate the runtime. Give each task its own workspace, temporary credentials, and minimal file permissions. Do not let an agent operate directly in a production home directory.
- Limit the tool set. Start read-only by default, then gradually open Edit, Write, Bash, and MCP tools. Block dangerous commands explicitly with
disallowedTools. - Set budgets and turn limits. Bound
maxTurns,maxBudgetUsd, task timeout, and queue timeout. - Record audit logs. Persist prompt summaries, tool calls, file changes, command output, final results, tokens, and estimated cost.
- Productize approval points. Add human confirmation or policy checks for file writes, emails, deployments, deletes, payments, and external API mutations.
- Test failure paths. Simulate permission denial, tool timeout, API rate limiting, long context, budget exhaustion, and malformed model responses.
- Protect secrets. Keep production keys out of the working directory. Tool handlers should receive only the tokens they need, and logs should never print full secrets.
- Separate draft from execution. For high-risk workflows, let the agent produce a plan or diff first, then require explicit confirmation before executing.
Common issues and how to investigate them
| Missing API key | Confirm that ANTHROPIC_API_KEY is actually present in the shell, process manager, Docker container, or CI job that starts your script. A local .env file may not be loaded by your runtime. |
|---|---|
| Python install fails | Check that Python is 3.10 or newer. Older interpreters may report No matching distribution found for claude-agent-sdk. |
| Tools are not called | Confirm that the tool is visible in context, has a clear description, and is not blocked by permission mode. For built-in tools, check allowedTools and disallowedTools spelling. |
| The agent runs too long | Narrow the prompt, restrict the working directory, set maxTurns and a budget, and avoid open-ended goals like "optimize the entire codebase." |
| Behavior differs across environments | Compare runtime settings, working directory, permission mode, installed MCP servers, loaded Claude Code settings, environment variables, and SDK versions. Differences in any of these can change agent behavior. |
| Permissions do not match expectations | Remember that allowedTools is a pre-approval mechanism, not always the only allow-list. Use dontAsk for locked-down agents and disallowedTools for explicit blocks. |