How to Build AI Agents with Claude Agent SDK: Complete 2026 Guide (Step-by-Step Tutorial)
2026-03-17T00:04:42.297Z
The Age of Autonomous AI Agents Is Here
Sending a prompt and getting a response back is no longer the frontier of AI development. In March 2026, the real action is in AI agents — systems that autonomously read files, run terminal commands, search the web, analyze codebases, and fix bugs without human intervention. And the most mature framework for building these agents is Anthropic's Claude Agent SDK.
Formerly known as the Claude Code SDK, the Agent SDK gives you the same tools, agent loop, and context management that power Claude Code — now programmable in both Python and TypeScript. Unlike the traditional Anthropic Client SDK, where you manually implement tool execution loops, the Agent SDK handles all orchestration automatically. You tell Claude what to do; it figures out which tools to use and executes them. This guide walks you through everything: setup, first agent, subagents, production deployment, and cost optimization.
What Is the Claude Agent SDK?
At its core, the Claude Agent SDK follows a structured loop: gather context → take action → verify results → repeat. The main entry point is the query function, which creates an agentic loop and returns an async iterator. You consume messages as Claude thinks, calls tools, observes results, and decides what to do next.
The key difference from the Client SDK is that you no longer write the tool loop yourself:
# Client SDK: You implement the tool loop
response = client.messages.create(...)
while response.stop_reason == "tool_use":
result = your_tool_executor(response.tool_use)
response = client.messages.create(tool_result=result, **params)
# Agent SDK: Claude handles tools autonomously
async for message in query(prompt="Fix the bug in auth.py"):
print(message)
The SDK ships with a rich set of built-in tools: Read, Write, and Edit for file operations; Bash for terminal commands; Glob and Grep for file and content search; WebSearch and WebFetch for internet access; and AskUserQuestion for interactive clarification.
Getting Started in 5 Minutes
Prerequisites
You need Python 3.10+ or Node.js 18+, and an API key from the Claude Console.
Python Setup
mkdir my-agent && cd my-agent
python3 -m venv .venv && source .venv/bin/activate
pip3 install claude-agent-sdk
Or with the faster uv package manager:
uv init && uv add claude-agent-sdk
TypeScript Setup
mkdir my-agent && cd my-agent
npm init -y
npm install @anthropic-ai/claude-agent-sdk
npm install -D typescript @types/node tsx
API Key
export ANTHROPIC_API_KEY=your-api-key
The SDK also supports Amazon Bedrock, Google Vertex AI, and Microsoft Azure through environment variables (CLAUDE_CODE_USE_BEDROCK=1, CLAUDE_CODE_USE_VERTEX=1, CLAUDE_CODE_USE_FOUNDRY=1 respectively).
Your First Agent: An Autonomous Bug Fixer
Let's build something real. First, create a utils.py file with intentional bugs:
def calculate_average(numbers):
total = 0
for num in numbers:
total += num
return total / len(numbers) # ZeroDivisionError on empty list
def get_user_name(user):
return user["name"].upper() # TypeError when user is None
Now create agent.py:
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, ResultMessage
async def main():
async for message in query(
prompt="Review utils.py for bugs that would cause crashes. Fix any issues you find.",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob"],
permission_mode="acceptEdits",
),
):
if isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
print(block.text)
elif hasattr(block, "name"):
print(f"Tool: {block.name}")
elif isinstance(message, ResultMessage):
print(f"Done: {message.subtype}")
asyncio.run(main())
Run python3 agent.py, and Claude will autonomously read the file, identify both edge cases, and add proper error handling. No manual tool execution code required.
Three components make this work: query creates the agentic loop, allowed_tools specifies which tools are available, and permission_mode="acceptEdits" auto-approves file modifications so the agent runs without interactive prompts.
Customization: System Prompts, MCP, and More
You can shape agent behavior with system prompts:
options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob", "Bash"],
permission_mode="acceptEdits",
system_prompt="You are a senior Python developer. Always follow PEP 8 style guidelines.",
)
With Bash enabled, prompts like "Write unit tests for utils.py, run them, and fix any failures" become possible — the agent writes tests, executes them, reads the output, and iterates until they pass.
For external integrations, MCP (Model Context Protocol) servers connect your agent to databases, browsers, and APIs. Adding browser automation is just a few lines:
options = ClaudeAgentOptions(
mcp_servers={
"playwright": {"command": "npx", "args": ["@playwright/mcp@latest"]}
}
)
Hundreds of MCP servers are available in the open-source ecosystem, covering everything from PostgreSQL to Slack.
Subagents and Multi-Agent Systems
For complex tasks, a single agent often isn't enough. The SDK supports a subagent pattern where a main orchestrator delegates work to specialized agents, each running in its own context window:
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
options = ClaudeAgentOptions(
allowed_tools=["Read", "Glob", "Grep", "Agent"],
agents={
"code-reviewer": AgentDefinition(
description="Expert code reviewer for quality and security reviews.",
prompt="Analyze code quality and suggest improvements.",
tools=["Read", "Glob", "Grep"],
)
},
)
Include "Agent" in allowed_tools to let the main agent spawn subagents. Each subagent gets its own context window, preventing context pollution — a real problem when a single agent tries to juggle too many concerns.
Anthropic's newer Agent Teams feature takes this further: team members receive role-specific instructions, can message each other (not just the lead), and the lead synthesizes results. According to Anthropic's own research, multi-agent systems consistently outperform single agents when: (1) context pollution degrades performance, (2) tasks can run in parallel, and (3) specialization improves tool selection.
The golden rule for production: give each subagent one job, and let the orchestrator coordinate.
Session Management: Maintaining Context Across Interactions
Real workflows span multiple exchanges. The SDK's session system lets you capture a session ID and resume later with full context:
# First query: capture session ID
session_id = None
async for message in query(
prompt="Read the authentication module",
options=ClaudeAgentOptions(allowed_tools=["Read", "Glob"]),
):
if hasattr(message, "subtype") and message.subtype == "init":
session_id = message.session_id
# Second query: resume with full context
async for message in query(
prompt="Now find all places that call it",
options=ClaudeAgentOptions(resume=session_id),
):
if hasattr(message, "result"):
print(message.result)
The pronoun "it" in the second query correctly resolves to the authentication module from the first query. You can also fork sessions to explore different approaches from the same starting point.
Hooks: Lifecycle Control
Hooks let you inject custom logic at key points in the agent lifecycle. Available hooks include PreToolUse, PostToolUse, Stop, SessionStart, and SessionEnd. A common production pattern is audit logging:
async def log_file_change(input_data, tool_use_id, context):
file_path = input_data.get("tool_input", {}).get("file_path", "unknown")
with open("./audit.log", "a") as f:
f.write(f"{datetime.now()}: modified {file_path}\n")
return {}
options = ClaudeAgentOptions(
hooks={
"PostToolUse": [
HookMatcher(matcher="Edit|Write", hooks=[log_file_change])
]
},
)
The TypeScript SDK supports up to 12 lifecycle events, while Python supports 6. These hooks are essential for security auditing, cost monitoring, and automated rollback in production.
Production Deployment Best Practices
Containerization is non-negotiable. Run agents inside sandboxed environments (Docker, gVisor, Firecracker). For multi-tenant applications, spin up a new ephemeral container per user task and destroy it on completion.
Manage credentials externally. Use ANTHROPIC_BASE_URL to route all API traffic through a trusted proxy (Envoy or custom) that injects credentials, logs requests, and enforces organizational policies outside the agent's security boundary.
Invest in observability. Capture OpenTelemetry traces for prompts, tool invocations, and token usage with correlation IDs across subagents. Gate deployments behind feature flags, stage rollouts, and set rollback triggers on anomaly detection.
Choose the right permission mode: acceptEdits for development, bypassPermissions for sandboxed CI/CD, and default with a canUseTool callback for user-facing applications.
Pricing and Cost Optimization
The Agent SDK itself is free — you pay for API token usage. As of March 2026:
- Claude Opus 4.6: $5 input / $25 output per million tokens
- Claude Sonnet 4.6: $3 input / $15 output per million tokens
- Claude Haiku 4.5: $1 input / $5 output per million tokens
Batch processing offers a 50% discount on token prices. Since agents consume significant tokens through multiple tool calls and context accumulation, cost management matters. Key strategies: use subagents to isolate context (preventing unnecessary context growth), choose the cheapest model that handles each subtask adequately (Haiku for simple file reads, Sonnet for complex analysis), and leverage prompt caching for repeated patterns.
Claude Agent SDK vs OpenAI's Agent Stack
As of 2026, the flagship models from Anthropic and OpenAI have reached rough performance parity, but their agent development philosophies diverge significantly.
Claude Agent SDK emphasizes a developer-in-the-loop, local-first workflow. Its strengths are extended context windows, sustained reasoning through long agentic tasks, and deep codebase comprehension. The terminal-native approach appeals to developers who want fine-grained control.
OpenAI's agent stack (Codex-branded) leans toward autonomous, cloud-based task delegation. GPT-5.4's native computer use capabilities and rapid iteration speed are its differentiators.
The choice comes down to workflow preferences. For coding automation and local development, Claude has the edge. For cloud-native autonomous execution, OpenAI's approach may suit better. Many teams use both.
Practical Takeaways
Start simple. The built-in tools — Read, Edit, Glob — are battle-tested and cover most code analysis and modification needs. Add custom tools only when you have specific requirements that built-in tools can't address.
Let Claude choose the tools. Just describe what you want done. You don't need to specify which tools to use — Claude selects the right ones based on the task.
Think in subagents early. Even if your initial use case is simple, designing with the orchestrator-subagent pattern from the start makes scaling dramatically easier.
Looking Forward
AI agent development is one of the fastest-growing areas in software engineering in 2026. The Claude Agent SDK has established itself as one of the most production-ready frameworks available, with subagents, Agent Teams, MCP integrations, and lifecycle hooks providing the building blocks for sophisticated automation. The best time to start building is now — begin with a simple file analysis agent, progressively add tools and subagents, and before long you'll have a custom AI automation pipeline tailored to your exact needs.
비트베이크에서 광고를 시작해보세요
광고 문의하기