AI Agents

AI agents are systems that use LLMs to reason, plan, and take actions autonomously. They represent the evolution from single-turn chatbots to multi-step, tool-using systems that can accomplish complex tasks.

Understanding agent architectures is essential for modern prompt and context engineering — agents are where prompting meets system design.

What Makes an Agent
Agent Patterns
Orchestration Frameworks
Tool Integration
Memory & State
Multi-Agent Systems
Evaluation & Debugging
Example Projects

What Makes an Agent

An AI agent combines four core capabilities:

┌─────────────────────────────────────────────────────────┐
│                      AI AGENT                           │
├─────────────────────────────────────────────────────────┤
│  🧠 REASONING    │  Plan steps, analyze results        │
│  🔧 TOOL USE     │  Call APIs, search, execute code    │
│  💾 MEMORY       │  Remember context across steps      │
│  🔄 ITERATION    │  Loop until task complete           │
└─────────────────────────────────────────────────────────┘

Chatbot vs Agent:

Capability	Chatbot	Agent
Turns	Single response	Multiple steps
Tools	None or limited	Dynamic tool selection
Memory	Session only	Persistent state
Planning	None	Explicit reasoning
Autonomy	Reactive	Proactive

Agent Patterns

Core architectural patterns for building agents.

ReAct (Reasoning + Acting)

The foundational pattern: interleave thinking with action.

Thought: I need to find the current weather in Tokyo
Action: weather_api(location="Tokyo")
Observation: 15°C, partly cloudy
Thought: Now I can answer the user's question
Answer: It's currently 15°C and partly cloudy in Tokyo.

Use when: Tasks require dynamic tool selection based on intermediate results.

Plan-and-Execute

Separate planning from execution for complex tasks.

PLAN:
1. Search for recent AI safety papers
2. Summarize top 3 findings
3. Compare to last year's research
4. Write synthesis report

EXECUTE:
[Step 1] Searching... found 47 papers
[Step 2] Summarizing top 3...
[Step 3] Comparing...
[Step 4] Writing report...

Use when: Tasks have clear phases and benefit from upfront planning.

Reflection / Self-Critique

Agent evaluates and improves its own output.

INITIAL OUTPUT: [first attempt]
CRITIQUE: The code doesn't handle edge cases for empty input
REVISION: [improved version with edge case handling]
VERIFY: Now passes all test cases

Use when: Quality is critical and errors are costly.

Multi-Agent Collaboration

Multiple specialized agents working together.

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Researcher │ ──► │   Writer    │ ──► │   Editor    │
│  (gathers)  │     │  (drafts)   │     │  (refines)  │
└─────────────┘     └─────────────┘     └─────────────┘

Use when: Tasks benefit from specialized roles or perspectives.

Orchestration Frameworks

Tools for building agent systems.

Framework	Best For	Key Features	Link
LangGraph	Complex workflows with cycles	State management, conditional edges, persistence	langchain-ai.github.io/langgraph
CrewAI	Role-based multi-agent teams	Agent personas, task delegation, collaboration	crewai.com
AutoGen	Conversational multi-agent	Microsoft-backed, group chat patterns	microsoft.github.io/autogen
OpenAI Agents SDK	OpenAI-native agents	Handoffs, guardrails, tracing	github.com/openai/openai-agents-python
Anthropic MCP	Standardized tool integration	Model Context Protocol, universal tool format	modelcontextprotocol.io
Letta (MemGPT)	Long-term memory	Persistent memory, self-editing context	letta.com
DSPy	Optimized prompts	Compile prompts from examples, auto-optimization	github.com/stanfordnlp/dspy

Tool Integration

Agents need tools to interact with the world.

Common Tool Categories

Category	Examples	Use Case
Search	Web search, document search, code search	Information retrieval
Code	Python REPL, shell, sandboxed execution	Computation, automation
APIs	Weather, stocks, databases, SaaS	External data and actions
Files	Read, write, parse documents	Document processing
Communication	Email, Slack, calendar	User-facing actions

Tool Definition Example

# OpenAI Function Calling format
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and country, e.g., 'Tokyo, Japan'"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "default": "celsius"
                }
            },
            "required": ["location"]
        }
    }
}]

Tool Platforms

Platform	What It Provides	Link
Composio	150+ pre-built integrations (GitHub, Slack, etc.)	composio.dev
Toolhouse	Managed tool infrastructure	toolhouse.ai
Browserbase	Browser automation for agents	browserbase.com

Memory & State

Agents need memory to work across multiple steps and sessions.

Memory Types

Type	Scope	Use Case
Working Memory	Current task	Intermediate results, scratchpad
Short-Term	Current session	Conversation history
Long-Term	Across sessions	User preferences, learned facts
Episodic	Past interactions	Similar past tasks, outcomes
Semantic	Domain knowledge	Facts, relationships, embeddings

State Management Patterns

Explicit State Object:

state = {
    "task": "Research AI safety",
    "steps_completed": ["search", "summarize"],
    "current_step": "compare",
    "artifacts": {"papers": [...], "summary": "..."},
    "errors": []
}

Conversation History:

messages = [
    {"role": "system", "content": "You are a research assistant..."},
    {"role": "user", "content": "Find recent AI safety papers"},
    {"role": "assistant", "content": "I'll search for...", "tool_calls": [...]},
    {"role": "tool", "content": "Found 47 papers..."},
    {"role": "assistant", "content": "I found 47 papers. The top 3 are..."}
]

Multi-Agent Systems

Patterns for agents working together.

Hierarchical

         ┌──────────────┐
         │  Supervisor  │
         └──────┬───────┘
        ┌───────┼───────┐
        ▼       ▼       ▼
    ┌──────┐ ┌──────┐ ┌──────┐
    │Agent1│ │Agent2│ │Agent3│
    └──────┘ └──────┘ └──────┘

Supervisor delegates and coordinates.

Collaborative

    ┌──────┐     ┌──────┐
    │Agent1│◄───►│Agent2│
    └──┬───┘     └───┬──┘
       │             │
       └──────┬──────┘
              ▼
         ┌──────┐
         │Agent3│
         └──────┘

Agents communicate peer-to-peer.

Pipeline

Agent1 ──► Agent2 ──► Agent3 ──► Output

Sequential handoffs with specialization.

Evaluation & Debugging

Agents are harder to evaluate than single-turn models.

What to Measure

Metric	What It Tells You
Task completion rate	Does it finish the job?
Step efficiency	How many steps to complete?
Tool accuracy	Right tool, right parameters?
Error recovery	Handles failures gracefully?
Cost per task	Token usage, API calls
Latency	Time to completion

Debugging Tools

Tool	Purpose	Link
LangSmith	Tracing, debugging LangChain agents	smith.langchain.com
AgentOps	Agent-specific observability	agentops.ai
Langfuse	Open-source LLM tracing	langfuse.com
Braintrust	Evaluation and logging	braintrust.dev

Common Failure Modes

Failure	Cause	Mitigation
Infinite loops	No termination condition	Max steps, explicit exit
Tool hallucination	Inventing non-existent tools	Strict tool validation
Context overflow	Too much history	Summarization, pruning
Goal drift	Losing track of objective	Explicit goal in state
Premature termination	Stopping before complete	Completion verification

Example Projects

Open-source agent implementations to learn from.

Project	Description	Link
GPT-Researcher	Autonomous research agent	github.com/assafelovic/gpt-researcher
AutoGPT	General-purpose autonomous agent	github.com/Significant-Gravitas/AutoGPT
BabyAGI	Minimal task-driven agent	github.com/yoheinakajima/babyagi
Voyager	Minecraft agent with lifelong learning	github.com/MineDojo/Voyager
Open Interpreter	Code execution agent	github.com/OpenInterpreter/open-interpreter
SWE-agent	Software engineering agent	github.com/princeton-nlp/SWE-agent
Devon	Open-source AI software engineer	github.com/entropy-research/Devon

Key Resources

Essential Reading

Building Effective Agents — Anthropic's official guide
LLM Powered Autonomous Agents — Lilian Weng's deep dive
Cognitive Architectures for Language Agents — Academic framework

Courses

AI Agents in LangGraph — DeepLearning.AI
Multi AI Agent Systems with CrewAI — DeepLearning.AI

Notes

Feedback and suggestions are welcome!

Last updated: January 2026

Awesome Prompt Engineering

The ultimate guide to prompt engineering, context engineering, and AI agents.