Home

AI Agents

AI agents are systems that use LLMs to reason, plan, and take actions autonomously. They represent the evolution from single-turn chatbots to multi-step, tool-using systems that can accomplish complex tasks.

Understanding agent architectures is essential for modern prompt and context engineering — agents are where prompting meets system design.


Contents


What Makes an Agent

An AI agent combines four core capabilities:

┌─────────────────────────────────────────────────────────┐
│                      AI AGENT                           │
├─────────────────────────────────────────────────────────┤
│  🧠 REASONING    │  Plan steps, analyze results        │
│  🔧 TOOL USE     │  Call APIs, search, execute code    │
│  💾 MEMORY       │  Remember context across steps      │
│  🔄 ITERATION    │  Loop until task complete           │
└─────────────────────────────────────────────────────────┘

Chatbot vs Agent:

Capability Chatbot Agent
Turns Single response Multiple steps
Tools None or limited Dynamic tool selection
Memory Session only Persistent state
Planning None Explicit reasoning
Autonomy Reactive Proactive

Agent Patterns

Core architectural patterns for building agents.

ReAct (Reasoning + Acting)

The foundational pattern: interleave thinking with action.

Thought: I need to find the current weather in Tokyo
Action: weather_api(location="Tokyo")
Observation: 15°C, partly cloudy
Thought: Now I can answer the user's question
Answer: It's currently 15°C and partly cloudy in Tokyo.

Use when: Tasks require dynamic tool selection based on intermediate results.

Plan-and-Execute

Separate planning from execution for complex tasks.

PLAN:
1. Search for recent AI safety papers
2. Summarize top 3 findings
3. Compare to last year's research
4. Write synthesis report

EXECUTE:
[Step 1] Searching... found 47 papers
[Step 2] Summarizing top 3...
[Step 3] Comparing...
[Step 4] Writing report...

Use when: Tasks have clear phases and benefit from upfront planning.

Reflection / Self-Critique

Agent evaluates and improves its own output.

INITIAL OUTPUT: [first attempt]
CRITIQUE: The code doesn't handle edge cases for empty input
REVISION: [improved version with edge case handling]
VERIFY: Now passes all test cases

Use when: Quality is critical and errors are costly.

Multi-Agent Collaboration

Multiple specialized agents working together.

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Researcher │ ──► │   Writer    │ ──► │   Editor    │
│  (gathers)  │     │  (drafts)   │     │  (refines)  │
└─────────────┘     └─────────────┘     └─────────────┘

Use when: Tasks benefit from specialized roles or perspectives.


Orchestration Frameworks

Tools for building agent systems.

Framework Best For Key Features Link
LangGraph Complex workflows with cycles State management, conditional edges, persistence langchain-ai.github.io/langgraph
CrewAI Role-based multi-agent teams Agent personas, task delegation, collaboration crewai.com
AutoGen Conversational multi-agent Microsoft-backed, group chat patterns microsoft.github.io/autogen
OpenAI Agents SDK OpenAI-native agents Handoffs, guardrails, tracing github.com/openai/openai-agents-python
Anthropic MCP Standardized tool integration Model Context Protocol, universal tool format modelcontextprotocol.io
Letta (MemGPT) Long-term memory Persistent memory, self-editing context letta.com
DSPy Optimized prompts Compile prompts from examples, auto-optimization github.com/stanfordnlp/dspy

Tool Integration

Agents need tools to interact with the world.

Common Tool Categories

Category Examples Use Case
Search Web search, document search, code search Information retrieval
Code Python REPL, shell, sandboxed execution Computation, automation
APIs Weather, stocks, databases, SaaS External data and actions
Files Read, write, parse documents Document processing
Communication Email, Slack, calendar User-facing actions

Tool Definition Example

# OpenAI Function Calling format
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and country, e.g., 'Tokyo, Japan'"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "default": "celsius"
                }
            },
            "required": ["location"]
        }
    }
}]

Tool Platforms

Platform What It Provides Link
Composio 150+ pre-built integrations (GitHub, Slack, etc.) composio.dev
Toolhouse Managed tool infrastructure toolhouse.ai
Browserbase Browser automation for agents browserbase.com

Memory & State

Agents need memory to work across multiple steps and sessions.

Memory Types

Type Scope Use Case
Working Memory Current task Intermediate results, scratchpad
Short-Term Current session Conversation history
Long-Term Across sessions User preferences, learned facts
Episodic Past interactions Similar past tasks, outcomes
Semantic Domain knowledge Facts, relationships, embeddings

State Management Patterns

Explicit State Object:

state = {
    "task": "Research AI safety",
    "steps_completed": ["search", "summarize"],
    "current_step": "compare",
    "artifacts": {"papers": [...], "summary": "..."},
    "errors": []
}

Conversation History:

messages = [
    {"role": "system", "content": "You are a research assistant..."},
    {"role": "user", "content": "Find recent AI safety papers"},
    {"role": "assistant", "content": "I'll search for...", "tool_calls": [...]},
    {"role": "tool", "content": "Found 47 papers..."},
    {"role": "assistant", "content": "I found 47 papers. The top 3 are..."}
]

Multi-Agent Systems

Patterns for agents working together.

Hierarchical

         ┌──────────────┐
         │  Supervisor  │
         └──────┬───────┘
        ┌───────┼───────┐
        ▼       ▼       ▼
    ┌──────┐ ┌──────┐ ┌──────┐
    │Agent1│ │Agent2│ │Agent3│
    └──────┘ └──────┘ └──────┘

Supervisor delegates and coordinates.

Collaborative

    ┌──────┐     ┌──────┐
    │Agent1│◄───►│Agent2│
    └──┬───┘     └───┬──┘
       │             │
       └──────┬──────┘
              ▼
         ┌──────┐
         │Agent3│
         └──────┘

Agents communicate peer-to-peer.

Pipeline

Agent1 ──► Agent2 ──► Agent3 ──► Output

Sequential handoffs with specialization.


Evaluation & Debugging

Agents are harder to evaluate than single-turn models.

What to Measure

Metric What It Tells You
Task completion rate Does it finish the job?
Step efficiency How many steps to complete?
Tool accuracy Right tool, right parameters?
Error recovery Handles failures gracefully?
Cost per task Token usage, API calls
Latency Time to completion

Debugging Tools

Tool Purpose Link
LangSmith Tracing, debugging LangChain agents smith.langchain.com
AgentOps Agent-specific observability agentops.ai
Langfuse Open-source LLM tracing langfuse.com
Braintrust Evaluation and logging braintrust.dev

Common Failure Modes

Failure Cause Mitigation
Infinite loops No termination condition Max steps, explicit exit
Tool hallucination Inventing non-existent tools Strict tool validation
Context overflow Too much history Summarization, pruning
Goal drift Losing track of objective Explicit goal in state
Premature termination Stopping before complete Completion verification

Example Projects

Open-source agent implementations to learn from.

Project Description Link
GPT-Researcher Autonomous research agent github.com/assafelovic/gpt-researcher
AutoGPT General-purpose autonomous agent github.com/Significant-Gravitas/AutoGPT
BabyAGI Minimal task-driven agent github.com/yoheinakajima/babyagi
Voyager Minecraft agent with lifelong learning github.com/MineDojo/Voyager
Open Interpreter Code execution agent github.com/OpenInterpreter/open-interpreter
SWE-agent Software engineering agent github.com/princeton-nlp/SWE-agent
Devon Open-source AI software engineer github.com/entropy-research/Devon

Key Resources

Essential Reading

Courses


Notes

Feedback and suggestions are welcome!

Last updated: January 2026