Home

Articles

Curated reading on prompt engineering, context engineering, and building with LLMs. Organized by topic for practitioners at all levels.

Contribute: Found an essential article? Submit a PR or open an issue.


Contents


Context Engineering & Prompting

The craft of communicating effectively with LLMs.

Article Author/Source Why Read It
Prompt Engineering Lilian Weng Comprehensive technical overview of prompting techniques
What We Learned from a Year of Building with LLMs (Part I) O'Reilly Production lessons from practitioners
What We Learned from a Year of Building with LLMs (Part II) O'Reilly Operations and organizational insights
Prompting Fundamentals Anthropic Official Claude prompting best practices
The Art of Prompt Design OpenAI Official GPT prompting guide

Building with LLMs

Architecture patterns and system design for LLM applications.

Article Author/Source Why Read It
Patterns for Building LLM-based Systems & Products Eugene Yan Essential architectural patterns
The Shift from Models to Compound AI Systems Berkeley AI Research Why pipelines beat single models
Building LLM Applications for Production Chip Huyen Production engineering considerations
LLM App Stack a]16z Reference architecture for LLM apps

Agents & Orchestration

Building autonomous AI systems that reason and act.

Article Author/Source Why Read It
Building Effective Agents Anthropic Official guide to agent development
LLM Powered Autonomous Agents Lilian Weng Deep dive into agent architectures
Cognitive Architectures for Language Agents CoALA Paper Academic framework for agent design
The Agent Reasoning Loop LangChain ReAct, Plan-and-Execute, and more

RAG & Knowledge Systems

Grounding LLMs with external knowledge.

Article Author/Source Why Read It
A Survey on RAG for LLMs arXiv Comprehensive academic overview
Chunking Strategies for LLM Applications Pinecone Practical document splitting guide
Advanced RAG Techniques LlamaIndex Production RAG optimization
Retrieval Augmented Generation: A Practical Guide Cohere End-to-end RAG implementation

Evaluation & Testing

Measuring what matters in LLM systems.

Article Author/Source Why Read It
Your AI Product Needs Evals Hamel Husain Why and how to evaluate LLM apps
How to Evaluate LLMs: A Complete Metric Framework O'Reilly Comprehensive evaluation metrics
LLM-as-Judge arXiv Using LLMs to evaluate LLMs
A Practical Guide to LLM Evaluation Confident AI Metrics and methodologies

Safety & Security

Building robust, secure, and aligned LLM systems.

Article Author/Source Why Read It
Prompt Injection: What's the Worst That Can Happen? Simon Willison Understanding prompt injection risks
OWASP Top 10 for LLM Applications OWASP Security risks and mitigations
Constitutional AI Anthropic Self-correction against principles
Red Teaming Language Models arXiv Adversarial testing methods
Many-Shot Jailbreaking Anthropic Long-context vulnerabilities

Production & Operations

Running LLM systems at scale.

Article Author/Source Why Read It
LLMOps: Everything You Need to Know Lakera Operations overview
Monitoring LLMs in Production Arize AI Observability practices
Cost Optimization for LLM Applications Helicone Managing API costs
Caching Strategies for LLM Apps GPTCache Reducing latency and cost

Research & Theory

Understanding the foundations.

Article Author/Source Why Read It
The Illustrated Transformer Jay Alammar Visual guide to transformer architecture
Attention Is All You Need (Annotated) Harvard NLP The foundational paper, explained
Scaling Laws for Neural Language Models OpenAI Why scale matters
A Survey of Large Language Models arXiv Comprehensive LLM overview
Towards Monosemanticity Anthropic Understanding what's inside LLMs

Contributing

To suggest an article:

  1. Open an issue with the article URL and a brief description
  2. Or submit a PR adding it to the appropriate section

Criteria for inclusion:


Notes

Feedback and suggestions are welcome!

Last updated: January 2026