Articles
Curated reading on prompt engineering, context engineering, and building with LLMs. Organized by topic for practitioners at all levels.
Contribute: Found an essential article? Submit a PR or open an issue.
Contents
- Context Engineering & Prompting
- Building with LLMs
- Agents & Orchestration
- RAG & Knowledge Systems
- Evaluation & Testing
- Safety & Security
- Production & Operations
- Research & Theory
Context Engineering & Prompting
The craft of communicating effectively with LLMs.
| Article | Author/Source | Why Read It |
|---|---|---|
| Prompt Engineering | Lilian Weng | Comprehensive technical overview of prompting techniques |
| What We Learned from a Year of Building with LLMs (Part I) | O'Reilly | Production lessons from practitioners |
| What We Learned from a Year of Building with LLMs (Part II) | O'Reilly | Operations and organizational insights |
| Prompting Fundamentals | Anthropic | Official Claude prompting best practices |
| The Art of Prompt Design | OpenAI | Official GPT prompting guide |
Building with LLMs
Architecture patterns and system design for LLM applications.
| Article | Author/Source | Why Read It |
|---|---|---|
| Patterns for Building LLM-based Systems & Products | Eugene Yan | Essential architectural patterns |
| The Shift from Models to Compound AI Systems | Berkeley AI Research | Why pipelines beat single models |
| Building LLM Applications for Production | Chip Huyen | Production engineering considerations |
| LLM App Stack | a]16z | Reference architecture for LLM apps |
Agents & Orchestration
Building autonomous AI systems that reason and act.
| Article | Author/Source | Why Read It |
|---|---|---|
| Building Effective Agents | Anthropic | Official guide to agent development |
| LLM Powered Autonomous Agents | Lilian Weng | Deep dive into agent architectures |
| Cognitive Architectures for Language Agents | CoALA Paper | Academic framework for agent design |
| The Agent Reasoning Loop | LangChain | ReAct, Plan-and-Execute, and more |
RAG & Knowledge Systems
Grounding LLMs with external knowledge.
| Article | Author/Source | Why Read It |
|---|---|---|
| A Survey on RAG for LLMs | arXiv | Comprehensive academic overview |
| Chunking Strategies for LLM Applications | Pinecone | Practical document splitting guide |
| Advanced RAG Techniques | LlamaIndex | Production RAG optimization |
| Retrieval Augmented Generation: A Practical Guide | Cohere | End-to-end RAG implementation |
Evaluation & Testing
Measuring what matters in LLM systems.
| Article | Author/Source | Why Read It |
|---|---|---|
| Your AI Product Needs Evals | Hamel Husain | Why and how to evaluate LLM apps |
| How to Evaluate LLMs: A Complete Metric Framework | O'Reilly | Comprehensive evaluation metrics |
| LLM-as-Judge | arXiv | Using LLMs to evaluate LLMs |
| A Practical Guide to LLM Evaluation | Confident AI | Metrics and methodologies |
Safety & Security
Building robust, secure, and aligned LLM systems.
| Article | Author/Source | Why Read It |
|---|---|---|
| Prompt Injection: What's the Worst That Can Happen? | Simon Willison | Understanding prompt injection risks |
| OWASP Top 10 for LLM Applications | OWASP | Security risks and mitigations |
| Constitutional AI | Anthropic | Self-correction against principles |
| Red Teaming Language Models | arXiv | Adversarial testing methods |
| Many-Shot Jailbreaking | Anthropic | Long-context vulnerabilities |
Production & Operations
Running LLM systems at scale.
| Article | Author/Source | Why Read It |
|---|---|---|
| LLMOps: Everything You Need to Know | Lakera | Operations overview |
| Monitoring LLMs in Production | Arize AI | Observability practices |
| Cost Optimization for LLM Applications | Helicone | Managing API costs |
| Caching Strategies for LLM Apps | GPTCache | Reducing latency and cost |
Research & Theory
Understanding the foundations.
| Article | Author/Source | Why Read It |
|---|---|---|
| The Illustrated Transformer | Jay Alammar | Visual guide to transformer architecture |
| Attention Is All You Need (Annotated) | Harvard NLP | The foundational paper, explained |
| Scaling Laws for Neural Language Models | OpenAI | Why scale matters |
| A Survey of Large Language Models | arXiv | Comprehensive LLM overview |
| Towards Monosemanticity | Anthropic | Understanding what's inside LLMs |
Contributing
To suggest an article:
- Open an issue with the article URL and a brief description
- Or submit a PR adding it to the appropriate section
Criteria for inclusion:
- Answers a real question practitioners have
- From a credible source (researcher, practitioner, or organization)
- Provides actionable insights, not just overview
- Stands the test of time (not purely news)
Notes
Feedback and suggestions are welcome!
Last updated: January 2026