AI Tools Information

AI Tools

A practitioner's guide to tools for building, deploying, evaluating, monitoring, and governing AI systems. Organized by what problem each tool solves and who uses it.

This guide answers three questions for every tool:

What real problem does this solve?
Who uses it in a serious organization?
How does it connect to frontier AI systems?

Foundation Models & APIs
Development Frameworks
Agent Orchestration
Prompt Management & Versioning
RAG & Knowledge Infrastructure
Evaluation & Testing
Observability & Monitoring
Safety & Guardrails
Deployment & MLOps
Governance & Compliance
No-Code & Business Platforms
Data & Compute Infrastructure
Research & Learning

Foundation Models & APIs

The AI systems themselves. These are what you're integrating, not building.

Tool	Problem Solved	Primary Users	URL
OpenAI API	Access to GPT-4, GPT-4o, o1, o3 models for text, vision, and reasoning	Developers, product teams	platform.openai.com
Anthropic API	Access to Claude models with strong instruction-following and safety	Developers, enterprise teams	anthropic.com
Google Vertex AI	Unified access to Gemini models with enterprise security	Enterprise ML teams, GCP users	cloud.google.com/vertex-ai
Amazon Bedrock	Single API to multiple foundation models (Claude, Llama, Titan)	AWS enterprise customers	aws.amazon.com/bedrock
Azure OpenAI Service	OpenAI models with enterprise compliance and data residency	Enterprise teams on Azure	azure.microsoft.com/products/ai-services/openai-service
Mistral AI	Open-weight and commercial models, EU-based	Teams needing EU data sovereignty	mistral.ai
Cohere	Enterprise LLMs optimized for RAG and search	Enterprise search teams	cohere.com
Groq	Ultra-fast inference for open models (Llama, Mixtral)	Latency-sensitive applications	groq.com
Together AI	Inference and fine-tuning for 100+ open models	Teams using open-source models	together.ai
Replicate	Run open-source models via API without infrastructure	Prototypers, small teams	replicate.com
Fireworks AI	Fast, cost-efficient inference for open models	Production teams optimizing cost	fireworks.ai

Development Frameworks

Libraries and SDKs for building AI-powered applications.

Tool	Problem Solved	Primary Users	URL
LangChain	Composable framework for LLM applications (chains, agents, RAG)	AI engineers, backend developers	langchain.com
LlamaIndex	Data framework for connecting LLMs to external data sources	Developers building RAG systems	llamaindex.ai
Haystack	End-to-end framework for search and RAG pipelines	Search/NLP engineers	haystack.deepset.ai
Semantic Kernel	Microsoft's SDK for AI orchestration (.NET, Python, Java)	Enterprise .NET developers	github.com/microsoft/semantic-kernel
DSPy	Programming framework that compiles prompts from examples	ML researchers, prompt optimizers	github.com/stanfordnlp/dspy
Instructor	Structured outputs from LLMs with Pydantic validation	Developers needing reliable JSON	github.com/jxnl/instructor
Marvin	Lightweight AI functions for Python applications	Python developers	askmarvin.ai
Guidance	Constrained generation with templates and grammars	Developers needing precise control	github.com/guidance-ai/guidance
Outlines	Structured text generation with guaranteed JSON/regex output	Production ML engineers	github.com/outlines-dev/outlines
Vercel AI SDK	React/Next.js SDK for streaming AI chat interfaces	Frontend developers	sdk.vercel.ai

Agent Orchestration

Frameworks for building autonomous AI agents that reason, plan, and use tools.

Tool	Problem Solved	Primary Users	URL
LangGraph	Build stateful, multi-step agent workflows with cycles	AI engineers building complex agents	langchain-ai.github.io/langgraph
CrewAI	Multi-agent collaboration with role-based agents	Teams building agent teams	crewai.com
AutoGen	Microsoft's framework for multi-agent conversations	Researchers, enterprise teams	microsoft.github.io/autogen
OpenAI Agents SDK	Build agents with OpenAI's native tooling	OpenAI API users	github.com/openai/openai-agents-python
Anthropic MCP	Model Context Protocol for standardized tool integration	Developers building tool-using agents	modelcontextprotocol.io
Letta (MemGPT)	Agents with persistent memory and self-editing	Long-running agent applications	letta.com
Agency Swarm	Framework for creating collaborative agent swarms	Agent developers	github.com/VRSEN/agency-swarm
Composio	150+ tool integrations for AI agents (GitHub, Slack, etc.)	Agent builders needing integrations	composio.dev
Agentops	Agent observability and debugging	Teams debugging agent behavior	agentops.ai

Prompt Management & Versioning

Track, version, test, and optimize prompts as engineering artifacts.

Tool	Problem Solved	Primary Users	URL
Langfuse	Open-source LLM observability, prompt management, evals	ML teams wanting self-hosted option	langfuse.com
PromptLayer	Prompt versioning, A/B testing, and analytics	Product teams iterating on prompts	promptlayer.com
Humanloop	Prompt management with evaluation and fine-tuning	Enterprise AI product teams	humanloop.com
Agenta	Open-source prompt engineering and LLMOps platform	Teams wanting prompt CI/CD	agenta.ai
Helicone	LLM observability with cost tracking and caching	Teams monitoring API spend	helicone.ai
Pezzo	Open-source AI development toolkit	DevOps teams managing prompts	pezzo.ai
Portkey	AI gateway with prompt management and fallbacks	Production teams needing reliability	portkey.ai
Keywords AI	Unified LLM API with built-in prompt management	Startups, small teams	keywordsai.co

RAG & Knowledge Infrastructure

Connect AI to your organization's data. Vector databases, embeddings, and retrieval.

Interactive Code: 🚀 Learn how RAG works in our RAG Tutorial Notebook

Tool	Problem Solved	Primary Users	URL
Pinecone	Managed vector database for production RAG	Teams needing managed vector search	pinecone.io
Weaviate	Open-source vector database with hybrid search	Teams wanting self-hosted vectors	weaviate.io
Chroma	Lightweight, open-source embedding database	Prototypers, small projects	trychroma.com
Qdrant	High-performance vector database (Rust-based)	Performance-critical applications	qdrant.tech
Milvus	Scalable open-source vector database	Large-scale enterprise deployments	milvus.io
pgvector	Vector similarity search in PostgreSQL	Teams already using PostgreSQL	github.com/pgvector/pgvector
LanceDB	Serverless vector database for multimodal data	Edge/embedded applications	lancedb.com
Voyage AI	High-quality embeddings for enterprise RAG	Teams needing better retrieval	voyageai.com
Cohere Embed	Multilingual embeddings optimized for search	Global enterprise search	cohere.com/embed
Unstructured	ETL for documents (PDF, DOCX, HTML) into LLM-ready chunks	Data engineers building RAG	unstructured.io
Docling	IBM's document understanding for RAG pipelines	Enterprise document processing	github.com/DS4SD/docling

Evaluation & Testing

Measure AI quality, catch regressions, and ensure reliability before deployment.

Tool	Problem Solved	Primary Users	URL
Promptfoo	Open-source prompt testing and red-teaming	Developers testing prompt changes	promptfoo.dev
Inspect AI	UK AISI's framework for rigorous AI evaluations	Safety researchers, evaluators	inspect.ai-safety-institute.org.uk
Braintrust	End-to-end evaluation platform with datasets and scoring	ML teams building eval pipelines	braintrust.dev
Ragas	Evaluation framework specifically for RAG systems	RAG developers	ragas.io
DeepEval	Unit testing framework for LLM outputs	Developers wanting pytest-style evals	github.com/confident-ai/deepeval
TruLens	Evaluation and tracking for LLM applications	Teams debugging RAG quality	trulens.org
Weave	Weights & Biases tool for LLM evaluation and tracing	W&B users, ML teams	wandb.ai/site/weave
Patronus AI	Automated LLM testing for hallucination and safety	Enterprise compliance teams	patronus.ai
Maxim AI	Evaluation platform for production LLM quality	Product teams tracking quality	getmaxim.ai
Galileo	LLM debugging, evaluation, and fine-tuning	ML engineers diagnosing issues	rungalileo.io
Arize Phoenix	Open-source LLM observability and evaluation	Teams wanting free tracing	phoenix.arize.com

Observability & Monitoring

See what your AI is doing in production. Trace requests, debug failures, track costs.

Tool	Problem Solved	Primary Users	URL
LangSmith	Tracing, debugging, and monitoring for LangChain apps	LangChain users	smith.langchain.com
Langfuse	Open-source tracing and analytics for LLM apps	Teams wanting self-hosted observability	langfuse.com
Helicone	Request logging, cost tracking, caching	Teams monitoring API costs	helicone.ai
Arize AI	ML observability for production models	MLOps teams	arize.com
Weights & Biases	Experiment tracking and model monitoring	ML researchers and engineers	wandb.ai
Datadog LLM Observability	Enterprise APM with LLM-specific tracing	Enterprise DevOps teams	datadoghq.com/product/llm-observability
New Relic AI Monitoring	LLM monitoring integrated with existing APM	Teams using New Relic	newrelic.com/platform/ai-monitoring
Honeycomb	High-cardinality observability for AI traces	SRE teams debugging production	honeycomb.io
OpenLLMetry	Open-source OpenTelemetry for LLMs	Teams standardizing on OTel	github.com/traceloop/openllmetry

Safety & Guardrails

Protect against jailbreaks, harmful outputs, data leakage, and policy violations.

Tool	Problem Solved	Primary Users	URL
Guardrails AI	Input/output validation with programmable rules	Developers adding safety checks	guardrailsai.com
NeMo Guardrails	NVIDIA's toolkit for conversational safety rails	Enterprise chatbot teams	github.com/NVIDIA/NeMo-Guardrails
Lakera Guard	Real-time protection against prompt injection	Security-conscious teams	lakera.ai
Rebuff	Self-hardening prompt injection detector	Developers building public-facing AI	rebuff.ai
Llama Guard	Meta's safety classifier for LLM inputs/outputs	Teams using Llama models	ai.meta.com/llama
Arthur Shield	Enterprise AI firewall with policy enforcement	Enterprise security teams	arthur.ai
Robust Intelligence	AI security and validation platform	Enterprise ML security	robustintelligence.com
Protect AI	ML security scanning and vulnerability detection	MLOps security teams	protectai.com
Garak	LLM vulnerability scanner (open-source)	Red teamers, security researchers	github.com/leondz/garak
LLM Guard	Open-source input/output sanitization	Developers needing free guardrails	llm-guard.com

Deployment & MLOps

Get AI into production: serving, scaling, versioning, and infrastructure.

Tool	Problem Solved	Primary Users	URL
vLLM	High-throughput LLM inference engine	Teams self-hosting models	vllm.ai
TensorRT-LLM	NVIDIA's optimized LLM inference	Teams with NVIDIA GPUs	github.com/NVIDIA/TensorRT-LLM
Ollama	Run LLMs locally with simple CLI	Developers, local experimentation	ollama.com
LM Studio	Desktop app for running local LLMs	Non-technical users, prototypers	lmstudio.ai
Text Generation Inference	Hugging Face's production inference server	HF model deployers	github.com/huggingface/text-generation-inference
BentoML	Build and deploy ML services as APIs	ML engineers productionizing	bentoml.com
Modal	Serverless infrastructure for ML workloads	ML engineers avoiding DevOps	modal.com
Anyscale	Managed Ray for scalable AI applications	Teams needing distributed compute	anyscale.com
Baseten	Deploy and scale custom models	ML teams needing fast deployment	baseten.co
MLflow	Open-source MLOps lifecycle management	ML teams tracking experiments	mlflow.org
Kubeflow	ML workflows on Kubernetes	Enterprise Kubernetes teams	kubeflow.org
SageMaker	End-to-end ML platform on AWS	AWS enterprise customers	aws.amazon.com/sagemaker

Governance & Compliance

Manage AI at the organizational level: policies, access control, audit trails, risk.

Tool	Problem Solved	Primary Users	URL
Credo AI	AI governance, risk assessment, and compliance	AI governance teams, legal	credo.ai
Holistic AI	AI risk management and auditing platform	Compliance officers, auditors	holisticai.com
IBM AI Governance	Enterprise AI lifecycle governance	Large enterprise IT	ibm.com/products/ai-governance
Fiddler AI	Model performance monitoring and explainability	ML teams needing explainability	fiddler.ai
Arthur AI	AI monitoring with bias detection and explainability	Enterprise compliance teams	arthur.ai
Truera	AI quality management and monitoring	Regulated industry ML teams	truera.com
DataRobot MLOps	Enterprise model deployment and monitoring	Enterprise data science teams	datarobot.com
Domino Data Lab	Enterprise MLOps with governance built-in	Large enterprise ML teams	dominodatalab.com
Cleanlab	Data-centric AI for finding label errors	ML teams improving data quality	cleanlab.ai

No-Code & Business Platforms

AI tools for non-developers: analysts, operators, knowledge workers, executives.

Tool	Problem Solved	Primary Users	URL
ChatGPT	General-purpose AI assistant with web access	Everyone	chat.openai.com
Claude	AI assistant with document analysis and coding	Knowledge workers, analysts	claude.ai
Gemini	Google's AI assistant integrated with Workspace	Google Workspace users	gemini.google.com
Microsoft Copilot	AI assistant across Microsoft 365	Enterprise Microsoft users	copilot.microsoft.com
Notion AI	AI writing and summarization in Notion	Notion users, PMs, writers	notion.so/product/ai
Jasper	AI content generation for marketing	Marketing teams	jasper.ai
Copy.ai	AI copywriting and workflow automation	Marketing, sales teams	copy.ai
Glean	Enterprise AI search across all company data	Enterprise knowledge workers	glean.com
Dust	Build AI assistants with company knowledge	Operations teams, analysts	dust.tt
Zapier AI	AI automation in workflow pipelines	Business operations, no-code builders	zapier.com/ai
Dify	Open-source platform for building AI apps	Low-code developers	dify.ai
Flowise	Drag-and-drop LLM flow builder	Non-developers building AI flows	flowiseai.com
n8n	Workflow automation with AI nodes	Technical operations teams	n8n.io
Relevance AI	No-code AI agent builder	Business users building agents	relevanceai.com
Voiceflow	Build conversational AI without code	Product teams building chatbots	voiceflow.com

Data & Compute Infrastructure

The foundation: data processing, compute, and ML infrastructure.

Tool	Problem Solved	Primary Users	URL
Hugging Face Hub	Model and dataset repository	All ML practitioners	huggingface.co
PyTorch	Deep learning framework	ML researchers and engineers	pytorch.org
TensorFlow	End-to-end ML platform	ML engineers, production teams	tensorflow.org
JAX	High-performance ML research framework	ML researchers	github.com/google/jax
Keras	High-level neural network API	ML practitioners wanting simplicity	keras.io
scikit-learn	Classical ML algorithms	Data scientists, analysts	scikit-learn.org
Pandas	Data manipulation and analysis	Data analysts, scientists	pandas.pydata.org
Polars	Fast DataFrame library (Rust-based)	Performance-critical data work	pola.rs
NumPy	Numerical computing foundation	All Python ML practitioners	numpy.org
Databricks	Unified data and AI platform	Enterprise data teams	databricks.com
Snowflake Cortex	AI/ML on Snowflake data	Snowflake users	snowflake.com/en/data-cloud/cortex
BigQuery ML	ML directly in BigQuery SQL	GCP data analysts	cloud.google.com/bigquery-ml
Lambda Labs	GPU cloud for ML training	Teams needing GPU compute	lambdalabs.com
RunPod	GPU cloud with serverless options	Cost-conscious ML teams	runpod.io
Vast.ai	GPU marketplace	Budget ML experimentation	vast.ai

Research & Learning

Stay current: papers, courses, communities, and documentation.

Resource	Purpose	URL
arXiv (cs.AI, cs.LG, cs.CL)	Latest research papers	arxiv.org/list/cs.AI/recent
Papers With Code	Papers with implementation code	paperswithcode.com
Hugging Face Papers	Curated ML paper discussions	huggingface.co/papers
Anthropic Research	Claude and AI safety research	anthropic.com/research
OpenAI Research	GPT and reasoning research	openai.com/research
Google DeepMind	Frontier AI research	deepmind.google/research
Prompt Engineering Guide	Comprehensive prompting documentation	promptingguide.ai
LangChain Documentation	Building LLM applications	docs.langchain.com
Anthropic Docs	Claude best practices	docs.anthropic.com
OpenAI Cookbook	Practical OpenAI examples	cookbook.openai.com
AI Engineer World's Fair	Conference recordings and resources	ai.engineer
Latent Space Podcast	AI engineering discussions	latent.space

If you are a...	Start with these categories
Developer building AI apps	Development Frameworks → Agent Orchestration → Evaluation
ML Engineer in production	Deployment & MLOps → Observability → Safety
Data Scientist exploring AI	Foundation Models → RAG Infrastructure → Evaluation
Product Manager	No-Code Platforms → Prompt Management → Observability
Security/Compliance Lead	Safety & Guardrails → Governance → Observability
Executive/Decision Maker	No-Code Platforms → Governance → Research

The Integration Challenge

Companies will not struggle to access AI.
They will struggle to integrate, trust, measure, and govern it under pressure.

This is why the tools in Evaluation, Observability, Safety, and Governance matter as much as the models themselves. The organizations that succeed with AI will be those that:

Measure what their AI systems actually do (not just what they're supposed to do)
Trace decisions back to inputs, prompts, and context
Protect against adversarial inputs and harmful outputs
Govern AI use with clear policies and audit trails
Iterate based on real production data, not assumptions

Notes

Feedback and suggestions are welcome!

This list is maintained as part of the Awesome Prompt Engineering collection. For contributions, please see the repository guidelines.

Last updated: January 2026

Awesome Prompt Engineering

The ultimate guide to prompt engineering, context engineering, and AI agents.