Awesome-Agentic-Engineering

🌐 Browser and Desktop Agents

Audience: practitioners Β· Evidence class: mixed

Last reviewed: April 2026.

Computer-use and browser agents operate on GUI surfaces (DOM, pixels, or accessibility trees) rather than pure APIs. Evidence tags follow the Benchmark and Evidence Policy. Entries that are marketing-only or have no active development were removed in this phase.

Consumer Products

Agent Description Evidence
OpenAI Operator ChatGPT autonomous web agent; human checkpoints; built on the CUA (Computer-Using Agent) model. [official]
Claude Computer Use Anthropic desktop/browser control via screenshots and tool loop. [official]
Claude for Chrome Anthropic browsing agent running inside Chrome. [official]
Google Project Mariner Gemini browser agent with multi-task execution in the user’s browser context. [official]
ChatGPT Atlas OpenAI’s AI-native browser with Agent Mode. [official]
Dia Browser AI-native browser from The Browser Company (acquired by Atlassian). [official]

Developer Infrastructure

Tool Description Evidence
Browser Use OSS browser agent library with DOM + vision hybrid; widely embedded in other agent stacks. [official]
Skyvern Vision-driven browser automation using multimodal LLMs for navigation without coded selectors. [official]
UI-TARS ByteDance open-source native GUI agent model + desktop app for end-to-end computer use. [official] Β· [benchmark] paper
Agent S2 (Simular) OSS compositional GUI automation framework with generalist + specialist models. [official] Β· [benchmark]
Browserbase Cloud browser infrastructure for agents; headless Chrome at scale with session persistence. [official]
Amazon Nova Act AWS browser automation research preview aimed at enterprise reliability. [official]
Playwright MCP Official MCP server wrapping Playwright for agent-driven browser automation. [official]

Benchmarks & Evaluation

Benchmark Description Evidence
OSWorld Real computer environments benchmark for multimodal agents on open-ended tasks. [official] Β· [benchmark]
WebArena / VisualWebArena Reproducible web agent benchmark on real-website snapshots. [official] Β· [benchmark]
WindowsAgentArena Benchmark for Windows desktop agents across real applications. [official] Β· [benchmark]