Awesome-Agentic-Engineering

🌐 Browser and Desktop Agents

Audience: practitioners · Evidence class: mixed

Last reviewed: April 2026.

Computer-use and browser agents operate on GUI surfaces (DOM, pixels, or accessibility trees) rather than pure APIs. Evidence tags follow the Benchmark and Evidence Policy. Entries that are marketing-only or have no active development were removed in this phase.

Consumer Products

Agent	Description	Evidence
OpenAI Operator	ChatGPT autonomous web agent; human checkpoints; built on the CUA (Computer-Using Agent) model.	`[official]`
Claude Computer Use	Anthropic desktop/browser control via screenshots and tool loop.	`[official]`
Claude for Chrome	Anthropic browsing agent running inside Chrome.	`[official]`
Google Project Mariner	Gemini browser agent with multi-task execution in the user’s browser context.	`[official]`
ChatGPT Atlas	OpenAI’s AI-native browser with Agent Mode.	`[official]`
Dia Browser	AI-native browser from The Browser Company (acquired by Atlassian).	`[official]`

Developer Infrastructure

Tool	Description	Evidence
Browser Use	OSS browser agent library with DOM + vision hybrid; widely embedded in other agent stacks.	`[official]`
Skyvern	Vision-driven browser automation using multimodal LLMs for navigation without coded selectors.	`[official]`
UI-TARS	ByteDance open-source native GUI agent model + desktop app for end-to-end computer use.	`[official]` · `[benchmark]` paper
Agent S2 (Simular)	OSS compositional GUI automation framework with generalist + specialist models.	`[official]` · `[benchmark]`
Browserbase	Cloud browser infrastructure for agents; headless Chrome at scale with session persistence.	`[official]`
Amazon Nova Act	AWS browser automation research preview aimed at enterprise reliability.	`[official]`
Playwright MCP	Official MCP server wrapping Playwright for agent-driven browser automation.	`[official]`

Benchmarks & Evaluation

Benchmark	Description	Evidence
OSWorld	Real computer environments benchmark for multimodal agents on open-ended tasks.	`[official]` · `[benchmark]`
WebArena / VisualWebArena	Reproducible web agent benchmark on real-website snapshots.	`[official]` · `[benchmark]`
WindowsAgentArena	Benchmark for Windows desktop agents across real applications.	`[official]` · `[benchmark]`

This site is open source. Improve this page.