(FULLY OPEN SOURCE) open-computer-use: Computer agents working on their own VMs (github.com)

🤖 AI Summary
Open Computer Use is a fully open-source platform that gives AI agents real computer control — web browsing (search, click, form-fill, scrape), terminal command execution, and desktop UI automation — and orchestrates those capabilities across specialized agents in ephemeral, sandboxed Docker VMs. It aims to replicate “computer use” features (à la Anthropic’s Claude Computer Use) but as a self-hostable, extensible stack: a Next.js frontend, FastAPI backend with a multi-agent executor (planner → browser agent → terminal agent), and Ubuntu 22.04 XFCE Docker VMs running a WebSocket agent server and VNC. The system streams execution with live screenshots, tool-call visualizations, logs, and supports model-key switching across OpenAI, Anthropic, Google Gemini, Mistral, xAI and more, while encrypting API keys and using Supabase for auth/state. For practitioners this lowers the bar to building autonomous, reproducible workflows and multi-agent automation without vendor lock-in, with safety controls like isolated ephemeral containers, network isolation options, and resource limits. Technical prerequisites are Node 20+, Python 3.10+, Docker, a Supabase project and search API keys; benchmarked performance shows ~45s average task completion, ~15s VM startup, ~2GB memory per session, and 50+ concurrent sessions on a modest test machine. The repo includes deployment guides, a plugin roadmap and responsible-use guidance — making it a practical base for research, automation tooling, and community-driven agent development, while also raising clear security and compliance considerations.
Loading comments...
loading comments...