Show HN: WyseOS – An AgentOS for Web Automation (medium.com)

0 points 2 hours ago ago | visit original

🤖 AI Summary

WyseOS is an "AgentOS" for web automation that combines a native multi-agent orchestration layer with a high-performance multi-modal action model and a persistent experience knowledge base. Announced as a modular, extensible PaaS for building and deploying web automation agents, it centers on a Task Planning Agent that decomposes natural-language goals into sub-tasks and dispatches them to specialist agents (over ten built-ins) which hold independent contexts and toolsets. WyseOS exposes APIs, a browser plugin and an OpenAPI/SDK, supports major LLMs (GPT, Claude, Gemini, Qwen, etc.), and addresses real-world constraints (cloud sandboxed browser + local plugin for complex auth, up to 40 iteration steps without a graph DB). Technically notable is its hybrid element-detection pipeline: a fine-tuned YOLOv12 visual detector (with Area Attention and FlashAttention optimizations) fused with DOM-semantic parsing; when visual and DOM detections overlap >15% the DOM result is preferred, otherwise visual detections augment robustness. Memory and continual learning use a hybrid retrieval strategy (TF-IDF + vector recall, weighted similarity, multi-level filtering) and self-supervised offline learning of task traces. Reported metrics include an average GAIA ~59.2% (competitive for open-source), Level-1 success >75% and a ~19.5% gain at Level-2 after training—indicating practical gains from multi-agent orchestration and experience-based learning for cross-site, complex automation.

Loading comments...

loading comments...