DeepResearch: Tongyi DeepResearch, the Leading Open-Source DeepResearch Agent (github.com)

🤖 AI Summary
Tongyi Lab announced Tongyi DeepResearch, a 30.5B-parameter open-source agentic LLM (released as Tongyi-DeepResearch-30B-A3B) specialized for long-horizon, deep information-seeking tasks. The model uses a sparse-activation-style design—only 3.3B parameters are activated per token—enabling a large capacity while controlling per-step compute, and supports a 128K-token context for extended multi-step web traversal and document synthesis. Tongyi DeepResearch claims state-of-the-art results on several agentic search benchmarks (Humanity's Last Exam, BrowserComp/BrowserComp-ZH, WebWalkerQA, xbench-DeepSearch, FRAMES, SimpleQA) and continues the team’s WebAgent lineage, positioning it as a research-grade tool for complex retrieval, planning and long-horizon reasoning. Technically, the release emphasizes an end-to-end training stack: a fully automated synthetic data generation pipeline for agentic pretraining, supervised fine-tuning and on-policy reinforcement learning, plus large-scale continual pretraining on agentic interaction data to keep capabilities fresh. Their RL method uses a customized Group Relative Policy Optimization with token-level policy gradients, leave-one-out advantage estimation and selective negative-sample filtering to stabilize learning in non‑stationary environments. At inference the model supports both ReAct for core ability evaluation and an IterResearch “Heavy” test-time scaling mode to push performance. The model, eval scripts and inference guides (recommended Python 3.10, conda/virtualenv) are available on HuggingFace/ModelScope, making it immediately usable for researchers building or benchmarking web-capable agents.
Loading comments...
loading comments...