Deep researcher with test-time diffusion (research.google)

0 points 5 hours ago ago | visit original

🤖 AI Summary

Google Cloud researchers introduced Test-Time Diffusion Deep Researcher (TTD-DR), a deep research agent that frames long-form report writing as a diffusion-style denoising process: an initial “noisy” draft guides iterative searches, and retrieved facts are used to progressively refine the draft until a final report is produced. The system combines a three-stage backbone (research-plan generation, iterative search composed of search-question generation + answer searching, and final report generation) with two novel algorithms: component-wise self-evolution (multiple answer variants, LLM-as-judge auto-raters for helpfulness/comprehensiveness, iterative revision, and crossover merging) and report-level denoising with retrieval (feeding drafts into search to generate better queries and using synthesized answers to revise the draft). TTD-DR uses Gemini-2.5-pro as its base LLM. TTD-DR sets new state-of-the-art results on long-form research (74.5% win rate versus OpenAI Deep Research) and improves multi-hop benchmarks (notable gains of 7.7% and 1.7% on two datasets; ablations show self-evolution alone raises DeepConsult wins to 59.8% and correctness on HLE-Search/GAIA by 4.4%/1.2%). The approach’s key implication is aligning agent workflows with human research cycles—planning, drafting, searching, and iterating—improving coherence and factual grounding for complex, multi-hop tasks. Practical caveats include dependence on retrieval quality and compute for repeated search/revision loops, and sensitivity to the base LLM, but the framework offers a promising template for more robust, research-focused AI assistants.

Loading comments...

loading comments...