Lost in the Maze: Overcoming Context Limitations in Long-Horizon Agentic Search (arxiv.org)

🤖 AI Summary
Researchers identify context accumulation as the main bottleneck in long-horizon agentic search—agents that must explore the web over long trajectories for tasks like deep research often clog their context windows with noisy content, hit tool-call budgets, or terminate prematurely. To address this, the paper introduces SLIM (Simple Lightweight Information Management), a minimalist framework that splits retrieval into distinct search and browse tools and periodically compresses the agent’s trajectory into concise summaries. This keeps the working context small and focused, enabling longer, more purposeful search chains without brute-forcing context size or inflating tool usage. Technically, SLIM achieves comparable or better task performance at much lower cost across multiple open-source base models. Using the o3 model, SLIM scores 56% on BrowseComp and 31% on HLE—improving over other open-source frameworks by 8 and 4 absolute points respectively—while making 4–6× fewer tool calls. The authors also release an automated trajectory-analysis pipeline and an error taxonomy, showing SLIM produces fewer hallucinations than prior systems. The paper’s key takeaway for the AI/ML community is practical: simple architectural choices—separating search/browse and periodic summarization—can substantially improve scalability, efficiency, and reliability for long-horizon agentic applications.
Loading comments...
loading comments...