Google researchers introduce "ReasoningBank" AI agent reinforcement learning (arxiv.org)

0 points 16 hours ago ago | visit original

🤖 AI Summary

Google researchers introduced ReasoningBank, a memory framework that lets LLM-based agents distill broad, reusable reasoning strategies from their own self-judged successes and failures instead of merely storing raw action traces. At test time the agent retrieves relevant distilled memories to guide decisions, then integrates new outcomes back into the bank so it incrementally improves over continued deployments. Compared to prior approaches that store trajectories or only successful routines, ReasoningBank produces more generalizable guidance and boosts both effectiveness and efficiency on web-browsing and software-engineering benchmarks. The team also proposes memory-aware test-time scaling (MaTTS): by allocating extra compute per task, agents generate more abundant and diverse interaction traces that create richer contrastive signals for synthesizing higher‑quality memories. The improved memory then steers more effective scaling, producing a positive feedback loop that yields emergent capabilities. Technically, the approach centers on distillation of reasoning patterns from labeled (self-judged) outcomes, retrieval-augmented decision making, and iterative memory update—establishing “memory-driven experience scaling” as a new scaling dimension for persistent agents. This work suggests a practical path for agents to self-evolve in long-lived, real-world roles by learning from their history rather than repeatedly relearning the same mistakes.

Loading comments...

loading comments...