Tiny Recursive Model – 7M parameter NN that outperforms LLMs (github.com)

0 points 8 hours ago ago | visit original

🤖 AI Summary

Researchers released Tiny Recursion Model (TRM), a 7M-parameter neural network that uses a recursive reasoning loop to solve hard tasks and reports 45% on ARC-AGI-1 and 8% on ARC-AGI-2. TRM challenges the prevailing assumption that only very large, expensive foundation models can handle complex reasoning: by repeatedly refining a candidate answer with a compact network, the authors claim competitive or superior performance to prior small-model baselines (e.g., HRM) on certain benchmarks while keeping compute and parameter counts tiny. This makes a strong case for alternative, parameter-efficient research directions that reduce reliance on scaling alone. Technically, TRM embeds the input question x, an initial answer y, and a latent state z, then performs up to K improvement steps where it (i) iteratively updates z n times conditioned on x and y (the inner recursive reasoning) and (ii) updates y from the current z. This cheap, iterative refinement lets the same small network correct prior mistakes over multiple reasoning cycles, improving sample efficiency and reducing overfitting. The open-source code is PyTorch-based (requires CUDA), with example configs using 2-layer models and multiple H/L cycles; ARC pretraining runs take on the order of days on multi-GPU setups. TRM’s release provides a practical blueprint for low-cost, iterative reasoning models and invites follow-up on scaling, integration with LLMs, and broader benchmark comparisons.

Loading comments...

loading comments...