🤖 AI Summary
Researchers propose "Metacognitive Reuse," a method that identifies recurring chain-of-thought fragments produced by LLMs, compresses them into concise, named "behaviors" (a short instruction plus a name) via the model’s own metacognitive analysis, and stores them in a behavior handbook for reuse. Behaviors are supplied in-context at inference or distilled into model parameters through supervised fine-tuning. Across experiments the approach reduces redundant reasoning, speeds up inference, and improves accuracy: behavior-conditioned inference cuts reasoning tokens by up to 46% while matching or improving baseline accuracy; behavior-guided self-improvement (no parameter updates) raises accuracy by up to 10% over a simple critique-and-revise baseline; and behavior-conditioned SFT more effectively teaches non-reasoning models to reason than vanilla SFT.
The key technical insight is procedural compression—turning repeated multi-step derivations into short procedural hints so models remember how to reason, not just what answers to give. That reduces context-window saturation and latency, freeing capacity for deeper exploration and enabling both immediate in-context gains and more sample-efficient fine-tuning. Implications include cheaper multi-step reasoning, improved lifelong/self-improving behavior without retraining, and a practical path to hybrid in-context + parameterized memory of reasoning procedures for more scalable, efficient LLM reasoning.
Loading comments...
login to comment
loading comments...
no comments yet