Study finds AI models store memories and logic in different neural regions (arstechnica.com)

0 points 12 hours ago ago | visit original

🤖 AI Summary

Goodfire.ai’s new preprint presents strong evidence that modern language models physically separate memorization and reasoning into different neural pathways. By analyzing OLMo-7B (Allen Institute) they identified a clean split at layer 22: the bottom 50% of weight components showed 23% higher activation on memorized text, while the top 10% activated 26% more on non-memorized inputs. When researchers surgically ablated the memorization circuits the model lost about 97% of its ability to recite training data verbatim but preserved nearly all evaluated “logical” abilities (true/false judgments, if-then rule following). The study’s striking implication is that arithmetic seems to ride with memorization, not reasoning: removing memorization pathways caused math performance to fall to 66% even as other logical tasks stayed intact. That helps explain why LLMs often flub arithmetic unless augmented with tools—they may be recalling learned facts rather than computing. For AI/ML this suggests paths toward more modular, interpretable models (targeted edits to reduce data leakage or tune capabilities), and clarifies limits of current “reasoning” benchmarks. The authors caveat that “reasoning” spans many definitions—deeper mathematical proof-style reasoning remains out of reach even when pattern-based logical skills survive memory removal.

Loading comments...

loading comments...