The Bitter Lesson's Bitter Lesson (andrewtrask.substack.com)

0 points 4 hours ago ago | visit original

🤖 AI Summary

Richard Sutton’s “Bitter Lesson” critique—that intelligence should be learned from first principles rather than by imitating humans—was reframed here as itself incomplete: learning “from scratch” ignores the enormous computational savings encoded in evolution and culture. Re-running evolutionary and cultural optimization would demand astronomic compute (an argued lower bound exceeding ~10^50 operations when accounting for neural activity across contributing organisms over ~4.5 billion years and ~10^30 living experiments), whereas modern models train with roughly 10^26 operations. Babies aren’t blank slates: they inherit neural priors and the compressed product of millennia of communication and problem-solving. Training LLMs on human text is therefore inheriting a highly optimized shortcut, not merely naive imitation. The practical implication is that the next frontier isn’t abandoning inherited knowledge but massively expanding responsible access to it. Contemporary LLMs are trained on ~100–200 TB of data versus an estimated 180 zettabytes of digitized human knowledge—a gap of over a billionfold—much of which is higher-quality, operationally validated institutional data. Real gains will come from architectures and policy stacks that enable “broad listening” at civilizational scale while preserving privacy, attribution, and ownership: privacy-enhancing technologies, attribution-aware models, and new governance frameworks. The bitter lesson’s bitter lesson: combine inherited optimization with continued learning rather than trying to re-learn everything from first principles.

Loading comments...

loading comments...