🤖 AI Summary
In a podcast recap, Andrej Karpathy summarizes Richard Sutton’s critique that contemporary LLMs are “ghosts” — engineered statistical replicas of human text — not “animals” who learn by interacting with the world. Sutton (author of “The Bitter Lesson”) argues that LLMs rely heavily on massive human-generated pretraining, supervised finetuning and curated RL mixtures, which bake human biases into models and aren’t the tabula‑rasa, interaction-driven systems he envisions. He contrasts this with agents like AlphaZero (learned from self‑play without human priors) or biological learning shaped by evolution and continual online adaptation, advocating instead for child‑machine style agents that learn at test time via reinforcement signals and intrinsic motivations (curiosity, empowerment).
Karpathy largely agrees this is a healthy corrective: pretraining serves as a pragmatic “crappy evolution” that solves cold‑start for huge parameter counts, but it’s not the pure, bitter‑lesson ideal. The debate highlights concrete research directions and tradeoffs — moving from static pretraining toward continual, online learning, richer intrinsic rewards, multi‑agent self‑play, and memory mechanisms (contextual/test‑time adaptation instead of weight updates). Practically, labs will likely keep building these human‑tainted “ghosts” while exploring animal‑inspired algorithms; both paths could yield powerful, but qualitatively different, intelligences.
Loading comments...
login to comment
loading comments...
no comments yet