Dwarkesh Patel's Podcast with Andrej Karpathy (thezvi.substack.com)

🤖 AI Summary
On Dwarkesh Patel’s podcast, Andrej Karpathy argued we’re entering “a decade of agents,” not an instant leap to AGI — he estimates about a decade to truly general systems — and pushed back against hype like “2025 is the year of agents.” The conversation parsed what current large language models (LLMs) can and can’t do: strong pattern completion and in‑context learning, but clear cognitive deficits in long‑term memory, multimodality, continual learning and context handling. Karpathy emphasizes that current models compress massive corpora into hazy “hazy recollection” in parameters (e.g., a 70B model trained on trillions of tokens) and that you still need explicit context (full texts in the window) for precision. Technically the debate boiled down to mechanisms and roadmaps: is intelligence emerging from in‑context behavior or do we need explicit continual/lifetime learning, sparse attention, and algorithmic changes? Karpathy sees biological processes (evolution, sleep) as different analogs and argues many improvements must come “across the board” (architecture, optimizers, memory systems). Practical evidence: he found LLMs weak when assembling a bespoke repo (nanochat) — autocomplete helped boilerplate but models “remember wrong” patterns from internet norms when confronted with unusual code structure. For practitioners, the takeaway is concrete: agents will incrementally gain value as we solve memory, multimodal grounding and sample efficiency, but meaningful, safe agent deployment will require focused algorithmic work and careful calibration of expectations about timelines.
Loading comments...
loading comments...