🤖 AI Summary
DeepMind unveiled Dreamer 4, a transformer‑based agent that learns complex, long‑horizon behaviors entirely inside a scalable world model trained from offline human gameplay videos. Using a fixed dataset of recorded Minecraft play (with only a few hundred hours of action-labelled data plus more video-only footage), Dreamer 4 learns to predict future observations, actions and rewards and then practices via reinforcement learning “in imagination.” Notably, it became the first agent trained purely offline to obtain diamonds in Minecraft—an achievement that requires thousands of sequential steps (crafting, mining, tool use) without ever interacting with the real game.
Technically, Dreamer 4 advances generative world models with an efficient transformer architecture and a new training objective called “shortcut forcing,” which improved predictive accuracy and sped up video generation by over 25× versus typical video models, enabling real‑time interaction on a single GPU. The model learns object interactions from mainly visual data, generalizes with minimal action labels, and outperforms prior world models on dynamics like block placement, crafting and tool use. This opens a practical path for training robots and agents safely in simulation from abundant internet video, with planned extensions like long‑term memory and language grounding to support more consistent, collaborative real‑world behaviors.
Loading comments...
login to comment
loading comments...
no comments yet