Big AI firms pump money into world models as LLM advances slow (arstechnica.com)

🤖 AI Summary
Top AI labs including Google DeepMind, Meta and Nvidia are shifting resources toward "world models" — multimodal systems trained on video, sensor and robotic data to understand and act in physical environments — as progress in large language models (LLMs) shows signs of slowing. Firms are positioning world models as the next frontier for building more general, embodied intelligence capable of navigating real-world dynamics rather than just predicting text. Recent announcements from several groups signal increased investment and competition to close gaps LLMs can’t tackle, such as grounding language in perception and control. Technically, world models are trained on streams of real or simulated environment data, combining vision, proprioception and control signals to learn causal dynamics and affordances. They promise advances in self-driving, robotics, AI agents, manufacturing and healthcare by enabling planning and manipulation in the physical world — a market Nvidia’s Rev Lebaredian estimates could approach $100 trillion if fully realized. But scaling world models demands vast, diverse datasets, heavy compute, robust simulation and solutions for generalization and safety, so they remain an unsolved challenge even as big AI firms pour money into development.
Loading comments...
loading comments...