🤖 AI Summary
AI development appears to be shifting away from ever-larger pre-training runs toward vastly increasing inference compute — either at deployment or as part of richer post-training loops. Reports from leading labs and OpenAI’s o1 chart suggest big pre-training gains are slowing (possibly due to limited high-quality data), while adding inference compute — or devote more compute to post-training RL — can unlock capabilities that pre-training alone no longer delivers. A practical heuristic offered is effective capability ≈ OOMs(pre-training) + 0.7 × OOMs(inference), and lab anecdotes imply that spending 2 extra orders of magnitude on inference could reduce the number of simultaneous deployed copies by ~100×.
This shift matters for governance, economics and safety. If inference-at-deployment dominates, it raises the marginal cost of "first human-level" systems (blunting rapid deployment and proliferation), lessens the strategic value of open-weight theft (because running models becomes the bottleneck), and breaks governance proposals based on training-compute thresholds. Inference scaling is task-sensitive — it benefits verifiable, multi-step “System 2” tasks most and enables pay-for-quality tiers (OpenAI already charges ~10× for higher-inference access). If instead inference is used during training (iterated distillation/amplification), the effects are ambiguous: it could revive pre-training scaling or enable recursive self-improvement. Overall, the move reshapes incentives, industry structure, and which regulatory levers will be effective.
Loading comments...
login to comment
loading comments...
no comments yet