The Alberta plan for AI research [pdf] (www.incompleteideas.net)

🤖 AI Summary
Rich Sutton and colleagues (DeepMind Alberta/University of Alberta) released "The Alberta Plan for AI Research," a compact roadmap that reframes the grand goal of AI as understanding intelligence as the continual acquisition of knowledge. Rather than adding domain-specific priors (vision, language, geometry), the plan advocates a retreat to core algorithms redesigned for online, temporally uniform, non‑episodic interaction: continual learning, meta‑learning, and model‑based reinforcement learning. Its motivating blueprint, the Oak architecture (Options And Knowledge), organizes agents around learned features, subtasks/options, multiple policies/value functions, and a cycle of discovery that ties abstractions to reward. The authors argue this approach better fits the "Big World" where agents never fully know the state or dynamics and must continually adapt. Technically the plan lays out 12 incremental steps — from continual supervised learning with per‑feature normalization and meta‑tuned step sizes (IDBD‑style) to supervised feature discovery, General Value Function prediction, average‑reward learning, actor‑critic control, and finally model‑based Proto‑AI instantiations (STOMP → Oak). Key emphases include off‑policy GVFs, average‑reward formulations (no discounting), online normalization, feature generation for continual settings, and new "continuing" benchmarks (a C‑suite). For the AI/ML community this pushes toward scalable, compute‑aware agents that learn continuously in multiagent, non‑stationary environments, reframing evaluation and tooling around lifelong and meta learning.
Loading comments...
loading comments...