The Path to a Superhuman AI Mathematician (cacm.acm.org)

🤖 AI Summary
At the Heidelberg Laureate Forum, Princeton’s Sanjeev Arora sketched a concrete roadmap to a “superhuman AI mathematician”: couple modern proof assistants (notably Lean) with iterated AI self-improvement so models write proofs directly in a formal language and have them mechanically verified. The key shift is replacing human labels with formal verification—AI proposes proofs in Lean, the checker confirms correctness, and verified solutions feed back as training signal. Reinforcement-learning-style self-play and AI-generated problem sets (rather than wholly human-curated banks) close the loop, letting systems scale their creativity while Lean weeds out hallucinations. This is already moving from theory to practice. DeepMind’s AlphaGeometry/AlphaProof and recent OpenAI/Google models reached IMO-level performance; Arora’s open Goedel-Prover-V2 achieved gold-level performance on five of six IMO problems after 10–20 rounds of self-correction. Morph Labs’ Gauss dramatically sped translating informal proofs into Lean (the Strong Prime Number Theorem conversion took weeks, not years). Significance: mathematics is a plausible first domain for superintelligent systems because correctness can be formally verified, enabling reliable, repeatable self-improvement. Caveats remain—models still err and creativity can hallucinate—but formal verification provides a robust safety valve that makes scalable, verifiable mathematical automation realistic.
Loading comments...
loading comments...