We Solved Poker: From Academic Bots to Superhuman AI (1998-2025) (gist.github.com)

🤖 AI Summary
From simple hand‑strength heuristics in 1998 to the superhuman engines of the 2010s and the polished GTO solvers of the mid‑2020s, poker AI evolved from handcrafted rules and opponent weight tables into mathematically grounded programs that beat top professionals. Early systems like Loki introduced effective hand strength (EHS) and dynamic opponent models; the 2000s saw an arms race — open equity libraries, log‑file and Windows‑message bots, and commercial cheating scandals (WinHoldEm). The real shift came with counterfactual regret minimization (CFR, 2007), which reframed poker as iterative self‑play regret minimization converging to Nash equilibria, enabling later breakthroughs: Cepheus “essentially solved” heads‑up limit hold’em (near‑zero exploitability), DeepStack used continual re‑solving plus neural nets for leaf evaluation to win against pros, and Libratus combined massive offline abstractions, nested subgame solving, and nightly self‑improvement to dominate top players in 2017. Technically, the trajectory shows three key moves: accurate opponent/hand evaluation (EHS, sampling, fast equity engines), scalable equilibrium computation (CFR and Monte‑Carlo CFR to avoid full tree expansion), and online refinement (continual re‑solving with learned value networks). Implications for AI/ML are broad: imperfect‑information games drove new regret‑based RL methods, hybrid search+learning architectures, and practical robustness against adversarial play—while also raising ethics and security concerns as bots moved from research labs into real money environments.
Loading comments...
loading comments...