🤖 AI Summary
Aleksandrs Slivkins’ "Introduction to Multi-Armed Bandits" (updated on arXiv through 2024) is a textbook-style, self-contained treatment that synthesizes the extensive literature on sequential decision-making under uncertainty. The book organizes material into focused chapters that can serve as standalone tutorials or surveys: basic IID reward models (including impossibility results, Bayesian priors, and Lipschitz rewards), adversarial settings (full-feedback, adversarial bandits, linear and combinatorial extensions), contextual bandits as a bridge between stochastic and adversarial regimes, and economic aspects (learning in repeated games, budget/supply-constrained bandits, and exploration with strategic agents). Many chapters include exercises, making the book suitable for coursework and practitioner upskilling.
Technically rigorous but accessible, the text provides foundations (concentration inequalities and KL-divergence in the appendix) alongside practical algorithmic themes: regret bounds, model assumptions that enable exploration-exploitation trade-offs, and structural extensions that handle similarity information, resource constraints (bandits with knapsacks), and incentive-aware learning. For AI/ML researchers and engineers building online recommenders, adaptive experimentation, or resource-constrained decision systems, the book is a compact reference that clarifies when standard bandit algorithms apply, how to extend them to structured actions or adversarial opponents, and where open problems remain.
Loading comments...
login to comment
loading comments...
no comments yet