AI models are terrible at betting on soccer—especially xAI Grok (arstechnica.com)

🤖 AI Summary
A recent study by AI start-up General Reasoning reveals that even leading AI models from Google, OpenAI, and Anthropic struggled significantly when tasked with betting on soccer matches during a simulated Premier League season. The research, part of the “KellyBench” report, underscores a disconnect between AI's advanced capabilities in certain domains, like software development, and its shortcomings in more complex, unpredictable human problems such as sports betting. The study involved testing eight prominent AI systems using detailed historical data to create models aimed at maximizing returns while managing risk. Notably, the AI agents placed bets based on the outcomes of matches and anticipated goals while adapting to new player information. However, results were disappointing; Anthropic's Claude Opus 4.6 performed best with an average loss of 11 percent, while xAI's Grok 4.20 faced bankruptcy in one instance and failed to finish others. Google's Gemini 3.1 Pro showed some promise with a 34 percent profit in one attempt but also faced financial collapse in another. This outcome highlights the intrinsic challenges AI faces in interpreting real-world dynamics over prolonged periods and raises questions about the reliability of AI in making long-term predictions.
Loading comments...
loading comments...