OpenAI’s latest research paper demonstrates that falsehoods are inevitable (theconversation.com)

🤖 AI Summary
OpenAI published a rigorous mathematical analysis explaining why ChatGPT-style models hallucinate: because autoregressive token-by-token generation amplifies classification error, so sentence-level error is provably higher than single yes/no error (the paper shows at least a twofold increase). Hallucinations are not just noisy training data—though rare facts are more prone to be made up (the paper notes if 20% of people’s birthdays appear only once in training, at least 20% of birthday queries will be wrong)—they’re an inherent consequence of probabilistic next-token prediction and limited exposure during training. The researchers also identify a feedback loop in evaluations: most benchmarks use binary scoring that gives zero credit for honest uncertainty, so the optimal strategy is always to guess. OpenAI’s proposed fix is confidence-aware responses and benchmark scoring (e.g., answer only if >75% confident under a penalty-reward scheme), which would reduce hallucinations but dramatically change user experience (many queries would return “I don’t know”) and require far more compute. Producing reliable confidence estimates means evaluating multiple candidate responses or using active learning—costs that make uncertainty-aware systems feasible for high-stakes domains (medicine, finance, infrastructure) but economically impractical for consumer services. The paper highlights a systemic misalignment: current user expectations, benchmarks, and economics favor confident guessing, so hallucinations will persist until incentives change.
Loading comments...
loading comments...