The Fundamental Limits of LLMs at Scale (arxiv.org)

0 points 6 hours ago ago | visit original

🤖 AI Summary

A new paper formalizes five intrinsic ceilings on LLM scaling—hallucination, context compression, reasoning degradation, retrieval fragility, and multimodal misalignment—by connecting empirical failure modes to fundamental limits from computability, information theory, geometry, and learning theory. The authors prove that any computably enumerable model family must fail on some inputs via diagonalization, and that undecidable queries yield infinite failure sets for all computable predictors. They also show that finite description length and information-theoretic/statistical bounds impose irreducible compression error and prohibitively large sample complexity for long‑tail factual knowledge. Geometric and optimization effects (positional under‑training, encoding attenuation, softmax crowding) effectively shrink usable context length, and likelihood‑based training biases models toward pattern completion rather than true inference. Retrieval under token and coupling limits produces semantic drift and noise, while multimodal scaling tends to preserve only shallow cross‑modal alignment. The significance is twofold: it explains why pure scale alone cannot eliminate core LLM failures and gives a rigorous taxonomy of when scaling helps versus where it saturates or is impossible. For practitioners, the paper suggests concrete mitigations—bounded‑oracle retrieval, positional curricula, sparse or hierarchical attention—and reframes future work toward algorithmic, architectural, and data-efficiency solutions rather than blind parameter growth.

Loading comments...

loading comments...