LLMs are strangely-shaped tools (near.blog)

🤖 AI Summary
The piece argues that large language models are “strangely-shaped” tools: they weren’t engineered like screwdrivers for a specific task but emerged from next-token prediction and reinforcement learning—i.e., what the loss happened to optimize. That accidental shape gives LLMs broad, surprising capabilities but also a host of practical traps when you try to turn them into products. Common failure modes include unreliable AI agents, brittle long-term memory when implemented solely with RAG (retrieval-augmented generation), poor search behavior, hallucinations, and high inference costs—plus many subtle weaknesses that casual tinkerers miss. The significance is twofold for the AI/ML community: first, much of current post-training research is about reshaping these models to be fit-for-purpose, which is hard even for major labs; second, the difficulty encourages homogeneity—companies copy successful consumer apps rather than explore new product shapes—because teams lack the right research–product alignment or vision. That means many “AI” features end up superficial. The author expects more substantive experimentation and progress later in 2024 and into 2025, and notes this oddly-shaped nature of current LLMs is one reason they’ve grown more cautious about AGI timelines, since an AGI would need a very different, purpose-built shape.
Loading comments...
loading comments...