Actual Intelligence (jwmason.org)

🤖 AI Summary
This essay reframes LLMs not as independent thinkers but as probabilistic language models—essentially windows onto the vast body of human text already online. They predict the next token given an input, so their abilities (answering factual questions, generating code, producing images with captions) come from the sheer volume and variety of human-produced material rather than any abstract world model. The author stresses that the “magic” is mostly in the free, decentralized corpus people have posted over decades, and that calling these systems “stochastic parrots” captures their fundamental dependence on existing human output. The piece argues this has urgent technical, economic and ethical implications for the AI/ML community: model performance is tightly coupled to dataset provenance and breadth, and current business models rely on effectively free labor (text, code, images) while shouldering huge compute and energy costs. If companies enclose and monetize those commons, they risk destroying the incentives that produced the data in the first place. The takeaway: focus policy and engineering on responsible dataset practices, fair compensation/licensing, transparency about training sources and energy use, and preserving an information commons that enables continued cooperative production of the raw material LLMs need.
Loading comments...
loading comments...