Vintage Large Language Models (owainevans.github.io)

🤖 AI Summary
Researchers propose "vintage" LLMs: models intentionally trained only on data up to a chosen cutoff (e.g., 2019, 1989, 1600) to simulate how an agent with period‑limited knowledge would reason, forecast, or invent. Uses include backtesting forecasting (LLM‑2019 predicting 2019–2024 events like the pandemic), probing the novelty of past discoveries (could a pre‑1900 model infer Newtonian ideas?), and interactive "time‑travel" dialogues with historical personae. Technically this requires careful decontamination (avoid leakage of future facts), multimodal curation (period‑appropriate images), and huge datasets — estimates suggest orders of magnitude of text (the author mentions ~50 trillion words) and training costs on the scale of hundreds of millions of dollars for state‑of‑the‑art models. Key implementation and research implications: vintage LLMs demand synthetic‑data bootstrapping (train a smaller clean model on historical corpora, use it to generate paraphrases/remixes to expand the dataset), and systems engineering like retrieval, chain‑of‑thought scaffolding, RL/agent layers, and tool access to turn a base model into a reliable forecaster or inventor. Cost‑saving training strategies include chronological forking (train once to a date then branch) and compartmentalized models with date‑tagged documents to condition outputs on an era. Beyond curiosity, this approach offers a controlled testbed for epistemic AI — studying calibration, gold‑standard supervision sources (human labels, algorithmic judges, and historical data), and what kinds of scientific breakthroughs are predictable given specific prior knowledge.
Loading comments...
loading comments...