Olmo 3 is a fully open LLM (simonwillison.net)

0 points 1 day ago ago | visit original

🤖 AI Summary

Ai2 (Allen Institute for AI) released Olmo 3, a fully open LLM suite that uniquely publishes model weights, checkpoints, training process and the training corpus. The lineup includes four 7B models (Base, Instruct, Think, RL Zero) plus 32B variants of Olmo 3-Think and Olmo 3-Base. Olmo 3 is pretrained on Dolma 3 (~9.3T tokens) with a curated Dolma 3 Mix (~5.9T tokens) emphasizing code and math, stronger deduplication/decontamination, and explicit web collection policies (no paywalled scraping). Ai2 positions Olmo 3-Think (32B) as “the best fully open 32B-scale thinking model,” trained on roughly 6x fewer tokens than comparable closed or open-weight 32B models (e.g., Qwen 3 32B) and ships with tools to inspect intermediate reasoning traces and link outputs back to specific training documents via OlmoTrace. The release is significant because it advances reproducibility and auditability: open training data plus traceability lets researchers probe how specific data and training choices produce behaviors (and makes it easier to detect poisoning/backdoors highlighted in recent research). Practical notes from early testing: the 7B download is ~4.16GB, the 32B ~18.14GB; the 32B Think model produces long, inspectable “thinking” traces (one SVG generation produced ~8.4k tokens over ~15 minutes). OlmoTrace currently surfaces phrase matches that aren’t always perfectly relevant, and the corpus still relies on web crawl data (not exclusively licensed), so openness improves but doesn’t eliminate all contamination or provenance challenges.

Loading comments...

loading comments...