Kosmos: An AI Scientist for Autonomous Discovery (arxiv.org)

🤖 AI Summary
Kosmos is an autonomous "AI scientist" that runs multi-hour discovery campaigns combining parallel data analysis, literature search, and hypothesis generation. Key technical innovations include a structured world model that lets a data-analysis agent and a literature-search agent share and update knowledge coherently across long runs — Kosmos typically performs up to 200 agent rollouts in a single session (up to 12 hours), executing an average of 42,000 lines of code and reading roughly 1,500 papers per run. It produces end-to-end scientific reports where every claim is traced to code or primary literature, enabling auditability. The system materially accelerates research: independent reviewers judged 79.4% of report statements accurate, collaborators estimated a 20-cycle run equals about six months of their lab work, and the yield of valuable findings scaled roughly linearly with cycles (tested to 20). Kosmos produced seven case-study discoveries across metabolomics, materials science, neuroscience, and statistical genetics — three reproduced unpublished/preprint results Kosmos never saw at runtime, and four were novel contributions. The work demonstrates that long-horizon, multi-agent architectures with explicit world models can materially extend autonomous scientific reasoning, while traceability and sub-100% accuracy underscore the continued need for human oversight and experimental validation.
Loading comments...
loading comments...