🤖 AI Summary
Researchers introduced Centaur, a foundation model of human cognition built by fine-tuning Llama 3.1 70B on Psych-101, a newly curated corpus of trial-by-trial data: >10 million choices from 60,000+ participants across 160 canonical experiments (bandits, decision-making, memory, MDPs, etc.). Training used QLoRA (4-bit quantized base + low-rank adapters of rank r=8, adding ~0.15% trainable parameters), masking loss to only human response tokens, and one epoch of cross-entropy training (~5 days on an A100 80GB). The approach reframes diverse experimental paradigms in natural language so a single model can learn cross-task behavioral patterns.
Centaur consistently outperforms the base Llama and a suite of domain-specific cognitive models at predicting held-out participants (Centaur NLL 0.44 vs Llama 0.58) and generalizes to held-out experiments, altered cover stories, structural task changes and wholly new domains. It produces realistic open-loop simulations (matching human exploration levels in bandit tasks, reproducing bimodal model-free/model-based distributions) and selectively predicts human but not artificial-agent behaviour; its internal representations also become more aligned with human neural activity despite no explicit neural supervision. For AI/ML and cognitive science, Centaur demonstrates that a single, language-model-based architecture can serve as a predictive, generalizable computational theory of behaviour—offering a practical tool for theory testing, behavioral simulation, and bridging models with neural data.
Loading comments...
login to comment
loading comments...
no comments yet