Multiagent Models of Mind (www.lesswrong.com)

0 points 1 hour ago ago | visit original

🤖 AI Summary

This piece argues that the common “people as single consequentialist agents” abstraction is a leaky simplification and proposes a multi-agent model of mind as a better descriptive and predictive framework. Rather than claiming humans literally house multiple personalities, the author suggests decomposing cognition into semi-independent subagents — each with its own values, world-model, and search/decision process — so we can better explain behavior that the single-agent view mispredicts. The claim reframes many supposed “cognitive biases” not as mysterious failings of a unitary reasoner but as interactions and failures of coordination among internal subagents. Technically, the author defines an agent by three components: values (utility-like preferences, possibly framing-dependent), a world-model (which may be as simple as raw sensing), and a search/decision process (from Causal Decision Theory to lookup mappings). This definition intentionally includes simple systems (e.g., thermostats) to show why humans are naturally modeled as agents. For the AI/ML community, this suggests new priors for modeling, interpretability, and alignment: treat learned systems and human behavior as compositions of submodules with their own objectives and planners, examine their interactions, and design architectures and training regimes that mitigate coordination failures. The multi-agent lens promises more granular predictions and engineering strategies for complex, goal-driven systems.

Loading comments...

loading comments...