MiniMax-M2 (github.com)

🤖 AI Summary
MiniMax-M2 is an open-source sparse Mixture-of-Experts (MoE) model released today, engineered to deliver “frontier-style” coding and agentic workflows at much lower deployment cost. Architecturally it exposes 230B total parameters with only ~10B activated per request, a design that targets faster plan→act→verify loops and lower latency for interactive agents. Artificial Analysis places MiniMax-M2 at #1 among open-source models on a composite intelligence metric, and it posts strong, practical results on coding and agent benchmarks—examples include LiveCodeBench 83, SWE-bench Verified 69.4, Terminal-Bench 46.3 and BrowseComp 44—showing real-world competence in multi-file edits, compile-run-fix loops, CI/IDE/terminal automation, and long-horizon toolchains across shell, browser, retrieval, and code runners. For practitioners, the release includes model weights on HuggingFace, a public API and a MiniMax Agent (free for a limited time), plus deployment guides recommending SGLang and vLLM. Key operational implications: 10B active parameters reduce per-request memory and tail latency, enable more concurrent runs per compute budget, and simplify capacity planning for regression suites and batched sampling. MiniMax-M2 is an “interleaved thinking” model that requires preserving <think>…</think> tokens in history for optimal performance; the team also provides suggested sampling settings (temperature 1.0, top_p 0.95, top_k 40). Overall, MiniMax-M2 targets developers and researchers who need strong, tool-enabled intelligence without frontier-scale inference costs.
Loading comments...
loading comments...