Anthropic says its new AI model “maintained focus” for 30 hours on multistep tasks (arstechnica.com)

0 points 307 days ago ago | visit original

🤖 AI Summary

Anthropic today launched Claude Sonnet 4.5 — billed as its "most capable model to date" — alongside Claude Code 2.0, a command-line AI agent for developers, and a Claude Agent SDK to let teams build custom coding agents. Anthropic says Sonnet 4.5 showed sustained focus on the same complex, multistep project for "more than 30 hours," a notable claim because agentic models typically lose coherence as errors accumulate and context windows (the model's short-term memory) fill. The company also emphasizes major gains in coding, computer use, reasoning and math, and markets Sonnet 4.5 as the best coding model in the world. The news matters because long-horizon coherence is a practical bottleneck for autonomous agents, continuous code refactoring, and multi-step pipelines. Sonnet occupies Anthropic’s mid-range slot between smaller Haiku and larger Opus models (Haiku 3.5 updated Nov 2024; Sonnet 4.0 in May; Opus 4.1 in Aug), offering a performance/cost sweet spot that many developers favor. Anthropic didn’t detail the 30-hour tasks or the specific architectural or system changes (longer context windows, improved memory, or stepwise error-correction could explain the behavior), so independent validation will be key. If reproducible, Sonnet 4.5 could make building robust, long-running developer agents and automation workflows far more practical.

Loading comments...

loading comments...