Anthropic has a 2-hour engineering take-home test. It says its new Claude 4.5 model outscored every human who took it. (www.businessinsider.com)

0 points 4 hours ago ago | visit original

🤖 AI Summary

Anthropic announced Claude Opus 4.5 and said the model "scored higher than any human candidate ever" on its two-hour engineering take-home exam — a timed test the company uses to evaluate prospective engineers' technical ability and judgement. Anthropic’s methodology gave the model multiple attempts per problem and selected the best answer; the firm notes the exam doesn’t capture every engineering skill and has provided few public details about the test content. Glassdoor reviews indicate the company's test can involve multi-level system implementation and feature additions, but it’s unclear whether the exact same format was used for Claude 4.5. The result is significant because it demonstrates an LLM reaching or exceeding human-level performance on time‑boxed coding tasks, reinforcing trends toward AI-augmented software development. Claude 4.5 also adds improvements in generating professional documents (Excel, PowerPoint) and arrives three months after its predecessor. Anthropic keeps training methods largely secret, though insiders suggest iterative code-writing and human/AI review loops were used. Practically, this implies engineers may shift toward supervising, editing and integrating AI‑generated code rather than manually producing all features — a change that affects hiring, evaluation, and team workflows more than wholesale job replacement.

Loading comments...

loading comments...