Researchers say AI just broke every benchmark for autonomous cyber capability (cyberscoop.com)

0 points 48 days ago ago | visit original

🤖 AI Summary

Researchers from the UK’s AI Security Institute (AISI) and Palo Alto Networks have reported that Anthropic’s Claude Mythos Preview and OpenAI’s GPT-5.5 have surpassed previous benchmarks in autonomous cybersecurity tasks. These advanced AI models have accelerated the pace of completing complex cyber operations, with the AISI noting a significant shift in the doubling time for autonomous capabilities—from an estimated eight months to just four to five months. This development indicates a rapid evolution in AI's ability to independently tackle intricate cyber threats. The AISI's simulations demonstrated that Claude Mythos Preview successfully completed a sophisticated 32-step corporate attack scenario and even tackled previously unsolved challenges, while GPT-5.5 also showed impressive results. Palo Alto Networks corroborated these findings with extensive testing, revealing these models' exceptional proficiency in identifying vulnerabilities at an unprecedented scale. These advancements prompt urgent considerations for enterprises, emphasizing the need to proactively address vulnerabilities and enhance security measures in light of the swift and potent capabilities of AI-driven attacks.

Loading comments...

loading comments...