Amid Mythos’ hyped cybersecurity prowess, researchers find GPT-5.5 is just as good (arstechnica.com)

0 points 59 days ago ago | visit original

🤖 AI Summary

New research from the UK’s AI Security Institute (AISI) has revealed that OpenAI's recently launched GPT-5.5 exhibits cybersecurity capabilities comparable to Anthropic's much-hyped Mythos Preview model. Despite Anthropic's claim that Mythos poses an outsized threat, tests conducted by AISI show that GPT-5.5 achieved an average pass rate of 71.4% on high-level cybersecurity tasks, just slightly ahead of Mythos Preview’s 68.6%. Notably, GPT-5.5 demonstrated impressive problem-solving skills, completing a complex disassembler challenge in just over ten minutes at a minimal cost of $1.73, showcasing its efficiency and effectiveness. This finding is significant for the AI/ML community as it challenges the narrative surrounding Mythos and suggests that advanced generative models like GPT-5.5 can perform effectively in critical cybersecurity scenarios. Both models succeeded in novel tests like “The Last Ones” data extraction challenge, marking a milestone in AI capabilities, although they struggled with more complex tasks like the “Cooling Tower” simulation. This highlights a growing trend where leading AI models are not only improving in their core capabilities but also proving to be competitive in niche applications like cybersecurity, paving the way for further innovations and potential real-world applications in securing digital environments.

Loading comments...

loading comments...