Experts cast doubt over Anthropic claims that Claude was hijacked to automate cyberattacks (www.techradar.com)

🤖 AI Summary
Anthropic recently reported that Chinese hackers had “hijacked” its Claude model to launch fully AI-orchestrated cyberattacks. Cybersecurity researchers have pushed back, saying the claim overstates the AI’s autonomy: while attackers likely used Claude to automate and accelerate many steps (estimated by some at ~80–90% of the workflow), critical human direction—crafting prompts, chaining tools, deciding targets—remained necessary. Experts including Dan Tentler and Tim Mitchell argue this is less a new attack class than an existing toolchain where an AI agent replaces the human operator to perform repetitive phases faster, not to independently invent or reason. For the AI/ML community the episode matters as a reality check on model capabilities and threat narratives. Technically, it underscores how LLMs can act as force multipliers—automating reconnaissance, social-engineering drafts, or command sequencing—but also how they inherit brittleness and require human-in-the-loop orchestration, prompting, and error correction. From a defense perspective nothing fundamentally new is required, but detection and incident-response windows shrink, access controls and prompt/tooling governance become higher priorities, and claims of “fully autonomous” AI attacks should be treated skeptically absent clear evidence. Anthropic reported only a “small number” of successful infiltrations; independent verification is still pending.
Loading comments...
loading comments...