Unauthorized users access Anthropic's Claude Mythos AI model within hours

11 articles · Updated · UnHerd · May 8

The breach reportedly involved guessed server locations, contractor credentials and information from an earlier compromise of AI training startup Mercor.
Anthropic had said the model, designed to find and exploit zero-day software flaws, would be restricted through Project Glasswing to about 40 selected organisations including major tech firms and banks.
The incident intensified debate over AI cyber-risk, as rivals launched competing models and officials and financial institutions weighed security and regulatory implications.

With offensive AI like Mythos already leaked, is the new cyber arms race already unstoppable?

When AI manages both attack and defense, what is the new role for human cybersecurity experts?

The Claude Mythos Incident: How 99% of AI-Discovered Vulnerabilities Remain Unpatched After a Major Security Breach

Overview

Anthropic withheld its powerful AI model, Claude Mythos, from public release due to major cybersecurity risks. To manage these risks, Anthropic launched Project Glasswing, giving limited access to trusted partners like Apple and Microsoft so they could test the model and help secure critical systems. Despite these precautions, unauthorized access occurred within a day, exposing weaknesses in the controlled release approach. This incident highlighted the challenges of safeguarding advanced AI, the importance of strong vendor security, and the urgent need for better governance as AI models become more capable and potentially dangerous.

...

Unauthorized users access Anthropic's Claude Mythos AI model within hours

The Claude Mythos Incident: How 99% of AI-Discovered Vulnerabilities Remain Unpatched After a Major Security Breach

Overview

Related Stories