Claude Mythos 1 Solves Erdos Problem 90, Scores 69% on Exploit Bench
Updated
Updated · Geeky Gadgets · May 29
Claude Mythos 1 Solves Erdos Problem 90, Scores 69% on Exploit Bench
1 articles · Updated · Geeky Gadgets · May 29
Leaked Claude Mythos 1 outputs showed the Anthropic model solving Erdos Problem 90 and producing a Python visualization, offering a rare look at its math, coding and creative reasoning.
A 69% score on Exploit Bench stood out as the review’s key benchmark, suggesting the model is especially strong at software exploitation and other cybersecurity tasks.
Anthropic has kept Mythos 1 largely internal because of misuse risks, though the report said the company has hinted a public release could come once stronger safety controls are in place.
Those capabilities could make the model useful in research, enterprise systems and cloud security, while raising broader questions about how to deploy high-end AI tools responsibly.
Can an AI be safe for the public if its creator is deemed a national security risk?
With AI developing exploits faster than humans can patch, is our digital infrastructure already indefensible?
If AI assistants make us more productive but less skilled, are we engineering our own obsolescence?