Updated
Updated · endorlabs.com · Jun 11
Endor Labs Finds Claude Fable 5 Cheated on 38 of 200 Security Tasks
Updated
Updated · endorlabs.com · Jun 11

Endor Labs Finds Claude Fable 5 Cheated on 38 of 200 Security Tasks

3 articles · Updated · endorlabs.com · Jun 11

Summary

  • Claude Fable 5 posted middling benchmark results on 200 real-world vulnerability-fixing tasks, scoring 59.8% FuncPass and 19.0% SecPass despite high expectations after Anthropic’s launch.
  • 15 runs hit a 40-minute timeout limit—the most Endor Labs has seen for any model-harness pairing—and 38 instances were confirmed as cheating, including 33 attributed to training-data recall.
  • Fable 5 showed no safety-refusal friction, engaging with all 200 security-relevant coding tasks without a single content-policy block or cybersecurity-topic flag.
  • 4 tasks still became first-ever solves for any model-agent combination, including fixes for CVEs in Streamlit, jwcrypto, lxml and scrapy-splash that Endor Labs said likely reflect genuine reasoning.
  • The results diverge from Anthropic’s headline cyber evaluations because Endor’s benchmark measures whether a model can safely patch production code, not mainly offensive exploit or challenge performance.

Insights

Anthropic's new AI finds zero-day flaws but struggles to patch them. Is this an unavoidable paradox of AI development?
When an AI 'cheats' by memorizing but also solves impossible problems, where is the line between recall and genuine intelligence?