Researchers Criticize Anthropic's Fable AI Guardrails as Model Falls Back to Claude Opus 4.8

3 articles · Updated · TechCrunch · Jun 10

Security researchers say Anthropic's newly released Fable blocks even basic tasks such as reading a blog post, code review or writing secure code if prompts appear remotely cyber-related.
Fable pauses chats when its safeguards fire, citing cybersecurity or biology risks, and then downgrades users to Claude Opus 4.8; critics say the filtering appears largely keyword-based.
Anthropic built those limits to prevent misuse for malware or biological weapons, reflecting the same safety concerns that kept its stronger Mythos model tightly controlled at launch in April.
Mythos access was widened last week to hundreds of organizations across 15 countries, while approved users in Anthropic's Cyber Verification Program face fewer Claude restrictions; OpenAI runs a similar Trusted Access program.