Anthropic ships three undetected Claude Code quality regressions

10 articles · Updated · InfoWorld · May 4

Over six weeks, changes on 4 March, 26 March and 16 April cut coding quality, including a 3% drop from a prompt tweak, while users quickly reported problems.
The regressions stemmed from lowering default reasoning effort, a caching bug that cleared context every turn, and concision instructions that standard release-gate evals missed.
Anthropic’s postmortem said agent evaluation must use real production failures, repeated trials and near-100% regression gates, highlighting wider industry concerns that narrow dashboards can miss user-visible declines.

Why do smarter AIs fail in ways that even their creators' most sophisticated tests cannot detect?

Is the entire AI industry just 'shipping vibes,' and what is the true cost of this silent crisis?

The 46% Time Savings Undermined: Inside Anthropic’s Claude Code Regression and User Backlash

Overview

In April 2026, Anthropic faced a major crisis as three separate product-layer changes—a reduced reasoning effort setting, a session state caching bug, and a misconfigured verbosity prompt—interacted to degrade Claude Code's output quality. This caused users to experience shallow code, lost conversation context, and brief, unexplained answers, sparking widespread frustration and the 'AI shrinkflation' narrative, fueled also by recent session limit adjustments. Anthropic responded by releasing a fix on April 20, offering usage limit resets as compensation, and launching new transparency efforts through a dedicated developer account. Despite these steps, rebuilding trust remains a challenge as the incident exposed the fragility of AI product ecosystems beyond the core model.

...

Anthropic ships three undetected Claude Code quality regressions

The 46% Time Savings Undermined: Inside Anthropic’s Claude Code Regression and User Backlash

Overview

Related Stories