AI Providers Raise Per-Token Costs as Usage Jumps by Hundreds With Agents
Updated
Updated · O'Reilly Media · Jun 30
AI Providers Raise Per-Token Costs as Usage Jumps by Hundreds With Agents
1 articles · Updated · O'Reilly Media · Jun 30
Summary
GitHub Copilot has shifted from unlimited monthly access to credit-based billing, charging $0.01 per credit and signaling a broader retreat from cheap, all-you-can-use AI.
Reasoning models and agents are driving that change because a single request can trigger many model calls, with internal reasoning and accumulated context pushing token consumption up by factors of hundreds.
Anthropic and OpenAI have already reinforced the trend with steeper pricing for stronger models—Fable costs about 2x Opus 4.8, while GPT 5.5 costs 2x GPT 5.4 per million tokens.
Capacity is tightening at the same time: new data centers and power infrastructure are lagging demand, outages have been tied to constraints, and providers can use higher prices to ration scarce compute.
Developers and managers are responding by emphasizing token governance, observability, and routing work to cheaper or local models, making token optimization the new norm.