Updated
Updated · 404 Media · Jun 30
Companies Adopt AI 'Caveman' Tool, Cutting Output Tokens by Up to 75%
Updated
Updated · 404 Media · Jun 30

Companies Adopt AI 'Caveman' Tool, Cutting Output Tokens by Up to 75%

2 articles · Updated · 404 Media · Jun 30

Summary

  • Caveman, a plugin that strips pleasantries and verbose phrasing from LLM replies, is being used inside companies to curb fast-rising and unpredictable AI token bills.
  • 65% to 75% fewer output tokens is the tool’s claimed savings range; in 404 Media’s test with Claude Code, it reported about 5,800 tokens saved, or 65%.
  • Legrand told employees to use the tool after billing changes and new quotas, alongside shifting tasks to cheaper models and avoiding high-reasoning settings unless needed.
  • OpenAI, Nvidia and GitHub developers are among reported users, and GitHub records show an OpenAI engineering director contributed Codex support to the project.
  • The push reflects a wider cost squeeze: GitHub moved to per-token pricing in April, while Uber and Walmart capped AI usage after budgets were burned through faster than expected.

Insights

With companies desperate to make AI terse, is the token-based pricing model for AI fundamentally broken?
Could making AI less conversational to save money ultimately lead to more expensive and critical errors down the line?
Is the AI revolution creating a new inequality through a 'hidden language tax' on non-English users?