Updated
Updated · IO Fund · May 31
Google Starts Selling TPU v8 Chips to Data Centers as Inference Demand Nears 50% of AI Load
Updated
Updated · IO Fund · May 31

Google Starts Selling TPU v8 Chips to Data Centers as Inference Demand Nears 50% of AI Load

4 articles · Updated · IO Fund · May 31
  • Google said it will sell TPU v8 accelerators to select third-party data center operators, turning its in-house AI chips into a merchant product for the first time.
  • 31.2 gigawatts of data center demand in 2026 are expected to go to inference—matching training—before inference pulls ahead in 2027, making lower cost per token a bigger buying criterion.
  • TPU 8i is aimed at that shift with 1,152-chip pods, 331.8 TB of coherent shared HBM, 3x more SRAM per chip and up to 80% better performance per dollar than Ironwood.
  • Google is pitching those economics alongside fast-growing cloud demand: Google Cloud revenue reached $20 billion in Q1 2026, backlog hit $462 billion and operating margin rose to 32.9%.
  • The move also exploits a possible opening against Nvidia, whose Rubin ramp may slip, though Nvidia is countering with new inference-focused system designs built around specialized racks and chips.
Is Google's move to sell TPUs a challenge to Nvidia or a risk to its own cloud business?
Beyond Google and Nvidia, who are the hidden winners in the race for AI infrastructure dominance?
With dueling AI architectures, what is the next bottleneck for developers building advanced artificial intelligence?

Google Enters Merchant AI Chip Market: TPU v8’s Technical Edge and the Fight to Disrupt Nvidia’s 70% Share

Overview

In 2026, Google made a major move by formally entering the merchant AI accelerator market, releasing its eighth-generation Tensor Processing Unit (TPU) v8 chips. This decision to sell TPU v8 to select third-party data center operators marks a pivotal moment, directly challenging Nvidia’s dominance in AI hardware. The TPU 8t is built for large-scale AI model training, while the TPU 8i targets inference tasks like sampling and reasoning. Both chips use an Arm Axion CPU header, showing Google’s focus on performance and efficiency. This strategic shift signals a new era of competition and innovation in AI hardware.

...