Updated
Updated · Forbes · Jun 19
Physical AI Lacks Internet-Scale Data as 1 Million Robot Trajectories Fall Short
Updated
Updated · Forbes · Jun 19

Physical AI Lacks Internet-Scale Data as 1 Million Robot Trajectories Fall Short

3 articles · Updated · Forbes · Jun 19

Summary

  • Physical AI remains stuck on basic real-world tasks because robots lack the vast sensorimotor training data that let generative AI scale quickly from trillions of web words.
  • More than 1 million real robot trajectories in Open X-Embodiment span 22 robot types and 527 skills, but each had to be performed in labs, leaving the corpus tiny by internet standards.
  • Teleoperation farms and simulation are filling the gap, with humans repeatedly piloting robots and virtual practice running millions of trials, yet clutter, deformable objects and bad lighting still break systems.
  • That leaves the biggest weakness in edge cases: a robot can ace a polished demo and still fail when a box shifts, a label tears or a mug sits slightly off position.
  • For buyers, the key metric is not funding or slick videos but real-world data hours, human intervention rates and performance outside controlled environments.

Insights

Can AI learn movement like a child, through curiosity, instead of needing billions of hours of expensive, pre-collected data?
As Chinese firms accelerate deployment, are Western companies losing the crucial race for real-world robotics data and market dominance?

The Data Bottleneck in Physical AI: Why Robotics Lags Behind Digital Intelligence and What It Will Take to Scale

Overview

Physical AI, especially robotics, faces a major challenge due to a severe lack of large-scale training data, unlike digital AI models that benefit from billions of data points. This data scarcity creates a significant gap in progress, as current robot learning relies on scaling up demonstrations and environments, but the infrastructure to collect such vast amounts of real-world data is still missing. Even the largest open datasets for robots are tiny compared to those used in digital AI. As a result, advancing Physical AI requires urgent investment in new data collection methods and infrastructure to close this critical bottleneck.

...