Physical AI Lacks Internet-Scale Data as 1 Million Robot Trajectories Fall Short
Updated
Updated · Forbes · Jun 19
Physical AI Lacks Internet-Scale Data as 1 Million Robot Trajectories Fall Short
3 articles · Updated · Forbes · Jun 19
Summary
Physical AI remains stuck on basic real-world tasks because robots lack the vast sensorimotor training data that let generative AI scale quickly from trillions of web words.
More than 1 million real robot trajectories in Open X-Embodiment span 22 robot types and 527 skills, but each had to be performed in labs, leaving the corpus tiny by internet standards.
Teleoperation farms and simulation are filling the gap, with humans repeatedly piloting robots and virtual practice running millions of trials, yet clutter, deformable objects and bad lighting still break systems.
That leaves the biggest weakness in edge cases: a robot can ace a polished demo and still fail when a box shifts, a label tears or a mug sits slightly off position.
For buyers, the key metric is not funding or slick videos but real-world data hours, human intervention rates and performance outside controlled environments.
Can AI learn movement like a child, through curiosity, instead of needing billions of hours of expensive, pre-collected data?
As Chinese firms accelerate deployment, are Western companies losing the crucial race for real-world robotics data and market dominance?
The Data Bottleneck in Physical AI: Why Robotics Lags Behind Digital Intelligence and What It Will Take to Scale
Overview
Physical AI, especially robotics, faces a major challenge due to a severe lack of large-scale training data, unlike digital AI models that benefit from billions of data points. This data scarcity creates a significant gap in progress, as current robot learning relies on scaling up demonstrations and environments, but the infrastructure to collect such vast amounts of real-world data is still missing. Even the largest open datasets for robots are tiny compared to those used in digital AI. As a result, advancing Physical AI requires urgent investment in new data collection methods and infrastructure to close this critical bottleneck.