Robots do not lack intelligence so much as they lack data. The training sets that taught language models to write never existed for machines that pick, walk, and grasp. Nvidia's answer, unveiled on June 1 at Computex, is a model that lets any robotics company manufacture that data in simulation.
What Actually Happened
At the Computex keynote in Taipei, Nvidia introduced Cosmos 3, which it calls the world's first open Physical AI omnimodel. It ships in two sizes available now: a Nano variant at 8 billion parameters and a Super variant at 32 billion parameters. The architecture is a Mixture-of-Towers design that Nvidia abbreviates as MoT, splitting the model into separate reasoning and generation towers so that planning and world simulation do not compete for the same weights. The single model handles vision reasoning, world simulation, action generation, and multimodal input and output in one stack.
The benchmark claim is blunt: Cosmos 3 ranks number one across more than seven robotics evaluations, including Physics-IQ, PAI-Bench, R-Bench, RoboLab, RoboArena, VANTAGE-Bench, and the TAR Leaderboard. The pitch to industry is the data problem itself: every robot manufacturer can use Cosmos 3 to generate synthetic training data, test control policies, and accelerate development without the slow, expensive grind of collecting millions of real-world demonstrations. Because the model is open, those manufacturers can run it on their own infrastructure rather than calling a closed API for every simulation step.
Why This Matters More Than People Think
The bottleneck in robotics has never been the motors or the cameras. It is the data gap. A language model can train on the entire public internet. A humanoid has no equivalent corpus of physical experience, and gathering it by teleoperating real robots costs millions of dollars and years of wall-clock time. Cosmos 3 attacks that gap directly by turning simulation into a data factory, which means a startup with a good robot but no fleet can suddenly generate the experience it could never afford to record.
Making it open changes who gets to play. Until now the best world models lived inside Google DeepMind, Tesla, and a handful of well-funded labs. An 8B model that a small team can run on a single workstation, plus a 32B model for serious training pipelines, lowers the entry price for physical AI the way open language models lowered it for chatbots. The companies that benefit most are the dozens of humanoid and industrial-robot startups that have hardware but lack the data engine to make it useful.
The Competitive Landscape
The obvious rival is Google DeepMind, whose Gemini Robotics work and world-model research target the same problem from inside a closed ecosystem. World Labs, the startup from Fei-Fei Li, has raised heavily to build spatial-intelligence models, and Tesla guards its Optimus simulation stack as a competitive secret. Cosmos 3 undercuts all of them on access: it is free, open, and ranked first on the public benchmarks those teams also chase.
The more interesting dynamic is on the robot-builder side. Figure AI, Physical Intelligence, Skild AI, Apptronik, and the wave of Chinese humanoid makers all need a data and simulation layer. Many were building it in-house. Cosmos 3 hands them a ready-made foundation, which is generous until you remember that every simulation run and every fine-tune is most efficient on Nvidia GPUs. The model is open, but the compute beneath it is not, and that is precisely the point. Nvidia is doing for robotics what it just did for language models with Nemotron 3 Ultra: give away the model, sell the silicon it runs on.
It also lands in a week when robotics suddenly got crowded. OpenAI announced a dedicated robotics division over the same weekend, Meta acquired the humanoid startup Assured Robot Intelligence in May, and Tesla continues to scale Optimus. By shipping the shared substrate everyone else needs, Nvidia positions itself as the arms dealer in a war it does not have to win directly.
Hidden Insight: Whoever Owns the Simulator Owns the Robots
The race in humanoid robotics is usually framed as a hardware contest: who builds the best hands, the most efficient actuators, the longest-lasting battery. That framing misses where the leverage actually sits. The team that controls the simulator controls the data distribution every robot trains on, and the data distribution shapes what robots can and cannot do. By making Cosmos 3 the default open simulator, Nvidia is angling to sit underneath the entire industry the way it already sits underneath model training.
This is a platform play disguised as a research release. If Cosmos 3 becomes the standard environment where robot policies are born and tested, then Nvidia influences the development roadmap of every company that adopts it, and it does so while selling the DGX systems and GPUs that run the simulations. The model is the on-ramp. The compute is the toll road. The same structure made CUDA indispensable, and Nvidia is deliberately repeating it one layer up.
The risk is that simulation is not reality, and physical AI lives or dies on that gap. A model can rank first on Physics-IQ and RoboArena and still produce policies that fail when a real gripper meets a real, slightly greasy bolt. Critics argue that synthetic data trains robots to be excellent at the simulator and mediocre at the world, the so-called sim-to-real gap that has humbled robotics labs for a decade. Benchmarks like PAI-Bench and VANTAGE-Bench measure progress against curated tasks, not the long tail of messy factory floors and cluttered homes where robots actually have to earn their keep. An open model that is number one on paper can still leave the hardest 20% of real deployment unsolved, and that last stretch is where most robotics companies quietly die.
However, even a leaky simulator changes the economics in Nvidia's favor. A robot team that can pre-train policies in Cosmos 3 and then fine-tune on a smaller real dataset still spends far less than one collecting everything by hand. The sim-to-real gap does not have to close completely for the model to be worth adopting. It only has to shrink the volume of expensive real-world data required, and on that measure a number-one-ranked open simulator is a serious gift, flaws and all.
What to Watch Next
In the next 30 to 90 days, watch the download and integration numbers. If Figure, Apptronik, Physical Intelligence, and the Chinese humanoid makers start publicly building on Cosmos 3 rather than their own stacks, the standard-setting play is working. Watch too for academic adoption, because robotics labs that lacked compute will grab an open 8B model fast, and their published results will either validate or puncture the benchmark claims within a couple of conference cycles.
Over the 180-day horizon, the signal that matters is real-world deployment, not leaderboard position. Track whether policies trained primarily in Cosmos 3 graduate to actual factory shifts and warehouse runs without a wall of safety incidents. If Nvidia pairs Cosmos 3 with its GR00T humanoid foundation models and the Jetson edge hardware into a single pipeline, it will have assembled an end-to-end physical-AI platform that competitors will struggle to match piece by piece. The day a robot company credits Cosmos-trained policies for a paying deployment is the day this release stops being a benchmark story and becomes an industry foundation.
In robotics the hardware gets the headlines, but whoever owns the simulator quietly owns the data every robot is born from.
Key Takeaways
- First open Physical AI omnimodel, Cosmos 3 ships in 8B Nano and 32B Super sizes, both available now.
- Number one across 7+ robotics benchmarks, including Physics-IQ, PAI-Bench, RoboArena, and the TAR Leaderboard.
- Mixture-of-Towers architecture, separate reasoning and generation towers handle planning and world simulation.
- Synthetic data factory, robot makers can generate training data and test policies without massive real-world collection.
- Open model, Nvidia compute, free to use, but most efficient on Nvidia GPUs, mirroring the Nemotron strategy.
Questions Worth Asking
- If one company supplies the simulator the whole robotics industry trains on, who really sets the direction of physical AI?
- How much of the sim-to-real gap can synthetic data close before a robot still needs expensive real-world demonstrations?
- If your business depends on robots, does an open foundation model lower your costs or quietly deepen your dependence on a single chip vendor?