Unlike With LLMs, It'll Take 2-5 Years To Figure Out Robotics' Scaling Law: NVIDIA's Jim Fan

The LLM scaling law is now quite apparent — increase the amount of compute and the number of parameters of the model, and the performance of the LLMs seems to get proportionally better. But it could be a while before we discover what a similar law for robotics is.

Nvidia Senior Research Scientist, Jim Fan, has offered his perspective on the future of robotics and the search for a “scaling law” comparable to what we’ve observed with Large Language Models (LLMs). His insights highlight a crucial difference between the relatively straightforward scaling observed in LLMs and the significantly more complex landscape of robotics. He predicts it will take two to five years before we can fully understand how to scale robotics effectively.

“I think in the next two to five years, from a technical perspective, we will be able to fully study the embodied scaling law,” Fan said. He draws a comparison to the groundbreaking work on LLMs: “So, I think the biggest moment in large language models is the original Chinchilla scaling law. Basically, that exponential curve where you’re putting more compute, you scale the amount of data, you scale the number of parameters, and you would just see intelligence just going up exponentially.”

This straightforward relationship, however, doesn’t translate neatly to the world of robots. Fan explains, “I don’t think we have anything like that for robotics yet, because the scaling law is so complicated for robotics. Right? You can scale across model, you can scale across the hardware fleet — the real robot data. And how about the simulation data scaling law? How about the internet data scaling law? How about the neural simulation scaling law as you are generating lots of videos?”

The multitude of factors influencing robotic performance makes identifying a single, unifying scaling law a considerable challenge. Fan believes this complexity is precisely why focused research in the coming years is crucial. He expresses optimism that this research will yield significant results: “So we will be able to study all of these things so that perhaps, you know, five years from now, or sooner from then, we’ll have that plot on the screen that, you know, exactly how many GPUs you buy and how much better your robot world will be. So we [will] answer that question quantitatively very soon.”

Fan’s projection of a two-to-five-year timeframe for unraveling the robotics scaling law suggests a period of intense research and experimentation. Unlike the relatively clear path observed with LLMs, where increasing compute and data generally leads to better performance, robotics presents a multi-dimensional challenge. The interplay between real-world data, simulated environments, cross-modal learning, and the sheer diversity of robotic hardware creates a complex web of variables that need to be carefully untangled. The quest to understand this interplay and establish a predictable scaling law will likely drive innovation in data collection methods, simulation techniques, and hardware design within the robotics field. The promise of eventually having a clear roadmap, like the one LLMs currently benefit from, where resource investment directly translates to performance gains, holds immense potential for accelerating the development and deployment of more sophisticated and capable robots.