LLMs have made some remarkable progress over the last few years, but there is still plenty of room for improvement in how they’re structured.
Jeff Dean, Chief Scientist at Google DeepMind and one of the most influential figures in modern AI, has put a sharp number on one of the field’s most persistent frustrations: LLMs need roughly a thousand times more data than a human being to reach comparable capability. Dean made the case that this gap isn’t just a hardware problem — it’s fundamentally an algorithmic one.

“We really need to come up with algorithmic things that just get much more out of every piece of data or example that the model sees, or every token,” Dean said in a conversation with fellow Google AI leaders.
The core of his argument is an interesting comparison. A capable human and a capable LLM may end up in roughly the same place — similar capability, slightly better in some areas, not quite as good in others — but the paths to get there are radically different in terms of data consumption:
“If you look at the efficiency of, say, human learning, it’s a thousand times better than what our sort of LLM learning can do. The LLM gets to see a thousand times as much data as a really capable human and then gets to roughly similar capability — maybe slightly better in some things and not quite as good in others — but it needed a thousand times as much data.”
This implies that the current paradigm of scaling up data and compute is not the only path forward, and may not even be the most important one. What matters is how much the model actually learns from each example it sees.
Dean’s proposed solution is as simple to state as it is difficult to achieve:
“So if we could make it so that you could get a thousand times as much information out of every example, it would be amazing.”
This is not just an abstract aspiration. Dean has been developing this line of thinking across multiple recent conversations. In an earlier discussion on pre-training, he argued that future models should move beyond passive data absorption and instead take actions in their environment — choosing which data to learn from next, rather than processing a predetermined stream. The efficiency gains, he suggested, would be significant: rather than measuring progress by raw token count, the question becomes how much learning a model can extract from a given number of tokens.
Separately, Dean has also pushed back against the idea that data scarcity is the primary bottleneck, pointing instead to untapped video, audio, and synthetic data as underutilised resources. Taken together, his recent statements sketch a consistent worldview: the field’s next major leap will come not from feeding models more, but from teaching them to learn better.
This is a broader trend. Four of Google DeepMind’s most senior researchers — including Dean himself — indicated that within a year, AI models could begin meaningfully improving themselves, without requiring a full retraining cycle. Self-improvement and data efficiency are, at their core, the same bet: that the next frontier in AI isn’t a bigger model on more data, but a smarter learner on the same data. If Dean is right about the thousand-times gap, closing even a fraction of it would be transformative. The question is which algorithmic breakthrough gets there first.