Pretraining AI models is thought by some to be fairly settled with set procedures and datasets, but it could change dramatically in the coming years.
Jeff Dean, Chief Scientist at Google DeepMind and Google Research, has laid out a vision for how pre-training — the foundational stage of building large AI models — could be fundamentally rethought. The core idea: instead of passively absorbing data, future models should take actions in the world during training, and actively choose what data they see next.

“The current pre-training regime is sort of like this: you take your model, you throw your random number generator at it to initialize it, you strap it to a board, and then you stream all the internet data past it and it learns what it can from that observation — but it’s not actually taking actions in the world,” Dean said.
The metaphor is deliberately stark — a model bound to a board, absorbing whatever passes before it. Dean’s argument is that this is a fundamental limitation, not merely an implementation detail.
What he proposes instead is an interleaving of passive learning with active engagement:
“It seems like we want to interleave some of that with something where the model gets to take actions in some environment — be it simulated robotics environments, trying to predict answers to questions, or things like that. And then going back to learning, where it’s also a bit more directed in how it chooses what data to see next, as opposed to a predetermined ordering of the data.”
The efficiency gains Dean expects from this shift are significant. Rather than measuring progress by raw token count, the question becomes how much learning a model can extract from a given number of tokens:
“That would actually be quite interesting as a way to dramatically improve the learning efficiency — for a given number of tokens, how much can the model get out of it? Taking actions in the world is going to be super useful for that. We do that in post-training, but that’s a very limited form of it.”
Crucially, Dean argues this shouldn’t be confined to the post-training phase as it is today. The pre-training/post-training divide itself, he suggests, may be artificial:
“I think interleaving this kind of thing much more, even at the pre-training stage — I mean, there’s no reason this distinction should exist for the long term.”
The implications are significant. The current pipeline — scrape the internet, initialize a model, train — has been the backbone of every major frontier model. Dean is suggesting that this pipeline’s passivity is a bottleneck on learning efficiency, and that models capable of directing their own training data intake could learn substantially more from the same compute budget.
This connects to a broader debate around pre-training: some, like Perplexity CEO Aravind Srinivas, have argued the pre-training era is ending. Google has pushed back — Gemini co-lead Oriol Vinyals credited both pre-training and post-training improvements for Gemini 3’s benchmark-topping performance. Dean’s comments go further: not just that pre-training still has legs, but that it could be restructured entirely.
The “actions in the world” framing is also notable in light of the industry’s current push toward agentic AI. If models are taking actions during pre-training — in simulated robotics environments or question-answering loops — the line between training and deployment starts to blur. Dean is essentially proposing that the behaviour we associate with agents be baked in at the foundational stage, not bolted on afterward.
Google is arguably better positioned than any lab to execute on this vision. It has vast untapped data reserves in video, audio, robotics, and decades of web data — precisely the kinds of environments where active, action-taking pre-training would operate. Dean’s remarks suggest the company is thinking seriously about how to exploit that advantage at the architectural level, not just the data level.