Recursive Self-Improvement Of AI Models Is No Longer Sci-Fi: Google DeepMind Researcher

The holy grail of a fast AI take-off is supposed to be AI models that improve themselves, and that already seems tantalizingly close in early 2026.

Mostafa Dehghani, a researcher at Google DeepMind, recently made a striking observation: what was once the stuff of speculative papers is now quietly happening across nearly every major AI lab. The core claim is simple but significant — the new generation of AI models is being built heavily using the previous generation. Recursive self-improvement, long treated as a future milestone, has already begun.

“You referred to that as something that looked like a bit of a sci-fi situation, where these models are actually improving themselves. And that’s true, because a few years ago, if you wanted to talk about this, you could just write a prospective paper at a conference and talk about it at a very high level.”

But the present, Dehghani argues, looks very different from that theoretical past:

“If we go and check out what is happening right now — to a really good extent, it’s happening. And somehow, most people don’t realize that this is already happening, especially over the past few months. In almost every lab, the new generation of models is built heavily using the previous generation of models.”

He is careful to note that the process isn’t fully autonomous yet — but the direction is unambiguous:

“It’s not fully automatic yet, but the direction is very clear, and it’s easy to imagine that we’re going to get to a situation with full automation. These models are going to improve themselves and keep learning from the world.”

Dehghani also connects this to adjacent research threads like continual learning, acknowledging the field hasn’t reached its most advanced points:

“If someone comes and says, ‘I have an idea to get a model to calculate the gradient and update its weights on the fly,’ it just feels very normal. It’s not something like, wow, what an amazing idea.”

What’s missing, in his view, is long-horizon reasoning and full automation — but the gap is closing fast:

“I think what is missing right now is long-horizon and full automation, and we are moving in that direction very fast. The moment we have this full automation, we can close the loop of self-improvement. Then the problems become mostly about providing compute for these models to do what they want to do. We’ve gotten rid of the human bottleneck for improving these models, which I expect to lead to a huge jump from such development.”

Dehghani’s framing puts a name on something the industry has been dancing around for months. The recursive loop he describes — prior models shaping the next generation — is already visible in concrete products. Google DeepMind’s AlphaEvolve, a Gemini-powered coding agent, autonomously discovers and refines algorithms, and has been used to make AI training itself more efficient — a textbook example of one model generation improving the next. DeepMind researcher Matej Balog had earlier noted they were “seeing the first signs of self-improvement,” pointing specifically to AlphaEvolve speeding up the training of future Gemini models. Meta CEO Mark Zuckerberg used almost identical language around the same time, saying he was seeing “early glimpses” of self-improvement in Meta’s models.

The pace of model releases lends further weight to Dehghani’s argument. Leading labs — Google DeepMind, OpenAI, and Anthropic — have been releasing major frontier models roughly every four to six weeks, each generation meaningfully more capable than the last. Whether that acceleration is partly a product of AI-assisted development is the key question Dehghani is, in effect, answering: yes, and it’s only going to compound. The remaining variable is compute — once the human bottleneck is removed, the ceiling becomes how much processing power can be thrown at the problem. That’s a race, as OpenAI President Greg Brockman has argued, that could reshape national economies.