AI Agents Solve Problems In A Way Similar To Biological Evolution: Stephen Wolfram

As AI systems get more sophisticated, they seem to be weirdly mirroring some biological traits.

Stephen Wolfram — mathematician, computer scientist, and founder of Wolfram Research — has been thinking about the deep structural similarities between biological evolution and how modern AI agents work. His argument is striking: the trial-and-error loops that define agentic AI today are not just superficially similar to natural selection — they may be drawing on the same underlying power of computation.

Wolfram draws a sharp contrast between earlier AI systems and what we have now: “At that point it was next token prediction. Right now it’s kind of next task prediction. It’s kind of gone up a level in abstraction — rather than ‘what token should I emit next?’, it’s ‘what piece of some task should I do next?'”

But the shift in abstraction level is only part of the story. The other, more profound change is the ability to loop and self-correct. As Wolfram puts it: “This ability to kind of loop around and say, did I get there yet? If not, let me retry it. And the thing that’s surprising, that I think one doesn’t understand very well, is how powerful that process actually is.”

To illustrate just how powerful, he reaches for a sweeping analogy:

“If we look at a good analogy — biological evolution, which just tries random things and sees what happens — 10 to the 40th individual organisms that have lived in the history of life on earth later, there’s us. And the question is, why does that work?”

His answer connects to research he has been doing since the 1980s on cellular automata — tiny programs that produce unexpectedly rich behaviour. “What you know is something similar going on when you see the agents going off and doing random things. They don’t quite work, and then they try again, and then they make something work. My current meta theory of what’s going on is it’s actually something very similar to what happens in biological evolution. I studied back in the eighties lots of very tiny programs, particularly these cellular automaton systems where a tiny, tiny program goes in and very complicated behaviour comes out.”

The deeper point Wolfram is making is about the nature of computation itself. Normally, when humans write code, they can anticipate what it will do. But the computational universe is far larger than the slice we deliberately engineer:

“I think that is the main secret that gets us complexity in nature, but it’s also the thing that it’s this great power of computation that isn’t what we normally harness. Because normally when we are doing computation, we are writing code where we can foresee what that code is supposed to do. Whereas in the computational universe of all possible programs, even these little tiny programs do all kinds of complicated things — things for which we don’t necessarily have a purpose. But they just do those things. So I think the thing that’s happening both in biological evolution and in machine learning is we’re getting to harness the power of what is possible in the computational universe.”

The implications are significant. If Wolfram is right, the effectiveness of agentic AI isn’t primarily about intelligence in any traditional sense — it’s about iteration at scale, sampling vast possibility spaces until something works. That’s a different kind of power, and one that doesn’t require understanding in the conventional sense any more than evolution “understands” the organisms it produces.

This framing helps explain a trend that has been building rapidly. Andrej Karpathy’s autoresearch project — which lets AI agents autonomously run over a hundred machine learning experiments overnight, checking whether each one improved results and discarding it if not — is almost a textbook demonstration of the evolutionary loop Wolfram describes. The agent doesn’t understand the experiments; it just iterates. Sam Altman has predicted that 2026 will be the year AI begins making meaningful scientific discoveries, and Anthropic CEO Dario Amodei has suggested the first one-person billion-dollar company — built by orchestrating AI agents across departments — could arrive in 2026. None of these predictions require agents to be “smart” in the way humans are. They just require agents to try, fail, and try again fast enough and at sufficient scale. Evolution’s lesson, applied to software.