Cost Of Inference Will Fall, But It Won't Go To Zero: Google DeepMind CEO Demis Hassabis

The cost of AI is dropping sharply, but it might never end up being completely free.

That’s the view of Demis Hassabis, CEO of Google DeepMind, who pushed back on the popular assumption that inference — the compute required to run an AI model — will eventually become a negligible cost. Speaking in a conversation with Y Combinator president Garry Tan, Hassabis invoked Jevons paradox to argue that falling costs don’t lead to reduced consumption; they lead to more of it.

“I’m not sure inference will ever be essentially free,” Hassabis said. “There’s Jevons paradox and other things — I think we’ll just end up using, all of us will end up using whatever we can get our hands on.”

His reasoning is straightforward: as inference gets cheaper, demand scales to match. He sketched out two of the more likely scenarios for how that demand gets absorbed. “You could imagine millions of agents, swarms of agents working together on things — that’s one way to use the inference. Or you could imagine single agents, or smaller groups of agents, thinking in multiple directions and then ensembling that.”

Both paths, he noted, are already being explored. “We’re experimenting with all these things. Probably many of you are. All of that will use up any inference, I think, that’s available.”

On the longer horizon, Hassabis allowed for more optimism — but with limits. “One day, maybe it can be almost cost zero. Certainly, the energy — if we solve fusion, or superconductors, or optimal batteries, or some set of those things, which I think we will do with material science — energy costs will be essentially zero.”

But even that scenario has a floor. “There’ll still be the physical creation of the chips and other things. There will be some bottleneck, at least for the next few decades, I think. And so if that’s the case, there’ll still be rationing on the inference side. You’ll still have to use it, I think, efficiently.”

Hassabis’s remarks land at a telling moment. Inference costs have dropped dramatically over the past two years — driven by more efficient model architectures, hardware improvements, and intensifying competition — but the demand for compute has kept pace. Reasoning models, which think through problems step by step before responding, consume significantly more compute at inference time than their predecessors, effectively absorbing the efficiency gains. Hassabis himself has previously noted that compute demand is as high as ever, pointing to reasoning models as a key driver.

His competitor at OpenAI, Sam Altman, has argued that the ultimate floor on AI costs is energy — that chips and networking hardware will eventually be cheap to manufacture, but powering the computation will always cost something. Hassabis is essentially saying the same thing, while adding that chip fabrication itself will remain a constraint for decades. The physical cost of silicon, in other words, isn’t going away.

The implication for builders and enterprises is that efficiency will remain a real variable — not a solved problem. Swarms of agents and ensemble reasoning architectures may unlock powerful new capabilities, but they won’t be free to run. Whoever figures out how to get the most out of each unit of inference will have a durable edge. That pressure, Hassabis suggests, isn’t going away anytime soon.