DeepSeek Is Marketed As Cheap But Likely Has 50,000 H100s: Anthropic CEO Dario Amodei

Even as DeepSeek has taken the AI world by storm and caused stock markets to tumble, incumbents in the space are looking to temper some of the enthusiasm around the model.

Anthropic CEO Dario Amodei has said that DeepSeek likely has 50,000 NVIDIA H100 GPUs, which is nearly half the number of GPUs Elon Musk has in his giant Colossus cluster. As such, Amodei implied that DeepSeek’s performance compared to its American peers isn’t unexpected. But he said that as American companies got hold of more GPUs, they’d again regain their lead in the space.

“There (has been a) switch to reasoning models,” Amodei said at an event yesterday. “At all times in the past, 99.9 percent of the compute went into one kind of training, which is pre-training.  We’re now executing the switch over where we figured out how to put small amounts of compute into this second stage called the reinforcement learning stage. And because none of it was being done before, there are big gains to that stage, and the amount of compute in that stage is increasing to the point where it will even become dominant right now,” he explained.

“We’re in the switch over region where with a little bit of RL training, you can kind of catch up with the current situation. Whenever the paradigm switches over, the landscape scrambles a bit. But then it kind of reestablishes itself. So later this year, we and probably others will have hundreds of thousands of chips,” he said.

“There will be millions of chips from various companies in 2026 and more in 2027. DeepSeek, the Chinese company, it’s marketed as cheap, but it’s at least been reported that DeepSeek has 50,000 H100s, which for reference is about half of what Elon’s Colossus cluster has,” Amodei said.

“I won’t say exactly how many of these Anthropic has right now. So I actually think we have an opportunity, and this relates to (the US’s) China policy right now. Both sides are roughly in the mid tens of thousands, low hundreds of thousands of chips. That’s why we’re kind of close to parity,” he said.

“Now we’re at the we’re at the crossover point. But as we go to hundreds of thousands and millions of chips, there’s two possible futures. In one of those futures, the U. S. and its allies are able to provision that many chips fast enough and because of the export controls on chips to China and because Chinese Huawei chips are inferior, China cannot get to that scale. There’s another world where both sides get to that scale, (where there will be parity between US and China),” he said.

Dario Amodei seemed to be admitting that there was little to separate the US and Chinese companies on a technical basis, but it was access to GPUs which could ensure that Chinese companies never caught up to American ones. He implied that after OpenAI had implemented the gains from chain-of-thought reasoning in its o1 model, DeepSeek had used a similar technique to get really good results in its own models, which meant that much of the early gains from pre-training that the US models had spent time and money on had been wiped out. But he said that as GPU clusters grew larger in the US — and simultaneously China couldn’t get the GPUs because of export controls — the US would regain its lead. He also said that if China did manage to get access to as many GPUs as the US, it was likely that US and Chinese companies would be at parity. It remains to be seen which scenario plays out, but Amodei’s statement is a massive endorsement of the skills of the Chinese AI labs — it seems that all that can stop them from being as good as their US counterparts is access to NVIDIA GPUs.

Posted in AI