DeepSeek has wowed the AI and the financial worlds over the last few days, and there now seems to be grudging admiration from the company whose model it has managed to replicate at a possibly lower cost.
OpenAI Chief Research Officer Mark Chen has said that DeepSeek managed to independently find some of the core ideas OpenAI had used to build its o1 reasoning model. “Congrats to DeepSeek on producing an o1-level reasoning model! Their research paper demonstrates that they’ve independently found some of the core ideas that we did on our way to o1,” Chen wrote on X.
But Chen added that the reaction to DeepSeek — which wiped out $650 billion from NVIDIA’s market cap in a day — might have been a bit unwarranted. “However, I think the external response has been somewhat overblown, especially in narratives around cost. One implication of having two paradigms (pre-training and reasoning) is that we can optimize for a capability over two axes instead of one, which leads to lower costs,” he explained.
“As research in distillation matures, we’re also seeing that pushing on cost and pushing on capabilities are increasingly decoupled. The ability to serve at lower cost (especially at higher latency) doesn’t imply the ability to produce better capabilities,” he added.
“We will continue to improve our ability to serve models at lower cost, but we remain optimistic in our research roadmap, and will remain focused in executing on it. We’re excited to ship better models to you this quarter and over the year!” Chen said.
Chen seemed to be referring to how DeepSeek had scaled computing resources on chain-of-thought reasoning to achieve its results — thus far, companies like OpenAI and Anthropic had focused most of their computing resources on pre-training. But with another axis of reasoning opening up, resources could be spent there, and lead to even better results. Chen also downplayed DeepSeek undercutting OpenAI by 90 percent on costs, saying that new techniques like distillation — which cause a model to learn from the outputs of another model — meant that serving cheaper models wasn’t necessarily a predictor of building more powerful models.
Mark Chen isn’t the only OpenAI executive to acknowledge DeepSeek’s results. OpenAI CEO Sam Altman had caused DeepSeek’s model “impressive”, and said it was “invigorating” to have a new competitor. “Deepseek’s r1 is an impressive model, particularly around what they’re able to deliver for the price. We will obviously deliver much better models and also it’s legit invigorating to have a new competitor! We will pull up some releases but mostly we are excited to continue to execute on our research roadmap and believe more compute is more important now than ever before to succeed at our mission. The world is going to want to use a LOT of AI, and really be quite amazed by the next gen models coming. Look forward to bringing you all AGI and beyond,” he said.
This is a tremendous acknowledgement for the tiny Chinese lab that’s taken the world by storm in recent days. DeepSeek had been barely known outside the research community until a month ago when it released its v3 model, and has now caused AI-based stocks to crash, and is being referred to as a “competitor” by OpenAI’s CEO. It remains to be seen how DeepSeek will fare in the coming months, but it’s certainly got the entire world — and the world’s top AI labs — to take notice.