Thus far, cutting-edge progress in AI had been the reserve of the US, with companies like OpenAI, Google, Anthropic and Meta competing amongst themselves for creating the best models. But it appears that a Chinese company has now thrown its hat into the ring — and is holding up very well against its American peers.
Chinese AI lab DeepSeek has released its DeepSeek-R1 model that claims to have performance that’s comparable to OpenAI’s o-1 model at a fraction of the cost. Also, unlike OpenAI’s models, DeepSeek-R1 has been released with an MIT license, and is open-source. This allows people to modify the model to their own use cases, which isn’t possible with OpenAI’s models.
In popular benchmarks such as AIME, Codeforces, Math-500 and others, DeepSeek-R1 was at par with OpenAI’s o1, and in some cases even better than the model. In particular, DeepSeek-R1 performed better than o1 on AIME 2024, Codeforces, and SWE-Bench Verified, which indicates it could be quite good at coding tasks.
And DeepSeek-R1 says it can deliver this performance at a fraction of the costs of other models. OpenAI’s o1 model charges $60 per 1 million output tokens, but DeepSeek-R1 charges just $2.1 per million output tokens. o1’s input API price is $7.5 per million tokens for a cache hit, while DeepSeek charges only $0.14 per million input tokens.
These are impressive results. DeepSeek is based in Hangzhou in China, and was founded in 2023 by a hedge fund named High-Flyer. High-Flyer was founded in 2015 by three engineers from Zhejiang University who began trading as students during the 2007–2008 financial crisis. The firm made use of machine learning to trade stocks, and in 2019 it established High-Flyer AI which was dedicated to research on AI algorithms and its basic applications. By 2021, all of High-Flyer’s strategies were using AI, making it an AI-focused fund. DeepSeek was founded in 2023 as an AI research lab, and has released a few AI models so far, but R1 is its most performant model.
DeepSeek-R1 doesn’t compare itself to OpenAI’s yet unreleased o3 model, which was announced by the company last month, and smashed several benchmarks and led many to speculate if it was AGI. But it does seem to be at par with the previous best model that OpenAI had released in o1, which could be quite the achievement — DeepSeek is not based in the US, is much cheaper than OpenAI, and to top it all, is open source. And DeepSeek doesn’t seem to be shy of taking on its competition — its release note on X cheekily said “Pushing the boundaries of **open AI**!”, a reference to its biggest American competitor’s tendency of not open-sourcing its models. It remains to be seen how DeepSeek is adopted in the coming weeks, but it appears that with models like Qwen and DeepSeek, China is emerging as a serious competitor — and peer — to the US in AI research.