No Overnight Success: How DeepSeek Slowly Grew Its Capabilities To Launch Viral AI Model R1

DeepSeek stunned the world last week with its R1 model. The model was as good as the models that OpenAI had released, its app reached the top of the app store, caused AI stocks to crash, and even got US President Donald Trump to say that its release was a wake-up call for the US tech industry.

But DeepSeek was no overnight success. The company had been founded in 2023, and had been regularly releasing models of increasing quality. It had first made researchers take note of it after it released DeepSeek-V3 in December 2024, which was trained on just $5 million, and matched several models like OpenAI’s GPT-4o in ability. But just a month later, DeepSeek launched R1, which had state-of-the-art results, and was as good as OpenAI’s top o1 model.

Here is DeepSeek’s journey through the different models it released. The models are compared with the then-state-of-the-art on the Chatbot Arena ELO score, which is determined by humans who are given blinded responses from two models and asked to select which one they think is better. DeepSeek’s first few models were well behind the state-of-the-art, but in a few months they rapidly improved their performance, and became completely world class.

Source: X/Tanishq Mathew Abraham

DeepSeek-LLM-67B (April 2024):
This foundational model served as a proof of concept. Although not groundbreaking, it provided the team with valuable insights into user interaction, model weaknesses, and optimization opportunities.

DeepSeek-Coder-V2 (July 2024):
With their second release, DeepSeek pivoted toward a specialized model tailored for coding tasks. This update not only improved their Chatbot Arena scores but also gained traction among developers, demonstrating their potential in niche AI applications.

DeepSeek-v2 (August 2024):
Marking a leap in general-purpose conversational AI capabilities, DeepSeek-v2 brought the company closer to the leaders. Enhanced language understanding and response coherence were evident improvements.

DeepSeek-v2.5 (October 2024):
By integrating feedback and fine-tuning performance, v2.5 narrowed the gap further. At this stage, DeepSeek’s models began competing on nearly equal footing with top-tier counterparts.

DeepSeek-v3 (December 2024):
This release was a breakthrough. With substantial upgrades in knowledge reasoning, contextual understanding, and response diversity, DeepSeek-v3 solidified the company’s position in the arena. It climbed rapidly in the ELO rankings, coming tantalizingly close to dethroning the best model.

DeepSeek-R1 (January 2025):
The latest iteration in DeepSeek’s lineup achieved what once seemed improbable: matching the top model’s ELO score. The R1 model is a testament to the company’s commitment to iterative refinement and technological excellence.

DeepSeek’s steady progress likely rebuts claims that that it was an underground CCP project that was designed to crash the US stock market. On the contrary, the company had been publicly releasing models for over a year, but they got lost among the dozens of models that were being released. And when the company released its breakout model, the world suddenly realized that an open-source AI model from China had all but beaten the top closed-source models from the US. DeepSeek got to the top position without any fanfare, cryptic X posts and likely with very limited resources — it shows that just putting your head down and working hard can sometimes lead to some great results.

Posted in AI