Chinese Company Moonshot AI Releases Kimi K2, Becomes Top Open-Source Model On Some Benchmarks

DeepSeek had drawn the attention of the world to Chinese AI capabilities, but there seems to be a lot more than just DeepSeek in China’s AI ecosystem.

Chinese AI startup Moonshot AI has released an open-source model named Kimi K2, which has emerged as the strongest open-source model on many benchmarks. Kimi K2 outperforms many proprietary models including GPT-4 on key benchmarks including SWE-Bench Verified, AIME 2025 and GPQA Diamond.

Kimi K2 has a total of 1 trillion parameters, but only 32 billion activated parameters are engaged per user query. Kimi K2 employs a sophisticated Mixture of Experts architecture, which divides the model into 384 specialized “experts.” For each token processed, only 8 experts plus a shared expert are activated, enabling massive scale without proportional computational costs during inference. It has 61 layers, including one dense layer, with an attention hidden dimension of 7168 and a context length of 128K tokens.

Kimi K2 uses MuonClip, a novel optimizer developed by Moonshot AI to stabilize training at scale. It addresses the issue of “exploding attention logits” by rescaling the weight matrices of query and key projections after each optimizer update preventing numerical instability. This innovation enabled stable pre-training on 15.5 trillion tokens without training spikes, a breakthrough that allows the model to scale to trillion-parameter size efficiently.

Kimi K2 outperforms many OpenAI and Google models on benchmarks. On SWE-Bench, it performs better than DeepSeek V3 and GPT 4.1. On LiveCodeBench, it outperforms DeepSeek V3, GPT-4.1, Claude 4 Opus and Gemini 2.5 Flash. Kimi K2 has emerged as a contender for the best open-source model on many benchmarks.

Kimi K2 was developed by Moonshot AI, a Chinese artificial intelligence startup founded in March 2023. Moonshot AI is headquartered in Beijing, with a mission to develop Artificial General Intelligence (AGI) through advanced large language models (LLMs). The company was co-founded by Yang Zhilin, a Tsinghua University and Carnegie Mellon University alumnus with experience at Google Brain and Meta AI.

Moonshot AI has attracted substantial investment from major players like Alibaba, Tencent, Sequoia China, Meituan, Xiaohongshu, and others. It raised a $1 billion Series B in early 2024, valuing the company at around $2.5 billion, with a latest valuation reaching approximately $3 billion by mid-2024. Alibaba alone invested nearly $800 million, making Moonshot one of the highest-valued unicorns in China’s large model AI sector.

And Moonshot’s success with Kimi K2 shows the breadth of Chinese AI ecosystem. Apart from DeepSeek, there are a dozen Chinese companies that are releasing enormously capable models, and competing with one another. Even more impressively, they’re releasing their models as open-source, which could make their adoption more likely by the global developer ecosystem. And with several highly funded Chinese players releasing top models with regularity, the battle between the US and China in the AI race is well and truly on.