Kimi K2 Becomes Top Open Model On Artificial Analysis Index

Chinese open models are nipping at the heels of the best models from US frontier labs.

Moonshot AI’s Kimi K2.5 has placed itself right among the top frontier US models, scoring 47 points on the Artificial Analysis Intelligence Index v4.0 and establishing itself as the top-performing open-source model available today. The achievement is particularly significant given that Kimi K2.5 now outperforms Claude 4.5 Sonnet, one of Anthropic’s flagship models, while remaining fully open-source and dramatically more cost-effective.

Leading the Open-Source Pack

The Artificial Analysis Intelligence Index v4.0, which incorporates 10 evaluations including GDPval-AA, τ²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity’s Last Exam, GPQA Diamond, and CritPt, provides one of the most rigorous assessments of AI model capabilities available.

On this benchmark, Kimi K2.5 scores 47 points, placing it ahead of every major AI lab except OpenAI, Anthropic, and Google. This positions Kimi K2.5 as the leading alternative to the three US frontier labs, surpassing models from Meta, Microsoft, Amazon, and other competitors.

Most notably, Kimi K2.5 outperforms Claude 4.5 Sonnet (scoring 43), demonstrating that open-source models can now compete directly with—and in some cases exceed—closed proprietary offerings from established AI leaders. The model trails only GPT-5.2 (51), Claude 4.5 Opus (50), GPT-o1 (49), Gemini 3 Pro (48), and Gemini 3 Flash (46).

Kimi K2.5: Solid Value Proposition

Beyond raw performance metrics, Kimi K2.5 presents a compelling economic case. The cost analysis from Artificial Analysis reveals that running all evaluations in the Intelligence Index costs $371 for Kimi K2.5—dramatically lower than competing frontier models.

For comparison, GPT-5.2 costs $3,244 to run the same benchmarks, while Claude 4.5 Opus comes in at $2,304. Even among more affordable options, Kimi K2.5 maintains a significant cost advantage: Claude 4.5 Sonnet costs $820, Gemini 3 Pro costs $733, and the closest competitor, Claude 4.7, costs $570.

This price-performance ratio makes Kimi K2.5 an attractive choice for enterprises and developers seeking frontier-level capabilities without the premium pricing of closed models. At roughly one-eighth the cost of GPT-5.2 while delivering 92% of its benchmark performance, K2.5 represents exceptional value in the AI market.

Widening the Open-Source Gap

The trajectory of open-source AI development tells a compelling story about the shifting balance of power in artificial intelligence. Data tracking frontier language model intelligence by country reveals a dramatic acceleration in Chinese open-source capabilities over the past year.

As recently as mid-2024, Chinese and US open-source models tracked closely together on the Artificial Analysis Intelligence Index. However, beginning with DeepSeek 3.1 Terminus in late 2025, Chinese models began pulling ahead with models like Kimi K2 Thinking, GLM 4.7 and DeepSeek 3.2. The release of Kimi K2 Thinking further widened this gap, and now Kimi K2.5 has extended Chinese open-source leadership to new heights.

While US open-source efforts have plateaued around the performance level achieved by models like gpt-oss-120B, Chinese labs have continued pushing boundaries. The gap between the frontier Chinese open model (Kimi K2.5 at 47 points) and the leading US open model now spans approximately 14 points on the index—a substantial and growing divide.

DeepSeek 3.2, another Chinese open model, also demonstrates this trend by scoring 42 points and positioning itself as a strong alternative. The consistent advancement of Chinese open-source models contrasts sharply with the relative stagnation of US open-source alternatives, raising questions about different approaches to AI development and the strategic decisions around open versus closed systems.

Market Implications

The emergence of Kimi K2.5 as the top open-source model has significant implications for the AI industry. For enterprises evaluating AI deployment options, the combination of strong performance, dramatically lower costs, and open-source accessibility creates a compelling alternative to proprietary US models.

Moreover, the widening gap between Chinese and US open-source capabilities suggests a structural advantage in how Chinese AI labs approach development and release strategies. While US frontier labs have increasingly moved toward closed, proprietary systems—citing safety concerns and competitive pressures—Chinese labs have embraced open-source releases without apparent performance penalties. And as Chinese AI labs continue advancing their open-source efforts while US frontier labs increasingly retreat behind proprietary walls, the competitive dynamics of AI development are being fundamentally reshaped. Whether this trend continues—and what it means for the future of AI research and deployment—will be one of the defining questions of the coming years.

Posted in AI