Deepseek Releases Deepseek-V3.2-Speciale That Company Says Rivals Gemini 3 Pro

Chinese open-source models are hot on the heels of their US counterparts from frontier labs.

DeepSeek has launched two new reasoning-focused AI models, positioning its flagship DeepSeek-V3.2-Speciale as a direct competitor to Google’s Gemini 3.0 Pro in advanced reasoning tasks. The Chinese AI company announced DeepSeek-V3.2 and DeepSeek-V3.2-Speciale on Monday, describing them as “reasoning-first models built for agents.” While DeepSeek-V3.2 serves as the official successor to V3.2-Exp and is available across the company’s app, web interface, and API, the more powerful V3.2-Speciale is currently restricted to API access only.

Performance and Benchmark Results

DeepSeek positions V3.2-Speciale as achieving world-leading reasoning capabilities that rival Gemini 3.0 Pro. According to the company’s internal testing, V3.2-Speciale achieved gold-medal-level performance across several prestigious competitive programming and mathematics competitions, including the International Mathematical Olympiad (IMO), Chinese Mathematical Olympiad (CMO), ICPC World Finals, and International Olympiad in Informatics (IOI) 2025. Last week, Deepseek had released its Math V2 model that had also delivered a gold performance in the Math Olympiad.

The benchmark data shows V3.2-Speciale outperforming competitors across multiple reasoning tasks. On the AIME 2025 mathematics benchmark, the model achieved a 96.0% pass rate, compared to Gemini 3.0 Pro’s 95.0% and GPT-5 High’s 94.6%. The gap widens significantly on the HMMT 2025 benchmark, where V3.2-Speciale scored 99.2%, substantially ahead of Gemini 3.0 Pro’s 97.5%.

In coding challenges, V3.2-Speciale demonstrated particular strength on the CodeForces platform with a rating of 2701, compared to Gemini 3.0 Pro’s 2708 and GPT-5 High’s 2537. On the challenging HLE (Human-Level Expertise) benchmark, V3.2-Speciale achieved 30.6%, outpacing both Gemini 3.0 Pro (37.7%) and GPT-5 High (26.3%), though notably trailing the search giant’s offering on this particular metric.

For agentic capabilities, DeepSeek-V3.2-Speciale posted competitive results on the Tool Decathlon benchmark at 35.2%, slightly behind Gemini 3.0 Pro’s 36.4% but ahead of GPT-5 High’s 29.0%.

Two Models for Different Use Cases

DeepSeek characterizes the standard V3.2 as offering “GPT-5 level performance” with balanced inference versus output length, positioning it as a “daily driver” for general use. The V3.2-Speciale variant, meanwhile, features “maxed-out reasoning capabilities” but requires significantly higher token usage to achieve its enhanced performance.

The company noted that V3.2-Speciale “dominates complex tasks” but comes with the tradeoff of increased computational costs. DeepSeek emphasized that the API-only release is intended to support community evaluation and research before a broader rollout.

Breakthrough in Agent Training

A key innovation in the V3.2 release is what DeepSeek calls a “massive agent training data synthesis method” covering more than 1,800 environments and 85,000 complex instructions. DeepSeek-V3.2 represents the company’s first model to integrate reasoning directly into tool use, supporting both thinking and non-thinking modes during tool interactions.

This “Thinking in Tool-Use” capability marks a significant architectural advancement for agentic AI systems that need to interact with external tools and APIs while maintaining chain-of-thought reasoning.

Availability and Pricing

DeepSeek-V3.2 is now available across the company’s app, web platform, and API with the same usage patterns as the previous V3.2-Exp model. The model supports tool use with integrated reasoning capabilities.

V3.2-Speciale is currently accessible only through a temporary API endpoint that will remain active until December 15, 2025, at 15:59 UTC. The model is priced identically to V3.2 but does not currently support tool calls. DeepSeek is serving the model through a dedicated endpoint to gather feedback from researchers and developers before determining its long-term availability strategy.

The release intensifies competition in the advanced reasoning model space, where companies are racing to develop AI systems capable of handling complex mathematical, coding, and logical reasoning tasks that approach or exceed human expert performance.