Alibaba Announces Qwen3.6-35B-A3B, Beats Gemma 4-31B On Many Benchmarks

Google had just announced Gemma 4 earlier this month, and China now seems to have come up with a competitor.

Alibaba’s Qwen team has released Qwen3.6-35B-A3B, a new open-source multimodal AI model that punches well above its weight class. The model is a sparse Mixture-of-Experts (MoE) architecture with 35 billion total parameters but only 3 billion active at any given time — making it dramatically more compute-efficient than it looks on paper. It is released under the Apache 2.0 license, meaning businesses and developers can use, modify, and build on it freely.

The MoE Advantage

The architecture is the story here. By activating just 3B parameters per inference, Qwen3.6-35B-A3B delivers the economics of a small model while drawing on the learned capacity of a much larger one. The benchmarks bear this out: on agentic coding tasks, it competes with — and in several cases beats — dense models that are 10x its active size.

On Terminal-Bench 2.0 (Agentic Terminal Coding), it scores 51.5, compared to Gemma4-31B’s 42.9. On SWE-bench Pro, it hits 49.5 against Gemma4-31B’s 35.7. The gap is similarly pronounced on SWE-bench Verified (73.4 vs. 52.0) and SWE-bench Multilingual (67.2 vs. 51.7). For developer-facing workloads, these are the numbers that matter most.

It also outperforms its own predecessor, Qwen3.5-35B-A3B, by a wide margin across nearly every benchmark — suggesting that the generational leap here is substantial, not incremental.

Multimodal Capabilities

Qwen3.6 is natively multimodal, supporting both a “thinking” and “non-thinking” mode — giving developers flexibility depending on whether they need deliberate step-by-step reasoning or fast, direct responses.

On vision-language benchmarks, the model holds its own against significantly larger competition. Across MMMU (Multimodal Reasoning), RealWorldQA (Image Reasoning), and GPQA Diamond (Graduate-level Reasoning), its scores trail Google’s Gemma4 models only marginally, if at all. Alibaba claims that on most vision-language tasks, performance matches Claude Sonnet 4.5, and even surpasses it in spatial intelligence — achieving 92.0 on RefCOCO and 50.8 on ODInW13.

Context: A Qwen Team in Transition

The release comes at an interesting moment internally. Junyang Lin, the public face of the Qwen project, stepped down in early March 2026 — a departure that drew significant attention in the open-source AI community. Under his tenure, Qwen models accumulated over 600 million downloads and spawned more than 170,000 derivative models on Hugging Face, surpassing Meta’s Llama in that metric. The fact that Alibaba has shipped a significant new model so quickly after that leadership change suggests the team’s momentum has not broken stride.

Earlier this year, Alibaba also released the Qwen 3.5 Small Model Series — compact models from 0.8B to 9B parameters aimed at edge deployment and on-device inference. Qwen3.6-35B-A3B is the higher-end complement to that lineup, aimed squarely at developers who need serious agentic and coding capability without the infrastructure bill of a 70B+ dense model.

Why This Matters

The open-source AI race is intensifying. Google’s Gemma4, Meta’s Llama series, and Alibaba’s Qwen family are all competing for developer adoption — and each new release raises the bar on what “free and open” AI can do. For enterprises evaluating self-hosted AI deployment, particularly those with cost, latency, or data sovereignty requirements, a model like Qwen3.6-35B-A3B — efficient, capable, and unencumbered by licensing restrictions — is a serious proposition.

The gap between open-source and closed proprietary models continues to narrow. And if this release is anything to go by, Chinese AI labs remain central to that trend.