"This Changes Things": Open-Source GLM 5.2 Impresses Vercel, Box CEOs

DeepSeek had caught the attention of the US AI world in a big way in early 2025, and another Chinese model could be trending in the same direction.

Z.AI — the Beijing-based lab formerly known as Zhipu AI — released GLM 5.2 on June 13, 2026, and the response from the Western tech community was immediate. Guillermo Rauch, CEO of Vercel, posted simply that he was “genuinely impressed, almost shocked” at how good GLM 5.2 is at coding, adding four words that summed up the broader sentiment: “This changes things.”

Box CEO Aaron Levie framed it in strategic terms, writing that what’s happening with open weights AI right now is “pretty remarkable.” He pointed to models achieving state-of-the-art results on specific tasks and getting close to frontier performance in coding and other domains. His read: as the gap between open and closed models stays narrow rather than widening, more value can be built on top of AI — and that’s good for the whole applied layer, including the frontier labs themselves.

Jeremy Howard, co-founder of Answer.AI and Fast.AI, was even more direct. He called GLM 5.2 “a marvel” and put it at least on par with Claude Opus 4.8 and GPT 5.5, while calling out qualities that benchmark charts don’t always capture — nuance, judgment, and reliable long-context handling. Mat Velloso, former VP at Meta and Google DeepMind, spent an entire day running it and said he “didn’t miss much.” He called it the first open model that passes as a daily driver.

The model earns those reactions on paper. GLM 5.2 scored 62.1 on SWE-bench Pro, ahead of GPT 5.5’s 58.6, and landed within one percentage point of Claude Opus 4.8 on FrontierSWE, a benchmark measuring long-horizon task completion. On the Artificial Analysis Intelligence Index v4.1, it scored 51 — placing it fourth overall globally, behind only Claude Fable 5, Claude Opus 4.8, and GPT 5.5. Among open-weights models, it leads by a seven-point margin over its nearest rivals.

What makes the benchmark numbers land differently this time is that they’re being corroborated by people using the model for real work, not just running it through evaluation suites. Rauch, Howard, Levie, and Velloso are all in the business of building and deploying AI systems. When that group converges on the same reaction unprompted, the signal is harder to dismiss.

The architecture behind the performance is a 744-billion-parameter Mixture-of-Experts model with 40 billion active parameters per inference call — the same footprint as its predecessor GLM 5.1. The gains over GLM 5.1 came entirely from training improvements, not scale. The context window expanded from 200,000 to one million tokens, and Z.AI specifically targeted long-context training, which may explain Howard’s observation about how well it handles extended input. Scientific reasoning saw the sharpest jumps: Humanity’s Last Exam climbed 12 points to 40%, Terminal-Bench v2.1 improved 16 points to 78%.

The model ships under an MIT license — no usage restrictions, no regional limits, weights freely downloadable from Hugging Face. That licensing choice matters for enterprises evaluating sovereign infrastructure options, particularly after recent export control actions pulled certain frontier models offline globally. For teams that can’t afford to have their AI stack disrupted by policy decisions, open weights under a permissive license has stopped being a nice-to-have.

On pricing, GLM 5.2’s API sits at $1.40 per million input tokens and $4.40 per million output tokens. A flat GLM Coding Plan starts at around $18 per month. For context, that’s a fraction of what closed frontier models from Anthropic and OpenAI charge. The cost differential has already been driving real switching decisions — a startup CEO recently described saving millions by moving to Chinese open models for production workloads.

Levie’s point about the applied AI layer is probably the most actionable framing. If open models can handle planning, execution, and agentic workflows at near-frontier quality, companies building products on top of AI can cost-optimize their model stack without sacrificing much on performance. The frontier labs retain their edge for the hardest reasoning tasks and for orchestration layers that benefit from the highest available capability — but the calculus for what you route where has shifted.

Z.AI has been building toward this quietly. GLM 5 was the first open model to cross 50 on the Artificial Analysis Intelligence Index. GLM 5.1 topped SWE-Bench Pro ahead of GPT 5.4 and Claude Opus 4.6 in April. GLM 5.2 now leads the entire open-weights category by a comfortable margin and sits within striking distance of the most expensive closed models available. The company completed a Hong Kong IPO in January 2026, has been running on zero Nvidia hardware since landing on the US Entity List, and has maintained a release cadence of roughly one significant model every six weeks.

The DeepSeek moment in early 2025 forced a reckoning with how much the US AI industry had underestimated what Chinese labs were capable of. GLM 5.2 is a different company, a different model, and a different set of benchmark categories — but the shape of the story is familiar. When the Vercel CEO says “this changes things,” it’s worth taking that seriously.