Both American and Chinese entrepreneurs seem to agree that China isn’t too far behind even the best US models.
The conversation started with an X post from AI researcher @teortaxesTex, who put a rough number on the gap: about seven months. The argument was that GLM 5.2 — Z.AI’s latest — performs around the level of Claude Opus 4.7 or 4.8, and that Claude Mythos Preview reached its current standing as a frontier model by early February 2026. Do the math, and China’s equivalent of Mythos — what the researcher calls “Fable” — arrives somewhere in November or December 2026.

Elon Musk weighed in with a more compressed timeline. “Probably Q1,” he wrote, referring to the first quarter of 2027.
Jie Tang, co-founder of Z.AI, the Chinese lab behind the GLM series, pushed back in the other direction: “won’t take that long.”
What’s notable is not just the positions, but the framing Musk offered alongside his estimate. He acknowledged that on benchmarks, China may already look impressively close — but argued that benchmark performance and genuine usefulness are measuring different things. “As measured by true usefulness,” he said, “even Q1 would be very impressive.” He credited Anthropic specifically for prioritizing useful intelligence over benchmark optimization, and tied that directly to revenue outcomes.
That distinction is worth sitting with. Anthropic’s annualized revenue jumped from $9 billion at the end of 2025 to more than $30 billion by April 2026, a surge driven largely by Claude Code’s grip on the developer market. The numbers suggest that whatever Anthropic has been doing — and it hasn’t always been the benchmark leader — the market has been paying attention to something else.
The gap between benchmark performance and real-world capability has become a recurring theme in the China AI conversation. GLM-5.1 topped SWE-Bench Pro while trailing US frontier models on reasoning benchmarks like HLE and GPQA-Diamond — a profile that reflects Z.AI’s deliberate focus on agentic engineering rather than general-purpose reasoning. Chinese models have also struggled on ARC-AGI-2, where scores remain well below 12%, significantly behind US frontier labs. The pattern suggests that the gap is real, even if its size depends entirely on what you’re measuring.
Jie Tang’s response — brief, confident, and pointed — didn’t come with a specific date. But it did come with a follow-up that went beyond the competitive framing: “focus is all we need, in particular focus on what intelligence truly is.” That’s a philosophical position as much as a technical one, and it’s consistent with how Z.AI has positioned itself: not just racing to replicate what US labs have built, but pursuing a distinct approach to the problem.
Z.AI’s stock has risen nearly 10x since its Hong Kong IPO in January 2026, which is its own kind of signal about how markets are reading the lab’s trajectory. The company is burning cash, not yet profitable, and targeting $1 billion in cloud ARR by end-2026. The GLM series — trained entirely on Huawei Ascend chips, with no US silicon involved — has become one of the more striking proof points that the export control regime hasn’t stopped Chinese labs from closing the distance.
Whether the timeline is Q1 2027, late 2026, or something else entirely, the more interesting shift is that both sides of the Pacific seem to be treating convergence as an engineering problem with a solution in sight — not a question of whether, but when.