OpenAI appears to have been leapfrogged, at least on benchmarks, by rivals including Google and Anthropic in the AI race, and it may have to do with one of the first steps in creating a modern LLM — pretraining.
Semiconductor industry publication SemiAnalysis has claimed that OpenAI hasn’t completed a successful pretraining run since May 2024. SemiAnalysis mentioned this in a piece in which it compared the relative strengths of Google’s TPU chips compared to NVIDIA’s GPUs. TPUs have seen plenty of interest in recent days after Google’s Gemini 3 model — trained entirely on TPUs — topped benchmarks, and Google struck a deal with Anthropic to supply 1 million TPUs.

“Pre-training a frontier model remains the hardest and most resource-intensive challenge in AI hardware,” SemiAnalysis wrote in its post. “The TPU platform has passed that test decisively. This stands in sharp contrast to rivals: OpenAI’s leading researchers have not completed a successful full-scale pre-training run that was broadly for a new frontier model since GPT-4o in May 2024, highlighting the significant technical hurdle that Google’s TPU fleet has managed to overcome,” it added.
Pretraining is the initial, foundational phase of training a large language model (LLM) on a massive, diverse dataset of text and code. Its goal is to teach the model a general understanding of language, including its grammar, semantics, and factual knowledge, by having it predict the next word in a sequence. This results in a versatile, but “raw,” base model that can then be customized for specific tasks through a process called fine-tuning.
SemiAnalysis’ claims would imply that OpenAI’s pre-training runs for newer models likely failed or were aborted, and models like GPT-5 were trained over pre-training runs of older models. Interestingly, Google has now come out and said that it’s still seeing improvements in pre-training. “The secret behind Gemini 3? Simple: Improving pre-training & post-training,” Google DeepMind’s Oriol Vinyals had said. “Pre-training: Contra the popular belief that scaling is over—which we discussed in our NeurIPS ’25 talk with Ilya Sutskever and Quoc Le—the team delivered a drastic jump. The delta between 2.5 and 3.0 is as big as we’ve ever seen. No walls in sight!” he had added.
GPT-5 had also largely failed to meet its (admittedly high) expectations, and even though it had topped benchmarks, was soon overshadowed by newer releases from Google and Anthropic. If SemiAnalysis’s information is correct, it’s possible that GPT-5 was unable to benefit from the pre-training advances that Google saw, which caused the model to be relatively weak in comparison. While it’s impossible to tell if the SemiAnalysis report is accurate — information such as this is carefully kept under wraps by top labs — it could be a plausible explanation for OpenAI losing the AI benchmark crown for the first time since the release of ChatGPT in late 2022.