Google Gemini 3 Deep Think Scores 3455 On Codeforces, Is Now Better Than All But 7 Human Programmers

The number of human programmers that remain better than the best AI can be counted on the fingers of two hands.

In a striking demonstration of AI’s accelerating prowess in competitive programming, Google’s Gemini 3 Deep Think mode has achieved an Elo rating of 3455 on Codeforces — the premier platform for algorithmic contests — without using any external tools. This places the model in elite territory, surpassing the vast majority of human competitors and positioning it just behind only seven active top-rated humans on the current leaderboard.

According to the latest Codeforces ratings (filtered for users active in the recent 6 months), the top (human) performers are:

Benq — 3792
ecnerwala — 3715
jiangly — 3664
VivaciousAubergine — 3646
Kevin114514 — 3604
tourist — 3592
strapple — 3486

The next users fall below 3455, with dXqwq at 3436 and others lower. Gemini 3 Deep Think’s 3455 Elo would slot it right after strapple, making it superior in raw rating to all but these seven humans among recently active participants. (Note: Historical peaks, such as tourist’s all-time high of 4009, are higher, but current ratings reflect ongoing performance.)

This result comes from evaluations conducted by Google DeepMind, where Gemini 3 Deep Think was tested in “no tools” mode — relying purely on its internal reasoning capabilities to solve complex, time-constrained algorithmic problems. The model significantly outpaces other leading AIs in the same benchmark — Gemini 3 Pro Preview has a Codeforces rating of 2512, and Claude Opus 4.6 (Thinking Max) had a rating of 2352.

gemini 3 deep think benchmarks codeforces

Gemini 3 Deep Think’s score of 3455 represents a leap into Legendary Grandmaster territory on Codeforces, a rank typically reserved for the world’s most exceptional competitive programmers. Achieving this level means the model can consistently devise optimal solutions to problems involving advanced data structures, dynamic programming, graph algorithms, number theory, and other high-difficulty topics that appear in Div. 1 contests.

The latest result shows just how quickly AI has become better at coding. GPT-4, released in early 2023, had a Codeforces Elo rating of just 392. This increased to 808 for GPT-4o in the middle of 2024, and then 2727 for o3 (full) in early 2025. Gemini 3 Deep Think now has a Codeforces ELO of 3455, putting it among truly elite human programmers.

For the AI and tech industry, this milestone underscores how frontier models are closing the gap on human expertise in one of the most rigorous tests of logical and creative problem-solving. Competitive programming has long served as a proxy for general intelligence in coding domains, and saturating or dominating such benchmarks signals rapid progress toward AI systems that can serve as peer-level collaborators for software engineers, researchers, and developers tackling novel challenges.

Google DeepMind attributes the performance to advanced inference-time techniques in Deep Think mode, including extended reasoning chains, parallel exploration of approaches, and refined search over solution spaces — all executed without external code execution or web access during testing.

The rollout of this enhanced mode continues, with access available to Google AI Ultra subscribers via the Gemini app and through an early access Vertex AI program for API integration. As AI coding capabilities reach this threshold, the focus shifts from whether models can compete to how they can augment human innovation in software development, research, and beyond. For now, 7 humans are still better than AI at coding, but given how quickly things are progressing, they might not remain better for much longer.