An AI Model Will "For Sure" Win A Gold Medal At The Math Olympiad Next Year: OpenAI Exec

OpenAI has released its o3 model that has given some extremely impressive results on coding and maths tests, and the company believes that this progress will likely continue.

An OpenAI executive has said that he believes that an AI model will “for sure’ win a medal at the International Mathematics Olympiad next year. “There was a math benchmark, which is high school competitive mathematics, and that quickly got saturated,” Sebastien Bubeck, a Member of Technial Staff at OpenAI said in an interview. “Now we’re talking about AIME. AIME is a new benchmark that OpenAI is using to benchmark the new o-series models. (It’s a) test to get into the U. S. team for the International Mathematics Olympiad,” he added.

“So GPT 4 (scores) at essentially 0%, maybe, maybe 5%, maybe 10% (on the AIME),” Bubeck said. “o1-preview is already at 50%. You can see (the progress). It’s going to be at 90 percent next month in two months. And the IMO gold medal is going to be — I don’t know when, but you know, very soon — I would for sure say next year, there’s just, there’s just no question given the current trend,” he added.

Bubeck seemed to highlight how rapidly AI models had gone from scoring 0 percent to 50 percent in feeder tests for the math Olympiad. He seemed to indicate that given how fast things were moving, it was very likely that AI would score as much as 90 percent on this test “in the next two months”. And these high scores on the feeder test would mean that it was extremely likely that an AI model would get a gold medal at the actual International Mathematics Olympiad next year.

This could be a pretty remarkable event. The International Maths Olympiad gets together the brightest mathematicians from around the world, and they compete among themselves to solve extremely hard math problems. If an AI could get a gold medal at an event, it would indicate that AI has become better than even the best humas at a task that until recently seemed to require high degrees of human intelligence.

It’s not only math at which AI systems have improved tremendously in the recent past. On a coding benchmark, OpenAI’s o3 model received an ELO rating of 2727, which ranked it 175th in the list of the world’s top programmers, which indicates that AI was likely better than all but 174 humans now at coding. On the Competition Maths benchmark, it managed an accuracy of 96.3 percent, which wasn’t far from a perfect score. And with OpenAI executives now saying that it’s very likely that an AI will win a gold medal at the International Mathematics Olympiad, it looks like the areas where humans are still better than computers are continuing to shrink rapidly.