OpenAI announced a breakthrough achievement: its experimental large-language model (LLM) reached gold medal-level performance at the International Mathematical Olympiad (IMO), competing directly with top human mathematicians on Saturday.
This is the first instance of an AI solving these tough problems with human-like reasoning, marking a significant step in artificial intelligence’s ability to handle creative and complex tasks.
AI tackles world’s most challenging math problems with human-level skill
The IMO is famed for its challenging algebra and pre-calculus questions that require memorisation and creative problem-solving. Only 67 out of 630 contestants earned gold medals this year, about 10 per cent.
OpenAI’s model solved five out of six problems, scoring 35 out of 42 points, a performance vetted by three former IMO gold medalists who gave unanimous approval to the AI’s detailed proofs.
Unlike previous AI models that rely on simple computation, this model crafted “intricate, watertight arguments at the level of human mathematicians,” according to OpenAI researcher Alexander Wei.
OpenAI sets expectations: New powers coming, but not yet public
While this AI achievement signals exciting progress, OpenAI executives, including CEO Sam Altman, stress that this advanced model won’t be publicly available anytime soon.
Altman confirmed that GPT-5, the next major release, will improve over its predecessors but will not include this IMO-level math capability yet.
The experimental model used novel research methods and operated without internet access or external tools, highlighting its internal reasoning power.
Wei clarified, “It will take many months before something with this level of math ability can be shared widely”.
This milestone reflects OpenAI’s broader goal to develop a general-purpose reasoning AI capable of complex decision-making beyond formulaic tasks.
For now, the gold-winning AI remains an internal project, but it positions OpenAI at the forefront of AI’s rapid advancement in tackling problems that demand originality and depth, traditionally seen as exclusive to human intellect.