New cases where ChatGPT and Claude helped solve mathematical problems that had been unsolved for years illustrate how language models are evolving from an auxiliary tool to an active research tool, but also highlight the need for careful human oversight.

Liam Price is, well, a nobody. A mere 23-year-old with no advanced training in mathematics. And yet he recently solved a problem that no mathematician has been able to tackle in sixty years.
And he did it without even understanding the problem he was solving.
Price's story begins a year ago, when the young man began testing whether Chat-GPT could solve the Erdős problems. This is a collection of more than a thousand mathematical problems left behind by the great mathematician Paul Erdős, and mathematicians all over the world try to find a solution to them in their free time – or during their working hours. Price's attempts seemed ridiculous almost a year ago, but at the time he was using the free version of Chat-GPT. Since then, he has received a two-hundred-dollar-a-month subscription to the artificial intelligence engine, and suddenly breakthroughs began to appear.
One of those hacks came in recent months, when Price entered Erdős problem number 1,196 into the GPT chat. The problem concerns “primitive sets” of numbers: sets of integers that are not divisible by any other. Such a primitive set might be, for example, {5, 7, 17} or {2, 5, 9, 11} and so on. Naturally, the simplest primitive sets are those consisting only of prime numbers—3, 5, 7, and so on—that are not divisible by any integer other than 1 and themselves.
Erdos Problem 1,196 is about trying to figure out how a certain transformation of the sum of numbers in a primitive set changes as the numbers get bigger or smaller. Chat-GPT received the problem from Price, thought about it for eighty minutes, and returned an answer.
"I didn't know what the problem was. I just… gave them [Erdosh problems] to the AI and saw what it could produce." Price said"And she gave what seemed like a correct solution."[1]
Price shared the answer he received with the global mathematical community. One of the greatest mathematicians alive today – Professor Terence Tao, who won the Fields Medal for his mathematical research – examined the solution together with other experts, cleaned, refined and formally tested it, and came to the conclusion that the artificial intelligence was right.
But it's even more interesting to understand how she was right, and how she wasn't.
Contrary to all expectations, Chat-GPT did not find or formulate new mathematical principles to solve the age-old problem. He simply thought differently than all human mathematicians.
“The people who looked at the problem all took a little bit of a wrong turn at the very first step,” Tao explained to Scientific American. “We’re starting to realize that the problem was perhaps easier than we expected, and it was like there was some kind of mental barrier. … There was a kind of standard sequence of actions that everyone who had worked on the problem before started with.”
Artificial intelligence has abandoned the conventional sequence of actions and gone in a completely different direction. It has recognized that there is a well-known mathematical formula from other mathematical fields that no one has ever thought of applying to this particular Erdos problem. Why? Because humans are limited in the scope of their knowledge, and to tell the truth, most of us also have difficulty leaving the familiar and safe framework that others have outlined around us in the past. Today's conventional artificial intelligences also prefer to turn to the familiar and known, it is true, but they have access to such vast amounts of information that they can make connections and links between different and very distant fields.
Detractors will now say that this is just further proof that AI lacks "creativity" or "human thinking," or any other diminutive phrase intended to affirm human superiority over the machine. One can certainly agree that this particular incident primarily testifies to AI's impressive power to connect different topics that were previously familiar to it. On the other hand, if a person were to demonstrate such an ability, we would probably say that he is creative.
Either way, in another place and time in the early months of 2026, another mathematician came to the realization that what was is no longer what will be, and that he needed to change his views about the capabilities of artificial intelligence.
Shock, shock!
In early 2026, a scientific article was published that opened with the words – "Shock! Shock!".
It is said gently that serious scientists cannot afford to write in this way. But this is an extraordinary scientist. No, not artificial intelligence, but Professor Donald KnuthThe 88-year-old has won some of the most prestigious awards in science, including the Turing Award (considered the Nobel Prize of computer science) and has also been nicknamed the "father of algorithm analysis."
In academia, as in prison, when your colleagues call you "father" and not because of a direct genetic connection, you are allowed to open articles in any way you like.
But what put Knuth in such shock?
"Yesterday I discovered that an open problem I've been working on for several weeks has just been solved by Claude Opus 4.6," Knuth wrote. "It seems I'll have to change my views on "creative AI" one of these days. What a joy to discover that not only does my mathematical proposal have a nice solution, but also to celebrate this dramatic advance in automatic deduction and creative problem solving."
If it wasn't clear, Knuth wasn't a big fan of Chat-GPT, Claude, and the other big language models. He Test them on 2023 and correctly stated that the products they provide are full of hallucinations, inventions, and outright lies. But that was then, three long years ago, in the pre-history of artificial intelligence.
And now? Knuth is starting to be convinced.
For the past year, Knuth has been investigating a mathematical-geometric problem, the kind that only mathematicians are interested in. Knuth found a solution for small, specific instances of it, but he was looking for a general solution, the kind that could fit every instance of the problem. At this point, he got stuck for weeks. It was clear to him that he needed a clever and elegant solution, the kind that only a mathematical genius would recognize. But there are very few such human geniuses.
Then one of the professor's friends decided to feed the problem to Claude.
That colleague provided the problem to Claude Opus 4.6 – an artificial intelligence model. He didn’t just ask the engine to solve the problem, but defined a unique way of working for it: In each run, Claude had to examine all the ideas he had come up with so far, and then provide a new hypothesis – a new idea for solving the problem. After the new direction was suggested, the artificial intelligence examined it and decided whether it was really the right solution. If not, it kept a record of the experience, and moved on to the next run.
This means that Claude created a continuous line of thought from experiment to experiment, which allowed him to examine the problem from dozens of different angles, and to learn from each wrong attempt. We see here a superhuman ability to solve problems, since humans have difficulty holding more than a few ideas in their limited minds at the same time. Claude did not have a similar problem: he has Context window A million tokens long, which could contain all of Shakespeare's plays and refer to them all at once.
And yet, is Claude – 'just' a token completion engine, as is commonly disparagingly stated – capable of solving complex and open mathematical problems, for which even the world's greatest researchers have difficulty finding an answer?
It turns out he did, but even for him it was difficult.
I mean, it took him a whole hour.
For sixty minutes, Claude proposed and tested 31 different and bizarre directions for solving the problem. Each of those solutions was different from the previous ones, sometimes expanding into unusual ideas, and sometimes returning to previous ideas and trying to understand why they failed and how they could be improved. The final result of this whole process, which came on the 31st attempt, was a high-level solution to a problem that the mathematician who invented it had been unable to solve himself for weeks.
[The little details: Claude independently found a solution only for the odd cases of the problem. The solution for the even cases remained open and was later discovered in joint work by humans and GPT-5.4]
Did Claude 'pick up' the solution to the problem from another mathematical field, or from other proofs that existed in the scientific corpus? This possibility cannot be completely ruled out, but the slow and gradual process of dealing with the problem strengthens the belief that he arrived at the solution organically, in a step-by-step self-improvement that was built gradually and with complete transparency.
He acted, in effect, as a human mathematician.
And so Knuth concluded that he would have to "change my views on creative artificial intelligence."
And the way mathematicians work changes along with Knuth's views.
The revolution has begun.
Since the beginning of 2026, in just five months, Artificial intelligence found the solution 16 Erdoğan problems. If any futurist had suggested a year ago that we would achieve such achievements in the first half of 2026, they would have thought he was crazy.
And yet, we move, and we continue to move at an ever-increasing pace. Anyone who thinks that artificial intelligence has already demonstrated its full capabilities has been proven wrong every few months over the past six years.
What does this progress mean?
In the short term – that is, in the very next few years – we will see more and more mathematical problems being solved with the help of human and artificial teams. The human mathematicians will choose the mathematical problems, know how to explain them to the artificial intelligence, and be able to understand the solutions it offers them in order to identify wrong directions – and make it clear to it that it needs to try thinking differently.
At the same time, every scientist will gain the power of an expert mathematician. Every biologist will be able to analyze probability problems as if he had an expert data scientist, mathematician, and statistician by his side. All research will rise to the level thanks to this power, and those that do not – will not be published.
And yes, those who don't know how to use artificial intelligence – or who trust it too easily – will also be able to get mathematical justifications for complete nonsense from it. That's what happens when we have a powerful tool in our hands, but one that also surrenders to us and serves our every desire – sometimes at the cost of distorting the truth.
The impact on technology will be slower, but it will also appear for sure. Mathematics is the basis of everything – it is the most precise language for describing the world and the way its various parts affect each other. Every human inventor is going to be armed with this language at the highest level, and will be able to achieve impressive achievements much more easily and quickly.
And in the distant future? Five, ten or twenty years ahead?
So human inventors – along with all the processes of invention, development and experimentation – may also be taken over by artificial intelligence and robots in research laboratories. This will not happen immediately, and even in many decades, humans will still be involved in the process of research, development and invention. But the share of carbon-based entities in research and development will become increasingly smaller compared to the share of silicon-based entities. I mean artificial intelligence, of course.
This means that future scientists and engineers need to start thinking about social, almost philosophical questions. What is the most appropriate and effective way to work alongside, under, or above artificial intelligence? How will we continue to control it as it develops the next generation of inventions for us – and probably future generations of its own? How will we ensure that we are still able to understand the products of its developments, or the new and advanced science it produces?
"We live in very interesting times, truly." Donald Knuth signed off his latest article, moments before wishing readers in the authentic style of a Star Wars fan that –
"May the force be with you."
May the force be with us – and remain with us – for many, many more years.
Short FAQ:
Has artificial intelligence really solved open mathematical problems?
According to the cases described, AI systems helped arrive at solutions to open mathematical problems, but the solutions were still tested, refined, and verified by human mathematicians.
What are Erdos problems?
The Erdős problems are a large collection of mathematical problems left behind by mathematician Paul Erdős. Many of them remained open for many years.
Does this mean that mathematicians are no longer needed?
No. At this stage, humans still choose the problems, formulate them, check the evidence, and decide whether the solution is valid. The change is that AI becomes an active partner in the research process.
More of the topic in Hayadan: