no paywall: https://archive.is/ozjPC
heck yeah. pretty exciting.
By the end of that Saturday night, Ono was frustrated with the bot, whose unexpected mathematical prowess was foiling the group’s progress. “I came up with a problem which experts in my field would recognize as an open question in number theory—a good Ph.D.-level problem,” he says. He asked o4-mini to solve the question. Over the next 10 minutes, Ono watched in stunned silence as the bot unfurled a solution in real time, showing its reasoning process along the way. The bot spent the first two minutes finding and mastering the related literature in the field. Then it wrote on the screen that it wanted to try solving a simpler “toy” version of the question first in order to learn. A few minutes later, it wrote that it was finally prepared to solve the more difficult problem. Five minutes after that, o4-mini presented a correct but sassy solution. “It was starting to get really cheeky,” says Ono, who is also a freelance mathematical consultant for Epoch AI. “And at the end, it says, ‘No citation necessary because the mystery number was computed by me!’”
While sparring with o4-mini was thrilling, its progress was also alarming. Ono and He express concern that the o4-mini’s results might be trusted too much. “There’s proof by induction, proof by contradiction, and then proof by intimidation,” He says. “If you say something with enough authority, people just get scared. I think o4-mini has mastered proof by intimidation; it says everything with so much confidence.”