It worked because the ai had a base to start from. Its not like it did all the calculations without guidance.
The PhDs they compare it with also have a base to start from, yet ChatGPT o1 outperforms them. The article adds nuance to this and the original cited statement so it's worth reading the actual article.
reply
They are also comparing how the ai does against each previous model. Of course it would be better.
reply