pull down to refresh

Cfr what I mentioned here: #1433541

But none of these tests were controlled experiments. Olympiad problems aren’t research questions. And LLMs seem to have a tendency to find existing, forgotten proofs deep in the mathematical literature and to present them as original. One of Axiom Math’s recent proofs, for example, turned out to be a misrepresented literature search result.
And some math results that have come from tech companies have raised eyebrows among academics for other reasons, says Daniel Spielman, a professor at Yale University and one of the experts behind the new challenge. “Almost all of the papers you see about people using LLMs are written by people at the companies that are producing the LLMs,” Spielman says. “It comes across as a bit of an advertisement.”
First Proof is an attempt to clear the smoke. To set the exam, 11 mathematical luminaries—including one Fields Medal winner—contributed math problems that had arisen in their research. The experts also uploaded proofs of the solutions but encrypted them. The answers will decrypt just before midnight on February 13.

I haven't seen articles yet about the outcome, but I guess it'll be there soon enough with this Feb 13th deadline.

How can they ensure that the same (or similar) problem didn't come up in someone else's work and get posted on the internet somewhere?

reply

We don't.

But I imagine they took some lemmas from modern research which hasn't been publicized as much as the erdos problems for instance. Something very niche that only the mathematician who proposed it likely works on.

In any event, seems like openAI took on the challenge: https://cdn.openai.com/pdf/a430f16e-08c6-49c7-9ed0-ce5368b71d3c/1stproof_oai.pdf

reply

I always fail in math problems.

reply
The answers will decrypt

That's such an odd thing to say. lol

reply