pull down to refresh

Isn't this a bit like saying measuring all the various wavelengths bouncing off a painting can help you know whether or not it is art?
Nice analogy!
I'd say to the philosopher it definitely is. It says in the article:
The “principled reasoner” being compared to here simply does not exist. It’s a Platonic ideal.
I agree with that. But what it feels like is that the "template" that is being reinforced during reasoning training 1, is maybe good for math and coding, but not so awesome for other domains. I think what's needed is new alignment policies that aren't solely derived from the experts that are OpenAI/Goog/Anthropic hires - some diversity in alignment would be nice! 2.
I don't think we have a good handle on what reasoning is in human minds
I think the problem is trying to emulate it in the first place. I'm personally always amazed by people that tick differently, and they help me break through barriers in my thinking. I don't need people that think like me around me. Compatible, but not similar?

Footnotes

  1. I cannot help but feel that the method and scoring is the same or at least very similar for all LLMs minus Claude and Gemini at the moment, also because when DeepMind decided to do it wholly differently, it turned out that the results kind of sucked.
  2. I was fantasizing the other day that I'd totally want a mixture of experts LLM where each expert layer has been aligned by a different Bitcoiner... So you basically get the luke-jr layer arguing with the petertodd layer in the reasoning phase. That would be so awesome for those of us that quit the bird app... once a month, on demand. "you're wrong and you know it." -> "you lie!". hah!