this thing is clearly trained via RL to think and solve tasks for specific reasoning benchmarks. nothing else.
Perfectly aligns with the perceived villain arc of the CEO. I made a small comment yesterday about how it's apparently okay in AI to do what gave VW massive reputation problems: build inferior products that only perform well on the benchmarks and safety tests.
Reinforcement Learning