pull down to refresh

177 sats \ 3 replies \ @optimism 4h
this thing is clearly trained via RL to think and solve tasks for specific reasoning benchmarks. nothing else.
Perfectly aligns with the perceived villain arc of the CEO. I made a small comment yesterday about how it's apparently okay in AI to do what gave VW massive reputation problems: build inferior products that only perform well on the benchmarks and safety tests.
reply
What's RL?
reply
95 sats \ 1 reply \ @optimism 2h
Reinforcement Learning
There's an advanced free and open source course at HuggingFace: https://huggingface.co/learn/deep-rl-course/unit0/introduction
reply
Oh got it. Somehow the initials didn't click
reply
Bizarre in a good way or “I can’t sleep tonight” kind of way? 👀 Can’t wait for the charts and weird outliers
reply