pull down to refresh

Damn, I still have a lot to learn. But that blog also confirms what I've been learning, that it's the Reinforcement Learning part that's the real big innovation.
As to the cost, the blog says this:
Some say DeepSeek simply digested OAI’s models and have replicated their intelligence, and the benefit of all that human-in-the-loop reinforcement, at a fraction of the cost.
You could argue OAI scraped the internet to make ChatGPT and now DeepSeek have scraped ChatGPT.
All is fair in love and AI, right?
So, the cost figure is definitely a lot more believable if you consider that it was built on top of previously trained models and weights. I think the way some people were talking about it, they made it seem like they just did it from scratch.
30 sats \ 1 reply \ @k00b 29 Jan
I think the ChatGPT scraping is still unverified afaik. They might've used it to build the RL training data I guess.
In this blog that gets a lot more technical they note that the optimizations they made were obviously good targets for optimization in retrospect:
None of these improvements seem like they were found as a result of some brute-force search through possible ideas. Instead, they look like they were carefully devised by researchers who understood how a Transformer works and how its various architectural deficiencies can be addressed.
reply
Fascinating
reply