pull down to refresh
0 sats \ 0 replies \ @Scoresby 14h \ on: The Era of Exploration - Yiding Jiang Design
I read through the rest of the paper and I have to admit I still don't understand this line. If pretraining is walking LLMs through huge chunks of human-created text and giving feedback about the quality of the llm's outputs, I don't think I understand why we can't repeatedly use the same data (as long as it's pretty huge, as in all the English language writing on the internet). Maybe it is because to make improvements llms need larger data sets and so we are getting to the point where human generated data sets can't get larger quickly (in this sense they have been "consumed"). But, I'm still struggling with how to think about a model "consuming" data.