I think this super fascinating. I assumed they were almost only reliant on crawling the open web. But the scale of this library/warehouse sounds like they have a literal second pillar. It also makes it much more plausible to me that LLMs will develop (or already have) their own unique je ne sais quoi.
So Anthropic bought physical books to scan them while Zuckerberg pirated torrents.
I think this super fascinating. I assumed they were almost only reliant on crawling the open web. But the scale of this library/warehouse sounds like they have a literal second pillar. It also makes it much more plausible to me that LLMs will develop (or already have) their own unique je ne sais quoi.
archived link