Aww cute they're learning how the Internet works!
reply
LOL, the data was intended to be used for AI training!
The companies trained their models in part by using "the Pile," a collection by nonprofit EleutherAI that was put together as a way to offer a useful dataset to individuals or companies that don't have the resources to compete with Big Tech, though it has also since been used by those bigger companies.
reply