pull down to refresh

Thats probably an oversimplification.
Those are the "RAG" (eg search) content that is searched for a query. That is NOT representative of the data that is used in training.
So yes, reddit, wikipedia, youtube, facebook are going to score very high as RAG sources because thats where most of the new content is being posted on the internet.
So when you enter your query, those sources are being referenced for up-to-date info.