In 48 BC the library of Alexandria burned down.
On that day humanity lost one of the largest collections of knowledge at the time. Now 2000 years later all this knowledge would probably insignificant. Or maybe not - it might be interesting for history, archeology and fiction.

This is the first time

This is the first time we have the technology to store vast amounts of data in very little space. Even for your parents generation which has access to digital media had to make strict decisions on what to keep and what to throw out. We might also be one of the first few generations that are rich enough to have the leisure time to care about data: We care about media, movies, series, music, art. We care about health data, hearth rates, sleep tracking, exercise data, calories, weight and their potential benefits in heathcare decisions. We care about documentation of business processes. We care about all sorts of data.
But our storage still isn't endless. We have to decide what to keep.

What is worth saving?

Knowledge

It pains me to look at the discussion page of Wikipedia articles. Moderators remove all sorts of information due to being irrelevant. Makes sense to some extent since Wikipedia is supposed to be a dictionary introduction to topics and not the collection all knowledge.
But when we rebuild this on a decentralized internet - shouldn't there be an endless further reading and further reading and further reading? For me there is no such thing as data that is too irrelevant. There is only incorrect sorting that wastes your time.
Especially with hyperefficient parsing methods like scrapers and LLMs we could read all data with the correct broadness or tiny niche desired.

Entertainment

It's baffeling to me how hard it is to watch an Oscar winning movies from the 30s. The 1930s were very recently in the grand scheme of things. And these movies were important enough to get a literal Academy Award. Netflix or Disney+? Nope. Apple or Amazon buy individual films? nope. Torrent? Good, 5 people are seeding. 5 people, in a world with millions of Americans and billions of people world wide.
Enter: Marion Stokes.
Marion Stokes was a woman that compulsively recorded television from 1977 until 2012. This exactly what I'm talking about. It is extraordinary that stuff was important enough at the time to stream it out to the whole nation and millions of viewers but nobody but a random person cared enough to save it.
Maybe you don't think TV is worth saving. That's okay, but you get the idea.

Personal Stuff

I personally think the following data is worth saving:
  • Pictures. I love looking through my grandmas photoalbum - my grandchildren should be able to marvel at history even further back as well.
  • I'm beginning to write Journaling because it is good for my mental health. This will die with me. But I will condense it into memoires in old age. As should everybody imo, no life is not worth reading about.
  • (Smart)home data will continue to the next homeowner of this piece of real estate. A homeowner deserves 50y of cistern water levels and rain patterns. They deserve pictures from the home construction. They deserve to know in which order we layed cables underneath the plaster.

Organize and search

Lastly, I want to recommend some resources

Decentralized Storage

I already mentioned the future of the decentralized web. Using torrents as data storage on a humanity level already only worked medium good. What makes us think it will be better when we try again with the next generation of decentralized web or that this old try might ever become better?

Obsidian

This is a cool software for taking notes in .md files. You can link them to one another to a graph of a "second brain".

privateGPT

A chatbot to parse your data locally. No internet required: https://github.com/imartinez/privateGPT

The r/DataHoarder subreddit

People who are as obsessed with this topic as I am.

TLDR

This was a chaotic ramble about the future of data hoarding and usage. I just want to know:
  1. What data do YOU find worth saving? Or will you take to the grave?
  2. How will we as humanity find the best way to record and use data?
  3. Can you recommend software or a setup to make data immortal?
PrivateGPT is interesting. I have several years of journaling in Obsidian and it’d be cool if I could ask when certain events happened in my life or perhaps give me summaries of years, months, etc. I’ll have to try it out - curious how good a job it can do. I find the idea fascinating. Thanks for sharing!
reply
PrivateGPT is interesting. I have several years of journaling in Obsidian and it’d be cool
Right, these technologies fit together so well conceptually
it’d be cool if I could ask when certain events happened in my life or perhaps give me summaries of years, months, etc
It's a great usecase. When I visit my parents they so often reminisce about the past exactly like that. "Was our vacation to xyz before or after we build the house? " "What did we do when 9/11 happened again?" "Did I already work at xyz when that happened?"
Also remember that chatbots are still in their infancy. Imagine what they will be able to extract from your thoughts in 10 years.
reply
Makes me think of a sci-fi episode of a show (maybe black mirror, I forget) where a dude downloads his wive’s digital self into a robot. I don’t remember ending, it was probably bleak. But, point is that something like this could be a way to talk to people that die. Just tell the AI to act like the person. Kinda crazy, creepy, wild, cool, etc. Future could get weird. :)
reply
I remember this episode. After that the LLM community was pretty fascinated with this concept. I remember somebody on Reddit made a script to extract bulk text out of Email, Whatsapp chat backups etc. I wonder what happened to this "branch" of LLMs
reply
All good intentions but in reality it'll be poorly angled selfies. Videos of cats doing stupid shit. Videos of people doing stupid shit. Videos of dogs doing stupid shit. Videos of birds doing or in the case of parrots saying and doing, stupid shit. And pictures of food.
The digital record of humanity will make us all look like total idiots.
I'm sure some historian will get a laugh out of it however.
reply
My precious cat pics ☝🏻
reply
I had this realisation long time ago. ALL your unencrypted communications and metadata will exist for ever. It will be valuable even after you die. Algos will read it, study it, learn from it. An advanced AI will know you better than yourself long after you die.
So say something nice :)
reply
I had this thought before so I wrote down the important things on a piece of paper in my lockbox with instructions on accessing it so my family could recover everything.
reply