I feel like I’ve entered a Google interview
https://engineering.fb.com/2015/05/04/core-data/under-the-hood-facebook-s-cold-storage-system/
This is an article from 2015 where Facebook/Meta was exploring Blu-ray for their DCs. You’re definitely right though. Tape is key as the longest term storage.
This is such a great example of the potential consequences of making a decision without understanding the landscape/context. It’s obvious this would happen in hindsight.
Super cool, blew my mind! I would love to see it in operation. The logistics from the machine side + the storage heuristics for when to store to a disc that’s write-only sounds like a really cool problem.
There was an article recently about this (too lazy to search it). It’s already starting to happen. If most of the content they train on is the internet and more internet content is created by LLMs without being tagged as AI generated content (can’t be guaranteed by all actors), then it’s inevitable. High signal training data is out the window.
There are also techniques where data centers do offline storage by writing out to a high volume storage medium (I heard blueray as an example, especially because it’s cheap) and storing it in racks. All automated of course. This let’s them store huge quantities of infrequently accessed data (most of it) in a more efficient way. Not everything has to be online and ready to go, as long as it’s capable of being made available on demand.
I had not thought about having the sunscreen leech into the water. Thank you for educating!