All my oldest preserved code (early 80s) is on paper, the things it occurred to me at the time to print out. No fancy archival paper either, just listings printed out on my dot matrix printer onto fanfold printer paper.
Anything from that era that I didn't print out is gone.
From the late 80s onward I still have all the files that I've cared to save. The general answer to that is that there is no persistent medium, you need to commit to keep migrating that data forward to whatever makes sense every so often.
I copied my late 80s 5.25" floppies to 1.44MB floppies in the early 90s. In the mid 90s I copied anything accumulated to CD-Rs. In the 2000s I started moving everything to DVD-Rs.
From the late 2000s until today I have everything (going back to those late 80s files) on a ZFS pool with 4-way mirroring.
Of course, aside from preserving the bits you also need to be able to read them in a future. Avoid all proprietary formats, those will be hopeless. Prefer text above all else, that will always be easily readable. For content where text is impossible, only use open formats which have as many independent open source implementations as possible to maximize your chances of finding or being able to port code that can still read the file 30-40 years from now. But mostly just stick with plain text.
I used to work in the data protection industry, doing backup software integration. Customers would ask me stupid questions like "what digital tape will last 99 years?"
They have a valid business need, and the question isn't even entirely stupid, but it's Wrong with a capital W.
The entire point of digital information vs analog is the ability to create lossless copies ad infinitum. This frees up the need to reduce noise, increase fidelity, and rely on "expensive media" such as archival-grade paper, positive transparency slides, or whatever.
You can keep digital data forever using media that last just a few years. All you have to do is embrace its nature, and utilise this benefit.
1. Take a cryptographic hash of the content. This is essential to verify good copies vs corrupt copies later, especially for low bit-error-rates that might accumulate over time. Merkle trees are ideal, as used in BitTorrent. In fact, that is the best approach: create torrent files of your data and keep them as a side-car.
2. Every few years, copy the data to new, fresh media. Verify using the checksums created above. Because of the exponentially increasing storage density of digital media, all of your "old stuff" combined will sit in a corner of your new copy, leaving plenty of space for the "new stuff". This is actually better than accumulating tons of low-density storage such as ancient tape formats. This also ensures that you're keeping your data on media that can be read on "current-gen" gear.
2. Distribute at least three copies to at least three physical locations. This is what S3 and similar blob stores do. Two copies/locations might sound enough, but temporary failures are expected over a long enough time period, leaving you in the expected scenario of "no redundancy".
... or just pay Amazon to do it and dump everything into an S3 bucket?
It costs the Internet Archive $2/GB to store content in perpetuity, maybe create an account, upload your code as an item, donate $5 to them, and call it a day. Digitally sign the uploaded objects so you can prove provenance in the future (if you so desire); you could also sign your git commits with GPG and bundle the git repo up as a zip for upload.
EDIT: @JZL003
The Internet Archive has their own storage system. I would assume it caps out because they're operating under Moore's Law assumption that cost of storage will continue to decrease into the future (and most of their other costs are fixed). Of course, don't abuse the privilege. There are real costs behind the upload requests, and donating is cheap and frictionless.
https://help.archive.org/help/archive-org-information/
> What are your fees?
> At this time we have no fees for uploading and preserving materials. We estimate that permanent storage costs us approximately $2.00US per gigabyte. While there are no fees we always appreciate donations to offset these costs.
> How long will you store files?
> As an archive our intention is to store and make materials in perpetuity.
http://news.bbc.co.uk/2/hi/technology/2534391.stm
"""But the snapshot of in the UK in the mid-1980s was stored on two virtually indestructible interactive video discs which could not be read by today's computers. """
I can't find the back story now, but if they weren't able to source a working laser disk reader from a member of the public (which IIRC took quite a bit of effort to find), then accessing this data - digitized in the early 1980s - would have cost a fortune.
The inspiration for this project, the 900-year-old Domesday Book, is just as readable today as it was in 1980 (and in 1200 or so). The ability to read data with one's eyes should not be underestimated.
The only true solution is a living one, where you have make sure you have the ability to get your data from an old format to a new one periodically. More importantly, you should look into the idea of 3-2-1 Backups. Anything that you intend to keep indefinitely is subject random events, ie fire, flood, tornado, theft, etc. Having multiple archives in separate systems is more import than worrying trying to ensure a single copy will last a long time.
Storing less than a gigabyte is very cheap to do in multiple formats, such as USB flash drive, external hard drive, CD, BlueRay disc, etc. You can hedge against data corruption with PAR2 files. Also, consider storing a copy on the cloud, ie Backblack B2, AS S3, etc. Again, I suggest creating PAR2 files and/or using an archive format that can resist damage.
Just create calendar events to check periodically the integrity of your archives. Having problems reading a CD, use the hard drive backup to burn a new one. This also a good time to consider if one or more your formats is no longer viable.
Finally, realize that a program runs within an environment and those get replaced over time. You need to no only back up your program, but probably want to store the operating system and tools around it.
You might like to read through the site, but if not then I would suggest keeping it safe via storage in multiple formats and locations. If I really wanted to keep something safe and wanted to put effort into it I would put it on a remote service, an external physical media that I might store somewhere else safe, and whenever I get a new computer it would get backed up to my computer. This of course puts extra managerial requirements on you, which for me would be difficult because of the ADHD problems, but of course you would need to make sure to keep your remote service or make sure if you are getting rid of it that you have a plan for moving stuff etc.
In my case I have multiple computers so I would also make sure important to preserve stuff was backed up to all of them.
All of which reminds me I should update a bunch of my stuff.
They’re special DVD and Blu-ray discs designed for long-term storage. DVD and Blu-ray are so widely used, it seems likely you’d be able to find some equipment in 30 years that could still read them.
35 mm film is also interesting but probably costs a fortune: https://www.piql.com/services/long-term-data-storage/
[0] https://archiveprogram.github.com/arctic-vault/ [1] https://www.piql.com/
Then come back to it at least once a year to run it again and make sure it still works.
At present we are lucky enough that Windows programs from 1990s still run under Windows 10 to some degree. Thank the folks at Microsoft for maintaining their operating system as a digital museum of archaic bug-compatible APIs.
Something to keep on the shelf would mean you ignore it for too long and it stops working.
Even something like a Python script may stop working due to changes in the language, and old versions of the interpreter no longer being maintained.
But at the face value, it's hard not to wonder whether you'd like to preserve a functional program, or its source code, or its architectural and design ideas?
Either way, your current perception of the program is likely tied to the current technology or perhaps even whole ecosystem around it.
So to preserve something like that, you'd need more than just storage.
If it's just for the source code, as in text, then as golden rule of backup goes - keep many copies in distributed but known locations. In other words - diversify and distribute. Whatever storage digital, or analog, or organic, as in human memory (storytelling is a type of storage too).
Though, likely, you mean the functional program. Thus you'd need to preserve the platform too, along with build tools. So, at least some system specs need to be preserved or a VM image for more or less stable virtualization envs.
The card is write-once. They run around $90-100.
"The material, a nanostructured glass disc, also has an estimated lifetime of 13.8 billion years (roughly the current age of the universe) at elevated temperature of 462 K (190 C), and a capacity of 360 TB. It has been hailed as a particularly significant invention, as no other existing storage medium can so safely ensure that data will be accessible by future generations."
https://www.guinnessworldrecords.com/world-records/412399-mo...
Copy-1. Compress a copy and email it to your self.
Copy-2. Burn to a DVDR and keep it on your shelf.
Copy-3. On a USB stick and store where you keep your passport.
Copy-4. On your rolling backups (you have backups right?...).
Copy-5&6. An extra DVDR and USB stick kept off-site (family/friend). Feel free to Encrypt it.
Copy-7. Your rolling backups that you keep off-site. Encrypted.
To be honest, since you already should have a good backup strategy, the cost should be like $5 for a couple USB sticks and DVDrs.
For a single copy long term paper or etched metal is probably the most reliable.
Now what is the most density you can get on standard paper? That’s a more interesting question.
Probably some collection of QR coded, multiple copies maybe.
Real talk: every 6 mos when checking fire alarm batteries check your storage (and as necessary migrate it to copies on new cloud systems etc).
I wonder if printing microfiche is something you can find easily.
[1]: https://arweave.medium.com/arweave-the-internet-archive-buil... [2]: https://arweave.org
Even if it's plain command line C you're still going to have potential issues with compiler compatibility 30 years from now. C will probably be ok if you code defensively to avoid explicit hardware dependencies, but for all anyone knows C will only be available in museums by then.
If it's something high level like Python, it's impossible to guess what state that ecosystem is going to be in 30 years from now.
Same applies to operating systems and tooling.
Vintage computer museum projects either store the complete hardware and software stack or run old code under emulation.
This was easy when you had (for example...) a VAX or PDP-11 that was essentially self-contained. It's going to get harder as processing and dependencies become more and more distributed.
I wouldn't even want to assume that something like Docker will look much like it does today, or if it does that it will be compatible with thirty year old images.
I wrote some binary decoding patches for ZBar for this exact use case. You can, for example, store video games in a QR code:
On a more serious note, there is lots of good information out there about digital preservation, e.g. from UK national archives[1]
[1] https://www.nationalarchives.gov.uk/information-management/m...
Well, if you're asking literally, then probably cuneiform clay tables (fired on purpose, of course). However, a higher density medium with a reasonable lifetime would probably be a 2D barcode engraved on a plate of stainless steel or something like that.
The ultimate of course would be 3D storage in synthetic quartz, but as a DIY solution, that is much more difficult to write (you need a short pulse laser for that), or even to read (for 2D barcode, any camera works).
Do not underestimate the resilience of paper format, however it's harder to move it to digital again.
More ideas that you could also use for programs: https://meaningofstuff.blogspot.com/2015/05/backup-your-smar...
My implementations of those programs are all gone. I have many of them on 5.25" disks but even if they're working, I have no way of reading them now.
However, the photocopies books are still intact with the pages held together by an aging paper tag. Go figure.
https://egyptianmuseum.catalogaccess.com/search?search=contr...
Though some are the only remaining replicas from RAID-1 (RAIT?) groups.
Consider adding LDPC or something while you're at it.
They're used so much by huge businesses[1], those archives will still be on tape in 30 years.
The tar format (tape archive) likewise will still be around.
Probably the biggest worry would be the interface connector, but considering you can buy serial adapters, and RS-232 is over 60 years old, you'll be able to get a USB adapter for whatever ports we have in 30 years.
The standard archive mechanism for the film industry when committing a film's footage to The Vault, is LTO and hard drive. Good enough for Disney is good enough for me.
If a cataclysmic change happens in storage media enough to unseat billions upon billions of LTO tapes, there'll be plenty of warning as the whole world changes over. And you'll be able to pick up spare drives for a pittance.
[1]: Shipping over 100 exabytes per year. This is slightly less than what just Seagate shipped in HDD capacity, but every byte of LTO storage is both with long term retention in mind.
My dad's Ph.D. is on PDP-8 magnetic tape. We went to a computer museum to try to recover it, but their PDP-11's Winchester drive (hard drive) had broken (and made dramatic noises), so we weren't able to boot fully in order to mount the tape. Eventually we ran out of time.
Over 30 years, the best way would be to teach someone new. Especially a child.
Educational computing is how Apple built their market, and how the Raspberry Pi is gently gaining market adoption for Linux. There's some LOGO code from 1997 that I wrote on a Mac Plus (and copied forwards repeatedly) at age 7, that still runs on Mini vMac.
The challenge is in finding someone else who believes in your posterity just as much as you. And that's just one challenge of having kids. (thankfully I'm yet to experience that responsibility).
It will last 10k years if stored reasonably. Storing GBs is no problem. Won’t go obsolete - the technology has been around for nearly four billion years.
I have a .tar.gz of my university account that's 25 years old at this point and a .ZIP of my old DOS, Turbo Pascal, etc. stuff that's a few years older. They've been copied so many times over the years and I'm not even sure the path my current copies took. They lived on floppies, on a CD-R for a long time, different PC hard drives, external backup hard drives, flash thumb drives, Dropbox / Google Drive / iCloud, and most recently a microSD card that lives in my MacBook's port. I'm sure it's been on a couple tapes and Zip disks but those media likely long outlived any installed drives I had. Can't remember ever getting a bit error or corruption on a copy. Even the CD-Rs that were well past their alleged lifespan read fine for me.
Tape cartridges are high volume, inexpensive and the drives can be found on eBay or similar for under $200.
They don't do random access in any sort of reasonable time, but can be great for archival work.
Also, isn't there at least thirty years of development roadmap on the books for tape?
https://www.monperrus.net/martin/store-data-paper
I mostly jest. It would require a lot of paper. But it should be stable storage for potentially centuries.
Upload your data to each of their services. AWS Glacier Deep Archive would charge pennies per month to store your data on their cold storage platform. Put a copy in four separate regions there, then again on the other platforms.
Then buy your own M-Disc burner/reader, perhaps two for backup, and burn the data to that. "They" say the media will last decades or more. Who knows about the readers, though.
But I think in all honesty are better off uploading your data to github, Google, dropbox (and whoever else you can find), and relying on at least one of them being around for so long.
Anyway, there's a guy who etches data into ceramic discs and stores them in a cave or old mine in, um, I want to say Switzerland?. The discs themselves would conceivably last millions of years, and barring cave-in of the tunnels they're stored in, uh, yeah.
It's cheaper and easier than finding media that you can literally forget about for decades and still find intact, and easily recoverable.
The ability (or lack of it) to play/view an original archive is already a major problem. Think about the history of floppy disks, zip media, digital tape cartridges, and numerous others. I recall these media being quite prevalent back in the 90's, and that's only <=30 years ago. (I have my own share of them.) Today these media are ancient history and soon, if not already, as inscrutable as the writing systems of obscure prehistoric civilizations.
As said in several comments, the situation is worse for software. Keeping long superseded equipment running is very difficult. (I have some of that too.)
Preserving source code should be pretty straightforward, printed out with archival ink on 100% cotton fiber, acid-free and buffered paper would work. The only catch is long-term storage of the printed document. Paper itself is subject to environmental degradation. Museum standards specify constant 20°C, 50% humidity, sealed against atmospheric pollutants and no light exposure. That should hold it for at least 100 years. :-)
Technologies are great, seems a good bet that a brilliant startup will think of new ways to preserve the history of our epoch.
Yeah, you can do archival diamond, or archival ink on archival paper. But will you be able to read it 1,000 years hence?
I doubt there's a market for a printer, except as a novelty device. I wonder what kind of bit density you can achieve, in clay?
The problem is often: how are you going to read it after all those years?
I got floppy disks that are ~25 years old. But without an old USB disk drive there is no way to know if they still are readable. And if USB-C becomes the norm I am not sure if I can read them ever again.
Keep the stuff in text and move it around to different places periodically, and keep redundant copies.
At the moment I favour git. It's easy to set up multiple repos, and if one goes offline you can set up another.
The CrafsMan has a DIY video on how to do it, he's also just a wonderful treat in general. https://www.youtube.com/watch?v=4tYMUqsVhfc
When I got married (20+ years ago) I burned a CD full of music for the DJ. Songs that were not likely in his/her collection.
I found the CD in a box about a year ago, and it is currently in my car stereo. All the songs are in tact.
Oh.. My xbox also reads and plays the music.
The script you downloaded from GitHub, or data format used, may long be gone.
Especially relevant if you store in the cloud (you should probably encrypt in that case).
I find that incredibly hard to believe
The most common form is magnetic storage, which is what you find on your hard drive or a floppy disk. It's really good at storing data and can hold a lot of it. However, it is not very stable. Magnetic storage degrades over time, meaning that the longer you use it, the more unreliable it becomes. This means that you need to frequently back up your data and/or replace your hard drive fairly regularly.
Another form is optical storage, which stores information as a pattern of pits on a surface such as a CD-ROM or DVD. Optical storage is also not very stable, as discs can be scratched easily. But optical storage drives are cheap and easy to use (you probably already have one in your computer.) Optical drives can also read from many different types of discs, which makes them versatile.
Finally there is tape storage, which records data onto magnetic tape like the kind used for analog audio cassettes. Tape storage is great for recording large amounts of data quickly, but accessing information on tape takes longer than accessing something stored on magnetic or optical media.
USB flash? HDD? SSD?
> must keep for at least 3 decades
Most storage mediums won't have problems lasting that long if they're stored well. Particularly flash and SSD storage shouldn't take issue: tape and HDD has a small chance of mechanical failure even if you don't use them.
> must be easily transportable (for moves between houses and such) and can sit on a shelf
I mean, I think most of our current formats do just fine in that regard.
> Bonus points for suggestions on an equally stable storage type that some computer will still be able to understand in the future.
Almost everyone is still using SATA. There is unlikely to be a world in the future that is unable to "understand" serial ATA storage.