I often see comments on HN which favor self-hosting and personal data management over use of cloud services, so I'm curious what approach people here take.
I've only just started doing the research for this, but my plan so far is just to buy a couple high capacity hard disk drives which would be mirrors of each other. Occasionally I'd copy files over from my computer to one of the drives and occasionally I'd mirror the data on one drive to the other. Also wondering if I should just re-use some existing 2.5" drives I have from old laptops or if it's more prudent to purchase a new drive that might be manufactured with long-term durability / stability in mind.
1. Use rsync to backup everything to a Synology NAS every day
2. Backup from the NAS to tarsnap (cloud service with cheap, encrypted backups) every few weeks.
Theres only a few important ideas and a whole lot of ways to accomplish them:
1. Backup frequency and retention policies. I forgot the term for this, but you want something that prioritizes more frequent backups for nearer term data. For example, see the Apple Time Machine backup policy:
"hourly backups for the past 24 hours, daily backups for the past month, and weekly backups for everything older than a month until the volume runs out of space"
2. Don't micro optimize decisions like which hard drive is most reliable. It's a waste of time and ultimately any drive can fail, the filesystem on that drive can fail etc. Instead, use backups. RAID gives you disk-level redundancy, and an extra backup on the cloud will almost never fail because they have pros making sure of that.
3. A backup you haven't tested restoring from is NOT a backup.
/--Dropbox
Synology NAS -> normalize file names
\--Google Photos
I wrote this tool to normalize folder and file names <https://github.com/jmathai/elodie>.I wrote about the rest of the system in the following posts.
1. https://medium.com/@jmathai/introducing-elodie-your-personal...
2. https://medium.com/@jmathai/understanding-my-need-for-an-aut...
3. https://medium.com/@jmathai/my-automated-photo-workflow-usin...
4. https://medium.com/@jmathai/one-year-of-using-an-automated-p...
I dont think the cost is expensive on a long term TCO. Assuming it really last that long. The problem is the time to burn those Data and they are practically not searchable.
I hope there are next gen storage, optical or not that brings larger data count and longer life cycle
I am thinking of a vague idea if there could be an NAS, where it has two drive, one for your local copy, the other one is used for storing bit and pieces of Data of similar brand of NAS from other users for recovery purposes.
[1] https://www.amazon.com/Verbatim-98914-M-Disc-100GB-Surface/d...
[2] http://www.microscopy-uk.org.uk/mag/indexmag.html?http://www...
1. Multiple laptos get rsync'd to a WD MyBookLive network drive every 4 hrs (Yes, I know about the recent WD issue - see note at the end)
2. The WD MyBookLive data gets rclone'd to Microsoft OneDrive Business Basic Account every day (used to be Amazon CloudDrive - see note)
3. The WD MyBookLive data gets rclone'd to BackBlaze B2 every week
Notes:
1. Yes, there was a huge security mess with the WD MyBookLive, where people had all their data deleted if they enabled UPNP and allowed the drive to punch a hole in their NAT. But never expose any such "IoT" devices directly on the WAN - always block all these devices on the router WAN interface completely and ssh mount your network drives on a "home server." Have cron jobs on the server do the sync with cloud drives. MyBookLive is much less expensive and if you keep it off the WAN can work very well as a NAS.
2(a). Amazon CloudDrive used to be a good service to use, even after they got rid of their "unlimited" plans at $5/mo/1TB. But then they blocked off rclone and now there are close to zero clients that it works with - the only clients that work are their slow web interface and an "odrive" client that almost never syncs.
2(b). Microsoft OneDrive offers a nice "Business Basic" plan for $5/mo/1TB that seems pretty good.
3. BB2 storage is $0.005/GB-month - slightly more expensive than Amazon Glacier at $0.004/GB-month, but its download cost at $0.01/GB is better than Glacier's at $0.09/GB
Then at least once a month I bring in an offsite 6TB USB drive and rsync the borg backup to it after manually reviewing that the current borg backups are “sane”.
The main advantage to this approach is that even with complete physical destruction I can access/restore anything within hours. Additionally, going back to old versions or checkpoints locally is extremely fast (seconds to minutes) no matter the file size. I also have occasions where I create/change a lot of data in the same day, to the point where a former overnight cloud backup process would occasionally fail to run because it was still running from the night before!
Syncthing is not backup but more synchronization. But for simple stuff is more than good enough. Especially for pictures that I want to keep forever.
https://redbeardlab.com/2021/08/03/my-syncthing-setup-cheap-...
1. rsync to USB disk into snapshot/ (with global and per-directory filters)
2. on USB disk, borg from snapshot/ to borg/
3. rclone borg/ to multiple object storages (e.g. B2)
"USB disk" refers to LUKS-encrypted ext4 partition on USB attached rotating hard drive. I use this for my PC and some remote servers (rsync can pull over ssh).
Does anyone have any good tools to pull (hopefully backed up) data off of failing disks?
What about imaging disks that are in filesystems your computer cannot read?
Ideally, these would be crossplatform, but suggestions across OSes would also work.
Time Machine (USB hard drive), NAS (Synology) and the cloud (Backblaze).
I'm struggling, don't have good solutions, have a good start, but otherwise have only a work in progress.
On my HP laptop Windows 10 computer, I have the D:\ recovery partition.
Otherwise I have 3 Western Digital USB external 3.5" form factor hard disk drives. Two of the drives have 2 TB (trillion bytes) of space, and the third one, new, has 5 TB.
I back up using ROBOCOPY with some carefully selected options. Occasionally I do a full backup of my data on C:\ and frequently I do an incremental backup of that data.
By my data I mean the Windows file system directory tree rooted at directory (for Apple users, folder)
C:\Users\user1\
This procedure does a lot of good but has
some flaws: One of these is that some of
the directories close to C:\ are special
in Windows and don't work in the normal
ways with the command line command DIR,
etc. I don't understand all the problems
but for one it appears that there are
circular references in the directory
structure that lead some software
operations, e.g., part of backup, to
infinite loops -- as far as I can tell
currently, really big, gigantic bummer.So, there are problems: Last week an external keyboard held down a key on my laptop, and Windows got confused and deleted several icons from my screen, desktop. Bummer.
Eventually Windows users discover (apparently secret knowledge) that each such icon is from a file of type LNK. So, some of these files are in directory
C:\Users\user1\Desktop\
but others are in directory C:\Users\Public\Desktop\
This directory was hidden until I used attrib -H
to unhide it. Well, the LNK file for
the icon for my installation of an old
version of Google's Web browser Chrome was
in that directory and was one of the icons
and LNK files confused Windows deleted.And the actual EXE file for Chrome was in a directory that does not play well with DIR, etc.
So, it looked like (will do better next time) I had to go to a directory
C:\Users\user1\prog05\google\chrome\
with program chromesetup.exe
and run that. And that is what I did.
So, I reinstalled Chrome. I don't know
if what was reinstalled was the old
version of Chrome I had or a newer
version. I care: When I like an old
version of a program, I'm very reluctant
to install a new version that replaces the
old version. E.g., for Mozilla Firefox, I
used to like it, but the recent version I
have changed the user interface (in ways I
regard as silly and steps backward) and,
really bad, force me to close a popup
window a few dozen times a day asking me
to install an new version. [I do NOT want
a new version -- Mozilla, I deeply,
profoundly, bitterly, hate and despise the
whole idea of frequent new versions.
Please, please, please STOP pestering me
to get new versions.] Further, when I have
Firefox save a Web page and then display
the saved copy, Firefox pings me asking
that I make Firefox my default Web browser
which it already is. And, once again,
Firefox pesters me, interrupts my work,
gives me another popup that demands that I
reject installing a new version of
Firefox. Further, some of the Web pages
saved by Firefox, Firefox won't display
but Chrome and Brave will. Silly
situation.Due to such pestering, I may have to junk Firefox: I have 100,000 lines of .NET code with comments with tree names of documentation, 6000+ Web pages, and used to use Firefox and a single keystroke to display such a Web page. Maybe now the pages won't display or I will get my work interrupted by two !@#$%^&*() popup windows I HATE. Looks like I may have to junk Firefox and go for Chrome, Brave, or some such. Maybe there is a way to download and keep a version of Firefox 5-10 years old that I liked JUST FINE the way it was -- no popup hassles.
So, net, Windows getting confused due to a key held down forced me to download a new version of Chrome -- really bad bummer. And the Chrome EXE file is in a directory that does not play well with DIR, some approaches to backup, etc.
For the LNK files in directory
C:\Users\Public\Desktop\
I've copied those to part of the file
system that behaves normally and where I
can back it up.It looks like I will have to do some system management mud wrestling to get the EXE files, etc. for programs in misbehaving directories
C:\Program Files
C:\Program Files (x86)
C:\Windows
to a normal behaving part of the file
system where I can use ROBOCOPY to back up
the files and just COPY or XCOPY to
restore selected files/directories from
the backup.To me these misbehaving directories look like a grand design disaster of Windows. Windows has had 30 or so years to get such things right and still has some serious, first grade, problems. I will have to investigate to find ways to work around these Windows disasters.
Near the top of my TODO list is getting around bad or missing Windows documentation to get around some really bad Windows system management disasters.
E.g., those circular references in the file system directory tree -- no longer a tree. WHAT a bummer. Where is the documentation for how the heck to work with such directories, list them, copy them, restore them? Was it really necessary to ruin the file system in this way?
E.g., I can't do a routine backup and restore and, thus, am pushed into depending on Web sites continuing to provide the downloads I want -- years from now. Bummer.
And part of this work to be done is to understand what I can do with the HP D:\ recovery partition and how to do it.
Uh, in simple terms, I want to know how to boot from a DVD, restore all the HP and Windows stuff from somewhere, DVDs if the size is small enough, from external hard drives otherwise, and then use ROBOCOPY to restore everything else. This goal seems simple and obvious enough. Now a big detour in my work is to figure out how to do that -- knowing that good documentation will be more rare than hen's teeth. System management mud wrestling to get around problems of HP and Windows instead of my real work.
On my other computer, a desktop computer I assembled from parts, e.g., an AMD FX-8350 processor, I have lots of internal hard disk space and two new 4 TB disk drives ready to install. So so far I do a lot in internal backup.
One problem is it appears that Windows has seen some free space on one of my hard disks and decided to use it for a lot of temporary storage to be used by running, installed programs. Since that storage changes frequently, it gets backed up as part of an incremental backup and makes my incremental backups much larger than they should be. Bummer. That is just what I do NOT want. And Windows never asked me about using my disk volumes this way, and so far I can't find any documentation for how to tell Windows to put that temporary junk where I would want it to be and out of the way of my real work. Windows, messing up my real work. Bummer.