How Do You Backup 20TB of Data? 983
Sean0michael writes "Recently I had a friend lose their entire electronic collection of music and movies by erasing a RAID array on their home server. He had 20TB of data on his rack at home that had survived a dozen hard drive failures over the years. But he didn't have a good way to backup that much data, so he never took one. Now he wishes he had.
Asking around among our tech-savvy friends though, no one has a good answer to the question, 'how would you backup 20TB of data?'. It's not like you could just plug in an external drive, and using any cloud service would be terribly expensive. Blu-Ray discs can hold a lot of data, but that's a lot of time (and money) spent burning discs that you likely will never need. Tape drives are another possibility, but are they right for this kind of problem? I don' t know. There might be something else out there, but I still have no feasible solution.
So I ask fellow slashdotters: for a home user, how do you backup 20TB of Data?" Even Amazon Glacier is pretty pricey for that much data.
Asking around among our tech-savvy friends though, no one has a good answer to the question, 'how would you backup 20TB of data?'. It's not like you could just plug in an external drive, and using any cloud service would be terribly expensive. Blu-Ray discs can hold a lot of data, but that's a lot of time (and money) spent burning discs that you likely will never need. Tape drives are another possibility, but are they right for this kind of problem? I don' t know. There might be something else out there, but I still have no feasible solution.
So I ask fellow slashdotters: for a home user, how do you backup 20TB of Data?" Even Amazon Glacier is pretty pricey for that much data.
Crashplan (Score:5, Informative)
Crashplan has unlimited storage. I use their home plan; it's unlimited for up to 10 machines. I think I am backing up about 6TB there now.
Hard drives + Robocopy (Score:5, Informative)
External hard drives in USB cases + Robocopy works great for me.
Re:reduce the amount (Score:4, Informative)
20TB is not out of the world. With a RAID of 4TB disks you can cover that at home, and it doesn't need to be on all the time. Maybe you can reduce the amount of disk usage by reducing duplicate content using bup [github.com] or an appropriate FS.
Good luck. (Score:4, Informative)
A quick check at one service which lists such large amounts, you would be looking at almost $20k/year to keep a single offsite copy of that. That is the posted price however, I imagine that is enough that you could shop around and find a deal, but, a deal is still going to be prohibitive for most people.
At 20 TB I would start thinking about one of two things: Tape, and/or git-annex.
Unless prices have changed since I last looked and the scales tipped, tape has the advantage of being cheap. Of course, you will need to test your tapes occasionally and likely want 2 copies just in case, but, at that point you are invested in tape, may as well.
The other possibility is git-annex and lots of drives, but you can mix types. That way you can keep a catalog of your library and information on where it all is, and how many copies of each thing you have.
Of course, any way you slice it, each physical piece of media is something that can fail so you have to occasionally test to ensure redundancy.
I agree but... (Score:5, Informative)
Ah, "unlimited"... right. (*cough*) (Score:5, Informative)
These "unlimited" claims always turn out to be lies. When will we learn?
My friend paid for an "unlimited" account from JustCloud for backup. He stored 1.8 TB on it and then they "fair use"'d his ass and canceled his account. They didn't even give him a refund for the rest of the money he prepaid.
Re:Glacier at $20/mn expensive? (Score:4, Informative)
Glacier at $20 per month for 20TB is rediculously cheap by today's standards. And at those sizes, you'd want to ship those drives to Amazon instead of uploading. We do this all the time and it's not that hard.
The price of TBs of storage of course will come down without question. But by today's standards $20/month for a medium that won't "bit rot" on you is an amazing deal.
You missed a 0, he has 20,000GB and the cost for glacier is $.01/gb/mo (not including upload charges). So, Glacier would cost him $200 a month or $2400 a year. Not hugely expensive but if you are OK with a quasi-local copy (offline and stored in a fire safe, perhaps) you could do it cheaper for less, after you hit the 1 year mark.
a used LTO autochanger is what I employ (Score:2, Informative)
No one will ever see this anonymous post but a cheap robot changer (used) on ebay can be had from between a few hundred to a few thousand dollars. Most of us are geeks and love technology. I use two such devices, couldn't imagine life without them. LTO4 is still the sweet spot in storage cost (media) and capacity. The tapes hold 800GB and can be purchased for around $22 dollars each.
Re:Hmmm... (Score:5, Informative)
At 10 characters per second, the backup would take 63,419 years(*) and require 659 TJ or 0.2 TWh of power to complete. I have a customer that still uses paper tape. It lasts and lasts, and I have only replaced the reader once. The punch needs a new power supply every 20 years or so.
However, 63,419 years is a long time to wait for a backup to complete.
(*) this assumes that 1 TB = 1,000,000,000,000 bytes. It takes almost 70,000 years if you add the extra 10%.
Re:Hmmm... (Score:4, Informative)
20TB = 1.33LoC
Re:Hmmm... (Score:5, Informative)
Punch the hole and you can flip them over to double your capacity.
Re:reduce the amount (Score:4, Informative)
Re:Hmmm... (Score:4, Informative)
Re:Hmmm... (Score:4, Informative)
Re:Hmmm... (Score:5, Informative)
At the very end of the 5 1/4" floppy era, the "High-Density" floppy used the same data rate, tracking, and recording density as the 8" 1.2M floppies. They were, in fact, 1.2M 5 1/4" floppies. Which is why their formatted capacity was different from 3.5" "high-density" equivalent, 1.44M.
Other than electrical needs (as 8" floppies often had their spindle motors directly powered by 120VAC line current), the high-density 5 1/4"s were used as a drop-in replacement for 8" floppies in the hobbyist retrocomputing community. (Not collectors, though; they'd want to keep the gear as cherry as possible.)
Re:If you want to hoard bits... (Score:5, Informative)
The dataset isn't that huge. Tape can write at speed at least as fast as disk - LTO-5 writes at up to 280MB/sec - far faster than you can read the source at which isn't likely to be fast disk. The seek for a single-file restore will be slower than disk but after the initial seek, the read will be as fast as from a typical archive disk (no, you're not archiving 20TB to SSD, nor are you storing the source data on SSD either)
However, the change rate for this application is likely to be low. That makes it very feasible to do random testing from the new backups where a minute to do the tape mount/seek is not a problem. You won't be writing more than a single tape in any single run (LTO-5 is ~1.5 TB of uncompressed data).
For $2K, you'll have the LTO-5 drive. Add $500 for 20 tapes and you can back up the entire set (once) plus a bunch of incrementals. I haven't done the math with LTO-6 which is faster and holds more data. If you want multiple generations, tape is a lot cheaper per TB than disk. The initial drive cost hurts but after that, the price is good at $15/TB or so.
Re:Go on the internet and find a DLT drive (Score:4, Informative)
You're dating yourself. LTO-5 is 1.5TB native, 3TB compressed at $25 per tape. LTO-6 is 2.5TB native and 6.25TB compressed. Both of those compressed numbers are using the built-in compression in the drive.
A 10-pack of LTO-5 tapes is about $250.
You can easily encrypt the tapes and tape them offsite. You can keep a copy onsite and offsite. You're simply not doing that with disk.
Your speed is also off - an LTO-5 can write at 280MB/sec. The limiting factor is not the write time on the media but the read time from disk.
Restore times are typically limited by the write rate on the destination raidset, not the read rate from tape.
Re:Go on the internet and find a DLT drive (Score:4, Informative)
LTO-6 can hold 2.5TB per tape, a tape cost ~$70, the drives cost $2000. That's still more expensive then just more HDDs for 20TB, but at >50TB it might be worth it.
Re:Glacier at $20/mn expensive? (Score:2, Informative)
You can find a bunch of SAS LTO4 drives on ebay for ~$50-75, and adding a SAS PCIe HBA doesn't cost much more (if you have 20TB I assume you already have a tower).
Re:Hmmm... (Score:4, Informative)
Tape drives need the full SCSI command set, not the trimmed version that made it in to SATA (I'm not sure there's even a "(01) REWIND" supported in SATA).
LTO tapes stored reasonably ("keep in in a cool dry place", as the song goes) should last 15 years from any vendor, as that's in the spec, and there aren't really bottom-feeder vendors for LTO.
Re:reduce the amount (Score:5, Informative)
The problem with RAID-5 is that you are 2 disks away from failure and rebuilds often kill the disks.
No. The problem with RAID-5 is that during a rebuild, there is a reasonably possible chance you could have a UBE, and lose one bit, making perfect recovery of the array impossible. Only a stupid controller would consider a UBE to be a failed drive and trash the entire array. On RAID-6, you still have the same possibility of a UBE, but the chances that two separate drives would experience one on the same exact block during a rebuild are so astronomically slim as to be irrelevant.