Best Solutions For Massive Home Hard Drive Storage? 609
i_ate_god writes "I download a lot of 720/1080p videos, and I also produce a lot of raw uncompressed video. I have run out of slots to put in hard drives across two computers. I need (read: want) access to my files at all times (over a network is fine), especially since I maintain a library of what I've got on the TV computer. I don't want to have swappable USB drives, I want all hard drives available all the time on my network. I'm assuming that, since it's on a network, I won't need 16,000 RPM drives and thus I'm hoping a solution exists that can be moderately quiet and/or hidden away somewhere and still keep somewhat cool. So Slashdot, what have you done?"
Define "massive" (Score:5, Insightful)
How much is a lot? (Score:3, Insightful)
Re:Define "massive" (Score:1, Insightful)
Since he's maxing out two machines worth of bays, if we assume three bays per machine and 3 TB disks, then massive probably means something more than 10 TB.
Re:Something like this (Score:4, Insightful)
The only caveat about that particular solution is the lack of redundant power, poor serviceability in the rack (may not apply like you said), and slow speed.
Their solution achieves the density it does because they are using SATA multiplexers, but that effectively creates bottlenecks and lowers overall speed. It works for BlackBlaze's application requirements, but YMMV.
Protocase.com makes the enclosure and will sell it to you for a pretty reasonable price. Getting all the parts is not such a big issue. I think we estimated we could build one without drives for less than $3k.
If you don't have it in a rack, then serviceability will be a lot better for sure. Rackmount solutions require cable management and heavy duty slide rails, and wide aisles, in order to gain access to the drives. The backplanes are parallel to the ground, facing up, and require taking the top off to access. Not exactly IT friendly.
Since the person in the article is not using this in a datacenter, cooling is going to be an issue. I suspect BackBlaze survives due to hot-cold aisles and plenty of airflow. Sticking one of those enclosures in a closet without ventilation/cooling is a recipe for disaster.
Re:Why do you need them available at all times? (Score:2, Insightful)
Re:This should be modded up (Score:4, Insightful)
I looked at a Drobo - but being on a budget, I kept on looking elsewhere. I don't doubt they deserve those reviews, but they are not cheap. And if the Drobo itself dies... good luck getting the data off those drives without another Drobo handy.
My solution - Raid 6 (Score:2, Insightful)
You don't specify what constitutes lots of data. In my case, 2 years ago I went for 6 750GB SATA drives in a Raid6 configuration. There's some very good posts here about some lesser known data reliability options, but personally I wanted to go with a worldwide standard that had been around for a long time and wasn't reliant on a couple guys hacking code in their spare time to make disk redundancy and file access work.
I bought a standard full size tower case, got a very large power supply, and spent a good deal of money on a mid-tier Raid controller. My primary requirement was Raid 6 so I could lose two drives without losing all my data, and my secondary requirement was having true hardware raid support. Most Raid controllers that are not enterprise business class are not true hardware raid - meaning that they use software and the CPU for some of the operations. This slows down file read/write. I did the research and read reviews and got a decent Promise card - if you have the money, go for LSI, Areca, or 3ware. Next, I got a Promise hot swappable 4 drive SATA bay. Not really sure why, it doesn't serve any purpose since in 2 years I haven't had a failure and thus have not had to hot-swap a drive. A very important thing is that I also purchased 7 drives for my 6 drive setup. So I already have a spare if I need it, and I don't have to worry about having the spare cash when a drive fails, or waiting on an RMA if it was still under support, etc. The one thing I wish I had done, and still might, is buy a spare raid controller with the exact same chipset. If your raid controller fries, ALL of your data is gone unless you can get the array up and running on an identical controller. That's a freaky thought!
6 drives in raid 6 at 750GB gives me a little under 3TB of disk. I wanted that in a single partition for ease of use, so I messed around with some 64 bit Linux distributions and did not have any luck. I finally settled on Vista of all things, but only after I got fed up with fighting with Linux - I didn't give it a fair shot, I should have been able to make it work. The only thing I can think is that it didn't like my controller or motherboard.
So, 6 drives of 750GB in Raid6 gives me 3TB. At the time I had less than 1TB of stuff, and wanted to make sure I had room to grow. I didn't grow anywhere near as quick as I expected, and I'm still at less than 2TB today. 2TB drives in a raid6 would give you 8TB, and that's if you only used 6 drives - you could easily add more into that same Raid6 array (depending on how good your Raid controller is). Even if all of your movies are dual layer quality, say 6GB each, that's over 1300 movies. That'd certainly last me a long time!
Comment removed (Score:5, Insightful)
Re:Define "massive" (Score:2, Insightful)
Certain file systems, even with tons of free space, will fragment files that are in the low megabyte range.
[citation needed]
I suspect fragmentation gets even worse on the large files the OP is asking about.
Sadly, that is speculation on your part.
Re:Bzzzt. Still Wrong. (Score:4, Insightful)
Do you really need -massive- storage? (Score:2, Insightful)
Moore's law applies particularly well to harddrives. Every two years you can buy a new hard drive that is twice as big, for the same price. Or pay half as much for the same storage capacity. If you stock up now, you'll spend a lot of money for something you can buy a lot cheaper in two years' time.
My advice: Buy new hard drives and replace them as you run short on space. If you run out of space inside your rack, move the contents from your oldest disc into the next, and you can sell, discard or get an external enclosure for your oldest disc.
Allocating massive storage without immediate need for it, is going to cost you a lot of money.
Re:Define "massive" (Score:3, Insightful)
You'd have to have one hell of a bit torrent hobby/debilitating movie watching problem to need more than 2 TB of video on tap on a hard drive for entertainment purposes.
Unless you're doing HD video editing, or you like to keep a copy of every picture ever taken by your 8+ MP DSLR in RAW format, few people actually need that space. You might be able to fill 100GB with installed video games but the average person who is buying a 1TB drive is probably upgrading granny's computer and thinking "well hey, for $30 more I can get ten times the space" and opt for the $100 1TB drive instead of the 100GB drive. I just replaced the primary drive on my file server and said "hey, for $6 more, I can upgrade from an 80gb drive to a 320 gb model".
Re:Define "massive" (Score:3, Insightful)
Actually NTFS is pretty good at keeping files unfragmented.
If a program opens a new file and them immediately seeks to the end of it to fix it's size then NTFS will look for a continuous block of free space to save it in. NTFS caches all writes so it can wait to see what the program actually does with a file before committing it to disk.
It also has a system designed to reduce the fragmenting effects of small files by being able to store their data in the same block as their metadata.
The only major fragmentation problem Windows XP has is when a machine has very little RAM and it allocates a rather small page file. It then ends up needing to expand the page file repeatedly and it gets highly fragmented causing severe slow down. I think they fixed it in Vista/7 by simply specifying a sensible minimum size and expanding it in larger chunks.
Re:Define "massive" (Score:5, Insightful)
Does using RAID controllers actually provide superior price:performance to using software RAID? Last I checked, the processors on most cheap RAID controllers were slower than dogshit and using md under Linux would give you better performance than basically any of them, at the cost of some CPU. But since CPU is cheaper than RAID, it probably makes sense. For example, going from a Phenom II X3 720 to a Phenom II X6 chip of the same clock rate takes the CPU from $100 to $200. How much would it cost to go from four crappy RAID controllers to four good ones? It would probably cost you at least $400.
The answer is probably to just go ahead and install Debian on a machine with as many CPU cores as you want to blow money on, and to use software raid. Put lots of system RAM in it, which the OS will automatically use for disk buffers. Current versions of grub work fine with USB keys, because they can use UUID for the groot, and the UUID never changes. If you want it to boot quickly, find a motherboard with coreboot support. If you want external disks you can use firewire cheaper than eSATA, if you get the external disks or just some enclosures at a good price. It makes maintenance a lot easier, but involves substantial power waste due to all those inefficient wall warts.
P.S. OpenSolaris is circling the drain, please don't suggest it to anyone for anything.
Re:Define "massive" (Score:3, Insightful)
Your analysis completely ignores the cost of the electricity to run a setup like that.
I went from an older similar setup with about 1TB of storage to a dedicated NAS box with 2TB of storage with similar performance characteristics -- and saved $40 a month in electricity.
A 500GB drive draws as much power as a 2TB drive, and server motherboards and power supplies devour power.
Re:Define "massive" (Score:3, Insightful)
Yea, in 1992 maybe ... not sure what version of Windows you're comparing too but that hasn't been true for years.
Re:Define "massive" (Score:3, Insightful)
250 movies that you watch every year, in addition to the ones you rent, or go see with friends, or simply non-movie stuff you watch like sitcoms and/or live events like the news/sports? You must only work 2 hours a day to keep up with your busy viewing schedule and still have time to sleep, shower and spend time with other humans (they exist outside of movies, you know). 10 movies that you re-watch year after year I can understand, but 250 just blows my mind. Do you schedule that a year in advance? What happens if you miss a day?
I mean, you already have it in another (optical) format. If you already have a physical backup, what's the point of archiving it on a hard drive? It's going to take you just as long to find that movie in explorer as it is going to take you to pull it off the shelf and stick it in the drive. I can understand the need for photo storage, since there's no other physical media they come on and memory cards are relatively expensive. But unless you're accessing the same data you already have in optical format at least once a month it seems that you're backing it up to hard drive just to be able to say you've got 250 movies on your hard drive. It's unrealistic to watch all those movies every year just to justify having them on a $100-200 hard drive + the time (how long does it take to rip a movie, 30 minutes?). Maybe I just value my time better.
Re:Define "massive" (Score:3, Insightful)
I've heard one too many sad stories about old on-disk RAID structures not being compatible with the new version of the old failed RAID card. I prefer the md device since it has been consistent for quite a while and the on-disk format is well documented.