Which RAID for a Personal Fileserver? 898
Dredd2Kad asks: "I'm tired of HD failures. I've suffered through a few of them. Even with backups, they are still a pain to recover from. I've got all fairly inexpensive but reliable hardware picked out, but I'm just not sure which RAID level to implement. My goals are to build a file server that can live through a drive failure with no loss of data, and will be easy to rebuild. Ideally, in the event of a failure, I'd just like to remove the bad hard drive and install a new one and be done with it. Is this possible? How many drives to I need to get this done, 2,4 or 5? What size should they be? I know when you implement RAID, your usable drive space is N% of the total drive space depending on the RAID level."
Re:search the fscking google (Score:3, Insightful)
My choice (Score:5, Insightful)
Raid's great, but an rm -rf is still an rm -rf, thus the third drive
Software RAID? (Score:5, Insightful)
1) You don't need drives that are the same size.
I've done hardware RAID, had a drive fail 2 years down the road and not been able to find an 18GB SCSI drive to re-insert to the array. That has the potential to jack your entire array. With software RAID, you buy a 36G drive, partition it so that 1 partition fits your array, and off you go
2) It's a personal file server, so speed is less important than cost (i'm guessing). With software RAID you can mix all sorts of wonderous things together. IDE drives from the basement, SCSI-320 drives you stole from work and nearly everything in between. It's for flexible, and has no associated controller cost.
3) It's easy as heck. You can configure it in Disk Druid/fdisk, and it works quite easily in any major distribution (I've done it in Slack, Debian, RH, Fedora and Mandrake).
The major downside is that you cannot (as least I don't know how to) hot-swap drives. But again, this is a personal file server. Spend your money on pizza and beer, screw the SCA hot-swap drives that are going to cost you an arm and a leg.
That's just my $0.02...flame away
Re:Just remember the RAID song (Score:5, Insightful)
RAID 0, you need a hero,
RAID 1, is equally fun,
but RAID 5 keeps you alive!
RAID 5 - better keep an extra drive
Or you'll be down until the replacement arrives
RAID 10 is better my friend
Work doesn't stop when the drive comes to an end
Re:Hardware (Score:3, Insightful)
"Good, Fast, Cheap: Pick any 2."
Re:search the fscking google (Score:2, Insightful)
or how about starting with using high quality drives instead of dirt cheap consumer drives with low life and warrenty lengths...
I have had ZERO problems with my server quality SCSI drives that still have 2 years left on their 5 year warrenty.
I suggest looking at getting reliable drives before looking at a RAID solution.
Re:Software RAID? (Score:2, Insightful)
Re:Raid 1, 0+1, or 5.. (Score:3, Insightful)
humans fail more then drive? (Score:2, Insightful)
Re:RAID 1 (Score:3, Insightful)
Lets see... 2.5 TB ~ 600 DVDs
And you store all of those DVDs where? And you access them quickly how?
Re:Software raid (Score:4, Insightful)
consider other risks too (Score:3, Insightful)
However, what happens if your place has a fire, gets vandalized, or a burglar takes off with your server(s)?
ghost? (Score:2, Insightful)
Re:search the fscking google (Score:3, Insightful)
Because Google turns up 1,400,000 hits of mostly crap in 0.11 seconds. When you need advice, do you ask a librarian, or a group of trusted friends? By your logic, we should trust the company that wants to sell us RAID cards. I'd rather ask people that use RAID products, not sell them.
Re:search the fscking google (Score:5, Insightful)
No offense intended here either, but why is it that every time someone posts an "ask slashdot" question someone else feels compelled to complain (and occasionally get downright rude) about why the user didn't just "google it"?
Google will get you articles and advertisements, true, but most of the time what the questioner is really after is peoples OPINIONS and EXPERIENCES.
If I post a question like "what's the best backup program you've used on linux" I'm looking for 1.5 million slashdotters EXPERIENCES with backup programs...a google search will get me a list of programs and some reviews if I'm lucky, but that's no substitute for hearing from a bunch of people who've actually DONE or USED something.
Hearing from a few hundred or thousand responders is a better recommendation than a "C-NET" review anyday!
RA *I* D (Score:3, Insightful)
Right now, the cheapest HDs per GB are 30GB@$3 = $0.10:GB. Cheapest RAID controller card is $15 for 4 drives; a PCI PIII/1GHz server stacked with 24 drives gives 720GB (down to about 600GB with RAID redundancy) for about $400. Large capacity drives (~160GB) are at $0.50:GB, so your $400 server gets you about the same storage, but no RAID. Add the faster seek times by switching over more IDE buses rather than moving fewer drive heads, and the RAID promise delivers.
Re:search the fscking google (Score:5, Insightful)
there are two kinds of people: those who have had hard drive failures, and those that will have hard drive failures. i don't care if jesus h fucking christ himself blessed your hard drives.
I suggest looking at getting reliable drives before looking at a RAID solution.
and, if the poster is looking for the more-realtime-than-backup-restore reliability as he indicated, i suggest he look at raid BEFORE looking at drive quality.
the name of the game is redundancy. a RAID array of cheap drives (let's remember that it stands for Redundant Array of Inexpensive Disks) *is* more likely to have a single hard drive failure - but it's recoverable. however, it's far less likely to have multiple, simultaneous drive failures on the same day (unrecoverable) than your one, expensive, better-quality hard drive is likely to have a single failure - which is unrecoverable.
Re:RAID 1 (Score:2, Insightful)
a1 - b1
a2 - b2
a3 - b3
aN - bN
In RAID 0+1, aN mirrors bN and we stripe downwards. The array fails if and only if both aX and bX fail, for any given X.
All else being equal, for N drives, after one failure there is a 1/(N*2)-1 chance that the next failed drive will bring the array down. For 2 drives, that chance is 100%. For 4 drives 33%, 6 drives is 20%, 8 drives is 14%, and for 10 drives it's 11%.
For 10 drives total, let's say two drives fail, a1 and b2. Now either b1 or a2 must fail to bring down the array - that's 2 out of 8 drives, 25% chance. If a3 dies next, we're at 3/7 drives that can down the array, 43% chance. If b4 goes next, a failure on any one of 4 out of 6 drives can bring the system down, 66% - now we've crossed your 50% threshold.
For optimal stability, 'a' and 'b' should be separate channels and a hot spare should be available on both channels.
The other benefit of RAID 1+0 is that once a drive is replaced, only one submirror needs resyncing, not the whole array. In RAID 5, the whole array must be resync'd and in RAID 0+1 half the array gets resync'd. This means that your hotspare becomes effective a lot quicker, and your time in degraded mode is a lot shorter.
Re:search the fscking google (Score:5, Insightful)
I'm glad he asked. I benefit from reading the discussion, including the various tangents. This gives me another opportunity to consider using RAID at home and benefit from some "war stories" folks might offer. My needs aren't exactly the same as his, but fortunately people never stick to the exact question asked, anyway. The free information people give out is invaluable, especially the stories of personal experiences and descriptions of people's personal setups at home.
Good God, you're dense... (Score:4, Insightful)
Therefore, if your data is important you won't just trust that an unlikely event won't happen - you'll assume that it will happen and make sure that it won't affect the integrity of your data.
Therefore you'll be using RAID and preferably regular backups whatever you do. This is what ensures your data integrity, not the reliability or otherwise of your drive.
After that, it's a case weighing performance, the cost (in money, manpower and downtime) of replacing a broken drive and the cost of setup against each other, and this is where it starts to make sense to use IDE drives for RAID:
For instance, say you've got 5 IDE RAID array. Over the space of say, five years you end up having to replace three of the drives - that's eight IDE drives you've had to buy
You also do the same thing with SCSI drives, and luckily none of them break - that's 5 SCSI drives all in all.
Now, say the IDE drives cost $100 each compared to $500 for the SCSI drives. You've spent $800 in the IDE case compared with $2500 in the SCSI case. There was no difference in the safety of your data but the SCSI one cost three times as much.
Therefore to choose SCSI, you'd *really* want to get that extra little bit of speed, which to be honest is more likely to be limited by the network to your server anyway...
So, to recap - assuming your data is valuable to you, the choice between SCSI and IDE has nothing to do with the disk reliability because you'll be relying on some other systems (RAID and backups) for your reliability anyway.
Rsync every night (Score:3, Insightful)
Re:RAID 5 or RAID 10 (Score:3, Insightful)
Give me a break, your argument is worthless when the costs of very reasonable hard drives are in the range of $100 each. (200GB drives for $101 at pricewatch). To even consider RAID-5 you would have be wanting to do this with a minimum of 3 drives. Probably more. 3 drives comes to $300.
If this is getting too rich for you as your post seems to claim, you shouldn't be even talking about RAID-5. Not because of the price of the controller card, but because of the cost of the hard drives alone.
"To store my MP3's, pictures, docs, etc its not exactly cheap"
Get a fucking CD-R burner then!
Re:RAID 1 (Score:1, Insightful)
me thinks the indiscriminate downloading and running of previously-unknown software(s) from "Russian (or some other foreign site[s])" has a lot more chance of being part of the original problem ("drive...totally gone") then "WinXp Pro just decided to wipe it clean."
oh, that's right, I'm reading
Re:Software raid (Score:4, Insightful)
If your HW RAID controller dies, you have to get another one of the same controller, and hope that you can re-import your config w/o losing all your data. If your running SW RAID and your SCSI/IDE controller dies, you can replace it w/ whatever is cheap/available at the time. As long as the failure itself didn't bork your data, you shouldn't have to do much, if anything, to see your data again.
If you can afford to get the top of the line SCSI RAID controller from a good vendor it's probably the better option, but if cost is an issue, IDE SW RAID is the only way to go.
Re:Software raid (Score:4, Insightful)
The real advantage of software over hardware RAID is that you don't need to keep a spare RAID card around. With hardware RAID, when your RAID card fails you'll need exactly the same make & model card to read your data.
With Linux software RAID, you can read the drive set on any system with the raid modules.
Re:Software raid (Score:2, Insightful)
Re:Software raid (Score:3, Insightful)
First off, 3Ware cards cannot be used as "dumb" IDE controllers - they only support logical drives - creating single drives is not possible, nor is leaving unassigned drives.
Second, Software raid will always suck for one big reason: A drive fails, your system locks up.
I have not seen any software based controller (promise, Silicon Image, High Point) or complete software based solution (Windows 2000/2k3 server's RAID, or Linux's md raid) on standard IDE controllers stay alive after a drive fails. It always takes the box down with it.
When you buy a hardware based RAID solution, the controller handles the drive failure gracefully, which keeps the machine running. "Dumb" IDE controllers don't know they're raided (they are dumb after all), so when a drive fails, they freak out.
3Ware makes a TRUE hardware based RAID solution that is intelligent enough to email you when a drive fails. Their 2 channel cards (SATA and PATA) are roughly $100, and their 4 Channel cards (RAID-5-able) are $250 and $350. Its well worth the money.
I've not used the LSI Megaraid SATA controller yet (I plan to); I've had good luck with their cards for SCSI RAID, and they carry a slightly cheaper price tag than the 3Ware cards.
No, I do not work for 3Ware - I think suggesting software RAID to anyone is a bad idea. I've seen people loose data with promise controllers, which are nothing more than glorified IDE controllers with software doing the RAID functionality. Software RAID is BAD.
Re:Software raid (Score:3, Insightful)
That pages does a good job of explaining why SCSI is "better", in terms of MTBF, seek time, etc. However, I can't help but feel that those numbers are kind of missing the original poster's requirements, which were for a *personal* fileserver. The MTBF for IDE may be lower, but in a RAID-1 or higher setup, this isn't really an issue. Realistically, multiple drives aren't going to fail *at once*.
For a home fileserver, IDE is more than fast enough, unless you're dealing with many gigabytes of data (video editing?). There's also the noise/heat/power issue of having multiple 10,000 or 15,000RPM drives spinning around the clock. A couple of 7,200RPM IDE drives in RAID-1 are quiet enough to run in my bedroom while I sleep; can the same be said for fast SCSI drives?
Don't get me wrong: SCSI is wonderful when screaming-fast performance and/or lots of concurrent users are a requirement, and things like price, noise, and power aren't a factor. Nobody disputes that. But for small office/home use, I don't always think it's the best choice.
Re:Rsync every night (Score:2, Insightful)
Re:Software raid (Score:1, Insightful)