Which RAID for a Personal Fileserver? 898
Dredd2Kad asks: "I'm tired of HD failures. I've suffered through a few of them. Even with backups, they are still a pain to recover from. I've got all fairly inexpensive but reliable hardware picked out, but I'm just not sure which RAID level to implement. My goals are to build a file server that can live through a drive failure with no loss of data, and will be easy to rebuild. Ideally, in the event of a failure, I'd just like to remove the bad hard drive and install a new one and be done with it. Is this possible? How many drives to I need to get this done, 2,4 or 5? What size should they be? I know when you implement RAID, your usable drive space is N% of the total drive space depending on the RAID level."
RAID 1.... (Score:2, Interesting)
RAID complexity (Score:2, Interesting)
Raid 5 (Score:3, Interesting)
Re:RAID 1 (Score:5, Interesting)
The benefits are that you get the same protection as with RAID 1, but lose the speed penalty, all without needing special hardware or spare CPU power for expensive CRC calculations.
With a 4 drive RAID 1+0, you'll get read performance of 2x-4x a single drive, while writes will be from 1x-2x. In theory, that is. In reality, if using a RAID PCI card or motherboard solution hooked to the south bridge, you'll most likely max out the read speed.
Anyhow, it's a very cheap solution that doesn't tax your CPU too much even if done through software (like with a highpoint controller), and it does give you piece of mind.
The worst downside is that you will have to take the system down to change a drive (correct me if I'm wrong, but I've never seen a hot-swappable RAID 1+0 solution), and the performance before you do that will take a substantial hit.
Raid 4/5 is nice because it doesn't waste a lot of drive space, but it comes at the price of very slow writes, and very high CPU use unless you also get a hardware controller with an onboard CPU.
Regards,
--
*Art
RAID 5 (Score:2, Interesting)
One great thing about using Linux on the fileserver is that you can use software RAID. As the name implies, this requires no special controller cards (which is nice, since RAID 5 controllers typically run $200+). You also have the option of setting spare drives, which allows the array to begin rebuilding immediately in the event that one drive fails - the spare takes its place. Setup is easy - create a RAID, select what type you want, and then add drives to it and format.
I'm using a RAID 5 setup with 5 x 250GB drives giving me 4 x 250GB = approx. 1TB of storage space. As has been mentioned, using RAID 5 allows you to recover if one drive fails. The odds of more than one drive failing before you have a chance to rebuild the array are essentially the odds of your box being destroyed (tornado, fire, etc.).
Also previously mentioned, never attach more than one drive per IDE bus (assuming you're using IDE like I am). Doing so is irresponsible from a bandwidth standpoint as well as from a reliability standpoint, since a drive crash typically brings down the bus, and all drives on the bus with it (and as we all know by now, losing >1 drive is not survivable). Buy some cheap PCI IDE controllers, keeping in mind to ensure that they're dual channel if you plan on connecting >1 drive per controller.
Take some time and read this [tldp.org] - it will tell you everything you need to know.
Re:RAID 1 (Score:2, Interesting)
Pretty much the solution is Raid 1 or Raid 5. Besides on most raid controllers, Raid 1 is faster read throughput than a single drive, though writing does take a bit of a performance hit. Raid 5 is expensive, while most any Raid controller can do decent Raid 1.
0+1 (Score:2, Interesting)
In 0+1 is all just data baby! Loose a disk, just break the mirror and you'll still get good speed until you can fix the failed disk.
Linux software mirroring (Score:2, Interesting)
I've found the linux kernel's built-in RAID [ibm.com] capabilities more than adequate for most of my fault tolerance needs. The best part is I can move the drives to pretty much any system - a new motherboard, whatever - without having to worry about kernel support or finding that IDE driver. If a drive fails I can boot its mirror up in any system and be in great shape. I also use the utility mdadm [unsw.edu.au] to email me if one of the drives fails. For some linux firewall systems I've built, I use old crappy 6GB drives, but mirror them so there's no risk if one of them goes out. Looking at my basement firewall now and...
everything is cool!
My strategy (Score:2, Interesting)
Basically, I'm more worried about keeping what's in
If zero-downtime is a critical factor for you, you probably want to RAID-1 the whole disk (just remember to copy the MBR, too!)
Re:RAID 1 (Score:5, Interesting)
I was even thinking of buying the app until I surfed to the company's site and found it was >$2K US. Screw that. If it happens again, I may not reciver my stuff.
I didn't have anything critical on there, but it woudl have been very time consuming to re-rip my CDs again.
jason
Re:RAID 1 (Score:3, Interesting)
Re:RAID 1 (Score:5, Interesting)
The only semi-common RAID I know of that could handle two drives failing at the same time would be RAID 10, A mirrored set of striped drives, and then only if one side of the mirror died.
For your diligence bit, I've actually worked with a machine that had a drive fail in the RAID 5 set and then as the hot spare came online and started rebuilding the data needed to keep the R in RAID another drive died. The whole set was then completely unusable and somebody probably would have been fired if there weren't a set of recent backups around. As it was a couple people got to work about 12 more hours on top of their 8 for the day to make sure the machine was running again by the next day.
Thus my moral, RAID isn't a replacement for backups, as there still can be failures. RAID will reduce the frequency with which you need said backups, hopefully to never, but it can still fail. Nothing replaces a good backup.
Oh, and also another good reason for RAID 5 instead of 1, there should be a bit of speedup since there's multiple disks involved, assuming, of course, your RAID card can handle all the XORs.
Re:RAID 1 (Score:5, Interesting)
Amen. I have vivid memories of typing rm -rf * in the wrong directory (and that was WITH pwd in my prompt). It took an entire week to duplicate the work lost.
Combining the rm command and lack of sleep is like combining a loaded gun and your forehead. You can only do it so often before you destroy something valuable.
Don't put your faith in RAID (Score:3, Interesting)
Re:RAID 1 (Score:5, Interesting)
If you're having two drives fail before you can get one replaced you need better hardware or a better failure notification system, or both.
And, speaking from personal experience, and both the theoretical and real-world benchmark tests, I can say quite firmly that the software RAID 1+0 on my dual P3 1GHz fileserver does give a 'spead boost'. Not the theoretical maximum of 4x read 2x write, obviously, but certainly a noticable speed boost.
And finally, you complain about a writing performance hit under RAID 1? Have you ever even used or benchmarked a RAID 5 system? Computing parity information, unelss you have a *very* expensive RAID 5 controller, puts RAID 5 well behind every other type of RAID when it comes to writing speed.
Like seriously man, have you ever even experimented with different RAID setups, or are you just extrapolating these ideas from something you read on the web?
Re:Software raid (Score:3, Interesting)
Performance with an IDE raid controller is pathetic. You can't get much more than 22MB/s. I can hit 68MB/s reading and 31MB/s on one system with 4 7200 8 MB cache IDE drives. (This system has 2 extra pci ide cards in it so each drive is a master with no slave).
If you want to go scsi then you have software and any ide raid card beat by a long shot. But "personal fileserver" usually means raid is too expensive.
Re:Software raid (Score:5, Interesting)
Hardware controllers with batter backed RAM (note; not all controllers have this), will have an edge over software solutions on ALL writes - no matter which RAID level you use.
Don't even bother trying to do RAID 5 in software
SW RAID is usually a lot faster than HW RAID solutions, when you factor out the battery-backed RAM part. Any HW RAID controller with battery backed memory will lose big-time to SW RAID on even moderately faster CPUs (like 500MHz P-IIIs), especially on RAID-5 which is compute intensive, an even more on RAID-6 which is also compute intensive but not XOR based.
Modern HW RAID controllers have reasonably fast CPUs with XOR accelerators built in - therefore they can do RAID-5 as fast as the pure SW solution. But this is not the case with older controllers.
I know of people who use 3ware cards for large RAID-5 servers, but only use the 3ware cards as "dumb" IDE controllers, and leave the RAID-5 handling to SW-RAID. The reason? Their benchmarks indicate that this is significantly faster.
And when you think about it, it makes sense. Nobody puts a GHz processor on a RAID controller. Even a slow-by-todays-standards P-III is able to XOR more than a gigabyte of data per second - much much more than anything you put thru most file servers out there.
So, the "HW RAID is faster than SW RAID" is true in one scenario only; when you have write-intensive workloads and a HW RAID controller with battery backed cache.
In *all* other cases, SW RAID will be a win, performance wise.
For a personal file server, I wouldn't hesitate to run RAID-5 in plain software. It's as fast or faster than any HW RAID controller in the sub-$3K price range, it's reliable, and the flexibility beats the heck out of any HW based solution out there (mixing IDE/SCSI, allowing a cryptographic layer between the RAID layer and the physical disks, etc. etc...)
Re:RAID 1 (Score:5, Interesting)
You can resolve this issue with high-capacity, portable storage. I keep all most critical stuff (software, licenses, photos, pr0n, etc.) on my 40GB portable drive. Forget those keychain things. The FireLite SmartDisk [smartdisk.com] is a USB 2.0, aluminum encased laptop drive. It draws power from USB - it even worked on my old USB 1.l system. They provide a special power cable, in case your old USB ports aren't pushing enough power. I toss thing in backpack every day and lug it all over - it has yet to show signs of weakness.
I totally agree with your configuration. For my Linux server, I've been using Linux (RH7.2) Software RAID-1 mirrored for ~3 years without a single issue.
Building a home fileserver (Score:5, Interesting)
I basically built a box to do nothing other than fileserv. I put together a nice simple old PC (550mhz with 256 meg of ram) and mounted it in an old rack mount case I had lying round.
It's running debian with 2.4.26.
I'm running software raid and installed 2 x 2 interface IDE cards.
I threw in 6 seagate 120 gig drives (the ones with the 8 meg cache) and ran raid5 across 5 of them and a hot spare to rebuild the raid should a drvie fail. Each drive has it's own IDE channel to prevent channel faliure from screwing my raid.
I'm using ext3 as the filesystem and wrote my own little raid mon script that SMS's me should a drive fail and alarms locally.
This setup has been rock steady and gives me 460 (ish) gig of usable space after formatting.
For added peice of mind the machine is plugged into a UPS that is connected to the machine via Serial. If the UPS kicks in it shuts the machine down properly after sending an alarm SMS (the DSL and switch are also on the UPS) (yes I'm a paranoid freak)
This makes a perfectly good media and file server and I've had no problem with it in the few months I've had it.
I also reccomend setting the spin down time onm the drives manually with hdparm. It was getting awfully warm in the box till I turned that on on the seagates. Modern drives are rather hot.
I have the whole thing mounted via SMB on my other boxes around the house and it's fast,(gig ethernet) reliable and easy.
Tho do remember that no amount of raiding will save you if you lose 2 drives through some horrible freak of badness, and no raid level is going to protect you from a house fire. Hence mine also rsyncs all my absoloutely vital files (scanned family photos and docs) offsite to a file storage site every night at 2am so as not to chew my bandwidth dduring usable times. Don't forget the only truely secure data is that which is backed up.. and offsite.... twice.
RAID 5 can be appropriate (Score:3, Interesting)
If you are willing to fork out about $1100 for storage you can create a really nice array. I'd recommend a 3Ware 4 port 9000 series controller like the 9500S4LP (around $330) or a RaidCore card reviewed [tomshardware.com] recently over at tomshardware. Add in 4 $180 250GB SATA drives and you have a nice 750 GB array for around $1100. The Promise FastTrack SX6000 is quite economical and supports more drives if you don't mind it's bad performance and crappy Linux support. 8 port cards are also pretty economical but it's hard to put that many drives in most cases. You have to design a system carefully in order to create arrays much bigger than 4 drives.
Once you have your array, it's a good to use Linux or something with a reliable journaling filesystem on top of it. Once you have a RAID array your filesystem becomes a much more important point of failure. Using a reliable one will do a lot towards reducing your likelihood of data loss.
I also use a separate drive with a separate filesystem for backup. I have a script that manages it for me (ignoring certain directories) which runs every night. A RAID array is pretty reliable and a big step up from single drives so it's a good half way point but I wasn't comfortable with it so I went further. How far you go us up to you.
I went with RAID5 (Score:2, Interesting)
Now, when I need to store a few hundred more hours of video, I can just throw 2 more Maxline Plus-II drives at it to get up to ~1.2TB--leaving final cost at under $2/GB, including the computer case, power supply and hotswap bays.
provantage.com has the 4 port 3ware 9000 card for about $320, I think. -se
Re:Software raid (Score:3, Interesting)
Are that many people really in need of a huge read thoroughput, but at the same time happy to accept high seek times? Is this really the best way to get performance out of your system? 3.6 ms seek time seems bad enough to me, but I can't imagine having my root partition on your average IDE drive's 8.5-9.5 ms seek time. I mean, really - you can get a 9 gig scsi drive for your root partion, brand new, for 30 bucks (inc. shipping) that has a seek time of 5 ms. Why would anyone use IDE for a root partition - but then try and make it raid for performance?
It's something that really has me baffled. Certainly, seek time isn't important on, say, listening to mp3's or watching videos - your bulk data. But when loading libraries to run programs, compiling, starting X, etc, it makes a *really* big difference. And to think that many people out there have their *swap* on IDE drives also...
Re:RAID 1 (Score:5, Interesting)
Ghost (Score:2, Interesting)
The nice thing about ghosting is that I get to use a cheap 120GB 5400 rpm drive and save many compressed images from the drive I'm backing up... I'm backing up a 36GB(?) 10,000 rpm WD Raptor. It only take 15 mintues to restart, boot from my ghost cd, save the image, and reboot into windows XP.
I didn't like the RAID solutions I was looking at, this not only works just as well for my needs, but I get to keep the removable HDD in a safe(fireproof of course) at the other end of the house just in case...
Just a thought...
Re:Software raid (Score:3, Interesting)
What do you need high read thoroughput (not write - RAID doesn't give that to you) for? Are you serving 2 gig files over the web? If not doing things like that, such a configuration is borderline pointless.
Take a look at
Do you see what I'm saying here? Using IDE as a root partition is dumb, but making it RAID is dumber.
Now, for slow bulk storage, nothing beats IDE.
Re:Software raid (Score:3, Interesting)
http://bbcr.uwaterloo.ca/~brecht/courses/856/read
Re:Software raid (Score:3, Interesting)
Hm, I'd rather invest those $50 into RAM (you easily get additional 512MB for that). It won't speed up boot time (but in my case, I don't care about whether booting once a day needs 1 or 2 minutes), but after first use, everything is really fast (well, at least under Linux
I could even preload some stuff into a RAM disk and prevent seek times this way (via dd), but as I said, first startup isn't that important to me.
I am also not sure, why you are speaking of fast swap access several times. My swap partition didn't get much use for the last 5 years (even when I was still at 386MB)[1]. If you aren't into video editing or such, today's average 512MB or such should be plenty.
Another possibility for fast access times without spending too much, which I have done recently on a database server, is using average disks and putting software RAID on it (I needed much space and the fast disks with the needed size were about several times the prize of the lesser disks).
This worked so well with SCSI disks that I intend to try it with my home system on the next upgrade. Though I expect less performance due to IDE constraints.
[1] It gets used whenever Linux decides that it's a good idea to swap unused parts out in order to increase the mem availabe for the filesystem cache - which is why I still have a swap.
Re:Building a home fileserver (Score:3, Interesting)
NO NO NO NO NO. Repeat: NO . Don't do that. Really, don't. Drives, particularly the high-RPM types you're likely to find in servers, do not like to be cycled a lot; it's the single most stressing thing you can do to one. You will dramatically shorten your drives' livespans if you do this.
Re:Software raid (Score:3, Interesting)
It cost quite a bit when I put it together, but it's been well worth it, seeing as how it has taken 5 years for the desktop-level stuff to catch it performance-wise. When I do upgrade, I will probably go with an escalade driving 74GB Raptors, since the have command queueing they are beating all but the most high-end SCSI drives out there now.