Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Data Storage Software Linux

Experiences w/ Software RAID 5 Under Linux? 541

MagnusDredd asks: "I am trying to build a large home drive array on the cheap. I have 8 Maxtor 250G Hard Drives that I got at Fry's Electronics for $120 apiece. I have an old 500Mhz machine that I can re-purpose to sit in the corner and serve files. I plan on running Slackware on the machine, there will be no X11, or much other than SMB, NFS, etc. I have worked with hardware arrays, but have no experience with software RAIDs. Since I am about to trust a bunch of files to this array (not only mine but I'm storing files for friends as well), I am concerned with reliability. How stable is the current RAID 5 support in Linux? How hard is it to rebuild an array? How well does the hot spare work? Will it rebuild using the spare automatically if it detects a drive has failed?"
This discussion has been archived. No new comments can be posted.

Experiences w/ Software RAID 5 Under Linux?

Comments Filter:
  • by GigsVT ( 208848 ) on Saturday October 30, 2004 @06:53PM (#10675244) Journal
    Have you tried the 9500 series? It looks much nicer than their older offerings.

    We've run several 7810s, 7850s in the past, totalling quite a few terabytes. All in all it's not too awfully bad, but the cards do seem to have trouble with dropping drives that don't seem to have any real problems (they recertify with the manufacturer's utility often with no errors).

    If you go 3ware though, get the hot swap drive cages from 3ware. They are expensive, but it makes it much nicer.
  • by Anonymous Coward on Saturday October 30, 2004 @06:56PM (#10675263)
    So true. In many cases HW RAID doesn't offer any advantage over software RAIDs and it's only a one new part that can broke and cost $$$ and time to replace.

    Moderators, mod this up!

  • by mortonda ( 5175 ) on Saturday October 30, 2004 @07:00PM (#10675297)
    The idea is that in order to write data to any sector on one of the drives, the sectors from six of the other drives need to be read, all XOR'd together, and then the result written to the remaining drive.

    Your logic eludes me. The blocks do not need to be read, as we are in the process of writing. We already have the data, because we are writing, so why would we re-read the data?

    Furthermore, block sizes default to 4k, though you could go to 8k or 32k block size. At any rate, you don't need a gig of RAM to handle this.

    Finally, XOR is not that expensive of an operation, and a 500Mhz CPU is going to be able to handle that faster that any but the most expensive controller cards.

    So unless you are actually a RAID kernel developer, I don't buy your story.
  • by GigsVT ( 208848 ) on Saturday October 30, 2004 @07:03PM (#10675320) Journal
    I just posted in another thread about 3ware and mysterious drops of seemingly good drives. Even with the ultra-paranoid drive dropping, we have never lost data on 3ware.

    Other than that, 3ware has been decent for us. We are about to put into service a new 9500 series 12 port SATA card.

    I wish I could say our ACNC SATA to SCSI RAIDs have been as reliable. We have three ACNC units, two of them went weird after we did a firmware upgrade that tech support told us to do, lost the array.

    We call tech support and they say "oh we didn't remember to tell you when you upgrade from the version you are on, you will lose your arrays".
  • by fleabag ( 445654 ) on Saturday October 30, 2004 @07:04PM (#10675322)
    I would support the sentiment.

    Back when I was using a PII-450 as a file server, I tried out software RAID on 3 x 80 Gb IDE disks. It mostly worked fine - except when it didn't. Generally problems happened when the box was under heavy load - one of the disks would be marked bad, and a painful rebuild would ensue. Once two disks were marked bad - I follwed the terrifying instructions in the "RAID How-To", and got all my data back. That was the last straw for me...I decided that I didn't have time to watch rebuilds all night. Note that this may have been caused by my crummy Promise TX-100 cards, I never bothered to investigate.

    I got an Adaptec 2400 IDE controller, and it hasn't blinked for two years. One drive failure, and the swap in worked fine.

    If the data is important to you - go hardware. If you want to lean something, and have the time to play, then sofware is OK. Just run frequent backups! If the data is really important to you, buy two identical controllers, and keep one in the box for when the other craps out. Having a perfect raidset, with no controller to read them, would be annoying.

  • Re:Works great (Score:5, Interesting)

    by k.ellsworth ( 692902 ) on Saturday October 30, 2004 @07:08PM (#10675350)
    Normally a drive crash anonunces itself some time before... use the smartctl tool.
    that tool checks the SMART info on the disk about posible failures..

    I do a lot of software raids and with smartctl, no drive crash has ever surprised me. i always had the time to get a spare disc and replace it on the array before something unfunny happened.

    do a smartctl -t short /dev/hda every week and a -t long every month or so ...

    read the online page of it:
    http://smartmontools.sourceforge.net/

    A example of a failing disc:
    http://smartmontools.sourceforge.net/exampl es/MAXT OR-10.txt
    a example of the same type of disc but with no errors:
    http://smartmontools.sourceforge.net/exam ples/MAXT OR-0.txt

    Software raid works perfect on linux... and combined with LVM the things gets even better
  • RAID5 (Score:2, Interesting)

    by mikewelter ( 526625 ) on Saturday October 30, 2004 @07:09PM (#10675361)
    What will you connect eight drives to? Four PCI ATA controllers? I have eight 200GB drives on my data server using a 3Ware RAID controller, and it has worked wonderfully for 18+ months. I have had a drive fail (due to insufficient cooling), and the system didn't even hiccup. I have a software RAID system at a client's location. Whenever there is a power failure, the system comes back up nicely. However, because of the abnormal shutdown, the software RAID tries to recover one of the disks. This absolutely eats the processor for 16 hours. 98-100% utilization. Fiddling the /proc parameters is no help. I think this is a bug--what could it be doing for 16 hours?
  • by GigsVT ( 208848 ) on Saturday October 30, 2004 @07:11PM (#10675373) Journal
    I agree, the parent post is just a troll to see how gullible the moderators are. Apparently he proved his point. :)
  • by Anonymous Coward on Saturday October 30, 2004 @07:17PM (#10675412)

    Your logic eludes me. The blocks do not need to be read, as we are in the process of writing. We already have the data, because we are writing, so why would we re-read the data?

    That would depend on the nature of the write. If you're writing the initial data it's unlikely that you'll require reading. However when you go to update the date you may have to perform reads in order to calculate the parity required for the update.

    Software RAID 5 is very reliable but does suffer a performance hit. Not because of the XOR computations like many here are suggesting. It's because each logical write needs to be translated into physical reads/writes...which consume time.

    The beauty of software RAID, at least software RAID implementations such as Veritas, is that it allows you to spread the RAID across a number of controllers.

    Listen to this guy...he knows more than the others who consider the XOR computation the slow link in software RAID 5. It's not.
  • by kcbrown ( 7426 ) <slashdot@sysexperts.com> on Saturday October 30, 2004 @07:42PM (#10675522)
    In general (not replying you your otherwise quite correct post, please don't feel browbeaten) I really wonder a) why anyone would need the additional uptime in an in-home setting

    The uptime isn't the reason for using RAID at home. Data integrity is.

    With RAID, I don't lose all my data (or, if I take regular backups, all the data since the last backup) in the event that a drive fails, as long as I . A good RAID-5 setup will give me better read speeds than a single disk, at the cost of some write speed. Since reads are generally much more common than writes on a home system, this is an overall win.

    However, these days disks are big enough that a RAID 0 configuration is reasonable, and that's what I have now. I get better write speeds and similar read speeds.

    In any case, backups are no substitute for a good RAID setup. In fact, I would argue that the home situation is much more appropriate for RAID, because there simply is no good backup solution for home use -- hard disks are orders of magnitude larger than any reasonably-priced backup medium you can find. Only businesses can afford the kind of backup solutions that are capable of backing up the amount of data that's typical on a home system today without burning through a bunch of backup media.

    and b) what the point of a generic IDE raid5 is anyway. When one drive dies, the system keeps running with the hotspare. On a commercial array (or using hot-pluggable storage like firewire) you can pull out the bad drive, put in a new one, and the system rebuilds that as the hotspare, all without any loss of service. But with regular ATA (and I guess SATA, although I'm not so sure) you can't hotswap, so you have to powerdown the array to swap in the new drive - at which point the reliability you got from RAID5 is gone. Hmm, well, I suppose it's less downtime than you'd have restoring from backups, but it's questionable if that's worth the ongoing performance hit the RAID5 (even a hardware one) would cause.

    Downtime isn't an issue for home use anyway. But loss of data is. That's why RAID solutions without hotswap capability are perfectly adequate for home use.

  • by Hrunting ( 2191 ) on Saturday October 30, 2004 @07:50PM (#10675553) Homepage
    Software raid is fine for simple configurations, but if you want to "do it right" - especially considering that you just dropped about a kilobuck on HDDs, go Hardware. A good, reasonably priced true hardware RAID controller that will fit the bill for you is the 3Ware Escalade 7506-8. It has 8 IDE ports, 1 for each drive - you don't want to run two RAID drives in master/slave mode off of a single IDE port; it will play hell with your I/O performance. It's true hardware raid, so you don't have to worry about big CPU overhead and being able to boot with a failed drive (a major disadvantage to software RAID if your boot partition is on a RAID volume, certain RAID-1 configurations excepted). You can buy them for under $450. provantage.com price [provantage.com] is $423.48 (I have no relationship with them other than I've noticed that their prices tend to be decent).

    Hardware RAID5 is fine if your sole goal is reliability. If you need even an iota of performance, then go with software RAID5. The 3wares have especially abysmal RAID5 performance, specially older series like the 75xx and 85xx cards. 3ware's admitted it, and something targeted for fixing in the 95xx series (haven't gotten my hands on those yet, so I don't know).

    As for software RAID reliability, I find that Linux's software RAID is much more forgiving than even the most resilient of hardware RAIDs. I've lost 4 drives out of a 12 drive system at the same time, and Linux has let me piece the RAID back together and I've lost nothing. Was the machine down? Yes. Did I lose data? No. Compare that with a 3ware hardware RAID system where I lost 2 drives. Even thought I probably could have salvaged 99% of the data off that array, the 3ware just would not let me work with that failed array.

    Also, on any reasonably modern system, the software RAID will be faster. You just have a much faster processor to do the RAID processing for you. The added overhead of the RAID5 processing is nothing compared to a 1-2GHz processor.
  • by pensivepuppy ( 566965 ) on Saturday October 30, 2004 @07:53PM (#10675574)
    I use linux software raid5 on seven 200G disks with LVM on top, and have had good results. It's usually going to be much more flexible than hardware raid. If I run out ide buses, I can use firewire, or scsi, or sata or whatever I want. You can also use different sizes of disks (within some limits). With hardware raid, you're stuck with the number and type of ports on that raid card, and that's it.

    I make lots of smaller raid5 /dev/md partitions, then concatenate them together with LVM into one large partition. When I go to add disks, this allows me to pull one md partition out (if I resize the fs down far enough), and expand that md onto another disk, then add it back into the volume. Raidreconf still doesn't sound reliable enough, so I avoid that for resizing

    If you do go wtih hardware raid, make *sure* you can do a resize of the raid without losing your data. A lot of cheap raid controllers don't allow this - you have to wipe out all your data in order to add another disk to your raid, which is usually impractical. And you have to assume you're going to expand it.

    Also, make sure to turn off write caching on your drives. It's much slower, but write caching is dangerous, especially in raid configurations.
  • Re:Please! (Score:4, Interesting)

    by Gherald ( 682277 ) on Saturday October 30, 2004 @08:02PM (#10675621) Journal
    > well i was talkign about more or less cheap controllers. in the arena of raid controllers 160USD is fucking cheap!

    Not compared to $0.

    You see, the typical budget RAID 5 builder just wants to store his collection of MPEG4s, MP3s, and other downloads or perhaps uncompressed hobbyist video. It's not a database, it's not a 150+ employee corporate file server, it's just personal. Performance is not a concern.

    And if performance is a concern (say he wants / on these disks) then the cheap way to go is software RAID 0, 1 or 1+0 (aka 10) *COMBINED* with a RAID5.

    For instance, I just built myself a new system with four 300gb drives and partitioned each one like so:

    50mb - /boot
    1gb - swap
    20gb /
    5gb - /tmp and /var /home

    For the 50mb, I made a bootable RAID 1 of four drives (grub can boot this, dunno about lilo)

    For the 1gb swap, I made a RAID 1 with two drives and a RAID 1 with the other 2. Thus I have a net of two 1gb swap partitions, with redundancy so my system will never crash due to drive-induced paging errors. This is essentially a RAID 0+1, though I let the kernel's swap system handle the RAID 0 aspect by giving them equal priorities.

    For the 20gb /, I did the same thing (pair of RAID 1s) and put a RAID 0 on top of that, for a net of 40gb redundant and fairly speedy storage.

    For the 5gb /tmp and /var I made a simple 10gb RAID 0 for each. Not a whole lot of need for redundancy here, I make a point of backing up the important /var stuff.

    With the four equal-sized partitions that were left, I made the RAID 5 for /home

    Don't you see what a great cost-effective approach this is?!?

    Maybe you work for some company with plenty of money lying around for $160 RAID controllers. But I'm in business for myself, and I don't see the sense in spending money where it isn't needed. Besides, the flexibility of software RAIDs (per-partition, not per-drive) would be well worth it to me even if something like the SX4 were cheaper.
  • by pjrc ( 134994 ) <paul@pjrc.com> on Saturday October 30, 2004 @08:10PM (#10675691) Homepage Journal
    Consider--your ATA RAID controller dies three years down the road. What if the manufacturer no longer makes it?

    This happened to me. The card was sorta still working... could read, with lots of errors usually recoverable, but writing was flakey.

    Luckily, even after about 3 years, 3ware (now AAMC) [3ware.com] was willing to send me a free replacement card. They answered the phone quickly (no long wait on hold), they guy I talked with knew the products well, and he had me email some log files. He looked at them for about a minute, asked some questions about the cables I was using, and then gave me an RMA number.

    The new card came, and my heart sank when I saw it was a newer model. But I plugged the old drives in, and it automatically recognized their format and everything worked as it should.

    This might not work on those cheapo cards like Promise that really are just multiple IDE controllers and a bios that does all the raid in software. Yeah, I know they're cheaper, but the 3ware cards really are very good and worth the money if you can afford them.

  • by Nick Driver ( 238034 ) on Saturday October 30, 2004 @08:44PM (#10675890)
    Vinum on FreeBSD absolutely rocks! You're old 500MHz machine will run FreeBSD beautifully too.

    Anybody here remember Walnut Creek's huge ftp archive at "cdrom.com" which back in it's heyday of the late 1990's used to be the biggest, most highest traffic ftp download site on the planet? They used a combination of Vinum software raid and Mylex hardware raid to handle the load. I remember reading a discussion article from them once that until you get a totally ridiculous volume of ftp sessions hammering away at ther arrays, that Vinum was actually a slight bit faster than the hardware array controller.
  • by Futurepower(R) ( 558542 ) on Saturday October 30, 2004 @09:01PM (#10675959) Homepage

    Office Depot had an 18th anniversary sale, and was selling Maxtor 60GB drives for $18 after rebate. Bought three for my personal test machines, and used my friend's addresses for the rebates.

    I often hear bad things about Maxtor drives, but after a whole 40 hours use, they haven't failed once.
  • by tarball ( 34682 ) on Saturday October 30, 2004 @09:19PM (#10676034) Journal
    BTDT.

    The gentleman is correct. I've used Arco in 2 systems that ran for years flawlessly. Except for a drive failure. Which made the peizo alarm become annoying and LEDs change state.

    And thanks so much for bring it up, because you reminded me I had one of those tucked away, forgotten, brand new in the box. I will be putting it into service soon.

    tom

    I hate sigs, and refuse to have one.
  • PCI bottleneck (Score:3, Interesting)

    by Mike Hicks ( 244 ) * <hick0088@tc.umn.edu> on Saturday October 30, 2004 @09:37PM (#10676126) Homepage Journal
    I haven't read all of the comments in detail, but I think one thing that people are often forgetting is that a standard PCI bus has a theoretical maximum bandwidth of 133 MB/s, a level you'll probably never see in real life, especially when there's a fair amount of chatter on the bus from different devices (and you'd get a lot of that with 8 drives plus networking plus who knows what else). Of course, PCI bus layouts vary considerably between simple motherboards and high-end ones.

    I don't know if anyone makes PCI-X ATA-133 controllers (non-RAID), so in the final analysis it might be best to get a 3ware card with a 64bit connector and plop it in a long slot. Of course, you need a pretty nice motherboard for that. I guess I haven't gone shopping recently, but they weren't that common the last I checked (and everyone is going to head for PCI-Express shortly anyway).

    Of course, it all depends on what you'll use the machine for. If it's just file serving over a 100Mbit network, there's no need to worry that much about speed. It's only a big deal if you're concerned about doing things really fast. I believe good 3ware RAID cards can read data off a big array at 150-200 MB/s (maybe better). My local LUG put a ~1TB array together for an FTP mirror with 12 disks (using 120GB and 160GB drives, if I remember right) about 2 years ago, and testing produced read rates of about 120 MB/s on a regular PCI box (I think.. my memory is a bit flaky on that). Of course, I don't think anything was being done with the data (wasn't going out over the network interface, to my knowledge, just being read in by bonnie++ I suspect).
  • by megabeck42 ( 45659 ) on Saturday October 30, 2004 @10:56PM (#10676509)
    The woeful inaccuracy of your post is really, really painful. Allow me to rebuke.

    First of all, linux software raid has excellent autodetection. You need to set the partition identifier to 0xFD so that the autodetector can identify it. As many have mentioned, software raid has a huge advantage over hardware raid for recovery - you can disconnect the drives from one computer, hook them to another and the autodetect code will figure it out. I know this works because I've done it.

    Second, for 8 drives and 2 controllers a card, you'd want four ATA133 adapters. Each adapter has, as you said, 2 controllers. You don't want to sue the slave channel, because that will definately kill performance.

    Third, Don't install the OS on the raided partition. Don't keep anything fragile or irreplacable on the OS partition. If you want to backup the configuration, backup the configuration. There's no need to raid your boot drive, and if your boot drive fails you can trivially reinstall.

    A cheap batch script is not an effective backup solution. What if files are locked or a file is backed up midway through a transaction? I readily agree that RAID is not a backup solution, but to putting any faith in a "cheap batch script" is profoundly naive.

    RAID5 has the advantage that you only lose one drives worth of space to parity information. With eight 250 gig drives on a P3 500, its readily obvious that his goal is to inexpensively store a large amount of data with an effective mitigation against a single drive failure. Software RAID5 is an excellent solution for him.

    Lastly, I'd recommend one of the intel gigabit cards, because although the drives will only read 50 or 60 megabytes/s, the whole point is moot if your network connection maxes out at around 10 megabytes a second. The client adapters, like the 1000MT is more than enough, and not that expensive.
  • raid5 + debian (Score:3, Interesting)

    by POds ( 241854 ) on Saturday October 30, 2004 @10:56PM (#10676511) Homepage Journal
    I'm running raid 5 on i think 2.6.8 with 3 drives. That is, i'm running it on the root partition and it runs alright, although, i have noticed it has goten slugish... maybe a defrag is in order?

    When i started out, firefox was loading in 2 seconds and it now appears to be taking around 4 seconds to load. At least i think those mesurements are ok. If you want real speed, i'd think about using raid01 as it seems 4 discs in a raid0 array would be faster than 8 in a raid5? I'm not too sure about that, but raid5 is significantly slower than raid0 apparantly. Also, using those other 4 discs to mirror the raid0 array could be more usful then raid5s parity/crc redundancy.
  • by realdpk ( 116490 ) on Sunday October 31, 2004 @12:43AM (#10676930) Homepage Journal
    Yep. Maybe even set up a pseudo-"RAID5" with 3 RAID5 servers. :) Costs _WAY_ less than tape, and is far more reliable. (I hate tape, can you tell?)
  • by __david__ ( 45671 ) * on Sunday October 31, 2004 @01:58AM (#10677218) Homepage
    Yes, heat is definitely an issue, and an issue I didn't even think about when setting up my 4 disk linux software raid 5 set.

    After I set it up for the first time, I had a drive die on me really quickly and noticed when I replaced it that it was murderously hot. As in "burning my fingers" hot. So I went and bought these little hd cooling fans that fit in front of a 5 1/4" drive bay (and come with 3.5" drive mounting adapters) and have 3 little fans on them. They cost about $7 each. I put 4 of them in my machine and they kept the drives at room temperature. Ahhh.

    But the noise was a problem as all those fans together sounded like a wind tunnel. Especially 2 years later when all the little fans started dying and making extremely loud noises. Think annoying fan noise multiplied by 12. Ugh. Then I found this neat product:

    Cooler Master 4 in 3 device module [coolermaster.com].

    Instead of 12 little fans I now have one big super-quiet fan, and my drives still say nice and cool. It was definitely worth the $30 I paid for it.

    Don't forget about heat.

    -David
  • by arivanov ( 12034 ) on Sunday October 31, 2004 @03:04AM (#10677387) Homepage
    I have about 3 TB on 3ware and 2TB on linux software RAID. Most of it RAID5. My recommendation is - stay away from 3ware unless you know what you are doing. It is a very nice controller, but it extremely fussy as far as bus noise levels are concerned. Also, it does not support PCI parity so you have no real indication of what, how and where fails. When used with riser cards we had to fit additional 33Mhz cards to drop the PCI down to 33 and retrofit a serious amount of extra grounding to get machines stable. Overall, a chassis with risers for 3ware is a no-no. You are better off with a 3U+ chassis where cards are straight on the bus. This is under any OS - just read the threads on BSD-stable.

    Linux software RAID5 has a considerably better chance to work nowdays. There are very few controllers out there that have unresolved bugs. Off the top of my head here are a few:

    • Serverworks - any kernel version. Has seriousperformance problems with slave drives. Kernels before 2.4.23 die at random.
    • CMD646/9. Used to be the best controller out there, unfortunately no more. It supplies bogus information for the ACPI tables so it is no longer useable on SMP as of Linux 2.4.23 and later. IRQs do not get initialized.
    • Promise - I personally stay away from them as many are not supported properly.

    As far as controller duty roster is concerned we should also mention Via. From being the worst controller for Linux once upon a time in 1997-9 it has become the best. I have been getting better IO performance on ITX with C3 then on Xeons with server controllers for some time now (starting from around 2.4.23).

"More software projects have gone awry for lack of calendar time than for all other causes combined." -- Fred Brooks, Jr., _The Mythical Man Month_

Working...