Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Data Storage Hardware

Hardware RAID 5 Performance Configurations? 62

gandy909 asks: "I am facing the need to replace a major server in the next few months due to both EOL status and disk I/O bottleneck issues with the array containing the data. The server is configured with a 2 channel array controller. It is a RAID 5 array and has 4 drives, 2 per channel (2 data, 1 parity, and 1 hot spare). Obvious performance benefits in replacing the server are quadrupling CPU speed and doubling the memory. Other side benefits I will gain with the new drives, that I think should help my performance, are moving from u160 to u320, and going from 7200 RPM to either 10k or 15k RPM." How would you configure a larger array to best increase its performance?
"Having googled around, the consensus is that increasing the number of drives is the preferred way to attack the I/O bottleneck. What I don't find much help on is determining the configuration of the larger array. Assuming I am going to be using a 12 drive array I have come up with the following possible configurations:

1 2 channel controller, 6 drives per channel
1 4 channel controller, 3 drives per channel
2 2 channel controllers, 3 drives per channel
3 2 channel controllers, 2 drives per channel

Would any one of those configurations provide better performance than the others, or would they all even out considering other factors?"
This discussion has been archived. No new comments can be posted.

Hardware RAID 5 Performance Configurations?

Comments Filter:
  • by Fished ( 574624 ) <amphigory@gma[ ]com ['il.' in gap]> on Friday November 12, 2004 @10:40PM (#10804622)
    Not enough information - is this data warehousing? Transaction processing? Mostly reads, or a lot of writes as well?

    But generally, I don't recommend RAID 5 for performance critical situations. It's great for data warehouse, but if you lose one drive their goes your performance. Also, realize that, often, the place where you can really boost performance is in the database, not the hardware. How's your query optimization? Do you have appropriate indexes? Is the code accessing the database efficient?

    • See my comment with more information [slashdot.org] for more info, but basically, it's a legacy app, and I'm stuck with the backend just the way it is, so my only real recourse is hardware improvements.
      • Comment removed based on user account deletion
        • Are you kidding? Have you ever used a Hardware array controller? do you even know the difference between a 0+1 and a RAID 5?

          First of all any controller made in the last decade has a fast enough proccessor on it that it is going to be faster than software RAID, second, a 0+1 is WAY slower than a RAID 5. For example lets say I got 4 disks if I setup a 0+1 that would be 2 mirror sets and the data is even split across the 2 arrays so I ask for a file and I have just 2 drive seeking the info, and retreiving th

          • by Fweeky ( 41046 )
            Here's some logic: RAID-5 needs to write across all disks to update parity on writes, which slows them no matter how much fancy hardware you've got to improve them. RAID-5 also needs to rebuild data from parity after a drive failure, meaning your high volume server is going to crawl until you can replace and rebuild.

            RAID-10/01 gives you a mirrored stripe; mirrors improve read performance by letting you balance reads across drives either to increase STR or TPS, stripes improve read performance again and a
            • Here's some logic: RAID-5 needs to write across all disks to update parity on writes, which slows them no matter how much fancy hardware you've got to improve them.

              This one statement shows that you do not know much about RAID 5.

              RAID 5 only requires that the parity stripe and the changed data stripe be updated. It's really quite simple, you read the old data and parity stripe, XOR the new data stripe with the old one, XOR that output to the parity stripe, and write out the new parity and data stripe.
              • Sure; I never claimed RAID-5 wasn't competitive, I was more bashing the guy who said RAID-10 was crap by comparison, when it plainly *isn't*.

                I withdraw my "no matter how much" remark. RAID-10 is still simpler to implement and an excellent performer, which given how crap most hardware controllers/drivers seem to be, can only be a good thing :/
                • FWI, RAID 1+0 and RAID 5 get a lot of testing on the postgresql-performance mailing list, and what we've found there is that with a high quality RAID controller with battery backed cache, the difference from one config to another is minimized until one or the other starts to saturate its SCSI busses. Since most database accesses are random, and therefore only use a small percentage of the maximum throughput available from an individual drive, it takes a lot of drives to saturate the SCSI busses, so for mos
          • First of all any controller made in the last decade has a fast enough proccessor on it that it is going to be faster than software RAID, [...]

            The performance hit with RAID5 doesn't come from the parity calculation, it comes from the additional I/O operations that RAID5 requires.

            For example lets say I got 4 disks if I setup a 0+1 that would be 2 mirror sets and the data is even split across the 2 arrays so I ask for a file and I have just 2 drive seeking the info, and retreiving that info, with a 4 disk R

        • Why would you use RAID 01 over RAID 10?

          (Answer appears to be that your controller sucks [pcguide.com]).
  • I'd put 2GB fibre HBA's in the server(s) and attach an OpenSAN RAID box from Winchester Systems [winsys.com] with dual-homed fibre to the array. That'll take care or your bottleneck and help your future scaling requirements.

  • by blackcoot ( 124938 ) on Friday November 12, 2004 @10:49PM (#10804670)
    ... the key to bottlenecks, as i've understood, are: 1) how many physical paths are there to disk, 2) how big are the buffers.

    #1 can be fixed by adding more controllers /and/ more disks, #2 can be fixed by buying the right controllers. not /too/ too useful, but a start.

    if you're /really/ concerned about performance (enough to spend cash on it), give someone like storagetek a call --- they've got this down to a fine art. quite probably waaay overkill for what you want to do, but it's a start.
  • by cookd ( 72933 ) <.moc.onuj. .ta. .koocsalguod.> on Friday November 12, 2004 @10:52PM (#10804684) Journal
    First, 3 drives in RAID-5 is not very useful. You get a lot of the disadvantages with few of the benefits. Having more drives really helps throughput. So go for more smaller drives over fewer faster drives for RAID-5.

    Second, RAID-5 is great for read speeds, but less great for write speeds. A good caching controller will help hide this, but a small write operation requires a read from each disk in the set before the write can be completed (in order to recompute parity for the stripe). If this is mostly reading, or if most writes are large (not small and random), RAID-5 will work fine (data storage, data mining, etc). If writes are frequent (transactions), RAID-5 is painful. RAID-10 might be better.
    • So go for more smaller drives over fewer faster drives for RAID-5.

      I meant to say larger.
    • by cookd ( 72933 ) <.moc.onuj. .ta. .koocsalguod.> on Friday November 12, 2004 @11:09PM (#10804769) Journal
      And to answer the question you actually ASKED...

      Fast drives can sustain about 60 MB/s nowadays, so don't put more than (max channel rate)/(max drive rate) = (320)/(60) = around 5 drives on a single channel.

      As for multiple controllers, if they aren't present for redundancy purposes, you're probably just as well off using just one.

      But then again, I'm just a Slashdotter who hasn't kept up with the latest specs, so I'm pretty much just making stuff up. Free advice is worth every penny.
      • it is better to stick a controller on each bus. This reduces contention across the controllers. That is, it is better to have two 2 channel controllers than one 4 channel controller, provided they aren't on the same bus.
      • by innosent ( 618233 ) <jmdority.gmail@com> on Saturday November 13, 2004 @01:47AM (#10805358)
        No, internal transfer rates are now up to around 109 MB/s for the top 15k RPM drives. You should be able to do better than 60MB/sec per drive, unless your reads are all over the place. (Lots of small reads instead of a few large, contiguous reads?) If you REALLY want speed, you can't beat RAID 10. For 12 drives, I would go with 2 U320 dual-channel controllers, having 3 drives on each channel (4 would probably be better, but that's 16 drives). Probably would be best to mirror across channels on the same card, and stripe those 6 (or 8) mirrored pairs across the two cards. You could hit a theoretical write speed of 640MB/sec pretty easily with 16 drives, probably only around 550-600 with 12, with read speeds of 1280MB/sec for 16 drives, and just over a GB/sec with 12. If a GB/sec isn't fast enough, it's probably time to write new software.
    • a small write operation requires a read from each disk in the set before the write can be completed (in order to recompute parity for the stripe)

      Not entirely. If your parity is a simple XOR of all the chunks, then you only need to read the chunk to be rewritten and the appropriate parity chunk. XOR the old chunk with the parity, XOR that with the new chunk and rewrite the parity chunk, and finally write the new chunk. However, this still necessitates a read from the drive where the write is to occur and

    • IIRC during write you have to read from just two disks: the one where the data goes and the one with parity. Using the difference from the old data and the new data you can recalculate parity without reading the other disks. So for each write on RAID you need two disks read and two disks write.
  • by millisa ( 151093 )
    If this is an EOL system and its using U160 drives . .chance are those drives are 36 gig or less . .I'd even bet they might be 18's . . . But lets say they are 36's . . four of them in a raid 5 is giving you ~120GB? Why not just get a pair of 147GB drives and run in a raid1? I mean, like others have said, without knowing what you are doing with it, it is hard to say where you are going to get the most benefit, but a lot of times, Raid5 is chosen just due to the increase in space you can get. . . There a
    • Why not just get a pair of 147GB drives and run in a raid1?

      Because in anything where you are worried about performance a simple RAID-1 is going to suck completely? Sure you get some advantages on reads with a decent controller but that's it. For performance critical systems a large RAID5 with a good controller or a RAID-10 across one or more controllers is definitly the way to go. For database type operation with a heavy write load (the most common load where this level of optimization is necessary) prese
  • Why RAID5? (Score:2, Interesting)

    by TheLink ( 130905 )
    What are you using it for? RAID5 is slow for writes. Plus AFAIK rebuilding a RAID5 is a LOT slower and hurts performance more than rebuilding stripes of mirrored drives.

    What are the applications? Would you be needing more drive space or adding drives in the near future?

    If it's a single application and you know you are not going to need to expand soon or add/change apps - drives are BIG compared to what app needs AND performance is _THE_ issue. Then I'd suggest optimizing around the application.

    For exampl
    • We did some benchmarking of the behaviour of a Sun disk array for PostgreSQL; couldn't see a significant difference between having a separate "little array" and folding it into a bigger one.

      What seemed to be of value was to put WAL onto a separate array simply out of the knowledge that those disks would get burned through faster, and we'd know they would have a higher replacement incidence.

      What I'd like to do to fix that is to have a few GB of SSD, where heavy writes wouldn't hurt any disks.

      But on

      • Interesting. Did you do a lot of update+commits or just big updates and fewer commits?

        Maybe someone should have a "HDD" that's battery backed 1-2GB RAM for WALs or stuff like that. Rechargeable lead-acid gel batteries (2+ year lifespan).

        Would 1-2GB be good enough for such things?
        • It was a case of plenty of little updates; that's what the OLTP applications do, so that's the thing to test.

          The slick part isn't having a huge battery, but rather having a form factor that's similar enough to plain old disks.

          What I'd love to see is the combination of:

          • A bunch of RAM,
          • Enough Compact Flash to cover all of the contents of RAM
          • Enough battery to give enough time to serialize the cache to CF if power dies

          Long battery life isn't necessary; you just need long enough to write it all

  • The more cards, the more buffers, the better the performance (marginally).

    I'll just assume it's an ERP, cause I like ERPs. They do a ton of reading and infrequent sustained transfers. If it is an ERP, your top priority should be more main system memory. Then use more smaller, fast drives. The more cards the merrier because your double or quadruple the buffers. At worst a buffer offers no advantage. Hopefuly it's a dynamic optimization supporting what the database is already optimizing in main memory. Littl
  • I've had great luch with Promise's Ultratraks [promise.com]. This is an array of IDE drives that connect to a SCSI interface. Using them in RAID5, and yes it does take some time to rebuild the set when you lose a disk, but if you can handle a little down time. Currently using the 8disk tower, populated with Western Digital 120GB drives. Works great, but when a drive fails, it takes about 6 hours to rebuild the set.
  • Consider using your CPU as the RAID engine. The hardware engines that venders use can easily end up underpowered when dealing with modern disks. Get enough SATA or SCSI ports and some hot swap bays you can build quite a powerful software array. If you dump the money you would have spent on a hardware RAID card into faster CPU you can easily end up with a faster overall solution.

    If you are planning to go the hardware route make sure you get a controller that can keep up. There's nothing worse than havin
  • by Photar ( 5491 ) <photar AT photar DOT net> on Saturday November 13, 2004 @12:03AM (#10805006) Homepage
    This sounds exactly like a question I had in Computer Architecture class.

    It looks to me that if you're upgrading from a computer that was 1/4 the speed of today's servers, then if you get a modern server machine with raid SATA you'll be fine.
  • You question should be a simple math problem. How much storage is required? What throughput do you need? Be sure to include expected growth over the depreciation period for both to keep the accountants happy (write-offs screw up runrate projections). Now it's just math based on tech specs to come up with several configuration options that meet your requirements. Once you have some options, a comparison matrix is one way to decide between the configurations you came up with based on cost, flexibility, s
  • More Information... (Score:3, Informative)

    by gandy909 ( 222251 ) <gandy909@gmailPOLLOCK.com minus painter> on Saturday November 13, 2004 @01:24AM (#10805282) Homepage Journal
    This is a *nix box running a Court application from a vendor that used a Synergy DE [synergex.com] ISAM type 'database' backend. We have about 100 users, a general mix of add/change/query operations but not a lot of deletes. It is keeping Court data, and that stuff never goes away. The reads are mostly displaying screenfuls of data or small reports to the screen all the time, or larger reports to the printer several times during the day. The writes are usually writing manually entered information, a screen or paragraph at a time, so there tend to be less of them, and at a slower pace, but all day long.

    The app is a legacy non-SQL type db that is not, nor ever will be, anywhere near normalized by any stretch of the term. The largest of the data files is just over 1 gig at this point. The OS file size limit is 2gb. Due to this, and the other reasons we will likely be moving to a completely different system in the 5 year range.

    Hardwarewise, the box as I inherited it, is a Dell 6400 rackmount server with 4 700mhz P3 Zeons (only 1 activated...don't ask), 1g mem, a PERC2(AMI MegaRAID) dual channel controller, and a split(4+4) backplane. It holds 8 9 gig drives in 2 arrays. Even with these small drives there is over 50% and 70% free space on the arrays.

    My budget limit is $10k to replace it. One of the options I was looking at was a Dell 2650 with a PERC3-QC controller and one of the Storcase 10 bay Infostations they offer on the Dell site to hold the rest of the drives. The way the app is so 'interconvoluted' together I don't think I gained anything by separating the data into 2 arrays and will likely just use a single array on the upgrade.

    I hope this helps... :)
    • First of all, can you provide some memory and/or processor utilizations stats? With an ISAM style database (these things are old as the hills) it is *unlikely* - but not impossible - that processor utilization is hurting you. ISAM style databases tend to be very simplistic and very lightweight compared to an SQL db. I would load up on memory and not worry as much about processor (unless processor stats suggest otherwise.) More memory will mean better buffering of disk i/o and more efficiency.

      Second, w

      • I'm a tad apprehensive about posting this info, as I enjoy having a slashdot id, and don't want to be cast out as a pariah, or be mentioned in the same sentence as Darl.
        The system in question is one I inherited. The only reason the OS hasn't changed to Linux is the fact that Synergy wants in the neighborhood of $20K for a 'transfer license' and the beancounters won't approve it.

        Yes, I have a SCO system. Whew! There, I said it. I'm not proud of it, but I do have to maintain it. What follows is the
    • It sounds to me like you will do just fine with a dual Xeon Dell 2650 and 5 15k internal drives in RAID 5. Or if you're concerned about disk IO (and from the description of you app it doesn't sound like there's that much) do a RAID 10 with a hot spare. I don't think I'd bother with the external drive array -- you don't need the complexity.

      Oh, you should be able to get this machine for half your budget -- don't order from the Dell website -- talk to a sales rep. You can get some very good deals on servers n
    • If you're working with small files (1 GB) have you considered Solid State? Or something like this: Rocket Drive [cenatek.com] 4 GB memory on a PCI card, mounted as a file system - 3000$ US.

    • PERC3 cards are all pieces of shit.

      Read the archives at linux.dell.com on the linux-poweredge mailing lists.

      The PERC4 are a great step up but you can still do better with your own cards.

      Also, don't ever buy drives from dell. Buy them from Dell "parts and accessories" (the phone number that routes you over VOIP to India...)

      You'll get the SAME, BRAND NEW drives from them, still under your normal warranty, still from Dell at about 1/2 or 1/3 of the price. Buy ten drives and they will really start to dro
    • I wouldn't think you need to spend 10k$. A new machine with two SATA disks in software RAID1 might well handle the same I/O throughput as your old machine, if you have lots of RAM to cache I/O. Maybe a couple RAID1 sets, with four or six disks total. If you're only using one CPU now, you'll probably be fine with a single new CPU. The trick is to find a solid, reliable single CPU board. Tyan makes some good stuff... I like their dual Opteron boards, and I'd guess that their single Opteron boards are of
    • Go for a Dell 2850..
      + Dual P4 Xeons
      + 800MHz FSB
      + upto 6GB RAM
      + six hotswap drive bays (optional split backplane 2+4)

      Go for..
      Dual 3.2GHz (if u have money to throw 3.6s for $1100 extra)
      4GB RAM (more is questionable benefit on x86's 32bit arch, some apps it may be worth it)
      six 36GB 15K drives
      addon PERC4/DC RAID controller
      All for $9012

      A RAID 10 config would be the best..
      + highest redundancy and
      + best performance.

      RAID5 is faster for sustained continuous reads/write but slower for random/small read/writes.

      HTH
    • I recently calculated the price for a brand new box with tons of storage (3ware SATA controller) and dual opteron with tons of ram for under 5K EUR.

  • If you're looking for performance, ditch RAID5 and go with RAID0+1 (mirror pairs of disks, then stripe the mirrors). It costs more in terms of dollars per gigabyte, but the performance and reliability increases are substantial over RAID5. The only justification for RAID5, really, is "we can't afford to buy enough disks to do something else" (which is a valid argument for many people). The other thigns you touched on still apply in a stripe/mirror world as far as splitting things over controllers and wha
  • you need at leaat 5 disks for reasonable performance on a RAID5 set, even with hardware support. (do some research). The more disks in the set the better the performance.

    also in RAID 5 the parity info is spread all the disks, otherwise it's a RAID 3 set, so loss os single disk doesn't mean major performance problems.

  • Here are some suggestions

    things to consider 1) hardrave sustained transfer speed, IBM made some bad ass 10k ultra 160 drives but they could only push 40 MB/s so a 4 disk RAID 5 would saturate the channel on the controller. also 2)consider the card slot 32 bit slot on a 66 MHz bus is 1053 Mb/s or 132MB/s in theory, so you need a faster bus like new chips or a 64 bit bus, most the last server hardware I worked on only wnet to 66 MHz bus speed with 64 bit busses, so your mileage will vary, so basicly get an U

  • Re: (Score:2, Informative)

    Comment removed based on user account deletion
    • I disagree on the rounded cables statement. Round cables (esp. in >4-drive arrays) allows more airflow, which keeps the disks cool. For 10K and 15K rpm disks, this is critical.

      Now, if you'd said "don't try to make round cables yourself," I might agree -- even though I've done exactly that in the past AND once managed to cut one of the wires in the process -- but there is nothing inherently wrong with round IDE cables. You might claim crosstalk is an issue, but that's largely what the extra ground lin

  • Would give you the best write performance.

    *BUT* YMMV depending on how your application actually uses the I/O

    I know of a widely used RDBMS that really sucks on RAID5 due to its underlying data storage.

    You'll need to test with all the combinations and also for RAID 10 (stripped mirror) and RAID 01 (mirrored stripe) to see which gives best performance for your application

  • At work we are replacing a bunch of file servers, and I just had to chime in with my totally unscientific benchmarks
    All systems dell sc1600's 2 x 2.4 xeon's 1 gb ram, perc/3 dual channel raid card
    for the highest usage server we did 5 x 36 raid 5 array 2 on one channel 3 on the other for ~ 100gig space + hot spare.
    for the others we did 2 73's mirrored one on each channel.
    All drives were same family from maxtor mark IV's i believe 10k's
    In my rudimentary testing using hdparm -tT the mirrored drives per
  • And learn what reads and writes are happening, and what is getting queued up now to create i/o bottlenecks. I haven't seen anyone comment on it, but understand that your primary LUN is just one part of the equation. At a minimum, you should be looking to separate your system files, your swap files, and your database partition onto separate arrays. For the first two, RAID1 is frequently the best choice.
  • Let the OS do the work. Add RAM. Have an Intel x86 box? Push it to 4G. Need more, go Opteron and push it to 8G. Trust in the filesystem buffering algorithms, and the speed of your system CPU(s), instead of the limp CPU and dinky cache on your RAID controller. And get a UPS. If your data set is large, dump the RAID5 and go RAID 0+1.
    • Xeon boxes can go above 4GB of RAM... for example, the HP DL560 does 4 Xeon CPUs and 12 GB of RAM.

      They're really nice boxes [hp.com]
      • Intel's PAE is, by all accounts, a performance-killing kludge. If you need more than 4G, go to a real processor, not some 'paging' scheme which brings us back to the old DOS EMM386 days, only with larger amounts of memory.
  • Since you are already using 4 drives just use raid 1+0 optimally split the 1+0 across 2 controllers
    and you will get muchbetter preformance. even just 1+0 is faster than raid 5.

    My lowly MS SQL server has 4 raid 1+0 channels.

    1 for the OS and swap, and 3 for databases.

    --Tim

A morsel of genuine history is a thing so rare as to be always valuable. -- Thomas Jefferson

Working...