No Hassle RAID 5 Implementations? 51
LambSpam asks: "I had a nightmare week (last week) with two of our servers running Intel's U3-1L RAID controller (RAID 5). Whenever there's a power outage in our building these controllers randomly mark one or more of the drives in the array offline (even with adequate UPS support), which means I have to manually mark them online and/or rebuild. Intel acknowledged the problem, but their solution involves updating the backplane's firmware, the controller firmware (destructive upgrade!), and even the firmware on our IBM drives in the array because they 'draw too much power' in certain conditions. I've only used one other RAID 5 implementation (MegaRAID), and it NEVER had these kinds of problems, whereas if you sneeze too hard around this U3-1L card it will go offline. Is this common with most hardware RAID implementations? What RAID 5 implementations works without hassle? What should I stay away from?"
Re:PERC? (Score:3, Interesting)
Re:PERC? (Score:3, Interesting)
Just a note on EMC.. When i've had the joy of working with a Symmetrix, EMC has always done a wonderful job of never having any downtime. They would come out at any hour of the day or night to replace a redundant card or a spare disk that wasn't even being utilized. They always evaluate any changes before they are made. I'm sure its possible for them to make a mistake, but for mass storage they're the ones i would choose.
Re:PERC? (Score:3, Interesting)
Just FYI, Sun doesn't actually make their high-end storage product. I think they call it the StorEdge 9900 or something but it's actually a rebranded Hitachi Data Systems 9960.
Funny thing about HDS. When you buy one of their 9960 systems-- a minimum investment of about $250,000-- you get a guarantee. If you ever lose any data at all on that storage system due to hardware or firmware fault, HDS will give you 30% of your purchase price back.
According to one of the senior HDS VPs that I spoke to last month, they've never had to pay out that penalty clause.
Two possibilities... (Score:4, Interesting)
The other one is something I've heard of (I'm not an electrical expert, but I'll try to explain). Larger (older installations, particularly) sites were wired for three-phase electricity. Over time, they split the phases for normal 110 volt usage. There is a chance where if the PC is connected to power on one phase, but the external unit is connected to power from a different phase, that the differential between the two can cause problems, due to the ground connection between the two through the cable shielding. I know, it sounds like something from the BOFH daily calendar, but it does make sense. Try making sure both pieces of equipment are on the same true UPS, or at least switched UPSes on the same circuit.
Re:IBM HDs (Score:1, Interesting)
FWIW, I've found the drivers for the PERC in FreeBSD to be far better than those in Linux.
Compaq is good. (Score:3, Interesting)
I suggest you also fix you power problem. The systems should have no idea power was lost to the building. If you are using a UPS and this is still happening, I'd find a better one.