
Hardware For Bulk IDE Hard Drive Burn-In? 51
r0gue_ asks: "I work for a mid-size OEM hardware manufacturer. We ship approximately 300 to 500 IDE HDs every month across all our units. Currently we experience about a 4% failure rate (Maxtor and WDs), though in recent months it has been a couple percent higher. The problem is our systems are dedicated boxes with a non end-user friendly form factor. Virtually every physical HD failure results in an RMA. What we are looking for is a hardware based IDE HD burn-in platform. Something that we could drop a dozen or so drives in at once, stress test them for a day or two, then put them into inventory for builds. I know the HD manufacturers and larger OEMs use them but I have not been able to track down anywhere we could purchase one. Right now moving to SCSI or a form factor that supports externally removable drives is not an option. I was hoping that the Slashdot community could point me in the right direction."
Here's a solution: (Score:2, Interesting)
Re:Here's a solution: (Score:1)
Re:Here's a solution: (Score:3, Informative)
Re:Here's a solution: (Score:3, Funny)
Yep. Use only Maxtor. That way you should be able to get 8%.
-- MarkusQ
Re:Here's a solution: (Score:1)
i'm still using maxtors myself at this point. why? i'm not sure what else to use that's better and the ones that have failed at least made noises and gave warning--something past western digitals did not.
so again, what do you recommend that would be better?
Re:Here's a solution: (Score:3, Interesting)
Maxtor: I've been plagued with problems from maxtor drives over the years. From one original Maxtor i've bought (and it's RMA replacements), i had 2 that had spindle motors that became abnormally loud, one catastrophically fail (IDE Auto-detect had problems even de
Re:Here's a solution: (Score:4, Informative)
StorageReview [storagereview.com] has a Drive Reliability Survey that lists statistics for many drive families. For example, WD 205Bx drives are near the top of the rankings (99th percentile) while the 600Ax is near the bottom (10th percentile).
Re:Here's a solution: (Score:3, Interesting)
Re:Here's a solution: (Score:1)
Re:Here's a solution: (Score:1)
I've had pretty good luck in the past with the seagate IDE drives as well.
How-to (Score:4, Informative)
write bash script that checks dmesg for how many drives are in the system and invoked the follwing perl script for each drive.
Write perl script that does this.
formats and partitions drive to max size,
copies a kernel or some other large file onto the disk until it is full.
monitors syslog for IDE errors
md5sums the files to make sure they all match.
reports an error if the MD5 doesnt match.
unless you get hotswap controllers you will have to reboot everytime you want to test another batch of drives.
if you dont wish to write this perl script i can be hired to do it for you.
Re:How-to (Score:3, Informative)
Re:How-to (Score:3, Interesting)
My advice would be to investigate into as many Firewire->IDE convertors as your company can afford, and then use a Firewire-friendly OS to do the burn-in. Something like OSX or Linux would work very well in this case - actually, a cheap Apple machine would be perfect for this application.
There's no need to start things up in batches with Firewire, either. You can plug in a disk, and your 'stresser' program
Re:How-to (Score:3, Informative)
Now, my motherboard supports PCI 64 at 66Mhz, with a bandwidth of 532MB/s, this would give 8.3MB/s per disk. Still not a lot, and you'd have to find a PCI 64 Firewire card with a
Re:How-to (Score:2, Informative)
Can your PC really do sustained writes to >8 drives without getting into performance issues?
Actually, speed is a good point. I guess I hadn't thought about that so much as part of the setup, but I guess if you're exercising disk
Re:How-to (Score:1)
You want the dangerous answer?
I used to do this, and never did blow my IDE interface, as some say I should have (try at your own risk).
Buy some IDE removable drive bays (one per drive) -- $20 each. Put the drives into the sleeves, and hook up a bay to your computer. Simply remove the drive sleeve and replace it with another when you want to. Obviously this drive can't have any active data on
Re:How-to (Score:1)
About speed, I'm not really sure, I only have 2 drives at the moment, and nothing in the PCI 64 slots, but at least the available bandwidth wouldn't be a bottleneck. Of course, it also depends on how fast those drives are. I'm pretty sure there are drives that are noticeably faster than mine.
Re:How-to (Score:5, Informative)
For example, if it's just bad spots, then you'll want to do as many reads and writes as possible. For that, the fastest thing would be a little C program that reads and writes different patterns to the raw device linearly.
On the other hand, if the failures are tied to seeks, you'll want to write to semi-random locations on the device, to force maximum seeks. Or if you see a mix of both, then your best bet might be to follow m0rph3us0's plan, perhaps tweaking it a bit to better simulate normal filesystem efficiency (and you can just do bit compares rather than md5sums if CPU is an issue).
You should also keep an eye on heat issues. The burn-in should happen at temperatures that are like what they will be in the end systems. If you pack 8 seeking drives into some cases, they'll cook. If you leave them in the open air, they might not trigger the failures you are seeing in the field. Try to match measured operating case temperature.
Oh, and don't forget to measure whether this burn-in is really helping. Take stats now, and keep tracking causes of return. It could be that the drives are sensitive to noisy power or vibration or something else that your burn-in won't catch.
Re:How-to (Score:1)
Comment removed (Score:5, Informative)
Re:I used to repair disk drives... (Score:1)
Western Digital and Maxtor are equal? (Score:5, Informative)
Slashdotters! If you don't find a story interesting, please don't complain and call Slashdot lame. Just ignore the story. Do you complain to your local newspaper that they should not publish recipes because you don't cook?
Comment about the Slashdot question: The wording of the question seems to imply that you believe that Maxtor and Western Digital hard drives have an equal failure rate. That has not been my experience. My experience has been that Western Digital are the most reliable hard drives. I'm very interested to know the experience of other readers.
Western Digital went through a bad stretch in which they experienced a problem that caused high failure rates several years ago, but that was cured.
It's shocking that you are in the computer business and knowingly shipping products with a 4% failure rate. That's very expensive and annoys the customers.
However, you are on the right track. Electronic products have what is called "infant failure". Most failures occur in the first week. During 192 hours (one week), the failure rate falls typically by a factor of 100 or even 10,000. At the end of one week most failures have already happened.
It's very easy to write a program that exercises a hard drive. Just copy files back and forth from folder to folder. It is easy to write a program that fills a hard drive with files, then erases them and starts again.
The Promise Ultra133 TX2 [promise.com] supports adding four more hard drives to the 4 already supported by modern motherboards. Eight is enough for one test computer, usually, because the power supply won't support more. Be careful to use delayed start. Maybe you will need more powerful power supplies than you normally use.
Make SURE that you are not having troubles with heat. Are your drives cool when they are installed in your product? High heat will cause high failure rate.
Re:Western Digital and Maxtor are equal? (Score:2, Interesting)
With a true 400 watt power supply, you can easily power 16 drives reliably. For reference, 8 drives pull a total of about 5-6 amps on 12v spin up, for about 1 second, then together use less than an amp on 12v, and very little 5v. This is based on testing with Maxtor 5400rpm drives, 7200 probably use a little more, and other brands may vary.
Power specs given in hard disk spec sheets are mostly boilerplate and do not reflect actual power consumption, the actual consumption is usual
Intel motherboards have "Hard Disk Pre-Delay". (Score:2)
Intel motherboards have a BIOS setting called "Hard Disk Pre-Delay". The system waits for the hard drives to spin before it tries to detect them.
Re:Intel motherboards have "Hard Disk Pre-Delay". (Score:1)
They are talking about actually delaying the spin-up sequentially to save your system from the initial power draw of the drives all spinning up at once.
First one drive starts, drawing about an amp. Then, once it is spun-up, the next one starts. This continues for each of your hard drives.
In this way you do not have a 5-10 amp draw when you turn on your system, as that is a very good way to cook your power supply.
All power supplies have overcurrent protection. (Score:2)
Yes, exactly. However, cooking the power supply is not a problem, since all power supplies have overcurrent protection. The problem is that the BIOS begins its detection process before the power supply has stabilized enough to provide the correct voltage, due to the unusual load. When the detection fails, there is an error message. So the BIOS pre-delay can be helpful.
Re:All power supplies have overcurrent protection. (Score:1)
In any case, it is more stressful on the components to surge at startup. It's not really much of an issue for servers that stay on all the time, since they probably go through the spin-up stress less than once a year, if that much.
Antec is a case maker. (Score:2)
Antec is a case maker. I have not been impressed with their power supplies. They are adequate, not wonderful, in my experience.
Re:Western Digital and Maxtor are equal? (Score:1)
Re:Western Digital and Maxtor are equal? (Score:1)
Why can't you use multiple power supplies for the drives themselves? As far as I know, there is no requirement for the power supply to supply both disks and system. This would eliminate the need to have a "Spin-up" option in BIOS as the drives to be tested would already be powered up.
For example, you could easily have two additional, external power supplies and plug four drives into each. Simply power the drives up first (count to ten or whatever), then the syste
Re:Western Digital and Maxtor are equal? (Score:2, Informative)
As far as powering them up in a sequence, there is no need to do that really, you can just turn on all the power supplies with the same switch. That's a little trickier to do with ATX, but cyberguys sells a adapter to make an ATX power supply act like an AT one with an external switch, and AT style motherboard connector. Or, since you aren't using the motherboard connector, you could just send the power supply the on si
4%? (Score:2)
After that, I'd look at maybe trying some different manufacturers
Re:4%? (Score:2)
You haven't bought a consumer IDE hard drive in the last few years, have you? Quality has gone to the dogs.
OCTET Machine (Score:1)
IIRC, there was a feature to test the disks as they were being mastered, but we never ran the machine in this mode due to the time it took to do it.
You could do 8 disks at a time, hence the name, I did a Google, but couldn't find you a manufacturer.
It looks like a elongated cash register, with an area covered with padding to site the drives, it can be connect
Is 576 drives at once enough for you? (Score:1, Flamebait)
Sheesh -- an Ask Slashdot that's already been answered on Slashdot! Not exactly a duplicate post, but apparantly the Editors aren't the only ones who don't read /.
Case solution (Score:3, Informative)
We built some disk arrays using a front-loading IDE case with drive trays. This one is pretty pricey but it's _nice_ hardware:
https://www.rackmountplus.com/spec.asp?ID=RMAC4D -IDE
That, plus a couple RAID cards (like 3ware's new 12-port cards) in a 64/66 PCI slot and bonnie++ would do a pretty good job of burning in your drives. You could flip drives in and out in a few seconds.
Why not (Score:3, Interesting)
Note: I have never implimented raid and am not an expert, so this idea would need to be independently verified.
use iometer and an adapter (Score:1)
3Ware card + Removable Hard Drive Bays (Score:2, Interesting)
Re:3Ware card + Removable Hard Drive Bays (Score:3, Informative)
Ask one of the big boys... (Score:2)
If I'm not mistaken, they just upgraded their cabinets, so it is likely that either there are surplus cabinets around from the various manufacturers, or theres somewhere making em. They might be a bit expensive, but if you're
Use old AT power supplies (Score:2)
Here is a few ideas for you (Score:1)
USB, also a usable plan.
Sadly you may need to use Windows has Solaris just isn't right and Linux has horrible Spaghetti Code for this stuff. Windows for all it's oh so many faults will let you get this up quickest.
Hitachi has an OEM tool to do this... (Score:1)
Hitachi DDD-SI [hgst.com]
Looking at the User's guide, it looks like you could use it's basic features on non IBM/Hitachi drives. You also might want to check out the other manufacturers sites and see if they've got something similiar.