Ask Slashdot: How Do You Test Storage Media?

Ask Slashdot: How Do You Test Storage Media? 297

Posted by timothy on Tuesday April 03, 2012 @01:29PM from the give-her-some-storage-tarot-cards dept.

First time accepted submitter g7a writes "I've been given the task of testing new hardware for the use in our servers. For memory, I can run it through things such as memtest for a few days to ascertain if there are any issues with the new memory. However, I've hit a bit of a brick wall when it comes to testing hard disks; there seems to be no definitive method for doing so. Aside from the obvious S.M.A.R.T tests ( i.e. long offline ) are there any systems out there for testing hard disks to a similar level to that of memtest? Or any tried and tested methods for testing storage media?"

Ask Slashdot: How Do You Test Storage Media?

This discussion has been archived. No new comments can be posted.

Search 297 Comments Log In/Create an Account

Comments Filter:

This is what I use (Score:4, Interesting)

by Wolfrider ( 856 ) writes: <kingneutron@NOsPAm.gmail.com> on Tuesday April 03, 2012 @01:43PM (#39562363) Homepage Journal

root ~/bin # cat scandisk
#!/bin/bash
# RW scan of HD
argg='/dev/'$1
# if IDE (old kernels)
hdparm -c1 -d1 -u1 $argg
# Speedup I/O - also good for USB disks
blockdev --setra 16384 $argg
blockdev --getra $argg
#time badblocks -f -c 20480 -n -s -v $argg
#time badblocks -f -c 16384 -n -s -v $argg
time badblocks -f -c 10240 -n -s -v $argg
exit;
---------
Note that this reads existing content on the drive, writes a randomized pattern, reads it back, and writes the original content back. With modern high-capacity over-500GB drives, you should plan on leaving this running overnight. You can do this from pretty much any linux livecd, AFAIK. If running your own distro, you can monitor the disk I/O with ' iostat -k 5 '.
From ' man badblocks '
-n Use non-destructive read-write mode. By default only a non-destructive read-only test is done. This option must not be combined with the -w option, as they are mutually exclusive.

old timers look here (Score:2, Interesting)

by vlm ( 69642 ) writes: on Tuesday April 03, 2012 @01:48PM (#39562427)

OK so that was the noob version of the question.
I have a question for the old timers. has anyone ever implemented something like:
1) log the time and temp
2) do a run of bonnie++ or a huge dd command
3) log the time and temp
4) Repeat above about ten times
5) numerical differentiation of time and temp and also any "overtemps"
In theory run from a cold or lukewarm start that could detect a drive drawing "too much" current or otherwise being F'd up, or cooling fan malfunction
I'm specifically looking for rate of temp increase as in watts expended, not just static workload temp.
In practice it might be a complete waste of time.
Another one might be something like a smart reported temp vs iostat reported usage plotted on a scatterplot graph.
So the old timer question is has anyone ever bothered to implement this, and if so, did it do anything useful other than pad your billable hours?

badblocks (Score:5, Interesting)

by Janek Kozicki ( 722688 ) writes: on Tuesday April 03, 2012 @02:56PM (#39563371) Journal

badblocks -c 10240 -s -w -t random -v /dev/sda1
that's my standard test for all HDDs

Re:Why? (Score:5, Interesting)

by v1 ( 525388 ) writes: on Tuesday April 03, 2012 @03:18PM (#39563753) Homepage Journal

The point is to know whether it's faulty now at the time of arrival rather then 2 weeks down the line where it becomes a problem.
I would disagree. I believe it's best to be able to identify the first moment a hard drive is starting to have problems, rather than the condition its in when you get it.
One reason is that most of your hard drives will eventually develop a problem, and only a small fraction of the drives you buy will arrive defective.
Another reason is that nothing of value is on the new drive, you are risking only purchase price. A year from now, you may have important, possibly irreplaceable or at least inconvenient things to replace.
I run a piece of custom software I wrote that does a slow "disk crawl", reading ~100mb every 5 minutes. Over the course of a month it has read every block on the drive, and starts over. I get an email if an i/o error OR slow performance is encountered. I store a lot here, I have somewhere around 25TB of storage under the roof at home. Over the years I've been notified ~8 times of a failing drive. In all cases I was able to replace it before it became inaccessible. One of them failed to spin up ever again the day after I removed it from service. I consider this a very good system, and am surprised not to see a similar commercial offering. (it's a 5,600 line bash script!)
SMART is only useful to possibly confirm that a drive has a problem. Only a fool relies on it to notify them when there's a problem. I've probably replaced somewhere around 750 hard drives here at work, and of those, under a dozen were still accessible and displaying a SMART failure. Many times I've had SMART toggle to failed while I was doing data recovery to a replacement drive, as I was fighting my way through I/O errors. Got some Cpt Obvious going on there I think.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Ask Slashdot: How Do You Test Storage Media? 297

Ask Slashdot: How Do You Test Storage Media? More Login

Ask Slashdot: How Do You Test Storage Media?

This is what I use (Score:4, Interesting)

old timers look here (Score:2, Interesting)

badblocks (Score:5, Interesting)

Re:Why? (Score:5, Interesting)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot