Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Hardware

Calculating the Mean Time Between Failures? 100

Blue Booger asks: "I was looking over some fibrechannel hard drives and noticed that the Mean Time Between Failures was rated at 1.2 million hours. I thought that was pretty high, and figured it up to be close to 137 YEARS!! I went to check some regular IDE drives just for comparison, and they were rated at 500,000 hours (57 years). Now, as I understand it, this is supposed to be the average time that you can expect the drive to last before failures. I rarely have an IDE drive last more than 4 years, and my record is 10 years, so what is the deal? BTW, that is 57 years running 24 hours a day...the MTBF is rated as power on time. Here you can find Western Digital's glossary that defines the term MTBF (pdf). Here you can find a spec sheet on one of their 20GB IDE drives. I checked, and Seagate also lists similar MTBFs. How the heck are they coming up with these numbers?"
This discussion has been archived. No new comments can be posted.

Calculating the Mean Time Between Failures?

Comments Filter:
  • Duty Cycle (Score:4, Informative)

    by m0rph3us0 ( 549631 ) on Thursday June 19, 2003 @08:59PM (#6249455)
    Usually they have a duty cycle associated with an MTBF which can drastically alter the MTBF at a 100% duty cycle.
    • I NEVER had an IDEdrive last more than 14 years!
      ;-)
    • Re:Duty Cycle (Score:3, Informative)

      by Blkdeath ( 530393 )
      Usually they have a duty cycle associated with an MTBF which can drastically alter the MTBF at a 100% duty cycle.
      PNot to mention temperature. Read the environmental factors very carefully; if you exceed them by even 1 degree celicius you can cut your MTBF equally, if not more drastically.
      • or maybe they forgot to power up the drives and when they turned them on after using their time machine to move into the future they miraculously worked
    • Has the MTBF of major brand hard drives gone down in recent years like their warranties? They once had five year warranties, then three, then all at once, major manufacturers scaled their warranties back to one year. Are they cramming way too much data onto the platters, making the technology unreliable, or are they just cutting their costs at the expense of customers? Their shortening of warranties to one year seemingly all at once smells like collusion to me, which violates anti-trust laws. The FTC should
      • Re:Duty Cycle (Score:3, Informative)

        by itwerx ( 165526 )
        That 5-year warranty almost put Western Digital out of business when they all started failing at the 4-year mark!
        No, I'm not kidding. Some heads rolled over that...
  • by Anonymous Coward
    Just make Some Wild Ass Guess(SWAG).

    Like, my hrad drive has a MTBF of 300,000 hours.
  • not just drives... (Score:4, Interesting)

    by ryanmoffett ( 265601 ) on Thursday June 19, 2003 @09:04PM (#6249492)
    Cisco used to sell Catalyst 3548XL switches that were listed as having a MTBF of 120,000+ hours. Their current replacement for that line (3550)comes in at 163,000+ hours. We had 7 of 24 3548XL switches fail in the first year we had them. They had poor air flow from a tiny fan, no heatsinks and tons of hot chips. The newer model has the same issue, though they did stuff a cheap foam baffle in the case to get air to flow closer to the chips, none of which have heatsinks. I have no idea how they tested them and got a MTBF of 13 years.
    • My lab has about 5 each 3524, 3548, and 3550-24, average age 2.5 years, and no failures (hardware or crashes). Other offices I know of have had similar experiences with the 3500 series.

      Either you just happened to get a bad batch, or you've got environmental problems. Make sure there is sufficient air flow around the units, and check the power harmonics on the circuit. Most consumer grade (read cheap) electronics uses crappy power supplys which cause harmonics on the power line. One or two isn't a big
  • by Anonymous Coward on Thursday June 19, 2003 @09:05PM (#6249500)
    Sure, the test engineers sit and rub their chins and write numbers on paper and do stupid tests in the lab, but in the end it comes down to this:
    • WD Guy 1 Hey, what's the MTBF for our new drive?
    • WD Guy 2 Dunno, what's Maxtor saying?
    • WD Guy 1 sez here "300,000" hours
    • WD Guy 2 okay, ours is 500,000 then
    • WD Guy 1 I smell a NEW VICE PRESIDENT
  • You are wrong (Score:4, Informative)

    by Mensa Babe ( 675349 ) on Thursday June 19, 2003 @09:06PM (#6249502) Homepage Journal

    I rarely have an IDE drive last more than 4 years, and my record is 10 years, so what is the deal?

    If you have twenty drives with twenty years MTFB (Mean Time Between Failures) each, then you have one failure per year on average. These are basic statistics fighting always against you.

    • I might add, that when I was contracted at a server farm, people there used to celebrate every day, when there was no hardware failure, and the record was four times in one month. But have they complained that the producers of hardware were lying to them, stating years of MTBF? No. And that's because they knew the basics of mathematics and knew how to use FDIV opcode in their brains. The only solution is redundancy.
      • by NickDngr ( 561211 ) on Thursday June 19, 2003 @09:27PM (#6249628) Journal
        DISCLAIMER: The views expressed hereafter are not necessarily those of MENSA, which I am only a member of.

        Shouldn't that be "The views expressed hereafter are not necessarily those of MENSA, of which I am only a member." I would think proper grammar usage would be a prerequisite for being a MENSA member.
        • by The Clockwork Troll ( 655321 ) on Thursday June 19, 2003 @10:03PM (#6249889) Journal
          I would think proper grammar usage would be a prerequisite for being a MENSA member.
          <input type="radio" name="gift" value="IQ" disabled>
          <input type="radio" name="gift" value="money" disabled>
          <input type="radio" name="gift" value="penis size" disabled>
          <input type="radio" name="gift" value="ability to nitpick trivia" checked>
        • After reading the original poster's (Mensa Babe) slashdot-blurp, I have come to the conclusion that she must be a very intelligent girl with a lot of anger towards all males with their brains in the wrong place.

          As for the language bit, ALOT of really intelligent people are totally dislexic when it comes to grammar and spelling, since it makes no sense anyway IMHO :)
    • No, you are wrong (Score:4, Insightful)

      by anthony_dipierro ( 543308 ) on Thursday June 19, 2003 @09:25PM (#6249619) Journal
      Actually, you are wrong... If you have one drive fail per year for 20 years, then the mean time between failures is 10.5 years.
      • Uh, how do you mean? (literally)

        Is there some special definition for MTBF that changes how "mean time between" is interpreted?

        • A mean is an average. The average of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, and 20 is 10.5.
          • So you are using the average of the times measured between deployment of hardware and individual drive failure, as opposed to the mean time between failures of individual drives, i.e. the mean time between necessary hardware replacements, which is what I would have thought more useful for evaluating hardware.

            I would have called your average "mean time to failure" vs. "mean time between failures."

            But as other posters mentioned, most of these stats are marketing bunk no matter how they're computed!

            • Your interpretation is correct, assuming the exponential distribution (which is the common assumption.)
              • Do drive failures tend to arrive in something resembling a Poisson manner, in practice?
                • Yes. Strictly -- as someone else pointed out -- the failures tend to have a so-called "bathtub" distribution. That is, there's a high failur rate at first ("infant mortality") followed by a long Poisson/Markovian/exponential (you pick your term) stretch, followed by a higher failure rate as it get old. In general, the "lifetime" of a component is the time to the inflection at the end of the exponential portion.
            • I would have called your average "mean time to failure" vs. "mean time between failures."

              Or better yet, "mean time before failure," thus preserving the acronym.

          • Yeah, but you can have one for 0 years, too. :)
        • Is there some special definition for MTBF that changes how "mean time between" is interpreted?

          If by some special definition you mean a simple linear multiplication (or division, depanding on your point of view), then yes. Anthony Dipierro probably was mistakenly thinking about a decibel or other logarithmically scaled unit system.

      • Actually, you are wrong... If you have one drive fail per year for 20 years, then the mean time between failures is 10.5 years.

        If I have twenty drives, each of which is estimated to fail once in a twenty-year period, then in such a twenty-year period every one of those disks is estimated to fail once. These are twenty failures in twenty-year period on average, id est one failure per year. It is actually a matter of very simple mathematics.

        • If I have twenty drives, each of which is estimated to fail once in a twenty-year period, then in such a twenty-year period every one of those disks is estimated to fail once.

          No, you're wrong. The average time to failure is 20 years, not the maximum.

          The average of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, and 20 is 10.5.

          • Right, she's just making the assumption that the drives were all bought at different times.

            That might be valid when you work in a datacenter that replaces x number of drives every year.

            On average, if the MTBF of those drives is 20 years, one drive in such a group will fail every year.
        • Re:Yes, I am right. (Score:2, Informative)

          by iamroot ( 319400 )
          Mean Time Before Failure is the MEAN time before the disk would fail.

          If they all failed within 20 years, how would the average disk have failed in 20 years???

          MTBF (Mean Time Between Failures) â" Average time (expressed in

          hours) that a component works without failure. It is calculated by dividing
          the total number of operating hours observed by the total number of
          failures. Also, the length of time a user may reasonably expect a device or
          system to work before an incapacitating fault occurs.

          20 drives/2

      • someone mod up the parent, they are correct, the mensa babe, despite their mensa membership, is wrong. MTBF of 20 drives, one failling each year is (Sigma(n=1 to 20){n})/20 which is 10.5 years.

        A quick thought experiment should make it obvious that for a series of numbers (eg number of years between failure) where the highest number is n, that the mean of these numbers could never be n.
        • urg... with proviso that at least one of the series of numbers is not equal to n.
        • Hmm... Actually I just thought of something. If you have 20 hard drives, and one fails each year, but you fix it immediately after failure and then put it into service again, then the MTBF would be 20 years.

          I guess that's why the term is the mean time between failures rather than the mean time before failure. Back when the term was invented it probably made economic sense to fix a hard drive when it breaks. Nowadays we're more likely to just throw it away.

          • nice try, but if something fails and you then fix it, it has still failed. :)
            • Yeah, but the total operating time for each drive is 20 years if you fix them and they last. What I'm saying is, one drive breaks after 1 year, you fix it and it lasts the next 19, one drive breaks after 2 years, you fix it and it lasts the next 18, etc.

              Alternatively, you could have 19 of the drives last 20 years, and the other one break once a year.

              Actually, you wouldn't even have to fix them, if you replace them. If you keep 20 drives for 20 years, and one breaks every year, if you replace the one th

              • if it fails after a year, and you fix it and it lasts another 19 years, MTBF is (1+19)/2 = 10. Its failed twice on you.
                • if it fails after a year, and you fix it and it lasts another 19 years, MTBF is (1+19)/2 = 10. Its failed twice on you.

                  No, I never said it failed a second time. I said it failed after a year, then it lasted another 19 years. Then you stopped the experiment.

                  • Anthony,

                    I think you are right in this case, even with that consideration.

                    Since every drive failed once within a 20-year period, then we have all of our data points on the lower half of the normal curve (with the mean of 20 being in the middle), and none above. Thus, the mean for this part of the experiment would actually be around 10 years. However, if we were to continue the experiment out to 40 years or longer, and gain more samples, then it might balance out, but those repaired drives would have to be
                    • Well, here's the thing. The way the numbers work, you don't have to wait to see each drive fail before calculating the MTBF. As soon as one drive fails, you can make a calculation. In fact, that's one of the reasons that hard drives usually have unrealistically high MTBFs. They test 500,000 drives for 1 hour, and only see 1 failure, so they call the MTBF 500,000 hours. But in reality the expected lifetime of a hard drive is not so cut and dry. It might have very low probability of failing after 1 hour
                    • That's the difference between "statistical inference" and what people who have had a probability class tend to do. If you've got 500,000 drives, test for an hour, and get one failure, you've got some evidence that the failure rate is 1 per 500,000 hours (== MTBF of 500,000 hours in the exponential case.) But the confidence in that estimate is very low. If you then replace the failed drive and do another 1 hour test, and get another single failure, then you still can estimate the MTBF as 500,000 hours --
                    • That's the difference between "statistical inference" and what people who have had a probability class tend to do.

                      I don't know if it matters as much whether or not the person has had a probability class as whether or not the person is trying to legally boost MTBF ratings.

                      Sure, you can get better figures using different methods, but those better figures most likely will be lower, so why bother if you want to sell your product.

                    • Anthony, that's not really true. First of all, for a long part of the lifetime, exponential is a very good approximation. I can't cite it offhand, but I read a paper that showed it was better than 95% accurate. Secondly, the Bayesian method will give measures that you can advertise as correct (and people like Telcordia are very stubborn about it) with a relatively short and inexpensive experiment.

        • Whatever your math may say, the Industry's standard is the one you disagree with. If they stick 1000 drives in an array, run them for 1 year, and only a single drive fails during that year, the MTBF for that model of drive is 500 years.
          • *sigh* how did you get modded as informative?

            The original post that started this thread (from mensa babe) stated "run 20 drives of MTBF 20 years, then one fails each year", a poster replied to say they were incorrect, it would mean drives had MTBF of 10.5 years. And I replied to say the correctee was right. (which they are).

            Your example is for 1000 drives, running for 1 year, 1 failure. Going by the same math which i used to back up the aformentioned correctee, ie MTBF = Sigma(runtime)/failures, which is
    • No, You are wrong (Score:1, Informative)

      by Anonymous Coward
      The discrepency emerges because you do not operate the drives past their end-of-life when you make the MTBF calculation. To illustrate assume you have a particular model of hard drive with a MTBF of 57 years, and it reaches end-of-life after 5 years. What you can then conclude is that if you replace the drive every 5 years with a new drive(of the same model and MTBF), then you can expect your first failure at 57 years. Keep in mind that this is really just the most probably time for a failure to occur.
    • First, as already pointed out above, if you have 20 drives with a simple MTBF of 20 years (as you are alluding), then a single failure of one drive each year for twenty years yields a mean of around 10 years.

      Second, manufacturer-specified MTBF has nothing to do with a simple, observational mean. It is a calculated statistic based on a statistical sample and extrapolated over a normal distribution. In real-life terms, it is close to meaningless.

      Basically, it does not mean "Mean Time Between observed Failur
      • Actually, let me clarify that.

        If you are assuming that a drive is repaired and placed back into service, and then fails again after another random 1-20 year period, then you would be correct.

        I think the problem is that most of the geeks think in terms of practicality, and treat the situation as MTTF ("Mean-Time-To-Failure") instead of MTBF, since we all know that MTBF is BS anyway.

        As a result, I clarify the statement above, myself assuming "simple MTTF", not "simple MTBF".
  • by HotNeedleOfInquiry ( 598897 ) on Thursday June 19, 2003 @09:09PM (#6249527)
    First they specify a sample period, perhaps a year. Then they multiply the number of units shipped during that time times the estimated hours per year that the drives are run then divide it by the number of units returned due to failure

    For example, shipped 2 million drives last year, each ran 2080 hours ( 8 hours * 52 weeks), roughly 4 trillion hours total. Out of those 2 million units, they got 3466 returns. So the average MTBF was 1.2 million hours.

  • by aaarrrgggh ( 9205 ) on Thursday June 19, 2003 @09:12PM (#6249552)
    If they run 500 drives for 2,000 hours and observe only one failure, that is a MTBF of 500,000 hours.

    Unfortunately, that equation doesn't take into account the fact that some equipment degrades over time; if a product is very reliable for 1,000 hours, and less reliable after that, just double the sample size (maybe triple for statistics), and see what you get.

    Real reliability calculations are much more difficult than just what users think MTBF means...
    • Yup.

      MBTF might have a little bit of value as a relative measure -- i.e. perhaps drives with an order of magnitude higher MBTF will last longer. It's a lot less useful as an absolute measure.

      Lots of equipment (not just hard drives) has some sort of estimated lifetime. The best the manufacturer can ever do is estimate under semi-realistic conditions and extrapolate.
  • Labs (Score:4, Insightful)

    by MazTaim ( 1376 ) <taim@@@nauticom...net> on Thursday June 19, 2003 @09:15PM (#6249571) Homepage Journal
    That's the key word.

    MTBF is probably determined by taking a bunch of drives, putting them into PERFECT conditions that NEVER exist in the real world. Run them in a way that, although test all functionality, really doesn't provide true conditions for drives (IE head always reading/writing up and down the disk probably never seeking, disks always spinning, etc..). Something that drives never do in real life. Statistics...statistics...statistics...(speeling too :)
  • Marketing BS... (Score:4, Insightful)

    by Alomex ( 148003 ) on Thursday June 19, 2003 @09:23PM (#6249613) Homepage
    Anybody who has a large number of drives running knows that the figures have become meaningless over time. They use to predict to the T the expected time of failure. They are now a marketing term assuming "a duty cycle" and computed by an absurd "units x time to failure". Using that system, the MTBF of the Honda Civic engine is 100,000 years as there are 1 million Civic's out there and none of them had their engine seize up in the first month.

    Somebody ought to sue them for deceptive advertisement.

    • I'm not picking. REALLY. It's just too funny that someone complaining about bad mathmatics in the thread, and the C compiler, in the same post, would live on a planet with a 10 month year.

      float y = (float) 1/2 * x; // should yeild better results.
    • I had a VW Jetta that blew its engine when it was a few weeks old. One of the pistons disintegrated due to a defect in manufacturing. The dealer sent the parts back to the manufacturer for failure analysis. It's rare, but such things do happen.
  • by HaloZero ( 610207 ) <protodeka@@@gmail...com> on Thursday June 19, 2003 @09:25PM (#6249620) Homepage
    Calculating the Mean Time Between Failures?
    I prefer to measure time by the emergence of one integral anomoly to the next.
  • MTBF... (Score:4, Insightful)

    by m0rph3us0 ( 549631 ) on Thursday June 19, 2003 @09:28PM (#6249631)
    The best way to determine *REAL* MTBF is how long the drives are warrantied for, no one warranties a product longer than it is supposed to last. When you see a company reduce it's warranty expect quality to drop in accordance.
  • by crmartin ( 98227 ) on Thursday June 19, 2003 @09:40PM (#6249698)
    You know, it's almost a shame to screw up the amusing notions /.ers come up with by adding actual information, but I can't help it, all those years of teaching I guess.

    Okay, first of all: "mean time between failures" is obviously a statistical measure -- it is an average over a large number of individual items. In most electronic components (including light bulbs!) the statistical distribution of the time between failures is the exponential distribution [wolfram.com], which has the odd property that it's "memory-less" -- it doesn't matter how long since the last failure it's been, the mean time to the next failure will still be the same. A consequence of this is that if the MTBF is 10,000 hours, the probability of failure in any particular hour would be 1/10,000th. So, if you set up 10,000 components, all running simultaneously, you'd expect one of them to fail within the first hour; conversely, if you ran them for 1000 hours, and 998 of them failed, you could be fairly certain that the MTBF would be around 10,000 hours.

    Note, by the way, that this is only true when the failure time distribution is exponential -- so it works for electronic components, but not for, say, bicycles and cars and roller skates, which are more likely to fail the older they get.

    This has an obvious problem, of course: if the MTBF is high, it can take forever to test. Consider, for example, something I worked on for NASA some years ago: trying to prove that a fly-by-wire system will have a mean time between failures of 1e10 hours. (This is about the same failure rte as the airframe, which is how they came up with the number.) 1e10 hours is about 1.141 million years, by the way.

    (Pop quiz: if MTBF is a million years, how do you explain the occasional airframe failure, say, eg TWA 800? Hint: It doesn't require any foul play.)

    At that point, you've got a couple of choices: first, you can make a lot of copies and run them simultaneously. Relatively easy for $50 disks, hard for billion dollar 747s.

    Second, you can make the estimate by computation and modeling which is what you do for web systems. Conceptually, it's pretty simple to do this, although it can be a kind of pain in the ass.

    The third way, which is new and cool, is by Bayesian estimation of failure rates. This method lets you make increasingly accurate estimates of the failure rate based on short experiments. I don't have time to go into it, but there are some good sources available on the web. [google.com]
    • Actually, here's some more references: at CiteSeer [google.com], a good (if expensive) book on practical examples [amazon.com], and my favorite textbook [amazon.com]. I'll shut up now.
    • AMEN! I distinctly regret that I don't have mod points right now. I have an ABD in Economics and years of work experience doing econometrics, and this analysis nails how the MTBF calculation is done exactly.

      One quibble I have (more with the HD manufacturers, not crmartin) is that HD have mechanical components (spindle, actuator, etc.) that are subject to wear. As a result, MTBF calculations that are appropriate for solid state electronic equipment not subject to physical wear are likely inappropriate for H
    • Of course, the real problem is that neither electronics, nor mechanical items have a exponential failure curve

      Mechanical items tend to fail due to wearout - aka, they become more likely to fail as time goes on

      Electronics follow a "Bathtub" curve. A high initial rate, that rapidly drops to a VERY low rate. It stays at that LOW rate of failure for MANY hours, and then the rate increases rapidly during "wearout" - sort of like the cross section of a bathtub - hence the name

      The whole concept of "Burn In" -
      • Sure, but in a "bathtub" distribution it's approximately exponential over most of the lifetime, so it's a decent approximation.
        • Yes, it's exponential in the "Non-interesting" part of the curve. When you look at total failures (as a percentage), less than 10% of the total parts will fail during the "Bottom" of the curve, and the rest are fairly evenly split between the 2 sides. It's one of the reasons that companies can offer extended warrantees on electronics as cheaply as they do, and for it to be their greatest profit center

          Buy the product, use and abuse it during the original warrantee period, and it'll break if it's going to
          • I'm not quite sure why you read like we're arguing, since I don't think we are -- except to the extent that you don't thing the lengthy period of Markovian behavior is interesting, while I think that's the most interesting part.

            The point is this: the MTBF is computed for the Markovian part of the total life. The value for MTBF is computed as 1 / failure-rate IF AND ONLY IF the time distribution of failures is Markovian -- otherwise it's a more complicated function. The useful life is the length of time ov
    • Pop quiz: if MTBF is a million years, how do you explain the occasional airframe failure, say, eg TWA 800? Hint: It doesn't require any foul play.)

      Let me try:(please reply if i am in the right direction)

      1.This 1 in a millon year does not count when the airplain is burning, or some other component failed.
      2. The airplane has lots of components. Suppose that if a door fails this could lead to failure of the airframe. if the plane has 3 doors the change goes down to 4 failures in a million year. LOTS of comp
      • "There are lots of planes and lots of years" is the right answer: I don't recall the exact figures right now, but at the time of the TWA 800 crash I predicted that it would turn out to be an airframe failure (on the heuristic of preferring failure to malice) because when I worked the numbers it turned out that MTBF of 747s was about that same 1e10.

        By the way, your supposition about answer (1) is correct. It's just a definitional thing: we're really talking about proximate and root causes. You don't count
  • by cookd ( 72933 ) <.moc.onuj. .ta. .koocsalguod.> on Thursday June 19, 2003 @10:37PM (#6250097) Journal
    Whatever happened to *NICE* time between failure?
  • by anthony_dipierro ( 543308 ) on Thursday June 19, 2003 @10:46PM (#6250152) Journal

    I went to check some regular IDE drives just for comparison, and they were rated at 500,000 hours (57 years). Now, as I understand it, this is supposed to be the average time that you can expect the drive to last before failures. I rarely have an IDE drive last more than 4 years, and my record is 10 years, so what is the deal?

    Let's say I have a drive that has a 99% chance of failing after 10 years, and a 1% chance of failing after 4710 years. The MTBF is 57 years.

    In fact, with the proper distribution (think 2^n) you could have an infinite MTBF, but still have a 99% chance of failure within 10 years. See for example the St. Petersburg paradox [stanford.edu].

  • by Compact Dick ( 518888 ) on Friday June 20, 2003 @08:36AM (#6252448) Homepage
    I'm sure our resident expert [slashdot.org] is more than willing to help.
  • Design Lifetime (Score:3, Interesting)

    by Detritus ( 11846 ) on Friday June 20, 2003 @09:00AM (#6252611) Homepage
    MTBF figures are usually associated with a design lifetime. That hard drive may have a 300,000 hour MTBF based on a 100% duty cycle and a 5 year design lifetime. That tells you the expected failure rate for the first 5 years of operation. After that point, the failure rate may increase rapidly.
  • I think we're looking too deeply into this concept. The Western Digital definition for MTBF [wdc.com] says, the MTBF "is calculated by dividing the total number of operating hours observed by the total number of failures. Also, the length of time a user may reasonably expect a device or system to work before an incapacitating fault occurs."

    This means that they hook a whole bunch of drives up and run then for a while. Add the total hours of drive operation up and divide by the number of failures. They're estimate of
  • I guess it depends on what you consider failure, but if it really fails then it can only happen once! The next failure is never coming 'cause there's no way it can fail again so MTBF = infinity
  • failures / (drives * hoursrun) = MTBF

    Where:
    failures = actual number of drives RETURNED to manufacture
    drives = total number of production drives built
    hours = actual failure point (about 5 years)
  • For most items, the MTBF is not how long you can expect an item to operate with out failures. For most MTBF calculations, half of the units will fail before 33% or so of the MTBF. (I don't have the derivation of this number in front of me, but I can probably dig it up if somebody wants it).

    As far as the high MTBF's mentioned by the submitter, I can think of at least two perfectly valid methods that accuratly determine long MTBF's:

    First, you have the theoretical MTBF. This is where you look at the chan

E = MC ** 2 +- 3db

Working...