How Accurate and Precise is libm.a? 11
Chad asks: "I am looking into doing so molecular modeling on Linux platforms because they are cost effective. After running some tests, I find errors, albeit small, in the results as compared to similar tests on SGIs or SUNs. I have heard about this in the past when talking to some professors but I never thought much about it until now. Knowing that errors propagate and grow (especially after weeks of computation), I want to know what I can do to avoid problems with the math library. Is it a problem? Can it be fixed? Am I over reacting?" Can anyone offer up some information on this? Has anyone actually stressed tested libm.a?
paranoia (Score:1)
http://www.netlib.no/netlib/paranoia/paranoia.c
On my Solaris 7 box, it says:
No failures, defects nor flaws have been discovered.
Rounding appears to conform to the proposed IEEE standard P754.
The arithmetic diagnosed appears to be Excellent!
On my Linux box, it says:
The number of DEFECTs discovered = 1.
The number of FLAWs discovered = 1.
The arithmetic diagnosed may be Acceptable despite inconvenient Defects.
Run it yourself!
It's been awhile, but... (Score:1)
Scary results / Re:paranoia (Score:1)
No optimization:
1 defect, 1 flaw
-O/-O1:
Everything goes completely insane, program locks
-O2/-O6:
Several things go wrong, program locks up later than with -O/-O1
I knew optimization was bad... but I never expected anything remotely like this... Why doesn't egcs have a -safe- optimization level?
Re:Floating point (Score:1)
When I would do "something", evaluate a cost function, and then undo that "something" (because of an increase in cost), the cost function after the undo would be equal to the original cost function (before the "something") more often on Linux than on Solaris.
I suppose you run Linux on an x86 processor. The x86 FPU uses 80-bit floating point numbers internally. That might be the reason why Linux is more accurate. You can force an x86 program to calculate with 64-bit numbers if you write each intermediate result to memory and load it again. But that's not very efficient.
Take a numerical analysis class! (Score:2)
Lets assume (for discussion) that Solaris is more accurate then linux. That does not make solaris accerate enough.
I did poorly in my numerical anaysis class, but I do remember a couple things: rounding errors (which floating point almost always has) increase greatly. Often you can get a result from a calculation that after analysis has zero significant digets. That is you could get your result from a random number generator and have just as much confidence in the answer. This of course depends on the calculations involved.
Anouther place the professor pointed out that adding more data to your real world sample increases the error out of all bounds. (This was related to finding a polynomial for some line, and basicly if the line is right, there may be an equation, but not a polynomial. In that case they can prove there does not exist a function that can tell you how far off you are when calculating a data point)
In summery, differences between what solaris calculates and what linux comes up with should not be your first concern. First you should understand all the errors involved, and most of them are fundamental to a computer with a finite number of bits.
x86 FP (Score:2)
Re:paranoia (Score:2)
No failures, defects nor flaws have been discovered.
Rounding appears to conform to the proposed IEEE standard P754,
except for possibly Double Rounding during Gradual Underflow.
The arithmetic diagnosed appears to be Excellent!
Software or hardware? (Score:2)
My suggestion is possible, but likely a bit of work. Try installing Solaris/x86 on your Linux box, or Linux/Sparc on your Solaris machine. See how the math goes there.
For more documentation, see "Cray instability" in the Jargon File [jargon.org].
Re:paranoia (Score:2)
Checking rounding on multiply, divide and add/subtract.
* is neither chopped nor correctly rounded.
/ is neither chopped nor correctly rounded.
Addition/Subtraction neither rounds nor chops.
Sticky bit used incorrectly or not at all.
FLAW: lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below.
and
Testing X^((X + 1) / (X - 1)) vs. exp(2) = 7.38905609893065218e+00 as X -> 1.
DEFECT: Calculated 7.38905609548934539e+00 for
(1 + (-1.11022302462515654e-16) ^ (-1.80143985094819840e+16);
differs from correct value by -3.44130679508225512e-09 .
This much error may spoil financial
calculations involving tiny interest rates.
I can't tell if this is a defect and flaw of my Pentium chip, the x86 FPU in general or of libm.
Was your test on Solaris 7 running on Intel or SPARC?
Anomalous: inconsistent with or deviating from what is usual, normal, or expected
Floating point (Score:3)
At my company we develop some sort of tool, and we build Linux and Solaris versions. The nature of the tool is that it does a lot of floating point number crunching, and we'd like it if the Linux and Solaris versions give the same output for the same input. Unfortunately, there are discrepencies between the floating point operations, and we have to do rather crude workarounds to try to suppress this tendency. In debugging, I've noticed that the discrepencies between the two systems often amount to multiples of seven times the smallest floating point number (with a given exponential term). For many purposes, this is OK, but if your program is basing decisions on the values of floating point numbers, you could easily go one way on Linux and the other on Solaris.
What's really interesting is something I noticed... When I would do "something", evaluate a cost function, and then undo that "something" (because of an increase in cost), the cost function after the undo would be equal to the original cost function (before the "something") more often on Linux than on Solaris.
Again, I don't really know what all contributes to the FP operations, so it may be more dependent on hardware than the libs.