## Ask Slashdot: How Reproducible Is Arithmetic In the Cloud? 226

Posted
by
timothy

from the irreproducible-results dept.

from the irreproducible-results dept.

goodminton writes

*"I'm research the long-term consistency and reproducibility of math results in the cloud and have questions about floating point calculations. For example, say I create a virtual OS instance on a cloud provider (doesn't matter which one) and install Mathematica to run a precise calculation. Mathematica generates the result based on the combination of software version, operating system, hypervisor, firmware and hardware that are running at that time. In the cloud, hardware, firmware and hypervisors are invisible to the users but could still impact the implementation/operation of floating point math. Say I archive the virutal instance and in 5 or 10 years I fire it up on another cloud provider and run the same calculation. What's the likelihood that the results would be the same? What can be done to adjust for this? Currently, I know people who 'archive' hardware just for the purpose of ensuring reproducibility and I'm wondering how this tranlates to the world of cloud and virtualization across multiple hardware types."*
## Fixed-point arithmetic (Score:5, Informative)

Use Fixed-point arithmetic.

In Mathematica make sure to specify your precision.

Look at 'Arbitrary-Precision Numbers' and 'Machine-Precision Numbers' for more information on how Mathematica does this.

## Re:Fixed-point arithmetic (Score:5, Insightful)

Submitter is entirely ignorant of floating point issues in general. Other than the buzzword "cloud" this is no different from any other clueless question about numerical issues in computing. "Help me, I don't know anything about the problem, but I just realized it exists!"

## Re:Fixed-point arithmetic (Score:5, Informative)

Submitter is entirely ignorant of floating point issues in general. Other than the buzzword "cloud" this is no different from any other clueless question about numerical issues in computing. "Help me, I don't know anything about the problem, but I just realized it exists!"

Wrong.

In IEEE floating point math, "(a+b)+c" might not be the same as "a+(b+c)".

The exact results of a calculation

candepend on how a compiler optimized the code. Change the compiler and all bets are off. Different versions of the same softwarecanproduce different results.If you want the exact same results across all compilers you need to write your own math routines which guarantee the order of evaluation of expressions.

OTOH, operating system, hardware, firmware and hypervisors shouldn't make any difference if they're running the same code. IEEE math *is* deterministic.

## Re:Fixed-point arithmetic (Score:5, Insightful)

## Re: (Score:2, Insightful)

Urgh. Getting math to be deterministic is a major pain in the neck. Most folks are completely ignorant of the fact that sum/avg/stddev in just about all distributed databases will return a *slightly* different result everytime you run the code (and are almost always different between vendors/platforms, etc.).

## Re:Fixed-point arithmetic (Score:5, Insightful)

Getting the result to be deterministic is only the start of the problem. How do you know it is _correct_, or more properly, know the error bounds involved? How much does it matter to your problem?

e.g. If I am doing a 48-hour weather forecast, I can compare my results with observations next week; I can treat numerical error as a part of "model" error along with input observational uncertainty, etc.

I might validate part of my solutions by checking that, for example, the total water content of my planet doesn't change. For a 48-hour forecast, I might tolerate methods that slightly lose water over 48 hours in return for a fast solution. For a climate forecast/projection, this would be unacceptable.

Getting the same answer every time is no comfort if I have no way of knowing if its the right answer.

## Re:Fixed-point arithmetic (Score:5, Insightful)

the question was not about compilers or indeed about software, but about fpu's, about firing up the same instance, with the same compilers and indeed with the same original binary.

it sounds like just fishing for reasons to have a budget to keep old power hw around.

I would think that if the results change so much to matter depending on fpu, that the whole calculation method is suspect to begin with and exploits some feature/bug to get a tuned result(but assuming that the cpu/vm adheres to the standard that they would be the same - if the old one doesn't and the new one does then I think that a honest scientist would want to know that too).

## Re:Fixed-point arithmetic (Score:4, Informative)

Whatever calculation you're making, be aware of the dynamic range of the intermediate results. Structure your calculations so that all intermediate results stay well within the dynamic range of the datatype. If you want to compute the standard deviation of 2048x2048 32-bit integers, use a 64 bit or 128 bit integer to compute the intermediate sum(x^2). If you try to use an IEEE double, you'll end up overflowing the 53 bits they give you because 2^11 * 2^11 * 2^32=2^54.

If you can, reformulate your calculation steps so to minimize the sensitivity to random errors on the order of a machine epsilon.

An electronic computer manual from UNIVAC/Boroughs/IBM written for pure mathematicians in ~1953 will tell you the same thing.

## Re: (Score:2)

When the computation involves a subtraction of numbers that are about the same value.

## Re:Fixed-point arithmetic (Score:5, Informative)

Don't use floating point if you can avoid it.

If you can't, and the results are EXTREMELY important (remember, floating point is an APPROXIMATION of numbers), then you have to read What Every Computer Scientist Should Know About Floating Point Numbers [oracle.com]. (Yes, it's an Oracle link, but if you google it, most of the links are PDFs while the Oracle one is HTML).

If you're worried about your cloud provider screwing with your results, then you're definitely doing it wrong (read that article).

And yes, lots of people, even scientists, do it wrong because the idealized notion of what a floating point type is and how it actually works in hardware is completely different. Floating point numbers are tricky - they're VERY easy to use, but they're also VERY easy to use wrongly, and it's only if you know how the actual hardware is doing the calculations can you structure your programs and algorithms to do it right.

And no actual hardware FPU or VPU (vector unit - some do floating point) implements the full IEEE spec. Many come close, but none implement it exactly - there's always an omission or two. Especially since a lot of FPUs provide extended precision that goes beyond IEEE spec.

## Re:Fixed-point arithmetic (Score:4, Funny)

that link has a lot of words.

## Re:Fixed-point arithmetic (Score:5, Informative)

## Re:Fixed-point arithmetic (Score:4, Informative)

Yup. And if you want to use any kind of parallelism to compute the final result, you're going to have quite a hard time ensuring that the order of operations is always the same.

That said, there are libraries around that make use of IEEE's reproducibility guarantees to ensure reproducible results. That will likely correct any reproducibility issues that would otherwise be introduced by the compiler, but you still have the order of operations issue (which is a fundamental problem).

Personally, I think a better solution is to simply assume that you're never going to get reproducible floating-point results, and design the system to handle small, inconsistent rounding errors. I think that's a much easier problem to deal with than making floating-point reproducible in any modestly-complex system.

## Re: (Score:3)

Submitter is entirely ignorant of floating point issues in general. Other than the buzzword "cloud" this is no different from any other clueless question about numerical issues in computing. "Help me, I don't know anything about the problem, but I just realized it exists!"

Ignorant, perhaps; but likely correct that 'subtle differences in floating point handling exist between ostensibly binary-compatible platforms' + ' "the cloud" reduces your control, and sometimes even your information, about what platform you are running on at any given time' = 'Floating Point Fun Time'.

To actually

solvesuch a problem, sophisticated understanding is certainly required (especially since any practical user will probably want the solution to befastas well as correct); but the act of comb## Re:Fixed-point arithmetic (Score:5, Informative)

Yes, you can do this, but its not feasible for all calculations. Things like trig functions are implemented on FP numbers, and once you start using FP its better to just keep using it, converting back and forth is just bad and defeats the whole purpose anyway. So in reality you end up with applications that DO use FP (believe me, as an old FORTH programmer I can attest to the benefits of scaled integer arithmetic!). Its one of those things, we're stuck with FP and once we assume that, then the whole question of small differences in results of machine-level instructions or of minor differences in libraries on different platforms, etc. you will probably find that arbitrary VMs won't produce exactly identical results when you run on different platforms (AWS, KVM, VMWare, some new thing).

Is it ia huge problem though? The results produced should be similar, the parameters being varied were never controlled for anyway. Its how often the rounding errors between two FPUs are identical. Neither the new nor the old results should be considered 'better' and they should generally be about the same if the result is robust. A climate sym for example run on two different systems for an ensemble of runs with similar inputs should produce statistically indistinguishable results. If they don't then you should know what the differences are by comparison. In reality I doubt very many experiments will be in doubt based on this.

## Re:Fixed-point arithmetic (Score:5, Insightful)

"Is it ia huge problem though?"

If tools like Mathematica are dependent on the floating-point precision of a given processor, They're Doing It Wrong.

## Re:Fixed-point arithmetic (Score:5, Insightful)

I think the problem is that people PERCEIVE it to be a problem. Nothing is any more problematic than it was before, good numerical simulations will be stable over some range of inputs. It shouldn't MATTER if you get slightly different results for one given input. If that's all you tested, well, you did it wrong indeed. Mathematica is fine, people need to A) understand scientific computing and B) understand how to run and interpret models. I think most scientists that are doing a lot of modelling these days DO know these things. Its the occasional users that get it wrong I suspect.

## Re: (Score:3)

>(Same with the optimization issues we covered in that class - that it can make a real difference in runtime whether you iterate first over the rows and then over the columns of a 2-dimensional array or vice versa, depending on how your software stores arrays in memory, was a huge puzzle for minds far brighter than mine.)

If you are still curious, read the short article at http://en.wikipedia.org/wiki/Instruction_prefetch [wikipedia.org], and when you come to the bit about prefetching texels, think of those texels as dat

## Re: (Score:2)

## Re: (Score:3)

No worries at all; the intent of my post was to encourage the GP to consult documentation specific to the implied case that the Mathematica developers hadn't considered the problem. I believe your submission was a good one, as it isn't always a guarantee that developers will have considered the implications of floating point calculations in any given codebase. Getting people to think about things is never a bad thing.

## Re:Fixed-point arithmetic (Score:4, Funny)

This one is actually a nontrivial challenge. Once the tape starts to get damp, you need to keep track of the probability that executing a given head-moving operation will cause the tape to snap and abruptly leave you with a confused finite state machine...

## Re: (Score:2, Insightful)

protip: When discussing the difference between Fixed Point and Floating Point, the abbreviation "FP" is useless.

## Re:Fixed-point arithmetic (Score:5, Interesting)

wildlywith even small differences in floating-point precision. I recently had a bug in a machine learning algorithm that produced completely different results because I was off by one trillionth! I was being foolish, of course, because I hadn't use an epsilon for doing FP, but you get the idea.But it turns out-- even if you're a good engineer and you are careful with your floating point numbers, the fact is: floating point is approximate computation. And for many kinds of mathematical problems, like dynamical systems, this approximation changes the result. One of the founders of chaos theory, Edward Lorenz [wikipedia.org], of Lorenz attractor [wikipedia.org] fame, discovered the problem by truncating the precision of FP numbers from a printout when he was re-entering them into a simulation. The simulation behaved

completely differentlydespite the difference in precision being in the thousands. That was a weather simulation. See where I'm going with this?## Re:Fixed-point arithmetic (Score:5, Informative)

Trust me, its a subject I've studied. The problem here is that your system is unstable, tiny differences in inputs generate huge differences in output. You cannot simply take one set of inputs that produces what you think is the 'right answer' from that system and ignore all the rest! You have to explore the ensemble behavior of many different sets of inputs, and the overall set of responses of the system is your output, not any one specific run with specific inputs that would produce a totally different result if one was off by a tiny bit.

Of course Lorenz realized this. Simple experiments with an LDE will show you this kind of result. You simply cannot treat these systems the way you would ones which exhibit function-like behavior (at least within some bounds). Lorenz of course also realized THAT, but sadly not everyone has got the memo yet! lol.

## Re: (Score:2)

If you are really having a precision problem, even in double precision, then it means you are facing an ill-conditioned problem. And if you are facing an ill-conditioned problem, then there is nothing a technological tool can do for you. Try to reformulate the problem to avoid bad conditioning, and FP will be fine.

## Re: (Score:2)

If small differences in the floating-point precision make your results vary a lot it is a sign that your computation is useless. For in this case your model is producing more random noise than information. Concerns about reproducibility are obviously frivolous in this case.

## yeah, don't be lazy (Score:2)

floats are soft option, only gets us all in trouble.

remember

we are pentium of borg, division is futile## Re: (Score:2)

floats are soft option, ...

Too many shadows, whispering voices

faces on posters, too many choices

If? When? Why? What?

How much have you got?

Have you got it? Do you get it?

If so, how often?

Which do you choose

a hard or soft option?

## Re:Fixed-point arithmetic (Score:5, Funny)

I have a mechanical calculator that is extremely reliable, so long as you oil it.

## Re:Fixed-point arithmetic (Score:4, Funny)

Or simply don't use the broken "cloud computing" model. If you have some calculations to do, and care the least about the results, how about buying a computer that does those calculations for you?

In other news, many problems become much easier when you assume a suitably large pile of money.

Incidentally, the same is true of explosives, amphetamines, and hookers.

## Re:Fixed-point arithmetic (Score:5, Funny)

Incidentally, the same is true of explosives, amphetamines, and hookers.

I don't have to be a mathematician to say that sounds like one hell of a party.

## bend reality (Score:5, Funny)

The result is always the same, but the definition of reality is changing. The result of every single calculation is in fact 42 in some units. The hard part is figuring out the units.

## Re: (Score:2)

## Re: (Score:3)

Once you define the unit of truth that is. :P

## Re: (Score:3)

We most certainly need Slashdot VirtualCrypto to gild comments like these. Karma alone is not enough and this comment is too damned funny.

## Good luck (Score:3, Insightful)

## I'm research the long-term consistency and ... (Score:2)

First sentence seems stilted at best.

## Easiest solution (Score:3, Funny)

Just scroll down a couple of posts [slashdot.org]. "Quite soon the Wolfram Language is going to start showing up in lots of places, notably on the web and in the cloud."

Problem solved!

## Numerical instability (Score:5, Insightful)

If the value your computing is so dependent of the details of float point implementation that you'er worried about it, you probably have an issue of numerical stability and the results you are computing are likely useless, so this is really a mute point.

## Re:Numerical instability (Score:4, Funny)

This is the only answer so far that makes sense, which is a pity because

A) It's an AC

and

B) The point is

moot, not mute.But we all knew that, didn't we.

## Re: (Score:2)

The point was mute, not moot. Everybody was thinking it, but no one could say it.

Oh Anonymous Coward, if only it were socially acceptable you wouldn't have to hide your shame.

## Re: (Score:2)

Reminds me of one of Lloyd Trefethen's maxims about numerical mathematics (http://people.maths.ox.ac.uk/trefethen/maxims.html ):

"If the answer is highly sensitive to perturbations, you have probably asked the wrong question."

## Use infinite precision software packages (Score:5, Informative)

What the title says - e.g. bignum for Python etc. It will be significantly slower, but the result is going to be stable at least for a given library version, and that is far easier to archive.

## Re: (Score:2)

Well, the original question was about hardware floating point arithmetic, which has the same problem.

## Re: (Score:2)

PI is irrational, 1/3rd isn't. 1/3 could be represented perfectly if the implementation had a "repeating" bit. AFAIK, there isn't any commonly used FP hardware that has such a bit, so yeah; 1/3 is not perfectly represented.

This reminds me of the arguments you get from people when you try to explain that 0.9 repeating is exactly equal to 1.0.

Their minds really get blown when you explain that 0.9 repeating is just 0.3 repeating + 0.3 repeating + 0.3 repeating. All those 3s add up to 9, all the way out into

## Re: (Score:2)

PI is irrational, 1/3rd isn't. 1/3 could be represented perfectly if the implementation had a "repeating" bit. AFAIK,

You'd need more than one extra bit to represent reccuring binary fractions because you need to store the point at which the pattern repeats. And you would still only be able to store a subset of rational numbers exactly because you would still have a limited number of bits.

## Re: (Score:2)

22/7... It all goes back to the Babylonian representation of time when there were only 22 hours in a day and thus 154 hours a week. Then some bright spark asked 'Wouldn't it be nice if there were a couple of extra hours in the day', and so the 24/7 paradigm was born. Some thought this change was irrational (c.f. daylight saving), so a formal definition of circumference = pi x diameter was adopted.

## Your chances are pretty darned good (Score:5, Informative)

Mathematica in particular uses adaptive precision; if you ask it to compute some quantity to fifty decimal places, it will do so.

In general, if you want bit-for-bit reproducible calculations to arbitrary precision, the MPFR [mpfr.org] library may be right for you. It computes correctly-rounded special functions to arbitrary accuracy. If you write a program that calls MPFR routines, then even if your own approximations are not correctly-rounded, they will at least be reproducible.

If you want to do your calculations to machine precision, you can probably rely on C to behave reproducibly if you do two things: use a compiler flag like

-mpc64on GCC to force the elementary floating point operations (addition, subtraction, multiplication, division, and square root) to behave predictably, and use a correctly-rounded floating point library like crlibm [ens-lyon.fr] (Sun also released a version of this at one point) to make the transcendental functions behave predictably.## iEEE 754 (Score:4, Insightful)

Different results on different hardware was a major problem up until CPU designers started to implement the IEEE754 standard for floating point arithmetic. [wikipedia.org] IEEE754 conforming implementations should all return identical results for identical calculations

However, x86 systems have an 80-bit extended precision format and if the software uses 80-bit floats on x86 hardware and then you run the same code on an architecture that does not support the x86 80-bit format (say, ARM or Sparc or PowerPC) then you are likely to get different answers.

I think newer revisions of IEEE754 have support for extended precision formats up to 16-bytes, but you need to know your hardware (and how your software uses it) to make sure that you are doing equal work on systems with equal capabilities. You may have to sacrifice precision for portability.

## You need to know some numerical analysis (Score:5, Insightful)

If your calculations are processor-dependent, that's a bad sign for your code. If your results really depend on things that can be altered by the specific floating-point implementation, you need to write code that's robust to changes in the way floating-point arithmetic is done, generally by tracking the uncertainty associated with each number in your calculation. (Obviously you don't need real-time performance since you're using cloud computing in the first place.) I'm not an expert on Mathematica, but it probably has such things built in if you go through the documentation, since Mathematica notebooks are supposed to exhibit reproduceable behavior on different machines. (Which is not to say that no matter what you write it's automatically going to be reproduceable.

Archiving hardware to get consistent results is mainly used when there are legal issues and some lawyer can jump in and say, "A-ha! This bit here is different, and therefore there's some kind of fraud going on!"

## Re:You need to know some numerical analysis (Score:5, Insightful)

This.

Reproducibility (what we strive for in science) is not the same as repeatability (what the poster is actually trying to achieve). Results that are not robust on different platforms aren't really scientific results.

I wish more scientists understood this.

-Chris

## Re:You need to know some numerical analysis (Score:5, Interesting)

While that's true in many cases, there are some situations in which we need . Read Shewchuk's excellent paper [berkeley.edu] on the subject.

When disaster strikes and a real RAM-correct algorithm implemented in floating-point arithmetic fails to produce a meaningful result, it is often because the algorithm has performed tests whose results are mutually contradictory.

The easiest way to think about it is with a made-up problem about sorting. Let's say that you have a list of mathematical expressions like sin(pi*e^2), sqrt(14*pi*ln(8)), tan(10/13), etc and you want to sort them, but some numbers in the list are so close to each other that they might compare differently on different computers that round differently, (e.g. one computer says that sin(-10) is greater than ln(100)-ln(58) and the other says it's less).

Imagine now that this list has billions of elements and you're trying to sort the items using some sort of distributed algorithm. For the sorting to work properly, you *need* to be sure that a < b implies that b > a. There are situations (often in computational geometry) where it's OK if you get the wrong answer for borderline cases (e.g. it doesn't matter whether you can tell whether sin(-10) is bigger than ln(100)-ln(58) because they're close enough for graphics purposes) as long as you get the wrong answer consistently, so the next algorithm out (sorting in my example, or triangulation in Shewchuk's) doesn't get stuck in infinite loops.

## Re: (Score:3)

notice something interesting in one particular simulation and you'd like to run it again to zoom on it,

If the thing you're zooming in on is dependant of the behaviour of floating point numbers, then it's not interesting from any point of view other than that. It certainly won't represent anything physically meaningful, which since we're talking about galaxy simulations I assume is the point.

## Re: (Score:2)

if the results are different enough to lead to different logical conclusions about what was being calculated then the whole method of using it as basis for decisions/deductions is pretty suspect and one should ask if the scientist in question chose 12bytes vs 16bytes to get the result he wanted.

otoh, having the flags on to behave per standard it should behave per standard.

## Obligatory Comic (Score:5, Funny)

## Library (Score:3)

It depends on what you mean by "cloud", which is sort of a catchall term. As you've pointed out, on SaaS clouds you're going to have no guarantee of consistency, even if no time passes -- you don't know that the cloud environment is homogeneous. For (P/I)aaS clouds, you can hopefully hold constant what software is running. For example, if you have your Ubuntu 12.04 VM that runs your software, when you fire up that VM five years from now, its software hasn't changed one bit. You of course have to worry about whether or not the form you have the VM in is even usable in five years. You would hope that, even with inevitable hardware changes, if none of the software stack changes, then you'll get the same results. I'd guess that if they're running all on hardware that really correctly implements IEEE floating-point numbers, than you will in fact get consistent results. But I wouldn't bet on it.

What you really need, unfortunately, is a library that abstracts away and quantifies the uncertainty induced by hardware limitations. There are a variety of options for these, since they're popular in scientific computing, but the overall point is that using such techniques, you can get consistent results within the stated accuracy of the library.

## You may have bigger issues (Score:2)

If you're worried about your program generating different results on different arch, you have some serious coding issues.

The math should be the same on all systems. If you're worried, try 2 different systems against a known or manually calculated result, that's how the Pentium-type bugs were discovered (if you remember).

Typically major issues in your processing units will be discovered quickly because of the ubiquity in the market. Unless you're using a custom built or compromised chip on eg primes, you sho

## Solved problem (Score:2)

The problem of inconsistent floating point calculations between machines has been solved since 1985 [wikipedia.org]. I'm sure moving your app into the cloud doesn't suddenly undo 28 years of computing history.

## Re: (Score:2)

The problem of inconsistent floating point calculations between machines has been solved since 1985. I'm sure moving your app into the cloud doesn't suddenly undo 28 years of computing history.

Except it hasn't. On a PowerPC or Haswell processor, a simple calculation like a*b + c*d can legally give three different results because of the use of fused multiply-add. In the 90's to early 2000's, you would get different results because of inconsistent use of extended precision.

## Re: (Score:3)

## Frist 3D pirnter prost (Score:2, Offtopic)

The solution is to use a 3D printer to make your own cloud.

## Rounding error (Score:2)

If you're not allowing for rounding errors, your result is invalid in the first place.

If you don't want rounding errors, use a packaged based on variable precision mathematics, like a BCD package.

## Ye Old Text (Score:3, Insightful)

This has pretty much been the bible for many, many, many years now: http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

If you haven't read it, you should - no matter if you're a scientific developer, a games developer, or someone who writes javascript for web apps - it's just something everyone should have read.

Forgive the Oracle-ness (it was originally written by Sun). Covers what you need to know about IEEE754 from a mathematical and programming perspective.

Long story short, determinism across multiple (strictly IEE754 compliant) architectures while possible is hard - and likely not worth it. But if you're doing scientific computing, perhaps it may be worth it to you. (Just be prepared for a messy ride of maintaining LSB error accumulators, and giving up typically 1-3 more LSB of precision for the determinism - and not only having to worry about the math of your algorithms, but the math of tracking IEEE754 floating point error for every calculation you do).

What you can do, easily, however is understand the amount of error in your calculations and state the calculation error with your findings.

## Is it just a language barrier? (Score:4, Informative)

My first thought on seeing "tranlate" and "I'm research" was that it's only language, but then I read invalid and incorrect statements about how precision is defined in Mathematica. So now I'm not quite sure it's just language.

Archiving a whole virtual machine as opposed to the code being compiled and run is baffling to me.

Now if you are trying to archive the machine to run your old version of Mathematica and see if you get the same result, you may want to check your license agreement with Wolfram first. Second, you should be able to export the code and run the same code on new versions.

I'm really really confused on why you would want this to begin with though. Precision has increased quite a bit with the advent of 64bit hardware. I'd be more interested in taking some theoretical code and changing "double" to "uberlong" and see if I get the same results than what I solved today on today's hardware.

Unless this is some type of Government work which requires you to maintain the whole system, I simply fail to see any benefit.

Having "Cloud" does not change how precision works in Math languages.

## Re: (Score:3)

## First, identify the problem (Score:2)

You ask:

Say I archive the virutal instance and in 5 or 10 years I fire it up on another cloud provider and run the same calculation. What's the likelihood that the results would be the same?

If the calculation is 2 + 2 I'd says the odds are pretty good you're going to get 4. I assume you're actually doing some difficult calculations that may push some of the edge cases in the floating point system. What I would do is make some test routines that stress the areas that you're interested in and run and check the results of those before doing any serious calculations. For the most part, you're going to have to assume that the basic functions work and there aren't simply specific combos l

## Try it on a single PC first (Score:2)

I currently have a Matlab script that produces slightly different FIR filter design coefficients each time I run it - when run on the same version of Matlab on the same machine. And this is with

Matlab, whose primary selling point is its industrial-strength mathematical "correctness".Also, I once used a C compiler that wouldn't produce consistent builds, and not just by a timestamp. The compiler vendor said that a random factor was used to decide between optimization choices that scored equally. We final

## Simulate IEEE754-compliant FPU? (Score:2)

Can't Mathematica be told to stick to an 80-bit precision output? If you can specify that in software, it shouldn't matter what code the underlying platform runs on.

## What if it's not reproducible? (Score:2)

## Associated concern (Score:2)

If you haven't already you may want to have a look at Interval arithmetic [wikipedia.org] since it addresses some associated issues. It is supported in various development environments and libraries.

## This is just toooo technical (Score:2)

I still have trouble with 1+1=10

## IEEE 754-2008 (Score:2)

If the math has been calculated with IEEE 754-2008 [wikipedia.org], it is IEEE 754-2008 (aka ISO/IEC/IEEE 60559:2011). Should not matter what you are running it on...

## Hardware Arb Precision Decimal Processors (Score:2)

I could see one thing happening over time. Right now a lot of software does calculations involving decimal fractions in floating point. The problem with this is that in general you cannot precisely represent a decimal fraction using a binary floating point number. This is why you often see results like a-b = 0.19999999999999.

Well I think it is possible that we could see development of hardware arithmetic units that would internally use arbitrary precision fixed point calculations to do these sorts of calcul

## Hamming's Motto (Score:2)

You would do well to remember a quotation attributed to Richard W. Hamming: "The purpose of computing is insight, not numbers."

## Perfect reproduction is difficult / undesireable (Score:2)

This issue was described far better than I can in William Kahan's essay, How Java's floating point hurts everyone everywhere [berkeley.edu]

## False assumption (Score:5, Informative)

This assumption by the OP:

Mathematica generates the result based on the combination of software version, operating system, hypervisor, firmware and hardware that are running at that time.

... is entirely wrong. One of the defining features of Mathematica is symbolic expression rewriting and arbitrary-precision computation to avoid all of those specific issues. For example, the expression:

N[Sin[1], 50]Will

alwaysevaluate to exactly:0.84147098480789650665250232163029899962256306079837And, as expected, evaluating to 51 digits yields:

0.841470984807896506652502321630298999622563060798371Notice how the last digit in the first case remains unchanged, as expected.

This is explained at length in the documentation, and also in numerous Wolfram blog articles that go on about the details of the algorithms used to achieve this on a range of processors and operating systems. The (rare) exceptions are marked as such in the help and usually have (slower) arbitrary-precision or symbolic variants. For research purposes, Mathematica comes with an entire bag of tools that can be used to implement numerical algorithms to any precision reliably.

Conclusion: The author of the post didn't even bother to flip through the manual, despite having strict requirements spanning decades. He does however have the spare time to post on Slashdot and waste everybody else's time.

## Re: (Score:3)

Notice how the last digit in the first case remains unchanged, as expected.

It only remains unchanged because it rounds down. ...60798 ...6080

N[Sin[1], 48] will end with

N[Sin[1], 47] will end with

Calculated on Wolfram Alpha.

## The one time you acutally use Java. (Score:2)

In x86 based processors we've had BCD (binary coded decimal) instructions for ages. I use those in my assembly project, or emulate unlimited bit length floating points with integer math in my big-num libs. However, modern languages do not rely on the hardware features like BCD.

In Matlab you should used fixedpoint math. That's pretty dumb, but it garauntees the precision will be the same on whatever platform.

Lacking a bignum lib with garaunteed behaviors, one could just use Java. Java emulates floating p

## Re: (Score:2)

guarantee - Gua ran tee; Hooked on phonics didn't work for me!

## Floating point is hard. (Score:2)

Kahan, of course, is the authority on this.

Handling of floating point overflow is a big problem. Under Windows on x86, you can get exact (as in at the right instruction location) floating point exceptions, and I've used that to catch overflow in a physics engine. But on some CPUs, there's a speed penalty for enabling exact FPU exceptions. Java and Go don't support floating point exceptions; they return NaN or +INF or -INF or 0 (for underflow). One problem with IEEE floating point is that you don't have t

## No, for many reasons (Score:2)

The short answer is no. The long answer is no ... and a very long list of reasons why.

Start with reading Goldbergs classic paper "What Every Computer Scientist Should Know About Computer Arithmetic" Sun's floating point group made some improvements to the paper and paid for rights to redistribute. Oracle continues to do so. http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html [oracle.com]

If that isn't depressing enough, and you use trig functions, read http://www.scribd.com/doc/64949170/Ng-Argument-Reduction-f [scribd.com]

## Mathematics is more than real numbers (Score:2)

I have seen some of the answers given by other people, and many seem to miss the point of floating point calculations. Floating point is by its very nature imprecise, and when you choose to use it, you have to keep that in mind - the task you want to perform must be one where a certain degree of imprecision does not matter. What you are after is not exact reproducibility, but simply that your results stay within accepted error margins, and depending on the nature of your calculations, these may be very wide

## Re: (Score:2)

Most of the time arbitrary precision is not necessary and it's easier (and faster) to just use a float. There are times when it matters, but for the most part people aren't doing things where it matters.

The submitter should know better about using integer operations for things that require precision though.

## Re:Arbitray precision (Score:4, Funny)

## Re: (Score:2)

## Re: (Score:3)

They may be well defined but nobody implements fully standards compliant FP units and they have subtle differences in output. Even with identical hardware, configurable settings like rounding modes may also differ between instances.

## Re:WTF? (Score:4, Informative)

They do not. IEEE754 has no "grey area". The results must match bit-exact or you are not IEEE754.

Of course, there can be implementation bugs. For example, Qemu does co-processor emulation only with 64 bit floats instead of the required 80 bit. Nobody seem to really care however. The other thing is of course that if reproducibility is more important than correctness, I suspect the math is done wrong.

## Re: (Score:3)

## Re: (Score:3, Informative)

Let's say you're using C on an x86.

float(32-bit) anddouble(64-bit) are well defined. However, the x86 FPU internally useslong double(80-bit).So if you do some math on a float or a double, the results can vary depending on if it was done as 80-bit or if the intermediaries were spilled and truncated back to 64/32 bit.

## Re: (Score:3)

So if you do some math on a float or a double, the results can vary depending on if it was done as 80-bit or if the intermediaries were spilled and truncated back to 64/32 bit.

Google for FP_CONTRACT. Quote from the C Standard:

A floating expression may be contracted, that is, evaluated as though it were a single operation, thereby omitting rounding errors implied by the source code and the expression evaluation method. The FP_CONTRACT pragma in provides a way to disallow contracted expressions. Otherwise, whether and how expressions are contracted is implementation-defined.

## Re: (Score:2)

The C standard is pretty useless here. Have a look at the really bad precision required. What you need to look at is IEEE754.

## Re: (Score:2)

No. The FPU does 80 bits to satisfy the precision requirements for 64 bit IEEE754.

## Re: (Score:2)

## Re: (Score:2)

Intel x87 scalar FP instructions use an 80 bit internal format for higher precision. Intel SSE2 vector FP instructions use 64 bits. You will see last bit variations depending on which instructions the compiler chooses.

And the compiler may choose differently depending on whether it's compiling for 32-bit or 64-bit x86 [github.com].

## Re: (Score:2)

So fix the compiler, or stop compiling for 32 bit. RAM is cheap, especially when you're talking about the cost per GiB of hundreds of gibibytes of it.

## Re: (Score:2)

So fix the compiler

"Fix the compiler" presumably meaning "change the compiler not to support non-SSE x86 processors" or, at least, "change the compiler not to *default* to supporting non-SSE processors". Sounds good to me, these days, but I'm not responsible for making those decisions about GCC, so there's not much I can do about it.

or stop compiling for 32 bit. RAM is cheap, especially when you're talking about the cost per GiB of hundreds of gibibytes of it.

At this point, I don't know how many *desktop/laptop* 32-bit x86 boxes there are out there, but, in any case, somebody got concerned that the tests didn't pass on a 32-bit machine, so.... Person

## Re: (Score:2)

"Fix the compiler" presumably meaning "change the compiler not to support non-SSE x86 processors" or, at least, "change the compiler not to *default* to supporting non-SSE processors".

I think this really is the best option, all things considered.

## Not true in the real world. (Score:2)

> Floating point and integer operations are well defined. Unless someone fucks up

> with implementing the floating point unit the result should be exactly the same.

Not true in the real world. See http://slashdot.org/story/13/07/28/137209/same-programs--different-computers--different-weather-forecasts [slashdot.org] There was a scientic paper about the same weather model producing different forecast outputs on different machines.

## Re: (Score:2)

If only there was some type of standard adopted that would make it so this wasn't the case...

## Re: (Score:2)

Not to mention, nuclear simulations should be staying on LANL's hardware, not being foisted into the cloud.

Unless somebody fucks up, LANL's nuclear simulations

becomethe cloud, toward the end.