Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Math Cloud Virtualization Hardware Science

Ask Slashdot: How Reproducible Is Arithmetic In the Cloud? 226

goodminton writes "I'm research the long-term consistency and reproducibility of math results in the cloud and have questions about floating point calculations. For example, say I create a virtual OS instance on a cloud provider (doesn't matter which one) and install Mathematica to run a precise calculation. Mathematica generates the result based on the combination of software version, operating system, hypervisor, firmware and hardware that are running at that time. In the cloud, hardware, firmware and hypervisors are invisible to the users but could still impact the implementation/operation of floating point math. Say I archive the virutal instance and in 5 or 10 years I fire it up on another cloud provider and run the same calculation. What's the likelihood that the results would be the same? What can be done to adjust for this? Currently, I know people who 'archive' hardware just for the purpose of ensuring reproducibility and I'm wondering how this tranlates to the world of cloud and virtualization across multiple hardware types."
This discussion has been archived. No new comments can be posted.

Ask Slashdot: How Reproducible Is Arithmetic In the Cloud?

Comments Filter:
  • Good luck (Score:3, Insightful)

    by timeOday ( 582209 ) on Thursday November 21, 2013 @08:11PM (#45486435)
    This problem is far broader than arithmetic. Any distributed system based on elements out of your control is bound to be somewhat unstable. For example, an app that uses google maps, or a utility to check your bank account. The tradeoff for having more capability than you could manage yourself, is that you don't get to manage it yourself.
  • by Anonymous Coward on Thursday November 21, 2013 @08:16PM (#45486469)

    If the value your computing is so dependent of the details of float point implementation that you'er worried about it, you probably have an issue of numerical stability and the results you are computing are likely useless, so this is really a mute point.

  • iEEE 754 (Score:4, Insightful)

    by Jah-Wren Ryel ( 80510 ) on Thursday November 21, 2013 @08:19PM (#45486503)

    Different results on different hardware was a major problem up until CPU designers started to implement the IEEE754 standard for floating point arithmetic. [wikipedia.org] IEEE754 conforming implementations should all return identical results for identical calculations

    However, x86 systems have an 80-bit extended precision format and if the software uses 80-bit floats on x86 hardware and then you run the same code on an architecture that does not support the x86 80-bit format (say, ARM or Sparc or PowerPC) then you are likely to get different answers.

    I think newer revisions of IEEE754 have support for extended precision formats up to 16-bytes, but you need to know your hardware (and how your software uses it) to make sure that you are doing equal work on systems with equal capabilities. You may have to sacrifice precision for portability.

  • by Anonymous Coward on Thursday November 21, 2013 @08:21PM (#45486515)

    Submitter is entirely ignorant of floating point issues in general. Other than the buzzword "cloud" this is no different from any other clueless question about numerical issues in computing. "Help me, I don't know anything about the problem, but I just realized it exists!"

  • by daniel_mcl ( 77919 ) on Thursday November 21, 2013 @08:22PM (#45486521)

    If your calculations are processor-dependent, that's a bad sign for your code. If your results really depend on things that can be altered by the specific floating-point implementation, you need to write code that's robust to changes in the way floating-point arithmetic is done, generally by tracking the uncertainty associated with each number in your calculation. (Obviously you don't need real-time performance since you're using cloud computing in the first place.) I'm not an expert on Mathematica, but it probably has such things built in if you go through the documentation, since Mathematica notebooks are supposed to exhibit reproduceable behavior on different machines. (Which is not to say that no matter what you write it's automatically going to be reproduceable.

    Archiving hardware to get consistent results is mainly used when there are legal issues and some lawyer can jump in and say, "A-ha! This bit here is different, and therefore there's some kind of fraud going on!"

  • Ye Old Text (Score:3, Insightful)

    by Anonymous Coward on Thursday November 21, 2013 @08:47PM (#45486709)

    This has pretty much been the bible for many, many, many years now: http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

    If you haven't read it, you should - no matter if you're a scientific developer, a games developer, or someone who writes javascript for web apps - it's just something everyone should have read.

    Forgive the Oracle-ness (it was originally written by Sun). Covers what you need to know about IEEE754 from a mathematical and programming perspective.

    Long story short, determinism across multiple (strictly IEE754 compliant) architectures while possible is hard - and likely not worth it. But if you're doing scientific computing, perhaps it may be worth it to you. (Just be prepared for a messy ride of maintaining LSB error accumulators, and giving up typically 1-3 more LSB of precision for the determinism - and not only having to worry about the math of your algorithms, but the math of tracking IEEE754 floating point error for every calculation you do).

    What you can do, easily, however is understand the amount of error in your calculations and state the calculation error with your findings.

  • by rockmuelle ( 575982 ) on Thursday November 21, 2013 @09:02PM (#45486803)

    This.

    Reproducibility (what we strive for in science) is not the same as repeatability (what the poster is actually trying to achieve). Results that are not robust on different platforms aren't really scientific results.

    I wish more scientists understood this.

    -Chris

  • by Jane Q. Public ( 1010737 ) on Thursday November 21, 2013 @09:04PM (#45486823)

    "Is it ia huge problem though?"

    If tools like Mathematica are dependent on the floating-point precision of a given processor, They're Doing It Wrong.

  • by Anonymous Coward on Thursday November 21, 2013 @09:17PM (#45486893)

    protip: When discussing the difference between Fixed Point and Floating Point, the abbreviation "FP" is useless.

  • by Giant Electronic Bra ( 1229876 ) on Thursday November 21, 2013 @10:30PM (#45487357)

    I think the problem is that people PERCEIVE it to be a problem. Nothing is any more problematic than it was before, good numerical simulations will be stable over some range of inputs. It shouldn't MATTER if you get slightly different results for one given input. If that's all you tested, well, you did it wrong indeed. Mathematica is fine, people need to A) understand scientific computing and B) understand how to run and interpret models. I think most scientists that are doing a lot of modelling these days DO know these things. Its the occasional users that get it wrong I suspect.

  • by immaterial ( 1520413 ) on Thursday November 21, 2013 @10:49PM (#45487471)
    For a guy who started off a reply with an emphatic "Wrong" you sure do seem to agree with the guy you quoted.
  • by Anonymous Coward on Thursday November 21, 2013 @10:50PM (#45487483)

    Urgh. Getting math to be deterministic is a major pain in the neck. Most folks are completely ignorant of the fact that sum/avg/stddev in just about all distributed databases will return a *slightly* different result everytime you run the code (and are almost always different between vendors/platforms, etc.).

  • by gl4ss ( 559668 ) on Thursday November 21, 2013 @10:53PM (#45487501) Homepage Journal

    the question was not about compilers or indeed about software, but about fpu's, about firing up the same instance, with the same compilers and indeed with the same original binary.

    it sounds like just fishing for reasons to have a budget to keep old power hw around.

    I would think that if the results change so much to matter depending on fpu, that the whole calculation method is suspect to begin with and exploits some feature/bug to get a tuned result(but assuming that the cpu/vm adheres to the standard that they would be the same - if the old one doesn't and the new one does then I think that a honest scientist would want to know that too).

  • by amck ( 34780 ) on Friday November 22, 2013 @07:56AM (#45489529) Homepage

    Getting the result to be deterministic is only the start of the problem. How do you know it is _correct_, or more properly, know the error bounds involved? How much does it matter to your problem?

    e.g. If I am doing a 48-hour weather forecast, I can compare my results with observations next week; I can treat numerical error as a part of "model" error along with input observational uncertainty, etc.

    I might validate part of my solutions by checking that, for example, the total water content of my planet doesn't change. For a 48-hour forecast, I might tolerate methods that slightly lose water over 48 hours in return for a fast solution. For a climate forecast/projection, this would be unacceptable.

    Getting the same answer every time is no comfort if I have no way of knowing if its the right answer.

"Gravitation cannot be held responsible for people falling in love." -- Albert Einstein

Working...