Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Math

Ask Slashdot: Is There a 'Standard' Way of Formatting Numbers? 84

Long-time Slashdot reader Pieroxy is working on a new open source project, a web-based version of the system-monitoring software Conky.

The ultimate goal is send the data to an HTML interface "to find some use for the old iPads/tablets/laptops we all have lying around. You can put them next to your screen and have your metrics displayed there...!"

There's just one problem: "I had to come up with a way for users to format a number." I needed a small string the user could write to describe exactly what they want to do with their number. Some examples can be: write it as a 3-digit number suffixed by SI prefixes when the numbers are too big or too small, display a timestamp as HH:MM string, or just the day of week, eventually cut to the first three characters, do the same with a timestamp in milliseconds, or nanoseconds, display a nice string out of a number of seconds to express a duration ("3h 12mn 17s"), pad the number with spaces so that all numbers are aligned (left or right), force a fixed number of digits after the decimal point, etc.

In other words, I was looking for a "universal" way of formatting numbers and failed to find any kind of standard online.

Do Slashdot readers know of such a thing or should I create my own?
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Is There a 'Standard' Way of Formatting Numbers?

Comments Filter:
  • Betteridge was here (Score:5, Informative)

    by drinkypoo ( 153816 ) <drink@hyperlogos.org> on Saturday August 14, 2021 @11:38AM (#61691777) Homepage Journal

    No, there are lots of standards, as usual. They vary with locale.

    • He's not looking for a standard number format. but a standard for number format specification, i.e. a way of solving the problem that different locales call for different ways of formatting numbers.

      • I thought he was looking for

        some use for the old iPads/tablets/laptops we all have lying around.

        Nonetheless it's good to hear even a small acknowledgement of complicity in the flood of toxic debris left by the digital age as it goes about making our lives better.

      • There are a number of standards but if you're looking for something reasonably ubiquitous, grab the one from the Unix "date" manual page.

        https://man7.org/linux/man-pag... [man7.org]

        • by vivian ( 156520 )

          Since this is likely to be a format specified by users then it would be better to use something that more people come into contact with - like Excel's format specifiers.
          They are a little less arcane looking than the Unix date format specifiers too, with all those % signs.

          • by thogard ( 43403 )

            The top of the list of things you don't want to take from excel is the date/time processing concepts.

            • The most important comments are often also the most unassuming.

              The top of the list of things you don't want to take from excel is the date/time processing concepts.

              I cannot over-emphasise how much I agree with this.

              <rant>
              I am after spending most of the last two full weeks trying to import several hundred .csv files into an Excel spreadsheet. The files have the dates and times in a single field, in US format:

              ,m/d/yy 3:14 AM,

              But my system's regional settings are set to the Irish standard:

              dd/mm/yyyy 03:14:00

              Importing through "Power Query" (a lie if there ever was) was a fecking nightmare. Even when

    • There's different kind of numbers to boot. Something related to time is going to be formatted different than something related to precision.

    • If somebody is even asking this question then "creating their own" won't help.

      The best you can hope for is:
      a) Let them type something
      b) Cross your fingers and hope that scanf() will do the right thing based on their locale setting.

      • Yeah what the person with the question probably needs to do is use a locale library. Let the user choose a locale. Pass the chosen locale to the formatting functions provided by the library. That's the right answer 99.9% of the time this question is asked.

        If locales truly don't answer the need, printf() format strings are widely used for specifying formats, and have been for many decades.

        HOWEVER, printf and friends can also introduce vulnerabilities when you let the user specify the format string. It needs

    • by fermion ( 181285 )
      You can always tell when a newbie is trying to write code. The question is how to specify format. The solution is old. Not like Hidden Figures Euler old, but it is a solved. Look at FORTRAN or C. These have specs for defining number formats.
    • by AmiMoJo ( 196126 ) on Saturday August 14, 2021 @12:40PM (#61692021) Homepage Journal

      ISO has universal formats for most things.

      ISO 8601 relates to time. For example, durations are PnYnMnDTnHnMnS. Never really took off... ISO date is an internationally accepted standard, it's obvious to everyone what 2021-08-14 is. At least I hope it is.

      • >"it's obvious to everyone what 2021-08-14 is. At least I hope it is."

        It certainly is. And I wish it was used everywhere. Several years ago, I started trying to convert to it, myself. Not only does it sort correctly, it conforms to most significant first, and nobody can be confused. Unlike the worst example, 03-06-11, which could be just about anything. Oh well.

        • It's obvious to everyone what 2021-08-14 is

          It may be obvious to people, but computers may think the "-" is executable.

          I use 2021au14 where this could be an issue.

          I realise it does not sort so well, and the letters are locale specific. For the UK: ja, fe, mr, ap, my, jn, jy, au, se, oc, no, de.

          This is, or was, used by London Transport (Buses, Underground, Overground)

          • I use 2021au14 where this could be an issue. I realise it does not sort so well, and the letters are locale specific.
            For the UK: ja, fe, mr, ap, my, jn, jy, au, se, oc, no, de.

            For months, why not use the more universal three-letter designation?
            Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec ?

            • For months, why not use the more universal three-letter designation?

              Is that universal? How well does it work for a big user base, like ... let's start at the top and say "China"?

              Oh, sorry, you meant "universal amongst those using the Latin writing system". So, how well does it work in, say, Swahili, where the month names are more obviously based on the count of months since a "New Year" (which, obviously, does not correspond to the calendar year).

              Mutter, mumble. Calendars based on lunar months, like thos

              • You're right. I should have said the more standardized three-letter abbreviations for months in English speaking locations like the UK location he mentioned.

                But you could have raised that point in a single sentence. You must also get really incensed when they crown "Miss Universe" without consulting the rest of the universe for permission to grant that title. But perhaps they did. I'll check with an astronomer too.

                • I work from a UK location too. But I have to deal with strange things like American, French, German, Norwegian, Arabic and Swahili terminology. Ones physical location is a pretty atrocious guide to the input you're going to get.

                  I thought the thought-criminals behind "Miss Universe" had been strung up on the Gileadean Wall of Canceldom, long since. Or were they just thrown off the boat? One bloated toadish plutocrat looks very much like another after after a while.

        • You mean it sorts correctly if you use a string comparison instead of a date comparison⦠BTW 14 is not a correct number for a month, so your example is wrong.
    • "The great thing about standards is that there are so many to choose from."

    • here
      https://developer.mozilla.org/... [mozilla.org]

      only problem is if your device's browser is too old.

    • There are standards. See other comments below.
  • Binary is most common in digital world. People just don't think about it.
    • Re:binary (Score:5, Insightful)

      by chispito ( 1870390 ) on Saturday August 14, 2021 @11:49AM (#61691813)

      Binary is most common in digital world. People just don't think about it.

      Well... you still have big endian and little endian.

      • by swilver ( 617741 )

        That's just how you store them in memory, it doesn't affect their meaning.

      • by dasunt ( 249686 )

        Well... you still have big endian and little endian.

        You can also have middle-endian. For example, a computer that stores a sixteen bit number as two eight-bit numbers listed in big-endian order, but each eight-bit number is little-endian.

        • Well... you still have big endian and little endian.

          You can also have middle-endian. For example, a computer that stores a sixteen bit number as two eight-bit numbers listed in big-endian order, but each eight-bit number is little-endian.

          You can also have Modbus...

        • Honeywell, wasn't it?

          But I think they used two eighteen-bit words to comprise a 36-bit real, and there was a "short real" form with an eighteen bit index and an eighteen bit mantissa as well.

    • Decimal is also quite popular on some places - especially after lockdown. I havenâ(TM)t been to too many stores - 2 so far - but they both have their prices in ten based system. I havenâ(TM)t figured out what that weird symbol in from of them means but maybe itâ(TM)s the symbol of ten - £.
      • Do BCD as a compromise.

      • I havenâ(TM)t figured out what that weird symbol in from of them means but maybe itâ(TM)s the symbol of ten - £.

        The world can't even agree on basic English punctuation.

        • +1 Funny

        • It's not "English" punctuation, but punctuation in the Latin writing system. And the world does agree on characters fro punctuation in the Latin writing system (hint : ASCII, then Unicode), it's just that Apple choose to use non-Latin characters to map to Latin characters on their keyboards. (I think it's Apple, though the last Apple machine I used didn't do this. I got rid of that in about 2010.)

          There is a corollary that a significant proportion of Slashdot's editors don't know their Apples from their Ora

    • He seeks a standard for number format specification, not a standard format.

  • by ScienceMan ( 636648 ) on Saturday August 14, 2021 @12:00PM (#61691849)
    The Data Format Description Language (DFDL) from the Open Grid Forum is the most comprehensive standard for describing the format of arbitrary fixed data presentations. Multiple implementations exist including the open source Apache Daffodil code and other commercial and open implementations. There is an active community of developers of DFDL description schemas on GitHub and implementations in the European Space Agency for example for satellite data. The published full recommendation is available at http://www.ogf.org/documents/G... [ogf.org] and a wide variety of publicly maintained schemas is at https://github.com/DFDLSchemas [github.com] covering applications from point of sale systems to scientific and health care records. You can write a DFDL description from scratch without much effort. You can also convert between formats by using or creating DFDL descriptions of the source and target formats using tools like Apache Daffodil (https://daffodil.apache.org/).
    • That sounds like it addresses the questioner's problem. Unfortunately, I've commented already, so I can't upvote your message.

      I was suspecting that "some XML" would probably cover the bill, but that sounds like a particular range of XML-ish properties.

  • Nope (Score:4, Insightful)

    by Joce640k ( 829181 ) on Saturday August 14, 2021 @12:02PM (#61691861) Homepage

    No, there's no universal standard. Forget it.

    Usually your OS has a locale setting. It makes C functions like scanf() and sprintf() work differently in different places. ...which leads to interesting bugs if you try to read text files with numbers that were created in a different country than yours. You need to know the locale of where the file was created, but nobody ever supplies that information.

    Even standards like CSV files are different in other places. You'd think the clue to the separator would be in the 'C' part of "CSV", but no, CSV values can be separated by semicolons or glub-knows-what depending on where they get created.

    • by AmiMoJo ( 196126 )

      I've written code to do this and I found the best option is to use SI units and a maximum of 3 digits before the decimal place. At least it was in my application.

    • Even standards like CSV files are different in other places.

      I had occasion to use pipe-separated-values (PSV), because the separated strings often contain commas. Using '|' as a separator avoided having to quote everything. The data was for circuit board parts lists and a parts inventory, which could be crunched by Python code, and presented as spreadsheets. Python's csv library and LibreOffice Calc are quite happy with this variant, as both give you a choice of separator.

      I chose pipe as a separator, because SQLite uses that as a default, when representing a table r

  • Are you basically looking for something like Java’s DecimalFormat [oracle.com] and/or SimpleDateFormat specifications? These use a String to define how you wish to translate a given number or time stamp input. Or am I misunderstanding what you are looking for?
  • There is the old adage that there are three major problems in software development, being:

    1. Time
    2. Names
    3. Off-by-one errors

    As for the first one, which seems relevant to the question, number formatting, as well as time formatting is locale-based and, unfortunately, sometimes decoupled from whatever one might choose as language.

    As for myself, time and number formatting seems most intuitive for me in the Danish form, which is also followed elsewhere, whereas I prefer my UI language to be English (whatever th

  • by Growlley ( 6732614 ) on Saturday August 14, 2021 @12:32PM (#61691985)
    America wont use it.
  • Trying to reinvent the edited numeric formats the mainframes had before they were born. Amazing the amount of "modern developers" that can't handle numbers internally either, not understanding issues of floats verses BCD nor rounding.

  • by UTF-8 ( 680134 ) on Saturday August 14, 2021 @12:37PM (#61692009) Homepage

    I recommend using CLDR (http://cldr.unicode.org/). It has data to format numbers with measurement units, and recent versions include all of the language specific grammatical cases needed to format such a quantity in a sentence. ICU (http://site.icu-project.org/) is one of the common ways to access this localized data.

  • Seems to me you could borrow the format string from printf to describe numeric strings, and borrow the format string from the *NIX time command to describe date/time strings.
  • ISO 8601 (Score:5, Informative)

    by Flu ( 16236 ) on Saturday August 14, 2021 @12:43PM (#61692031)
    Its the internationally approved standard for time/date/duration. There's no need whatsoever for any user to enter/display such units in any other way.
    • Re:ISO 8601 (Score:4, Informative)

      by nuckfuts ( 690967 ) on Saturday August 14, 2021 @02:06PM (#61692311)
      Great answer! I had not seen that standard before, and I'm pleased to see that the date format (e.g. 2021-08-14) is precisely what I have chosen to use personally for years. In file names, it gives me the sorting that I want when listing files, and it also resolves the aggravating ambiguity of dates like 2021-04-05.
    • by Nobelium ( 16273 )

      Hello 16,236. You beat me by days (weeks?) to Slashdot. Congrats.

      From 16,273

  • It may help if you think about the task as formatting arbitrary strings with numbers as a part of them. Because that is what you are asking for.

    Most of the user cases you present could be handled by something relatively simple like the printf format string method, which while not quite a standard, is pretty widely known. Or, oh, the horror, the Excel custom number format.

    Is arbitrary formatting something your users want? You may be thinking of this from a programmer-centric perspective. Could you cover

  • Number formatting is one of those utility things that pops up twice a month in web development. Hence PHP pretty much has this covered. It's likely to be a thin inner api function for which an ancient C library exists that is the same foundation for number formatting in Perl, Python and probably a few other FOSS scripting languages. So it's also very likely you'll have a tough time finding something that's more powerful and/or stable.

    Google "number formatting in PHP" and you'll have your standard.

    You're we

  • Only accept IEEE 754 floating point numbers in Octal format.

  • by IcyWolfy ( 514669 ) on Saturday August 14, 2021 @01:16PM (#61692131) Homepage

    (many examples removed due to triggering error: "your comment looks like ascii art")

    It sounds like he wants a method to specify formatting.

    (ICU) Message Format for Java/C/PHP/JS/... is the most robust, but it is a programming language in itself, right down to conditional blocks and offsets, and lots of repetition to format the text portion correctly. But it covers Ordinals, Cardinals, Percentages, Ranges, Date, Time, etc.

    Where each domain to format (number, date, time) has a massive backing library.

    For the use-cases described, one would need to rely on a custom Formatter class to process the numbers for each of the complex handling (character truncation, max-length handling, alignment/padding specifications)

    Adding a CSS engine would simplify formatting by allowing blocks to be sized independently of the data, and display to be nicely aligned/formatted/truncated, but that adds yet another layer of complexity that likely isn't wanted.

    When you say "display as jjmm" this is still ambiguous, as the ICU specification requires "j" (user preferred hour "H"/"h" (24/12)), but also whether it is 0 based or 1 based ("h" or "k") to be defined as 00-23, 01-24, 00-11, 01-12; as different languages have different requirements here.

    DateTime tends to use the ICU specification of "G", "yyy", "MMMdd", "jjmmss", "e", "v" for outlining which components of a time needed (era, year, month, day-of-month, hour, minute, second, day-of-week, (am/pm,) timezone" depending on fields displayed, (month) will need to conjugate accordingly. MR-1920 (nominative month), 19-dMR-20 (Genitive month)

    Representing a duration, comes with added nuances, and requires format-specifiers ( 00:00:00:000 00'00"00.000 00h00m00s 00 h 00 m 00 s (localized, and conjugated 00 mn 01 mny 02 mni 03 mns 04 mns 10 mn 11 mny, etc.)

    Numbers, have different formatting for spacing depending on language #,00,000 or #,000,000 where the groupings change by language. Or, the symbols change ( 100'000.00 100,000.00 100.000,00 100 000,00 )

    Padding generally doesn't have a standard anywhere, and CSS is most commonly used. Otherwise, falling back to an excel like "0" padded "#,000,000" type specifier, may be hacked into something useful.

    Alignment is not handled by any system that I am aware of. CSS is the most common, with limited support for decimal alignemnt. The most common is setting tab-stops with a type of "decimal alignement" or "left-alignment" or "right-alignment" using a combination of tab-stops and tabs to create the effect needed. As this requires interaction with other elements on a page, it is deemed out of scope for any formatter algoritms, and delegated to page/layout engines (eg. CSS)

    Significant Digits, Truncation, of numbers: equally not well supported. Again CSS truncation is nice here. But, outside of that it becomes complicated due to a standard spread of 3-6 digits before a change-over; special casing surronding numbers so that "1100" doesn't become "1,1k" in a list of mostly 0-999;

    Formatting units may require conjugations, and have custom spacing rules: 00 unit 00unit 00[half-space]unit

    Special units may need special handling: Currency sign may be $0,00 $ 0,00 0,00$ 0,00 $. Percent/permille sign may have different spacing and placement within a given language.

    Formatting negative numbers have different requirements than positive numbers in many cases and must be handled separately.

    For each domain, there are many conventions available. Properly formatting times, dates, and numbers is a full time, titled job (12 years at it now). Allowing the user to customize is rife with error, as each display format has dozens of options, and configuraiton in addition to the raw-output specifiers. Not doing so, will mean the output in incorrect for the majority of situations and languages.

    Using standards internally is great, but should generally never be exposed to the user.

    Giving a formatting tree is best:
    Number - Curr

  • First you need to get around the long/short scale [https://en.wikipedia.org/wiki/Long_and_short_scale] and the Indian numbering system [https://en.wikipedia.org/wiki/Indian_numbering_system] which is neither.
  • I know of no universal standard for numbers. I am reminded of a comment one of my brothers once made, "The great thing about standards is that there are so many of them."
  • But I believe it's still under discussion. For years. I would just look at a native format that logtash understands natively and go with that.

  • I think it really depends on what your trying to do, if its for humans than of course you use things humans are familiar with, many things are increasingly decimal notation these days, but there are also fractions. For time you use the standard formats for those. As others mention, there are some standards for time formats. So the standard is really what you already see in every day use, for humans. Its usually best to go with what everyone is already familiar with for human readable applications.

    For comput

  • Have a look at what Excel does for formatting numbers.
    Select a cell and hit Ctrl 1 or Cmd 1
    Look at the various formatting options. They have been designed (or evolved over many years) to be relatively straightforward for non-technical users.
    They cover a vast range of different types - numbers, dates, times, currency etc.

  • If you let people type in stuff, they'll type in crap.
    Just give them a drop-down list of all the allowable numbers formatted the way you want in one of those infinite scroll boxes.
    Then they can just mouse-wheel down and click the number they want.
    There are many advantages to this, for you, BOFH.

  • I think this method (from Provue's Panorama database programming environment ) covers just about every conceivable option:

    http://www.provue.com/panorama... [provue.com]

    csw

  • Sounds like a nifty project. Good luck with it (seriously). There's a lot of good (and funny) stuff in these comments and ino sure you'll have a great time looking into the various formatting options available from the different standards, etc that have been listed, and any others your happen to find (e.g. the .NET ones) ... But all I would suggest/ask is that, from the users perspective, you: - Choose something reasonable sensible as a default format for each type of number (so your system just works witho
  • Why is this even on slashdot? this has been asked so SO many times in the past. I'll bet the one who's asking the question is just beginning out with development or something. Do a google and you'll find there really is no standard way. And if it's about input, YOU decide how people need to input the data, and are more than enough libraries that deal with this subject.
    This is a question that, these days, belong on stackoverflow, but if you do a search there, you'll find more than enough answers.
    This person

  • It's already there for just this purpose.

  • I'm a fan of the Angular method. Check out https://angular.io/api/common#pipes for examples. The idea is you have a variable with a string in it. You then pass that number to a function of your choice which does whatever the function is designed to do. It also allows custom functions if the user wants to define their own formatting method. So sure, you can say the value is a number and have it formatted by the DecimalPipe. But for me I've added features like a phone number formatter, so I can simply s

  • none of the answers here understand what you're asking for, since they are trying to solve the wrong problem.

    asking the user to input a formatting string is idiotic, even for technical users. the whole point of a phone app is to be easy to use and needing to look up format identifiers is not.

    if your app really needs the flexibility, use a drop-down menu or other custom dialog to let them input the relevant time in one of N ways. maybe even have multiple input boxes which auto-update the compatible fields ba

  • The quickest solution (to at least get a functioning MVP) would be to let users supply a format string with format-specifiers from the standard library (for printf / String.Format / object.ToString) for whichever tech-stack you are using and live with whatever limitations they have (unless you are using JavaScript - which probably has no single 'standard' library for this)

    Then when you've got that working figure out how to make a GUI to guide users through the complexity of creating a valid format string f

The moon is made of green cheese. -- John Heywood

Working...