Ask Slashdot: Is There a 'Standard' Way of Formatting Numbers? 84
Long-time Slashdot reader Pieroxy is working on a new open source project, a web-based version of the system-monitoring software Conky.
The ultimate goal is send the data to an HTML interface "to find some use for the old iPads/tablets/laptops we all have lying around. You can put them next to your screen and have your metrics displayed there...!"
There's just one problem: "I had to come up with a way for users to format a number." I needed a small string the user could write to describe exactly what they want to do with their number. Some examples can be: write it as a 3-digit number suffixed by SI prefixes when the numbers are too big or too small, display a timestamp as HH:MM string, or just the day of week, eventually cut to the first three characters, do the same with a timestamp in milliseconds, or nanoseconds, display a nice string out of a number of seconds to express a duration ("3h 12mn 17s"), pad the number with spaces so that all numbers are aligned (left or right), force a fixed number of digits after the decimal point, etc.
In other words, I was looking for a "universal" way of formatting numbers and failed to find any kind of standard online.
Do Slashdot readers know of such a thing or should I create my own?
The ultimate goal is send the data to an HTML interface "to find some use for the old iPads/tablets/laptops we all have lying around. You can put them next to your screen and have your metrics displayed there...!"
There's just one problem: "I had to come up with a way for users to format a number." I needed a small string the user could write to describe exactly what they want to do with their number. Some examples can be: write it as a 3-digit number suffixed by SI prefixes when the numbers are too big or too small, display a timestamp as HH:MM string, or just the day of week, eventually cut to the first three characters, do the same with a timestamp in milliseconds, or nanoseconds, display a nice string out of a number of seconds to express a duration ("3h 12mn 17s"), pad the number with spaces so that all numbers are aligned (left or right), force a fixed number of digits after the decimal point, etc.
In other words, I was looking for a "universal" way of formatting numbers and failed to find any kind of standard online.
Do Slashdot readers know of such a thing or should I create my own?
Betteridge was here (Score:5, Informative)
No, there are lots of standards, as usual. They vary with locale.
Re: (Score:3)
He's not looking for a standard number format. but a standard for number format specification, i.e. a way of solving the problem that different locales call for different ways of formatting numbers.
Re: (Score:2)
some use for the old iPads/tablets/laptops we all have lying around.
Nonetheless it's good to hear even a small acknowledgement of complicity in the flood of toxic debris left by the digital age as it goes about making our lives better.
Re: (Score:2)
There are a number of standards but if you're looking for something reasonably ubiquitous, grab the one from the Unix "date" manual page.
https://man7.org/linux/man-pag... [man7.org]
Re: (Score:1)
Since this is likely to be a format specified by users then it would be better to use something that more people come into contact with - like Excel's format specifiers.
They are a little less arcane looking than the Unix date format specifiers too, with all those % signs.
Re: (Score:2)
The top of the list of things you don't want to take from excel is the date/time processing concepts.
Re: (Score:2)
The most important comments are often also the most unassuming.
The top of the list of things you don't want to take from excel is the date/time processing concepts.
I cannot over-emphasise how much I agree with this.
<rant> .csv files into an Excel spreadsheet. The files have the dates and times in a single field, in US format:
I am after spending most of the last two full weeks trying to import several hundred
But my system's regional settings are set to the Irish standard:
Importing through "Power Query" (a lie if there ever was) was a fecking nightmare. Even when
Re: (Score:3)
There's different kind of numbers to boot. Something related to time is going to be formatted different than something related to precision.
Re: (Score:2)
If somebody is even asking this question then "creating their own" won't help.
The best you can hope for is:
a) Let them type something
b) Cross your fingers and hope that scanf() will do the right thing based on their locale setting.
Locale libraries or wrapped printf() (Score:2)
Yeah what the person with the question probably needs to do is use a locale library. Let the user choose a locale. Pass the chosen locale to the formatting functions provided by the library. That's the right answer 99.9% of the time this question is asked.
If locales truly don't answer the need, printf() format strings are widely used for specifying formats, and have been for many decades.
HOWEVER, printf and friends can also introduce vulnerabilities when you let the user specify the format string. It needs
Re: (Score:2)
Re:Betteridge was here (Score:4, Informative)
ISO has universal formats for most things.
ISO 8601 relates to time. For example, durations are PnYnMnDTnHnMnS. Never really took off... ISO date is an internationally accepted standard, it's obvious to everyone what 2021-08-14 is. At least I hope it is.
Re: (Score:3)
>"it's obvious to everyone what 2021-08-14 is. At least I hope it is."
It certainly is. And I wish it was used everywhere. Several years ago, I started trying to convert to it, myself. Not only does it sort correctly, it conforms to most significant first, and nobody can be confused. Unlike the worst example, 03-06-11, which could be just about anything. Oh well.
Re: (Score:2)
It may be obvious to people, but computers may think the "-" is executable.
I use 2021au14 where this could be an issue.
I realise it does not sort so well, and the letters are locale specific. For the UK: ja, fe, mr, ap, my, jn, jy, au, se, oc, no, de.
This is, or was, used by London Transport (Buses, Underground, Overground)
Re: (Score:2)
I use 2021au14 where this could be an issue. I realise it does not sort so well, and the letters are locale specific.
For the UK: ja, fe, mr, ap, my, jn, jy, au, se, oc, no, de.
For months, why not use the more universal three-letter designation?
Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec ?
Re: (Score:2)
Is that universal? How well does it work for a big user base, like ... let's start at the top and say "China"?
Oh, sorry, you meant "universal amongst those using the Latin writing system". So, how well does it work in, say, Swahili, where the month names are more obviously based on the count of months since a "New Year" (which, obviously, does not correspond to the calendar year).
Mutter, mumble. Calendars based on lunar months, like thos
Re: (Score:2)
You're right. I should have said the more standardized three-letter abbreviations for months in English speaking locations like the UK location he mentioned.
But you could have raised that point in a single sentence. You must also get really incensed when they crown "Miss Universe" without consulting the rest of the universe for permission to grant that title. But perhaps they did. I'll check with an astronomer too.
Re: (Score:2)
I thought the thought-criminals behind "Miss Universe" had been strung up on the Gileadean Wall of Canceldom, long since. Or were they just thrown off the boat? One bloated toadish plutocrat looks very much like another after after a while.
Re: Betteridge was here (Score:2)
Re: Betteridge was here (Score:2)
Re: (Score:2)
and nobody can be confused.
sigh
Re: (Score:2)
>"You mean it sorts correctly if you use a string comparison instead of a date comparison"
Correct
>"BTW 14 is not a correct number for a month, so your example is wrong."
No, MY example was 03-06-11
Re: (Score:3)
"The great thing about standards is that there are so many to choose from."
For ECMAscript Intl (Score:2)
here
https://developer.mozilla.org/... [mozilla.org]
only problem is if your device's browser is too old.
Re: Betteridge was here (Score:1)
binary (Score:2)
Re:binary (Score:5, Insightful)
Binary is most common in digital world. People just don't think about it.
Well... you still have big endian and little endian.
Re: (Score:2)
That's just how you store them in memory, it doesn't affect their meaning.
Re: (Score:2)
You can also have middle-endian. For example, a computer that stores a sixteen bit number as two eight-bit numbers listed in big-endian order, but each eight-bit number is little-endian.
Re: (Score:2)
You can also have middle-endian. For example, a computer that stores a sixteen bit number as two eight-bit numbers listed in big-endian order, but each eight-bit number is little-endian.
You can also have Modbus...
Re: (Score:2)
But I think they used two eighteen-bit words to comprise a 36-bit real, and there was a "short real" form with an eighteen bit index and an eighteen bit mantissa as well.
Re: binary (Score:2)
Re: (Score:2)
Do BCD as a compromise.
Re: (Score:2)
I havenâ(TM)t figured out what that weird symbol in from of them means but maybe itâ(TM)s the symbol of ten - £.
The world can't even agree on basic English punctuation.
Re: (Score:2)
+1 Funny
Re: (Score:2)
There is a corollary that a significant proportion of Slashdot's editors don't know their Apples from their Ora
Re: (Score:2)
He seeks a standard for number format specification, not a standard format.
Yes there are standards. Hereâ(TM)s the most (Score:3, Informative)
Re: (Score:2)
I was suspecting that "some XML" would probably cover the bill, but that sounds like a particular range of XML-ish properties.
Nope (Score:4, Insightful)
No, there's no universal standard. Forget it.
Usually your OS has a locale setting. It makes C functions like scanf() and sprintf() work differently in different places. ...which leads to interesting bugs if you try to read text files with numbers that were created in a different country than yours. You need to know the locale of where the file was created, but nobody ever supplies that information.
Even standards like CSV files are different in other places. You'd think the clue to the separator would be in the 'C' part of "CSV", but no, CSV values can be separated by semicolons or glub-knows-what depending on where they get created.
Re: (Score:2)
I've written code to do this and I found the best option is to use SI units and a maximum of 3 digits before the decimal place. At least it was in my application.
Re: (Score:2)
Even standards like CSV files are different in other places.
I had occasion to use pipe-separated-values (PSV), because the separated strings often contain commas. Using '|' as a separator avoided having to quote everything. The data was for circuit board parts lists and a parts inventory, which could be crunched by Python code, and presented as spreadsheets. Python's csv library and LibreOffice Calc are quite happy with this variant, as both give you a choice of separator.
I chose pipe as a separator, because SQLite uses that as a default, when representing a table r
Trouble grokking the problem domain (Score:2)
Trying to be fair (Score:1)
There is the old adage that there are three major problems in software development, being:
1. Time
2. Names
3. Off-by-one errors
As for the first one, which seems relevant to the question, number formatting, as well as time formatting is locale-based and, unfortunately, sometimes decoupled from whatever one might choose as language.
As for myself, time and number formatting seems most intuitive for me in the Danish form, which is also followed elsewhere, whereas I prefer my UI language to be English (whatever th
Re:Trying to be fair (Score:5, Informative)
There is the old adage that there are three major problems in software development, being:
That adage is MUCH better when you say "two" instead of "three".
Re: (Score:2)
There are three recurring problems in computer science:
1. Date/time
2. Naming things
3. Off-by-one errors
4. Butchering jokes
Re: (Score:2)
1. Those who understand binary
10. Those who don't
Re: (Score:3)
There are four recurring problems in computer science:
1. Date/time
2. Naming things
3. Off-by-one errors
4. Butchering jokes
5. Monty Python references
Re:Trying to be fair (Score:4, Funny)
AMONG the recurring problems in computer science are:
1. Date/time ... I'll come in again.
10. Naming things
100. Off-by-one errors
4. People who don't understand binary
101. Butchering jokes
110. Monty Python references
Re: Trying to be fair (Score:1)
You are aware, of course, of what an off-by-one error is, right?
Re: (Score:2)
if there is (Score:5, Funny)
Re: (Score:3)
America wont use it.
Except if you call it the "American format".
hahaha zoomers (Score:2)
Trying to reinvent the edited numeric formats the mainframes had before they were born. Amazing the amount of "modern developers" that can't handle numbers internally either, not understanding issues of floats verses BCD nor rounding.
Use CLDR or ICU (Score:3)
I recommend using CLDR (http://cldr.unicode.org/). It has data to format numbers with measurement units, and recent versions include all of the language specific grammatical cases needed to format such a quantity in a sentence. ICU (http://site.icu-project.org/) is one of the common ways to access this localized data.
printf and time format strings... (Score:2)
ISO 8601 (Score:5, Informative)
Re:ISO 8601 (Score:4, Informative)
Re: (Score:1)
Hello 16,236. You beat me by days (weeks?) to Slashdot. Congrats.
From 16,273
Re: (Score:2)
Re: (Score:1)
How universal is universal? (Score:1)
It may help if you think about the task as formatting arbitrary strings with numbers as a part of them. Because that is what you are asking for.
Most of the user cases you present could be handled by something relatively simple like the printf format string method, which while not quite a standard, is pretty widely known. Or, oh, the horror, the Excel custom number format.
Is arbitrary formatting something your users want? You may be thinking of this from a programmer-centric perspective. Could you cover
Use the one (one of the ones?) in PHP. (Score:1)
Number formatting is one of those utility things that pops up twice a month in web development. Hence PHP pretty much has this covered. It's likely to be a thin inner api function for which an ancient C library exists that is the same foundation for number formatting in Perl, Python and probably a few other FOSS scripting languages. So it's also very likely you'll have a tough time finding something that's more powerful and/or stable.
Google "number formatting in PHP" and you'll have your standard.
You're we
Make your users smarter (Score:2)
Only accept IEEE 754 floating point numbers in Octal format.
Full Time Job (Score:3)
(many examples removed due to triggering error: "your comment looks like ascii art")
It sounds like he wants a method to specify formatting.
(ICU) Message Format for Java/C/PHP/JS/... is the most robust, but it is a programming language in itself, right down to conditional blocks and offsets, and lots of repetition to format the text portion correctly. But it covers Ordinals, Cardinals, Percentages, Ranges, Date, Time, etc.
Where each domain to format (number, date, time) has a massive backing library.
For the use-cases described, one would need to rely on a custom Formatter class to process the numbers for each of the complex handling (character truncation, max-length handling, alignment/padding specifications)
Adding a CSS engine would simplify formatting by allowing blocks to be sized independently of the data, and display to be nicely aligned/formatted/truncated, but that adds yet another layer of complexity that likely isn't wanted.
When you say "display as jjmm" this is still ambiguous, as the ICU specification requires "j" (user preferred hour "H"/"h" (24/12)), but also whether it is 0 based or 1 based ("h" or "k") to be defined as 00-23, 01-24, 00-11, 01-12; as different languages have different requirements here.
DateTime tends to use the ICU specification of "G", "yyy", "MMMdd", "jjmmss", "e", "v" for outlining which components of a time needed (era, year, month, day-of-month, hour, minute, second, day-of-week, (am/pm,) timezone" depending on fields displayed, (month) will need to conjugate accordingly. MR-1920 (nominative month), 19-dMR-20 (Genitive month)
Representing a duration, comes with added nuances, and requires format-specifiers ( 00:00:00:000 00'00"00.000 00h00m00s 00 h 00 m 00 s (localized, and conjugated 00 mn 01 mny 02 mni 03 mns 04 mns 10 mn 11 mny, etc.)
Numbers, have different formatting for spacing depending on language #,00,000 or #,000,000 where the groupings change by language. Or, the symbols change ( 100'000.00 100,000.00 100.000,00 100 000,00 )
Padding generally doesn't have a standard anywhere, and CSS is most commonly used. Otherwise, falling back to an excel like "0" padded "#,000,000" type specifier, may be hacked into something useful.
Alignment is not handled by any system that I am aware of. CSS is the most common, with limited support for decimal alignemnt. The most common is setting tab-stops with a type of "decimal alignement" or "left-alignment" or "right-alignment" using a combination of tab-stops and tabs to create the effect needed. As this requires interaction with other elements on a page, it is deemed out of scope for any formatter algoritms, and delegated to page/layout engines (eg. CSS)
Significant Digits, Truncation, of numbers: equally not well supported. Again CSS truncation is nice here. But, outside of that it becomes complicated due to a standard spread of 3-6 digits before a change-over; special casing surronding numbers so that "1100" doesn't become "1,1k" in a list of mostly 0-999;
Formatting units may require conjugations, and have custom spacing rules: 00 unit 00unit 00[half-space]unit
Special units may need special handling: Currency sign may be $0,00 $ 0,00 0,00$ 0,00 $. Percent/permille sign may have different spacing and placement within a given language.
Formatting negative numbers have different requirements than positive numbers in many cases and must be handled separately.
For each domain, there are many conventions available. Properly formatting times, dates, and numbers is a full time, titled job (12 years at it now). Allowing the user to customize is rife with error, as each display format has dozens of options, and configuraiton in addition to the raw-output specifiers. Not doing so, will mean the output in incorrect for the majority of situations and languages.
Using standards internally is great, but should generally never be exposed to the user.
Giving a formatting tree is best:
Number - Curr
Long and short scale (Score:1)
No standard (Score:2)
Re: (Score:2)
We've upped our standards, so UP YOURS
ISO 8601 (Score:2)
But I believe it's still under discussion. For years. I would just look at a native format that logtash understands natively and go with that.
What we commonly use is the standard (Score:2)
I think it really depends on what your trying to do, if its for humans than of course you use things humans are familiar with, many things are increasingly decimal notation these days, but there are also fractions. For time you use the standard formats for those. As others mention, there are some standards for time formats. So the standard is really what you already see in every day use, for humans. Its usually best to go with what everyone is already familiar with for human readable applications.
For comput
Look at what Excel does (Score:2)
Have a look at what Excel does for formatting numbers.
Select a cell and hit Ctrl 1 or Cmd 1
Look at the various formatting options. They have been designed (or evolved over many years) to be relatively straightforward for non-technical users.
They cover a vast range of different types - numbers, dates, times, currency etc.
Data entry is bad practice (Score:2)
If you let people type in stuff, they'll type in crap.
Just give them a drop-down list of all the allowable numbers formatted the way you want in one of those infinite scroll boxes.
Then they can just mouse-wheel down and click the number they want.
There are many advantages to this, for you, BOFH.
Describing number formats (Score:2)
I think this method (from Provue's Panorama database programming environment ) covers just about every conceivable option:
http://www.provue.com/panorama... [provue.com]
csw
Sounds like a nifty idea for a project (Score:1)
Why is this on slashdot? (Score:2)
Why is this even on slashdot? this has been asked so SO many times in the past. I'll bet the one who's asking the question is just beginning out with development or something. Do a google and you'll find there really is no standard way. And if it's about input, YOU decide how people need to input the data, and are more than enough libraries that deal with this subject.
This is a question that, these days, belong on stackoverflow, but if you do a search there, you'll find more than enough answers.
This person
Do not try and re-invent TeX (Score:2)
It's already there for just this purpose.
There are methods... (Score:1)
I'm a fan of the Angular method. Check out https://angular.io/api/common#pipes for examples. The idea is you have a variable with a string in it. You then pass that number to a function of your choice which does whatever the function is designed to do. It also allows custom functions if the user wants to define their own formatting method. So sure, you can say the value is a number and have it formatted by the DecimalPipe. But for me I've added features like a phone number formatter, so I can simply s
solution does not exist (Score:2)
none of the answers here understand what you're asking for, since they are trying to solve the wrong problem.
asking the user to input a formatting string is idiotic, even for technical users. the whole point of a phone app is to be easy to use and needing to look up format identifiers is not.
if your app really needs the flexibility, use a drop-down menu or other custom dialog to let them input the relevant time in one of N ways. maybe even have multiple input boxes which auto-update the compatible fields ba
Start by trying the standard library (Score:2)
The quickest solution (to at least get a functioning MVP) would be to let users supply a format string with format-specifiers from the standard library (for printf / String.Format / object.ToString) for whichever tech-stack you are using and live with whatever limitations they have (unless you are using JavaScript - which probably has no single 'standard' library for this)
Then when you've got that working figure out how to make a GUI to guide users through the complexity of creating a valid format string f