(Useful) Stupid Regex Tricks? 516
careysb writes to mention that in the same vein as '*nix tricks' and 'VIM tricks', it would be nice to see one on regular expressions and the programs that use them. What amazingly cool tricks have people discovered with respect to regular expressions in everyday life as a developer or power user?"
IP and Hardware addresses (Score:5, Insightful)
And this one for mac addresses
Re:IP and Hardware addresses (Score:4, Insightful)
Re:is it an rfc-822 compliant e-mail address? (Score:4, Insightful)
Mmmmm readable.
use Regex::Common; (Score:5, Insightful)
$text_with_urls =~ m/$RE{URI}/;
$text_with_ips =~ m/$RE{net}{IPv4}/;
Re:ARGH!!!! (Score:2, Insightful)
So clearly, Slashdot's shit never stank?
No, seriously, why the bitching? Did you expect the site to just keep reporting dry stories about incremental Linux kernel upgrades for its entire existence? You expected a website to never change and never update with the times? Just because it's old doesn't mean it's sacred.
Re:IP and Hardware addresses (Score:4, Insightful)
If you mean "Is it an address that you can send IP traffic to?", then the answer is no. If you mean "Is it a valid value that can end up in an IP address field (e.g., in the response to the ipconfig command)?" then the answer is yes - it means that you've not got a connection.
Re:Do these questions really belong here? (Score:2, Insightful)
I like stackoverflow a lot and have been tangentially involved in other tech knowledge base-type sites, but they suffer from one typical problem.
People who already *have* certain knowledge don't often spend much time reading sites dedicated to dispensing that information.
Re:99 Bottles of Beer on the wall (Score:5, Insightful)
(I would quote the final result but /. won't allow that many "junk" characters.. let's hope that doesn't cripple this entire discussion.)
Interesting that a site for nerds doesn't allow a lot of characters commonly used in source code.
Re:Mainframe Formatting (Score:4, Insightful)
Re:Regexp-based address validation (Score:5, Insightful)
The regex is beautiful in the sense that it lets you not be one of those assholes who refuses valid email addresses.
(Useful) Stupid useless articles (Score:3, Insightful)
Dear slashdot editors,
slashdot.org is not stackoverflow.com [stackoverflow.com].
The articles and discussions here are not searchable in a sane way. Your recent attempts to mimic stackoverflow are just a waste of everybody's time because all those little tidbits that people post get lost in the internet noise immediately.
We know you're bit desperate [alexa.com] for traffic these days. But this is not the way to go.
Re:IP and Hardware addresses (Score:4, Insightful)
Why isn't 0.0.0.0 or 10.* a valid IP address? Since when is the definition of IP address to be unicast and globally routable?
I'd rather take issue with the fact it completely fails on IPv6 addresses.
Re:is it an rfc-822 compliant e-mail address? (Score:3, Insightful)
Does that thing allow nested comments, and escaping inside them? It doesn't look like it, it isn't recursive. (I have some in the email address I typically put online, ais523(524\)(525)x)@bham.ac.uk; that could be a good test for your email client, and is useful because I've never come across a spambot that can parse it.)
Recent versions of Perl and Python regices allow you to write recursively; that probably qualifies as a stupid regex trick, especially as it makes them more computationally powerful so they can handle things like email addresses. Or you could just sit wondering why email addresses allow nested comments anyway...
Re:email validation... FAIL (Score:4, Insightful)
Your regex doesn't allow + signs in the name part.
Nor, I would suspect would it handle quoted strings e.g. "Jeremy P"@example.com is technically a valid RFC 822 address.
And having just looked up the RFC 5322 spec which you quote, I see there are more cases you fail to take acount of e.g.
Jeremy P <jeremyp@example.com>
Also, what makes you think upper case in domain names is invalid? jeremyp@example.COM fails validation.
Opposite (Score:4, Insightful)
IMHO, this is exactly the way that Slashdot should be going. Threads like this are interesting, add to the reservoirs of internet knowledge, and have the highest quality to noise ratios.
I (and I suspect many others) read Slashdot not for the latest +5 funny comment (though those can be fun to read) but to read the opinions of brilliant minds. And when those minds start trading secrets... Everyone wins.
Re:Validating credit card numbers (Score:3, Insightful)
The simplest way to do it, of course, is to just list all valid Luhn Algorithm numbers. something like (.....384848583 | 938484845 | 8383838383......). Of course, this is probably not what you are looking for, because you will be listing a lot of numbers, and if your Luhn number is too big, then it won't be in your list.
So, as for a more general solution, it is possible because at each digit you can know whether your number matches so far or not. What you will be basically implementing is a regular expression that checks each digit and says, "does this digit move me to a state that is a valid number or an invalid number?" I could be wrong, but my initial estimate is that this will take less than a thousand states in a state machine (of course, the easiest way to do this is to design a state machine and then translate it to a regular expression).
To give an idea of what you are up against (and to help me find the answer to your question myself!) I implemented here a simple regular expression to determine if any binary addition will have an overflow at the last digit or not:
((0+1)+(1|(0+11)1+)+
You can do something similar, although much much longer, with the Luhn algorithm.
Hope that helps.