Providers Ignoring DNS TTL? 445
cluge asks: "It seems that several large providers give their users DNS servers that simply ignore DNS time to live (TTL). Over the past decade I've seen this from time to time. Recently it seems to be a pandemic, affecting very large cable/broadband and dial up networks. Performing a few tests against our broadband cable provider has shown that only one of the three provided DNS servers picked up a change in seven days or less. After turning in a trouble ticket with that provider - two of the three provided DNS servers were responding correct - while the third was still providing bad information more than two weeks after that specific change. What DNS caches ignore TTL by default? Is there a valid technical reason to ignore TTL?"
"This struck me as odd, and I decided to run a few tests using my own domain. Lowering the TTL to twenty four hours, and making changes and then checking to see when a change was picked up. I queried twelve outside DNS servers/caches that I had access to (Thanks to my friends and relatives with dial ups and DSL who put up with me and my requests to reboot their machine daily!). Checks performed against these outside DNS servers indicate that it may take as much as four to five weeks before a DNS change is picked up! Most DNS servers picked up the change within 48 hours. A small number did not (three out of twelve - that's a quarter of them!)
This merits more study, and prompts a few questions. So, before I begin with a more serious broad study, I'd like to get some feedback on the problem as I've seen it. I know the tin foil hat crowd will see the failure to propagate DNS correctly as censorship, and the OS/bind/djb/whatever zealots will simply see this as an argument for their particular religion.
Based on the responses I get, I will then setup and test a couple of domains with different DNS servers for 6 weeks and report back the findings. [volunteers welcome!]"
I Noticed Too (Score:4, Insightful)
Re:Faulty system (Score:5, Insightful)
old TTL? (Score:3, Insightful)
Re:It's a strange pandemic... (Score:1, Insightful)
Re:I Noticed Too (Score:5, Insightful)
They've actually had to go to extra effort to break it on purpose.
BW (Score:2, Insightful)
The other problem is lazy or incompetant sysadmins...
Re:Yes there is (Score:5, Insightful)
Test case (Score:4, Insightful)
The TTL should stay the same for a while and then try simply making a change without modifying any other configuration to avoid any other problems with this test.
Why would you reboot? (Score:5, Insightful)
If you're rebooting client machines to check DNS records, then I'm forced to view your entire study with caution.
ELOGICFAULT (Score:5, Insightful)
Besides, this behavior blows up all sorts of geographical load balancing, datacenter failover, etc. type solutions (google for a F5 3DNS device sometime).
Bad stuff, mucking about with the TTL that someone has assigned to a record. It's not arbitrary information. To those fucking with TTLs, how about we arbitrarily alter the numbers in your paycheck? Oh? What's that? That doesn't seem like a good idea? Gee. Go Fig. HANDS OFF MY TTL, ASSHAT.
-AC
Re:Faulty system (Score:5, Insightful)
It's irresponsible tampering, it's that simple.
You did what? (Score:3, Insightful)
ipconfig
ipconfig
But wouldn't an easier way be just using dig to directly query the name servers?
Re:People abusing it on the other end... (Score:2, Insightful)
Re:You did what? (Score:3, Insightful)
Re:You can use TTL to keep customers from leaving! (Score:3, Insightful)
Re:Why would you reboot? (Score:3, Insightful)
I'd imagine rebooting was easier.
Re:Bypass their DNS (Score:3, Insightful)
Re:People abusing it on the other end... (Score:4, Insightful)
If DNS bandwidth were a non-issue, you wouldn't even be able to set TTL -- it would just be hardwired to a small value. But it is. Don't lapse into spammer logic: "I'm only wasting a tiny amount of resources, so it doesn't matter." But it does matter, because lots of other people are thinking the same thing.
Re:I know AOL used to be an offender, likely still (Score:5, Insightful)
I'd be curious who's the audience for the site(s) you're talking about. I'm pretty uncomfortable calling tens of millions of users unimportant, especially when it comes to e-commerce. Different "additude", I guess. Or attitude, even.
I maintain an ancient AOL account specifically so I can see things the way that some of my customers' customers see them. But it has one other advantage: if I've just made DNS changes to domain I care about, I set up a temporary new A record (like X.whatever.com) and then surf to it through AOL's proxies. This seems to get their name servers to notice that the SOA record is new, and it flushes out the rest of their cache. This seems to work on all sorts of servers, most of the time.
AOL used to do this (and probably still does) (Score:3, Insightful)
Caching provides a response much more quickly (albeit not always right), and for a large scale ISP, DNS lookups consume not-insignificant amounts of bandwidth. This used to cost much more than it does today, and I'll bet much of this continues out of intertia.
Re:If you want to help then (Score:3, Insightful)
So you're developing the methodology for something you've already done?
a troll, or just an ignoramus.
Mu. If you ask people for help, then insult them when they respond with legitimate concerns, you're going to have a tough time getting recruits. My statement was based entirely on the text you wrote, which was liberally peppered with statements that show you do not have a strong grasp of DNS concepts. Specifically:
You appeared to include details of your methodology, but included irrelevant details, and (as evidenced by your reply) omitted important ones. You have not yet even mentioned dig - whether you know what it is, or how to use it to troubleshoot DNS problems.
Testing wasn't carried out until a MONTH after I changed the TTL to be sure it had propagated correctly.
An example of important omitted information. Other things you omitted: Did you check the TTL (on the recursive server) before and after you made the change? Did you ensure that the recursive server was obtaining the new TTL, by checking the SOA? Did you determine if the recursive server was caching the SOA (technically you're not supposed to, but many DNS servers send a TTL with SOA replies, and it's possible that the implementation on your recursive server was caching the SOA.) Did you check the returned TTL via sequential, timed queries to see if it was changing properly?
there are other methods, this is by far the simplest for the non technical
It's also the one that provides the least amount of data, and is the least reliable. A windows batch file (created by you) with a clicky-clicky icon is only marginally more difficult, and would provide better, more reliable data.
Re:DNS practices --- CHANGE THE !@#$%^& serial (Score:3, Insightful)
But I have seen once when we had changed the serial, we had lowered the TTL for the week preceeding, and yet there were DNS servers out there that just refused to update. (AOL being one of them).
After we hit two weeks, and the IP still hadn't propagated, I did some digging -- somehow, 4 of the root name servers were forwarding queries to two development DNS servers that someone had set up, which weren't being maintained and getting updates. So yes, it was not the fault of the remote DNS servers that weren't taking the updates
But it's not always just a matter of changing the serial.... other things can go wrong with DNS.
Re:If you want to help then (Score:3, Insightful)
If you don't want to volunteer, Fine.
If you think that the poster doesn't know how to plan the test; go make your own test.
He specifically said the test methodology will be improved on suggestions on the mailing list. How about contributing there instead of making an uppity jackass of yourself here.
(Would like to post my rant AC, but I don't want you think the GP is responsible for this reply)
Where to begin... (Score:3, Insightful)
Asking friends to REBOOT? Why not just ipconfig
I also have a really hard time taking someone seriously that, in the opening question, mentions something like "well, zealots will argue, and tin foil hats will bitch" or whatever. Yea, he's really unbiased..
TTL affects the time you should cache the records, at least he seems to get this. So, he can't think of one reason why a large ISP might want to ignore TTL's?
I'll name a couple and leave it to this guy to fill in the rest:
A) Because a lot of really terrible DNS admins set the value way too low and leave it there?
B) Because ISP's might have a need to keep their cache database activity to a resonable level?
GO on with your study! The results will probably prove to be very uninteresting.
New TTL won't take effect until old TTL runs out (Score:4, Insightful)
In fact, servers that are obeying TTL won't see the new record until the old record's TTL expires.
The querent doesn't say whether or not there was any wait for the old TTL to expire. They don't even mention what the old TTL was!
Ignoring TTLs not necessarily intentional (Score:5, Insightful)
1. Change in the hosting of a domain to new DNS servers without properly removing the domain from the old hosting DNS servers.
When this happens, a DNS server caching a domain's info will continue to check the old servers until the old server stops answering.
2. A change in the TTL of a domain to a lesser value.
If you change the TTL of a domain from 7 days to 1 hour, DNS servers currently caching that domain's information will hold onto it for 7 days before discovering the new TTL.
3. A bug in BIND 8 that prevents it from pulling updated information from the primary DNS server for a domain.
We see this rarely, but it requires a restart of an affected DNS server. We have not diagnosed the specific cause yet since we're moving servers to BIND 9.
Re:Bypass their DNS (Score:2, Insightful)
We've put up with this sort of subversion for email (SMTP) out of necessity -- there just isn't any other way to deal with dumbass users with unpatched windows boxes sending 95% of the world's spam. Subverting DNS like this should be punishable by death. Face it people, any service can be targeted.
Re:DNS practices --- CHANGE THE !@#$%^& serial (Score:1, Insightful)
You see, the Internet is no longer a bunch of DARPA nerds playing with ftp and gopher. It's an important part of people's lives, often as important as the telephone. (More important for a considerable number of people.) I will only become more important. Capriciously blocking chunks of the network to save a few dimes on bandwidth and DNS server RAM is simply unacceptable. Seriously: the average mail transaction is thousands of bytes, the average web transaction is tens of kilobytes, and the average DNS transaction is under two hundred bytes. The bandwidth is negligible, and gigabytes of cache RAM are utterly cheap these days.
Re:DNS practices --- CHANGE THE !@#$%^& serial (Score:3, Insightful)
I don't know why you assume everyone should know what you meant. The rest of your hateful post made you look uninformed so folks probably generally presumed you were just a newbie admin with an inflated ego.
And why would they bump the TTL on their nameserver, anyway? Could you possibly mean that they should bump the serial number? I think you keep confusing record caching with zone transfers to secondary servers.
Also while we're on the subject of TTL's I that our nameserver is actually setup to increase TTL's less than 24 hours to 24 hours. I believe thats in an RFC or best practices guide I read somewhere.
I presume you know nothing about global load balancing. Global load balancers, which are really just fancy DNS servers, work by varying the A records returned from queries. The GLBs monitor the servers (or more likely load balancer farms) and if one goes down the GLB will no longer resolve to that IP address. For that to work, the TTL must be set to a very short time. If an ISP ignores the TTL, it will cause problems for any of their customers who access the domain with the short TTLs. Many large sites with multiple data centers make use of GLBs to balance traffic accross their data centers. You should not ignore TTLs or you may find that folks who rely on your DNS servers will occasionally be unable to access various sites. Since GLBs also tend to direct traffic toward less busy data centers, you will find that ignoring TTLs will also result in slower access for your clients to their favorite web sites. And if that's in a best practices guide, you might consider throwing that guide away.
I do know that TTL is a recommendation, thats all.
And I suppose stopping when the guard rail drops at a train track is technically just a recommendation, too. People have good reasons for lowering TTLs even if you don't seem to think so. Ignoring them can cause real problems.
I don't know why you need to interject the condescending, hateful speech in your posts. I would have blown it off with your apology, but then you included that unnecessary "you all should know that?!?!?)" crack in your latest post. You act like a genius and then make mistake after mistake in your technical statements, making you look like a buffoon. Why don't you relax and humble yourself a bit. Your ego is too inflated.