Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
The Internet

How to Work Around Broken Port-80 Routing? 326

Dr. Zowie writes "My ISP places an opaque (intended to be transparent) web proxy between me and the rest of the world. It is causing me problems due to misconfiguration or misdesign. My question is twofold. On the micro level, what can I do in the short term to work around the broken routing (in the long term, I switch ISPs if it's not fixed)? On the macro level, what can we as a community do to prevent breakage of the net on a global scale by poorly designed routing hacks?"

Dr. Zowie continues: "I use a regional ISP with otherwise-very-good policies. However, they seem to be intercepting anything that comes from my home net on port 80, so that they can ``transparently'' cache web requests based on the payload of those packets. The proxy seems to work rather well in most cases: I never noticed it until I started using OpenNIC. Then I found that some web pages that should have resolved OK through the OpenNIC system failed even though routing on different ports worked OK.

"I did some experimentation using ``telnet'' on port 80 directly, and found that packets are being routed based only on the payload regardless of the original destination address: I can (for example) retrieve the Slashdot front page by using ``telnet www.google.com 80'' and asking for "http://www.slashdot.org http/1.1". The tech support folks seem to be stonewalling me: the main contact tells me that the behavior is "not broken" even though it clearly violates RFC 1812, the standard set of rules for IP routing.

"The practice of ``transparent'' proxy routing seems to be growing more widespread. It appears to break the internet standard in a way that works for most folks for now, but that breaks port 80 usage in general. Looking ahead, this breakage seems like a growing nightmare waiting to happen. At the very least, I expect more instances of my particular problem to appear as folks give up on the corporate hegemony of ICANN. More insidiously, transparent proxy routers break the layered nature of the internet protocol and restrict the flexibility that made it work in the first place. One would hope that such proxies would at least act like routers when the fancier proxying fails, but at least my ISP's doesn't. What about your ISP's?"

This discussion has been archived. No new comments can be posted.

How to Work Around Broken Port-80 Routing?

Comments Filter:
  • by samrolken ( 246301 ) <samrolken AT gmail DOT com> on Saturday March 23, 2002 @04:17PM (#3213698)
    that's why I suggested adding 80 to 2^16 and setting your proxy to connect at that port. It's the same port, the auto-proxy-router thing just wouldn't see it as such.
  • by Jerf ( 17166 ) on Saturday March 23, 2002 @04:27PM (#3213741) Journal
    I reply to this because I bet a lot of people are going to think this.

    The real problem is that you're probably using port 80 for something other than what it's explicit purpose.

    No, that's not it at all. Follow the openNIC [unrated.net] link.

    What he's trying to do is resolve an address, via the perfectly standard and normal DNS protocol, with an alternative root server. This is also perfectly standard and normal. This is not a violation of DNS, nor any other protocol, nor is it a particularly wierd thing to want to do. (Unusual, but perfectly normal.)

    The problem is that his ISP is catching all traffic to port 80, and redirecting it to their proxy. Thus, when he asks for "http://www.something.nonstandardroot", the web proxy is interfering with the request (presumably after his home computer correctly resolved the DNS address of www.something.nonstandardroot), catching the GET part of the HTTP request, extracting the server name, and attempting on it's own to resolve the name.

    (Note this is a complete waste: The home computer has probably already resolved the address, now the proxy will resolve it again.)

    Unfortunately, the proxy is too ignorant to know how to resolve the alternate DNS address. It's not incapable in the technical sense, it just doesn't understand root servers it's not configured for. The problem is that this means that the perfectly normal and acceptable HTTP request, for an HTML document, on an IP address the client computer has already perfectly normally resolved, gets lost, because the proxy doesn't know how to resolve the address. Bad proxy!

    A workaround, albiet a sucky one, is to resolve the address on one's home computer, then go to that IP address manually. This still causes problems on subdomain-aware webservers, where several domains or subdomains may all come from the same IP address, and the server wants to use the host part of the HTTP GET request to differentiate what to serve. (You could code up a quick Python/TK script to do this, but it'll still suck.)

    So, when you say a proxy is not required to route anything anywhere, you've accidentally hit on the exact problem: a proxy shouldn't be routing, because it may not know how. This proxy tries to. That's why it sucks.

    And to cover the last part of your post, there's absolutely nothing non-standard about any of this, except the behavior of the proxy, which is the only thing in this whole mess that hasn't "embrace[d] the DNS standard, HTTP standard and the routing standard". ICANN's root servers are not written into RFC's. They are merely common practice, one that many people, probably correctly, believe is an increasingly dangerous common practice [kuro5hin.org]. (You may not completely agree, but the opinions deserve consideration.)
  • by khuber ( 5664 ) on Saturday March 23, 2002 @04:27PM (#3213742)
    You might try something like the port + 65536 rule.

    How could a number outside 16 bits make it to a router since TCP only holds 16 bits for ports? If you wrap around to 80, you have 80, not 65616.

    -Kevin

  • by ocip ( 200888 ) on Saturday March 23, 2002 @04:29PM (#3213747) Homepage
    If you look at it from your ISP's standpoint transparent proxies aren't as evil as you make it sound.

    99.9% of the ISPs clients aren't trying to do anything tricky, like this. Of those 99.9%, say, only 40% have a proxy server specified. These 40% get to enjoy faster web browsing--which is probably all they're doing anyway. The other 60% enjoy slightly less quick web browsing, but that's they're own fault, right? They're the only ones losing out, right?

    Wrong. The ISP has to pay for bandwidth. The ISP doesn't like the proxy only because it makes browsing snappier, it likes the proxy because it also saves them on bandwidth costs! If the other 60% of the clients were using the proxy they might save 10%, or more, on total bandwidth costs.

    You could think of it like this, too: that's 10% more bandwidth available for the clients at no additional cost to the company (apart from the capital for the proxy server). Yes, they're not perfect, but they make a difference. When you weigh the pros and cons, well, it's obviously going to be worth it for the ISPs to have it installed.

    You could look around for an ISP that doesn't use a transparent proxy but, as you said, they're becoming more popular. Realise that they're not doing to squash your freedom, but instead to provide better service and to save money.
  • Re:Education (Score:2, Insightful)

    by MrHat ( 102062 ) on Saturday March 23, 2002 @04:29PM (#3213751)
    I'll tell you what I'd do.

    1. Refuse to use the machines at school for any internet access. Period.

    2. Let the board and the teachers know why. Tell them they've taken a good thing and turned it into a complete waste of tax money by senselessly restricting.

    3. Ask the board why they think their current system is capable of making better judgements than their salaried teachers.

    This is probably why I really didn't get along with anyone in high school. But this stuff really ticks me off - usually some overzealous admin taking the liberty of forcing his/her idea of "good" on to everyone.
  • by theCoder ( 23772 ) on Saturday March 23, 2002 @04:43PM (#3213805) Homepage Journal
    My college [purdue.edu] has a similar set up because it saves an incredible amount of bandwidth. It's not to be mean, or malicious, or spy on your browsing habits, it's just to save bandwidth. And it does (I wish I had numbers to back this up, but I don't run the proxy).

    There have been problems with the proxy in the past (it not returning any data) and there are still some minor issues, but on the whole it works well (in that you don't ever notice it).

    It sounds like the ISP in question has a bug in their web cache code. If the web cache doesn't have the particular URL cached, it forward the request to the intended destination. I'd bet it's trying, but it can't lookup whatever OpenNIC URL is being specified (because it doesn't use OpenNIC). The ISP really should report this bug to the manufacturer.

    My advice is this -- get the ISP on your side to fix the problem. They won't remove the proxy, and they shouldn't have to if the bug is fixed.
  • I don't see a problem with what he's trying to do.

    The problem he's having is that he's asking for an OpenNIC web site, and not receiving the page. The problem is as follows:

    The "address" of the site he's looking for is present in two separate places in the request he's making. The IP Header includes the IP address of the site, and the HTTP header includes the URL, which includes the server name.

    When he requests a webpage from an OpenNIC TLD, his machine correctly resolves the hostname, and constructs an request, which is sent through his ISP. The web proxy intercepts the request, and tries to proxy his request, so that it can be cached for later lookups.

    Apparently, the Web cache is not configured to lookup machines under OpenNIC TLDs. That's reasonable, but that shouldn't stop a web browser from being able to see the web page.

    If the web proxy can't identify the hostname present in the URL, it should simply pass it through, allowing the client (who already knows the IP), and the Web Server (who also, clearly, already knows it's own IP) to communicate. This would prevent the client from gaining the benefit of the cache, but would allow the client and server to communicate.

    By accusing the poster of "[choosing] to disregard the other relevant standards," I can only assume your talking about his testing the web requests through a telnet client. I think that was an excellent troubleshooting procedure. It clearly identified the source of the problem.

    HTTP does have it's own rules, but none of those rules should override TCP/IP. If this user makes a request to a web server (he's obviously already identified the IP address of the server, or he wouldn't be attempting an HTTP request). The caching proxy shouldn't be hijacking his request for any reason. It may be misconfiguration, or it may be broken proxy software, but it certainly isn't the user's fault.
  • by Bender Unit 22 ( 216955 ) on Saturday March 23, 2002 @05:02PM (#3213865) Journal
    Not for sure, most proxy/switch solutions can do ip-spoofing so the remote webserver can't detect it. This is often done to avoid user/login problems on systems that base parts of their security on IP's. If the site then has it proxy rules set correct in the meta tags or header information, the "hidden" proxy won't cause any problems or cache any information.
  • by Skapare ( 16644 ) on Saturday March 23, 2002 @05:25PM (#3213929) Homepage

    If you connect to a specific IP address, a transparent proxy should connect to that very same IP address. If it connects to any other for any reason, it is apply a sort of "routing" logic. Apparently what happens is because the client includes an HTTP version 1.1 "Host" header, the proxy prefers to do a DNS lookup on the hostname given, and (if it finds it) connect there instead of the client's original destination IP address.

    This is broken. If the proxy has a different idea of what domain names mean, it gets the wrong web site, or perhaps fails to get one at all. A correct transparent proxy implementation should always connect to the very same IP address the client tried to connect to without regard to the "Host" header (which must also be passed along). A DNS lookup can still be done to optimize the cache. If the destination IP address is in the list of A records from the DNS query, then it can simply be matched to the cache by name alone. However, if the IP address does not match any that DNS gets, then those pages can still be cached, but they must be cached under the tuple of both the destination IP address and the "Host" header name together (as this content can be different than any other for the same host name or the same IP address).

    Maybe someone can provide a list of which transparent proxy cache programs do it wrong, and which do it right (as I have not examined these programs). I don't know if peakpeak.com will change out the software once they find something that does it right (or even make a configuration change if it turns out that's all that is needed). Ironically, if you find an outside proxy server which can do it right for you, you could connect directly to that service via a different TCP port and end up defeating the efforts of your ISP to save upstream bandwidth by caching.
  • by GigsVT ( 208848 ) on Saturday March 23, 2002 @05:55PM (#3214016) Journal
    Well, yes and no, how could a proxy work with non-ICANN roots?

    It will try to resolve the address in the GET line, and fail, because it doesn't know about other TLDs.

    The only way to fix this broken proxy behavior is to have it ignore GET lines that is can't resolve, and instead forward the request intact to the IP address.
  • by jdavidb ( 449077 ) on Saturday March 23, 2002 @06:00PM (#3214032) Homepage Journal

    I agree with everything you say; proxy servers are a great thing for all involved and not a threat to freedom.

    But the problem is that this proxy server doesn't work right. My browser should look up the IP corresponding to the site, send a request on port 80, and get the response. In this case, it looks like the proxy is insisting on doing the lookup part, and so the user effectively can't change his DNS.

  • by Phroggy ( 441 ) <slashdot3@@@phroggy...com> on Saturday March 23, 2002 @10:37PM (#3214823) Homepage
    Where should a customer call to complain then?

    In my example, the source of the problem was the phone company, not the ISP. In the case of most large ISPs, the ISP has a contract with the phone company wherein the ISP orders the DSL service from the phone company, and the end user doesn't talk to the phone company. In this situation, an end user cannot even talk to the phone company about DSL problems. You can complain to your local Public Utilities Commission, and I would highly encourage you to do so.

    Should a customer just grab their ankles and say "please sir, may I have another".

    In the United States of America, with Republicans running the government? Yes. Congress is currently trying to pass legislation to make the situation even worse.

    In other situations, where the problem does lie at the ISP, the best place to go is straight to the top. Try to get in touch with the president of the company, or somebody else in upper management. They have the power to make it happen. Do not try to work your way up the chain (don't try to go any higher than the direct supervisor of the tech who answered the phone; that will get you nowhere); you have to go to the executive level.

    Poor ISP. They get money to do a job and they can't. So some pissant customer support moron complains about the customer.

    In my example, no competing ISP can do the job any better, except possibly the ISP that is owned by the phone company, and they do the rest of the job so poorly it's not worth it. In the case of problems at the ISP, understand that the company's management may not even be aware that there is a problem, and tech support often can't fix it. Complain to management, let them know there's a problem. If that doesn't resolve the issue, take your money elsewhere. That's capitalism at work.

    Poor you. You have a job that you can't do, so its the customer's fault. Boo hoo hoo. I see a long career at McDonalds for a slacker like you.

    I never said anything was the customer's fault, nor did I say I can't do my job. Sometimes telling customers "no" is my job; I'm not slacking when I do it, and it's not the customer's fault I have to.
  • by Zeinfeld ( 263942 ) on Saturday March 23, 2002 @11:20PM (#3214931) Homepage
    Before addressing the technical issues this appears to be a really whiny sort of complaint. I suspect that the real issue is that the poster wants to force the rest of the world to support his eccentric choice of DNS root. This strikes me as an invented difficulty rather than a real one.

    Your problem is not one that HTTP or the proxy spec was designed to cover. When we developed HTTP the issue of ICANN did not exist. I certainly don't think it unreasonable for a proxy code writer to assume that users are using the Internet DNS system. If you want to do things different you should expect problems, that is the way of the world.

    The host name header was introduced as a hack to alleviate the problem of IPv4 address exhaustion. There is actually a good reason for the proxy to dereference the DNS name itself since then it can do load balancing amongst http servers if the client does not.

    The proxy might also be using a new enhanced http protocol and so it is pretty important that it be able to access the DNS NAPTR records for the service and do the appropriate mapping.

    One way to address the problem would be to change the host header so that it has the alternic prefix to the dns name, if porn.xxx is an alternic name one would assume that there is a name something like porn.xxx.alternic.org that resolves in icann space. If you want to use non standard DNS configurations expect to have to patch applications.

    Proxy caches were really important in the early days of the web and still are for certain congested links. In the main however the content providers use techniques that mean that caching is very much less useful than it once was. Most content is active these days so it is only the images that cache well.

  • by Anonymous Coward on Saturday March 23, 2002 @11:40PM (#3214982)
    1. An HTTP proxy server is not a router.

    2. What is happening is that your *default gateway* (which really IS a router) is redirecting packets bound to port 80 to the proxy server. Your default gateway is doing the routing, NOT the proxy server. (Linux does a nice job at transparent proxying, btw.)

    3. The proxy server then tries to resolve the domain name using DNS.

    4. The DNS server the proxy server is configured to use, not knowing anything about these funky TLDs you're trying to access, can't find it. It tells the proxy server so.

    5. The proxy server comes back and gives you a nice, friendly error message telling you it can't resolve the host name.

    Look...transparent proxying is to bandwidth what NAT is to private networks. It works, it works very well, it's in widespread use (getting wider every day, probably), and it's here to stay. If you really want to do something constructive to solve your problem, ask your ISP to configure their DNS to resolve the OpenNIC TLDs. They're a lot more likely to do that than they are to stop using transparent proxying (I know I would be).

  • by evilviper ( 135110 ) on Sunday March 24, 2002 @04:36PM (#3216971) Journal
    99.9% of the ISPs clients aren't trying to do anything tricky, like this.


    Well, in that case, they can stop supporting anything but windows, since it has a clear majority. Oh, and you can't use anything but IE since it's got a majority as well.

    The problem is that I don't pay for 'a service that allows me to view most web sites'. Rather, I pay for an 'internet service'. If anything that should work, doesn't, then they are violating their end of the contract... Not to mention probable false-advertising, etc.

    If it costs them 10% more bandwidth for those who choose not to use their optional proxy, then they should charge the customers 10% more.

    How about if the USPS decided to crush every package by 1cm because then they can fit more packages in each plane/truck. Besides, 99% of people have at least 1cm of padding to protect the package contents anyhow.

    It's exactly the same thing. Doing something that doesn't hurt too many people, in exchange for more profit. The fact that most people aren't going to be negatively affected doesn't make it right, or legal for that matter.

For God's sake, stop researching for a while and begin to think!

Working...