Forgot your password?
typodupeerror
The Internet

How to Work Around Broken Port-80 Routing? 326

Posted by michael
from the service-road dept.
Dr. Zowie writes "My ISP places an opaque (intended to be transparent) web proxy between me and the rest of the world. It is causing me problems due to misconfiguration or misdesign. My question is twofold. On the micro level, what can I do in the short term to work around the broken routing (in the long term, I switch ISPs if it's not fixed)? On the macro level, what can we as a community do to prevent breakage of the net on a global scale by poorly designed routing hacks?"

Dr. Zowie continues: "I use a regional ISP with otherwise-very-good policies. However, they seem to be intercepting anything that comes from my home net on port 80, so that they can ``transparently'' cache web requests based on the payload of those packets. The proxy seems to work rather well in most cases: I never noticed it until I started using OpenNIC. Then I found that some web pages that should have resolved OK through the OpenNIC system failed even though routing on different ports worked OK.

"I did some experimentation using ``telnet'' on port 80 directly, and found that packets are being routed based only on the payload regardless of the original destination address: I can (for example) retrieve the Slashdot front page by using ``telnet www.google.com 80'' and asking for "http://www.slashdot.org http/1.1". The tech support folks seem to be stonewalling me: the main contact tells me that the behavior is "not broken" even though it clearly violates RFC 1812, the standard set of rules for IP routing.

"The practice of ``transparent'' proxy routing seems to be growing more widespread. It appears to break the internet standard in a way that works for most folks for now, but that breaks port 80 usage in general. Looking ahead, this breakage seems like a growing nightmare waiting to happen. At the very least, I expect more instances of my particular problem to appear as folks give up on the corporate hegemony of ICANN. More insidiously, transparent proxy routers break the layered nature of the internet protocol and restrict the flexibility that made it work in the first place. One would hope that such proxies would at least act like routers when the fancier proxying fails, but at least my ISP's doesn't. What about your ISP's?"

This discussion has been archived. No new comments can be posted.

How to Work Around Broken Port-80 Routing?

Comments Filter:
  • Use netcat... (Score:4, Informative)

    by samrolken (246301) <samrolken&gmail,com> on Saturday March 23, 2002 @03:02PM (#3213635)
    You can use netcat to route your own port 80 traffic. Simply get a good UNIX shell account, and configure your router to direct to that. It becomes a real version of what you would be trying to do. However, I would bitch like crazy if my ISP did anything like that to me. If I want to connect to port 80 on something, I would want to be connecting to such port 80. Any fiddling with it would sure make me drop that ISP in an instant.
  • Tunneling (Score:3, Interesting)

    by Matthaeus (156071) on Saturday March 23, 2002 @03:05PM (#3213651) Homepage
    I recently had this problem with my university account...They route all resnet web traffic through an old 386 proxy server that can't handle the load. Find a free proxy out there and SSH tunnel to it. I'm sure there are more elegant means of getting through a poorly configured proxy, but this'll work as a quick fix.
  • find a friend who has a colocated server or dsl connection.

    then use that machine as a web proxy, or set up an ipsec tunnel to that machine and route your port 80 traffic through that tunnel.
  • by AntiNorm (155641) on Saturday March 23, 2002 @03:12PM (#3213678)
    Onenet [onenet.net] is the internet "service" provider to most state agencies within Oklahoma, including Oklahoma State University, where I am currently working on a BSEE. Neglecting Onenet's other issues (AOL's netadmins could do a better job than Onenet's), they have a "transparent" web cache proxy. More often than not, errors fetching a web page come not from the browser or the site itself as they should, but from the proxy. DNS errors from the proxy are not uncommon. As for switching ISPs, I can't, which really sucks. But for what I can reach on the net, I'm still getting ultra-cheap broadband :P.
  • same problem (Score:3, Interesting)

    by babycakes (564259) on Saturday March 23, 2002 @03:13PM (#3213684)
    We had pretty much the exact same problem with our ISP, in that if we sent HTTP requests out without any proxy configuration, they would often take a couple of times to get through, since our ISP's transparent proxying didn't work. However, on setting the browser's proxy settings to the proxy itself, this seemed to solve the problem since it would ask the proxy directly.

    Don't ask me why :)
    • Cisco wccp/gre tunneling had large amounts of bugs until fairly recently (1-2 years ago), and still require specific ios versions. This bug would cause the gre tunnel (which connected the router to the cache) randomly collapse. That is probably the bug you are seeing, and is entirely unrelated to that the poster's problem is.
  • Education (Score:3, Interesting)

    by radoni (267396) on Saturday March 23, 2002 @03:14PM (#3213688)
    At my highschool, the current system for blocking webpages was introduced as a means to cache commonly used pages and make the District 225 intranet faster. The superintendent and members of the district board know very little about computers, so naturally it is approved. After the Columbine incident, a new feature was tacked on that blocked certain objectionable web sites. The recent WTC attack caused even more areas of the net to be restricted. Today, when i want to search "terrorism" for a paper on the war afghanistan, my results are blocked. Teachers have informed us that we must use the one non-blocked computer in the tech room, or do research at home.

    my friend set up an anonymous web surfing proxy at his home computer, and using this i can get whatever i want.

    there are publically available anonymous port-80 proxies still around.

    • Re:Education (Score:2, Insightful)

      by MrHat (102062)
      I'll tell you what I'd do.

      1. Refuse to use the machines at school for any internet access. Period.

      2. Let the board and the teachers know why. Tell them they've taken a good thing and turned it into a complete waste of tax money by senselessly restricting.

      3. Ask the board why they think their current system is capable of making better judgements than their salaried teachers.

      This is probably why I really didn't get along with anyone in high school. But this stuff really ticks me off - usually some overzealous admin taking the liberty of forcing his/her idea of "good" on to everyone.
      • Re:Education (Score:2, Interesting)

        by Ryan Amos (16972)
        Obviously you've never dealt with a public school system. Refuse using the machines? You get written up and sent to the principal for disobeying the teacher. Tell the board anything? They don't listen to teachers or parents, let alone students. Teachers are often just as powerless as the students in administration matters. Unfortunately, public schools often operate on the assumption that the people on top are ALWAYS right.

        The school board of any decently large school district is generally disinterested in actually educating students or making things work well, most school board members are just there as a springboard for higher political office. They generally don't give a fuck about the students or education, so they lower test standards and claim that test scores have improved. If it makes them look good to "protect our children from evil" by blocking out these sites, they'll do it. School systems don't operate like normal organizations; the students' opinion carries next to no weight at all, as "it is up to us adults who know better to protect the students from what they don't know." Total BS, but that's the way public school works.
  • I used to have an ISP that, although they allowed you to have your own site (on their webspace,) loading the site was just damn SLOW for anyone who tried. It was much faster if the pages were hosted somewhere on another continent compared to an ISP with a server in the same city.

    The thing is, they probably won't listen to problems like this, or your proxy issue in most cases. But I found a way to make them listen to you:

    Phone them up saying that you want to cancel the service. Mention something about their web hosting being broken. They will probably say that they will have a management person phone you back to confirm the process.

    When they do phone back, for me, the call was like "Hello, there was a call eariler about a slow connection?" And at this point you have someone on the line who is interested in helping you, has power in the organisation to really fix things (because they're management or a senior tech) and they want to get your issue fixed to they don't lose your business. And THIS is when you really try to explain what's going on.

    This was my experience. Perhaps it will work for you.

    • This is a method of getting things solved that really does work ... but generally only with smaller businesses. Before I went to work for an ISP, I had an account with another ISP that was doing an excellent job. About a month into the job I called up the other ISP to cancel my service (as it was a dedicated full time nail-up service, and thus a bit more costly that plain old dial-up). The lady that answered took my request and proceeded to ask why I was dissatisfied with the service. I then said "Oh, no, I was perfectly happy with the service". She then asked "Oh, you're moving out of town?". To that I finally explained to her that I had taken a job with an ISP in town, and had free service through it. She told me "well, I guess it won't do any good then for the owner to give you a call and see if he can get you back".

      You're not likely to get that kind of response from a big national ISP. One customer is just too small a percentage for a manager to call to find out why you are unhappy. Mostly, whatever changes they might need to do to make you happy would be too costly for such a big company, anyway. The trouble with small ISPs is that there is such a great variation of competency. But when you get a good one, you have a gem. With the big ones, it's generally a fairly uniform level of pathetic service.

  • by BrookHarty (9119) on Saturday March 23, 2002 @03:20PM (#3213711) Homepage Journal
    Proxy servers, They might not be cacheing 8080 or other Proxy ports. Check http://tools.rosinstrument.com/proxy/ [rosinstrument.com]

    Bouncers - You set this program on an external server on a port thats not filtered. You just point your browser at this IP/port and your outside your filtered isp. Check www.freshmeat.net [freshmeat.net]

    SSH, tunnel or route from an external box.

    Really, If you cant go through it, go around it, either with software or networking.
    -
    Well, if crime fighters fight crime and fire fighters fight fire, what do freedom fighters fight? They never mention that part to us, do they? - George Carlin
    • Yes, you can use a proxy server outside your firewall that will fetch things for you, but you *still* have the problem that if you don't control the configuration of the proxy server, it might not do what you want - you're just choosing between differently configured proxies, one of which might do most of what you want.

      If you can use a tunnel server, like IPSEC or PPTP or SSH, which lets you pick the IP address to send your IP packets to but doesn't interpret the packets itself, you'll mostly be ok (you'll still have to make sure to do your own DNS if you want to resolve on alternate roots.)

  • by Lumpish Scholar (17107) on Saturday March 23, 2002 @03:27PM (#3213740) Homepage Journal
    (1) Line up a serious alternative ISP. Talk to their sales department; see if they do the same thing.

    (2) Talk to your ISP's sales department. Tell them your problem. Tell them you're ready to move. (Perhaps ask what the hit rate of the cache is, that is, if the overhead is worth it for them.) See if they offer any accomodation.

    (3) Go with the ISP that does what you want.

    If you're using them for DSL, you may not have a lot of choice.

    (As others suggested, if host resolution is your issue, you could run a local proxy on your 127.0.0.1 interface that converts host names into addresses.)
    • by Jerf (17166) on Saturday March 23, 2002 @03:34PM (#3213774) Journal
      (As others suggested, if host resolution is your issue, you could run a local proxy on your 127.0.0.1 interface that converts host names into addresses.)

      Unfortunately, that's not a complete solution. Example: Compare my home page [jerf.org] versus the IP address [65.196.231.181] that hostname resolves to.

      Lots of servers do this.
    • Very good suggestions, and I'm planning on doing steps (1) through (3).

      As far as a local proxy: that won't work with virtual hosting in a non-ICANN name space. The immediate problem is that I can't retrieve non-ICANN web pages because the proxy tries to resolve the non-ICANN name in the payload, using ICANN DNS. I can always ask for numeric addresses, but virtual hosting (where a server gives you different pages depending on the name you ask for) is widespread enough that there are many web pages I can't retrieve even in principle.

      Cheers,
      Craig
  • by ocip (200888) on Saturday March 23, 2002 @03:29PM (#3213747) Homepage
    If you look at it from your ISP's standpoint transparent proxies aren't as evil as you make it sound.

    99.9% of the ISPs clients aren't trying to do anything tricky, like this. Of those 99.9%, say, only 40% have a proxy server specified. These 40% get to enjoy faster web browsing--which is probably all they're doing anyway. The other 60% enjoy slightly less quick web browsing, but that's they're own fault, right? They're the only ones losing out, right?

    Wrong. The ISP has to pay for bandwidth. The ISP doesn't like the proxy only because it makes browsing snappier, it likes the proxy because it also saves them on bandwidth costs! If the other 60% of the clients were using the proxy they might save 10%, or more, on total bandwidth costs.

    You could think of it like this, too: that's 10% more bandwidth available for the clients at no additional cost to the company (apart from the capital for the proxy server). Yes, they're not perfect, but they make a difference. When you weigh the pros and cons, well, it's obviously going to be worth it for the ISPs to have it installed.

    You could look around for an ISP that doesn't use a transparent proxy but, as you said, they're becoming more popular. Realise that they're not doing to squash your freedom, but instead to provide better service and to save money.
    • That's a very good point.


      What I'm complaining about is that the proxy has a bad failure mode: when the usual resolution method fails, the proxy ought at the very least to fail softly and route the packets like a real router should Unfortunately, it doesn't -- it generates an error message instead.

      • by Anonymous Coward on Saturday March 23, 2002 @10:40PM (#3214982)
        1. An HTTP proxy server is not a router.

        2. What is happening is that your *default gateway* (which really IS a router) is redirecting packets bound to port 80 to the proxy server. Your default gateway is doing the routing, NOT the proxy server. (Linux does a nice job at transparent proxying, btw.)

        3. The proxy server then tries to resolve the domain name using DNS.

        4. The DNS server the proxy server is configured to use, not knowing anything about these funky TLDs you're trying to access, can't find it. It tells the proxy server so.

        5. The proxy server comes back and gives you a nice, friendly error message telling you it can't resolve the host name.

        Look...transparent proxying is to bandwidth what NAT is to private networks. It works, it works very well, it's in widespread use (getting wider every day, probably), and it's here to stay. If you really want to do something constructive to solve your problem, ask your ISP to configure their DNS to resolve the OpenNIC TLDs. They're a lot more likely to do that than they are to stop using transparent proxying (I know I would be).

    • I agree with everything you say; proxy servers are a great thing for all involved and not a threat to freedom.

      But the problem is that this proxy server doesn't work right. My browser should look up the IP corresponding to the site, send a request on port 80, and get the response. In this case, it looks like the proxy is insisting on doing the lookup part, and so the user effectively can't change his DNS.

    • Or the OP could look around for an ISP that does use transparent proxying correctly. This is not an unsolvable problem; all the proxy has to do is connect to the correct origin server, which is the IP address the client connected to. This is necessary because with transparent proxying, the GET request provides a URI, not a full URL.

    • 99.9% of the ISPs clients aren't trying to do anything tricky, like this.


      Well, in that case, they can stop supporting anything but windows, since it has a clear majority. Oh, and you can't use anything but IE since it's got a majority as well.

      The problem is that I don't pay for 'a service that allows me to view most web sites'. Rather, I pay for an 'internet service'. If anything that should work, doesn't, then they are violating their end of the contract... Not to mention probable false-advertising, etc.

      If it costs them 10% more bandwidth for those who choose not to use their optional proxy, then they should charge the customers 10% more.

      How about if the USPS decided to crush every package by 1cm because then they can fit more packages in each plane/truck. Besides, 99% of people have at least 1cm of padding to protect the package contents anyhow.

      It's exactly the same thing. Doing something that doesn't hurt too many people, in exchange for more profit. The fact that most people aren't going to be negatively affected doesn't make it right, or legal for that matter.
  • by tangent3 (449222) on Saturday March 23, 2002 @03:31PM (#3213761)
    Here in Singapore, ISPs are required by law to block port 80, forcing all outgoing http requests to go through a proxy server (which filters out webpages which are deemed unsuitable for Singaporeans to view, including www.playboy.com), or to have a transparent proxy server blocking out such requests.

    This has caused me many problems before, when my IP gets determined wrongly by the remote site (which naturally thinks takes the proxy server's IP for my IP address). Some applications don't like the transparent proxy either, for example Frontpage Extension (not my choice to use!), and an autopatching program which refused to download the latest version of a file, insisting on downloading only the file cached in the proxy server until the cache gets flushed.

    The only real method of bypassing the proxy is to use another proxy server (since 8080 isn't blocked) outside the ISP's network. This tends to be really slow though.

    I guess I have to live with this until the government one day realises that proxy servers cannot stop the people from viewing pr0n, and it's probably not worth maintaining the proxy servers to meet the demands of all the net users in Singapore, not to mention maintaining the list of sites to block.
    • OK, this is a bit OT, but since you're from Singapore, I'm curious about something. I know that when filtering was proposed there, many people weren't happy about it. Has there ever been a move to form something akin to the EFF to protest this, or is the political situation still such that doing this would get you hauled into court by the government?

      The whole political situation there baffles me. More repressive governments have been forced to reform by popular protests. Why hasn't it happened in Singapore? You'd think that, with the extent to which the country is connected to the rest of the world, people would see what's happened in places like Indonesia, Thailand, Yugoslavia, etc. and want to do the same.

    • I guess I have to live with this until the government one day realises that proxy servers cannot stop the people from viewing pr0n, and it's probably not worth maintaining the proxy servers to meet the demands of all the net users in Singapore, not to mention maintaining the list of sites to block.

      The Singapore government is probably more concerned about stopping people accessing the numerous overseas sites run by the opposition movement. For those that don't follow Singapore politics it is one of those countries where the government brings specious lawsuits against opposition politians and elections are run in the manner of the old Soviet Union.

      Of course since it is a capitalist pseudo-democracy this rarely gets comment in the western media. When it does the government has sued for libel under its mickey mouse libel laws in its kangeroo court system.

      All phone calls made in Singapore are tapped and the government analyses the telephone call logs to see who is talking to whom. Its kinda the state that Ashcroft would like.

  • by MattW (97290) <matt@ender.com> on Saturday March 23, 2002 @03:31PM (#3213762) Homepage
    First of all, the phrase "routing" is a misnomer. Web caching is something that happens on the application layer of the OSI model, layer 7, whereas "routing" refers to layer 3, which supplies IP routing for the TCP/IP protocol suite. What's broken is their caching, their cache server, or their proxying; pick a term.

    Second, there's a lot of ways around it which involve tunnelling.

    Tunnel to another box running a non-broken web cache. I used to tunnel my http traffic through ssh to my colocated boxes, which ran adzapper, and proxied through that.

    Tunnel at the IP layer by running any IP-in-IP encapsulation. If you have some version of windows, for example, you might convince someone with a server to run a PPTP server for you somewhere and you could tunnel through that. There are even Free PPTP Servers for Linux [poptop.org] available to help.

    Find someone who runs a little proxier for their own net with socks, and bounce off their socks proxy. Someone you know no another ISP probably has Wingate or the like running, and if they allowed it (and on some older version, it will permit this by default), you could set your browsers SOCKS settings to bounce off their proxy server, and since SOCKS isn't on port 80, your ISP will probably ignore it.

    There are also a number of things you might discuss with your ISP to resolve the issue.

    Suggest that they switch to a less broken cache server. (Squid [squid-cache.org], anyone?)

    Suggest that they exempt you specifically from the cache server by telling it to ignore your ip address.

    Note that they have an obligation to make sure their caching software doesn't interfere with your browsing; so it will be necessary (and not cost-effective for them) for you to call for every problem you notice.

    Obviously, you'll need to probably speak to a whole number of supervisors, and probably eventually get transferred to a "real engineer", and they will probably hack in a fix (like exempting you only) rather than truly deal with the problem.

    If all else fails, then you may want to try issuing ultimatums, like, "If you can't fix this problem, then you can cancel my service." Tech support people are lazy, however, in some cases, and may just opt to cancel you. This is a harsh reality in the world of consumer bandwidth -- and it will be worse, soon, with bells closing their DSL lines to competition, meaning unless someone else builds a telephony infrastructure to you, you'll probably pick Cable vs 1 DSL provider, and if you don't like something at either of them, you're just out of luck.

    • >First of all, the phrase "routing" is a
      >misnomer. Web caching is something that happens
      >on the application layer of the OSI model,
      >layer 7, whereas "routing" refers to layer 3,
      >which supplies IP routing for the TCP/IP
      >protocol suite. What's broken is their caching,
      >their cache server, or their proxying; pick a
      >term.

      Thanks for the helpful comment!

      What I'm complaining about is that their router (layer 3) routes all port-80 packets to a cache server that looks at the payload only (layer 7) and not at the header at all. In short, they're not routing correctly; they've broken the layered structure of the protocol.
      • Possibly. There's a decent chance that the cache server is actually responsible for passing all the traffic, actually. A lot of routers can't properly route-cache if you try policy-based routing (which you must in order to route by port and not just destination IP + routing table). So ALL packets get passed through the cache server, but it just forwards non-port 80 traffic, since a mere receive/send is very quick, as its routing table will likely consist of only a dozen or so entries (the vast majority of which will be its default route out), and the cache server is likely to be sitting between their backbone routers which have to maintain BGP tables and the DSL lines/etc in question.

        Be sure to try the other helpful suggestion I read of trying port 65616 (that is the right port, btw) -- if your proxy server is stupid, it might pass that on. Of course, you'd have to type it into your URLs a lot, but it is still a way to get around the cache when you need to, if it works.
    • by Phroggy (441) <slashdot3NO@SPAMphroggy.com> on Saturday March 23, 2002 @04:19PM (#3213912) Homepage
      Tech support people are lazy, however, in some cases, and may just opt to cancel you.

      Au contraire. Tech support people are tired of listening to customers whine about problems that tech support people cannot fix. If customers have unreasonable expectations, and refuse to listen to us, it's far better for the company if they just cancel service and go elsewhere (becoming somebody else's problem).

      Also, non-chalance about canceling service is sometimes the best way to make customers understand that we really are doing our best to help them, and we're not just blowing them off. Sounds weird, but here's an example:

      Customer has a problem with their DSL service. We've identified that the problem lies with the phone company. Phone company has given us a commit date of Tuesday by end of business day for repair to be complete. For whatever reason, the customer feels like they've been dragged around, and their service isn't getting fixed. Customer says if they're not up and running by 9:00am Monday morning, they're cancelling service.

      Customer expects us to bend over backwards to get them up and running by 9:00am Monday morning. We can't. There is absolutely nothing we can do. It's out of our hands. Customer needs to understand this. Customer will have the same problem at any competing DSL ISP, but we're the ones who have identified the problem and are getting it fixed.

      We respond by repeating to the customer that we have been given a commit time of Tuesday by end of business day, but that we cannot guarantee that the issue will be resolved by then. We then offer to the customer that if this is unacceptible and they'd prefer to cancel service, although we'd hate to lose them as a customer, we'd be more than happy to transfer them to someone who can take care of that.

      This has the effect of making it clear to the customer that we really mean what we say. Usually, they shut up, keep their account, and let us do our jobs. Often, they'll ask to be transferred to get the account cancelled, then hang up during the transfer.

      The alternative is to offer the customer incentives to try to convince them to stay with us, such as offering a free month of service, or a credit on their account. This costs us money, and gains nothing - if the customer has the expectation that we're willing to give him free service, he'll try to take advantage of it in the future. Far too many ISPs have failed for this very reason.

      At the last few ISPs I've worked for, nearly all my coworkers have been genuinely interested in helping the customer, and we've been fortunate to have management that allows us to do so. I understand that at some companies this is not the case; those are obviously the ones to avoid.

      Sorry for ranting. Getting back on track: ultimatums like "if you don't fix this problem, I'll cancel my service" sometimes are a good idea. That will tell you whether or not you can get the issue resolved. Be prepared to actually cancel, because if they can't resolve the issue, that's what will happen. If they can but just don't want to, threatening to cancel may just be the incentive they need to get it done.
      • Well, I was once in support. The company I worked for (an early ISP in 95) had a fairly well known client for internet connecting - it was a visual basic piece of crap designed to get you connected and start a slip connection, along with a few trivial tools (like a very lame usenet reader, a gopher search, etc).

        I could relate anecdotes all day, but here's a good one: customer calls up, and is getting the message, "Out of memory" and the app is crashing. Customer only has 4 or 8 megs of ram, a smidge low but not too low to run the app back in the world of WFG 3.11. I happened to divine this from listening to the call from the next cube over, and heard the person conclude that windows was causing the problem and recommend to the customer that they reinstall! ARGH. I practically jumped over the cube wall and got them to transfer them over to me, where I discovered that the customer had their virtual memory settings set to 0MB, manually. That and the 4/8M of physical ram did not mix.

        "Reinstall Windows" and "Reinstall XXXXX" (our product) were the favorite responses, and it was a culture of lazy, voodoo support, which is why I got customers calling back for the 12th time, sometimes. Add that to a hold time of over an hour, and its no wonder that company faded into obscurity.

        My experience going the other direction is even more common. I can scarcely EVER call anyone responsible for supporting an Internet service and not get either (1) a moron or (2) a lazy moron. Admittedly, I don't expect a lot -- good engineers don't like doing technical support. But when you describe routing problems to your colo provider and their support staff tries to immediately claim it is a problem with some other provider before even checking a traceroute (which, if they knew what they were doing, would let them realize what an out-and-out fabrication their claim obviously was), you get fed up.

        However, all this aside, my advice to the original poster was based on this general idea: his support people will be familiar with stuff like checking DSL line-downs and scheduling truck rolls, debugging customer DNS/DHCP settings, etc. They will never have any reason to even wonder about a caching server, most likely, but because many tech support departments have poor to nonexistant escalation paths to engineering, they will attempt to shaft the customer, either knowingly or perhaps ignorantly. But one way or the other, they won't fight their way through to escalate properly. I think if your experience is different, its the exception rather than the rule. (and bravo to your TS department if you do provide competent service time and again -- I've yet to come personally in contact with anyone who could do it, including a company where reps made an average of over $55k/yr).
  • The BEST solution that unfortunately will never be implemented is to allow specifying a port number in a DNS lookup. Then when the browser or e-mail looks up the address, one could also specify a port that you want.

    Unfortunately, this ain't gonna happen without a rewrite of everything.

  • by billstewart (78916) on Saturday March 23, 2002 @03:39PM (#3213789) Journal
    Do you know what brand of attempting-to-be-transparent proxy cache server they're using? Proxy caching is an important performance enhancer for ISPs, corproate firewalls, and other bottleneck network environments, and "transparent" proxies are less trouble for the ISP and for the users as well (especially since many users wouldn't bother configuring their browsers for them unless either they're pre-configured by the ISP or forced to use the proxy by firewall rules that block non-proxy access.)

    Of course, the problem with transparent servers is when they're not, and your ISP seems to have one that isn't. Is it possible to find out what kind it is, either by telnetting to the thing and looking at headers or by asking the ISP, and can you do bug reports to the vendor to get them to fix their product?

  • by theCoder (23772) on Saturday March 23, 2002 @03:43PM (#3213805) Homepage Journal
    My college [purdue.edu] has a similar set up because it saves an incredible amount of bandwidth. It's not to be mean, or malicious, or spy on your browsing habits, it's just to save bandwidth. And it does (I wish I had numbers to back this up, but I don't run the proxy).

    There have been problems with the proxy in the past (it not returning any data) and there are still some minor issues, but on the whole it works well (in that you don't ever notice it).

    It sounds like the ISP in question has a bug in their web cache code. If the web cache doesn't have the particular URL cached, it forward the request to the intended destination. I'd bet it's trying, but it can't lookup whatever OpenNIC URL is being specified (because it doesn't use OpenNIC). The ISP really should report this bug to the manufacturer.

    My advice is this -- get the ISP on your side to fix the problem. They won't remove the proxy, and they shouldn't have to if the bug is fixed.
    • Proxies are not inherintly bad. As you point out, they do have important benefits. But a broken proxy, as peakpeak.com has, which fails to make the connection to the IP address the client tried to connect to, causes problems. The correct behaviour would be to do the DNS lookup to get a valid list of IP addresses. If the intended origin server the client tried to connect to is among those, then check the cache by hostname alone. If the IP addresses do not match, then check the cache for a combination of hostname and IP address. If the cache doesn't have the requested document, then connect to the proper IP address and fetch it.

      I don't know which proxy servers (can) do this properly. Hopefully someone can post (a link to) a proxy server review with this much detail. I do agree the ISP needs to fix the proxy, and it shouldn't be necessary to remove the proxy to accomplish it. If they can't fix the proxy, then a workaround is to route the one customer around it (route-map in Cisco IOS can do this).

    • We have a squid cache and during peak browsing, er, working times we see 40-50% cache hit rate.

      I think the byte savings isn't quite as good as that, but I don't have any solid data to back that up.

      The best I can say is that we had to shut the cache off for a day or so to do some maintenance and the help desk got a lot of calls about how "slow" the web was, in spite of the fact that not more than a few days prior we had *doubled* our internet bandwidth (single 1.5Mbit frame to MPP bonded dual pointtopoint).

      I think that overall it provides much better bandwidth utilization (ie, fewer packets on the ISP link, even if the byte savings is only 10-20%) and the client browsing experience is a lot snappier.

      Our ISP used to have a whole statewide squid cache hierarchy which you could tune your local squid cache into if you wanted to -- I wish they still did, the aggregated caching would have been very nice.
  • AOL ignores ports (Score:4, Interesting)

    by Anonymous Coward on Saturday March 23, 2002 @03:51PM (#3213833)
    AOL's transparent proxy is a little worse. It ignores the port and proxies anything that looks like HTTP. Of course, they deny having a transparent proxy, but I was able to watch packets leaving our network headed for AOL and then watch altered packets come back from AOL.

    I stumbled across this when their proxy had some trouble with the cookies we were using and suddenly no one on AOL could use our service. A few minutes later they could again. Then they could not. During this time, I was running a packet logger on the outgoing traffic from our server and on the incoming traffic to a workstation I had connect to AOL. Everything worked find until the server sent the cookie. Then AOL suddenly stopped sending more packets. This occured on every port I tried, even ports reserved for other services.
  • by xanthan (83225) on Saturday March 23, 2002 @03:52PM (#3213835)
    The web cache is exhibiting correct behavior. When a forward proxy cache (transparent or not) gets a request in the form of GET http://www.site.com/ http/1.1, it will use the www.site.com address instead regardless of what original dns name you went to (www.google.com in your example). In the transparent case where the GET statement looks more like GET /content.html http/1.1, it will use the original destination address.

    In other words, it's your client that's broken. See RFC 2616 for details.

    The unfortunate truth is that more often than not, sites simply don't set their cache controls correctly. They forget that caches don't exist just on the server side but that they exist on the client side as well. Section 13 of RFC 2616 explains how they work in great detail and it really should be mandatory reading for any site administrator.

    If you're still looking for more information on web caching, check out Content Delivery Networks by Scot Hull. It was just released and is available on Amazon. There is an enlightening section on web caching that should clearly explain why what you're seeing is in fact correct behavior.

    • Well, yes and no, how could a proxy work with non-ICANN roots?

      It will try to resolve the address in the GET line, and fail, because it doesn't know about other TLDs.

      The only way to fix this broken proxy behavior is to have it ignore GET lines that is can't resolve, and instead forward the request intact to the IP address.
    • by Dr. Zowie (109983) <slashdot@@@deforest...org> on Saturday March 23, 2002 @05:04PM (#3214045)
      Yep, the cache is behaving correctly for a cache. The problem is that it's behaving incorrectly for a router, because I can't send the http: requests I want to the hosts I want to send them to.

      I'm not familiar enough with the ins and outs of cache design to know whether RFC2616 is designed primarily for ``transparent'' or ``selected'' proxies, but using a DNS resolution on the destination host seems to break the layered structure of the IP stack. In this case, packets that I've (layer 3) addressed to a specific host never get there, because (layer 4) they're being directed to another machine based on port, and the other machine (the cache) is routine them based on a name (layer 7) contained in the packet payload.

      That is acceptable behavior for a proxy to which I'm explicitly routing my http requests, but not for a router down which I'm sending port-80 IP packets.

    • Wrong. That's not what's happening. Ordinary proxying does use the modified GET request form where the URL is used in place of the URI. However, transparent proxying is different because the client is sending a URI, not a URL. And it's connecting to the origin server IP address directly, not to the proxy. The only way to identify the correct host is to use the IP address the client attempted to connect to. That's the transparent in "transparent proxy".

      If a client does attempt to connect to some IP address, and a transparent proxy won't use that IP address because it thinks the origin server is at another address, that's wrong. But if it has no idea what the origin server IP address is at all, even though the client was indeed connecting to it, then that's doubly wrong. A message from the transparent proxy saying it cannot find the IP address is simply stupid because it has the IP address the client connected to, since this is a transparent proxy.

  • If you want to find the IP address of a transparent proxy, simply point your web browser at a web page that will print out "your" IP address when you request a web page. Instead of printing the IP of your firewall or your host, it will print the transparent proxy's IP address.

    For example:

    After that, you may be able to do some more investigation into what kind of host it is and/or what kind of software it is running. (This is left as an excercise for the crac...err, reader.)

    • Not for sure, most proxy/switch solutions can do ip-spoofing so the remote webserver can't detect it. This is often done to avoid user/login problems on systems that base parts of their security on IP's. If the site then has it proxy rules set correct in the meta tags or header information, the "hidden" proxy won't cause any problems or cache any information.
  • It's in the layers. (Score:4, Informative)

    by Bender Unit 22 (216955) on Saturday March 23, 2002 @03:58PM (#3213854) Journal
    Normally what you do is to do layer 4 switching but note that you can do do switching on layer 7 as well, which means you can have the switch do url based switching so that a part of the url determines that it should get switched. This requires much more power and is mostly done for server switching like load balancing.

    What happens in your case might be that they have placed a switch that can do at least layer 4 switching, between you and the internet.
    What then is done is that all port 80 requests coming from the clients side(you) are re-directed to the proxy which means that http requests on other ports will not be cached. Note that anonymous ftp can also be proxied.
    A "clever" proxy/switch solution can do ip-spoofing so the webserver gets your IP adr. and sends it back to you directly, but as there is a switch inbetween, it redirects the result to the proxy which then sends the result back to you.

    A way to avoid it is to get a gateway somewhere that can channel your http traffic, you could set your browser to use this gateway as a proxy on any port. The switch will most likely not act on the traffic coming on this port an pass it though.

    The easy way would be installing a proxy server on a box that you have access to on the outside and configure it so that it won't cache anything.

  • Hello,
    How can you detect transparent proxying? Or opaque proxying?
  • Run up their support costs until they start using
    a non-broken proxy cache. Technical solutions are
    nice, but they only fix the problem for *you*. If
    you care about your peers, and the community of
    users, solving the problem for *everyone* is much
    to be preferred. Most users won't even understand
    that they are being screwed by the ISP. They
    depend on you to resolve the issue. Keep calling
    support until they fix it.
  • ... are a pain in the rear. From time to time, the web proxy will just... die. No data from my box can go out on port 80 to any sites for a good 10-30 minutes. This is in addition to the usual crap with their gateways, which cause stalls in ALL data transfers at random intervals, for a solid 30-70 seconds. Ironically, that gateway problem stalls my large file downloads and makes it near impossible to view streaming media at any level of enjoyability... The two biggest features flaunted by broadband services like Comcast. Anyway, sorry for the OT rant. :P
  • It seems to me that ISPs use interception proxies to lower bandwidth costs. Here in Canada (Ontario at least), most of the big ISPs are talking about implementing bandwidth caps (5GB/month with excess charged at C$10/GB). I hope your ISP isn't doing both, as that would seem unnecessary and rather heavy-handed.

    My previous ISP, Sympatico [sympatico.ca], used to have a transparent interception caching proxy. It was quite troublesome and more of a translucent crashing poxy server. I remember being unable to access starwars.com for two weeks once, even though everything else seemed fine. It was particularly annoying for people whose MTU was set too high (they needed 1454 or less) as they would constantly get timeouts on HTTP POST, such as when trying to send email from a web interface like Hotmail or Yahoo. It was also a constant source of problems for people trying to author their own personal web pages as it would cache them and not show their updates.

    My current ISP, IStop.com [istop.com], has an optional proxy. This is great! I normally use it, but if I have problems, I can switch to a direct connection. They run Squid and they also seem to have some sort of advert filter running. I get their logo (cached by my browser) or "This ad zapped" messages in place of at least 80% of web ads, which saves me lots of irritation, and both of us save lots of bandwidth. Incidentally, they also have reasonable bandwidth caps: 10GB non-local + 10GB local (mail, news, proxy, etc) per month, with excess charged at C$3/GB.

    After a while, Sympatico reduced HTTP interception to large population centres like Toronto, Montreal and Ottawa. Finally, they stopped doing that too. I guess it was causing too many problems and costing them too much to deal with it. If my ISP were to introduce an interception proxy today, I would leave them immediately. It's just not worth the irritation and problems for the length of time it will take them to fix it or get rid of it. I do live in an area where there is plenty of DSL competition though.

    So that would be my advice: switch ISPs immediately. Don't waste anymore time or effort on these guys.
  • I am the network admin for a wireless isp that does transparent cacheing. If a user asks us to turn it off, we can disable it for their IP.

    For more than 99% of our users, they don't know what routing or cacheing is, much less that it's happening. For those that actually have issues with the proxy it's a quick modification to our ipchains rules. So far we've only had 2 such requests. Also, we disable the cacheing for business class users by default.

    I would hope that you would ask them to disable their transparent cacheing for you before doing something as rash as dropping them. It's my bet that most of their other users do not have this issue, and they may not even be aware that it is causing problems for you.
  • The original post describes the prediciment that she/he is in, but doesn't even say what is broken, exactly!

    From the submission, it actually appears that the proxy is working exactly as configured. The end user, however, is breaking things himself by using nameservers other than his ISP's. That can't be described as a failure of the ISP by any means.

    Proxy servers add a lot of value to any network larger than, say your 3l33t home rig. The two main purposes I use them for are to reduce overall bandwidth usage, and to insert some level of malware protection. I've saved myself, and my company a lot of headaches by blocking silly virus code requests.

    It's nice that the post managed to include links to RFC, etc... it's too bad that they don't seem to really have an understanding of how networks, specifically the Internet, works.

    As others have commented there are plenty of alternative ways to get around this like SSH tunnels, VPNs, third-party proxies, etc...

    Just my own little $0.02 worth of a rant. Please drive through.

    -buffy

    • What's broken is routing. I can't route port-80 packets to the host of my choice. Proxies are all right as far as they go, but this one is mandatory. The ISP didn't advertise it and tech support takes the attitude that, well, it's transparent so it doesn't matter. It's just that it's not transparent.

      That's the problem with a lot of software "solutions": people don't think through all the worst case scenarios, just the main one, and the tool breaks when it's used in a way they didn't expect.

      • "Broken" routing is what they've done, NOT what that decision has "broken" for you.

        What I'm asking is to describe what you're seeing that doesn't work through the proxy. WHAT is broken? Are you having a problem connecting to a specific site, or collection of sites? Do certain streaming media not work? Come on, tell us what is wrong, so we can try to give you some proposed solutions.

        Complaining that you disagree with the decision of your ISP is not the same thing as offering up a real description of the resulting issues that occur because of that decision.

        I'm honestly very curious to hear, because I run such a proxy in my company's production networks, and sometimes my users are not actually the most vocal in telling my department if something is wrong. I'd like to hear what you're seeing.

        Thanks in advance.

        -db
        • Okay, cool.

          The problem I'm having is that, because I have to rely on the proxy's DNS to resolve a web hostname, I can't get certain HTTP requests to certain hosts.

          The specific example that I mentioned is the "http://www.dev.null" URL, but there are loads of other examples. In particular, "http://www.dev.null" is a completely valid URL that points to, well, the dev.null site. To resolve the URL into a host IP number to connect to, you have to be using the openNIC DNS tree (.null isn't supported by ICANN).

          So far so good. I can connect my web client, for example, to "https://www.dev.null" and get the secure pages for the dev.null site. But the main pages at "http://www.dev.null" are invisible to me -- the most I can get is an error message from my ISP's ``transparent'' caching proxy.

          The reason for the difference is that my home box uses OpenNIC -- the https: request gets resolved into an IP number and my local host just telnets to that host on the appropriate port and issues an encrypted HTTP request.

          The http: request fails because, although my host resolves the host-part of the URL into an IP number, that number is ignored by my ISP's ``transparent'' proxy. All port-80 packets emanating from my home network get intercepted and fed into the proxy. The problem is that the proxy attempts to resolve the URL using ICANN DNS -- so it's unable to identify an IP number to associate with the URL. It's apparently poorly coded, in that it doesn't know enough to try the IP destination address on the incoming stream that it's intercepting.

          My ISP reacted sort-of the way you just did: "well, if you're not using DNS, you're out of luck, pal." But that's Wrong. If they had an explicit proxy server and I had some choice about what happened to my packets, that would be OK -- but they don't. All of my port-80 packets are being intercepted by a piece of buggy 'ware. That (A) breaks the layered structure of the IP protocol, and (B) prevents me from accessing big chunks of webspace that I'd like to use.

          If you administer a similar transparent proxy setup, I encourage you to ensure that it falls back gracefully to being a router if it can't make sense of the http: requests. Apparently the most popular Cisco solutions and Linux ipchains router/proxies are pretty easy to configure correctly (according to some of the other answers).

          Cheers,
          Craig
    • The proxy server, when operating in the role of an intercepting transparent proxy, should make the connection to the same IP address as the client did. And it should do it immediately if it has no cache entry for that (hostname,IPaddress) tuple. The proxy can then do a DNS lookup on that hostname later, to find out what IP addresses can have a merged/shared cache; later requests using one of those IP addresses would get the merged/shared cache, and those using other IP addresses would be cached separately.

      So tell my why a proxy server can't do this.

      Of course, it's not very likely the ISP will fix the problem very soon. Other solutions, such as an outside proxy, are likely to be the quicker workarounds. Ironically, this would defeat the bandwidth gains for the ISP, although for only an amount of that one customer. So the ISP shouldn't care as it's so small, right? If it turns out they do get angry that the customer is bypassing their cache proxy, then it's time for them to get their clue.

  • There are a few workarounds to the problem of devices that you do not wish to handle your traffic doing so.

    I have seen tunneling via ip-ip, ssh, and other ipv4 protocols mentioned, however there is another option available, and that is to tunnel your traffic as ipv6 traffic over ipv4.

    It does take a bit of time to set up, but if you can find an agreable ipv6 network provider to allow you to tunnel to their server, your traffic will not be handled by any transparent proxy server at your local ISP, regardless of the type of traffic that you are working with.

    I am not sure how complete the ipv6 implementation for Windows is yet, or, depending upon which version of Windows you may be running, if it is even an option, but for users working with Linux and BSD, this should not be a significant issue.

    Then again, I could be wrong.

    -Rusty
  • Yep (Score:3, Informative)

    by fanatic (86657) on Saturday March 23, 2002 @05:41PM (#3214178)
    I did some experimentation using ``telnet'' on port 80 directly, and found that packets are being routed based only on the payload regardless of the original destination address: I can (for example) retrieve the Slashdot front page by using ``telnet www.google.com 80'' and asking for "http://www.slashdot.org http/1.1".

    This is how a 'transparent proxy' is going to work. Any SYN to port 80, whether from telnet of a browser, is interecpeted. Then the proxy uses the URL and host in the GET request and headers to get the page and send it to the client. Your only hope is do your browsing on a port other than 80, as others have noted, by setting up a machine somewhere outside of your ISP that can recievie your requests on something other than 80, then send them out on 80. Then you set your browser to proxy to that other machine on that other port, if it's an actual proxy.

    Of course the real cure is to shit-can your ISP. Not only are they messing with your abiity to use the DNS root of your choice, but they also have the ability to track your activities. (well, more easily than just putting a sniffer on.) Wonder what they're doing on port 443?
    • by Skapare (16644)

      There are 2 different kinds of proxy roles.

      One is where the client browser connects directly to the proxy. It provides the hostname or IP address in a URL on the GET (or other method) request line. That's the one to be used.

      The other is the interception of TCP traffic (usually port 80 specifically). In this case, the browser is unaware, and is connecting to a specific IP address. But most importantly, it is sending a URI, not a full URL, on the GET (or other method) request line.

      At the very least, a proxy which wants to connect to IP addresses it gets from DNS instead of the one the client was connecting to, should, if DNS gives it no addresses, connect to that IP address the client was using. The proxy peakpeak.com is using can't even accomplish this. Now that's just dumb.

      If it were to use the proper origin server IP address and the associated name from the HTTP "Host" header to first check the cache, then connect before doing a DNS lookup to that IP address, then performance is faster. There is no need to do a DNS lookup for this purpose (although it can be later used to know a set of IP addresses that can share a common cache).

  • My ISP (CTC) started doing this on my static dialup without warning. I noticed because 1) eBay pages suddenly required reloading in order to update (ie, if I quit the browser, and then went back to a dynamic eBay page, it was the same as before unless I reloaded the page). .. and then 2) I noticed when connecting to another machine, the address that showed up in the logs was not mine!

    Anyway, after poking the machine I discovered it was a Cisco something or other. I also discovered that if you sent a malformed or invalid request, it would STOP transparent proxying for a few minutes!

    So the solution I came up with was to telnet port 80 someplace (didn't matter where, because the proxy would pick it up) and type "PLEASE DON'T PROXY ME" and close the connection and then it would leave me alone for a few minutes.

    Most of the time I left it on as the proxying seemed to speed up the usual day-to-day surfing. But you might want to try a script to do this automatically. Probably this is just an option the engineers forgot to turn off (I believe by law they must turn off all customer-friendly services :-).

    After a few weeks of doing this, and making a few phone calls, the proxy mysteriously went away. Maybe they took my static dialup off the list, or they decided to do it for everybody. Whatever. I've been using Squid so it's pointless for me anyway.

  • Does it work with http 1.0? you could get them with that.. (1.0 doesn't require a 'Host: ' header, so the request could just say 'GET /index.html HTTP/1.0\n\n', and the proxy wouldn't know where to send the request).

    you could also setup a proxy on localhost that rewrites the Host header from 'Host: www.weird_ass.domain' to 'Host: www.weird_ass.domain.existing_domain.com', and then have the DNS server that resolves 'existing_domain.com' to reply with the IP for 'www.weird_ass.domain' when it gets a request for 'www.weird_ass.domain.existing_domain.com'. Maybe the maintainers of the 'weird_ass.domain' zone alredy have that.

    You'll probably need a lot of custom code for something that can be fixed by changing ISPs tho.

  • This is don't with the Web Cache Communication Protocol (WCCP ) from his ROUTER. the command to find out if a Cisco router is WCCP enabled do a sh ip int (your int). Yo can look up the specs of the protocol to figure out how to try and bypass it. But you probably won't get ther by using another proxy(tried it), because you will still go through the original proxy configured at the router before going anywhere.
  • by davew (820) on Sunday March 24, 2002 @09:50AM (#3215949) Homepage Journal

    I see what you mean. You are sending traffic to a particular address based on your own DNS resolution, and if the traffic is proxied, you want it to be sent to your chosen destination, not that of the proxy.

    In my opinion, the ISP is exhibiting correct behaviour.

    Picture this: the object of the exercise with the transparent proxy is to cache pages and increase speed for the customer, right? I think it's already been agreed earlier in the thread that this is not entirely evil.

    Let's say the proxy honours the destination IP address that you chose (I'm not sure how this would work in practice, but I'll go with it for now). It returns the web page from the server that your DNS picked, and caches it for the next guy.

    Another customer requests a page with the same name. What if they're using a DNS root where the answer conflicts with yours? The customer gets the "wrong" web page. Because cached objects eventually expire, this means that the customer might get a completely different site dependent only on the time and date they happened request it.

    The ISP doesn't use the same DNS root you do, so they can't begin to troubleshoot the problem.

    I concede that the popular "alternate" DNS roots have few enough conflicts with the IANA-assigned roots at the minute, but even that is an irrelevancy - any solution that allows a customer to choose destination IP address on behalf of other customers opens up the ISP to a denial of service attack by a user less trustworthy than you or I. One could set up an arbitrary "root" server that resolves www.yahoo.com to my own site. Or google. Or some site that accepts credit card orders.

    I can't see any scalable way out of this without the ISP picking one root, and sticking with it. If that is so, then I think this is a fundamental problem with split roots and, if you really want to use them, be fully aware of what you're getting yourself into. Turning off the transparent proxy will help this time, but you won't be able to rely on being able to talk to any server on the internet that doesn't use the same root as yours, even the servers you don't (usually) need to know exist.

    Regards,
    Dave

If builders built buildings the way programmers wrote programs, then the first woodpecker to come along would destroy civilization.

Working...