Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
The Internet

Multihoming Suggestions w/o at Least a /24? 55

An anonymous reader asks: "I work for a small company who is looking to get a multihomed Internet connection for redundancy. The logical conclusion would be to get another internet connection to another provider. However, in the case of a primary connection failure, we need to be running BGP to have our internally-hosted sites still accessible to the Internet via the 2nd connection. The problem is that we only have a /28 (16 IPs), which is too small to make it past most route filters, and would then mean that we still couldn't be reached if the primary T1 is down. So, what's our options? (and no, lying and getting a /24 isn't a valid choice)"
This discussion has been archived. No new comments can be posted.

Multihoming Suggestions w/o at Least a /24?

Comments Filter:
  • by amorsen ( 7485 ) <benny+slashdot@amorsen.dk> on Thursday February 20, 2003 @04:15PM (#5346498)
    The obvious choice is to get a second set of 16 addresses on the other connection, and then make your DNS server send out addresses to whichever connection currently works. Not all services like switching addresses, and sessions break when doing failover, but it might work for you. If you only care about outgoing traffic, load-balancing and failover is fairly easy to do and there are lots of products to help. Again, outgoing sessions will get killed if they happen to use the link that breaks.
    • by photon317 ( 208409 ) on Thursday February 20, 2003 @04:21PM (#5346562)

      Yeah outbound traffic is easy, it's the inbound he's having a problem with I'm sure. The problem with two sets of addresses and DNS switching is the caching. Even if you set your records to expire in 30 seconds or something crazy like that, at various levels the records *will* get cached much longer than that, and it will "problematic" at best.

      This question is truly worthy of Ask Slashdot, which is a first in a long time. I have yet to see a good answer for someone who wanst truly redundant internet connectivity and has too small an address space to really do BGP peering.

      I thought of one solution at the ISP end of things, which would require partnerships between ISPs. Two distinct competing ISPs could grab a decent-sized netblock and share it. They sell these IPs to customers wanting dual-homed access from both ISPs, and split the money. In this type of scenario the customer can BGP to both ISPs, who in turn BGP with each other and the real backbone, and you can get all the redundancy you need in case of ISP or wan-link failure.
      • So true about the Ask Slashdot, this is a worthy question as most of the time you have to get the ARIN and BGP in the loop.

        I was thinking that you could also do it on the cheap by running 2 lines with your typical outgoing redun links. Which is fine.

        Then have a secondary DNS located outside your network, with very very short TTL and a perl script that checks your link state. Once one group of IPs go down your secondary remote DNS outside your network starts to feed your redundant group of IPs.

        I do something like this with my home cable ip which is DHCP'ed. On my box at the house I have a script that check for ip changes, if it does change it notifies the remote DNS to change to the new ip. I don't see why you could not do something like this with 2 cheap links and small blocks. You would just have to worry about convergence and proxys that feed to your sites.

        Anway, just leaking at the brain...I have been known to be wrong.

    • To expand on the parent, research something like the Fatpipe WARP: http://www.fatpipeinc.com/warp/index.htm
  • and get your own /24, or whatever, saying you needed it to multi-home.

    but this was back in the 1995-6; things may be different now.
  • I've seen & skimmed a paper on Cisco's website on doing a hackish form of multihoming by using two WAN links that are NATted against each other. Googling is left as an exercise to the interested.
  • by pruneau ( 208454 ) <pruneau@g m a i l .com> on Thursday February 20, 2003 @04:24PM (#5346595) Journal

    Of course, the usual question is: what can you afford to have redundancy ?

    Because before technical solutions, you might want to review the contract with your access provider to include liabilities. The contract itself might cost more, but it might be simpler than a real redundant solution.

    Because unless you know for a fact than your access provider is not reliable and has bad support, playing the redundancy game might be a bit more expansive than "simply" getting a double connection from the internet.

    Let's do the excercise: you want a dual internet connection, that's OK, but you surely do not want a single router=single point of failure. So you have to buy another router, most probably the same brand as the one you already have, so to be able to use the (most probably) proprietary high availability solution. Provided your current model supports HA, or you will have to buy a more expensive one ?

    Which brings to mind that having a redundant link (with an SLA :-) from the same provider might be an excellent idea, since they are probably aggregating your /28 to other /subnet, your route advertisment won't get lost in their network until it gets aggregated. Just make sure it does not get aggregated on the next hop ;-) Well, if you are willing to pay for multi-homing, woul'dt it be easier to try to obtain an SLA with only one access provider, SLA including an redundant routing connection, with some redundancy protocol handled

    • Ooops, remove the last sentence...
      Whish we could edit at least our comments.
    • What happens when your providers outbound gets trashed? What happens when your providers BGP peers decide to drop them, or someone cuts their fiberline to InterNAP?
      Multihoming to dual providers provides a level of redundancy not possible with a single provider - you get multiple backbone access points.
      And yes, it's expensive, but sometimes that cost is worth it (not everything is monetarily measurable - not every organization is a business).
      • OK, a last one.

        You really want to do multi-homing ? Yes, you sure ?

        Well, then get two different provider with a different set of addresses, set up two routers for each one of your feedm and set up your servers to have two differents IP addresses per machine. Of course, to be fully redundant, you have to register two different DNS domains (on, in .com and one in .org will definitely look k00wl).

        What ? It does not work ? assymetric routing ? That's for whinnies.

        Be real dedundantm duplicate everything, (but your boss: this one should be replaceable).

  • Fake it with DNS? (Score:2, Interesting)

    by zcat_NZ ( 267672 )
    Set up your servers with a different IP for each route. Set up DNS inside your network so that the DNS server on one interface returns IP addresses that go through that interface, and vice-versa.. with a short expiry time.

    If the main link goes down, so does the primary nameserver. The secondary nameserver (on the backup link) then returns IP's that are routed through the backup link.

    This should work, but it probably goes against several RFC's..
    • If you're doing this, keep in mind that -- to the outside world -- both your nameservers are equal. You'll get approximately half the clients querying your secondary nameserver, meaning that approximately half the traffic will hit that second link.

      Of course, if they have similar amounts of bandwidth, etc., you might actually consider this to be a good thing :-).

      It's still not going to work great for fail-over, though (thanks to things like DNS caches). As distasteful as it might be, the most cost-effective way for you to approach this is probably going to be with an SLA (service-level agreement) with your ISP.

    • This should work, but it probably goes against several RFC's..

      hehe.. That's my favourite quote. The tasks I'm given at work require that I say that every other day. I thought I was the only one. ;)
    • Re:Fake it with DNS? (Score:3, Informative)

      by aminorex ( 141494 )
      I think that if DNS is the best you can do, you
      should round-robin the IPs from each link on both
      servers. Only drop the IPs from link1 if link1
      goes down. Then even if there is some dead cache
      on the network, at least your clients can reach
      the server by trying again.
      • Trying again won't work. The client's DNS server will still cache your old IP info. It won't ask again until the cache times out. You can try and put in a very low cache timeout, but not everyone listens to your cache timeout. I know that from experience.
  • by GoRK ( 10018 ) on Thursday February 20, 2003 @05:15PM (#5347145) Homepage Journal
    You're going to have to do your own redundant routing in between you and a network that is properly multi-homed with BGP out to the larger internet to make this work like I think you are really wanting it to work.

    First, find an upstream ISP that is multi-homed to your satisfaction. Buy some IP's from them and put in a router or two for redundancy.

    Next, build two or more tunnels to the ISP over different circuts or providers and run your own small BGP network on private IP's between the router at your multihomed isp and the routers on either end of your connection. Assign the IP's that belong on the multihomed network locally and let your own routers run BGP (or OSPF or whatever else you want to use instead like load balancing) between your LAN and the multihomed network.

    It's hackish. It will be fairly expensive. It will also, however, work, let you keep your servers on-site, and give you greater control over redundancy and failover than you'd get with two upstream providers allowing you to use BGP anyway.

    In the end, it might work out to be cheaper to do this in the long run since you wont have to pay any upstream ISP for letting you do BGP. You'll just have to pay for colocation somewhere, which could be a lot cheaper.

    ~GoRK
    • Next, build two or more tunnels to the ISP over different circuts or providers and run your own small BGP network on private IP's between the router at your multihomed isp and the routers on either end of your connection.
      Yes, I suppose this would actually work. In fact, it's pretty much what I'm doing for my own network :-). You might not want to use BGP, though; it isn't really designed to run across several hops like that, and will probably flap more than you'd like.

      OSPF will do the job just fine, and provide reasonably quick convergence too.

      • Since the "several hops" are actually just a tunnel, BGP would still run fine, but the other alternatives such as OSPF as I mentioned would be 1) simpler to set up and 2) more "in tune" with the actual underlying routes and 3) way easier to load balance .. if you're running two pipes, you might as well have them both carrying traffic if they can.

        ~GoRK
  • by FreeLinux ( 555387 ) on Thursday February 20, 2003 @05:16PM (#5347152)
    If the use of BGP is out of the question, there seems to be only one alternative. However, this solution still leaves the ISP as a single point of failure.

    The option is Virtual Router Redundancy Protocol (VRRP). A brief description of VRRP, including a diagram, can be found here [cisco.com]. Keep in mind that there are numerous other manufacturers that support the VRRP standard, you don't *have* to go with Cisco. Also, remember that with VRRP there is still a single point of failure, the ISP. This means that your ISP had better be a good one.
    • As far as the ISP goes just make sure that they have the following.

      1) At least 2 independant fiber connection that use geographically different routes to get to the building.

      2) Dual redundant UPS power.

      3) At least 1 generator.

      4) A/C should run on UPS or Generator power.

      5) A fully redundant network(mesh topology?).

      After you find an ISP with ALL of these you need to get a connection on each of the incomming fiber connection. As far as your end of things, 2 routers would be good, but you can pick up something like an OLD and inexpensive Cisco 75xx or Bay BCN with dual serial or T-1 cards and DUAL CPU's as well as at least dual power supplies.

      Reliable connections don't have to cost an arm and a leg, just a LOT of research and careful planning!

      Good luck!
  • by Richard_at_work ( 517087 ) on Thursday February 20, 2003 @05:16PM (#5347158)
    How about spending money to have a reverse proxy off site, in a colo somewhere, that handles which line to send it down. Clients connect to the advertised IP address for a site from DNS, which is the colo proxy/whatever, and then are either dealt with transparently, like a true proxy, or redirected to whichever line is up at the time.

    Its something i have intended to look into for work, as it would jsut be a extension of what we currently use for firewalling anyways, port 80 is redirected from the gateway to a machine behind the firewall. To carryout a port 80 redirect on two publically available ips is probably jsut as trivial, infact as ive been thinking this through, i have tried it with OpenBSD on both ends and apache as the webserver, a rdr on the outside box gets my webpage fine.
  • Multihome to a single ISP that has multiple redundant backbone connections and do IBGP with them. His summarized aggregate routes will be multihomed on his backbone. You can then peer with him for your smaller subnet. I know it isn't as good as peering with two independant ISP's. Maybe you can connect to the same ISP at two different POPs to alleviate this somewhat.
  • While your providers are almost certainly supposed to be following address allocation policies from their address providers, you might be able to get them to issue you a /24 nonetheless. You can show a legitimate need for this amount of address space, even though you don't plan to use it for addressing machines--you want to multihome. Try to get your current provider to listen. If they don't, while shopping around for other providers, (which you will need anyway if you want to multihome), make a /24 a requirement of your contract. Renumbering 16 hosts won't be fun, but its not the worst thing in the world at all.
  • Just put dual NICs in all of the servers, and give them all an IP address on each network. If you advertise them equally, all should be well, even if a line fails, you still have a 50% chance of the web server getting hit with the good line.

    I know it's not strictly necessary to do the redundant hardware/network thing, but why not? It's only a few $20 network cards and a switch.

    If you have multiple MX records (one per IP), then you won't lose any email either.

    It would be nice to dual-home, but dual IP is a workable solution for the small business.

    --Mike--

    • Note that this is not neccesary as your firewall can translate multiple public addresses to one private address. Note that you want a firewall that maintatins state info such as Netfilter on Linux, Cisco PIX, etc.
  • http://www.rainfinity.com/products/rainconnect.htm l

    We evaluated their software, but since we have a /24 determined we had already implemented what they could offer.
  • Hello,

    There is no need to lie. Going multihomed is reason enough to request and obtain a /24 from one of your two providers, despite the fact that your network size only requires a /28. I have performed this exercise for companies of your size many times over, and trust me, any major network provider will give you a /24 if you are switching over to BGP and getting a second connection.

    The effect of imposing a /24 or greater limit on BGP routes is that providers need to be more sensative to the needs of companies who, when considering network size alone, can't justify a /24. Thus, going multi-homed is enough of a qualifier by it self to obtain a /24 from an upstream ISP.

    -James

  • I don't see anything noting what your T1 provider's policy/procedure for failure of their service.

    Have you asked them?

    Based on the information provided, here's my suggestions:

    1) Are sattelite connections doable for you? It can be expensive but if you're company needs a quick fix until the main T1 comes back up, this may work for you.

    2) Would it be possible to swtich your company's failed T1 through a secondary route to the internet? Most T1 Providers offer this.

    3)DSL. I know this may seem wierd but in the event of an outage of service, you need to provide the bare minimum of connectivity to your customers internaly and externally.

    Dolemite
  • If something's worth doing, it's worth doing right. The main consideration from ARIN, as I recall, is a 70% allocation of existing address space. If you've got /28, two addresses are used for network/broadcast, at least one on a router, and at least one on a firewall.

    You do have (a) firewall(s), don't you?

    At a minimum, that's already 25% of the existing address space in use, without including any servers you may, or NAT addressing you may be doing.

    Break your /28 up into smaller networks, maybe like this, and a larger range is easy to justify-

    /28 = /29 - 6 available host addresses
    /29 = /30 - 2 available host addresses
    /30 = /32 single host (loopback)
    /32 single host (loopback)

    This scheme gives you just 5 useable addresses left, assuming you've got two routers and a firewall. The available /30 could be used for NAT translations, or any number of other things.

    With 11/16 addresses used, you're at 68% of your address space. One more host puts you over the top.

    Use this type of addressing to get the better of the two ISPs you choose to delegate a routeable block to you.

    Obtain from ARIN, an ASN. Inform the other ISP that you want to BGP peer with them, tell them if you want full routes, or summary routes only advertised to you, and give them your ASN and the IP addresses of the router(s) they'll be peered with. Give the first ISP the same info. Both ISP's should give you ASN's and peering info for themselves as well. Here's the catch... make sure you don't turn yourself into a transit area between the two ISPs. Filter your BGP adverts so that you're only advertising routes originated in your AS.

    Simple, huh?

    Joe
  • Radware Linkproof [radware.com]
    Magic Radware box... check out the Linkproof or Linkproof LT depending on your needs. Dynamically add or drop providers seamlessly, use your exisiting IP's, handles the failover through a dynamic DNS system onboard.
  • I have two ISP connections- both look like ethernet to me. One happens to be wireless and the other comes in over a telco circuit, but the handoff is ethernet.

    After much searching and testing I built my router using FreeBSD and IPFW. More on that further down.

    Each ISP has given me a block of addresses from their CIDR block. I multihome proxy servers and email servers for inbound and outbound connections. They have one interface with multiple IPs bound. Nothing special there. Their default route is my FreeBSD router.

    The freebsd router has mulitple ethernet interfaces. One per ISP and one for my servers. The ISP-facing interfaces have /30 addresses for routing purposes and "my" side has the /25 and /27 blocks they assigned me from their pool.

    The default route on the freebsd box is one of the providers.

    I use IPFW for egress routing. Packets on the OUT side of the interfaces facing the ISPs are checked for source addressing and either allowed through or pushed over to the proper interface. Works like a charm.

    interfaces:

    em0 aa.bb.dd.128/25 (my side)
    em0 xx.yy.zz.192/27 (also my side)
    em1 aa.bb.cc.220/30 ISP A
    em2 xx.yy.zz.188/30 ISP B

    the rules I use:

    ipfw -q add 100 fwd xx.yy.zz.189 ip from xx.yy.zz.192/27 to any out xmit em1

    ipfw -q add 201 fwd aa.bb.cc.221 ip from aa.bb.dd.128/25 to any out xmit em2

    There's also ipfilter in there handling filtering. IPFW only handles the egress routing.

    DNS fills the gaps. I return at least two A records for the hosts I publish.

    I used Linux for a short time in this router function but got bored with problematic network drivers.

    That Radware device and the one by F5 are doing the same thing, but for at least 5 figures. I looked at them and then opted for this cheaper solution. I just bought duplicate router hardware and just keep a cold spare.
  • Keep an eye on these: 2002_7 [arin.net] and 2002_3 [arin.net]
  • you're not going to get bullet proof 24x7x365 for cheap, so first get over that idea. What you do want is the most people able to use your services the most amount of time.

    This applies to web servers as that is my primary concern, but could be used for other types of services.

    Setup 4-10 front end web servers and do round robin DNS between them for your site. The 4-10 all need to be colocated on differnet parts of the Internet, differnet providers all in data centers is ideal. These sites can be load balanced/failover protected for more $$$ of course. These are your initial contact points for a client, and they have either your home page or a forward to your currently active site(s). This speeds up your ability to rapidly change from your live datacenter for your real site to your backup datacenter. You run a monitoring / mangament program that's constantly communicating to these servers which is the current live site. (this part of the solution must be VERY secure otherwise you're going to reduce downtime) Very secure and also automated fast, with a safe default to fall back on, like a local static page. Now if your main site gets hit by a 500MB/sec tcp/udp flood on random ports from randomly spoofed ip addresses (like a certian site I manage did 2 months ago), you simply move the current live site to backup site 2, or 3, or 4. If the attack is directed at your dns then it has 4-10 times the total networks to flood and should receive substantialy more attention than were it to take down only 1. If you have 10 front end sites up and one gets taken down then 10% of the people asking for DNS are going to get the wrong site, at least until you update your DNS.

    Anyways this is a proposal to a problem made in hind site, but it's one we're working on implementing.

    best of luck to you! If there is some poor mans BGP out there I've love to hear about it.

    in re-reading this I realize it doesn't seem directly applicable to the needs (cost, etc) but if you were to use 1-2U per colocation facility and less than 1MB/month each you could pay in the few hundreds per front end site, and keep your local facility....

    I run Linux on my HP Laptop [submarinefund.com]

  • by JWSmythe ( 446288 ) <jwsmythe@nospam.jwsmythe.com> on Friday February 21, 2003 @01:31AM (#5350221) Homepage Journal
    I know this method will get flamed by quite a few people, but it works very well.

    We want 0 downtime. There's no way to guarantee that any equipment is without failure. Something can/will always break. That's something you have to accept.

    voyeurweb.com is located in colo facilities in both New York and Tampa. Each facility has it's own network drop. The size doesn't matter, but for reference, it's 1000base fiber in each location.

    We have at least 5 machines in each location. Each machine has it's own IP, and in some cases multiple IP's just to increase it's load (faster machines can handle heavier loads).

    You put multiple A records in your DNS. When a customer browses to your site, they get any one of the IP's randomly. Here's what an 'nslookup' returns for voyeurweb.com

    > nslookup voyeurweb.com | grep Address
    Address: 63.208.2.23
    Address: 63.208.2.25
    Address: 63.208.2.62
    Address: 63.208.2.64
    Address: 63.208.2.84
    Address: 63.208.2.97
    Address: 209.247.59.14
    Address: 209.247.59.15
    Address: 209.247.59.16
    Address: 209.247.59.17
    Address: 209.247.59.84
    Address: 209.247.59.85
    Address: 209.247.59.86
    Address: 209.247.59.87

    The 209.247.59 IP's are in New York. The 63.208.2. IP's are in Tampa. We're favoring the New York network a little bit, because we have some other specialized sites running in Tampa, and want an equal load between the cities. Right now, we're pulling about 450Mb/s per city at peak time. Just half of our 1000Mb/s drop. We've just added 1Gb/s fiber in Los Angeles, to increase our redundancy. How redundant you make youself is really up to how much the bosses want to spend, and how safe you want your site. Like I said, we want 0 downtime, and we achieve it.

    The nice part is, if a machine fails, the client hangs for a few seconds, and then goes off to the next IP. If all the IP's in a city fail, the client can potentially hang for up to 30 seconds, before going to a server that works. Your browser will continue to use any IP that works.

    We use a relatively short ttl in our DNS records, so if we decide to shut down all the servers in a city for any reason, within an hour all the traffic stops to them. I've done this many many times now, I'm 100% sure it works.

    If we have a known problem (say a server has a hardware failure), you take it out of your DNS, and within an hour, no one is even trying to hit it.

    We've done this with pairs of machines, or in the case of voyeurweb.com, up to 25 machines.

    It's so simple it shouldn't work. I've been told by quite a few people that it won't work, but then I'll prove to them that it does..

    Before we did this (before I was admin), if a machine failed, thousands of viewers would write in complaining immediately. Now, I can take a few machines down for maintaince, and no one notices. If we have a mystery crash at 4am, it's not fatal, we fix it when we can..

    If you're a viewer, you probably didn't notice that we shut down all of Tampa for voyeurweb.com, for a week because of provider problems (lack of available bandwidth). You probably didn't even notice when we swapped out all the New York servers.. We were polite with some of them, but got bored with it, and just started yanking cables after a while.. We didn't receive a single Email asking why the sites were down. When a server went down, we did notice an increased load across the rest of them. It's nice having a *LOT* of servers running. If you have 10, you only get a 10% load increase across all the other servers when one goes down. :)

    We don't depend on BGP. We don't depend on expensive load balancing equipment. We don't depend on anything other than the fact that people use browsers, and they resolve IP's through DNS.

    In your case, you should have two ISP's providing you bandwidth. Each ISP should issue you a block of IP's from their available pool (like, it's hard to be on the Internet without it).. I'd say, if you want a site that stays up, set up a pair of mirrored servers. Give the first one an IP and gateway of the first ISP, and the second one an IP and gateway of the second ISP.. I could name off over a dozen sites that do this now, but I won't. :)

    If you want to get real fancy, get one machine, put two IP's on it (one from each provider), and have a script monitor each gateway. If one fails, switch to the other. But this doesn't do anything for redundancy if one server should fail.

    I hope I've explained this well. I've never seen it well documented anywhere. It's in the BIND documentation somewhere, but they have a real convoluted method of CNAME's and A records, which other documentation says are completely against some of the RFC's, so you shouldn't do it.

    Even our evil nemesis does it...

    >nslookup microsoft.com | grep Address | sort
    Address: 207.46.134.155
    Address: 207.46.134.190
    Address: 207.46.134.222
    Address: 207.46.249.190
    Address: 207.46.249.222
    Address: 207.46.249.27

    Two different networks, splitting the load. :)

    If CNN does it, it must be good.

    > nslookup cnn.com | grep Address | sort
    Address: 64.236.16.116
    Address: 64.236.16.20
    Address: 64.236.16.52
    Address: 64.236.16.84
    Address: 64.236.24.12
    Address: 64.236.24.20
    Address: 64.236.24.28
    Address: 64.236.24.4

    BTW, yes we get constant DoS attacks against us.. Sometimes I entertain myself by watching the logs. :) But, it's pretty hard to take down 10 servers that can each push out 150Mb/s (dual NIC's bound together with teql).

    To the script kiddie that "took down" one of our machines the other night.. Ummm, you didn't. I was annoyed at seeing the logs, and dropped all traffic from your network. :)
    • One problem I have run in to with the "DNS way" is that if a home user opens their browser and resolves the name to an IP address it tends to cache that until you close your browser, whatever the TTL.

      At the time I only tested with IE but Mozilla may do the same thing. I don't know.


      • It's possible, but when we change our DNS, users stop hitting it after an hour.. It could be that the users open and close their browser frequently, or the newer browsers are better behaved.
      • Well he's talking about DNS returning multiple IP addresses for each name lookup. Not round robing returning a single different IP for each name lookup.

        Probably would work for intelligent browsers.

        Works for most apps and users - since most people have been trained to "reload/retry/reboot"...

        It works for some telnet clients - they try the alternate IPs.
    • It's simply round-robin DNS to a bunch of web servers. It's a reasonable technique for your goals, but it's got nothing to do with the original question, which was how to keep his company online when one ISP/connection/link fails, and to both outgoing and incoming traffic of all kinds find the right place to go.


      • Actually, it should work fine for him..

        With two servers, and IP's on both networks, even if they're both on his same local network, he'll remain up, regardless if one of the lines dies..

        The only concern that I could immediately see, would be more complicated stuff, like database work. But, that's easy enough, since he can tell each machine that both networks are local, so he doesn't go outside of his network for the connection.

  • We have two broadband links to our small downtown office. Each of these links terminates at an OpenBSD firewall. We briefly looked at what kinds of fail-over models being multihomed in the DHCP ghetto makes available to us (actually our ADSL service provides two static IP addresses, but they still grant you those addresses via DHCP leases).

    We can do some cool things with Zebra (OSPF) over our VPN (all our offices / developer's have their networks in different portions of the 10.x space). If segments of our VPN fail, the traffic would find other segments forming a viable route to the final destination. Within the VPN we can pretend that we have a real network.

    Fail-over to the outside world is more difficult.

    One issue which no-one has mentioned here is stateful firewalling. Even if we can broadcast route information to the two OpenBSD gateway machines, it will break stateful firewalling completely. That said, most of our traffic is not session oriented. A broken HTTP request will fail-over at the application layer. No biggy.

    The primary form of session oriented traffic we use is OpenSSH tunneling. The simple solution here is to remove stateful firewalling on the SSH port. I have to doubt that stateful firewalling your SSH ports accomplishes much.

    For fail-over of our public web servers we decided against nasty DNS schemes. You don't get load balancing (just load sharing, which is not the same thing), there are issues about DNS record caching, your control is poor, when problems arise your ability to investigate the problems is all in a muddle. We don't tolerate that kind of thing here.

    Instead we are directing all public URLs to a highly reliable and well connected public hosting company. That host is returning a frameset document containing a single frame window, with the embedded frame directed at one or the other of our two broadband services. Our firewalls are sending pings to the public hosting company so that the PHP script there can decide which of the two URLs to send out on each request.

    There are plenty of extra complications since we are also thunking requests into different SSL compartments and we are using the OpenBSD firewalls to run chroot Apache in proxy mode (which allows them to filter the URLs before they reach the internal content servers).

    With the right magic additions to the frameset code we can make the public URL track navigation through our content without revealing the ugliness of our internal fault-tolerant, SSL compartmentalized URLs.

    Framesets suck, but not nearly as much as borking the DNS standard.

    We are fortunate that our client base is narrow and we can impose client browser requirements. Our traffic volumes are low enough that we can afford the overhead of trampoline packets on each URL access.

    We achieved a fairly low standard of fail-over in the end, but the good thing is that we didn't have to involve anyone else, and the fail-over we do have is sharply defined and easy to diagnose.
  • by ikekrull ( 59661 ) on Friday February 21, 2003 @08:26AM (#5351478) Homepage
    The internet is not about fault-tolerance and ability to 'survive nuclear attack' for anybody who isn't an ISP, large corporation or government department.

    That idea went out the window with the introduction of CIDR.

    There were good reasons for this, primarily the unmanageable growth of route-tables.

    IPV6 will never see the light of day because if IPV4 can't be routed economically out to the edge of the network, then increasing the address space by a large factor will not help matters.

    There is no way around this except by removing CIDR for a decent proportion of the internet, but route-tables will of course baloon hugely.

    So, while 'the people' want to be able to multihome etc. 'the backbones' don't want to have to scale up their routing capacity by a large factor.

    Which all boils down to the conclusion that the powers-that-be on the internet have decreed 'thou shalt not multihome unless thy pockets are extremely deep and thou has at least a /16'

    This is unfortunately the way it is, and won't change under the current 'internet regime'.

  • What usually goes down on a T1? Normally it's a router module, the smartjack (most often it seems!), or something at the CO. *ALL* of these can be easily remedied with a second T1 with load balancing and failover. Nothing fancy. We do it here with CEF and it works great. Just run each T1 to a seperate router module.

    What would BGP buy you? Well, if you ran your second T1 to another CO and to another ISP POP it would let you survive an entire CO or POP outage. But, how often do those happen with a good ISP? Almost never. How much would it cost to run that T1 to another CO? A ton.

    I've already been down this road and found out that a dual T1 setup with something like CEF takes care of the job, unless you want to spend a fortune.
    • If you do BGP you'll need a bigger router. A small router won't do the job. When I was looking BellSouth required at least a 3640 w/ 128MB as the minimum for any BGP customer.

What is research but a blind date with knowledge? -- Will Harvey

Working...