Forgot your password?
typodupeerror
Communications The Internet Technology

Scaling Server Setup for Sharp Traffic Growth? 19

Posted by Cliff
from the a-system-that-won't-fall-over-with-massive-use dept.
Ronin asks: "We are a young startup developing a yet another collaborative platform for academic users. Our platform (a) requires users to log-on to the website for extended period of time, and (b) is content intensive - stuff like courses, quizzes and assignments gets posted regularly. We're using a LAMP setup on a 1 GB P4 server. Our user base is small (about 1,200 users, 5-7% connected at any given time) but we expect it to grow rapidly. We expect sharp traffic growth, and are working to scale our server software & hardware setup linearly. What kind of server setup plan should we go for keeping in mind our content heavy application and that we may have to scale up rapidly. Can anyone share his/her experience with LAMP in dealing with scalability of high-traffic sites? Taking clues from the Wikimedia servers, we understand that the final configuration involves proxy caching for content, database masters/slave servers and NFS servers. We of course don't have such a high traffic, but it will be interesting to note what kind of server config you'd go for."
This discussion has been archived. No new comments can be posted.

Scaling Server Setup for Sharp Traffic Growth?

Comments Filter:
  • by Anonymous Coward
    Why don't you post a link to your site so we can make more informed recommendations?

    heh heh heh

  • Coral cache? (Score:3, Informative)

    by _LORAX_ (4790) on Tuesday December 20, 2005 @06:30PM (#14304394) Homepage
    They even have documentation for how to setup your server to offload "heavy" items to the coral cache without impacting your control of the site.

    http://wiki.coralcdn.org/wiki.php/Main/Servers [coralcdn.org]
  • YAGNI (Score:2, Informative)

    by GigsVT (208848) *
    Stay ahead of the load, but don't build in capacity you don't need.

    CPUs are especially expensive right now for very little gain on the high-end.

    Make sure to use a PHP accelerator if anything is CPU bound. (assuming you are using the real LAMP and not one of those redefinitions of the P that anti-PHP revisionists like to use).

    The next step is caching of static content, this can be as simple as making image links point to a different server, or you could take the next step with an inwardly facing proxy.

    But re
  • by SmallFurryCreature (593017) on Tuesday December 20, 2005 @07:08PM (#14304741) Journal
    When your current server is growing to small but you are not yet ready or willing to go to the big boys and simply buy a solution.

    The easiest way is to split your server. A typical website has 4 pieces wich can be easily seperated. Scripts (if you use lamp perl or php), images/static content, the database and finally logging.

    If you site is "normal" you can get some very good results by splitting these tasks up across different servers. Each task really asks for a different hardware setup.

    Scripts tend to be small but require a fair chunk of crunching power. There is little point in scsi as the scripts could just remain in memory without ever needing to swap. Depending on your scripts you don't need gigs of memory either. What you could really use is a multi-core machine. server side scripting practically begs for multicore. Why process one page request when you can do 2 or 4 etc. It may be me but I had better results with dual P3 then single P4.

    Images are almost the opposite. Depending of course on your site they could easily come to several gigs and worse constantly change. IF you cannot fit your content in memory you better have a fast hd. SCSI still is the best for this. CPU power on the other hand is less important. What you may want to look into is that your hardware is optomized. I believe that Linux has some support for more direct throughput (reducing the amount of times the image is shuffled around before going out of your network card). Raw CPU is less needed. Here I also got some really good experiences in preffering multi core over raw gigahertz power.

    Database is in a class of its own. With certain databases there isn't even a benefit to having multicore it seems perhaps due to whole table locking. The main advantage you can get by seperating it from the rest is that it means your apache server can concentrate on one task. Also removing the outside connection on your database is a nice bit of extra security. Database server really depends all on the way your site is setup. For a typical page request I usually asume the following, 1 script request, a dozen image requests, 3 database queries. (verification, retrieval, update)

    Logging is often overlooked but it takes up a serious amount of resources. Not logging is an option of sorts but I don't like it. Switching it to a machine dedicated to the task can seriously speed up your other servers AND provide a level of extra security. A logging server can be very lightweight and just needs a decent HD setup.

    Anyway that is the amateurs way to save a website creaking at the seams when their is no money to get a pro solution. It is a hazzle as you now got four machines to admin but it is easy to setup and usually does not require a major redesign.

    Load balancing and stuff sounds nice but most customers get such odd reactions when they here the prices charged.

    • I agree with much of this post. Once split up, you can also easily see where you need to grow. For one of my growing sites that was very image and database dependant, I used 1 primary machine, 2 database machines and 6 image/static content machines. The image servers were all just mirrors of the main site, and put into round robin DNS, which provides pretty good, but not perfect load balancing. I didn't have a logging machine, but I did have another machine for backups and other things like email. Pe
    • With certain databases there isn't even a benefit to having multicore it seems perhaps due to whole table locking

      what database worthy of high traffic uses full table locking? MySQL doesn't and the big players(Oracle, SQL Server, Sybase) don't either. I don't think you want to power you high traffic web site with Access.
    • by sootman (158191) on Wednesday December 21, 2005 @12:55AM (#14306797) Homepage Journal
      These are all very good tips. There are also several things you can do with just one box:

      - PHP has lots of caching options available and other things that can boost performance. Learn them. One good overview is in the powerpoint slideshow here. [preinheimer.com] Just like you can't put a heavy building on a weak foundation, it's very hard to speed up an app that's badly written in the first place.

      - SQL can be badly misused. Make sure that your page uses as few queries as possible and that those queries are as good as possible. Don't use PHP for things that SQL does very well--joins, filtering, etc. Your goal should be for every database query to return as much information as you need to build the page and not an ounce more.

      - you can take a half-step towards multiple boxes by running multiple servers on one box. Apache is great but it's overkill for static work like serving images--look at tux, boa, lighttpd, thttpd, etc. for those duties. For example, serve the app from www.example.com on Apache and the images from images.example.com via Boa. Or, have Apache on :80 and serve images via Boa on :8080.

      - the last thing to do before splitting up to multiple servers is to get one better box. from the box you describe, you might realize a 200-300% improvement with a fast dual-CPU box with 2-4 GB RAM and either a) RAID or b) different disks for different tasks--logs (writes) on one, images (reads) on another, etc.

      - be scientific. measure, make one change, and measure again.

      - many things can be quickly tested before being fully implemented. turn off logging and see if performance improves. if it doesn't, then there's no reason to go through the trouble of making /var/log/ and NFS mounted share. visit the site using a browser with images turned off to see how much faster it is when images aren't being asked for.

      - on a related note, determine where the bottlenecks are before optimizing. There's no reason to split image-serving duties if the only image you have is your logo and a couple nav buttons.

      - if possible, when you're done, do a writeup and submit it to slashdot. I always say "the best way to be successful is to find someone who has done what you want to do and copy them" and your experiences might help the next person who's in the same boat you're in now.

      - talk to people who have experience building fast servers. there's lots of stuff to know. for just one example, I've often heard that PIIIs and PIII Xeons are better than P4s for almost all server duties. there are religious wars in server land as well--SCSI vs. ATA, etc.--but talk to a few people and patterns will emerge.
  • Usually the thing that kills you is the database. Make sure you have a well desgined schema, every bit of RAM that you can cram on the db server machine, a scsi RAID array configured as 1+0, a fast multiport ethernet card connected directly to the web host and make damn sure you have a well impelemented connection pool used properly by your application.

    Also make sure the server code is well designed, i.e. no select * stuff, just get what you actually need.
  • by aminorex (141494) on Tuesday December 20, 2005 @09:19PM (#14305663) Homepage Journal
    Cheap scalability means load balancing over commodity components, which you can add quickly to a set for linear scaling. The first challenge is where the client traffic comes in the door. If you can't them in, you can't serve them. When you add commodity components, you reduce MTF, so your configuration needs to
    do dynamic-failover and rebalancing.

    The best way I know to scale your front door is to start with two netfilter firewalls sharing a MAC, and getting load balance by MAC layer filtering rules. It's pretty easy to plug in additional firewall transit capacity and to script-in failover using a heartbeat daemon. You can do firewalls in failover pairs more quickly and easily than you can do odd-numbered rings, but both are quite doable by relatively straightforward scripting and configuration.

    I strongly recommend against breaking your traffic into categories, like static pages, etc., and balancing load by moving different categories to different servers. If you do that, you end up with way too much hardware underloaded, and way to much hardware overloaded, and either no failover provisioning, or else a very complex failover configuration. Instead, make the individual servers identical, and cheap. Just add more clones to the pack as needed, and keep the traffic balanced.

    By this time you're starting to see my basic approach to scalable commodity 'nix clusters. See this lame ASCII art [southoftheclouds.net] for detail. It amounts to a series of independently scalable layers,
    Firewalling, app serving, db caching, db serving.

    The memcached layer is indicated if you have a lot of read-only db traffic.
    These nodes are cheap, don't even really need hard drives. You could boot them
    off of CD or off the network, diskless. They hold as much RAM as possible.
    The number of MC servers required depends strongly on how much RAM each can hold
    but the amount of RAM required per DB node depends on the characteristics of your
    application DB traffic.

    I'd rather install a memcached server and keep a hot DB spare than try to maintain
    transparent failover on a DB cluster. Coherence requirements complicate the performance curves when you have multiple DBs accepting write operations, which can lead to unpleasant surprises. Delay scaling your DB cluster as long as you can.
    • by aminorex (141494) on Tuesday December 20, 2005 @10:06PM (#14305983) Homepage Journal
      I should mention that if you didn't code-in memcached, you probably don't want to retrofit it, just for performance tuning or capacity scaling. In that case, I should suggest C-JDBC [objectweb.org]. You don't need to use a Java AS node in order to use C-JBDC, either.

      I haven't made a production deployment of C-JDBC, so I defer to the experience of others, but from my research, it looks like a hot ticket for scaling DB performance while simultaneously isolating you from the specificities of a given DB product.
    • Some good comments there :)

      I can't see your diagram, but I'd certainly echo the use of Danga's memcached [danga.com]. I use it upon my site, and found that I save a lot of database access via the caching.

      There's a brief introduction to memcached with perl [debian-adm...ration.org] I wrote to explain it for newcomers, but bindings are available for PHP, and many many other languages.

      Secondly I'd look at cheap clustering with pound [debian-adm...ration.org] this is much better than using Round Robin DNS as another poster mentioned; since it avoids clients getting sent

  • by plsuh (129598) <plsuhNO@SPAMgoodeast.com> on Tuesday December 20, 2005 @09:58PM (#14305927) Homepage
    Geez, first thing to do is profile the application, under expected heavy usage patterns. This can be a bunch of looping scripts running wget or the like, or a bunch of testers (never underestimate the cost-effectiveness of a bunch of student volunteers on a weekend day - they likely will work for donuts and juice), or a commercial load test tool.

    See how hot and heavy things can get before something chokes. Then you'll know whether your application is compute-bound, memory-bound, disk I/O-bound, or what. Also, whether it's Apache itself or the MySQL database that's getting hung up.

    Also, look at your current usage logs. You say that your site, "requires users to log-on to the website for extended period of time" and also that it has "about 1,200 users, 5-7% connected at any given time". Are there usage patterns or spikes that you need to worry about? Is there a morning login activity spike? Is there a lunch spike or a leaving-for-the-evening spike? How high are they relative to the general background and to each other? What about popular pages? Are there three or four pages that could be statically generated on a periodic basis to relieve a big chunk of the load? How much of the site can realistically be cached across all users, vs. across a user, vs. must be generated afresh with each request? During the long logged-on period, are users actively doing things the whole time, or are they doing a "click here, three clicks half an hour later, another click ten minutes after that" kind of sporadic activity pattern?

    Once you know where the bottlenecks are and the likely usage patterns, then you can apply the optimizations that other folks have spoken about. I've deployed a number of large-scale WebObjects systems, and one thing I can assure you is that your initial impression of what's important to users and what really is going to cause a load is wrong. Users will find new ways to work and a seemingly innocuous routine may end up being called thousands of times.

    --Paul

    PS - don't forget code optimization. At least half of the slowdowns that I have found in deploying web apps can be classified as bone-headed programmer issues. E.g. - inserting nodes into a linked list one by one, when I should have known that the inserts would come in groups that needed to go into the same place. I should have (and later did) gather them up and do a single insert instead of repeatedly traversing the linked list. :-P
  • Sharp traffic will shred the hell out of most cable sheathing and scratches up the insides of your fiber optic cables. And it's even more important to not look into a fiberoptic link with sharp traffic coming out of it. Burns your retina and scratches your cornea. Good luck.
    • I've had pretty good results with wireless.

      Sharp traffic just sails right through with no problems.
  • by cornjones (33009) on Wednesday December 21, 2005 @01:21AM (#14306890) Homepage
    There are couple of things here but the first and foremost point is to partion your app. Loose coupling is the name of the game. This will give you the ability to upgrade hard hit parts of your app individually.

    Note that where this all lives on teh hardware is really immaterial if you partion it correctly. If something is taking too much power, move it to it's own setup.

    Static content is relatively straight forward. Have a specific web service for your static content (css, img, scripts, swf, etc). If your service grows to necessitate it, get a content caching service (akamai, savvis, mirrorimage). This part of your site will be teh most able to scale quickly (read immediately) as they are already pushing enough traffic to make yours basically immaterial.

    You will need some sort of load balancing solution. Something like NLB is fine if you can't afford dedicated hardware. Not sure what the current hotness on linux is but I am sure somebody here will chime in on that.

    Make sure your application layer is stateless as much as possible. If you do this correctly, you can add web heads as neceessary to handle the traffic. If you do have to rely on state you can either keep it in the client (know that they will lie to you and handle appropriately) or keep it in the db. If you absolutely must keep state at the web head, accept that at least 5% of your traffic is going to bounce around between boxes no matter what sort of sticky algorithms you use at the load balancer. This is a function of one user coming through different IPs between clicks (due to large network routing) and there is really little you are going to do about it. Web heads are cheap and make sure you have plenty so you can yank them out for maintenance (planned or otherwise) while the service still runs.

    The db's are always the hardest part. At the end of the day, all the data has to live somewhere. Your db's will probably be the most expensive part of your app. Make sure you take advantage of the full IO of the system by partitioning: logging on one set of spindles, each heavily used db on it's own set of spindles as much as possible. If your dbs are read only, you can put them behind a load balancer but if they are read/write that may not be an option. A preferred way in the win world is to have clusters and a SAN but that is expensive. A less expensive way is to have a replicated warm failover available for a scripted save.

    Keep in mind that you will be taking chunks of your system out for maintenance (planned and unplanned) so make sure each function is replicated across at least two pieces of hardware.

    some other points...
    make sure you keep stats and watch your trends. It is unlikely you will drastically change your traffic patterns overnight (barring an Event). Have a relationship w/ your supplier and know how long new hardware takes to be delivered.
    If you can afford it, keep your peak traffic under 60% capacity, when you start getting much over 80% you are running too hot. Depending on how professional you are going, don't cheese out on the small shit. HA network gear and dual nics have saved me many a time.
    Know your single points of failure. Make sure you monitor these very closely and, preferrably, automate your failover. Assuming adequate monitoring, the computers will know much more quickly then you will if something is wrong and if you script it, you can recover quickly. Even if it does this, you want to know when shit breaks. Don't assume b/c you have two of something that both are ready and running. Test your failovers.
    There are lots of fun queueing apps that allow you to decouple your applications even further. It is another service to run but, depending on scale, it can be worth it.
    • I think the original question from Ronin suggests he will not know what "partitioning" means. Indeed, the parent seems to be using it differently from my own understanding of the term.

      Let me rephrase the parent like this: make sure that you have easy access to your domain's DNS configuration so that you can easily create new machine names in your domain. *Initially, all of these new names can point to a single machine.* When you write your app, you can create multiple databases -- for instance -- on multipl
      • I mean partitioning in this way... Logically segregate functional areas. Don't put your business logic in the db. Don't intermingle your static and your dynamic content. Draw out your application and define strict interaction between functions. If you are going this far, separate your view controls from your middle tier business logic. This will allow you to pull the app apart as it grows without major reworking. As others have mentioned, separate web sites for static and dynamic content is a start (e

Life would be so much easier if we could just look at the source code. -- Dave Olson

Working...