Scaling Server Setup for Sharp Traffic Growth? 19
Ronin asks: "We are a young startup developing a yet another collaborative platform for academic users. Our platform (a) requires users to log-on to the website for extended period of time, and (b) is content intensive - stuff like courses, quizzes and assignments gets posted regularly. We're using a LAMP setup on a 1 GB P4 server. Our user base is small (about 1,200 users, 5-7% connected at any given time) but we expect it to grow rapidly. We expect sharp traffic growth, and are working to scale our server software & hardware setup linearly. What kind of server setup plan should we go for keeping in mind our content heavy application and that we may have to scale up rapidly. Can anyone share his/her experience with LAMP in dealing with scalability of high-traffic sites? Taking clues from the Wikimedia servers, we understand that the final configuration involves proxy caching for content, database masters/slave servers and NFS servers. We of course don't have such a high traffic, but it will be interesting to note what kind of server config you'd go for."
Start with a scalable pipe (Score:4, Insightful)
do dynamic-failover and rebalancing.
The best way I know to scale your front door is to start with two netfilter firewalls sharing a MAC, and getting load balance by MAC layer filtering rules. It's pretty easy to plug in additional firewall transit capacity and to script-in failover using a heartbeat daemon. You can do firewalls in failover pairs more quickly and easily than you can do odd-numbered rings, but both are quite doable by relatively straightforward scripting and configuration.
I strongly recommend against breaking your traffic into categories, like static pages, etc., and balancing load by moving different categories to different servers. If you do that, you end up with way too much hardware underloaded, and way to much hardware overloaded, and either no failover provisioning, or else a very complex failover configuration. Instead, make the individual servers identical, and cheap. Just add more clones to the pack as needed, and keep the traffic balanced.
By this time you're starting to see my basic approach to scalable commodity 'nix clusters. See this lame ASCII art [southoftheclouds.net] for detail. It amounts to a series of independently scalable layers,
Firewalling, app serving, db caching, db serving.
The memcached layer is indicated if you have a lot of read-only db traffic.
These nodes are cheap, don't even really need hard drives. You could boot them
off of CD or off the network, diskless. They hold as much RAM as possible.
The number of MC servers required depends strongly on how much RAM each can hold
but the amount of RAM required per DB node depends on the characteristics of your
application DB traffic.
I'd rather install a memcached server and keep a hot DB spare than try to maintain
transparent failover on a DB cluster. Coherence requirements complicate the performance curves when you have multiple DBs accepting write operations, which can lead to unpleasant surprises. Delay scaling your DB cluster as long as you can.