Building/Testing of a High Traffic Infrastructure?

Please create an account to participate in the Slashdot moderation system

Building/Testing of a High Traffic Infrastructure? 231

Posted by Cliff on Saturday November 13, 2004 @01:21PM from the transactions-and-pages-per-second dept.

New Breeze asks: "I'm currently working on my first web 'application', and have discovered that I know less than nothing about setting up the infrastructure to manage a high traffic system. Where does one go to learn about setting up the infrastructure required to host something like Slashdot? Or do you just say, 'Not my area!' and help them find a consultant?"

"My experience is pretty much limited to:

1. Install the web server on one box, the database on the same box if it's a small installation or a separate box if performance seems like it will need it. Add more memory and processors based on SWAG criteria. (Scientific Wild Ass Guess)

2. Contract with a hosting company.

I had a potential customer ask what I would recommend if they wanted to self host, they have around 300 remote locations and would have multiple users from each location hitting the application at the same time, so saying a couple of beefy servers probably isn't the right answer.

I haven't a clue. The last place I worked with on something like this hired a high dollar consultant who spend a huge pile of their money setting up a load balanced, oracle parallel server redundant everything system.

How do you test it? I've worked where they actually had a room with hundreds of systems on racks that they would configured to run test transactions against different servers and software builds for stress testing, but that's not in my budget..."

This discussion has been archived. No new comments can be posted.

Building/Testing of a High Traffic Infrastructure?

Load All Comments

Search 231 Comments Log In/Create an Account

Comments Filter:

A Beowolf Cluster of Course (Score:5, Funny)

by l810c ( 551591 ) * writes: on Saturday November 13, 2004 @01:24PM (#10807307)

That one was easy, ...Next

Share
twitter facebook
Ask a Pr0n serving company (Score:5, Insightful)

by chia_monkey ( 593501 ) writes: on Saturday November 13, 2004 @01:24PM (#10807309) Journal

Seriously...they know all about serving up content on high traffic sites. Not only is it high traffic, but it's rather big files that they're delivering. When we're testing the networks that we set up, both wired and wireless, we often visit pr0n sites for our benchmarks.

Share
twitter facebook
- Re:Ask a Pr0n serving company (Score:4, Funny)
  
  by AndroidCat ( 229562 ) writes: on Saturday November 13, 2004 @01:27PM (#10807326) Homepage
  
  Okay, let's put it to a real test. Post some good pr0n links and we'll try to slashdot them!
  
  Parent Share
  twitter facebook
  - Re:Ask a Pr0n serving company (Score:5, Informative)
    
    by Neil Blender ( 555885 ) writes: <neilblender@gmail.com> on Saturday November 13, 2004 @01:48PM (#10807423)
    
    Okay, let's put it to a real test. Post some good pr0n links and we'll try to slashdot them!
    
    Having worked for porn sites, I will tell you this: They, more than anybody, will rise to the challenge. Porn = Traffic = Ad Click Throughs = Money. If a porn site sees a sudden rise in traffic, they will drop more servers into their delicately load balanced system without blinking an eye. Porns sites aren't slow. There is a reason why.
    
    Parent Share
    twitter facebook
    - Re:Ask a Pr0n serving company (Score:2)
      
      by DickBreath ( 207180 ) writes:
      
      Having worked for porn sites, I will tell you this: They, more than anybody, will rise to the challenge.
      
      Then please post some links.
      I'm sure Slashdot users will also rise to the challenge.
      (please only post links to sites that can be successfully navigated using one hand.)
    - - Re:most porn companies are clueless (Score:2, Interesting)
        
        by Neil Blender ( 555885 ) writes:
        
        Most porn companie are clueless,...They get a MAJOR SCREWING from hosting companies that charge big $$ to figure out how to handle the load.
        
        Real porn companies don't host, they colocate. And real porn companies - real porn companies - are well advanced beyond your Slashdots and your CNN.coms. They don't push an agenda, they push what serves millions of page views without 500s or login problems or 'nothing to see here, move along' warnings. Porn is always bleeding edge on the technology front. And porn
        
        Re:most porn companies are clueless (Score:2)
        
        by jobugeek ( 466084 ) writes:
        
        No credit companies are getting tired of Joe Asshole getting his rocks off at porn sites then calling the CC company when the bill comes claiming he didn't go there.
        
        Re:most porn companies are clueless (Score:2, Funny)
        
        by AndroidCat ( 229562 ) writes:
        
        I think their current economic model consists of getting rich off of each others' click-throughs and showing thumbnails of each others' thumbnails.
      - Re:most porn companies are clueless (Score:4, Funny)
        
        by AK Marc ( 707885 ) writes: on Saturday November 13, 2004 @10:05PM (#10810161)
        
        They get a MAJOR SCREWING from hosting companies that charge big $$ to figure out how to handle the load.
        
        Well, if you are in the porn industry, you should expect a major screwing. However, I would expect the porn industry to know how to handle a load.
        
        Parent Share
        twitter facebook
- (File throughput) != (database connectivity) (Score:4, Informative)
  
  by turnstyle ( 588788 ) writes: on Saturday November 13, 2004 @02:04PM (#10807490) Homepage
  
  Popular porn site likely do lots of bandwidth, but that doesn't necessarily mean lots of database hits.
  Accommodating "high traffic" that is mostly bandwidth intensive is quite a different problem than accommodating traffic that is database intensive.
  
  Parent Share
  twitter facebook
  - Re:(File throughput) != (database connectivity) (Score:2)
    
    by LiquidCoooled ( 634315 ) writes:
    
    Don't most porn sites now run database backends to manage the linking and hit rates?
    Those that *ahem* list a "todays best" area at the top of the page, followed by the daily links are certainly DB driven.
    
    Theres management required for referrer chains and user management.
    Thats without even getting to the individual gallery/image/movie pages that are decorated with links and adverts depending on where they come from.
    I've even seen sites now with a blogger style entrance and backend tpg style archives.
    
    They a
- Re:Ask a Pr0n serving company (Score:5, Interesting)
  
  by DogDude ( 805747 ) writes: on Saturday November 13, 2004 @02:06PM (#10807509)
  
  Absolutely. Most companies hand the whole thing off to a hosting companies that specialize in porn hosting. These places are rooms upon rooms of racks, on raised floors, 6 times redundant connections, dual power backup systems (generators), and all the fiber you could ever want. They're the best. Take a look Candid Hosting [candidhosting.com]. They had a few hurricanes go over them, and they didn't bat en eye. Incredible uptime.
  
  Parent Share
  twitter facebook
  - Oh yeah... (Score:2)
    
    by DogDude ( 805747 ) writes:
    
    Oh yeah, and a lot of these places even have biometric scanners to get into the 24/7 monitoring room and the server rooms. They have standard hardware setups that they generally use pre-ghosted installs of FreeBSD or Windows 2000. Of course, everything is RAID-5 and backed up religiously. The best in the business.
  - Re:Ask a Pr0n serving company (Score:2, Insightful)
    
    by Barryke ( 772876 ) writes:
    
    Candid Hosting is expensive, at least in comparision to dutch hosting (Amsterdam Internet Exchange)
- Re:Ask a Pr0n serving company (Score:2, Informative)
  
  by emmetropia ( 527623 ) writes:
  
  I've done the porn gig for a while, and I can tell you a "secret" that won't work all that well for a web application. Anything on a porn site that can be served up statically, usually is. If it's a "page of the week" it usually is generated once a week, so that you don't need to pull from the database on every single hit. At least, that's they way it works/worked for us.
  
  If you do need to hit the database with almost every page load, there's a few simple tricks. These are what I use with PHP, which
- Adult hosts (Score:2)
  
  by base_chakra ( 230686 ) * writes:
  
  I'm not so sure I agree. Over the past three years I've done a considerable amount of work in the adult web industry (yes, actual work of the non-fun variety). The most professional hosts that allow adult sites aren't exclusively "adult hosts". Actually, many adult hosts are terribly unreliable and/or unprofessional, and few of them offer managed hosting, which is what this person needs.
  
  That said, there are a number of extremely reliable, professional managed hosting companies that attract a lot of adult
- Re:Ask a Pr0n serving company (Score:2)
  
  by sevinkey ( 448480 ) writes:
  
  I work for a sister company of one of those, and yeah, we can definitely handle the bandwidth. We have a 1gbps pipe from Level3 and that's filled up before. We get DDOS attacks every day, but they do the "open a connection and forget it" type of attacks since bandwidth isn't a problem.
  
  My database system gets about 300,000 hits a day, and that's just a small cluster of windows servers now days...
  
  Mainly the way we do it is to try our best to make a resilent redundant system, and then monitor for bottlenec
Post a URL (Score:5, Funny)

by MarkusQ ( 450076 ) writes: on Saturday November 13, 2004 @01:24PM (#10807310) Journal

Post a URL here and we'll help.
-- MarkusQ
P.S. Clever use of the text describing the link may help you control how much trafic you get, from low ("M. Moore Nude!") to high ("SCO caught robbing courthouse").

Share
twitter facebook
Simple Flow chart for learning (Score:5, Funny)

by drachenfyre ( 550754 ) writes: on Saturday November 13, 2004 @01:26PM (#10807320) Homepage

1. Submit Link to slashdot with your webserver hosting a lot of large video files supporting the link.
2. Have link approved (Note - duplicate any story just posted is probably the best way to get approval and lots of people crying dupe)
3. Learn what caused the webserver to melt and how long it took to melt.
4. Fix the problem that caused step #3
5. Repeat 1-4 until server doesn't melt.
6. Congrats! You've learned how to host a high demand web server.

Share
twitter facebook
- Re:Simple Flow chart for learning (Score:2)
  
  by ReelOddeeo ( 115880 ) writes:
  
  You forgot...
  7. Profit
Test using Slashdot itself! (Score:3, Insightful)

by xmas2003 ( 739875 ) writes: on Saturday November 13, 2004 @01:26PM (#10807321) Homepage

In answer to your question about testing, have your web site /.'ed and see how it handles the Slashdot Effect [komar.org] which is a pretty good stress test! ;-)
P.S. When I first tried to read this story, I got "Nothing for you to see here. Please move along" ... somewhat ironic I'd say ...

Share
twitter facebook
- Re:Test using Slashdot itself! (Score:2)
  
  by aldousd666 ( 640240 ) writes:
  
  there are a million webmasters with ad-sponsored sites who'd love to get slashdotted. Many are called, but few are chosen
  - Re:Test using Slashdot itself! (Score:2)
    
    by Jeff DeMaagd ( 2015 ) writes:
    
    Many are called, but few are chosen
    
    Many people ask how many of the chosen are paying customers, like Roland Piquepaille.
hmm (Score:2, Informative)

by Anonymous Coward writes:

it really depends on what you need.

In my experience though hardware (especially memory) and bandwidth come before a superoptimized software front-end & database.

A good introduction I can recommend is called "Developing IP-Based Services: Solutions for Service Providers and Vendors" - I forget who wrote it. But definatly worth reading on the subject.
PLEASE (Score:5, Funny)

by asadodetira ( 664509 ) writes: on Saturday November 13, 2004 @01:29PM (#10807339) Homepage

Please don't tell him. We don't need another slashdot. Servers worlwide surrender

OK.It's easy. There are three steps involved
1.Build a low performance infrastructure.
2.Put a RT sticker and chromed exhaust pipes
3.Done

Share
twitter facebook
- - Re:PLEASE (Score:2)
    
    by rizzo420 ( 136707 ) writes:
    
    and it has to have a loud exhaust system and a big ass woofer so all you hear is the plastic shit on the car vibrating. tinted windows and cd hanging from rear view mirror optional.
Do the math (Score:5, Insightful)

by MarkusQ ( 450076 ) writes: on Saturday November 13, 2004 @01:33PM (#10807360) Journal

First step, do the math.
What was once a "high volume" app may be nothing for modern equipment. You're talking about on the order of 1K concurrent users (300 sites * several users per site).
If "use" means manually typing data into forms, viewing mostly static pages, etc. this isn't really a very "high volume" application, and a single decent server should handle it.
If, on the other hand, "use" means constantly running complex queries against a billion item data set, you're doomed.
So where do you fall in this spectrum?
Coming up next...where's the bottleneck?
-- MarkusQ

Share
twitter facebook
- Re:Do the math (Score:2)
  
  by ArbitraryConstant ( 763964 ) writes:
  
  "If, on the other hand, "use" means constantly running complex queries against a billion item data set, you're doomed."
  
  Unless you're Google.
- Re:Do the math (Score:4, Informative)
  
  by New Breeze ( 31019 ) writes: on Saturday November 13, 2004 @02:49PM (#10807755) Homepage
  
  300 sites, between 12 and 200 concurrent users at a site.
  
  It's a CRM system, i.e. some basic data entry, some portions are transaction processing. i.e. the workflow portion for the base part of the app is very simply:
  
  Search for customer by various criteria.
  No customer found, add one.
  Retrieve customer information.
  Add current order information being stored for this customer.
  Process loyalty/discount programs to see if customer qualifies for an award.
  Return award to order entry system for processing.
  
  There's a lot more to it, but that's the meat of it. It's fairly data intensive, there is a great deal of information stored for customers for use in data mining the collected information. It's primarly web service based, but there is a fairly extensive management and reporting tool that is all HTML based.
  
  My guess is going to be that the bottleneck is going to the the database, but we've done extensive testing with a million customer sample database running multiple instances of test applications from 10 other boxes, but that doesn't exactly prove much as it's too predictable.
  
  Parent Share
  twitter facebook
  - Re:Do the math (Score:4, Informative)
    
    by mrjohnson ( 538567 ) writes: on Saturday November 13, 2004 @04:31PM (#10808350) Homepage
    
    Aight, I'll take a stab...
    
    You'll need a hardware SSL loadbalancer, with redundancy:
    
    http://www.coyotepoint.com/e450.htm
    
    (Two of those).
    
    You'll need at lease two web servers with CPU and RAM. The requirements on these boxes really depend on the app. I'd make them dualies at least, with fast Xeon processors (Good bang for the buck). A couple gigs of ram each. You can add servers to the load balancer later if you need to. Disk doesn't really matter, but I'd use a SCSI mirrored root volume for reliability.
    
    The database needs to be redundant, and since you think it'll be the bottleneck, an Oracle RAC setup would seem to fit your needs. I really don't like Oracle from a developer stand-point, but two big servers with Oracle running on raw disks for performance is a tough combo to beat. Expensive and long on the setup and install, though.
    
    With all of that, you'd have a large initial investment, but something that would grow with your needs. You could add new apps to this setup later on, as well. However, if you needed to run this on the cheap, the Linux Virtual Server project does a lot of this with simple Linux boxes.
    
    If this is too expensive, the first thing to take out is the hardware SSL. I included it because I want them, not because I have 'em. :-) Check out pound:
    
    http://www.apsis.ch/pound/
    
    A couple Linux boxes with failover setup and you've saved a good 40-70 grand. Requires some expertise.
    
    Parent Share
    twitter facebook
  - Re:Do the math (Score:2)
    
    by MarkusQ ( 450076 ) writes:
    
    So, tens of thousands of concurent users on a single web application (9,000 to 36,000). The first thing I'd do is check those numbers. That sounds high (e.g., larger than anything Citibank, or Intel is running). If the numbers are acurate, you may want to reconsider making it a web app.
    But the real question isn't how many users, it's how much load they'll impose. Assuming that the number of users is correct, it sounds like a POS situation (but 120 registers at one site?), so maybe one full transaction
- - How beefy? (Score:2)
    
    by grahamsz ( 150076 ) writes:
    
    Throwing hardware at the problem is usually pretty cost effective - given that consultants are expensive.
    
    I've seen a single (fairly dated) 12 cpu sparc box serve up about 600 simultanous connections for a cgi driven application without faltering.
    
    Get a system that you can ramp up and keep adding processors and ram to and you should be able to handle to load that you are talking about with two boxes (one for FE and one for BE)
Apache Benchmark is your friend (Score:5, Informative)

by sseremeth ( 716379 ) * writes: on Saturday November 13, 2004 @01:34PM (#10807369)

If you want to throw some serious load at your equipment, get a few other systems saturating your network with Apache Benchmark (ab) requests. It gives lots of useful data, like response times, etc. . And you're best off toppling the application and trying to find the cause that it failed and working on that as someone already suggested. The rinse and repeat.

Looks like Apache has updated their tools since the last time I had to do this...

http://httpd.apache.org/test/

Share
twitter facebook
- Re:Apache Benchmark is your friend (Score:3, Interesting)
  
  by oneishy ( 669590 ) writes:
  
  Another really good tool for stress testing web apps is Microsofts Web Application Stress Tool [microsoft.com]. It allows you to configure testing for a set of different virtual users, and also supports https, stores cookies if you want, etc. An all round good featured tool. One of the best features for testing a load ballanced app is it's ability to seamlessly distribute the testing load across multiple client machines, thus really providing a realistic load.
- Re:Apache Benchmark is your friend (Score:2)
  
  by gregfortune ( 313889 ) writes:
  
  Jmeter [apache.org] is pretty nice as well.
- Re:Apache Benchmark is your friend (Score:2)
  
  by ryochiji ( 453715 ) writes:
  
  One thing to keep in mind when you're test using programs like ApacheBench is that laboratory tests don't necessarily simulate real-world scenarios well.
  
  For example, a server hooked up to a "client" on your lan will be able to support a hell of a lot more requests than in the real world. This is because, even if your application responds quickly, your web server process has to stay up to send the output back to the client. In a lab network, this usually takes hardly anytime, while an actual modem user hi
Look at the other high load websites (Score:4, Informative)

by GrAfFiT ( 802657 ) writes: on Saturday November 13, 2004 @01:36PM (#10807378) Homepage

Check out what those guys do at Wikipedia [wikimedia.org]. Don't forget to look at their useful links at the bottom.
Or maybe it's overkill.

Share
twitter facebook
- Re:Look at the other high load websites (Score:2)
  
  by StillAnonymous ( 595680 ) writes:
  
  That's a pretty sweet setup! It also goes to show all those naysayers that Fedora Core can and IS used in production environments with obvious success.
- look at siege, httperf, and autobench instead (Score:3, Informative)
  
  by smitty45 ( 657682 ) writes:
  
  better than ab is siege, which can deal with HTTP/1.1 Keep-Alives, and give more regression-style stats. it's at joedog.org.
  
  better than siege would be something like httperf, and autobench, which will give you some indication whether or not your client generating the requests is still healthy. autobench also allows you to run multiple instances of httperf on different machines, and then aggregate the numbers after the test.
  
  remember folks, there are only 65535 (minus 1024) ports that any machine can be u
Dear Slashdot... (Score:5, Insightful)

by Anonymous Coward writes: on Saturday November 13, 2004 @01:36PM (#10807385)

I'm currently working cooking in a restaurant, and have discovered that I know less than nothing about performing stomach surgery. Where does one go to learn about the techniques and tools necessary for curing stomach cancer? Or do you just say, 'Not my area!' and help them find an oncologist?

Seriously.. you have a lot to learn, and a lot of what you need to know just comes from experience which you can't get from a book.

First: learn how everything works. When you click a link in your "application" (why the quotes?), what happens? For instance, does it run a Controller object? If you're using a language like Ruby or Perl, is it "pre-compiled" or does it have to interpret a script on each hit? Does the controller then go to the database and populate variables, then insert them into a template, then render the template? Is the template cached? How are your database settings? Enough memory for joins? Are all your queries using the appropriate indexes? Are you familiar with your database's performance-measuring variables and tools? Are you pulling more data than you need to in each query?

Once you have an understanding of what's happening, then you can start measuring. Where are the bottlenecks? This is a very important thing to keep in mind in programming or system architecture: DON'T OPTIMIZE UNLESS YOU NEED TO! Keep your system and code as simple as possible. For instance don't cache things in your program (making it more complicated and harder to maintain) unless you have a BENCHMARK IN HAND showing a performance bottleneck.

You might not need to move your database to another machine. What you need to do depends on your app.

Yes, you will need to do a lot of testing to identify your "first round" of bottlenecks. You need to build a lot of diagnostics into your app to help you identify how long different steps take.

Always deploy your app in stages, one site at a time, until you start identifying some problems. Then fix those problems before continuing deployment. Never "flip a switch" and reveal any change all at once.

Good luck!

Share
twitter facebook
- Re:Dear Slashdot... (Score:2, Informative)
  
  by New Breeze ( 31019 ) writes:
  
  I think there was a misunderstanding. I know how the application works in great detail. I know that it can be scaled up across multiple machines. It will scale up.
  
  What I don't know is how to judge what hardware to reccomend to someone wanting to self host.
  
  I'm pretty damn sure that only comes from 1) Testing, and I'm not buying $10's of thousands in hardware to test with and 2) Experience, which I don't have.
  
  The last place I worked at had a rack of nice quad zeon processor systems in their lab that the
How to do it with little/no budget (Score:5, Informative)

by multipartmixed ( 163409 ) * writes: on Saturday November 13, 2004 @01:38PM (#10807390) Homepage

First off, if this is a "must succeed with no problems" project, all bets are off -- hire an experienced consultant so you have someone to blame. Also, this technique only works when you have the type of site which will *build up* to expected load -- not get turned on instantly.

This is tough to generalize without knowing specifics, but here goes:

1. Make sure your application can work correctly when load balanced across multiple boxes
2. Keep webserving and DB work on different machines
3. Make sure your application can work with another database without much work (this gives you the option to hire, say, an Oracle DBA and buy an Oracle license if MySQL can't keep up.. does it even support row-locking yet?)
4. Have extra hardware handy, in the rack. Do NOT turn it on yet.
5. Observe the application running; determine bottlenecks, tune
6. If you can't tune it to perform adequately, NOW is the time to break out the extra hardware while re-evaluating the implementation.

If you throw all your hardware at the problem at once, you get very little warning when the shit starts to hit the fan, and no response scenario. Do NOT make that mistake. Load, test, tune, repair, repeat.

Share
twitter facebook
- Re:How to do it with little/no budget (Score:2, Interesting)
  
  by coofercat ( 719737 ) writes:
  
  In my experience (having played at being the highly paid consultant who comes in to fix stuff once you've messed it up) I'd always point the finger at the linkage between components ("components" being items in your architecture, including the people you're using to help you). In a three tier environment (a sensible approach, almost regardless of your technology), the database is often a problem. DBAs jump on that pretty quickly, so what's left? Networks are normally easily sorted, but you may still find y
- Re:How to do it with little/no budget (Score:2)
  
  by smitty45 ( 657682 ) writes:
  
  "MySQL can't keep up.. does it even support row-locking yet"
  
  It does, and has, for quite some time. Innodb.
Interesting.... (Score:2, Interesting)

by killerasp ( 456561 ) writes:

this is a very interesting topic. I just just started my new job where i was coming from an internship previously. There we had a web server, database server, a devbox and a log processing box for webtrends analysis. But now at my new job im being introduced to high level PIX boxes, F5 load balancers, redudant web servers, transaction servers, etc. One thing i just learned the other day is that they use the F5 to handle SSL encryption/decryption instead of relying on the webservers. I never knew that was po
- RE: Using F5's to encrypt data (Score:3, Informative)
  
  by Em Ellel ( 523581 ) writes:
  
  It may or may not be a great idea depending on your situation. For one - the cost of SSL card for F5 is so high, it may be easier to just get extra servers. For another, I work with some banking applications and having data sent cleartext, even on an inside network directly connected to load balancers is NOT a valid option.
  
  However if local security can be ignored and you have the money to spend, F5's offer a nice offload of encryption processing. But then again, so do hardware cards for individual servers.
Two scenarios: (Score:5, Insightful)

by Jerf ( 17166 ) writes: on Saturday November 13, 2004 @01:38PM (#10807392) Journal

1: Gradual growth. Find bottleneck, remove it. Repeat. Make sure to start with a growable database and web site technology, but that shouldn't be too tough. Also, stay ahead of the game, always with overcapacity, both to cover for outages and for sudden growth spurts.

2: Instant growth from 0 to thousands+: Hire someone who knows what they are doing. In the first scenario, you have the time to learn what is actually going on, which is an advantage. In this one, you don't, and the customer base is to big (i.e., $$$) to screw with.

That basically covers it. Specific advice will vary widely based on databases and web technology deployed, so just about any other specific advice you get here is as likely to be wrong for you as right.

Share
twitter facebook
no. 1 cause of downtime (Score:2, Interesting)

by morten poulsen ( 220629 ) writes:

I run a site which peaks above 5,000 page views/second. That part is static, and runs thttpd. No problems at all.

The other part is dynamic. It runs on Apache (load balanced, no problem) with a PostgreSQL server. If you don't need it's features, "just say no"!

It is the single part in our system that causes most problems. When your tables grow semi-large (less than 800k rows) and you do a few joins, it chooses strange - and slooooow - ways to execute your queries. Combine that with a few journalists who wan
- Re:no. 1 cause of downtime (Score:2)
  
  by mrjb ( 547783 ) writes:
  
  I built a site that was relatively heavy on calculations, dynamically showing stats on collected data. Fortunately I won't need to expect millions of users (because I'm not posting the link here :P )
  
  Those stats are calculated periodically by a cron job now, so that table joins are no longer needed for serving the data itself. That alone turned the site a lot lighter.
  
  Of course the cron job still needed to do table joins, but it too was refactored and runs in less than 80 seconds now instead of the former 4
Depends (Score:4, Insightful)

by JediTrainer ( 314273 ) writes: on Saturday November 13, 2004 @01:43PM (#10807412)

What constitutes 'high traffic' for you?

I've been developing a high traffic site (well, maybe medium traffic) at about 1.5 million transactions per month. We have customers using the site all over North America, plus a few in Europe and Asia, and the whole thing is hosted internally off of our 10MB link.

We have each 'tier' clustered as a pair of servers - 1Ghz/256M is more than sufficient for our 2 Apache servers. 3Ghz/1GB is our Tomcat tier, and I'm not sure what the DB runs on, but they're the beefiest servers of all the tiers.

Within the app architecture, try to ensure that you can scale to more servers. We have the ability to add more servers to any of the above tiers without any changes, plus any long-running processes (complicated reports and such) get dispatched to a fourth layer of servers we call 'backend' (by RMI). These 'backend' servers can be low-end (300mhz/256M are fine), because they run non-time-critical tasks and generally might email their results or whatever.

In this way, we've avoided the EJB complications while also having full redundancy at every level. There was some custom framework involved, but it's been working well. Our application was complex enough to warrant an advanced framework (similar to Struts, except we wrote ours before Struts came out), yet EJB seemed too heavy for what we wanted to accomplish. Of course it didn't hurt that the only thing we paid licenses for was the DB.

Importantly, though, this was the right solution *for us*. It's serving us well, and already scaling well beyond the number of customers we originally anticipated would be using it. While this meets our needs fairly well, it may or may not be the right type of solution for what you're looking for, particularly because I don't know what your application is supposed to do.

Share
twitter facebook
- - Re:Depends (Score:2)
    
    by JediTrainer ( 314273 ) writes:
    
    You'll be surprised... 1 million hits/mo assuming a normal distribution is only about 1 hit every 2-3 seconds. Not exactly a big load unless each hit generates a huge complex page that sucks down your bandwidth. Figuring a normal 8-5 day M-F, that goes up to 2 hits/sec.
    
    That's right - I know it's not terribly bad. But during the workday, there's some serious peaks and valleys when it comes to our traffic. The numbers you quote assume an even distribution of hits over the 9-5 (actually we normally see traf
A few basic things... (Score:5, Informative)

by Em Ellel ( 523581 ) writes: on Saturday November 13, 2004 @01:48PM (#10807421)

These are some very basic thoughts on the subject. They may not be 100% right for you, but will get you thinking in the right way:

Rule 1 - Three tier archictecture is popular for a reason - it works. Offload user interface (web) to dedicated boxes, make application itself run on separate boxes and make database separate

Rule 2 - When possible, scale horizontaly not vertically. Make sure your application is as stateless as possible and is capable of you just dropping in an extra server when needed without a lot of reconfiguring. Make sure you can survive a loss of a server without loss of data. Lots of cheap servers will most always work out better (and cheaper) than one big ass box.

Rule 3 - Make as much of your application as static as possible. Even pseudostatic data (something that updates every minute or so) should be made static and have a process re-generating it every minute or so. Not wasting your CPU time to render a menu or something on every hit will add up fast under heavy stress.

Rule 4 - Strip your HTML. For example, some crappier web languages (think ColdFusion) have a tendancy of inserting spaces for every line of code etc. A large application running CF (dont ask) would insert enough spaces to make a simple page hundreds of kb in size. Just turning on "the write to output only on demand" option will drop size of the page to next to nothing. So know what it is that you are producing on output and make sure it is lean. Turning on server side compression solves this better, however adds to CPU requirements. On trully stateless web servers this just mean you need more web servers. So MAKE YOUR WEB SERVERS STATELESS.

Rule 5 - Know how many users your upstream connection can handle (in simplest terms - average size of HTML communication * number of users) and make sure you do not exceed it. Limit your connectivity at load ballancer. Having some users not be able to access your site is better than having ALL users not be able to access your site. Make sure you get plenty of bandwidth to spare. If you are setting up a multi-site presence, make sure your intersite communication is a - not going over same line as incoming and b - has sufficient bandwith and latency to serve the traffic.

Rule 6 - Professional load testing tools cost big bucks. But if you are carefull you can fake it with some open source software. Google it. When testing remember to take into consideration the limitation of your tester system and bandwidth.

-Em

Share
twitter facebook
- Can you qualify some of this stuff? (Score:3, Insightful)
  
  by UVABlows ( 183953 ) writes:
  
  I don't really understand a lot of the stuff you said (I am not a sysadmin). For example:
  
  What does it mean to not scale "vertically"? When I read that, the only thing that comes to mind is to put the boxes next to each other, not on top of each other. From context I gather that horizontally means extra machines, but what does vertically mean?
  
  For "dropping in an extra server when needed without a lot of reconfiguring", what do you mean by "a lot of reconfiguring"? Obviously you need to get the machine,
  - Re:Can you qualify some of this stuff? (Score:4, Informative)
    
    by Em Ellel ( 523581 ) writes: on Saturday November 13, 2004 @02:27PM (#10807607)
    
    What does it mean to not scale "vertically"? When I read that, the only thing that comes to mind is to put the boxes next to each other, not on top of each other. From context I gather that horizontally means extra machines, but what does vertically mean?
    
    Horizontal scaling - adding more machine
    Vertical scaling - adding more CPU/Memory/etc to existing machines.
    
    For example, a horizontally scaled application may have 20 1u 1cpu servers, a vertically scaled one has a Sun E15k heating up the room.
    
    For "dropping in an extra server when needed without a lot of reconfiguring", what do you mean by "a lot of reconfiguring"? Obviously you need to get the machine, install the os, set up networking, install the web server, setup the web application, point it at the database, etc. How does the application being "stateless" help? I guess, what are some examples of state that an application can have that will make configuring an additional web server difficult?
    
    Reconfiguring the application not the servers. A stateless web server does not store any user state. Meaning that if a user hits web server A for one request, and web server B for another, the user will not know the difference. Also meaning that if you add another server, you do not need to worry about conflicts, sharing data, etc. Stateless servers can be taken offline or brought online without any fuss. They become a commodity appliance and if you need more, you just get more. In realistic terms this means that if you need state for the application (login, etc) you either store the state on the client's machine in a cookie (BAD, all sorts of abuse is possible) or better store an temporary ID in a cookie (or in URL) and store state in App server or (better) DB. A lot of web servers and app servers offer clustering to solve the state issue. While this may or may not work, most of the time it is a marketing hype that rarely lives up to expectations and add extra load. It also violates KISS principle (Keep It Simple Stupid) and will give you more headache than it is worth.
    
    Concerning the pseudo static data regeneration, what if the thing that was being updated was only accessed once every half-hour on average? I am assuming then that generating the page on demand would be better?
    
    Use your brain. The idea is to lower CPU requirement and potential risk from overloading, not just to use a cool trick. Do whatever works best.
    
    I don't really know what you mean by "MAKE YOUR WEB SERVERS STATELESS". I mean, they have to know if a request just came in, where the data is, what time it is etc, and that stuff gives it state. I am assuming you mean something else by stateless but I cannot figure it out.
    
    State implies retained state across MULTIPLE connections/hits. Most application require state, however state does not need to be kept on the web servers and sometimes not even on app servers.
    
    HTH
    
    -Em
    
    Parent Share
    twitter facebook
    - Re:Can you qualify some of this stuff? (Score:2)
      
      by CerebusUS ( 21051 ) writes:
      
      Another note on stateless design. Be sure all your app coders understand the goal there, and understand the security risk in doing it wrong. At one of my former jobs they were working hard to make their code (an e-commerce app) stateless and ended up putting all the shopping cart details into cookies. Including the price of the items in the basket.
      
      A few weeks later they discovered things were being bought for a dime instead of $30. They fixed it then, but that shouldn't have made it off the design boar
      - Re:Can you qualify some of this stuff? (Score:2)
        
        by UVABlows ( 183953 ) writes:
        
        Was their app still stateless after they fixed the problem or did the solution involve the app keeping state?
    - Re:Can you qualify some of this stuff? (Score:2)
      
      by UVABlows ( 183953 ) writes:
      
      I see now that a distinction is made between "web server" and "app server". A web server serves static content and an app server runs the servlets or php scripts or whatever generates the dynamic content?
      
      I gather that machines in whichever tier (web, app, or db server) is used by the state-tracking mechanism cannot be turned off while the site is running because the users whose state was stored in the machine being removed would be viewed as new visitors to the site. Is this the reason to make the web se
      - Re:Can you qualify some of this stuff? (Score:2)
        
        by Em Ellel ( 523581 ) writes:
        
        Look up "three-tier architecture".
        
        A quick rundown:
        
        Web servers acts as presentation layer, putting together HTML pages. They contain no business logic they only know how to take a request, as app server for information and render this information. In addition to web servers, presentation can be IVR (phone interface), WAP serverfor mobile phone, or any other user interface. Because app server knows nothing on how to present user data, it could care less what is the presentation layer as long as presentation
  - Re:Can you qualify some of this stuff? (Score:2)
    
    by SuiteSisterMary ( 123932 ) writes:
    
    For "dropping in an extra server when needed without a lot of reconfiguring", what do you mean by "a lot of reconfiguring"? Obviously you need to get the machine, install the os, set up networking, install the web server, setup the web application, point it at the database, etc. How does the application being "stateless" help? I guess, what are some examples of state that an application can have that will make configuring an additional web server difficult?
    When you add a new server, do you need only a) ma
  - Re:Can you qualify some of this stuff? (Score:2)
    
    by CmdrGravy ( 645153 ) writes:
    
    I am not qualified at all for any of this but I'll have a go at answering your questions.
    
    "What does it mean to not scale vertically"
    
    I think he means that if you need more capacity you can just should just be able to add another web server say, or another database server and not have to add one of each each time you need more capacity. Not 100% sure of that myself though !
    
    "dropping in an extra server when needed without a lot of reconfiguring"
    
    He means that you should just be able to image another server
  - - Re:Can you qualify some of this stuff? (Score:2)
      
      by UVABlows ( 183953 ) writes:
      
      So if a web application needs to keep state (eg someone being logged in), is there a better option than the two you presented or do you just have to pick one even though it will have a downside?
- Re:A few basic things... (Score:2)
  
  by CerebusUS ( 21051 ) writes:
  
  The parent is probably the best answer I've read so far, but I thought I could add a few things.
  
  If your app relies on heavy database usage, you either need to invest the time to make yourself a decent SQL coder / administrator or invest the money to hire one for awhile. Having a farm of webservers capable of handling a million hits an hour doesn't do any good if the application locks the table each of those hits is trying read. This is an oversimplification, but a good SQL admin will be able to watch you
Mischevious piggybacking (Score:2)

by Silas ( 35023 ) * writes:

Hire a consultant to do it right from the start.

If you can't, this is sort of a mischevious way of doing it, but one that can work well in a pinch. Get your basic requirements down in writing (bandwidth, OS and app software, server requirements, disk space, backup scheme, etc.) and then contact one of the high-end services like Rackspace to ask for a proposal for their services based on those requirements. In the resulting conversations, you'll learn a lot about what kind of infrastructure is "standard"
high traffic system (Score:3, Interesting)

by psin psycle ( 118560 ) writes: <psinpsycle.yahoo@com> on Saturday November 13, 2004 @01:57PM (#10807460) Homepage

I've worked on a very high traffic system. At one point we were pushing 100MBPS in traffic. I had about 15 servers, 1 database server, and a load balancer. The traffic was mostly static html pages, with a bit of php/mysql for about 1/10th of the traffic.

We had a master database server that was distributed to all the webservers. When reading from the database, each webserver would read it's own local copy. mysql replication kept the data on the local webservers fresh.

Updates to the database were easy as only a small number of users were doing any updates. All updates were able to go through one server and wrote directly to the master database.

The load balancer was managed by the hosting company. It simply made sure that all the webservers shared the traffic load. Any webserver that died for whatever reason would automatically stop getting traffic sent to it.

Share
twitter facebook
- Re:high traffic system (Score:2)
  
  by smitty45 ( 657682 ) writes:
  
  This setup certainly can go some distance, but at some point, replication can become too much for each read slave.
Test and define your usage (Score:2, Informative)

by DeBaas ( 470886 ) writes:

There is a reason why this is a specialty. There isn't a clear answer.

The answer depends on many factors such as:
- how heavy are the pages (many pictures?)
- what's the platform (Lamp/J2EE/etc....)
- how is the usage?, if someone gives you a figure for concurrent users, ask yourself what they mean by that. Some apps have users contstantly submitting, others once in a few minutes
- how are they connected? Reverse proxy can really help for slow connections!
- if you have performance problems, investigate where t
Suggestions (Score:2)

by Facekhan ( 445017 ) writes:

If this is just for internal users and telecommuters then you really need to get an idea of how many people will actually be using the app and then put it on a server and simulate the effects of more and more users until it starts to tax the system. THen you can calculate how many users each server can support at 40-60% load and get that many servers behind a loadbalancing device. If its only few servers you can use a router to run the loadbalancing or get a dedicated load balancing device to do it.

I have
Read a lot, ask a lot of questions (Score:3, Informative)

by ToasterTester ( 95180 ) writes: on Saturday November 13, 2004 @02:12PM (#10807534)

This is one of those areas that is there is no set answer. There are lots of articles on the topic, but usually on systems larger than you plan to do. Go to user groups, but many in user groups are doing smaller site, but some might be doing what you are.

Main thing is define what you call a lot of traffic. A lot to one person isn't a lot to another.

Then nail down your budget that will be your most defining factor.

Then when designing use a design that is easy to scale. That way if you are off you can scale with little pain.

Personally I would put money into the database server, they can be real pain to scale. The web side design as a farm even if only two web servers to start with. Decide how you plan to load balance. A couple web boxes DNS round robin will do, but bigger you have to look to real load balancing options. Also what is your SLA that will determine how big your farm needs to be or if to keep hot or cold spare boxes around. IF a farm how are you going to keep content in sync? Then power, cooling, Security, and on and on. Its a lot of work, but when done and everyone is happy you can't wait for a even bigger project.

Share
twitter facebook
Performance planning and scalability (Score:2, Informative)

by punker ( 320575 ) writes:

I work for a website that does alot of traffic (it's a specialized industy, and no it's not pr0n). The site pushes about 10Mbit/s from 9-5 during the week through 6 webservers. There are a couple things you need to look at as far as making a site like that work.
The first thing you should do is look at your system and determine what your resource drains are. Do you have a database? Is it read-write or read-only? What are your replication and growth options for that app? That affects your scalability a
Experience is key (Score:5, Informative)

by xrayspx ( 13127 ) writes: on Saturday November 13, 2004 @02:17PM (#10807566) Homepage

Knowledge comes from Experience, and experience comes from Doing.

Mistakes will be made, They key is in mitigating the effects of those mistakes. Redundancy and Manageability are your two biggest buzzwords here. A good load test and utilization projections are definitely key, but no matter what you think your userbase will be, if it's a public application, you'll almost certainly be wrong. Try to prepare for the most traffic possible.

Redundancy on every level, including switching infrastructure is a very good plan. Any decent server sold can use multiple bonded NICs for reduncancy, if possible design your network such that if a switch fails, your network will fail over to another switch, etc.

I would suggest going to many local datacenters and interviewing each with probing questions relating to your situation. You will find that they are all relatively equal in terms of Standard DC items:
Diversity of route (physical entrance of cabling into the building) and redundant carriers.

Cooling

Power and backup gens

The things they differ on will be the readiness of their NOC team (do you have to fill out a web-form or call a call-center in East St. Louis to get a problem fixed in San Jose, or can you just "call the NOC and somene goes to your cage"), the monitoring/alerting they provide their customers for issues on the datacenter network. Infrastructure-wise, most DC's can provide you with Ping/Power/Pipe, but the service and SLAs are where they get points.

Do a LOT of reading. Depending on your platform, you have many choices. Linux vendors and Microsoft both have good platforms WRT building redundant networks, provided you do your homework.

Which brings you to manageability. Make sure that you have a deployment framework you can live with right from the start. Deploying code by hand is alright when you have 2 sites in IIS x 3 or 4 machines, but it gets hairy when you have 15 sites x 20 webservers. Make sure you can deploy web content, mid-tier apps, etc, with the "click of a button". This helps to ease the possibility of repetitive mistakes being made. Depending on the app, you may have to roll-your-own, but it's worth it.

Scalability. Make sure you pick a DC that can grow with you. If you plan to start out with 4 1u rackmount webservers and maybe a 7u DB, plus some storage array, make sure there is "room to move" in the DC without needing to cross-connect all over their facility with a cage here and a couple cabinets on the other end. Scalability testing by your engineers would be a great plan also. During load testing, if you're planning on using 2 mid-tier servers to process "Project X" from the web-users, set up 6 or 8 and load them up with bogus traffic. See how long it takes to kill your DB server.

Monitoring/analysis. Make sure you have a monitoring system into which you can hook custom monitors and alerts. Of your installation, those parts with the lowest levels of monitoring will be the ones most prone to breakage. Good packages here are NetCool and HP Openview. Expensive though. It's something you can probably write in-house until you need to spend the big bucks for an enterprise package.

Look to do a lot of reading, but break it into chunks. There is (I hope) no book called "Building and Maintaining High Traffic Enterprise Networks, for dummies, vol2". Every network will be different. But if you componentize your search, you will yeild great results. If you look to build your own monitoring or code deployment system, read up on WMI, read Cisco related newsgroups for network layer redundancy, etc.

Consultant is NOT a dirty word. Make sure you hire one for the right reasons. You do not want someone to come in and "make it so". You want someone with more experience than you have to work WITH you to design a network that you understand, can maintain, and which will scale. There's an art to it, hire Chris van Allsburg, not Picasso, Dali or certainly not Poll
Read the rest of this comment...

Share
twitter facebook
- Re:Experience is key (Score:3, Funny)
  
  by xrayspx ( 13127 ) writes:
  
  Fuck me. I'm ripping the "K" and "Y" keys from my laptop right now. I can't believe that post. I sound lie some bonehead owledge engineer.
  
  eep it real.
  - Re:Experience is key (Score:2)
    
    by soloport ( 312487 ) writes:
    
    Uh... I rather enjoyed your post. No need to bash one's self over a good, knowledgeable post. Really.
    - Re:Experience is key (Score:2)
      
      by h4rm0ny ( 722443 ) * writes:
      
      Uh... I rather enjoyed your post. No need to bash one's self over a good, knowledgeable post. Really.
      
      eah, me too. Anone that nowledgeable is worth coping. The clearl now stuff.
Follow the course! (Score:2)

by internet-redstar ( 552612 ) * writes:

In Belgium (Europe) there's the Linux 'DEEP SPACE' course, of course :)
See it at deepspace.linuxbe.com [linuxbe.com]
Money and time (Score:2)

by hoofie ( 201045 ) writes:

If you want to build a high-traffic web site application, you'll almost certainly need loads of money and time.
At the company I work for, we have re-built our entire web site system and internal systems over the last couple of years. We've gone from single processor compaq server with webapp and DB on one to a load-balanced multi-application server [all dual processor] with primary and backup oracle databases. Why - because our traffic [both paying and just visiting] was expanding dramatically all the time
ipvs, LAMP (Score:2)

by dougnaka ( 631080 ) * writes:

I'd highly recommend a LAMP setup with ip virtual server My experience says apache/php/mysql(or postgresql) is a good way to scale.
Buy 2 good load balancers with redundant power supplies, SCSI disks with hardware RAID. Depending on how much database your app needs that's where your hardest to avoid point of failure will be, look into what slashdot does for high performance, I forget the name of the software but it's a distributed caching type system, linux journal had an article about it and it looked ver
...as always, it depends... (Score:4, Insightful)

by Bob Bitchen ( 147646 ) writes: on Saturday November 13, 2004 @02:27PM (#10807608) Homepage

First off, I'd say you're doing this bass-ackwards. You really should have already answered all these and many other questions before ever laying fingers to keyboard.

It depends on lots of things. Who's going to manage the self-hosted host? If they have an IT dept. maybe they can provide the hardware sizing. In any case you will first need to establish the usage patterns and then go forward from there.

Share
twitter facebook
ApacheCon.com - learn from the experts (Score:3, Informative)

by dirkx ( 540136 ) writes: <dirkx@vangulik.org> on Saturday November 13, 2004 @02:33PM (#10807627) Homepage

Just hop on a plane to LasVegas - We're having the ApacheCon (http://www.apachecon.com) [apachecon.com] this week - with at least half a dozen tail on that topic (in the httpd, java, perl and php fields). Though the more hands on oriented tutorials will already start today - :-)
A good alternative is the book by OReilly - Web Performance Tuning (http://www.website-owner.com/books/servers/webtun ing.asp).
Dw.

Share
twitter facebook
Load balancer + content differentiation (Score:5, Informative)

by chrysalis ( 50680 ) writes: on Saturday November 13, 2004 @02:34PM (#10807635) Homepage

I have some experience with administration of web sites with very high traffic. My previous experience was with p0rn sites (lots of sites, lots of concurrent accesses). My current job is at Skyrock / Skyblog, that serves about 25 million pages every day.

In both jobs, the infrastructure was extremely similar.

The entry point is one (or more) load balancer.
A load balancer will not only blindly allow you to have multiple backends. It will also accept client connections, buffer the request, get the data from already established (keepalive) sessions, buffer it, and transmit it though large chunks to the client. This, alone, really helps to reduce the number of Apache processes that are taking resources (especially memory) for nothing.

The load balancer can also do other things, like protecting the servers against some attacks, plotting the current workload of every backend, compress HTML pages, etc.

At my previous job, we were using Foundry Serverirons. Now, we are using Zeus ZXTM http://www.zeus.co.uk/ [zeus.co.uk] with great success. Although it's very expensive software, it's way cheaper than Foundries, way more configurable, way more user-friendly and we are very pleased with it so far. A single PC handle 300 Mb/s (Linux 2.6 is needed for epoll).

The load balancer can also be configured to send the requests to this or that server according to the request.

Thus, servers are dedicated to specific tasks.

We have a bunch of static servers for static HTML, CSS, images, etc. They run minimal Apache servers, designed for speed, with NPTL and the worker MPM. Non-forking servers like thttpd or lighttpd is also an option. The static servers are mainly old P3 machines, with only 512 Mb RAM.

Then, we have servers for PHP. The Apache they are running is huge (our web sites need a lot of modules), the hosts are dual 3 Ghz Xeon with 2 Gb RAM and there are some other specific tweaks.

Content differentiation is important. It's a waste to spawn huge Apache process to serve static stuff, just because the same host should also be able to serve PHP. Also, tuning (esp. NFS) is very different for static and dynamic content. And as a specialized server often serves the same files, caching is more efficient.

We run Gentoo Linux on all web servers, plus one DragonFlyBSD (mostly for testing).

The same content differentiation is made for SQL server. One SQL server serves one sort of thing, so that caching is efficient. Also don't forget that on x86, Linux and MySQL can hardly use more than 2 Gb of RAM. So with big tables, this is really annoying. We are switching SQL servers to Transtec Opteron-based servers for that.

On high traffic infrastructures, the I/O is often the bottleneck especially if you serve a lot of different content.

For our blog service, we had to buy a Storagetek disk array with 56 disks (fiber channel, 15k) in RAID 10. As NFS would introduce too much delay, we directly plugged two web servers to the controller of the disk array. These web servers are the NFS servers for the PHP servers, but they also directly serve the static content.

The access time of hard disk is really annoying. For shared data, but also for databases. We found that RAID 5 was way too slow (even with the high-end Storagetek/LSI controller) since we have about 1 write for 5 reads. So we had to switch everything to RAID 10. It really performs better, but it's obviously more expensive.

Another bottleneck was the share of PHP sessions between all load-balanced PHP server. We first used a MySQL/InnoDB-based solution, but it poorly scaled. That's why I had to write specific software : Sharedance http://sharedance.pureftpd.org/ [pureftpd.org]

In a high-traffic infrastructure, my hint would be to use many modest, but specialized servers over one huge mega-fast server that does everything. This is way more scalable. And easier to manage, even from a financial point of view. You can b
Read the rest of this comment...

Share
twitter facebook
- Re:Load balancer + content differentiation (Score:2)
  
  by smitty45 ( 657682 ) writes:
  
  Did you try load balancing Innodb slaves ?
  
  btw, mysql on opterons is quite excellent, but don't even think of using 2.6.xx kernels on AMD64. just fyi, it was pretty awful when we tried it.
- Memcached? (Score:3, Interesting)
  
  by AmVidia HQ ( 572086 ) writes:
  
  Your sharedance software is interesting. Don't know if you are aware of memcached though, (http://www.danga.com/memcached/ [danga.com], by Livejournal guys) and if so did it lack something that prompted you to write your own?
Lessons since '99... (Score:5, Informative)

by xanthan ( 83225 ) writes: on Saturday November 13, 2004 @02:42PM (#10807684)

You don't mention if you're on the applications side of the world or the network, so I'll cover a little about both.

1. If you're on the app side, make friends with the network side and vice versa. To understand web site management and acceleration, you will need to know about both parts. Making peace with the other team is crucial to a successful site.

2. If you are on the app side, start thinking about concurrency from the start. You're going to have not 2-3 users at the same time, but more like hundreds if not thousands. This means that you can't do things like lock up tables and the like in the database. If at all possible write your application so that users don't need to come back to the same server to track their session information. Make sure each request is tracked quickly and easily. Also, differentiate your static content from the dynamic content -- you'll eventually want to cache the static content and life will be easier with static objects being served out of a known location. And please... please, please, please... make sure your app generates clean HTTP headers. Set your cache controls correctly, don't duplicate headers, don't be a smart-ass with your headers. Just use clean headers. ASSUME that there will be proxies between you and the client. ASSUME that you will not be able to control all of them.

3. Don't forget about megaproxies. Depending on the nature of your site, you're going to have a ton of your users coming from a small handful of addresses. (e.g., AOL) While some megaproxies have fixed the issue of a single user coming out of multiple proxy servers, all have not. This means anything that you use for client IP persistence is broken.

4. Client IP addresses... don't assume you have them. Don't assume they represent a unique user. They don't. Many load balancers/web accelerators also need to act as proxy and will replace the client IP address anyway. (Don't stress about logging -- any reasonable one will insert the client IP address in a HTTP header that you can extract like X-Forwarded-For:)

5. Peak load on your web servers. Apache can go fast, scale, blah blah blah... my ass. It's not the web server or operating system that is going to determine your peak performance. It is your application itself. Be prepared to fess up to the reality that your application peak performance is not going to be hundreds or thousands of requests per second unless you go insane with the optimization. (e.g., write your application into the web server and embed the whole thing into the kernel, etc.) Assume you're more likely going to get a few dozen requests/sec per app server. Keep that in mind as you plan server purchases and scaling.

6. HTTP request does not equal TCP connection. Don't assume that. With HTTP multiplexing like the stuff that Netscaler does (web accelerator), you're going to see most of your requests coming out of a small handful of TCP connections. Make sure your application supports that. Even if you don't use a web accelerator, browsers will do that do. Don't cheat and force the connection closed on every HTTP request, your web server will crap.

7. This is related to 6, but don't forget that web connections are very short lived compared to what the original designers of TCP were thinking about. As a result, you're going to run into cases where you run out of epheral ports (netstat -an will show a ton of ports in TIME_WAIT) even though your machine is idle. This is why HTTP Multiplexing is important -- you don't want a lot of connection churn. Yes, you can tweak your OS settings so that TIME_WAIT expires quickly, but that isn't going to help your overall performance. (TCP connection setup/teardown is a huge burden on a HTTP request that may only span a few packets...)

8. Look into HTTP acceleration technology from the get go. I've used several different brands and I've found Netscaler's to be the best. They are crazy fast and capable boxes that have a ton of features (like the HTTP multiplexing, SSL acceleration, HTTP compression, web
Read the rest of this comment...

Share
twitter facebook
- Re:Lessons since '99... (Score:2)
  
  by smitty45 ( 657682 ) writes:
  
  All excellent points, and I love Netscalers as well. I will add to..."As a result, you're going to run into cases where you run out of epheral ports" ...but remember, this is only a port limit per *IP*. Adding more subnet IPs to the other side of the connection (like on a Netscaler) can help HUGELY to break down that limit.
  
  and to "6. HTTP request does not equal TCP connection. Don't assume that."
  
  I will add: HTTP multiplexing (Netscaler or not) will just plainly not work without Keep-Alive connections bu
  - Re:Lessons since '99... (Score:2)
    
    by smitty45 ( 657682 ) writes:
    
    oh wait:
    
    10. Do #9 until replication becomes too much. Then, federate your databases and stop load balancing them. Build some smarts into your frontends so they can direct traffic to the right db, which are all masters at this point, slaves are only for backup.
to get background info...ask vendors (Score:2)

by museumpeace ( 735109 ) writes:

As others have suggested, you may already be in over your head. But even to pick a consultant, you need to have a rough idea of the options and their cost/benefit trade-offs. The large vendors: IBM, Sun, Microsoft etc and some second-tier vendors such as Netscape and BEA have overviews of the application and architecture of their products on their respective web sites...that will cost you a day of reading and give you a headache from reading conflicting claims of superiority [mindcraft.com] BUT, you will know the jargon
Scaling a high traffic site (Score:2, Insightful)

by DFossmeister ( 186254 ) writes:

First, although 300 locations with a few users each may sound like a high-volume site, it is not. I don't want to burst any bubbles, but it simple is not high-traffic in today's world. I work with large e-tailing sites that get 200,000 unique visitors per hour.

The first step is to determine the type of load you will receive. Is it call-center type traffic, where they will have dedicated staff accessing the application, or will it be more like Internet traffic that comes in waves when it feels like it?
Tips (Score:2)

by NerveGas ( 168686 ) writes:

First, separate the web serving from the database server, put them on different machines.

Second, web serving is easily (and massively) scalable. Buy a file server with a good RAID array (and backups!), then a bunch of front-end web servers. Start with round-robbing DNS for load-balancing. If you want, move to some LVS-based load balancers for failover, etc..

Third, database clustering is not an easy thing to do - if your database server doesn't offer good, scalable clustering, then you just have to buy
Hire a consultant (Score:2)

by nurb432 ( 527695 ) writes:

Real networking isnt something you pick up overnight, might as well spend the money upfront and do it right.
I thought I'd add my 2cents (Score:2)

by ResQuad ( 243184 ) writes:

I work at a company thats a hosted CRM. My first an most important sugestion is: MAKE SURE THE APPLICATION IS EFFECIENT. If you app does alot of unessisary crap in the database or what not, you are screwed no matter what you use.

Past that, The sugestions for a hardware load balancers are right on par. You have (for example) 2 machines hosting the same application, behind a hardware LB. You have a nice SQL machine, you are set. Add machines as needed (the SQL part gets tricky though).
Large Scale Infrastructures (Score:3, Informative)

by Floody ( 153869 ) writes: on Saturday November 13, 2004 @05:36PM (#10808700)

1. Foundry ServerIrons at the front-end layer.

2. Front-end proxying/caching. Not just static content either, take dynamic content that need not be updated often and put it on the front-end in a fashion that does not require over-weight httpds (i.e. no mod_perl). Use session affinity tricks on the front end (such as mod_rewrite with cookies). squid for caching as necessary.

3. Back-end heavy servers should have a maximum amount of memory, and obviously lower maxclients.

4. NetApp storage on the back-end, scaled as needed.

5. http://www.backhand.org/mod_log_spread/ [backhand.org]

6. Well designed network topology and aggressive switch partitioning: hint, use vlans and minimize trunking.

Share
twitter facebook
an answer... (Score:3, Informative)

by drasfr ( 219085 ) writes: <revedemoi&gmail,com> on Saturday November 13, 2004 @08:35PM (#10809751)

yes, i Know you asked how can find how to setup a high traffic architecture. I think you came at the right place on Slashdot.

Although I have never seen really many documentation online, I have setup many architecures in the past, and still able to handle very high volume traffic :10s of millions of pages views a day, most of them dynamic.

It really all depends on ONE factor: money.

I will give 2 choices, I have implemented both:
Appropriate budget:
Frontends/Load Balancing: We had a pair of of Big/IPs with SSL accelerator, configured for redundancy, that rocks.
behind them, we had a clustered NetApp F840, with gigabit interfaces, on a gigabit networks.
Frontends: We were running Apache, with all the binaries, config, webpages, perl scrips located on the shared filesystem. Each machine was a dual CPU, 2GB memory, 2x36GB scsi drive, we had 26 of them, double the capacity really needed so if a machine or two were to go down during the night, no need to worry and it would wait for next day, business hours, great for peaks as well.

As a database backup we had an Oracle Cluster on a SUN 6650, 14CPUs, 14GB of Memory, connected on an EMC storage. One machine was configured as the master, the other as a standby with the possibility to take down the primary and mount it's filesystem directly from the SAN. Pretty much all the config was on the SAN, on different volumes, and could be mounted on either machine. Each volumes had a copy and an hourly update in case of failure of the primary volume.

Now for a more realistic scenario with low budget:
- Load Balancer: Get 2 Linux machines, I'd suggest machines with 2GB Memory, 2x36GB Disk, 2x3Ghz CPUs, with Linux Virtual Server. (http://www.linuxvirtualserver.org/)
- Build 2 Linux machines that you would use as NFS Server (If you are short in budget also could use them as Oracle or Mysql Server), configure them with 2 external scsi arrays that can be mounted on either machine. If you are really short in budget, don't use external array, but big enough internal drives, and for example rsync to replicate the data between the 2 of them. (I would personnally use LVM, establish a snapshot copy on the master and do a rsync of this snapshot. If you have a database on it, put it in quiet (hot-backup) mode while you do the snapshot
).
- FrontEnds: Get a couples of machines with 2 CPUs, 2GB memory for example, 2x36Gb drives. Configure them to mount the filesystem from the NFS servers.

- Database, it is budget, use Mysql (or Oracle this would work), configure one machine as Master, the other as read-only. Have all your machines interrogating either machines for read-only requests, and going to the master only for write requests.

If you need more power: configure more frontends, configure more read-only slave database server. Now if you are write intensive, more than reads, on the database, then it becomes a bit more complicated.

if you want to know more, contact me off-list.

Share
twitter facebook
Here's what we do in state government: (Score:3, Interesting)

by crazyphilman ( 609923 ) writes: on Sunday November 14, 2004 @12:41AM (#10810961) Journal

Our sites obviously have to serve millions of people, so they have to be pretty robust. I can't tell you every detail because we're all pretty specialized and don't get to see everything ourselves, but from working with our database guys and network guys, I do have a pretty good 10,000 foot picture of how things work. Here's a general sense of what you'll have to do to really be robust:

1. Your database gets its own server, as powerful as you can afford. If you're a really big site, you're using Oracle, and really, a database cluster rather than a single server. IMPORTANT: Only the DBA can touch the production databases. Developers MUST submit requests to the DBA for any changes. Nobody should be touching a production database from their desktop, other than maybe being able to run queries to check data, and they use a separate, limited login for that. Changes are done by the DBA ONLY.

2. You put a firewall between the database server and your middleware server. The firewall is a dedicated device, and you're careful about the ports you leave open. Only the middleware server and DBA workstations on your intranet can touch the database.

3. Your middleware server(s) are as powerful as you can afford (this will be a theme here) and ONLY run middleware. This means, business rule processing. Everything that touches a database in any way MUST come through middleware -- no direct connections, ever. IMPORTANT: developers don't directly install middleware; network staff only.

4. A firewall (again, dedicated device) between the middleware server and the web server. Only the web server (and network staff workstations on your intranet) are allowed to touch the middleware server.

5. A set of web servers for your websites, as powerful as you can afford (hate to keep repeating this, but if you skimp you'll end up screwing yourself down the road). IMPORTANT: Developers should NEVER have access to production web servers; they should give their stuff to the networking staff when it's ready. Also, if you're doing FTP and such, put it on a separate server.

6. A firewall outside your web server, which only permits port 80 traffic and is twice as paranoid as your other firewalls. Log everything "funny".

In general, you'll have to hire some people: someone really good at security, to configure all your firewalls, someone good at setting up load-balancing to set up all three layers, someone to help you set up a good development environment...

One thing lots of people overlook: You'll want a "sandbox", i.e. a dedicated set of test database, middleware server, and web server that your developers can play with when working on their sites. You'll also want to set up a UAT (User Acceptance Test) environment similar to your sandbox, so projects can be moved to UAT for testing before being rolled out to production. You can't do UAT on a sandbox; sandboxes are constantly changing. You need a stable environment for UAT.

Anyway... Hope that helps, it's just advice, you know? Not all of it directly addresses high-volume sites, some of it is about site stability and security, but I think it all ties in together. If your site is being changed by developers, it won't be stable... And if you don't have a paranoid firewall setup, it won't be secure. A lot of webmasters would consider this layout to be (putting it politely) seriously paranoid, but hell, just because you're paranoid doesn't mean they're not out to get you. And, anyway, like I said, high volume does imply these other considerations...

Good luck!

Share
twitter facebook
See how Wikipedia does it on a shoestring (Score:4, Insightful)

by gtoomey ( 528943 ) writes: on Sunday November 14, 2004 @01:38AM (#10811222)

Look how Wikipedia organises its cluster [wikimedia.org] on a shoestring budget.
- Over 750 requests/second [wikimedia.org] on 29 - servers average >20 requests/second each (Yes I know some are not http servers) . Compare that to some commercial solutions.
- commodity hardware
- squid for cacheing/load balancing, feeding Apache
- multi-tiered archtieture
- dual Opteron for the master mysql database

Share
twitter facebook
- Re:weekend gmail invites (Score:4, Informative)
  
  by Japong ( 793982 ) writes: on Saturday November 13, 2004 @01:27PM (#10807327)
  
  These are very obvious links to a shock site, ignore them and mod parent down. Seriously, AC, don't you get tired of this?
  
  Parent Share
  twitter facebook
  - Re:weekend gmail invites (Score:2)
    
    by Sique ( 173459 ) writes:
    
    You should'nt have clicked on the link, just copied the URL into your browser. Sheesh. Even the simplest gags still work around her.
- Re:ahhh ask slashdot... (Score:2, Informative)
  
  by StillAnonymous ( 595680 ) writes:
  
  It's also a good opportunity for people to learn from other's experiences. Christ, man, I don't see why people have to hoard their knowledge. What kind of example does that set?
- Re:The obvious answer is: (Score:2, Interesting)
  
  by gal1264 ( 470552 ) writes:
  
  Everyone has to start somewhere right?
  
  What's your background. There's lots of different ways to solve every problem. I think it's much more of an assessment of what kind of problems you're good at solving. If you think you can conceptualize what your system needs to do, and evaluate different components objectively do it.
  
  Coming from someone who's implemented some massive testing infrastrucutres and custom tools, worked on computational biology frameworks, as well as well as currently working on fault tole
- Re:The obvious answer is: (Score:3, Informative)
  
  by twigles ( 756194 ) writes:
  
  Jesus what an asshole this parent poster is. Someone asks for advice and this arrogant guy calls them incompetent for not being born with the knowledge. Someone please mod him troll; this is exactly why non-techies think we're all arrogant.
- Re:This is the type of question (Score:2, Insightful)
  
  by deesine ( 722173 ) writes:
  
  You've got to kidding, right?!
  
  This guy's asking how he might setup a race car for the NASCAR circuit. And you're telling him; forget about $big block engines, forget about $super injected fuel & exhaust flow, forget about $blue-printing the motor...you can get the same performance from your Escort, just press harder on the gas pedal!
  
  Thanks for the laugh! LOL
  
  -d
- Re:Lots of factors to consider, primarily budget. (Score:2)
  
  by kylegordon ( 159137 ) writes:
  
  The 1750's are about 15k each.
  Remind me not to use your supplier... From dell.com a 1750 starts at $949...

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

A Beowolf Cluster of Course (Score:5, Funny)

Ask a Pr0n serving company (Score:5, Insightful)

Re:Ask a Pr0n serving company (Score:4, Funny)

Re:Ask a Pr0n serving company (Score:5, Informative)

Re:Ask a Pr0n serving company (Score:2)

Re:most porn companies are clueless (Score:2, Interesting)

Re:most porn companies are clueless (Score:2)

Re:most porn companies are clueless (Score:2, Funny)

Re:most porn companies are clueless (Score:4, Funny)

(File throughput) != (database connectivity) (Score:4, Informative)

Re:(File throughput) != (database connectivity) (Score:2)

Re:Ask a Pr0n serving company (Score:5, Interesting)

Oh yeah... (Score:2)

Re:Ask a Pr0n serving company (Score:2, Insightful)

Re:Ask a Pr0n serving company (Score:2, Informative)

Adult hosts (Score:2)

Re:Ask a Pr0n serving company (Score:2)

Post a URL (Score:5, Funny)

Simple Flow chart for learning (Score:5, Funny)

Re:Simple Flow chart for learning (Score:2)

Test using Slashdot itself! (Score:3, Insightful)

Re:Test using Slashdot itself! (Score:2)

Re:Test using Slashdot itself! (Score:2)

hmm (Score:2, Informative)

PLEASE (Score:5, Funny)

Re:PLEASE (Score:2)

Do the math (Score:5, Insightful)

Re:Do the math (Score:2)

Re:Do the math (Score:4, Informative)

Re:Do the math (Score:4, Informative)

Re:Do the math (Score:2)

How beefy? (Score:2)

Apache Benchmark is your friend (Score:5, Informative)

Re:Apache Benchmark is your friend (Score:3, Interesting)

Re:Apache Benchmark is your friend (Score:2)

Re:Apache Benchmark is your friend (Score:2)

Look at the other high load websites (Score:4, Informative)

Re:Look at the other high load websites (Score:2)

look at siege, httperf, and autobench instead (Score:3, Informative)

Dear Slashdot... (Score:5, Insightful)

Re:Dear Slashdot... (Score:2, Informative)

How to do it with little/no budget (Score:5, Informative)

Re:How to do it with little/no budget (Score:2, Interesting)

Re:How to do it with little/no budget (Score:2)

Interesting.... (Score:2, Interesting)

RE: Using F5's to encrypt data (Score:3, Informative)

Two scenarios: (Score:5, Insightful)

no. 1 cause of downtime (Score:2, Interesting)

Re:no. 1 cause of downtime (Score:2)

Depends (Score:4, Insightful)

Re:Depends (Score:2)

A few basic things... (Score:5, Informative)

Can you qualify some of this stuff? (Score:3, Insightful)

Re:Can you qualify some of this stuff? (Score:4, Informative)

Re:Can you qualify some of this stuff? (Score:2)

Re:Can you qualify some of this stuff? (Score:2)

Re:Can you qualify some of this stuff? (Score:2)

Re:Can you qualify some of this stuff? (Score:2)

Re:Can you qualify some of this stuff? (Score:2)

Re:Can you qualify some of this stuff? (Score:2)

Re:Can you qualify some of this stuff? (Score:2)

Re:A few basic things... (Score:2)

Mischevious piggybacking (Score:2)

high traffic system (Score:3, Interesting)

Re:high traffic system (Score:2)

Test and define your usage (Score:2, Informative)

Suggestions (Score:2)

Read a lot, ask a lot of questions (Score:3, Informative)

Performance planning and scalability (Score:2, Informative)

Experience is key (Score:5, Informative)

Re:Experience is key (Score:3, Funny)

Re:Experience is key (Score:2)

Re:Experience is key (Score:2)

Follow the course! (Score:2)

Money and time (Score:2)

ipvs, LAMP (Score:2)

...as always, it depends... (Score:4, Insightful)

ApacheCon.com - learn from the experts (Score:3, Informative)

Load balancer + content differentiation (Score:5, Informative)

Re:Load balancer + content differentiation (Score:2)