Real World Webserver Price vs. Performance Figures? 56
Borgoth asks: "At my company we just broke 10 million pageviews per day. We use 5 2-processor 1U off-the-shelf Intel boxes running Apache, Linux, mod_perl, and MySQL. This averages out to about 2 million pageviews per day per server (about 20 million hits/server, including images). Most of our pages have some dynamism using mod_include SSIs, and maybe one pageview in five directly results in a db query. We think we should be pretty happy that we're doing so much with so little, but we don't really have any idea how much horsepower other sites are using in their server farms. So, what sort of webfarms do Slashdot readers maintain, and how does their performance compare?"
not many comparisons (Score:3, Insightful)
Well I really dont know.. (Score:5, Funny)
Well CT? (Score:2)
If they maintained their own servers.
If they're not already compiling an answer that isn't a flippant troll like this;)
Re:Well CT? (Score:5, Informative)
The answer for slashdot is more complex because we have three groups.
Article/comment servers can handle 200K of pages views a piece.
Index/All can handle 100K.
Static/XML can take a million per server.
I have a fix that goes in this week which should up Article/Comment, for index I am looking at a new system for caching the stories that should increase the index servers.
My friend's site... (Score:5, Funny)
Re:My friend's site... (Score:3, Funny)
Re:My friend's site... (Score:3, Funny)
Re:My friend's site... (Score:2)
Serving only static files, on a threaded webserver and a decent OS, even a lowly Pentium can saturate a T1 without breaking a sweat. Hell, I remember saturating 155 Mbit/s ATMs (100x faster than a T1) on 1994-vintage DEC Alphas. I would say that most Slashdottings flatten routers and pipes rather than servers.
Re:My friend's site... (Score:2, Funny)
Seriously, though, I'm glad you brought that up. That'll make him feel a whole lot better. Sometimes the "myth of the almighty Slashdotting" can get even the most level-headed webmaster up in arms. Now if only he'd return my phone call...
Re:My friend's site... (Score:1)
Re:My friend's site... (Score:2, Funny)
- rwsorden, May 5, 2003 from Infamous Technology Quotes
Hard to say (Score:5, Insightful)
Was the vehicle a rowboat or a train?"
Every site is different. I don't really care that the servers are 1U at the expense of telling us things like how large the database is and is it mostly cached reads or read-write activity? How big is the pipe? What is the CPU speed and RAM size? What is the speed and type of disk? How many bytes are transferred?
Incidentally, a much more important number is peak capacity, ie. what is your 5 minute peak load? Whatever you can reasonably handle for 5-10 minutes you can probably handle constantly but a supposedly high-volume site can melt down when the site gets flashed up on the morning news or Slashdot.
You should move all static content (Score:2, Interesting)
Re:You should move all static content (Score:2)
We use Boa [boa.org], which is a little faster (apparantly).
That is if it doesn't randomly decide to fall over, destroy your indexes, or corrupt your data.
*grumble*
You're missing the important stuff (Score:4, Informative)
What is your system load? If it's less than 1, you've got processor power to spare. If it's more than one, you could add more processors IF you think that site response is too slow.
What is the throughput to your disks? Actually benchmark this with vmstat or something like that. If that shows that your disks are constantly maxed you could get more servers to spread the disk activity around, or you could build a faster disk subsystem if you've got a centralized database. Smart architecting helps too. Don't run the database on the same processors that run scripts and serve pages. Use the database load handling features to improve that specific part of the site. See what pages you can generate statically - I doubt that every single page on a site needs to be from the database.
Re:You're missing the important stuff (Score:3, Interesting)
This is not true. System load is the average number of blocked processes. They may be blocked waiting for processor time, but they may also be blocked waiting for a lot of other stuff. So, the 100% usage system load depends on what are you doing with the server. You can have a system keep a load average of 20 and yet show unal
Re:You're missing the important stuff (Score:2)
Depends on what tools you are using. Many (uptime on RedHat for example) exclude processes blocked by I/O.
Re:You're missing the important stuff (Score:2)
I've seen the behaviour I've described across SuSE, Debian and Gentoo systems. Servers with many network connections, and lots of disk I/O show high load averages and unnallocated CPU time.
My Anecdotal Evidence (Score:5, Informative)
The total outfit is 8 servers, 6 active: 1 DB Server with one hot backup (dual P-III 750, 1.5GB), 4 web servers (~1.1ghz, 1GB), 1 uniproc dedicated image server (1ghz, 1GB) with a hot backup.
The 4 web servers toss a combined total of about 1.5 million pageloads a day, of which 1.4 mil are dynamically generated using FastCGI/Perl and that others are shtml and stylesheets. A lot of the data that is queried from the DB server can and is cached on the web heads for better performance so that during peak times the server doesn't have to do much more than 80 queries/sec. The image server using stock Apache 1.3 however, does something like 3m serves a day without much sweat since it's all static content.
All told that works out to each web server doing something like 325,000 pageviews a day. I don't have a barometer of whether that's good or not, but honestly I worry more about bandwidth than computrons.
I think you should be pretty happy with what you're doing. I don't know of the current figures, but last september Slashdot was doing 2.4m pageviews a day with ~10 web heads (as gleaned from 'Taco's journal). Understand that's not an apples to apple comparison since I guess you're serving more static content while slashdot (and my site) are by and large dynamic.
DB or not DB? (Score:4, Insightful)
You neglected to mention what DBMS you use. Or is it a given nowadays that everybody uses MySQL?
Which is my cue for my usual anti-MySQL flame. Except that it's old, I'm tired of doing it, you've all heard it. Still, I'd like to see some serious benchmarks comparing MySQL with PostgreSQL, Firebird, and Berkeley DB. With attention to realistic web-style queries, scalability and (except for Berkeley DB, of course) complex queries.
Re:DB or not DB? (Score:1)
I can flame MySQL with the best of them, but it's still the one I choose because It's the one that sucks the least for what I want it to do. I have yet to find a DB engine that does not b
Re:DB or not DB? (Score:2)
Re:DB or not DB? (Score:1)
Re:DB or not DB? (Score:2)
Re:DB or not DB? (Score:1)
How would you know, if you haven't touched it in three years?
Re:DB or not DB? (Score:2)
If Postgresql was corrupted, it was likely running on a hard drive with bad sectors or in a machine with bad memory. The fact that MySQL doesn't tax your machine as hard makes it more likely that postgresql will show these errors.
I've been running Postgresql for 4 years, and have
Re:DB or not DB? (Score:1)
Create two tables, one innodb, the other myisam. Run a transaction against both of them. roll it back halfway through.
MySQL doesn't tell you that you can't run a transaction on a MyISAM table, it doesn't tell you it didn't roll back the MyISAM table, in fact, it happily acts like it got the whole thing right, and rolled it back. Except the changes you made
Re:DB or not DB? (Score:2)
Re:DB or not DB? (Score:2)
MySQL is missing so many key features that benchmarking MySQL against Postgresql is useless.
How fast is MySQL when it has check constraints on data? No one knows, since it doesn't have them.
How fast is MySQL when you use a complex sub select? No one knows, it can't do more than a few common cases yet.
How fast is MySQL when you use stored procedures? Triggers? custom data types? Unions?
None of those things are there. WHEN MySQL gets around to havin
Re:DB or not DB? (Score:2)
You compare cargo capacity and speed, of course. This would then help you choose based on the relative importance of these factors for your application. Though I suspect most people would choose some kind of comprimise, such as a semi truck.
What's "key" to you is not "key" to everybody. Thousands of webmasters claim that MySQL has the features they need. Maybe
Re:DB or not DB? (Score:2)
It may be hard to believe, but you can do more with a database than build a web site
MySQL is a great backend data store for web sites. However, it is NOT ACID compliant, and shouldn't be compared against an ACID compliant database.
Postgresql uses Write ahead logging, and you can pull the power plug on the machine in the middle of 1000 concurrent
Re:DB or not DB? (Score:2)
On the other hand, you seem to be reading a lot into my use of the word "benchmark". Perhaps you're assuming I'm like all those marketroid drones who use bogus benchmarks to "prove" that a particular product is "superior". In real life, benchmarks only prove that a product does one particular thing in one particular circumstance better. Benchmarks have their legitimate uses, but only if you bear in mind thei
Re:DB or not DB? (Score:2)
select enum from table except (select enum from table2) kinda stuff, and mysql just can't do it.
The part about maintaining two databases is valid, but only so far. the biggest cost of running Oracle are license fees and maintenance. MySQL doesnt' really need a whole lot of maintenance, since it's more or less self maintaining,
Re:DB or not DB? (Score:2)
Many large companies have site licenses for enterprise software, including whatever DBMS is standard. Maybe they could negotiate smaller fees by saying, "we use MySQL for the small stuff!" but I doubt it!
Except that you can fake it on the client site by doing multiple queries and merging the data. Yeah, that's painfully ineffi
CORRECTION (Score:2)
Try this guy (Score:5, Informative)
Frankly, I don't think that even Slashdot gets as many page views per day, as you do.
My company (Score:3, Informative)
Sorry, no FAQ for that! (Score:5, Insightful)
You have a website that has its needs. I can't imagine what kind of application you are using, how much memory it needs, whether it is processor intensive or disk intensive, or both. Depending on how your website works, there are a variety of solutions available. One solution to one problem might actually cause more problems for you if applied inappropriately.
It might make a lot of sense to consolidate the database onto an advanced server -- with 2 procs, RAID SCSI drives, and a fair amount of memory. It might make a lot of sense to get cheaper boxes with more memory and only one processor to run the web servers. Perhaps you can mount them all off of one giant NFS file server, and have the data that the web servers need held in a cache on the web server. It might make a lot of sense to go talk to IBM and Sun and see what they have to offer as well. It might also make a lot of sense to redesign the way your web application works to reduce the load.
But no one can tell you the right way to do it, because your situation is unique. No one can even give you a good estimate of cost. Your best bet if you are truly lost is to hire someone to analyze your code, your servers, and your needs, and come up with a plan. Those guys cost a bit of money, and finding a good one is near impossible. You're better off at studying up on what your website really needs and experimenting with possible solutions.
This is where you start to realize why web people can earn up to 6 digits. We don't just design web sites or program applications. We have to make sure they scale as well.
Using mod_gzip? (Score:5, Informative)
Re:Using mod_gzip? (Score:2, Insightful)
Re:Using mod_gzip? (Score:2, Interesting)
Re:Using mod_gzip? (Score:2, Informative)
Re:Using mod_gzip? (Score:1)
.com ebay bargains and party pics (Score:1, Offtopic)
Offtopic -2, Lovely Ladies +5
Re:.com ebay bargains and party pics (Score:2)
Our stats (Score:4, Informative)
We are comfortably serving 2.5M dynamic generated pageviews every month across 3 webheads, 1 software load balancer and two large DB servers. This is all mod_perl work here. Last I looked we were doing about 1.5TB/month in bandwidth from these dynamic pages.
Webhead data (currently 3, adding 2 more soon):
2x1.67Ghz Athlon
3GB Ram / 18GB SCSI Disk (only used for logs, content is read over NFS)
LB data (we're moving this to a CISCO CSS 11050):
1x1.4Ghz PIII
2GB Ram / Disk unimportant, it's never touched.
Software load balancer: Pound, quite an amazing piece of software.
DB server (one live, one hot-spare)
4x1.6Ghz Xeon (PowerEdge 6650)
4GB Ram / Big ass disks and a 40GB database
MySQL currently sees about 500-600 queries per second on the DB. We need to implement more server-side caching though, we are seeing an alarming 54% query cache hit rate (4.0.12).
One thing I'm looking at is less computation on the forward-facing webservers. Instead, using SOAP to build the page components from a separate cluster of application servers. Preliminary testing is promising.
at a previous job (Score:2, Informative)
What are you running? (Score:3, Interesting)
Using a web server which pre-forks (example-- Apache 1.3x), is probably the best way to dramatically reduce performance and scalability in most situations. The sheer number of processes under high load makes most schedulers crap themselves in most situations.
Multithreadedness, an example is Apache 2.x, can greatly improve performance and scalability as can single process, single threaded multiplexing non-blocking IO based web servers such as Thttpd, BOA or Zeus.
Once one has selected a server which works effeciently for them given their content, fine tuned their OS, then one can move towards actual processing power and system throughput.
Re:What are you running? (Score:2, Informative)
1. Linux maps treads to processes so you get a mass of processes anyway.
2. If you want to run things that are not tread safe like PHP you have to pre-fork. In fact PHP's web site states not to run PHP, Apache, and UNIX-like OS 2.x on any production web site. Beause most libs for are not thread safe. Which means mod_perl and mod_* are going to have the same problems. It may work it may not that is not what I want to base my job on.
3. Single process, single
Consider the whole picture.. (Score:4, Insightful)
For example, I had a site that I ran for a while that was fairly poorly built from an application perspective. However, the client had prepped a flash load (ie: a bursty, concentrated load) for a specific time period.. and I had about a month to prepare. The problem was that we couldn't rewrite the apps part of the site to ease the congestion, nor could we rewrite some apps to be distributed to multiple servers. (They stored state on the server..)
So, I brought in a Foundry ServerIron, and used the URL switching to map all static files/items to a pair of Ultra 5 workstations. These had a bunch of memory and had iPlanet Enterprise Server configured with very agressive caching parameters. For the dynamic content, I also increased any caching parameters available.
(This is high level, but you get the idea. Basically, serve as much out of memory as possible.. other tuning issues.. turn off name resolution obviously.. make sure you aren't I/O bound.. or network bound for that matter.)
The day came around and we served 5 or 6 million hits in two hours or so.. the average load on the servers was around 0.1. In fact, even on the servers with the static content getting lots of hits, there was only really disk activity when access logs were flushed to disk (Every 30 seconds)..
So, don't just think about servers.. consider all options when trying to balance and handle your load.