Learning High-Availability Server-Side Development?

Learning High-Availability Server-Side Development? 207

Posted by kdawson on Thursday August 23, 2007 @10:57AM from the servers-not-breaking-a-sweat dept.

fmoidu writes "I am a developer for a mid-size company, and I work primarily on internal applications. The users of our apps are business professionals who are forced to use them, so they are are more tolerant of access times being a second or two slower than they could be. Our apps' total potential user base is about 60,000 people, although we normally experience only 60-90 concurrent users during peak usage. The type of work being done is generally straightforward reads or updates that typically hit two or three DB tables per transaction. So this isn't a complicated site and the usage is pretty low. The types of problems we address are typically related to maintainability and dealing with fickle users. From what I have read in industry papers and from conversations with friends, the apps I have worked on just don't address scaling issues. Our maximum load during typical usage is far below the maximum potential load of the system, so we never spend time considering what would happen when there is an extreme load on the system. What papers or projects are available for an engineer who wants to learn to work in a high-availability environment but isn't in one?"

Learning High-Availability Server-Side Development?

This discussion has been archived. No new comments can be posted.

Search 207 Comments Log In/Create an Account

Comments Filter:

Re:2 words (Score:3, Interesting)

by teknopurge ( 199509 ) writes: on Thursday August 23, 2007 @11:19AM (#20330859) Homepage

I just finished reading that paper and was left with the impression that I had just wasted 10 minutes. I could not find a single insightful part of their algorithm - and in fact can enumerate several 'prior art' occurrences form my CPSC 102 class during my undergrad - all were lab assignments.

I did, however, find this sentence disturbing:
However, given that there is only a single master, its failure is unlikely; therefore our current implementation aborts the MapReduce computation if the master fails.
Huh? So, because there is only one master it is unlikely to fail? This job takes hours to run. This is similar to saying that if you have one web server, it is unlikely to fail. I can't help but think this is a logical fallacy. I don't care how simple of complicated a job is - a single-point-of-failure is a single-point-of-failure.

I don't code for it directly (Score:4, Interesting)

by Applekid ( 993327 ) writes: on Thursday August 23, 2007 @11:20AM (#20330873)

Our in-house applications don't get built around performance at all (personally I find it disappointing but I don't write the rules... yet). We generally scale outwards: replicated databases, load distribution systems, etc.

Many of the code guidelines we have established are to aid in this. Use transactions, don't lock tables, use stored procedures and views for anything complicated, things like that.

I guess my answer is that we delegate it to the server group or the dba group and let them deal with it. I guess this means the admins there are pretty good at what they're doing. :)

Languages, Libraries, Abstraction, Audience (Score:2, Interesting)

by DonRoberto ( 1146645 ) writes: on Thursday August 23, 2007 @11:34AM (#20331033)

From working on both academic and enterprise software designed specifically to scale, these are four things I've noticed are incredibly important to scalability:

Languages - I recently saw a multi-million dollar product fail because of performance problems. A large part of it was that they wanted to build performance-critical enterprise server software, but wrote it mostly in a language that emphasized abstraction over performance, and was designed for portability, not performance. The language, of course, was Java. Before I get flamed about Java, the issue was not Java itself and alone, but part of it was indeed using a language not specifically designed for a key project objective: performance. The abstraction, I would argue, did the project worse than all the other peformance issues associated with bytecode however. Relevant books on this subject are everywhere.

Libraries - Using other people's code (e.g. search software, DB apps, etc.) will always introduce scalability weaknesses and performance costs in expected and unexpected places. Haphazardly choosing what software to get in bed to can come back to bite you later. It is an occupational hazard, and each database product and framework and even hardware configuration has its own pitfalls. Many IT book on enterprise performance or even whitepapers and academic papers can provide more information.

Abstraction - There is no free lunch. When you make things easier to code, you typically incure some performance penalty somewhere. In C++, Java, and most other high level languages, the sheer notion of modularity and abstraction eventually add so much hidden knowledge and code that developers either lose track of what subtle costs everything is incurring, or are suddenly put in a position where they can't go back and rewrite everything. Sometimes it is better to write a clean, low-level API and limit the abstraction eyecandy or it will come back to bite you. On the other hand, sometimes a poor low-level API is worse than a cleanly abstracted high-level API. In practive, few complex and performance-oriented systems are architected in very high level languages however. I have seen few books on this subject, and it is pure software engineering. Design patterns might help, however.

Audience - Both clientelle and developer audiences make a big difference. Give an idiot a hammer with no instructions... and you get the point. Make sure your developers know what they're doing and what priorities are, and also design your interfaces and manuals in such a way as to keep scalability in mind. Why have a script perform a hundred macro operations when a well-designed API could provide better performance with a single call? This entails both HCI and project development experience.

Wish I could suggest more books, but there's just too many.

Statelessness (Score:4, Interesting)

by tweek ( 18111 ) writes: on Thursday August 23, 2007 @12:13PM (#20331559) Homepage Journal

I don't know if anyone has mentioned it but the key to a web application being scalable horizontally is statelessness. It's much easier to throw another server behind the load balancer than it is to upgrade the capacity on one. I've never been a fan of sticky sessions myself. This requires a different approach to development in terms of memory space and what not. With a horizontally scalable front tier, you can't always guarantee that someone will be talking to the same server on the next request that they were on the previous request. It requires a little more overhead in terms of either replicating the contents of memory between all application servers or on the database tier because you persist everything to the database.

At least that's my opinion.

Lots of Options (Score:3, Interesting)

by curmudgeon99 ( 1040054 ) writes: on Thursday August 23, 2007 @12:34PM (#20331897)

First of all, excellent question.

Second: ignore the ass above who said dump Java. Modern hotspots have made Java as fast or faster than C/C++. The guy is not up to date.

Third: Since this is a web app, are you using an HttpSession/sendRedirect or just a page-to-page RequestDispatcher/forward? As much as its a pain in the ass--use the RequestDispatcher.

Fourth: see what your queries are really doing by looking at the explain plan.

Five: add indexes wherever practical.

Six: Use AJAX wherever you can. The response time for an AJAX function is amazing and it is really not that hard to do Basic AJAX [googlepages.com].

Seven: Use JProbe to see where your application is spending its time. You should be bound by the database. Anything else is not appropriate.

Eight: Based on your findings using JProbe, make code changes to, perhaps, put a frequently-used object from the database into a class variable (static).

These are several ideas that you could try. The main thing that experience teaches is this: DON'T optimize and change your code UNTIL you have PROOF of where the slow parts are.

Re:They use it to make money, so no criticism allo (Score:3, Interesting)

by fimbulvetr ( 598306 ) writes: on Thursday August 23, 2007 @01:36PM (#20332725)

The point was that he seemed to consider it so academic and so "well known" that he could just dismiss it without considering it.

Google seems to have taken this elementary technique and turned it into a something that can kick the crap out of an over-engineered solution under the right circumstances. I've read the paper, and assuming this is really used how they say it is, I can say that it does a fantastic job of performing AND HA, based on my personal experiences with gmail, google, groups, adwords, maps, analytics, etc.

Fanboy? Maybe, depending on your definition. Impressed? Hell yes.

Re:The unfortunate thing about databases (Score:3, Interesting)

by Kenneth Stephen ( 1950 ) writes: on Thursday August 23, 2007 @05:08PM (#20335687) Journal

I'm afraid the parent post is an example of not seeing the forest because of all the trees.....

Application code should never ever be aware of deployment issues. Making it aware of such things a sure way to ensure nightmares when your environment changes. For example, lets say you have to send mail. You could take the option of always talking to localhost under the assumption that your app will always be deployment on a machine with a mail server. But consider the case when the app is taken and deployed in a production environment with a firewall around it, and to send mail, you have to send mail to another system. Your app breaks. The right way to do this is to externalize the existence of a mail server into some properties / config file that gets updated at application deployment

This is so fundamental, that it seems obvious. Lets apply this philosophy to the case at hand: the application should never ever have to know whether there is a failover server / hot standby / cluster in place or not. It should just assume that its going to execute a statement and if there is a failure, the transaction will rollback. Whether the transaction will error out and rollback depends on the properties of the environment. For example, DB2 can do clustering / HADR (high-availability data replication). And on AIX, you have server clustering solutions like HACMP, transaction checkpointing and partition mobility, and a whole host of other technologies which can intervene to not cause the application to fail in case of a database / database server failure.

If a database server ever introduces an API for making applications aware of failover issues, thats a sure sign that the database architects are asleep at the wheel.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Learning High-Availability Server-Side Development? 207

Learning High-Availability Server-Side Development? More Login

Learning High-Availability Server-Side Development?

Re:2 words (Score:3, Interesting)

I don't code for it directly (Score:4, Interesting)

Languages, Libraries, Abstraction, Audience (Score:2, Interesting)

Statelessness (Score:4, Interesting)

Lots of Options (Score:3, Interesting)

Re:They use it to make money, so no criticism allo (Score:3, Interesting)

Re:The unfortunate thing about databases (Score:3, Interesting)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot