Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Programming The Internet IT Technology

Choice of Language for Large-Scale Web Apps? 801

anyon wonders: "PHP is the most popular language for the web. eBay uses ISAPI (C), Google uses C/C++ (search), Java (gmail), and Python. Microsoft uses ASP (what else?). For small web site, it really doesn't matter. What's your take on language choice for large-scale web applications? Maybe language choice is irrelevant, only good people (developers) matter? If you can get the same good quality people, then what language you would chose? Considering the following factors: performance, scalability, extendibility, cost of development (man-month), availability of libraries, cost of libraries, development tools? Has there been a comprehensive comparison done?"
This discussion has been archived. No new comments can be posted.

Choice of Language for Large-Scale Web Apps?

Comments Filter:
  • Polyglot (Score:3, Insightful)

    by FTL ( 112112 ) * <slashdot.neil@fraser@name> on Saturday July 30, 2005 @02:16PM (#13202966) Homepage
    What's your take on language choice for large-scale web applications?

    As many as possible. Use PHP for the front end, Perl for input parsing, Euphoria for the graphics, JavaScript on the client-side, Moo for the database and Python for the glue to hold things together.

    Every language has strengths and weaknesses. There is no killer language. A good carpenter has lots of tools and uses the most suitable tool(s) for each task. Likewise a programmer should be skilled in many languages and should pick the most appropriate one for each task. Learn as many programming languages as you can, and when you've done that, learn a few more.

    [The feeling of job security is also rather nice.]

    • Re:Polyglot (Score:2, Informative)

      by l33t.g33k ( 903780 )
      Google uses Java (gmail) Not really. They use JavaScript for that, which is quite different.
    • Re:Polyglot (Score:5, Insightful)

      by Tablizer ( 95088 ) on Saturday July 30, 2005 @03:04PM (#13203275) Journal
      As many as possible. Use PHP for the front end, Perl for input parsing, Euphoria for the graphics, JavaScript on the client-side, Moo for the database and Python for the glue to hold things together. Every language has strengths and weaknesses.


      It will just produce a job ad that says:

      Required: 3+ years experience in PHP, Perl, JavaScript, Euphoria, Moo, and Python.

      Then when they can't find any individual to fit the bill (surprise!), they will lobby Congress for more visa workers so that they can hunt the entire globe for the "best and brightest".

      (Hmmmmm. What the hell is "Moo"?)
      • Re:Polyglot (Score:4, Informative)

        by FooAtWFU ( 699187 ) on Saturday July 30, 2005 @04:08PM (#13203647) Homepage
        Last I checked, a MOO was a MUD, Object Oriented. Most MOOs are probably based off the LambdaMOO server, which was initially developed at PARC; the original LambdaMOO is available via your favorite telnet or MOO client at lambda.moo.mud.org port 8888.

        However, I would find such a system to be extremely unsuitable as a general-purpose database.

    • Re:Polyglot (Score:5, Insightful)

      by Lord Ender ( 156273 ) on Saturday July 30, 2005 @04:50PM (#13203872) Homepage
      If I were your boss, I would hire an intern and have him rewrite your apps from scratch with a single, maintainable language. Once he is done, I would hire him for half of what I pay you, then give you the boot. Job security through incompetence?
  • Perl. (Score:5, Funny)

    by Anonymous Coward on Saturday July 30, 2005 @02:16PM (#13202968)
    For everything.
    • Seconded! (Score:3, Informative)

      by XanC ( 644172 )
      It does it all, and it values the most expensive component of software (for all but the biggest Web apps): programmer time.
      • Re:Seconded! (Score:4, Insightful)

        by cahiha ( 873942 ) on Saturday July 30, 2005 @07:26PM (#13204609)
        It does it all, and it values the most expensive component of software (for all but the biggest Web apps): programmer time.

        Programmers also have to debug and maintain that software, and that makes Perl one of the most wasteful languages in terms of programmer time.
    • by LibertineR ( 591918 ) on Saturday July 30, 2005 @04:32PM (#13203785)
      And people wonder why geeks dont get laid.

      It is Saturday, and instead of being out in the sunshine, taking in rays, talking to women, GOING OUTSIDE, here we are, in front of our screens debating about which language to build our web apps with? Can we suck enough?

      Dont bother replying, because when this damn compile is done, I am going outside if it kills me. I wont be here to read any replies, dammit.

  • Perl? (Score:2, Interesting)

    by hanshq.net ( 671857 )
    Perl is also a nice choice. Sites Running mod_perl [apache.org]
  • Who do you have? (Score:3, Interesting)

    by rob_squared ( 821479 ) <{rob} {at} {rob-squared.com}> on Saturday July 30, 2005 @02:18PM (#13202984)
    Smarter people than myself have said it, if the people you have know a certain language, use that, don't force them to use something else even if it is conceived to be better. Now if you're going out and specifically hiring people for this project, things get a whole lot more touchy-feeley and you'll be forced to do much research. But then again, you're probably expecting to do a lot of that anyway.
    • We support PHP, Perl, and Cold Fusion on our web servers and the choice depends on who's coding it and who's going to support it. I've got several people who grew up with HTML and CFML and they like that whole "world", and their comfort level makes it easier for me.
  • WebObjects (Score:5, Interesting)

    by lightningrod220 ( 705243 ) on Saturday July 30, 2005 @02:18PM (#13202987)
    Apple uses WebObjects for its online store and the iTunes store. Consider that those go under a lot of stress. Those seem to be the biggest examples of its use, so I don't know what kind of performance it does in other situations. But for an all-around package, it seems to be pretty good.
    • Re:WebObjects (Score:3, Interesting)

      by Pius II. ( 525191 )
      I started using WO last week, and I have to say, it's great. I was able to go from "I don't know what a database is" to deploying my own Java client for my web page interface in about two hours. Of course, knowing Cocoa, Cocoa Bindings and the corresponding patterns helped a lot.
      BTW, according to the blurb on the (German) Apple home page, other large users of WO include the Deutsche Bank, O2, Consors, Bayer and T-Systems.

      Fuck T-Systems!
  • Java Java Java! (Score:4, Informative)

    by FortKnox ( 169099 ) * on Saturday July 30, 2005 @02:18PM (#13202988) Homepage Journal
    No question about it!

    performance, scalability, extendibility, cost of development (man-month), availability of libraries, cost of libraries, development tools

    Performance? Assembly will give you the best performance followed by C and C++. All three of do not have that great of support for web apps..
    However, Java is almost exclusively being used for large enterprise websites. Its powerful enough to handle the big jobs, and using the appropriate app server will give you great performance.
    Cost of development is heavy in initial development, but pays for itself in maintenance. Most libraries and APIs are free in java (struts, spring, hibernate, tapestry, etc etc etc...). I'd say they are second to perl in terms of freely available and powerful libraries and APIs.
    Development tools? Just check out the (free!) eclipse platform.

    In my mind there is no question that Java (more specifically J2EE) is the best option for general large scale enterprise applications.
    • Re:Java Java Java! (Score:5, Insightful)

      by Surt ( 22457 ) on Saturday July 30, 2005 @02:32PM (#13203087) Homepage Journal
      Actually, odds are that hand written assembly will underperform compiled c these days. Hiring or training people that can write better assembler than a modern compiler is very very difficult.

      But for web development, Java is generally the right choice for the backend. Lots of competent people available who will require no learning curve. The support tools available for java on the backend are also clearly the best right now, as you pointed out (hibernate etc.). The tools for working in java are also a step ahead of anything else right now (idea and even its slightly retarded younger brother eclipse are both way ahead of the tools for any other language).

      • Code management (Score:3, Insightful)

        by sterno ( 16320 )
        Another aspect of Java for dealing with large sites is that it lends itself to cleaner code and better organization. PHP pages end up being a bunch of pages which means you get UI and business logic all entangled. In java, there's a lot of ways to avoid that mess and make a more organized and more readily maintained system.
        • Hmmm.... (Score:5, Interesting)

          by einhverfr ( 238914 ) <{moc.liamg} {ta} {srevart.sirhc}> on Saturday July 30, 2005 @04:10PM (#13203662) Homepage Journal
          I actually like PHP for large-scale web apps. However, I agree that many PHP programmers do create unmanageable code. This is, however, a programmer issue rather than a language issue.

          I started writing HERMES (a CRM framework/app) in PHP and it is now over 20k lines and when I have time to add enhancements it will grow again. The code is incredibly manageable simply because the complexity of the application meant that I had to divide the code into four main areas (each handled in different sets of files):
          1) Main engine(s)/UI framework
          2) UI generation code/data input screens
          3) UI event handling code
          4) Core object logic.

          This way, if you want to change the user interface, you just change the user interface. System-wide changes get made in one place where screen-specific changes get made somewhere else.

          Everything is relatively well abstracted, so the code is very manageable.

          Now, other languages have very specific problems associated with them:

          1) Scripted languages in general: slow performance

          2) Compiled languages in general: Requires rebuild before changes take effect, so testing and retesting is slowed down.

          3) Java/.Net/Byte-code languages: Worst of #1 and #2 above.

          4) Python: Performs a little better than most scripting languages, but there are times when its reference-based structure can cause bugs to be very difficult to find.

          5) PHP: Many PHP programmers write readible but unmaintainable code.

          6) Perl: Many Perl programmers write maintainable but unreadible code.

          7) LISP: See Perl only even more so.

          8) ASP. ASP is only really useful in large apps when paired with COM objects written in C++ or VB. So you have the problems with a scripted language combined with the problems of compiled languages.

          But again, many of the worst issues are programmer rather than language issues. Then again, depending on your project, you may have to eliminate possibilities because of language capabilities.
          • Re:Hmmm.... (Score:4, Informative)

            by sterno ( 16320 ) on Saturday July 30, 2005 @06:13PM (#13204274) Homepage
            But again, many of the worst issues are programmer rather than language issues.

            While that's true, each language has semantics that either encourage or discourage their worst behaviors.

            As far as Java in regards to your comments above. Java's scripted aspects are actually compiled into code and turned into byte code before run. So the first time a page runs, it will be slow because of conversion. Once run the first time it will be as fast as the compiled code.

            As far as the issues of compiled code, development evironments for java really make a lot of this process quite easy. If you make changes to your code that don't require changes to method signatures, just the chunk of code you modified can be re-compiled. In NetBeans, what I use, I just click a button, and my code is ready to test in less than second.

          • Re:Hmmm.... (Score:3, Interesting)

            6) Perl: Many Perl programmers write maintainable but unreadible code.

            7) LISP: See Perl only even more so.

            Hmm...having read quite a bit of code in modern common lisp packages, I'm going to have to disagree. For the most part, the code was quite readable and understandable. Some of it took a while to get the hang of (Aranaeda, for instance), but this was because what it was doing was large.

            That's the average. On the other side of it I've seen code of blinding clarity that expressed its intentions so

          • Re:Hmmm.... (Score:4, Informative)

            by FunkyMonkey ( 79263 ) on Saturday July 30, 2005 @06:40PM (#13204392)
            20k lines of code? That is miniscule. I've got a mid-sized enterprise system that's got 20k FILES containing millions of lines of code integrating a dozen desparate systems over a network of 50 or so servers. It handles thousands of concurrent users performing transactions - not just viewing content. That's just a mid-sized system. Some large scale systems use clusters of hundreds of servers. Not to bash what you're doing but I think you could use a little perspective on the size of your application.

            I don't care if you've got a freakin army of PHP programmers, you're never going to build a system that can scale like Java.

            1) Scripted languages in general: slow performance

            2) Compiled languages in general: Requires rebuild before changes take effect, so testing and retesting is slowed down.

            3) Java/.Net/Byte-code languages: Worst of #1 and #2 above.

            Don't believe the hype about Java's performance. Today's just-in-time compilers can optimize code as well as hand optimized code and they don't waste resources optimizing paths that don't get executed. There are many benchmarks out there that confirm that Java's performance is comparable to C++ and even better in some areas.

            http://www.javaworld.com/javaworld/jw-02-1998/jw-0 2-jperf_p.html [javaworld.com]
            http://www.tommti-systems.de/go.html?http://www.to mmti-systems.de/main-Dateien/reviews/languages/ben chmarks.html [tommti-systems.de]
            http://java.sys-con.com/read/45250.htm?CFID=29694& CFTOKEN=101A9EF8-9F8D-153A-37A5E0A40D3EE24A [sys-con.com]

            I agree with your point though, there are a huge number of crappy programmers out there. Good programmers write good code in whatever language they are using.

            So, what is good code?

            IMHO, good code performs well and is easy to understand and use.
          • Tiny. (Score:3, Insightful)

            by C10H14N2 ( 640033 )
            If your code is at 20,000, you haven't even begun to get to the point where manageable code is truly problematic. A skilled developer can get a grip on that (about 400 printed pages) in a day unless it is utterly obfuscated.

            Now, with respect to #1 and #2 as applied to #3? The WORST of execution and compilation time generalized to _all_ bytecode? WTF?

            With a proper J2EE development environment (no .Net here), my compile/build/deploy cycle on most projects takes one command and, guffaw, 20k lines would compile
          • Lisp code is extremely easy to read and maintain -- it's just the opposite of Perl, not "more so".

            Are you going to trot out the "parenthesis are hard to read" argument? Well take a look at XML: that has TWICE THE NUMBER OF PARENTHESIS, only they're pointy instead of curved. That old "I'm afraid of parenthesis" argument is bullshit.

            Now look at Perl code: instead of seeing the explicit, unambiguous parenthesis in the code, you have to remember and resolve all of the implicit and complex special syntax rul

    • Re:Java Java Java! (Score:5, Insightful)

      by drgonzo59 ( 747139 ) on Saturday July 30, 2005 @04:08PM (#13203650)
      You are right, performance from the language point of view is won by assembler, but often it is the choice of the algorithm that will make the big difference. A bubble sort in assembler of 1 million items might be slower than a quicksort of the same million items in python.

      Often when someone asks the question "what languages do you know?" or "what languages are the best?" it shows a lack of CS background and experience. The right question is "what programming paradigm would you use?" or "what programming paradigm is better?" (Of course when you come down to a specific problem, then the choice of libraries might determine the language, but the original poster only specified "large web application" as the requirement so talking about a specific language is pointless).

      The difference between the two questions underlies the difference between the two types of education most programmers have. Some have gone to 4 year colleges and got a "Computer Science" degree, while some learned a language in their spare time, or went to technical college. The people from the technical college will know just one language and ask others what langues are the best, what languages they use etc. To them learning a new language hard. What a CS degree teaches (or should teach) is different programming paradigms - procedural, functional, object oriented, along with an algorithms and data structures. So if someone knows how to think in terms of objects when they solve the problem they can program in java, c++, python, ruby and other object oriented languages.

      I used C++ in college, then I learned Java, now I use primarily Python. All I had to do is learn the syntax and some of the common library functions -- all can be done with a good reference book and/or Google in a couple of weeks.

      Or if a problem can be better solved with a functional approach, I would use Prolog or Lisp (you can use Lisp for websites too!).

      So, I think the original question should have specified the problem more exact or ask about what paradigm would be better. Rather than give a general requirement ("large web application") but then then ask for a specific language. This is bound to lead to nothing but arguments of why everyone's favorite language is best and that's about it.

  • by Qui-Gon ( 62090 ) * on Saturday July 30, 2005 @02:19PM (#13202994) Homepage Journal

    And you said it...

    Maybe language choice is irrelevant, only good people (developers) matter?

  • by Brento ( 26177 ) <{brento} {at} {brentozar.com}> on Saturday July 30, 2005 @02:20PM (#13203000) Homepage
    You're using examples of Ebay, Google and Microsoft's web sites as your "large-scale" web app description. If you truly do want to build something as large-scale as that, then you're going to have a lot of hiring to do. Take a look at your local market - or even better, place ads for architect-level people in each of the languages you're considering. See what kinds of people you get, and that should weigh into your decision.
  • Erlang!

    I would elaborate, but I'm afraid it would go straight over the heads of all you imperative programming dweebs. </smug>
    • Well while I love the elegance of a well designed functional implementation, functional languages are easy to abuse creating unreadable code (that might work but is hell do maintain)

      Though i admit to never having used Erlang, I have used others.
  • ASP.NET w/C# (Score:3, Insightful)

    by gfody ( 514448 ) on Saturday July 30, 2005 @02:23PM (#13203013)
    this should've been a survey
  • by Dysenteryduke ( 903867 ) on Saturday July 30, 2005 @02:24PM (#13203021)
    For large scale applications, java, c/c++, perl, PHP just don't cut it. You should really check out mod_fortran. Everything you love about fortran with none of the hype.
  • I've been getting into Ruby on Rails [rubyonrails.com] recently, and am most excited by how Rails makes it very clear what the "best practices" for organizing and building your application is.

    I have long despaired of learning that same information for PHP (with which I have much more experience). I've not yet found a book or other documentation that provides a concrete approach. And looking at existing large-scale projects, e.g., WordPress and others, reveal a myriad of different philosophies. It leaves developers basically trying different things out on different projects, and picking up their own favorite best practices as they go along.

    While it's great that the languages are so flexible, well, sometimes it's nice to be guided to a known solid approach. It leads to consistency among and across many developers and time. This makes it easier for new developers to join or take over a project, or even for the original developer to do maintenance on components which were written long ago.

    So, where are the recommended approaches for organizing and constructing large-scale applications for PHP (and Python, etc.)?
  • Slashdot uses perl! Regardless, If I were to choose any web language I would use Perl. Lib's? Free! Tons of dev's out there too.

  • Python (Score:4, Informative)

    by g-to-the-o-to-the-g ( 705721 ) on Saturday July 30, 2005 @02:27PM (#13203039) Homepage Journal
    Python is the way go to [zope.org]. For anyone who's built custom sites based on Zope, I think they would agree with me. Python is really easy to use for building big apps for use in web stuff, and Zope provides an easy-to-code-for application server.
    • by Some Random Username ( 873177 ) on Saturday July 30, 2005 @03:01PM (#13203250) Journal
      You can certainly make a large, high traffic site in python. But not with zope. Zope is brutally slow, and the only thing you can do about it is shove a cache infront of it, which does nothing to help speed up user-specific content.

      Just use a decent python web framework with a real webserver, zope is a waste of time.
      • by MikeFM ( 12491 ) on Saturday July 30, 2005 @03:54PM (#13203558) Homepage Journal
        A solution I like is to write a Python backend that is exposed to the frontend as XML-RPC. Then use the language your designers find easiest to work in for front-end coding.. usually PHP.

        Python is great for the backend because it has good namespace support which helps a lot for big complex programs. PHP on the other hand is well known and extremely easy for doing various web-scripting type tasks. I have a little PHP function that gets called by the PHP server for every page (without needing to be in the code exposed to the PHP coders) that simply passes the page inputs to Python over XML-RPC and puts the response into a global variable. Then the PHP coders jut display the results however needs to be done based on the inputs and outputs.

        Some nice benefits of such a split system is that it's easy to keep UI logic sepperate from application logic and it's easy to split your application up over multiple servers so that it can scale to any load. For example you might have two PHP servers, three Python servers, and a DB server dividing the load. Normal load balancing techniques work just fine for deciding how the machines talk to each other. Pretty nice to be able to just throw another server in where it's needed if you suddenly find a 9/11-type day where your site is getting unexpectedly high loads.

        Of course you can split your processing up in more levels if you need to. I like to abstract out all my queries into their own XML-RPC interface that sits in front of the DB so as to not allow direct access to the DB for security reasons. Anyone trying to hack the DB would have to use my stored queries and work through my XML-RPC interface rather than being able to access the DB directly. If your dealing with sensitive information it's just another layer of protection. If you have to access third-party systems that use some unstandardized method of communicating then it can help to keep your code clean if you create a proxy interface between those systems and your own that speaks XML-RPC. This way the code for speaking to that other system is a completely sepperate code base and your main code base is kept clean.
  • by Kenja ( 541830 ) on Saturday July 30, 2005 @02:27PM (#13203041)
    Hell, if I have to suffer so should all of you.

    (yes I program with this monstrosity of a system)

    • Suddenly, your sig makes much more sense.
    • by LS ( 57954 )
      run away from your job... NOW!!! If you are able to get a functional system out of domino, you are definitely skilled enough to use a real environment to build web-apps. run now before your resume is subsumed!!!
    • Re:Lotus Domino (Score:3, Informative)

      by tigersha ( 151319 )
      Holy mother of Jesus. I have to program that heap of tripe for years now. And my boss decided to write a large system in Notes with a mathematics student and I had to take the project over. The last five years have been hell.

      Notes's web part is not too bad and its unbeatable for putting up a really quick form for people to use if the looks are not too important but the database is utterly atrocious.

      the other problem with Notes is the horrifying development environment. You cannot, for instance, search your
  • Depends (Score:4, Insightful)

    by boner ( 27505 ) on Saturday July 30, 2005 @02:28PM (#13203049)
    What you are asking is a dilemma that has been around since the invention of different programming languages. My personal opinion is that the best investment of your time is designing the web-app itself. Once you understand the feature set you require/desire then it makes sense to start looking at how the feature set requirements map to the available languages from a development and performance point of view.

    Most people tend to forget to take a productivity point of view and let themselves be guided by whatever is available or what's cool. If you follow a productivity approach it will help you make the trade-off decisions between interpreted languages like PHP and compiled languages like C/C++, with ASP and Java somewhere in between.

    There is a balance between development and production, when you go live and your web-app is well-designed it should be easy to add additional hardware to compensate for performance issues (server is about US$ 2000,- , or the equivalent of 10-20 hours of developer time.)

    The single most important piece of advice after recommending that you spend more time on designing the app: don't get married to the language. Be prepared to use PHP to develop quickly and understand what works and what doesn't for your web-app. Once you have solved the usability bugs, investigate how you can drive efficiency by choosing a different language or not.

    There is no template for what is the best environment, only your common sense, and oh... did I mention that you should spend more time designing your app?

  • by Foofoobar ( 318279 ) on Saturday July 30, 2005 @02:28PM (#13203050)
    I use PHP myself because it focuses on one thing and doesn't get distracted by trying to do more than it's build to do... that being, serve dynamic web pages.

    Sure you can use it to dynamically generate images, PDF's and alot more but these things tend to slow down and detract from what it is meant to do and should be handled by third party apps preferably on a different server that way you separate your processes and keep PHP focused on it's task.

    Plus with the improvements in the ZEND engine and it's object oriented programming, PHP is now comparable and even sometimes faster than Java.

    People will say that it doesn't scale but they base this opinion on a preset prejudice or on the scalability of the underlying architecture. But PHP's engine is actually more compact than the JVM because it has less to focus on and thus can scale along side Apache, the entire way.

    And with tons of larger companies moving to PHP, it has proven it can handle the load.

    My only complaint though is developers who try to do EVERYTHING in PHP. With all the added modules, it does have the potential but do you really want to waste processing power letting PHP handle all these extra tasks? Use PHP for dynamic webpages and any added processing you need to do, I suggest moving to a secondary app preferably built in C/C++ or even Java. That way you get the most bang for your buck.
    • by Space cowboy ( 13680 ) * on Saturday July 30, 2005 @03:30PM (#13203421) Journal
      People will say that it doesn't scale but they base this opinion on a preset prejudice or on the scalability of the underlying architecture.

      What people mean by 'it doesn't scale' is that it doesn't scale. Not that it doesn't run fast enough or have enough functionality for pretty much anything at the small-to-medium sized website...

      I have a set of 200 or so websites all running though a self-built PHP template-based content-management system (hey, this was 8 years ago, they were rare then! :-) that has stood the test of time admirably. It's only got a few million pages in it's CMS, but it's pretty cool:
      • Typical page-creation is ~0.01 secs for complex pages
      • Copes with (currently) several million users
      • Handles email list management (opt-in only, don't flame me :-)
      • Separates the content from the formatting. Formatting is by recursive template instantiation.
      • Can embed run-at-page-delivery-time PHP modules as CMS elements
      • Has an Ad-server (flash, DHTML or images) which guarantees ad-placement in slots at a pre-paid rate
      • Copes well with binary data (PDFs, images, movies, etc.)
      • Handles image galleries from both user/admin perspective
      • Has sections where extranet companies can "own" part of the sites
      • Complete messageboard system, any number of boards, skinnable.
      • Manages products, shopping basket etc. and integrates with online purchasing providers
      • Provides newsfeeds in a variety of formats (RSS, XML via FTP, etc.)
      • Provides a *fast* fulltext search that uses phrases, booleans, etc.
      • Layers facilities on top of search (eg: site-editor can embed results of a search into an email (s)he composes. Preview, then deliver to opt-in list.)

      And will all those features it's still not scaleable. I can't split the system over multiple webservers and begin a transaction on one webserver, have a hardware failure, and have it complete on a different webserver. ..

      I server about a million page-impressions a day (less at weekends) so I'm hardly "big iron", but at the moment it's all serving from a single machine(*) with a manual backup ready-to-go. We're (probably) about to triple our daily throughput (time to splash some cash :-), so scalability has become more important, and I'm looking into the best way of doing this.

      I can't have the above level of scalability but I can divide up the work over (say) 4 cloned webservers, and use round-robin DNS (low TTL) or transparent-proxy load-balancing to share the load. Then at least if one of the machines goes down (not the proxy ;-), I can have it automatically react and recover.

      We're probably going to have 2 database servers as well - one in slave mode, one in master mode (all writes to the master, because we use MySQL). The single point of failure then becomes the proxy gateway (because RR DNS is a bit of a pain), so we can have a spare standing by - the configuration of a load-balancing proxy is pretty trivial, and doesn't depend on anything else, so it can be sitting ready to run and swapping ethernet patch cables ought to be all that is necessary.

      And that's about as "scalable" as I can make it - not very. All I'm doing is duplicating hardware for speed and reliability. I can have robustness against a machine dying, but that's about as far as I can go. True scalability allows the operation the machine was doing when it died to complete successfully, and PHP ain't there (yet). I guess you could implement it in s/w using lots of state tables, and perhaps get 80% of the way there, but it's an add-on not a built-in, and not a complete solution. Better to go with something that works if you need it...

      Just MHO.


      (*) It is a bit of a beast of a machine though :-)
      • by esconsult1 ( 203878 ) on Saturday July 30, 2005 @07:30PM (#13204636) Homepage Journal

        Then I guess you never heard about using database driven sessions. The way how you've designed that bad boy, it would'nt scale in any language.

        Here's what we do:

        • 8 Apache Webservers
        • 3 Million pageviews per day
        • Distributed PHP sessioning (Postgresql based)
        • PHP module
        • Postgresql (no worries with MySQL write locks)
        Scaling? We add new machines in the mix, tell our load balancer about the new machines, and we've scaled linearly. A machine goes down? The load balancer redirects to another machine and the session continues without a beat.

        Bottleneck? The database, but then you throw big iron at that.

        Look, the web is stateless, if applications are designed from the get-go realizing that fact, heck, you can get a shell script sitting in cgi-bin to scale with your server pool.

        There's absolutely nothing in PHP that inherently causes it not to scale. Sure, other languages have easier and sometines better features built in, but if you're already using PHP, implementing those features are usually worth the few programming hours of effort instead of switching to another language/platform.

    • Java is called a language but in this context it is more of a platform which, frankly, is older, more robust and better thought-out than anything PHP has to offer--at this point. I believe PHP is great for small to medium scale web sites, but once you start to deal with the large structures that enterprise systems require, PHP is just not an option--if you want packages already available to you which are thought-out, mature and stable, like all the various J2EE solutions available.

      PHP very well may be faster for an individual page--but what are you comparing that to? Tomcat set up to use JSP? Well, there's a lot of infrastructure there that a PHP developer is probably not going to use for a simple dynamic page. And the fact is, PHP is incorporating a lot of 'heavier' OO features now whose effective use is debatable when considering web apps tied to the HTTP protocol--why build and tear down your entire OO structure every time you load a page? To do that intelligently you want an application server caching these objects...and then we start talking about Java and all the years it has on PHP there.

      So, I'm really just saying--some things are right for some projects, others for other projects. Choose wisely.

      • I could not agree more. I used php extensively before and often laughed at J2EE as oversized and overblown for web sites. But i constantly hit the wall with every more compley web application. The fact that your goal is to build up your data structure as quick as possible only to display them and destroy them one moment later makes any real programming a pain. I even investigated in building PHP application servers, but they are far, far from mature. Most are just trying out the possibilities.

        Compare to jav
  • by Dominic_Mazzoni ( 125164 ) on Saturday July 30, 2005 @02:28PM (#13203052) Homepage
    You didn't make it clear who is doing the development.

    If you're doing the development by yourself, then obviously you should weigh the choices and pick the language that will work best for you. Development time, for example, is highly dependent on how well you already know the languages.

    However, if you already have a developer, or a team of developers, to do this development, then whatever you do don't force them to use what you think is the best language. That's a guaranteed way to lower productivity and morale if they think it's a poor choice! Ask them to make recommendations. Maybe even spend a couple of days prototyping various things in different languages first.

    One of the nicest things about back ends is that it doesn't matter what language you use (nobody can tell from the outside) and you can easily mix and match languages. There's nothing wrong with writing the majority of the code in PHP or Python for rapid development, but using Java or C++ extensions for a few of the computationally-intensive algoritihms.
  • I like the language... Microsoft took java, added language features I was missing from Borland's C++ builder (events, properties, delegates), let you call an windows dll.

    I've also enjoyed the platform. It was rather trivial to change our logging system to use non-blocking calls to send these logs over the MessageQueue, and then to have an NT service receive those messages for later processing.

    When the database CPU load went up to 70% we implement caching using their intrinsic ASP.Net Cache objects on often
  • by 00_NOP ( 559413 ) on Saturday July 30, 2005 @02:30PM (#13203074) Homepage
    Perl is a great choice. You can do anything with it and nobody else understands what your code does so they have to get you to maintain it :)
  • Microsoft (Score:2, Informative)

    by minus_273 ( 174041 )
    Microsoft uses ASP (what else?).

    err, no. MS does not use asp, they use ASP.net. There is a BIG difference between the two. The former is VB and the latter is C#,VB.net,J#,managed c++ etc etc. basically any language that runs in .net
  • I would call Yahoo a "large scale web app" (to put it mildly). Yahoo uses PHP [internet.com] and the founder of PHP [wikipedia.org] works for Yahoo.


  • PHP (Score:3, Interesting)

    by AVryhof ( 142320 ) <`moc.hcraeserfohyrv' `ta' `soma'> on Saturday July 30, 2005 @02:32PM (#13203084) Homepage
    We have a small website (85,000 hits a day)

    So here's the rundown of what we use...
    CGI/Backend: PHP

    Client Side: Javascript

    Presentation: CSS/HTML 34 (Somewhere between 3.2 and 4)

    Then of course there is the PHP and static generated RSS feeds.
  • Ruby on Rails (Score:3, Informative)

    by threaded ( 89367 ) on Saturday July 30, 2005 @02:37PM (#13203108) Homepage
    Ruby on Rails, try it, you won't want to use anything else. Ruby on Rails is just so sweet, just like the original Java alpha was all those years ago.
  • PHP is hugely popular, but it's one of the few modern software systems lacking native support for Unicode. Unicode is important because of the first W in WWW: even if i18n is not part of the initial ploject, I would be wary of architecting a big new system in 2005 using a language whose string support is based on 1-byte characters [php.net]:

    In PHP, a character is the same as a byte, that is, there are exactly 256 different characters possible. This also implies that PHP has no native support of Unicode.

  • by blueZhift ( 652272 ) on Saturday July 30, 2005 @02:42PM (#13203136) Homepage Journal
    I still use PHP for a lot of personal work and quick stuff, but I've been leaning more and more on Python [python.org], Zope [zope.org], and Plone [plone.org] for building stuff at my day job. If you need to quickly and easily implement role based security, Zope makes it drop dead easy because it's built in and through ZEO [zope.org], zope apps can be highly scalable. Of course as with most things, use whatever technologies get the job done. For example, my Zope apps live behind an Apache server that I use for SSL as well as access control.
  • by BigGerman ( 541312 ) on Saturday July 30, 2005 @02:45PM (#13203146)
    (for those who actually care to get something out of the door)


    front end - Tomcat running JSPs (JSTL or Velocity for templating)

    in the middle - Spring and Spring MVC

    Closer to database - Hibernate.

    Ideally, everything running in same JVM. Add more servers for scalability front-ending them with load balancer with sticky sessions.
    No J2EE fluff, easy to find people, good productivity.

  • by PhotoBoy ( 684898 ) on Saturday July 30, 2005 @02:45PM (#13203149)
    It's been my experience that language is mostly irrelevant when building a large, scalable web app.

    There is certainly a difference in performance between various web languages/libraries but the most important aspect is how well you design your app to scale across multiple servers. Even if you were to spend years writing the most tightly coded app in Assembly that is 99.9% efficient you will still reach a point where you need to use more than one server.

    As long as your app is designed with scaling to multiple servers in mind the choice of language should merely be down to what your team is best able to work with and support. It's no good doing everything in ISAPI just because eBay does it if your team is mainly experienced in Perl. Building the app to work well with multiple servers that are clustered according to their function (e.g. a DB cluster, load balanced webservers, large scale storage solution, etc) is the best way to ensure a scalable solution. Picking a database server for example that easily allows you to add a new machine to the cluster should be more important than language choice. Picking high availability software that doesn't require downtime every time you need to add a new server is very important.

    Maybe I sound like I'm advocating writing sloppy code and just throwing lots of servers at the solution, but it's worth considering how today's top of the range server will be the cheapest low range machine in a few years. This means you can either pile high with cheap boxes or buy fewer but more powerful servers which have double the capacity of the cheaper server. It's certainly the solution that's worked well for Google...
  • by adolfojp ( 730818 ) on Saturday July 30, 2005 @02:54PM (#13203200)
    Large standard library
    Excellent MVC model
    Integrated caching capabilities
    You can compile your libraries before uploading
    Excellent Web Services model
    Free tools
    Works on Linux (through mono)
    Large third party support
    Very Fast
    Easier to use and deploy than J2EE :-P
  • by psykocrime ( 61037 ) <mindcrime@cp p h a c k e r . c o .uk> on Saturday July 30, 2005 @04:02PM (#13203608) Homepage Journal
    saying that E-Bay uses ISAPI / C may be oversimplifying things. I see that some of their url's still include isapi.dll, which does suggest using ISAPI. But they had gone on the public record a few years ago as saying they were migrating to Java / J2EE, specifically IBM WebSphere software.

    http://computerworld.com/softwaretopics/software/a ppdev/story/0,10801,63692,00.html [computerworld.com]

    I would guess that they're actually using a mix of technologies. Any insiders have any insight they can share? Even anonymously?
  • by markv242 ( 622209 ) on Saturday July 30, 2005 @04:36PM (#13203803)
    The only reason people think they use ISAPI is because that's what they originally used, and an executive decision was made to not break any existing links at any time, ever. Check the Powered by Java [ebaystatic.com] image. The /ws/eBayISAPI.dll that you see in all of the requests just invokes a servlet.
  • Pfft (Score:3, Insightful)

    by defile ( 1059 ) on Saturday July 30, 2005 @05:36PM (#13204101) Homepage Journal

    For small web site, it really doesn't matter.

    Same is true for a large site.

    A good way to define "large site" is "beyond the hardware capabilities of a single computer". For example, if you made a hand optimized assembly version of Slashdot that had its own network driver, TCP/IP stack, etc. its load would still probably be beyond the role of any one commodity computer.

    When you throw this kind of a load at computers, many basic assumptions start to break -- you inevitably exercise a use case that is quite uncommon with no off-the-shelf solution that fits the bill quite right.

    Of course, since large sites mean big business, vendors want you to believe that their solution can grow towards infinity. But don't be fooled: there are no silver bullets.

    Getting into a religious war over what RDBMS, language, OS, etc. to use is pointless -- you just cannot avoid refactoring/rewriting major chunks of a project through its lifetime. It is undeniable.

    Better to pick what your group is most comfortable with and just take it from there.

  • by Tablizer ( 95088 ) on Saturday July 30, 2005 @05:47PM (#13204148) Journal
    [fill_in_the_blank] is the way go to. See [blank].org for more. For anyone who's built custom sites based on [blank], I think they would agree with me. [blank] is really easy to use for building big apps for use in web stuff, and [blank] provides an easy-to-code-for application framework that saves lots of time and money.

    Best of all, it is [blank]-oriented so that you just snap functionality together like Lego blocks to get an instant app that runs at the speed of light almost right out of the box! And [blank] scales to every user on the entire planet. And it plugs into XML.

    Only a Devry graduate would use anything different. Go with [blank]!
  • by MarkWatson ( 189759 ) on Saturday July 30, 2005 @06:33PM (#13204359) Homepage
    I think that Java is the gold standard for small and large web portals in terms or reliability, good performance, etc. I have done portals that simply use Tomcat with either Prevayler or Hibernate/JDBC for persistence that basically run forever, until we want to do a software upgrade.

    That said, for CRUD applications, RoR is good - the scaffolding gets you up and running quickly, and views, controllers, etc. are easily customized.

    I used to use Python and Common Lisp a fair amount, but not recently. The UnCommon Web Common Lisp package looks good; I would like to check it out in some detail when it is more mature. It uses continuations (like Seside for Squeak and VisualWorks Smalltalk) to manage state between web pages.

    Sure, there is some overhead for using multiple langauges and frameworks, but I have always believed that it is best to be a "generalist" who can drill down when required.

Loose bits sink chips.