Slashdot Log In
Choice of Language for Large-Scale Web Apps?
Posted by
Cliff
on Sat Jul 30, 2005 01:14 PM
from the dilemma-of-many-choices dept.
from the dilemma-of-many-choices dept.
anyon wonders: "PHP is the most popular language for the web. eBay uses ISAPI (C), Google uses C/C++ (search), Java (gmail), and Python. Microsoft uses ASP (what else?). For small web site, it really doesn't matter. What's your take on language choice for large-scale web applications? Maybe language choice is irrelevant, only good people (developers) matter? If you can get the same good quality people, then what language you would chose? Considering the following factors: performance, scalability, extendibility, cost of development (man-month), availability of libraries, cost of libraries, development tools? Has there been a comprehensive comparison done?"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Polyglot (Score:3, Insightful)
As many as possible. Use PHP for the front end, Perl for input parsing, Euphoria for the graphics, JavaScript on the client-side, Moo for the database and Python for the glue to hold things together.
Every language has strengths and weaknesses. There is no killer language. A good carpenter has lots of tools and uses the most suitable tool(s) for each task. Likewise a programmer should be skilled in many languages and should pick the most appropriate one for each task. Learn as many programming languages as you can, and when you've done that, learn a few more.
[The feeling of job security is also rather nice.]
Re:Polyglot (Score:5, Insightful)
Noooooo!
It will just produce a job ad that says:
Required: 3+ years experience in PHP, Perl, JavaScript, Euphoria, Moo, and Python.
Then when they can't find any individual to fit the bill (surprise!), they will lobby Congress for more visa workers so that they can hunt the entire globe for the "best and brightest".
(Hmmmmm. What the hell is "Moo"?)
Parent
Re:Polyglot (Score:4, Informative)
However, I would find such a system to be extremely unsuitable as a general-purpose database.
Parent
Re:Polyglot (Score:5, Insightful)
Parent
Re:Polyglot (Score:5, Funny)
Parent
Wrong (Score:5, Insightful)
Wrong.
AJAX asynchronously calls any server-side technology without needing a page redraw. It could be PERL, ASP, or anything else that can respond to an HTTP Request.
Please read the docs about Ajax before telling me something that has nothing to do with it.
Please follow your own advice.
Parent
Re:Polyglot (Score:5, Informative)
Adaptive Path has a nice article introducing Ajax called Ajax: A New Approach to Web Applications [adaptivepath.com].
Parent
Re:Polyglot (Score:4, Funny)
Parent
Perl. (Score:5, Funny)
Seconded! (Score:3, Informative)
Re:Seconded! (Score:4, Insightful)
Programmers also have to debug and maintain that software, and that makes Perl one of the most wasteful languages in terms of programmer time.
Parent
Everything, huh? (Score:5, Funny)
It is Saturday, and instead of being out in the sunshine, taking in rays, talking to women, GOING OUTSIDE, here we are, in front of our screens debating about which language to build our web apps with? Can we suck enough?
Dont bother replying, because when this damn compile is done, I am going outside if it kills me. I wont be here to read any replies, dammit.
Parent
Re:Everything, huh? (Score:5, Funny)
The problem with talking to women is that so few of them have anything interesting to say about whether or not C++ is better than Perl...
Parent
Re:Everything, huh? (Score:5, Funny)
Parent
Who do you have? (Score:3, Interesting)
WebObjects (Score:5, Interesting)
Java Java Java! (Score:4, Informative)
performance, scalability, extendibility, cost of development (man-month), availability of libraries, cost of libraries, development tools
Performance? Assembly will give you the best performance followed by C and C++. All three of do not have that great of support for web apps..
However, Java is almost exclusively being used for large enterprise websites. Its powerful enough to handle the big jobs, and using the appropriate app server will give you great performance.
Cost of development is heavy in initial development, but pays for itself in maintenance. Most libraries and APIs are free in java (struts, spring, hibernate, tapestry, etc etc etc...). I'd say they are second to perl in terms of freely available and powerful libraries and APIs.
Development tools? Just check out the (free!) eclipse platform.
In my mind there is no question that Java (more specifically J2EE) is the best option for general large scale enterprise applications.
Re:Java Java Java! (Score:5, Insightful)
But for web development, Java is generally the right choice for the backend. Lots of competent people available who will require no learning curve. The support tools available for java on the backend are also clearly the best right now, as you pointed out (hibernate etc.). The tools for working in java are also a step ahead of anything else right now (idea and even its slightly retarded younger brother eclipse are both way ahead of the tools for any other language).
Parent
Hmmm.... (Score:5, Interesting)
I started writing HERMES (a CRM framework/app) in PHP and it is now over 20k lines and when I have time to add enhancements it will grow again. The code is incredibly manageable simply because the complexity of the application meant that I had to divide the code into four main areas (each handled in different sets of files):
1) Main engine(s)/UI framework
2) UI generation code/data input screens
3) UI event handling code
4) Core object logic.
This way, if you want to change the user interface, you just change the user interface. System-wide changes get made in one place where screen-specific changes get made somewhere else.
Everything is relatively well abstracted, so the code is very manageable.
Now, other languages have very specific problems associated with them:
1) Scripted languages in general: slow performance
2) Compiled languages in general: Requires rebuild before changes take effect, so testing and retesting is slowed down.
3) Java/.Net/Byte-code languages: Worst of #1 and #2 above.
4) Python: Performs a little better than most scripting languages, but there are times when its reference-based structure can cause bugs to be very difficult to find.
5) PHP: Many PHP programmers write readible but unmaintainable code.
6) Perl: Many Perl programmers write maintainable but unreadible code.
7) LISP: See Perl only even more so.
8) ASP. ASP is only really useful in large apps when paired with COM objects written in C++ or VB. So you have the problems with a scripted language combined with the problems of compiled languages.
But again, many of the worst issues are programmer rather than language issues. Then again, depending on your project, you may have to eliminate possibilities because of language capabilities.
Parent
Re:Hmmm.... (Score:4, Informative)
While that's true, each language has semantics that either encourage or discourage their worst behaviors.
As far as Java in regards to your comments above. Java's scripted aspects are actually compiled into code and turned into byte code before run. So the first time a page runs, it will be slow because of conversion. Once run the first time it will be as fast as the compiled code.
As far as the issues of compiled code, development evironments for java really make a lot of this process quite easy. If you make changes to your code that don't require changes to method signatures, just the chunk of code you modified can be re-compiled. In NetBeans, what I use, I just click a button, and my code is ready to test in less than second.
Parent
Re:Hmmm.... (Score:4, Informative)
I don't care if you've got a freakin army of PHP programmers, you're never going to build a system that can scale like Java.
1) Scripted languages in general: slow performance
2) Compiled languages in general: Requires rebuild before changes take effect, so testing and retesting is slowed down.
3) Java/.Net/Byte-code languages: Worst of #1 and #2 above.
Don't believe the hype about Java's performance. Today's just-in-time compilers can optimize code as well as hand optimized code and they don't waste resources optimizing paths that don't get executed. There are many benchmarks out there that confirm that Java's performance is comparable to C++ and even better in some areas.
http://www.javaworld.com/javaworld/jw-02-1998/jw-
http://www.tommti-systems.de/go.html?http://www.t
http://java.sys-con.com/read/45250.htm?CFID=29694
I agree with your point though, there are a huge number of crappy programmers out there. Good programmers write good code in whatever language they are using.
So, what is good code?
IMHO, good code performs well and is easy to understand and use.
Parent
Re:Java Java Java! (Score:5, Insightful)
Writing efficient assembly code today is at least 3 or 4 orders of magnitude harder work than it was in the 60s or 70s, and there are far fewer experts available to hire today than there were back then. There are maybe 3 or 4 major computer game developers still doing hand assembly optimization these days, and those guys would be extremely hard to hire away from their current jobs. Most games are just developed in c, and are bound by the performance on the video card anyway, so that optimization on the CPU just isn't that important any more.
Parent
Re:Not all true (imo) (Score:5, Insightful)
Now, I've never used IDEA for a prolonged period of time - I couldn't get into it, and was happy enough with Eclipse not to worry. (The fact that Eclipse is free helps - it would be difficult to persuade my company to pay for loads of licences for IDEA when Eclipse is perfectly all right and free.)
I do, however, use Visual Studio
1) Refactoring. Yes, there are tools available to help - but it's free and bundled into Eclipse.
2) Organise imports. Even with VS 2005 having some limited support, it doesn't help nearly as much as it should.
3) Built-in unit testing tools. Using TDD.NET to fire up NUnit GUI (or any of the other things it can do) is much, much uglier than the built-in support for JUnit in Eclipse.
4) Ant support in Eclipse. Our Java build script is *so* much nicer than the nastiness VS.NET encourages. I'm looking forward to investigating the VS 2005 integration with MSbuild.
5) "Hold down ctrl to make anything a hyperlink" - want to go to where a method, variable, class etc is declared? Just hold down ctrl and click. Navigation was never simpler.
6) Search for all references (etc) - in theory there's "go to definition" in VS.NET 2003, but half the time it doesn't work when you're in a large solution, and I don't believe there's any way of finding all references.
7) The VSS plugin for Eclipse is actually better in my view than the VS.NET support... much easier to understand the configuration, change it on a per project basis etc.
8) Launching Tomcat in a debugger with Eclipse (even without any extra plugins) seems a lot more reliable than trying to make sure that IIS has actually caught up with changes. Why do web projects need IIS to be running even to open in VS.NET? It's crazy.
9) Quick Fix and other source options - get Eclipse to write code for you, fix code for you, extract constants, etc. Fantastic stuff - especially in test-first development, where you can write code which uses the API you *want* to exist, then tell Eclipse to create the shell of that API for you.
10) Compile on save with a really good incremental compiler. This saves huge amounts of time. Oh, and changes really do happen, unlike in VS.NET where if you change an embedded resource, a normal build sometimes picks up the change but sometimes doesn't. (Not to mention VS.NET locking access to files it's built quite often, meaning you can't rebuild them without restarting VS.NET - particularly in terms of XML documentation.)
These are not esoteric features which are hardly ever used - although I could list loads of those too, if you want. These are things I use *every day*. My pair programmer and I are *always* saying how much easier our C# work would be if VS.NET supported the features above. Half of them aren't even in VS 2005 beta 2, as far as I can see - or at least aren't as well implemented. Funnily enough, I can't remember the last time we said something similar the other way round...
So, I've given some of my reasons why I think Eclipse isn't just a step ahead of VS.NET, but leaps and bounds. Now, why do you think VS.NET is better than Eclipse, and do you really not care about the above features?
Parent
Re:Java Java Java! (Score:5, Insightful)
Often when someone asks the question "what languages do you know?" or "what languages are the best?" it shows a lack of CS background and experience. The right question is "what programming paradigm would you use?" or "what programming paradigm is better?" (Of course when you come down to a specific problem, then the choice of libraries might determine the language, but the original poster only specified "large web application" as the requirement so talking about a specific language is pointless).
The difference between the two questions underlies the difference between the two types of education most programmers have. Some have gone to 4 year colleges and got a "Computer Science" degree, while some learned a language in their spare time, or went to technical college. The people from the technical college will know just one language and ask others what langues are the best, what languages they use etc. To them learning a new language hard. What a CS degree teaches (or should teach) is different programming paradigms - procedural, functional, object oriented, along with an algorithms and data structures. So if someone knows how to think in terms of objects when they solve the problem they can program in java, c++, python, ruby and other object oriented languages.
I used C++ in college, then I learned Java, now I use primarily Python. All I had to do is learn the syntax and some of the common library functions -- all can be done with a good reference book and/or Google in a couple of weeks.
Or if a problem can be better solved with a functional approach, I would use Prolog or Lisp (you can use Lisp for websites too!).
So, I think the original question should have specified the problem more exact or ask about what paradigm would be better. Rather than give a general requirement ("large web application") but then then ask for a specific language. This is bound to lead to nothing but arguments of why everyone's favorite language is best and that's about it.
Parent
The only thing that counts... (Score:3, Insightful)
And you said it...
Maybe language choice is irrelevant, only good people (developers) matter?
Consider the employment market (Score:3, Insightful)
ASP.NET w/C# (Score:3, Insightful)
all these new languages are hype (Score:5, Funny)
Re:all these new languages are hype (Score:5, Funny)
Parent
Re:all these new languages are hype (Score:5, Funny)
I read that as "Everything you love about fortran with none of the hope."
Fortran? Seriously, what's the matter? Was Emacs not available?
Just because you can, doesn't mean you should.
Parent
Where are the best practices for each language? (Score:3, Insightful)
I have long despaired of learning that same information for PHP (with which I have much more experience). I've not yet found a book or other documentation that provides a concrete approach. And looking at existing large-scale projects, e.g., WordPress and others, reveal a myriad of different philosophies. It leaves developers basically trying different things out on different projects, and picking up their own favorite best practices as they go along.
While it's great that the languages are so flexible, well, sometimes it's nice to be guided to a known solid approach. It leads to consistency among and across many developers and time. This makes it easier for new developers to join or take over a project, or even for the original developer to do maintenance on components which were written long ago.
So, where are the recommended approaches for organizing and constructing large-scale applications for PHP (and Python, etc.)?
Python (Score:4, Informative)
Please stop insulting python. (Score:5, Interesting)
Just use a decent python web framework with a real webserver, zope is a waste of time.
Parent
Python + PHP + XML-RPC (Score:5, Interesting)
Python is great for the backend because it has good namespace support which helps a lot for big complex programs. PHP on the other hand is well known and extremely easy for doing various web-scripting type tasks. I have a little PHP function that gets called by the PHP server for every page (without needing to be in the code exposed to the PHP coders) that simply passes the page inputs to Python over XML-RPC and puts the response into a global variable. Then the PHP coders jut display the results however needs to be done based on the inputs and outputs.
Some nice benefits of such a split system is that it's easy to keep UI logic sepperate from application logic and it's easy to split your application up over multiple servers so that it can scale to any load. For example you might have two PHP servers, three Python servers, and a DB server dividing the load. Normal load balancing techniques work just fine for deciding how the machines talk to each other. Pretty nice to be able to just throw another server in where it's needed if you suddenly find a 9/11-type day where your site is getting unexpectedly high loads.
Of course you can split your processing up in more levels if you need to. I like to abstract out all my queries into their own XML-RPC interface that sits in front of the DB so as to not allow direct access to the DB for security reasons. Anyone trying to hack the DB would have to use my stored queries and work through my XML-RPC interface rather than being able to access the DB directly. If your dealing with sensitive information it's just another layer of protection. If you have to access third-party systems that use some unstandardized method of communicating then it can help to keep your code clean if you create a proxy interface between those systems and your own that speaks XML-RPC. This way the code for speaking to that other system is a completely sepperate code base and your main code base is kept clean.
Parent
Lotus Domino (Score:5, Funny)
(yes I program with this monstrosity of a system)
Depends (Score:4, Insightful)
Most people tend to forget to take a productivity point of view and let themselves be guided by whatever is available or what's cool. If you follow a productivity approach it will help you make the trade-off decisions between interpreted languages like PHP and compiled languages like C/C++, with ASP and Java somewhere in between.
There is a balance between development and production, when you go live and your web-app is well-designed it should be easy to add additional hardware to compensate for performance issues (server is about US$ 2000,- , or the equivalent of 10-20 hours of developer time.)
The single most important piece of advice after recommending that you spend more time on designing the app: don't get married to the language. Be prepared to use PHP to develop quickly and understand what works and what doesn't for your web-app. Once you have solved the usability bugs, investigate how you can drive efficiency by choosing a different language or not.
There is no template for what is the best environment, only your common sense, and oh... did I mention that you should spend more time designing your app?
Depends on what you want to do... (Score:5, Insightful)
Sure you can use it to dynamically generate images, PDF's and alot more but these things tend to slow down and detract from what it is meant to do and should be handled by third party apps preferably on a different server that way you separate your processes and keep PHP focused on it's task.
Plus with the improvements in the ZEND engine and it's object oriented programming, PHP is now comparable and even sometimes faster than Java.
People will say that it doesn't scale but they base this opinion on a preset prejudice or on the scalability of the underlying architecture. But PHP's engine is actually more compact than the JVM because it has less to focus on and thus can scale along side Apache, the entire way.
And with tons of larger companies moving to PHP, it has proven it can handle the load.
My only complaint though is developers who try to do EVERYTHING in PHP. With all the added modules, it does have the potential but do you really want to waste processing power letting PHP handle all these extra tasks? Use PHP for dynamic webpages and any added processing you need to do, I suggest moving to a secondary app preferably built in C/C++ or even Java. That way you get the most bang for your buck.
Re:Depends on what you want to do... (Score:5, Informative)
What people mean by 'it doesn't scale' is that it doesn't scale. Not that it doesn't run fast enough or have enough functionality for pretty much anything at the small-to-medium sized website...
I have a set of 200 or so websites all running though a self-built PHP template-based content-management system (hey, this was 8 years ago, they were rare then!
And will all those features it's still not scaleable. I can't split the system over multiple webservers and begin a transaction on one webserver, have a hardware failure, and have it complete on a different webserver.
I server about a million page-impressions a day (less at weekends) so I'm hardly "big iron", but at the moment it's all serving from a single machine(*) with a manual backup ready-to-go. We're (probably) about to triple our daily throughput (time to splash some cash
I can't have the above level of scalability but I can divide up the work over (say) 4 cloned webservers, and use round-robin DNS (low TTL) or transparent-proxy load-balancing to share the load. Then at least if one of the machines goes down (not the proxy
We're probably going to have 2 database servers as well - one in slave mode, one in master mode (all writes to the master, because we use MySQL). The single point of failure then becomes the proxy gateway (because RR DNS is a bit of a pain), so we can have a spare standing by - the configuration of a load-balancing proxy is pretty trivial, and doesn't depend on anything else, so it can be sitting ready to run and swapping ethernet patch cables ought to be all that is necessary.
And that's about as "scalable" as I can make it - not very. All I'm doing is duplicating hardware for speed and reliability. I can have robustness against a machine dying, but that's about as far as I can go. True scalability allows the operation the machine was doing when it died to complete successfully, and PHP ain't there (yet). I guess you could implement it in s/w using lots of state tables, and perhaps get 80% of the way there, but it's an add-on not a built-in, and not a complete solution. Better to go with something that works if you need it...
Just MHO.
Simon
(*) It is a bit of a beast of a machine though
Parent
Re:Depends on what you want to do... (Score:5, Informative)
Then I guess you never heard about using database driven sessions. The way how you've designed that bad boy, it would'nt scale in any language.
Here's what we do:
- 8 Apache Webservers
- 3 Million pageviews per day
- Distributed PHP sessioning (Postgresql based)
- PHP module
- Postgresql (no worries with MySQL write locks)
Scaling? We add new machines in the mix, tell our load balancer about the new machines, and we've scaled linearly. A machine goes down? The load balancer redirects to another machine and the session continues without a beat.Bottleneck? The database, but then you throw big iron at that.
Look, the web is stateless, if applications are designed from the get-go realizing that fact, heck, you can get a shell script sitting in cgi-bin to scale with your server pool.
There's absolutely nothing in PHP that inherently causes it not to scale. Sure, other languages have easier and sometines better features built in, but if you're already using PHP, implementing those features are usually worth the few programming hours of effort instead of switching to another language/platform.
Parent
The real problem in comparing Java and PHP (Score:5, Insightful)
Java is called a language but in this context it is more of a platform which, frankly, is older, more robust and better thought-out than anything PHP has to offer--at this point. I believe PHP is great for small to medium scale web sites, but once you start to deal with the large structures that enterprise systems require, PHP is just not an option--if you want packages already available to you which are thought-out, mature and stable, like all the various J2EE solutions available.
PHP very well may be faster for an individual page--but what are you comparing that to? Tomcat set up to use JSP? Well, there's a lot of infrastructure there that a PHP developer is probably not going to use for a simple dynamic page. And the fact is, PHP is incorporating a lot of 'heavier' OO features now whose effective use is debatable when considering web apps tied to the HTTP protocol--why build and tear down your entire OO structure every time you load a page? To do that intelligently you want an application server caching these objects...and then we start talking about Java and all the years it has on PHP there.
So, I'm really just saying--some things are right for some projects, others for other projects. Choose wisely.
Parent
Language != module (Score:5, Insightful)
As far as your opinions on PHP not scaling, tell that to IBM, Avaya, Hewlett Packard, Disney, Sprint and the others who get millions of hits a day using PHP. Seems to me if sites that get millions of hits a day can handle the bandwidth using PHP, that it JUST MIGHT be able to scale.
And as far as worst security history, you again confuse bad programming with the language it is written in. For this analogy, C# and VB still hold that title. Just because the language allows you to make mistakes in your programming, does not mean it is the languages fault when you create a recursive function that loops perpetually.
I suggest trying a course in logic; it makes your programming better and your argumentative rhetoric make more sense.
Parent
Re:You are contradicting yourself. (Score:5, Insightful)
Constant exploits? For PHP, or for crapply-written content management systems (ahem, phpnuke) that happen to be written in PHP?
CERT has issued two advisories for PHP itself: CA-2002-05 and CA-2002-20. Looking through the changelog [php.net] I see only a handful of security fixes.
Like most languages, it's possible to write unsecure code. I've seen code that executes stuff on the command line, right from a GET string. It's just as possible to write secure code.
One problem with PHP is it's a simple language, and a lot of beginners with no experience pick it up and can use it to write applications. Knowing nothing about software development, or security issues, they tend to write bad, insecure code. This has nothing to do with the language, it simply has to do with the developers. If python or ruby came into incredibly widespread use (ie, available on pretty much any hosting account you can buy, like PHP is), then you'd probably see the same thing happening. It doesn't say anything about the languages, it's simply a matter of inexperienced developers writting bad code.
Parent
Let your developers decide (Score:3, Insightful)
If you're doing the development by yourself, then obviously you should weigh the choices and pick the language that will work best for you. Development time, for example, is highly dependent on how well you already know the languages.
However, if you already have a developer, or a team of developers, to do this development, then whatever you do don't force them to use what you think is the best language. That's a guaranteed way to lower productivity and morale if they think it's a poor choice! Ask them to make recommendations. Maybe even spend a couple of days prototyping various things in different languages first.
One of the nicest things about back ends is that it doesn't matter what language you use (nobody can tell from the outside) and you can easily mix and match languages. There's nothing wrong with writing the majority of the code in PHP or Python for rapid development, but using Java or C++ extensions for a few of the computationally-intensive algoritihms.
If you are a singlehanded developer (Score:3, Funny)
Our standard enterprise stack these days (Score:5, Insightful)
Java:
front end - Tomcat running JSPs (JSTL or Velocity for templating)
in the middle - Spring and Spring MVC
Closer to database - Hibernate.
Ideally, everything running in same JVM. Add more servers for scalability front-ending them with load balancer with sticky sessions.
No J2EE fluff, easy to find people, good productivity.
ASP.NET... no, really (Score:4, Informative)
Excellent MVC model
Integrated caching capabilities
You can compile your libraries before uploading
Excellent Web Services model
Free tools
Works on Linux (through mono)
Large third party support
Very Fast
Easier to use and deploy than J2EE
Are you sure about E-Bay? (Score:4, Informative)
http://computerworld.com/softwaretopics/software/
I would guess that they're actually using a mix of technologies. Any insiders have any insight they can share? Even anonymously?
Actually, eBay uses Java. (Score:5, Informative)
[blank] rocks! (Score:5, Funny)
Best of all, it is [blank]-oriented so that you just snap functionality together like Lego blocks to get an instant app that runs at the speed of light almost right out of the box! And [blank] scales to every user on the entire planet. And it plugs into XML.
Only a Devry graduate would use anything different. Go with [blank]!
Re:What about security? (Score:4, Interesting)
Although I haven't tried it, you can get similar benefit from using Jython. Having two languages like Python and Java at your disposal has got to be a godsend for a large web app. I'm not sure if you still get to use C modules if using Jython.
Parent