Ask Slashdot: Which OSS Database Project To Help? 287
DoofusOfDeath writes "I've done a good bit of SQL development / tuning in the past. After being away from the database world for a while to finish grad school, I'm about ready to get back in the game. I want to start contributing to some OSS database project, both for fun and perhaps to help my employment prospects in western Europe. My problem is choosing which OSS DB to help with. MySQL is the most popular, so getting involved with it would be most helpful to my employment prospects. But its list of fundamental design flaws (video) seems so severe that I can't respect it as a database. I'm attracted to the robust correctness requirements of PostgreSQL, but there don't seem to be many prospective employers using it. So while I'd enjoy working on it, I don't think it would be very helpful to my employment prospects. Any suggestions?"
postgresql (Score:5, Insightful)
You will probably be happier in the fewer postresql shops. Think about it do you want to get it done quick and dirty or the right way?
Salesforce.com is hiring PostgresSQL guys like mad (Score:5, Insightful)
Re:Hierarchical, er "NoSQL", DBs? (Score:3, Insightful)
Nosql DBs suffer pretty bad from Inner platform effect, where the users end up implementing their own classic SQL-RDBMS on top of the nosql. "I don't have joins... well I'll write on in ruby". You could probably do the community a huge service by PROPERLY re implementing at least a API compatible mysql system on top of a variety of various nosql services. That way devs could be buzzword compliant, while not actually having to change anything (well, the sysadmins will throw fits at the change for sake of change, but no one cares about them)
http://en.wikipedia.org/wiki/Inner-platform_effect [wikipedia.org]
The ones that don't inner-platform aren't really using them as "databases" so much as simple persistent stores. Like dumping data into a CSV file. Maybe persistent stores with advanced parallelization, but just persistent stores.
Re:Video (Score:1, Insightful)
I will note that Postgres also inserts a default value of NULL (same as MySQL) when no value is given and the column has no default value.
So if the default value is NULL then Postgres uses NULL as the default? SHOCK AND FUCKING HORROR. If that's the only "complaint" you can make about Postgres then it's basically perfect.
Re:If "employment prospects" Are All That Matter . (Score:3, Insightful)
If employment prospects are all that matter, stay away from the major SQL databases. They're mostly feature complete, have large established developer communities that are hard to break into (sometimes requiring employment at the sponsoring company) and often have a lot of legacy baggage that limits what you can accomplish.
Meanwhile, in the NoSQL world, people are busy re-inventing the wheel. You can take decades-old techniques and apply them to new features of these databases. For example, Redis doesn't have true clustering support. There's a preliminary draft and some exploration, but it's still really nascent. If you've got the DB chops to implement it and do it well, there's a ton of places that would hire you.
The downside is, of course, that you end up working with NoSQL databases, but your employment prospects to actual work and knowledge ratio is a lot higher.
Noseql (Score:5, Insightful)
First I should probably burn some karma and say "what a load of garbage". The headline asks what OSS database to HELP with, but the article summary might as well read "Which free SQL-compatible database to learn to use". And on top of that it contains the answer already, along with questionable dirt-showing on MySQL which makes it read like a guerilla-ad for PostgreSQL.
But in any case, it makes a major, huge difference whether the question is "which database codebase to contribute improvements to" or "which free database to learn for best amployment chances". Sounds like it's the latter, and in that case a follow-up question is what kind of employment. The one correct answer is "whichever database your employee is using" - don't expect to be able to choose a job on the basis of what database engine they happen to be using in one of the departments at the time. Second best answer is go with both; and again it makes a huge difference whether it's for self-employed web-site design or financial analysis for stock brokerage firm.
And if you actually went with MySQL, next question is which database engine. Huh, you ask? Well you see, MySQL is not a single database engine, in actuality it's a front-end to pluggable database engines. The stock release fetures at lest MyISAM, InnoDB, Heap, BDB, NDB and Archive (and few variations). In general it's a choice between MyISAM or InnoDB which are whole different story. When most people say "MySQL has such and such problem" they're actually talking about MyISAM, but MySQL has defaulted to InnoDB engine for years.
But the third and best answer is "none of the above". In most cases everybody seeking employment in relevant job will be fluent in SQL and have at least some experience with both MySQL and PosgreSQL, and it'll be rare for the employer to be at all interested in your ability to actually "hack" the database source. NoSQL databases offer ample opportunity to differentiate both on the job-market, and on the business competitiveness arena by improving the source-code (and in most cases as long as the binaries stay in-house, so can the source which makes bosses happy, but consult your OSS license).
Why MySQL Wins (Score:3, Insightful)
I love PostgreSQL in theory but hate it in practice. It's a pain in the ass to work with... not very productive. For a long time, I felt it was worth it to endure this for the superior design, feature set, and technical correctness.
But one day I realized that I need to get things done, switched the MySQL. The learning curve was small but the main kicker was that things just worked and easily reworked. There are risks, limitations, and problems. It's very imperfect but I get things done now... and never have or care to think about the purist philosophies with which I used to love to indulge in.
In the end, you have to give up perfection to go anywhere.. Otherwise, it's like having to get half-way there first, meaning you have to get half-way to half-way first, etc. recursively forever.. With MySQL I take a reasonable number of precautions for things that can go wrong, ensure there are good backups, and deal with the others as they come.
Now I think MySQL is superior for practical use by a long shot. And I think that's why its adopted so heavily.
The key ingredients to successful technologies are:
(1) You can do something obviously cool or useful with it.
(2) It's quick and easy to learn and use.
And that's it. This is why so many successful things are made by idiots. Look at HTML. It was made by Tim Burners Lee back when he knew very little. But 12 year olds were picking it up and making cool (at the time) web pages. Now he know so much more and has tons of backing from heavy weight organizations and money but cannot seem to even force the success of the Semantic Web. It's hard to learn and hard to work with even when you learn it. Furthermore, it's not obvious to most what cool or useful things you can do with it. Proponents keep saying it'll mature and will be easier when tools and libraries are available to make it easier... That misses the point. Even the tools mostly suck and are buggy because the basic tech. is a pain in the ass to work with. There are philosophical visionaries galore but no substantial progress beyond what grants and job requirements force people to do... and there won't be.
Matthew
Re:With shared hosting (Score:5, Insightful)
Re:Video (Score:5, Insightful)
The simplest way to say it is that MySQL is really more of a data store than a database. You can store stuff in it, and it'll get the data back reasonably efficiently, but in terms of actually operating as a proper compliant database for critical information it just isn't designed that way. It works great for storing the back end for your web server, but if you wanted to store complex data in it and needed it to be 100% accurate, transactional, and reliable, the product just doesn't fit the bill. For all that it's got a paid "enteprise" edition, it's really more in the space of something like SQLite or SQL CE than it is in the space of Oracle, and again it's not an issue of whether it can scale or whether it's buggy, it just simply isn't designed to be compliant to the required level. That's largely the reason it works so well as a LAMP back end and is so easy to administer, but it just isn't fit for purpose for much more.