Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Slashdot.org News

Comparing MySQL and Postgresql 34

Mr Poet would like to start a discussion on the following: "I am an AVID user of Postgresql, over the past 6 months I have seen more and more about MySQL. I have done a little research and outside of MySQL being extremely fast I have not seen any reason why people are so excited about it. This is not a slam on MySQL. I just don't understand the reasons. Could we have a technical discussion as to why?" That's a good question. What are the strengths and weaknesses of both programs. Are there areas of deployment where one is more appropriate than the other, or are these programs fairly interchangeable?
This discussion has been archived. No new comments can be posted.

Comparing MySQL and Postgresql

Comments Filter:
  • Unless I had a very small, maintainable, perfect program running on top of perfect libraries, I'd hesitate to use threads where they weren't necessary. Threads make everything so much more fragile, and limit your choices (and chances) for error recovery--or even error tracking, as the thread causing the corruption may not be affected by it. At least with separate processes you have a limited section of shared memory, accesses to which you can limit to a fairly small section of code. With threads, if you have a memory corrupting bug and you can't review the entire program to find it (or don't find it when you look), you're pretty much screwed...

    Remember, kids, the thread you seg fault may not be your own...
  • FWIW, PostgreSQL still doesn't do row locking. It doesn't lock while you read. Period. This is substantially better than row locking, provided you're ready for the results.
  • Just as an important note, only two RDBMS's support truly parallel executiion og queries -- Oracle and DB2. This feature is called MPP in Oracle and costs mucho buckos. MPP Oracle is one of the fastest(if not the), most scabale databases availible. It would be interesting to see if Oracle will port this feature to Beowulf.

    Threads are merely a design issue. mySQL decided to execute mutliple queries over threads rather than processes. Oracle, Informix, DB2, and PostgresSQL use multiple processes in UNIX environments. This largely due to their age. When they were first written UNIX threads either didn't exist or weren't standard enough to work reliabily across platforms. IMHO, threads are better because they can be started and closed fater and require fewer resources, but then, processes are cool as well.
  • I believe that the HAVING clause is designed for
    the use of aggregate comparisons.
  • I've been happily using MySQL for three years; it saved our asses when one an outside vendor implemented something using mSQL. Their program worked, but took two or three minutes to run, and our client was freaking out. I discovered MySQL, ported the program over, and the MySQL system took two or three seconds to run.

    I kinda miss subselects, but not much; they're scheduled for a point release in the near future. I think the word from Monty is "September", but don't hold me to that.

    I've never missed transactions. The one time they would have been nice I was able to catch the exception and delete the incomplete information.

    MySQL's license is pretty simple -- as long as you don't sell a product that can only function with MySQL, you don't need a license. If you provide a MySQL adaptor, a PostgreSQL adaptor, and an Oracle adaptor, I don't think you need one.
  • Actually in 6.5 there now is a serial type that will create the sequences and and the indexes for you and automatically increment the count with each insert.
  • Thus said, in my setup (web backend database), I choosed PostgreSQL, mainly for the totally random reason that MySQL was'nt compiling on my developpement platform (Debian 2.1/Sparc) and PostgreSQL was packaged for that distribution.

    However, I had made some research on the issue. Here are my conclusion :

    - MySQL is reputed faster (MySQL developper admit that speed is no. 1 priority), but PostgreSQL advocate point that the speed disadvantage of PostgreSQL is mostly due to fsync() call for every write request to the db, wich is a Good Thing (TM) but can be disabled if speed is an issue. My personnal opinion is that fast hardware is cheap, thus lowering the importance of this issue.

    - MySQL has a lot of nice third-partie support and goodies and better doc. However, the doc for PostgreSQL 6.5 had been much improved and is quite helpful.

    - As pointed out in previous post, support for transaction is a plus for PostgreSQL. Also, the PostgreSQL developpement team seem genuinely focused on SQL92 compliance, another Good Thing (TM) from my newbie point of view.

    - Another nice aspect of PostgreSQL I don't know much about is native geometric data type (could be interesting in GIS application) and support for BLOB (don't know if MySQL support BLOB). I don't need these features, but it's nice to know there available.

    Thus said, PostgreSQL has some shortcoming (at least in v.6.3.2, the one I use):

    - Can't use aggregate function in WHERE clause. For example, the following statement is legal with MySQL :

    SELECT * FROM sometable WHERE id = max(id);

    In PostgreSQL, you need some hoopla to do this :

    SELECT * FROM sometable WHERE id = (SELECT max(id) FROM sometable);

    This might be fixed in Pg 6.5, I don't know.

    - There is no AUTO_INCREMENT in PostgreSQL. You have to create a "sequence" (CREATE SEQUENCE seq;) and invoke nextval('seq') to extract a serial number. This is fixed in v6.5 : PostgreSQL now have a SERIAL data type that is auto-incrementing.

    That being said, don't take my word for it (I *really* don't have much experience with database).

    Just my 0.02 CDN$.
  • You only pay if you want to sell a product
    derived from or including mysql that is non
    redistributable or non-free.

    or summat like that. either way for practical
    use, say on a website, even if you are a footsie
    100 company, you will not have to pay.

    It is true that it is not fully GPL'd though.
  • This happens to be a question I have some experience in. I have been a professional MySQL developer for about 2 years now working at a 400 million dollar corporation. We first started looking at RDBMS's for the specific task of creating a separated, abstracted data reporting model in which we converted our OLTP database into a more friendly OLAP database. The special needs associated with OLAP made us look first for speed, second for flexibility, third for hooks into programming languages.

    I can say beyond a shadow of a doubt that for OLAP and dimensional data modeling MySQL is without peer. One of the applications I have written has an active concurrent userbase of nearly 10,000 clients. (Not 10,000 possible clients, 10000 concurrent clients!) And the MySQL engine has no problems cranking out the data that they need.

    MySQL has excellent hooks into a plethora of programming languages. The first applications written at my corporation were written in Perl with the DBI interface. The interface between Perl and MySQL is flawless. I have never had a single problem in dealing with recordsets via the DBI interface. Once we shifted away from pure reporting applications and more into dynamic web-based analytical processing a shift was made to PHP3. The PHP hooks into MySQL are amazing, the speed is unmatched and the ease of use alowed for excellent rapid application development in our high pressure enviroment. Once you learn however, that until PHP4, PHP did not have a true garbage collector so if you didn't explicitly free your result sets, you ended up with Apache processes that took up 140Mbs. of ram ;)

    The current shift is away from PHP and into a more formalized OO model with Java Servlets. The MM Mysql Drivers (available for download from a link on the MySQL download page) are excellent and fast. I wrote a custom database connection pooling algorithm based on some nice circular skip list heuristics, and we have never had a problem with it. I am still not sure if I like it as much as PHP, but the ties into MySQL are amazing.

    As for MySQL's limitations... well, it does NOT have subselects... yet. With MySQL 3.23.* we will soon have subselects. At first this was something that really bothered me, until I realized that 90% of all things I needed to do via subselects were easily and more efficiently implemented as joins. And the rest could easily be implemented with the use of temp-tables (after all, that's what a subselect does anyways ;).

    The lack of transaction and rollback capabilites can be a problem, but with the soon advent of atomic operations we will have most of the neccessary tools to emulate some of the functionality of transaction-rollback mechanisms. This, however, is a valid complaint, and if you are developing an OLTP instead of an OLAP application, it might be a good idea to go with an Oracle, Postgress, or Solid.

    And finally, the MySQL mailing list has some of the most brilliant minds in the world of database development I have personally ever encountered. There is seldom a question asked that does not either receive a professional concise answer, or a pointer to the proper place too receive that answer. So, all that said, look at the subject of the message for a concise version of this posting, and if anyone has anything they would like to dicuss with me about MySQL development, feel free to get in touch with me.

    ed [mailto]
  • That's what it comes down to. Consider gdbm/dbm as even lower on the functionality list, and even faster (I reckon, but don't truly know). Not all programs need a full RDBMS. That's all, no hidden agenda.

    --
  • If your comment about "nuts" is my other comment above, you din't read it write, or I didn't write it clearly enough. There is a continuum from a full RDBMS on one end to gdbm/dbm on the other, with mysql being in the middle. There certainly are times when even mysql is overkill, and gdbm is perfectly good enough. And there are times when mysql is not good enough, and you need a real RDBMS with transactions.

    --

  • >- Can't use aggregate function in WHERE clause. For example, the following statement is legal with MySQL :
    >SELECT * FROM sometable WHERE id = max(id);

    Hmm...IIRC, you can't do this in SQL92. So it's actually a symptom of PostgresQL's stricter adherence to SQL.

    Of course, with my luck, I'm totally wrong about this, and deserve every flame I get because of it.
  • 1) Don't you mean that MySQL can perform multiple (read) operations on *same* tables simultaneously? Doing differenting things to different tables sounds like no big deal, not nearly as hard as doing different things to the same table.

    2) Is anyone aware of a current benchmark comparing PostgreSQL to other DBs? The standard line about PostgreSQL performance seems to be "PostgreSQL was pretty slow, but 6.5 is a huge improvement"... with no numbers to compare.

  • Interbase has had a similar design for a long time, which has not helped it get anywhere in the marketplace.

  • I've never missed transactions. The one time they would have been nice I was able to catch the exception and delete the incomplete information.

    That's absurd! That is what you use transactions for -- ensuring concurrency and atomicity among updates. MySQL is barely a relational database, and you could do almost as well using serialized hashtable objects in your choice of programming languages. :-)

    I have never really seen any speed problems with postgresql -- I built a web-based groupware and e-mail application for a liberal arts college and based it on postgresql; at any given time, the db has tens of thousands of rows (maintaining user preferences, info about postponed messages, addressbooks, and myriad other things in the email app alone), and performance is comparable to if not better than an Oracle namespace of similar magnitude on the development server at my current job doing db development for a major pharmaceuticals company.

    The moral of the story: if you're backing a web site, your bottleneck is most likely not going to be the DB, unless you're opening the DBMS with every page you serve (i.e. use a connection pool OR ELSE). You're more likely to be network-I/O-bound most of the time, and a "lightning fast" DBMS won't help that at all. However, if you're writing an application that requires that more than one person access/update the data simultaneously, you'll want transactions, or else you can deal with the mangled db.

    Good luck. I'd suggest you read Philip and Alex's Guide to Web Publishing [photo.net] for more info on why this is important. You should also check out Mr. Greenspun's clever ArsDigita Community System -- it does almost everything that a web/db application needs to do, and it caches db connections.

  • Many people write web application. These application don't need all the extra goodies of a (more or less) complete SQL implementation.

    The big differences are:
    1) Transaction management
    2) Stored procedures

    Take for example a "Slashdot" system. This system doesn't require very much of the database.

    It doesn't need transactions. Most action is making selects on texts to view. After that the
    "transaction" is complete. The second most use option is probably submitting a text (such as this). This can be done with a single insert command. The last option is updating user settings. This may require transaction handling: what happens if the user is "logged in" twice and changes his user settings at the same time? This situation is probably considered "unlikely" or an "acceptable bug". From this point of view, Slashdot is just like many other web-sites: all it requires is speed.

    However, I you try to build a real world financial administration,you cannot live with such "acceptable bugs". This requires more, most important the two features mentioned above.

    Now the question is whether many organisation would allow such critical data to live on PostgreSQL. System administrators are used to Oracle, Sybase, Informix etc. These names they trust. PostgreSQL only has become reasonably stable in the last 1.5 year and still has to build its name. Only a few months ago some critical features (e.g. row-locking) have been implemented to make it feasible to make such an administration system. The name still has to grow, but has a great potential. After it has done so, you will find more comments of people being enthousiastic about PostgreSQL.
  • by 7021 ( 15479 )
    Interesting I see this on Ask /. cause i was gonna submit the same question. We are currently in the process of setting up a web app for our clients and are trying to decide which db will run the app. We currently develop using postgres but the speed of MySQL is very attractive.

    I have done as much research as Google could point me to and as far as i can see MySQL screams but it is not feature rich. Here is the summary of an afternoon (last week) of research (Disclaimer: I still have to do benchmarks myself but this is a start.)
    MySQL is fast but it lacks some important stuff that we decided we couldn't live with out.

    • Doesn't support inner selects - I tried to convince my self that I could live without these but today while doing some data cleaning i was using them all over the place.
    • no support for transactions, they suggest table locking but when you are updating several tables at once locking all of them maynot be the best thing to do. They are going to implement some sort of transaction process but rollbacks won't be incorporated.

    Postgres on the other hand was developed in a very academic enviroment and is extremly feature rich. The major draw back to postgres I could see was that it doesn't scale very well and i am not sure if 6.5 addressed any of these issues. That being said if the db won't scale then it really doesn't matter how many features you have.

    They guys who developed MySQL have a really cool and usfull tool on there web site that runs a script and lets you know exactly (It is VERY detailed) what features each db has.

    I strongly suggest anyone insterested in benchmarking features of dbs check this out. db compare [mysql.com]

    We decided that we would use postgres until we do our own benchmarks. It may scale enough for what we need, the load shouldn't be too high for our use.

    more info would be great, also the timeline for MySQL to incorporate inner selects would also be handy cause i know they havbe it at the tops of the features they are going to implement.

  • Although MySQL lacks some features the Postgres has, MySQL is fully threaded whereas Postgres is not. In short, Postgres cannot perform parallel operations; MySQL can operate on different tables simultaneously.
  • I've been using MySQL to backend my web sites for about 6 months. It is fast.

    However, what are the real comparison numbers? Are we talking 3-4% on 1 million rows?

    If PostgreSQL [postgresql.org] is only slightly slower that MySQL [mysql.com], I would consider switching.

    A.

    --
    Adam Sherman

  • Too bad the PostgreSQL documentation is so misleading about commercial databases not having this feature. Oracle has it. So does Solid [solid.com].
  • ...and support for BLOB (don't know if MySQL support BLOB). I don't need these features, but it's nice to know there available.

    MySQL supports them. I really like the ease with which one can throw around large blocks of text using MySQL - easier than with postgres or Oracle IMHO.

    I looked long and hard at both of these when I started doing db backed web development (about 1 year ago). I started with postgres, but there were occasional dropped connections and even server crashes. I really liked the feature set of postgres, but the consensus at that time was that mySQL was more stable - so I switched. And it was true. Not one crash or problem of any kind yet with lean and mean MySQL. I would like to see a larger limit on varchar, and subselects (coming soon!).

  • Actually, Informix uses threads.

    Sybase and Oracle still use separate processes, but I think Informix uses threads.

  • You point out the two missing bits of mysql.. inner selects are nice, but you could write a routine that used a temporary table and hid that from you. A key mysql speed advantage, although it screams everywhere really, is the setup and tear down of database connections. Great for web. It has been very stable for me, compared to sybase, no worrys with segments and killing stuck processes and backup servers and asking how full is the database and getting back the answer "it depends"... I would rather be on mysql than anything because of the speed, the runs-everywhere nature (runs quite happily on a little laptop) the regexps, blobs and alter table statement (insert columns and modify data types on the fly in a production running database! hell yeah!), the speed, the documentation, the datatype support, the speed, the mysql command line environment, the stability, the speed and did I mention the speed? they do have some various join options I think that can help with inner selects, did you read their chapter on how to work round the missing features? I thought it was quite good. The AC suggesting DB files as an alternative to mysql is nuts, unless you are storing one huge table of key=value pairs, forget DB files. -Justin
  • Go to the PostgreSQL site and look for documents about MVCC. It's a great improvement over previous versions of PostgreSQL and was added in 6.5. Hardly any other database (even commercial) has this cool feature. It's about accessing and updating the same row concurrently.
    --
    OS lover
  • click here [usask.ca]

    Excerpt:

    Unlike most other database systems which use locks for concurrency control, Postgres maintains data consistency by using a multiversion

    model. This means that while querying a database each transaction sees a snapshot of data (a database version) as it was some time ago,
    regardless of the current state of the underlying data. This protects the transaction from viewing inconsistent data that could be caused by
    (other) concurrent transaction updates on the same data rows, providing transaction isolation for each database session.

    The main difference between multiversion and lock models is that in MVCC locks acquired for querying (reading) data don't conflict with
    locks acquired for writing data and so reading never blocks writing and writing never blocks reading.

    --
    OS lover
  • As far as I can see, the tradeoff goes like
    this:

    PostGresQL:


    • BAD: Lousy documentation, many
      undocumented bugs, inconsistent behavior
      from one release to the next. Not reliable.
    • GOOD: Has cursors, transactions, and rollbacks.


    MySQL: The opposite.

Get hold of portable property. -- Charles Dickens, "Great Expectations"

Working...