Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Programming IT Technology

Open Source XML Databases? 19

tarun asks: "I am creating the next version of my open source UDDI registry and decided to use an XML database backend - if I can find any good ones. The reason to make this choice was that I was impressed by oracle and db2's xml capabilities in my past lives. However, when I tried looking for an open-source alternative it seems there is nothing around except perhaps xindice -which clearly is less then perfect. I am looking for something that can work with more than one existing databases (I will ship my software with MySQL but if a large organization wants to deploy it, it should be able to do it using Oracle, DB2 or whatever they want to use) and xindice currently only works with Berkley-DB. Also, I am looking for something that can create database tables for me given an XML schema (I can tweak it later to create indexes, stored procedures etc) and given an XML document - write it to these tables. If it supports something standard like Xupdate or XQuery, that is even better."

"There are some other XML Database projects but either they have too few features or are not open-source. What is the XML-aware portion of the Slashdot community using? Have you ever run across such problems? Do you guys create your database schemas by pain-stackingly copying every element in every XML schema you have to handle to database tables and write huge amount of parsing/deparsing code both ways?"

This discussion has been archived. No new comments can be posted.

Open Source XML Databases?

Comments Filter:
  • Right here, baby (Score:1, Informative)

    What features doesn't eXist [sourceforge.net] have that you need?
    • Re:Right here, baby (Score:3, Informative)

      by tarun ( 83353 )
      From eXist homepage,

      XPath support XPath support is still preliminary. Some functions and numerical operators are missing and only abbreviated XPath syntax is supported. The parser has also some problems to recognize the full range of unicode characters. I have started to write a new XPath parser (using JavaCC instead of ANTLR) to overcome these limitations.

      XUpdate The basic model has been designed to provide efficient, index based retrieval. As a drawback, eXist does currently not support direct manipulations of the DOM tree like node insertions or removals. A document always has to be deleted or updated as a whole.

      This is clearly a major restriction for applications which need to directly manipulate the DOM tree. Such applications have to create a new document (as XSLT does) and insert this into the DB after all transformations are done. Documents should be kept small to easily reinsert them whenever they change.

      DOM manipulation methods and XUpdate are planned for one of the next releases

  • by jefflinwood ( 20955 ) on Friday August 09, 2002 @12:33PM (#4040018) Homepage
    Try using an XML to Java toolkit with a Java object-relational mapper. XML Databases aren't really all they are cracked up to be. Really quickly, projects you should look at are: Also you might want to check out Brett Mclaughlin's O'Reilly book on Java XML binding's, or the Wrox Professional Java XML book.
  • The link should be http://soapuddi.sourceforge.net/ [sourceforge.net]
  • Why XML? (Score:4, Interesting)

    by jacoberrol ( 561252 ) <jacoberrol@nOSPAM.hotmail.com> on Friday August 09, 2002 @01:25PM (#4040424)
    Could someone explain to me the benefits of an XML database? I can't think of an XML document that can't be expressed as a relation. I've seen vague references to performance advantages and ease of development [xmldb.org] for certain applications, but I haven't heard a convincing argument. Am I missing something, or is this just XML hype?
    • Re:Why XML? (Score:1, Insightful)

      by jashua ( 558596 )
      I've found it's in retrieving XML documents from the database. It's easy to parse a document and write it to relational tables but it can be, with complicated documents, a real pain to write XML taking the data from a realtional database. It's also more coding work too.
    • Re:Why XML? (Score:5, Informative)

      by DevilM ( 191311 ) <devilm.devilm@com> on Friday August 09, 2002 @02:35PM (#4041038) Homepage
      Sure! To use XML with an RDBMS you have to do one of two things. First, map the XML to a relational schema. It is well understood that doing this has two main problems. The first problem is the resulting schema. To create a schema to support heirarchical data results in a complex and ugly schema. The second problem is a spatial one. To retrive a given XML document, the database must pull data from a variety of pages. This results in poor performance as the database has no context to store the different bits of an XML document together on the disk. FYI, this one of the main reasons for the creation of OO databases.

      The second way of handling XML in an RDBMS is to store the document as a CLOB. Storing it as a CLOB has the advantage of solving the two above issues, but introduces one of its own; You can't query the data that is represented by the CLOB because it is all stored in a single column. This means you have to extract the document from the CLOB and parse it before being able to use any of the data. Some databases now have built in XML parsers so you can do this from stored procedures and combine the XML document with tabular data, but the performance sucks.

      I do cover why you would want to use an XML database and how to use Xindice in an article I wrote for DevX that can be found here [devx.com].
      • Ok, so what you're talking about here and in your article is a special-purpose database for storing XML documents. This is an important distinction, because I think a lot of people would read XML database to mean: a general-purpose data store, which uses XML documents to store the data.

        One other question, and forgive me if this is naiive because I'm still thinking in relations. Can an XML database generate XML dynamically from other documents?

        Say I have an RDBMS with three tables: STUDENTS, CLASSES, ENROLLMENTS. I can join STUDENTS and ENROLLMENTS to give me all of the students taking a particular class. How does an XML database handle that? How do I create relationships between the documents?
        • (don't get too much into my implementation, this is just an example)

          If you have students belonging to many classes then you might have a STUDENT tag with many CLASS tags with CLASSID attributes. An XML document for these students would contain many STUDENTs. And if later you decide those students should be broken up in dormatorys then you might arrange STUDENTS under DORMATORY tags.

          RDBMS aren't good at the later idea... Taking data and moving it around in a tree. RDBMS's can't deal with tree's very well at all. Often one you get back to a tree you'll find that what took many tables is really just a way of representing many flat parts of tree.

          It doesn't suit everything, and I'd imagine that for databases they'd be RDBMS too - the XML document would be a table-type.

          You'd use XPATH to select a node on the XML tree. But this is a lightweight method, and it only suits simple situations. XQuery is for anything more complex.

          For an XPATH query if you wanted a student who had an attribute of studentID="4" then the query might look like "student/@studentID=4".

        • Can an XML database generate XML dynamically from other documents?
          Sure, and this is the way that many RDBMS vendors have tacked on XML support. This is just a wrapper though, and it wouldn't have the speed or usefulness of a proper XML database.

          One forms data relationships in the same way as before. Just mark part of one tree as relating to part of another. That hasn't really changed (from what I've read).

      • Re:Why XML? (Score:3, Interesting)

        What about using an hiercial DB? They *are* the most efficent kind, in all terms.
    • Am I missing something, or is this just XML hype?

      Basically, yes. "XML database" doesn't mean much about the database itself, unless you mean that the file format used to store the data is XML (which is pretty much uninteresting, except for being fairly braindead for many sorts of data sets). It tells you nothing about how, logically, the data is organized and what operations it supports (which is what "object database", "relational database", "hierarchical database", etc. attempt to convey), which is generally what a programmer using the database is _most_ interested in.

      It may mean that data is presented in XML at query time and XML queries are accepted; if so, that's a moderately more interesting claim but really speaks to a database interface (a la JDBC or pydb) rather than anything interesting about the database itself. Which is not to be dismissed, but formatting results as XML is trivial compared to having to implement e.g. relations in code (for instance), and that sort of interface can be added to any kind of underlying database.

      It doesn't really speak at all about the on-disk storage structures (even if data is "stored as XML"), which is often the most interesting thing from a performance standpoint and often interesting from a usability standpoint (e.g. "can efficiently store data in the existing native filesystem" is often mandatory for non-dedicated applications).

  • Umm..no (Score:3, Informative)

    by quinto2000 ( 211211 ) on Friday August 09, 2002 @01:25PM (#4040425) Homepage Journal
    Do you guys create your database schemas by pain-stackingly copying every element in every XML schema you have to handle to database tables and write huge amount of parsing/deparsing code both ways?"
    I use Dia [lysator.liu.se], Agata [codigolivre.org.br], and Dia2SQL [debian.org]. (There are several variants of Dia2SQL). What exactly are the benefits of XML? I'm new to databases, but this way seems to me more efficient. At what level of complexity does the XML schema become so useful?
  • Remember what they say about XML databases: SELECT * FROM books.
  • by DevilM ( 191311 ) <devilm.devilm@com> on Friday August 09, 2002 @02:26PM (#4040975) Homepage
    I am curious what the poster thinks is wrong with Xindice. He seems to indicate that he is looking for an XML to RDBMS mapping engine, not a native XML database. Further, he seems to think that Xindice is limited to one type of store. This is simply not true.

    In Xindice, XML documents are stored in collections that have filers associated with them that do the actually storage. Xindice provides a Filer interface as well as several different implementations both in-memory and persistent. Additionally, it would be quite trival to implement a DBFiler that stored the data in an RDBMS.

    So again I ask, what is wrong with Xindice?
  • What do you want? (Score:1, Interesting)

    by Anonymous Coward
    Do you want to force XML into a relational database, or do you want to do relational operations on XML data in an XML database? Are you more interested in indexing/searching XML data, like informaton retrieval systems do, or updating it like relational databses do?
  • by broody ( 171983 )
    Your looking in the wrong place.

    It sounds like what your looking for is really an object-relational wrapper rather than an XML based RDBMS. Take a look at OJB [apache.org] and see if it is not really what you are seeking.

The unfacts, did we have them, are too imprecisely few to warrant our certitude.