Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Linux Software

What Is XML And Why Should I Care? 17

Anonymous Coward asks: "I 've been reading a lot about XML, I know Slashdot uses it for some features, but I haven't found a site or tutorials that give clear and good examples about it. There are a lot of software for Windows to develop XML aplications. However, I have only seen XML parsers for Linux (no applications). Much of the tutorials found on the Web about XML are not that good. It's all abstract. I am looking for good examples..." This may be FAQ, but I too would like to see the "Layman's Description of XML and Why It is Cool."
This discussion has been archived. No new comments can be posted.

What Is XML And Why Should I Care?

Comments Filter:
  • Let's say I want to move stuff from one database to another. Here's how it would be done with XML.

    <company name="slashdot" employees="2">
    <employees>
    <employee>
    <name value="rob malda"/>
    <nick value="cmdrtaco"/>
    </employee>
    <employee>
    <name value="jeff bates"/>
    <nick value="hemos"/>
    </employee>
    </employees>
    <homepage>
    <url value="http://slashdot.org/"/>
    <title value="slashdot"/>
    <slogan value="news for nerds. stuff that matters."/>
    </homepage>
    </company>

    --

  • The key reason for the lack of "killer applications" in XML is that it most immediately benefits information consumers, not producers. XML is most effective at bridging gaps between dissimilar systems. Slashdot's XML usage for the slashboxes are an excellent example. XML makes it possible for Slashdot to offer foreign information in its native audience without having to produce that information internally.

    An information producer doesn't get any immediate gain by providing their information in XML. The gain comes when a suitable number of consumer sites have taken advantage of the XML data the producer's supplied, thereby generating traffice returning to the producer. But just the fact that I've offered my entire site's information as XML doesn't profit me at all. The gain is dependent upon consumer sites using that XML information.

    I'm sure if I'd been more interested in Economics while an undergraduate I'd be able to pepper this post with impressive sounding econ term. Consider yourself lucky.


    Why'd you say 'burma'?
  • Of course lisp expressions are less robust than XML, because you can't tell what a parenthesis matches.
  • ...is Glade. It saves the entire user interface of a GTK+ application as a big XML file.

    The output is something like this:

    <widget>
    <name>myButton</name>
    <type>GtkButton</type>
    <label>This is my button.</label>
    <signal>
    <signalname>clicked</signalname>
    <handler>on_myButton_clicked</handler>
    </signal>
    </widget>

    (or something like that.)
    With libglade, you can just (at runtime) read the XML file and build the user interface on the fly. Among other good things, this allows you to tweak UI details without a recompile. ("Dang, I misspelled that label! Gotta relink, now!")

    Also, it's just a big text file, so while you might find it easier to build the initial interface in Glade, the tweaking can be done easily enough with vi or whatnot...

    --ryan.

  • Miva Merchant, a shopping cart/transaction software is based on "Mivascript", which is an XML schema for e commerce type apps. The parser is called Miva Empressa. I've used in on Linux and Solaris, but I think they also have an NT and HP-UX version.

    Twister, from bCandid Software in an NNTP to Web toolkit for serving Usenet with a Web interface and building large scale message boards. It's used at Deja and ZDnet, to name a few. Twister has functions and several other elements which are complied into a template file. The data displayed according to the XML in the template.
    Dave

  • Great, XML is the new god. It's a data structure. But what can I do with it? So let's say I have an XML document like so:

    <whoAreYou>
    <firstName>Bob</firstName>
    <lastName>Dobbs</lastName>
    <likes>
    <favoriteColor>Periwinkle</favoriteColor>
    </likes>
    </whoAreYou>

    Now what? How can I interface with this in a web application? What if I have 10,000 records. Should I NOT use a SQL database and put everything in to this big HTML-like file? That doesn't seem efficient.

    The million dollar question: XML is a data structure, but is it a *database*?

    I found <a href="http://hotwired.lycos.com/webmonkey/98/41/in dex1a.html?tw=authoring">Webmonkey's</a> intro helpful.
  • You are right of couse, but the real problem at the application level has always been persistant storage of data and the abandonment of undocumented proprietary file formats.

    Last year I had to deal with two sets of data that were in dead, undocumented file formats. One I had to reverse engineer to get the data out. The second I found a third-part program that could read part of the files and did the rest by hand.

    XML creates the possibility of an industry standard for the storage and subsequent transmission of data. The self-defining nature gurantees that I will not have to deal with the month of hell I had to go through. It will now be just a day or two of hell.

    As to the history, everyone I know who deals with data translation is an SGML expert. XML feels pretty natural to them. The XML hyperbole will crash in another year or so and soon after we'll then have a clear view of the real utility of this markup language.

    All in all, there's no reason to be bitter...well unless you were part of the Cyc project that was one of the catalysts of the fall for AI.
  • by Anonymous Coward
    In the most basic utility definition, XML is standardized method for structuring data.

    Of course this may not have any direct relevance to you. What it means is that I can create a data format based on XML that you can:

    1) Use directly given a Data Type Definition (DTD) or if there is an descriptive XML Schema.

    2) Translate it into your own format by translation using the same DTD or XML Schema.

    Since XML has a really, really basic overall structure (see the <tag>...</tag> and <tag/> post above), it is very human readible. It is also very easy to parse (at this basic level).

    All of this combines to create an environment where the possibility exists for many different things.

    1) Different application will be able to share the same common data. For example, you could do basic word processing in Tuxedo Office and edit the mathematical equitions within a document with a specialized program.

    2) Create industry standard XML-based DTDs for specifying orders etc. and link businesses directly to other businesses without having to deal with the EDI middlemen.

    3) Anything you can think of...

    It is still early in the XML game and the implication may be wide and personal, or it could end up as just another tool for those currently using SGML. In the short term you'll see it pop up as an exportable format in desktop applications. Perhaps later XML will be used for the internal formats as well.

    It's a good format and I've started using it to replace INI files used to store data in some contract applications I'm working on. When combined with simple structured storage, it almost makes me want expand on this to create a mid-sized database application.
  • www.xml.com [xml.com] has a good startup guide [xml.com] to XML for those who still don't know what it is.

    The best thing about it is that it is an open, easy to parse data format. Creating your own apps is easily done in any programming language, so you don't need to rely on other people to create the software for you.

    And because an XML Document contains it's own DTD (the rules on what is allowed within a document), others can easily use their own XML software with your data in ways you never intended. Which is a feature, aparently.

    But XML is, by definition, abstract. Once you start using it, it might make more sense. Look at it as a structured text file and it might be easier

  • One hears a lot about XML this and XML that. XML isn't really that complicated. It's basically just a textformat with built-in structure (markup). Don't get me wrong. I do think XML is very useful,. But I certainly also find that XML is a bit misunderstood - especially in the media. "XML Primer" was I good book I read on this. I don't remember the author though. Oh yeah, and um "first post". -dennis
  • by stab ( 26928 ) on Friday March 24, 2000 @06:53AM (#1176494) Homepage
    I recommend you scope out the WebReference [webreference.com] site for a lot of information on XML.

    In particular, the XML Expert [webreference.com], as he's known, has posted a number of interesting articles, and if you start from the bottom, you should be well aware of what the fuss is by the time you finish reading!

  • XML is little more than a syntax for writing down tree structures. It is equivalent in expressive power to S-expressions (the core data type in the Lisp family of languages - 42 years old in 2000). I'm far from the only person that has noticed this - go see this PDF [bell-labs.com] by Phillip Wadler at Bell Labs, especially pages 5-8. He also has some other good XML links [bell-labs.com].

    The only reason XML excites people is it looks like something they're familiar with (SGML/HTML). All the old representational issues that the AI community has been grappling with for decades will now be recast into XML terms, and "solved" in half-ass fashion by people who won't know where to look in the literature for existing techniques.

    Microsoft loves XML - have you thought about why they like it? It's because they can claim to be "open" and "conforming to standards" by using its syntax, and still have enough control over the underlying semantics to keep developers and users on the upgrade tradmill.

    XML - more than enough rope, with a godawful syntax to boot. Long live () and ""!

  • by Anonymous Coward on Friday March 24, 2000 @06:34AM (#1176496)
    It's one of thost things that when you start using it, you start to really see how significant it really is. Let me go over some high points:

    1. The web has shown us how useful a mechanism that plain text is for communication. In this day, essentially anyone or anything can read simple text. It is ubiquitous (I will use this word again).

    2. When two things need to communicate, they need to establish a method of communication. In the annals of the computer industry, many forms of communication involve "one-off" type plain text communication mechanisms. Think flat files. Think fielded files (COBOL copylib's anyone??? ARRRGH!!!) Think comma delimited, tab delimited, etc. XML is essentially a contender in this arena. XML happens to be better.

    3. XML is a better mechanism for many reasons.

    a. It represents hierarchical data well (this is a key piece). It is difficult to effectively represent "has-a" type relationships in a tab delimited file... (Customers have orders, orders have items, items have descriptions, etc.)

    b. It has built in mechanisms so that using third party tools called parsers, you can (without writing a line of code), validate that an XML document is *syntactically* correct. Think about how important this is when communicating between two systems. When you know before you even touch the data that it is syntactically correct, that simplifies things a great deal.

    c. It is human readable. Tags are meant to be self describing, so that you can look at an XML document, and have a clue what the data represents that you are looking at.

    4. When combined with ubiquitous (to use that word again) protocols like HTTP over TCP/IP, which is supported by most systems today, XML becomes an extremely effective form of communication between two arbitrary systems. The operating systems and hardware platforms and underlying architectures become complete irrelevant (with respect to the two systems) because the form of communication is so trivial to use.

    *Obviously* building a system based on XML is no small matter. *Obviously* XML is not the end-all be-all of the computing world. *Obviously* XML is not going to cure cancer.

    But it is *really* cool...
  • by dlc ( 41988 ) <(dlc) (at) (sevenroot.org)> on Friday March 24, 2000 @06:54AM (#1176497) Homepage

    XML suffers from the same problems that a lot of "popular" technologies suffer from: overhype. XML has a lot of potential to change the way you move data around. You can share data between totally different applications. You can post an XML version of your headlines (such as sites like Slashdot, LWN, Freshmeat, et al do) and have other sites snarf them to list the headlines. It's a desert topping and a floor wax. You may not believe it, but it will cure your asthma, too.

    OK, enough hype.

    XML is a data description standard that relies on pairs of tags, which are enclosed by < and >. These tags can be nested, and this nesting represents a hierarchy. For example:

    <foo>Hello!</foo>

    Here, the foo tag has a value of "Hello!". I could just as easily written the same thing using attributes:

    <foo value="Hello!" />

    This tells us the same thing. Here is a nested tag:

    <foo>
    <bar>Hello</bar>
    </foo>

    So, bar lives inside foo, and has the value "Hello!". Who cares about all this stuff? Why does this matter? Glad you asked.

    Basically what it means is that I can take my data, in whatever format I keep it in (whether it be text files, HTML files, PGP-scrambled MD5 hashes, or even something really stupid like an Access database), convert it to XML format, and it is easily usable by other programs.

    How? you ask. In order for a document to be valid XML, it has to meet some pretty stringent requirements, such as all tags must be closed and properly nested. In addition, you can define you data types in advance (some standard XML document types (called a DTDsor Document Type Definition) are RSS (Rich Site Summary, a Netscape-induced standard that lets you describe a sites contents (this is what Slashdot uses)) and CDF (Microsoft's Channel Definition Format, used for their (failed) push technologies)).

    Yeah, great. Contrary to what uyou may be reading and such, XML is not revolutionary; XML is not earth-shattering; XML is not new. XML is a good idea that just happens to have a lot of people, and therefore a lot of momentum, behind it.

    How do I use it? Well, the first thing XML requires is a parser. A parser (usually) reads in the XML, turns it into some sort of a parse tree, and them outputs it into some format your target application finds useful. There are many parsers out there, many written in Java, but of course also in C, Perl, Python, Tcl, and others. Slashdot uses the XML::RSS module to slurp in headlines from the 8 zillions other sites that make up the slashboxed on the right of the page.

    The story of XML is the story of potential. There is tons of it: Potential to share data among applications and among businesses. But you still have to do much of the work. Premade solutions (such as Perl's XML::RSS) tend to be specific purpose solutions, or very general purpose (like the expat parser, written in C, which you plug in to your application to parse XML). Most of the work still needs to be done by the programmer in question; XML provides a framework for data sharing.

    This frameworks entirely developed by the developer who controls the data. While you can use predefined DTDs if you want, you are not at all obligated to do so. Recently /. ran a review [slashdot.org] of "Docbook: The Definitive Guide"; Docbook is an example of a premade DTD for technical writing and documentation. But of course everyone's data is different, so your DTD will reflect your data exactly, without you having to modify it to fit into someone else's schema.

    XML is only a part of the story; it describes the data itself, with nothing about how the data should be presented or connected to other data sources. These other parts have their own markup languages, XSL (eXtensible Stylesheet Language) and XLL (eXtensible Linking Language). There are tons of X_L lanagues (eXtensible Query Language (XQL) anyone?) which are designed to fill in the various gaps.

    Microsoft, for all their faults, have been doing a lot with XML lately. They are moving the native formats for their Office suite to be XML-based; there's CDF I mentioned earlier; they developed a business-to-business langauge called BizTalk (which is just a DTD and some assorted supporting programs/parsers/etc). IBM has also done a great deal with XML and Java, producing parsers and translators.

    Hope I didn't ramble or jump around too much.

    darren


    Cthulhu for President! [cthulhu.org]

The moon is made of green cheese. -- John Heywood

Working...