DTDs for Internal IT Documents? 44
Saqib Ali asks: "A DTD (Documentation Type Definition) defines the document structure with a list of legal elements. DocBook DTD is being widely used in creating Linux related documentation. However I am looking for a XML DTD that is more suited to internal IT documentation, and easy to learn and use. Preferably I would like to use a DTD that can be used with OpenOffice. What DTDs are other Slashdot readers using for for internal IT documentation? I have created documentation using DocBook DTD and hosted them on a Apache Cocoon . Cocoon lets me transform the XML to HTML or PDF. I would like to keep the same backend infrastructure (i.e. Cocoon) but try out other DTDs that are suited for IT related documentation. Any ideas?"
Why not Docbook? (Score:5, Interesting)
-molo
Re:Why not Docbook? (Score:2)
Re:Why not Docbook? (Score:4, Informative)
1) I have been using DocBook for few years. So I know it pretty well. However to newbies, DocBook can be overwhelming (I have been there). I know there is also Simplified DocBook, but no good WYSIWYG editors for it either.
2) OpenOffice doesn't have very good support for DocBook. Setting up OpenOffice to support DocBook can be very tricky.
3) Not many good WYSIWYG editors support DocBook DTD. And if they do, it is not that easy to setup.
What I am looking for is a "Open Standard" DTD can be used to create simple documentation, and is well suited for IT related stuff.
Re:Why not Docbook? (Score:3, Interesting)
In summary, I don't think there's a better solution than DocBook, despite the fact that it does not fulfill all of your requirements.
Re:Why not Docbook? (Score:3, Funny)
How about MS Word 2003?
Re:Why not Docbook? (Score:3, Informative)
Laugh it up, but Infopath would allow you to define a schema (DocBook) and let you use Word as an engine to enter the data as one gigantic "form".
Caveats:
1) I have no idea how well it would handle "non-form based" data such as this. It seems it was meant for work-flow enhancement for smaller amounts of data than a book.
2) Infopath seems to be really, really expensive. It is probably outside of the budget of any open-source project.
That said, the solution does exist.
Re: Docbook WYSIWYG (Score:5, Informative)
Issues with today's XML editors (Score:3, Insightful)
One thing I haven't seen much of is an XML editor which does real-time validation of the document against a DTD or XML Schema. The ones I've seen do only XML syntax (ie. well-forming) validation in real time, and syntax highlighting as a benefic side-effect. But you have to start the validation against DTD/Schema by hand (ie. push a button) and the feedback is in many cases raw output from the validator presented in a separate pane or window. What I'd like is the highlighting to give me useful feedback the
Re:Issues with today's XML editors (Score:2)
For large documents, are computers powerful enough yet for full real-time schema validation in a way that isn't annoying as hell? I'm still angry over having to turn off every one of Word's magic doodads before it doesn't constantly annoy the piss out of me.
Re:Issues with today's XML editors (Score:3, Informative)
I'm in daily debt to the folks who wrote these.
Re:Issues with today's XML editors (Score:2)
I think it would be better to make the editing commands depend on the DTD, so rather than typing < a > to begin an <a> element, there is a toolba
Re:Issues with today's XML editors (Score:2)
You could have an editor where you type arbitrary characters and which then warns if they veer off the DTD, as you suggest. But this would get annoying since if you are typing 'hello' then the document is not well-formed from typing the first character until you have typed the last. So you'd get lots of warning lights all the time.
It's how most "programmer editors" work. It actually makes for a fast and easy to use syntax checker: if you see that all the text has turned green it's very easy to go up an
Re: Docbook WYSIWYG (Score:1)
Re: Docbook WYSIWYG (Score:2)
We use Morphon in our company for DockBooks. The primary draft writers - industrial engineering interns (it's for an industrial engineering application) - had very little problem figuring it out.
What about the humanities? (Score:2)
DTDs for the humanities (Score:4, Informative)
Just off the top of my head, I recall TEI [tei-c.org] and TEI-lite being in wide spread use. There are quite a few subsets of both. In general it's often easier to strip an existing DTD down to what you need than to try to make a new one from scratch.
Docbook, as others have mentioned, is good for simple documents, or ISO-12083 [xmlxperts.com] for more complex ones are additional options.
DTDs are pass (Score:5, Insightful)
What you want is a Relax-NG Schema. DTDs only define the barest bones of XML structure. Validating against a schema lets you verify all kinds of things that a DTD can't even express.
(Don't be confused by W3C Schemas. That format stinks.)
Is a schema important for documentation? It depends on how much structure you need, which largely depends on how many uses you have for the documents. My employer actually puts the documentation itself in the schema, and generates manuals from the same text that validates important input files.
Re:DTDs are pass (Score:2, Informative)
DTDs are '90s technology.
What you want is a Relax-NG Schema.
I think you are confused about what a DTD is. It stands for Document Type Definition, and includes all the human-readable specification that describes what all the elements mean as well as the formal, machine-readable part of the specification.
You absolutely cannot get away from DTDs if you are to edit XML sanely. It's essentially editing with a blindfold on otherwise.
This is a completely different matter to what you should use for v
Re:DTDs are pass (Score:4, Informative)
It stands for Document Type Definition, and includes all the human-readable specification that describes what all the elements mean as well as the formal, machine-readable part of the specification
Hmm. You mean it has comments? Because a DTD is nothing more than a machine-readable specification, which often (but not always) comes with comments.
If you somehow came to the conclusion that xml schemas (in general) are not meant to be human readable, take a look at the compact syntax [oasis-open.org] for RNG.
Norm Walsh (ie Mr DocBook) is already making progress [walsh.name] replacing the DTD infrastructure of DocBook with RNG. And guess what, he uses an editor (nXML-mode [thaiopensource.com] for GNU Emacs) that supports RNG!
I guess he must wear a blindfold? Or maybe you should take a read of James Clark's paper on the design of Relax NG [thaiopensource.com]?
Re:DTDs are pass (Score:2)
If you want to use "DTD" to mean that, that's your business. However, the rest of the world uses the term in the sense defined in the XML TR.
Re:DTDs are pass (Score:2)
Sorry, you are wrong and the grandparent poster is correct. DTDs, XML Schema, and RELAX NG are all systems for describing families of XML documents. The grandparent poster is saying that one of the systems (RELAX NG) is superior to the other two. I'd agree with that assessment...
Validation is just checking that an XML document matches a specification, no matter whether the specification is written as a DTD, an XML Schema, or RELAX NG schema. It is not a separate issue at all.
Re:DTDs are pass (Score:3, Informative)
Except that DTDs are also currently the only standard way to expand general entities in a document. I wish there was a standard entity definition language independent of validation languages such as RELAX NG. Tim Bray once had an idea [tbray.org], but that seems to have gone nowhere. :-( XSLT could be used to do it, but transformations of that nature are slow and clunky when compared with entities. XML 2.0 needs a new doctyp
Re:DTDs are pass (Score:4, Insightful)
(Don't be confused by W3C Schemas. That format stinks.)
This is why XML still sucks. The technology is volatile, even down to the schema format!
So, even after several years of not knowing what to focus on to learn how to use XML effectively, I still wouldn't know what to focus on to learn how to use XML effectively. Standard interchange my ass.
Re:DTDs are pass (Score:2)
Re:DTDs are pass (Score:1)
They are also all using it differently. The only benefit to XML right now is a common syntax and a large number of APIs to access it. Above syntax, there is little common ground in how XML is actually applied. Also, I'd argue that only a few XML-related skills are really transferable from one job to another, because the second job is very likely to be using different APIs, schema definitions, etc.
Re:DTDs are pass (Score:2)
Skip XML for source. (Score:5, Interesting)
Been using it for a year, and I'm absolutely flippin' delighted with it. Structured documentation that's both open-standard and imminently readable, yet delivers great PDFs.
Re:Skip XML for source. (Score:2)
Just something to consider.
DTD? No, I use XSDs. (Score:4, Informative)
I usually use XSD with JAXB (Java XML Bindings) because it provides me with an Object-Oriented approach to reading/writing that specific format. There are also a couple projects to do that for C++ (Rogue Wave). Personally, I *love* XML Data Binding, because I no longer have to deal with DOM or SAX.
Re:DTD? No, I use XSDs. (Score:2)
I think the thousands of acronyms that came along with XML has ruined a whole generation of computer science students.
Re:DTD? No, I use XSDs. (Score:4, Insightful)
Would you rather I had said this:
I don't use Document Type Definitions, but I do use Extensible Markup Language Schema Definitions for all of my document formats. There is a good link to learn the basics here [w3schools.com].
I usually use Extensible Markup Language Schema Definitions with Java Advanced Programmer Interface for Extensible Markup Language Binding because it provides me with an Object-Oriented approach to reading/writing that specific format. There are also a couple projects to do that for C++ (Rogue Wave). Personally, I *love* Extensible Markup Language Data Binding, because I no longer have to deal with Document Object Model or Simple Advanced Programmer Interface for Extensible Markup Language.
Realistically, these acronymns are the only thing that make our posts even readable/understandable.
Re:DTD? No, I use XSDs. (Score:2)
I agree, but the constant introduction of XML-related acronyms creates a sort of attention deficit disorder among young programmers. There are so many bandwagons, right now, that it is quite mind-spinning. In fact, I'm now wondering what was so bad about plain text and CGI programs, after all.
Re:DTD? No, I use XSDs. (Score:3, Informative)
Perhaps...
It's always a compromise (Score:5, Insightful)
As a compromise, you might to check out the progress on XHTML 2.0 [w3.org]. It has a syntax not too far removed from the HTML we've all seen before, but 2.0 is closer to the DocBook model of semantics rather than presentation. It is also more likely to be supported by 3rd party clients in the future. (There is already an XHTML 2.0 renderer available for Mozilla.)
Whatever you do, make sure the markup relates to meaning and NOT how it looks. Looks change, but if you don't take care to the meaning/semantics from the beginning, it is prohibitively difficult to put in in later. For example, it's easy to make all annotations and citations red. It's not so cut and dried to change all red text to annotations (when citations or emphasized text may be formatted in red).
Why, O Why? (Score:4, Insightful)
foo=bar*.734
Re:Why, O Why? (Score:4, Interesting)
# "Write Once, Publish Anywhere" - You have only to prepare single documents; you can either use them as they are with stylesheets or convert them later to different physical media and formats, including plain text, XHTML and PDF.
# Since your documents are in non-proprietary text format, you can edit them with any text editor, and assure their continuity and cross-platform compatibilty.
# The physical layout of documents is separated from the content.
# Retrieving specific information is very esay from structured document.
Too close to home. (Score:3, Interesting)
The company I'm working for are currently working on a "business narrative markup language" which is intended to be an overly simple markup for writing business documentation. I think it's going fairly well, though without a trial in the real world it's hard to say.
The main problem we found with XHTML were a complete lack of useful structure (e.g. in XHTML2 you can have plain text inside a section, or a paragraph inside a section with plain text inside the paragraph, or <l> elements inside the paragraph with plain text inside the paragraph. Three different ways to mark up exactly the same thing. If you don't know why that is bad, try to write an XPath which selects the first line of text of the first section in the document.
The main problem we found with DOCBOOK is that there are far too many elements for any mere mortal to learn, and we wanted a secretary to be able to understand our language. By contrast, ours achieves most of its structure through nested items, and blocks of text (effectively paragraphs.)
There are also groups in OASIS trying to nail exactly this sort of thing but XHTML2 is being a fairly big pressure, to say the least.
On a related topic... (Score:3, Interesting)
A "roll your own" approach works quite well. Assuming you only need a small number of features, you could either just devise a small set of elements to do the job, or you could devise an outer structure and include XHTML or XHTML2 as the content of sections in that structure.
This is the way I'm working on the development weblog I've been rolling. If I need a new element to bring in more semantics, I invent it at the time I need it. Since nobody else needs to use the content, and XSL does all the present