Using the DocBook DTD for Internal Documents? 58
Saqib Ali asks: "These days, most of the Linux Documentation is created using DocBook DTD. I was wondering if it will be useful for a large Enterprise to create Internal IT documents using DocBook DTD. Any success stories where a large enterprise converted all of its internal IT documentation to DocBook, with management's support? Any other things/issues to keep in mind before embarking on such a mission?"
I end up having a lot of the same questions (Score:3, Interesting)
Re:I end up having a lot of the same questions (Score:3, Interesting)
DocBook, on the other hand, has a lot of complicated markup-- I mean who enjoys using the PARA tag to open and close each and every paragraph? It would drive me insane. Then, after you finally find an editor that suits your needs, you still have to monkey around trying to convert the documents. I was able to get a DB file into HTML without too much pain, but PDF? Never managed it. I spent too many hours on what essentially would have translated the DB XML into TeX source anyway! Why not just write in TeX and be done with it.
Finally, there is LyX for LaTeX which looks to be a WYSIaboutWYG editor, although I find it very convenient to just use emacs. I think the only problem I've had so far is getting figures to lay out on within text how I want, whereas TeX is pretty happy shoving them later, so that the body of the text can remain as fluid as possible. You can see the results on my site [ichimunki.com] (where I suppose I ought to include a tarball of the actual LaTeX source files and the simple shell script that drives all the processing).
Your site.. (Score:3, Interesting)
Ignoring the utterly braindead ``foo'' quotes, those filenames are ultra lame.
DocBook lets you specify a section ID which ends up being mapped to a filename when generating HTML; doesn't LaTeX haeve something like that?
Re:Your site.. (Score:2, Interesting)
And yes, there is an option to have the resulting
In a perfect world, I'd like to see a system that combined the best of Wiki, TeX, and DocBook (I have nothing against XML, I just don't know if I'm in love with DB's DTD yet), so that you could have the pages be fairly interactive for online references (especially useful in a corporate setting), but still generate standalone documents from the entire work. All with complete revision control, of course.
I settled on what seemed to be the best compromise available so that I could have a single set of source files produce both printed matter and a website. Ultimately the possibilities with XML seem greater (via stylesheets and xsltproc and custom document parsers written in languages like Perl or Ruby), but getting from XML to PostScript or PDF is the part I had problems with. I like to think if I had problems with it, so would others. But then I limited myself to Free Software, whereas someone willing to use non-Free software might easily find an off-the-shelf package to get around the PS/PDF hurdle.
Re:Your site.. (Score:3, Informative)
Check out Apache Cocoon [apache.org] and Norman Walsh's DocBook stylesheets [sourceforge.net] at Sourceforge. It sounds very much like what you are looking for both for batch processing of documents (using command-line mode) and for online dynamic presentation. There is even a serializer to PCL5 in case you ever wanted to send directly to HP-compatible printers.
Re:Your site.. (Score:2)
This is a really clever program that allows you to take a regualr web page and produce very nice PDFs (or PostScript) from it. It supports a few new tags that let you do things like page breaks, headers/footers and such that always should have been in HTML (even if only as a hint for printing) but wasn't. It automatically builds tables of contents (fully clickable in the PDF), cover pages, and the like, too.
I've started using this tool more and more often over the last few months. It's just too handy for words. You can find it at Easy Software [easysw.com]. (And yes, it's open source.)
Re:Your site.. (Score:2)
You have a solution and you seem to like it. My problem with it is the mixture of content and layout. Bold, italics, strikeout, and underline have no intrinsic meaning: they are visual cues for underlying themes. When they are the only model, you by definition lose the semantic background to the document. "What's the problem," you ask? For static HTML and PDF presentation, there is no problem for human readers. But it removes the possibility of automated, intelligent indexing and categorization.
I retort with Yoda. "Is the dark side stronger?" "No! Quicker. Easier. More seductive."
Re:Your site.. (Score:2)
Well, you can specify left and right quotes in HTML: latex2html should either use them, <q> tags, or normal double quotes. Not abusing backticks (note they don't look anything like the mirror of ' in an awful lot of fonts, including ones very popular online, such as Verdana).
Doing it TWICE to emulate double quotes means the author of latex2html is going to hell for sure (along with 1001 online newspaper editors)
Ah, yes, that's better. Better to be dependent on title and have some meaning than be dependent on order in the table of contents and have little meaning
Yes, that would be nice. WebDAV with a versioning backend like SubVersion has some potential for document management - better imo than the million-and-one forms approach.
Document formats are a little more hairy. Sometimes I feel like using something like AFT, which is pretty close to plain text. Other times I want to use XHTML, or DocBook, or my own schema. Some front-end which handles all of them would be nice
I'm not really bothered by print, but I do want my documents to be stand-alone from the website. I want navigation elements to grow dynamically from the metadata in my documents, or from some external metadata file. I also want to be able to generate documents from databases etc and have them plug in nicely with the filesystem and keep nice abstract and stable URI's.
Unfortunately I'm pretty sure I'm gonna have to write this myself. Being a professional slacker, this will likely take a while
DocBook to PS/PDF isn't too hard. If you can find a generalised XSL:FO engine you would be able to use an arbitrary XML document provided you have a stylesheet for it. Failing that, a browser, CSS print media rules and an option to print to a PS file would probably be ok; converting PS to PDF shouldn't be a problem, and CSS can style any XML document you like.
Sounds like you already made up your mind (Score:2, Insightful)
As far as markup goes, one of the reasons for using the open/close tag pair in XML was because so many people have written HTML and are used to that model.
As for complicated markup, there is a Simplified DocBook [oasis-open.org] that reduces the amount of elements you have to know and keep track of while still remaining 100% DocBook compatible. Write a little now, and as your experience and comfort grows, so can your markup choice. Simplified DocBook now, full DocBook when the volume of documentation requires it later (By that time, more editors will have come out hopefully).
DocBook to PDF is handled by converting to XSL:FO (not to be confused with XSLT) syntax and serializing with something like FOP. LaTeX is actually closer to XSL:FO than to DocBook. If you're trying to convert to PDF by hand, you're expending more effort than you needed to. You can find premade stylesheets for HTML and FO [sourceforge.net] and documentation about how to use them without reinventing the wheel. The advantage of going to XSL:FO instead of a direct DocBook-to-PDF is that there are serializers out there to output FO syntax to PDF, PostScript, PCL5, and RTF. It would be a shame to just make a one trick pony.
As for emacs, there are emacs extensions written for DocBook [oasis-open.org] that help you with tag choices and automatically close the tags for you. Isn't that one of the main complaints you had about the syntax? And you're comfortable with emacs, right?
Note that you are using LaTeX to drive the layout. This is not how to use DocBook. In fact, DocBook goes out of its way to avoid any layout information in the file. Say you want to search for all documents with a section title that contains "apple". Anyone with a document parser can implement this no matter who wrote the DocBook file at any organization. LaTeX you could do this as long as everyone agreed upon the element identifiers -- which doesn't happen at every company. DocBook is content, HTML and PDF are layout, and never the twain shall meet...except during the transformation step.
If you prefer LaTeX, peace be with you. But they cannot really be compared as LaTeX -- while possible in implementation -- does not enforce a disctinction between semantic content and layout presentation. DocBook does. This adds some complexity for the initial startup sometimes, but it pays off when you actually have to organize and index those documents in an archive. You should talk to the folks at the Linux Documentation Project for more insight on this.
Re:Sounds like you already made up your mind (Score:1)
hunh? (Score:2)
XML processed with XSLT and serialized through FOP. Where is LaTeX used? XSLT doesn't have anything to do with LaTeX and FOP has nothing to do with LaTeX. Where do they rely on LaTeX?
Oh! You were talking about the LaTeX converters that Norman Walsh made available. Sorry. There's the confusion. If you use the FO stylesheets and FOP or iText for the PDF serialization, things are much much simpler. LaTeX shouldn't come into play unless you really want to use LaTeX.
And you are right that it is quite possible to make layout-free LaTeX. My statement was only that it does not enforce the separation of content and layout. This is the same as saying that there is nothing stopping a programming team from making clean, readable C with uniform indentation of code blocks, but Python doesn't allow the choice: clean, uniform indentation is an intrinsic piece.
It was not my intention to say that LaTeX made it impossible or even unduly difficult. Sorry for the confusion.
Re:hunh? (Score:1)
Re:I end up having a lot of the same questions (Score:4, Informative)
All Simplified DocBook files are also completely valid DocBook documents. But there are far fewer elements and constructs to keep in your head. It's also geared toward smaller items such as articles instead of complete books. At my company, we made a couple of template documents and then just had people fill in the blanks. People ended up working faster once we got them to stop worrying about formatting and styling (non-trivial).
Start writing in SD and as the collection of documents grows, you can look into combining them into a cohesive DocBook collection as time permits and your experience level grows.
one open source approach ... (Score:4, Informative)
Re:one open source approach ... (Score:1)
It may work... (Score:2)
Check out NTSGML pages (though they have not been updated for some time) if you end up doing this all under Windows. Also, I'd recommend sticking with generic SGML, not XML -- RTF converters for XSLT are not that good (I was not able to produce a single readable doc).
Oh for heaven's sake (Score:2)
In my opinion, XSLT should not be used to generate something like RTF directly. XSLT was made to transform one XML schema to another. Period. Anything else is like trying to put the square peg in the round hole.
Re:Oh for heaven's sake (Score:2)
That is what I used. Problem is, I guess, that I was trying to do it under WinNT and there may have been a few quirks that just would not let it work fully. For one, jfor would nver produce anything anywhere resembling what was expected.
Another annoying thing was that I actually had to run a web server on my lap top to be able to generate anything: all the tools (except, I think xsltproc) were very insistent on going to OASIS website to read latest & greatest DTD! Maybe again, I ahve missed something, but I could not persuade neither saxon, nor xerces/xalan to use local copy of DTD...
Re:Oh for heaven's sake (Score:3, Informative)
From http://xslt-process.sourceforge.net/docbook.php
I also know that there's a way to specify it as a general resource and to have a catalog that keeps from having to hardcode each file to a path, but I don't remember the syntax or the steps offhand.
Hope this helps with your laptop problem.
Re:Oh for heaven's sake (Score:2)
I try using XML to structure my docs... (Score:3, Insightful)
book
|
+--chapter
+--chapter
| |
| +--section
| +--section
|
|--chapter
ad nauseum. Not chapter titles, not section titles, the literal words chapter and section. Multiply this by hundreds of sections.
How. Completely. Useless.
Until I can find an XML editor with some bloody sense to its structure navigator, I would rather use word. And no, I don't really want to use a WYSIWYG editor, because I want to know what XML it generates for my custom xslt snippets (which I might add I also have similar problems navigating with these brain dead editors)
Re:I try using XML to structure my docs... (Score:2)
Re:I try using XML to structure my docs... (Score:3, Interesting)
Re:I try using XML to structure my docs... (Score:3, Interesting)
As it should be (Score:2)
As for wanting to know what the underlying XML is, "why!?!" For something like Word, where only formatting information is saved, I could see your concern. This is like the HTML output of Frontpage and Dreamweaver. But DocBook is a semantic construct with no formatting information. What you see in a GUI should be far less variable in the output data below.
With DocBook, you already know what code snippets it is generating without even looking at your editor; it's rigidly defined in the DTD. Your XSLT should be written to the DTD, not to a document.
We did it. (Score:5, Insightful)
Anyone who was not a programmer balked at the idea of having to write documentation in a (Gasp!) markup language. "Just give me Word!" they would whine.
There is a lot of overhead associated with DocBook that most non-technical people don't want to deal with. They want a WYSIWYG editor, and will cry, kick, scream, and intentionally be completely unproductive until they get it.
Re:We did it. (Score:2)
Re:We did it. (Score:3, Funny)
Except a union apparently.
Re:We did it. (Score:1, Offtopic)
<RANT>
Where I used to live (Victoria, BC), janitors in the hospital got paid more than the medical staff. Why? Because the medical staff were considered an essential service by the government, and therefore not allowed to strike. Because the janitors were not considered essential, they were allowed to strike, and therefore drove up their pay rates.
Unions were useful in their day. They eliminated harsh working conditions. Now the government performs that task with laws, and unions have become superfluous.
</RANT>
Re:We did it. (Score:1, Offtopic)
The government created the Canadian health service, which in turn made it impossible for medical workers to negotiate via collective bargaining.
The "lazy union worker" image is just that -- an image pushed by business and the media. And while some things, particularly senority systems and the greviance process, seem very strange and wasteful, they are there because employers like railroads, meat packers, health services and school boards screwed their employees in those areas.
I expect anti-union attitude amongst IT staff and programmers will change as their jobs are rendered obsolete by automation and cheap competition.
Re:We did it. (Score:1, Offtopic)
Read Fast Food Nation [amazon.com] , and then either 1) reiterate your claim, explaining how a slaughterhouse as described by Schlosser doesn't constitute a harsh working condition, or 2) refute the factual evidence presented about slaughterhouses. Hint: no one in the meat industry has been able to find factual errors in Schlosser's account.
There are plenty of other examples, of course, that's simply the first that comes to mind. Harsh working conditions exist, and industry has figured out how to work with government to prevent safety regulations from being implemented.
Re:We did it. (Score:1, Insightful)
An accountant should not have to right in DocBook or any other markup language.
Use a WYSIWIG editor and translate it to DocBook.
Re:We did it. (Score:2, Insightful)
You know, I normally find your posts pretty thoughtful, and I often agree with them. But this time I think you're way off the mark. "Discipline them?" If you treat people like children, you shouldn't really be surprised if they act like children in return, should you?
General-purpose computers are great things because they allow people to use the tools they find most effective to get the job done. In this example, what's the job? Producing documentation. (The submitter was talking about internal documentation, but the OP was talking about docs in general, evidently.) To produce documentation, you should use the tool that's best suited for producing documentation, not the one that looks coolest on paper or that has the neatest feature set or whatever.
Writing structured documents in something like LaTeX (with which I have some experience) or XML (with which I have less) works well up to a point... but only up to a point. If your document is going to be basically prose-- unformatted paragraphs organized into sections, chapters, and books-- then writing with a markup language will probably work well. The ratio of content to markup will be small, so you can just concentrate on your words.
But if you want to create even something as simple as a bulleted list, suddenly you have to deal with markup. Creating a bulleted list in Word is trivial; you click the "bulleted list" button and go to town. Creating a bulleted list in LaTeX or XML is more work, and it scatters markup throughout your document in an unappealing and unpleasant way.
So markup works in some situations, but in others it's not a good solution. This is what we should be talking about here. Not talking about disciplining coworkers who "act like spoiled two-year-olds."
I just think you're forgetting what the purpose of computers and IT is: to give people the tools they need to do their jobs. Any system that requires its users to work in a way that they're not happy with is flawed, and could be improved somehow.
(Sorry about the rant.)
Re:We did it. (Score:2)
Re:We did it. (Score:1)
My opinion on the whole matter is that people should use whatever tools they like to do their jobs-- to the extend that it's practical for them to do so. XML might have some technical merits over Microsoft Word, but if the writer wants to use Word, that's his call.
But that's just my opinion.
Re:We did it. (Score:1)
Erm. . . How would this be unreasonable? How is it any more unreasonable than expecting a programmer to use language 'foo' for an application the company is developing? If the technical writer wishes to get paid, then they need to do their job, and that means doing what their employer tells them to do. If that includes using XML or LaTeX, then they either do their job, or find a new job.
When you put unreasonable demands on people-- people who are just trying to do their jobs, by the way-- it's pretty likely that people are going to respond unreasonably.
I'll note again that I don't think mandating a specific way of writing things is at all unreasonable. If these people are trying to do their jobs, then they'll do it as their told to. . . that *is* their job. Being told how to do your job is not an uncommon, nor unreasonable, thing (within reason of course, micro-managing is a Bad Thing (tm)).
My opinion on the whole matter is that people should use whatever tools they like to do their jobs-- to the extend that it's practical for them to do so. XML might have some technical merits over Microsoft Word, but if the writer wants to use Word, that's his call.
That's a great idea, but what do you do then, when you've got 10 different content authors using 11 (One of them got annoyed halfway through a project, and decided to try something new) different frameworks to develop their writing?
And worse yet, what happens when you decide to combine two different authors' works into a single work, when they've both used different tools?
While I'm all for people being allowed some individual choice in how they do their job, there is a limit that has to be considered. If they're working for a company, that company gets to decide both what they need to do for their job, and how they need to do it. If the company standardizes on a single format, such as XML/DocBook, or LaTeX, or HTML, or whatever it is, then everyone at that company should be using it. Regardless of whether they'd rather be using something else, they're being paid to do what their employer tells them to.
Re:We did it. (Score:1)
It's unreasonable like carving a roast beast-- er, sorry, too much Dr. Seuss-- carving a roast beef with a screwdriver is unreasonable. If the person doing the job finds the tool inappropriate, maybe the mandate should be reconsidered.
I'll note again that I don't think mandating a specific way of writing things is at all unreasonable.
Ah, but that's the thing. Mandating the use of XML for technical writing gets in the way of the job. If you're spending time tweaking document structure in an obscure language, you're not writing.
All I'm saying is this: you will almost certainly gain more efficiency and productivity by letting your people do their jobs with the tools they prefer than by requiring the use of any one tool, not matter what its technical or political merits might be.
Re:We did it. (Score:2)
Or perhaps the person doing the job should realize that no job is perfect, and at some point they're going to have to accept some restrictions from their employer on how they do their job. At least, if they want to get paid.
Your analogy of carving a roast with a screwdriver doesn't really hold up, because most of the things we're discussing here, LaTeX, XML, etc, were specifically designed for authors. A better analogy would be that you are carving a roast, and need to pick a knife. LaTeX would be one type of knife, while XML/DocBook would be another type.
Just because someone doesn't like the knife they were given doesn't mean that it's the wrong knife. They may just be ignorant of it. Or it may be that the company is standardizing on a single type of knife so that it can more easily share the knives among employees.
Ah, but that's the thing. Mandating the use of XML for technical writing gets in the way of the job. If you're spending time tweaking document structure in an obscure language, you're not writing.
Have you ever used XML (Assuming that we're specifically talking about DocBook, as that was designed specifically for use by authors, particularly technical writers)?
DocBook/XML was specifically designed for creating documents and books. Additionally, XML is not an osbscure language, nor very difficult to work with. Espcially in this age of the Web, everyone is familiar with HTML, making DocBook fairly easy to pick up. As if that wasn't easy enough, there are numerous XML editors available that can make it even easier to work with.
Unless all writing is done in plain text, you will have to deal with some work to make it presentable. Whether that be in a word processor, in LaTeX, in DocBook, whatever, it will have to be done. The question that has to be asked is which format will provide the greatest benefits with the fewest detriments. Depending on the goals of the company, the individual authors may very well not be the best person to make these decisions.
All I'm saying is this: you will almost certainly gain more efficiency and productivity by letting your people do their jobs with the tools they prefer than by requiring the use of any one tool, not matter what its technical or political merits might be.
Ah, but you're looking at this in a very limited way. Yes, you may gain more efficiency in the short run, by individual authors, by letting each person use whatever they want. But in the long run, you could end up spending literally 10 times as long making the end product meet the company's needs.
It's easy for an individual person to look at the situation and say, "I could write this document in only three hours if I could do it in 'foo', but doing it in DocBook/XML will take me four hours", and think that it would be much more efficient to write it in 'foo'. But if this individual is writing a single article that will be combined with four other articles into a single work, and it will take six hours for someone to combine the five differently formatted articles into that single work, then collectively, you've just lost an hours worth of work.
And no, this isn't a purely theoretical example. At a previous employer, we had a situation like this occur. Eventually, we standardized on a single framework for all technical writing and documentation. At first, it did slow people down a little bit, as they were forced to learn the new system. Once everyone became used to it, though, it worked *much* better than before. Being able to easily share and merge documents allowed us to create a single, central, information repository, easily accessible and usable by everyone.
Lastly, while you throw out technical merits with a single statement, it's not something to be overlooked. Depending on what your end goals are, you may *need* to consider technical merits in order to get the job done. For example, if your end result needs to be available as a PDF file, then you better be using tools that support PDF generation. If you're not, then no matter how productive you might think you are, you're never going to get your job done. Sometimes it's more important to fit your tools to your job, than to fit them to a specific person.
Re:We did it. (Score:1)
That doesn't sound right to me. LaTeX is a typesetting system, not an authoring system. The distinction is subtle, but important. I've had many jobs in my life-- mostly 'cause I have a short attention span and I keep getting fired-- and along the way I've been a typesetter, a programmer, and most recently an author. Putting on my typesetter hat, LaTeX rocks. It's a fantastic typesetting system, all praise be to Knuth and Lamport. But as an author, it's definitely not optimal. If I want to italicize a word-- something authors do a lot-- I have to type {\it whatever}. That's not author-friendly. XML is far worse. XML, in my author opinion, isn't really meant to be human-readable. It gets in the way of the words, and as an author, words are all that count, you know?
So LaTeX and XML are really awful systems for authors. With all the tools at my disposal, I still find myself using Microsoft Word with a very narrow set of predefined styles for creating structured documents.
See, the thing is, this simply isn't true. Technical writers-- that's what we're talking about here-- come from two basic camps. They're either technology people who become writers, or they're writers who write technology stuff to pay their bills while they work on the great American novel on weekends. Programmers and geeks-- I use the term reflexively and affectionately-- are familiar with HTML. Writers aren't, and don't particularly want to be. Asking people who just want to write to scatter XML markup through their documents is like trying to teach a pig to sing: it wastes time and it annoys the pig.
If you're going to have your writer or writers using a tool anyway, why not just let them use the one they're already familiar with? Why try to force a new one down their throats just because it produces XML?
And while we're on the subject, don't bother taking your XML documents to a printer to get typeset and published. The print world-- actual ink on paper stuff-- cares about traditional stylesheets, from Word or Quark or FrameMaker, not XML.
Agreed. But guys from the IT department sure as hell aren't qualified to make the call, either. Compromises will have to be made, and that involves getting input and feedback from your people rather than simply dictating to them.
Look, let's talk about the real world here. In the vast majority of cases, technical writing goes from the writer to page layout to the printer. In many cases, the process of printing the documents might be supplemented-- or even completely replaced-- by the creation of PDFs, but the process is the same. Writer to page layout to printer.
Most page layout gets done in one of three pieces of software: QuarkXPress, Adobe FrameMaker, or Adobe InDesign. If the page layout is done with FrameMaker, then the best thing is for the documents to have been written with FrameMaker. This is fine, because FrameMaker is a good tool for writers. Most writers at least know it exists-- unlike LaTeX or XML-- and if they're not familiar with it, they can learn their way around it in minutes thanks to the familiar UI-- unlike LaTeX or XML. But FrameMaker is falling out of favor in many circles, replaced by either QuarkXPress or InDesign. In either of those pieces of software, the layout artist has to import text documents provided by writers and flow them through design templates. Layout artists like their jobs to be easy; the best possible scenario is if the documents provided by the writers can drop right in, and arrange and format themselves based on previously designed stylesheets. Can you do that with XML? No. Can you do it with LaTeX? No. Can you do it with Microsoft Word? Hell, yeah, easy as pie. So which tool is the best for the job in that situation?
Now, there are exceptions to this rule. Until recently, one major UNIX systems vendor I worked with still produced all their documentation using troff on UNIX workstations. I don't know what tools O'Reilly uses for layout, but I understand that theirs is an all-UNIX workflow as well. Of course, O'Reilly prefers that its authors submit Word or FrameMaker files, so that just goes to prove my point.
It's late, and I'm tired. Let me just wrap this up by saying this: go find a technical writer. Explain to her (most of the writers I know are women; this may or may not be typical) that you want her to do all of her writing using LaTeX or XML from now on. Explain to her what this means. After you get out the hospital, come tell me how it went. I'll be interested to hear.
Re:We did it. (Score:3, Interesting)
Try LyX [lyx.org].
Just click "title" and type the title. Click a button to turn italics on/off, etc.
See http://bgu.chez.tiscali.fr/doc/db4lyx/ and http://www.lyx.org/help/xml/xml.php
-Peter
No Suitable Editors (Score:4, Interesting)
DocBook is a great spec, but the editors suck for the most part. Lyx can't import DocBook in reliably, and your Docbook is stored as a lyx file (latex I think). Lyx's Docbook stuff can be a bear to set up, even on a system like RedHat where most of the software comes installed. I only recommend Lyx to people who have experience with Lyx, to someone who just wants to write docs, it tends to be more trouble than it's worth.
Framemaker will probably do everything you want and be a godsend with lots of nice features, but you'll pay for it, $800 for Win/Mac and ~$1300 for Unix.
XMLmind is pretty cool, it does Docbook well but is a little slow, it has a little bit of a learning curve, but is prolly the best Docbook editor I've found for free. It's not Open Source though. It is written in Java, so you might have some speed issues, depending on the platform you run it on. I've been recommending XMLmind to everyone I know that asks about Docbook, it has a tree view of the DOM as well as a WYSIWYM view with stylesheets applied on the fly. It has property editors and a pretty smart insert tool that follows the DTD, only allowing you to insert allowed tags into other tags. It feels like more of a programmer's tool than Framemaker, but it should be fairly easy for most WYSIWYG users to adjust.
<rant>
I don't understand why on God's green earth OpenOffice or Abiword or KOffice, or anyone else in the OpenSource world has neglected this area. It's been three years since the LDP went to DocBook, GNOME uses DocBook as their doc format. Why in the hell don't we have decent document writing tools when everyone is always screaming about the lack of documentation in the OpenSource world?
If we want more docs written, it needs to be easier to write them and shouldn't involve learning all about SGML or XML engines as well as a markup language to do it. DocBook is too big to keep in my head and I shouldn't have to think hard about how to write docs when my focus is the content I want to write for. Organizing technical info on a difficult subject is hard enough, stopping every five minutes to look up a DocBook tag or trying to better understand the structure is a huge barrier to getting the work done.
</rant>
But that's just my $.02
Re:No Suitable Editors (Score:1, Informative)
Re:No Suitable Editors (Score:2)
What is really unfortunate is that, even if you somehow convince people to use this tool, once they discover that <citation> produces essentially the same formatting as <image_caption> (or whatever two tags), then they'll either use the two interchangeably for whatever, or they'll use one or the other exclusively for things that are unrelated to citations or captions. Nobody except programmers cares at all about document structure, and you can't force them to. All people want, and all they'll think about, is pretty layout.
(rant mode off)
Re:No Suitable Editors (Score:2)
I think that most documentation people can understand such distinctions. To drive the point home better, use different styles for each -- at least while they are editing. You can do this with the WYSIWYG editors such as Morphon -- just use a different color for each. Or you could create preview stylesheets out of the standard Norm Walsh templates.
Re:No Suitable Editors (Score:2)
No doubt the best documentation people understand this, but in my experience, most either don't understand it or don't care. And if you enforce the difference between types like this, then what they see is ugly, and it'll be nothing like what they eventually get. This, reasonably, makes them resistant to using the software.
Actually, in my experience, most people working on documentation were dragged there from something else they'd rather be working on, and often even have to be shown such advanced concepts as copy and paste. Therefore creating documentation should be really easy, but worrying about structure just isn't easy. Making this bold, and that italic is easy, though. This problem won't be solved until we can create heuristics that just figure out what you mean when you make a block of text such-and-such a style, or at the very least can separate the "styled text" part of a document from the "containing layout" part, and can reliably extract the important styles from the ones that change between presentations. Either that, or every company hires expensive professional documentors.
Part of the problem, I think, is that many people who work on documentation were trained on typewriters or desktop publishing software. And though those have justifiably gone out of fashion, nobody except programmers is interested in learning what they see as the paradigm of the week.
Re:No Suitable Editors (Score:1)
Re:No Suitable Editors (Score:1)
As for producing docbook natively, the NetBeans [htpt] java IDE has an XML module that is pretty slick. Good 'ol X?Emacs in PSGML mode is what I use to create and edit Docbook on the fly, it works really well (although the indentation engine is pretty flaky). Those are both open source.
Abiword supposedly can save in Docbook V 4.1.2 XML format, but its output filter leaves a lot to be desired the last time I checked. OpenOffice's native format is XML, so a set of XSL stylesheets is all that's needed to Docbookify it. We may be working on developing just such stuff over here.
Re:No Suitable Editors (Score:2)
Because good editors are hard to write and a vast majority of the sufficiently talented coders who could do it still don't grasp the concept of content being separate from layout. You can't code what you don't understand.
That coupled with -- what other have touched on -- users who can't accept that what they edit is not necessarily what it will ultimately look like.
"I want to put this in italics."
"Why?"
"Because the image captions should all be in italics here."
"So put the text in a <caption> tag."
"But it's not in italics in my editor."
"It will be in italics when it's published."
"But it's not in italics in my editor."
*sigh*
You're right, we need better editors.
Here's a site that does just that (Score:1, Interesting)
What about training for the users? (Score:1)
What about OpenOffice.org (Score:3, Interesting)
Re:What about OpenOffice.org - progress! (Score:1)
--
Simon
In progress of converting (Score:2)
So far we've completed converting 3 of our "books" from Script to DocBook. The largest book being over 175 chapters with about 600 pages. The most time consuming problem was the project requirements were that the DocBook version must look very similar to the Script version. We used the XSL stylesheets from docbook.sf.net [sf.net] and FOP [apache.org].
Script is a formatting language (think RTF) and DocBook is a markup language. There was a lot of inconsistant formatting in the Script versions which decreased readablilty. The consistant formatting of correctly marked up DocBook is a very good thing.
I spent a lot of time customizing the XSLT stylesheets. XSLT has a nice mechanism that allows you to import and then overide parts of the imported stylesheets. This is real nice because we can upgrade the upstream style sheets without modifing our customizations. This isn't completely true if there are big structual changes to the upstream stylesheets but since our changes are in seperate files it's rather easy to refit our customizations.
We had two people working on this project. One customizing the stylesheets, me, and another who took the Script source and added DocBook tags. This worked quite well. We were commited to the project and were able to stick with it until completion. This worked very well.
I encouraged another department to give DocBook a try and this didn't work so well. They currently only publish their interal docs to HTML and their documentation source was written in HTML. For them the overhead of DocBook and their lack of desire for paper output made it not worth it for them.
Previously we could only print to paper. Now we have a single source to generate HTML, PDF, Paper (from pdf), and Windows Compiled HTML Help files (basicly HTML with extra meta info).
Some people seem to just not understand the advantages of marking up the structure of the document instead of the formatting. If you want to use DocBook because of the hype then odds are you'll piss people off in the short time, maybe long term too, by forcing it on them. If you and management understands the long term advantages of structed documentation then I really recomend DocBook.
docbook sucks!!! (Score:1)
for all the reasons stated above and...
i was unable to produce a simple Howto document (bulleted list) because the docbook.xsl file had error(s).
when i reported these to the author (?) i was ignored.
now over a year later i'm kicking myself for not finishing my version of what docbook should be: doc-this! [doc-this.com]
i have been asked recently to finish this so i guess maybe it's woth the effort.
Re:docbook sucks!!! (Score:1)
DocBook Rocks!
Here's what a publisher does (Score:1)
Don't use any formatting when writing your text, no bold, no italics, nothing. When there's a figure, place [FIGURE ##] where ## is the number of the figure. I repeat, do not do any formating, we won't accept your document if it's formated.
I'm pretty sure that they we're taking this unformated text and transforming it into docbook.
So you may want to do this: ask your non-technical people to write unformated text, and hire a technical person (programer) to do the markup.
don't use markup, generate it (Score:2, Insightful)
Markup should always happen
At work when doing professional documentation, our layout people extract the raw text and apply to their own Framemaker setups - so all the formatting our developers do is really in vain. The doc dept. has no trouble with my plain text stuff
Docbook itself is fine - but make life simple for the writers, don't make them think about markup (as much as possible anyway). My vote is on the plain-text editors + filters
My CDN$.02.