On The CopyLeft Of DTDs 51
"Writing a DTD is a challenge in itself (my company had never tried to go to the Web before, and never heard of XML until my project). To make the system work, we should then write software to adapt our supplier's data model to ours: for n suppliers we would need 2(n-1) correspondences (import and export) from their data model to ours which gets to be expensive on a large scale. Having a common model would help, especially for small companies not on the Web yet (those which rely only on paper data sheets for instance). My opinion, as there is no standard on our industry like RosettaNet, is that we could speed up things, and avoid babelization of XML tags by releasing our model with a Copylefted licence, lowering the cost and hassle for others on our market to build electronic publishing tools. Of course, there is a lot of money invested in our DTD, so what if competitors try to steal it?
Would the Copyleft of our DTD be a good idea?"
XML (Score:1)
Free is great, but (Score:2)
Propose a standard (Score:2)
copylefted DTD (Score:2)
You might have a look at the freely available DocBook DTD. If you can use that, you don't have to roll your own.
If you copyleft your DTD, with a GPL-like license, nobody can steal it, because it's free. You might even create a standard, if it's a usable DTD. And you could share the load for data conversion, by asking your contributors to format the data according to your open DTD before submitting it.
I'm not really sure if there are really any downsides, unless your DTD is in some way your critical moneymaking resource (although I can't imagine how).
Just my $0.02.
Roland
Mainly depends on your internal politics. (Score:2)
In the grand scheme of things, it won't help your competition much, as they'd just spend the time to develop their own in-house solutions anyways when they felt the need. The practical effect of releasing the spec is that you've made a fixed, one-time donation of manpower to your competitors (they no longer have to develop their own versions of this spec).
On the other hand, there is little direct benefit to you releasing the spec. Some groups will adopt it, others won't, and you'll still have to spend a lot of time beating on your customers to use it properly. The good news is that a), free/open tools to perform conversion to/from common formats may become available, which reduces your support load to your customers (you'd otherwise have to provide the tools yourselves), and b) the spec may be extended by others when shortcomings are noticed. This is a benefit - you get R&D for free.
In practice neither effect is likely to be large unless you get lucky/unlucky. Your competitors will probably develop their own in-house specs tailored to their own needs anyways, and unless this is spectacularly useful, the Open Source and Free Software communities are unlikely to glom on to it to the extent required for free (beer) tools and an improved spec to appear.
What will determine whether management approves/disapproves this idea is a) whether their optimistic about the OSS/FS community's ability to spontaneously produce tools, and b) how cagey they are about their "intellectual property". Most likely scenario: They'll see no benefit and some potential loss, and more importantly see a chunk of their IP hanging out there for the world to see. Project not approved.
But, IMO it's still worth a shot, as long as you state your justifications carefully and do your research.
Hey... (Score:2)
Wasn't DTD banned in 1972 for causing Bad ThingsTM?
Re:XML (Score:3)
Are you a moron? The whole question is about definition -- contrary to the popular belief XML is not a unified standard for representation of a structured data, it's an umbrella standard for different kinds of data representation formats. And to use any of those XML-based formats one needs:
XML was and is criticized for the lack of means to convert the second into a code that can be automatically included in the first and used to create programs that operate with the data according to its semantics -- all DTD is good for is to automatically determine if certain input is indeed compliant with it (what is called "validation", even though it never guarantees that data is valid or consistent from the application or data model point of view), and for human to read the description and write a code to process the data.
While XML still sucks because no such connection between formats amd semantics can be established, the original question was about publishing first (and hopefully the second), so others will be able to write applications that use the same format. DTD can apply to either XML or SGML, but in this case there isn't much difference between them in the results for the programmer, as he will end up doing all the job after some simple parser deserialized the data.
Standardization. (Score:5)
A DTD is supposed to standardize data formatting, isn't it? Think less "copyleft" and more "standardized". This is one situation where the Artistic license makes sense, because it requires non standard versions to be labeled as such.
The Artistic license is so vague though, you might want to have your legal department draft something based on the BSD license, with a clause that hacked versions would have to be relicensed under a different name. That would give developers maximum freedom without compromising the standard. In other words, they could steal your code but they couldn't steal your brand name; similar to RedHat.
A GPL'd DTD would compel other developers to release refinements, but it would do nothing to protect your brand. Brand theft would be far more damaging than code theft.
Re:No danger (Score:2)
Re:Hey... (Score:1)
GNU FDL (Score:1)
DDT????? (Score:1)
Re:No danger (Score:1)
Don't use DTD - use XML Schema (Score:3)
If you are working on a new project use XML Schema rather than DTDs. DTDs are a hangover from the days of SGML and do not allow you much control on the content of your documents.
If you use XML Schema then you can specify exactly the format and content of your fields and validate the document much more precisely than just PCDATA / CDATA permits.
Go and have a look at the W3C site before you commit yourself, it is an easy change at the start of a project but will be much harder later.
Description of XML schema can be found at http://www.w3.org/XML/Schema [w3.org].
GPLing DTD's makes no sense (Score:5)
If I were you, I would use something very similar to the Docbook copyright notice:
Copyright 1992-2000 HaL Computer Systems, Inc.,
O'Reilly & Associates, Inc., ArborText, Inc., Fujitsu Software
Corporation, Norman Walsh, and the Organization for the Advancement
of Structured Information Standards (OASIS).
$Id: docbookx.dtd,v 1.12 2000/08/27 15:15:26 nwalsh Exp $
Permission to use, copy, modify and distribute the DocBook XML DTD
and its accompanying documentation for any purpose and without fee
is hereby granted in perpetuity, provided that the above copyright
notice and this paragraph appear in all copies. The copyright
holders make no representation about the suitability of the DTD for
any purpose. It is provided "as is" without expressed or implied
warranty.
If you modify the DocBook DTD in any way, except for declaring and
referencing additional sets of general entities and declaring
additional notations, label your DTD as a variant of DocBook.
HH
Who are you? (Score:2)
How difficult would it be for a competitor to "steal" the DTD anyway? I mean, copy your ideas whilst renaming tags, restructuring the DTD a bit, and so on, till it wasn't provably derived from your DTD? The only point of you having a non-free license to defend your DTD is if this kind of defense might work. If your DTD would be easy to duplicate anyway, then you're not getting any security from a non-free license.
As to whether copylefting the DTD would help your company, I think the answer largely depends upon who you are, and your relationship with your suppliers. If you are having problems persuading your suppliers to use your DTD, then being able to point to the open license might help: "this is poised to become the standard". On the other hand, if all your suppliers are happy to use the DTD already, then you won't make any short-term gain. You might make long-term gain if future suppliers would be more willing to use a copylefted DTD; but that depends on what your industry's like and what kind of stance your suppliers are likely to take.
Copyleft probably won't protect a DTD (Score:4)
On the other hand, don't expect the copyleft to protect your DTD. If anyone wants to use the data format in a proprietary application, well, they might not be able to use your DTD directly, but they can clone it and the result would probably not be considered a derivative work.
There are a few rights that we want to protect for the good of Free Software. We don't want API copyrights to be enforcible. We want to have the right to reverse-engineer for purposes of compatibility. We don't want to have a Microsoft come along and say "You can't make word processors that are Word-compatible, the file-format is copyrighted". Asserting the copyleft on a file format isn't compatible with this. However, a DTD isn't a file format, just its description. Thus, go ahead and copyleft your DTD, but be aware of the limitations.
Thanks
Bruce
"Copyright" DTDs make no sense (Score:3)
For seconders, there are already a bijillion incompatible DTDs out there. The world doesn't need more.
And most importantly, requiring your suppliers and/or customers to conform to a closed-source DTD *COSTS THEM RESOURCES.* You shoot yourself in the foot when you do that: as soon as someone with a cheaper solution comes along, kiss your contract goodbye.
The best thing you can do is work *with* your competition to develop a *single* DTD that saves all your suppliers/customers money. Compete on the basis of service, of added-value, or something else that counts. Competing based on proprietary DTDs is just utterly stupid.
--
Release Those DTDs as Open Source! (Score:2)
Unless you want your data to be inaccessible to anyone else. What would be the point of a company declaring of ``We're Open! We use XML!'' and then tie up the use of the data with some silly license attached to the DTD.
I'd love to see something big happen to XML. But then I had high hopes for EDI way back when. It turned into a total mess where every implementation was a custom job it was doomed to fall on its face and find far fewer companies that wanted to take advantage of it. And each job was custom since no one could agree on things like what ``customer code'' meant. Hard enough to get two divisions of the same company to agree on that let alone two separate companies. Along comes XML and it just might fall on its face for similar reasons.
--
What in God's name... (Score:1)
Re:XML (Score:1)
Fear of competitors (Score:1)
Re:Propose a standard (Score:1)
Re:What in God's name... (Score:1)
Are DTDs taking the world by storm ? (Score:1)
But then, it does not seem to catch on. This whole W3C XML1.0v4 thing seems to evolve into some kind of niche, in which some people want to play and many more others don't. We've all looked at it, and are not really impressed. I would dare to say: It fails to become "hot".
And then we've got those little ideosynchratic languages like XSLT and stuff, that aren't very impressive either. I'll stick with Java, if I need to convert complex data structures.
I'm rather convinced that releasing DTDs won't convince that many extra people to play the DTD game.
You could release instead some novel martian poetry in the wild and hope that people will read it
Steal? (Score:1)
Eh? (Score:2)
This sucks.
The extent of Roman Power (Score:1)
DTD need consensual XML standards (Score:2)
The difficulty comes from getting two sets of people to agree on what the objects definitions are or are going to be. That requires collaboration and cooperation. Two things that are not going to come from any software effort.
All software developpers tend to treat the invertion of fire as their exclusive intellectual property and you can eat your meat cold and bloody or pay them for the privilege of cooking your steak.
The effort will have to come from consortia of clients and related firms who use data processing but aren't in that business.
That said, yes you can publish the DTD specifications arrived at by the consortia and it wil be aequately covered by the document copyright.
Though I think that using copyleft would allow you to avoid stupidity like the RAMBUS debacle.
Newton said I see far because I stand on the shoulders of Giants. Linus Thorvald RMS et alia are giants. Bill Gates is a big dip in the level playing field. Emulate Linus and you stand a chance. Emulate Bill and your effort will degenerate into a pack of wild dogs tearing at a haunch.
Re:"Copyright" DTDs make no sense (Score:2)
Actually, a "clean-room" implementation, if possible at all, would have to have different tag names, or else you would get sued for violating the copyright. But with different tag names, it would not actually work with data tagged the original way. Those "open" documents would not be so open after all.
IMO, copyrighted DTDs will be the major weapon in the next generation of attempts to corner the market via proprietary data formats.
Alas, that's a vision somewhat different from the promise of XML.
it wouldn't be 'theft' (Score:1)
If you're asking questions like this, your company probably isn't ready for GPL'ing your work. Here's the deal, if your competitors use (umm that'd be 'steal' in your parlance) your DTD, their suppliers/vendors/partners will like y do so too. How many of these suppliers/vendors/partners do you have in common with your competitors? I think you can smell what I stepped in here...
The whole purpose behind GPL'ing something should be to encourage/enable it's use (spare me the ethical/moral lectures please, I'm talking in practical terms here) by others, be they friend or foe. In the long run, the more companies that use your DTD, the fewer you'll have to write custom code for.
Slashdot's Own Example Of DTD/XML Use (Score:2)
Slashdot (again, Slash if it's setup to) produces all headlines in a convienient, machine-readable format. It can be found at www.slashdot.org/slashdot.xml [slashdot.org].
At the same time, the DTD for this file (called 'Backslash' and can be found at www.slashdot.org/backslash.dtd [slashdot.org]) essentially describes to an XML parser what is and what is not allowed in the file. It essential defines what constitutes a "valid" document; it is valid meaning that when compared against the DTD, it conforms to the defintion.
"Well-formed" is another XML term which means it at least is formatted correctly accordingly to the XML definition (for example, single tag elements end in a backslash.)
If you're interested in learning about XML and this DTD stuff, as well as all the latest proposals that are meant to replace DTDs (such as XML Schema [w3.org]'s), check out the official W3C [w3c.org] site at www.w3.org/XML/ [w3.org].
Re:Don't use DTD - use XML Schema (Score:2)
Well... (Score:1)
As many others have already pointed out, try to make yourself known that you are the ones who got this standard out.
However, you have to be careful that the standard you are pushing is a very good one. Once you're sure, go out and make yourselves heard. Coz if it turns out not-as-good-as-you-thought, get ready for some shit big time.
I've had a very similar experience last year in my old job. My project was to export an in-house VHDL compiler's internal data structures onto XML and then visualise them in HTML (as an application for the format).
What we did (after we'd finished it) was sit down and write a paper on it, submitted it to a conference, it got accepted and there we were getting ourselves some +ve publicity. So, that's definitely something to keep in mind.
Now, having gone through this research, here are a few hints for later down the road:
Trian
(off to get some rest and a beer coz I've spent a few hours too many in front of this machine)
BizTalk (Score:1)
There's nothing here that can't be done open-source, as far as I know. And even if you don't want to go BizTalk right away, definitely consider the implications of what they've done before going and implementing anything.
It must be said that every once in a while M$ does something actually kinda innovative. They are doing a lot of cool stuff with XML that no one else is doing, so you have to give them credit for that much. Their OS's, on the other hand...
Re:Don't use DTD - use XML Schema (Score:1)
DTD's may not be subject to copyright, anyway (Score:2)
Copyright protects the expression of ideas, and not ideas themselves (that's what patents are for). There's a copyright law concept called "the merger doctrine" that says (more or less) that you cannot copyright a work that represents the only possible expression of an idea -- to do so would result in copyright protecting the idea along with its expression, and that's beyond a copyright's power. The idea and its only expression are said to merge, and thereby fall out of the scope of copyright protection.
(The case that set this idea out was Baker v. Selden, which was decided at the turn of the last century and had to do with a book of accounting forms -- the expression of the form was its idea, and as a result people were free to copyright the layout of the form.)
This is the reason right-thinking people believe that APIs cannot be copyrighted -- by definition, the API is the only accurate expression of the idea represented by the interface, and the merger doctrine applies.
A DTD would likely be subject to the same reasoning.
Re:DTD's may not be subject to copyright, anyway (Score:1)
Sorry -- cut and paste error -- that should be "... as a result, people were free to _copy_ the layout of the form.
It's the only way... (Score:1)
Besides, publishing your DTD will give you positive feedback. It doesn't really matters if the competition tries to start using your XML structure. You are the ones who already have all their systems developed to support than kind of XML documents. In fact, if your competitors also start trying to use your DTD then more and more customers will start to use it But your competitor will start from zero, while you have the knowledge, the prestige from having invented it, the actual systems already working, etc. etc.
However, if you are planing to do this, dont use a DTD, they are probably sentenced for a quick death. XML Schema definition are better, not only because they allow you to describe your XML document more precisely, but because they are XML themselves.
Actually, if you can convince your customers to send you their data according to your DTD, then you are already doing a good thing in bringing all that people to the beautifull world of XML
Re:Don't use DTD - use XML Schema (Score:1)
DTD is an interface... (Score:2)
I believe that copyright law says that you cannot prevent anyone from using an interface. Any license that restricts access to the interface is taking *away* a right that the user already possesses. This is a pretty big step for copyleft to take, and I don't know that it is legally valid without an end user license.
Another option would be a "weak" copyleft, that guarantees access to the original DTD, but does not restrict any software that uses the DTD. Sort of an LGPL for DTDs. I know you guys want a world where the people you don't like don't exist, but you twist the meaning of "freedom" beyond recognition when you dictate the license that other people's XML documents must be under. (I'm not leveling this solely at the copyleft community, but also at the commercial firms that do the same with proprietary licenses).
Re:XML (Score:2)
As I recall, an XML DTD can optionally include CSS information. This would enable a user agent (browser) to display an xml document correctly.
Who said anything about displaying information? Most of information isn't meant to be displayed as often as it should be processed, and my complaint is about inability to create any processing routines without using some external information, even if the standard can be easily formalized in the form of constraints and processing algorithms that correspond to the nature of data. I understand that in this obsessed with GUI consumer software industry people are more likely to first think about pretty forms of displaying the data, but for any kind of real work this would be tail wagging the dog.
Re:No danger (Score:2)
Re:No danger (Score:2)
Under certain circumstances you can use a DTD to create an XML document, and then send someone the document without the DTD, because XML may still work without it.
XML doc control is different from open-source programming in that a DTD is not a program, it's more like a config file, and there is no point in keeping it secret.
If your DTD is a useful tool, it makes sense to allow others to use it; if it's really useful it may even become a de facto standard.
But there are so many useful DTDs out there already that creating your own should only be done after a document analysis has demonstrated that none of the existing ones will fit the bill.
///Peter
--
Schemes (Score:1)
Why are you worried? (Score:1)
Either way, if I were you, I'd learn Schemas. They have just become a Candidate Recommendation and are much more powerful in what they can define and do. And, more importantly, Schemas are XML files themselves so they can be transformed by XSLT if need be, or they can be parsed and processed, for instance, to provide the contents in a drop down list.
Re:XML (Score:1)
Re:"Copyright" DTDs make no sense (Score:1)
I think your goal is laudable, but you're ignoring the real question - who should pay the development cost of these standards? Since it is so easy to copy each others work, maybe some involvement by industry associations and universities is needed.
Lawyers have to do some pro bono work to stay in their professional association. Could industry associations make the same requirement? To stay in the association, you must contribute to the standards effort or compensate those who do.
Why not make a business school project to gather DTD? Make the donations cost tax deductible?
Re:XML (Score:3)
Can you give an example of what you mean by creating processing routines without using some external information?
Any kind of data that reflects something in real life. For example, description of financial transactions, where only certain combinations of values are valid, and the effect of transaction should be calculated using known algorithms based on the document content itself and the database that describes entities involved that can be known only partially to each party (say, I don't know, how much money a brokerage and the government have, but I do know how to calculate brokerage fee and taxes when I sell stock, and I know how it affects my account -- but I can't just ask INS and brokerage to give me a bunch of machine-readable definitions that when compiled will allow me to process those things automatically).
XML 1.0 is the W3C's first Recommendation for the Extensible Markup Language, a system for defining, validating, and sharing document formats on the Web
I don't see the word "displaying" or any synonym for it anywhere in this definition. Documents may be shared for any kind of purpose, and displaying is just one of them.
The primary use of the web has displaying information since it was invented. As far as I can tell, XML does exactly it was designed to.
Not true. HTTP was created as the protocol to transfer HTML files and images, however since HTTP 1.0 full MIME types support was added, and protocol was transformed into "better FTP". I don't think, rpm file that I have downloaded to update Red Hat distribution yesterday was ever meant to be displayed -- it serves its purpose only by being processed by rpm utility, and most of data that it contains is not human readable at all. XML is supposed to be used in the same way -- in my example of financial transaction (for what OFX format may be used -- it has first SGML-based and later XML-based version) the functionality is completely unrelated to the display of data, and even if such information is displayed, financial transactions usually are not displayed in the way as they are performed, but converted and combined together to be human-readable using external algorithms.
MOD This UP! (Score:1)
First, you should ask, can you COPYRIGHT a DTD?
Answer: NO
Then the copyleft or GPL issue never comes up.
Re:Eh? (Score:1)