Slashdot Log In
Is Dedicated Hosting for Critical DTDs Necessary?
Posted by
Cliff
on Thu May 17, 2007 06:07 PM
from the might-the-W3C-be-interested dept.
from the might-the-W3C-be-interested dept.
pcause asks: "Recently there was a glitch, when someone at Netscape took down a page that had an important DTD (for RSS), used by many applications and services. This got me thinking that many or all of the important DTDs that software and commerce depend on are hosted at various commercial entities. Is this a sane way to build an XML based Internet infrastructure? Companies come and go all of the time; this means that the storage and availability of those DTDs is in constant jeopardy. It strikes me that we need an infrastructure akin to the root server structure to hold the key DTDs that are used throughout the industry. What organization would be the likely custodian of such data, and what would be the best way to insure such an infrastructure stays funded?"
Related Stories
[+]
Technology: Netscape Restores RSS DTD, Until July 134 comments
Randall Bennett writes "RSS 0.91's DTD has been restored to it's rightful location on my.netscape.com, but it'll only stay there till July 1st, 2007. Then, Netscape will remove the DTD, which is loaded four million times each day. Devs, start your caching engines."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
I know! (Score:5, Funny)
Mhahahahaha. Yeah. I know, I crack myself up.
Re:I know! (Score:5, Funny)
Mhahahahaha. Yeah. I know, I crack myself up.
Parent
Re:Catalog files? (Score:5, Insightful)
Is there some technical reason I'm not aware of that means it has to stay somewhere central?
There shouldn't be, yet I would be greatly surprised if some application didn't match on the entire DTD string, hostname and all.
I am equally baffled at what applications need the DTD for anyway. Except for generic XML applications, what use is a DTD? Most applications only handles a fixed few XML document types anyway.
Finally, if they really need that DTD... any distro have most major DTDs available. No reason why they couldn't carry a few extra. Should be easy to just search for them locally.
Parent
Centralization (Score:5, Insightful)
Regards.
Don't use them (Score:5, Insightful)
Sure, DTD files are necessary for development. If your app requires that they be used to validate something in real time each time it is comes in from a client or whatever, then use an internal copy of the version of the DTD file that you support. If the host makes a change to it (or drops it, or lets it get hacked), your app won't break, and you can decide when you will implement and support that change.
I really don't see what is gained by making the real time operation of your application dependent on the availability and pristinity of remotely and independently hosted files. It just makes you fragile, and you can get all the benefits you need from just checking the files during your maintenance and development cycles.
Parent
Re:Don't use them (Score:5, Informative)
Parent
Re:Centralization (Score:5, Informative)
This is known as a URN [wikipedia.org]. URLs and URNs are together known as URIs.
Parent
w3c (Score:5, Insightful)
Re:w3c (Score:5, Funny)
Parent
DTD? (Score:4, Insightful)
Re:DTD? (Score:5, Informative)
Parent
Re:DTD? (Score:5, Funny)
Parent
In case of death... (Score:5, Insightful)
Sane? (Score:5, Insightful)
No (Score:5, Insightful)
You shouldn't be using DTDs any more. Validation is better achieved with RelaxNG, and you shouldn't use them for entity references because then non-validating parsers won't be able to handle your code.
For those document types that already use DTDs, either you ship the DTDs with your application, or you cache them the first time you parse a document of that type.
The Netscape DTD issue was caused not by the DTD being unavailable, but by some client applications not being sufficiently robust. You shouldn't be looking at the hosting to solve the problem.
DTDs, XML entities and the non-breaking space (Score:4, Funny)
It's trivial to define " " yourself in a DTD, (<!ENTITY nbsp "&#a0;">) and many of the standard DTDs out there do define it, but by the XML 1.0 standard it's got to be defined somewhere or else the XML won't parse.
Parent
Perhaps something like "pool.ntp.org"? (Score:5, Insightful)
NTP.org" [ntp.org] maintains a pool of public NTP servers that are accessible via the hostname "pool.ntp.org", so perhaps something similar would work for a global DTD repository. An industry organization with a vested interest, the W3C seems like the most logical, could maintain the DNS zone and organizations could volunteer some server space and bandwidth to host a mirror of the collected pool of DTDs. Volunteering organizations might come and go, but when that happens it's just a matter of updating the DNS zone to reflect the change and everyone using DTDs just needs to know a single generic hostname will always provide a copy of the required DTD.
Just a thought...
using non-local cached copy considered harmful (Score:5, Interesting)
Doing anything else strikes me as fundamentally dangerous and insecure: it makes a remote dns vulnerability into an easy application DoS (or worse).
Call me crazy... (Score:5, Interesting)
Seriously, what the fuck were they thinking relying on a server to be always available?
Re:Call me crazy... (Score:5, Funny)
Your trust in the world is cute.
Parent
URI vs URL (Score:5, Insightful)
"http://my.netscape.com/publish/formats/rss-0.91.
It is silly that millions of RSS readers fetch a non-changing file from the same web site every day. It is only very slightly less silly that they fetch it from the web at all.
EXACTLY (Score:5, Insightful)
A DTD spec SHOULD have both a PUBLIC identifier and a SYSTEM identifier. The system identifier is strongly recommended to be a URL so that a validating parser can fetch the DTD if the DTD is not found in the system catalog.
The system catalog is supposed to map from the PUBLIC identifier to a local file, so that the parser needn't go to the network.
If you are running a recent vintage Linux, look in
So:
Parent
Not again (Score:4, Informative)
Supply local DTDs with your app (Score:5, Interesting)
When prototyping our "offline mode", we ran into this exact same problem because the Xml APIs we used wanted to validate xml against online dtds. We ammended the validator's resolver to use locally embedded or cached dtds for all our doctypes, problem solved.
In in my app it was an obvious problem to solve because offline usage was a big scenario, but I could imagine that being "out of scope" for a less-than-robust website.
I have a server in my basement we could use. (Score:5, Funny)