What's Wrong with DTDs?
Remember that the XML DTD describes the structure of the elements in an XML document and that the document can be validated against the DTD to check that it conforms to the structure. This ability to validate the contents of a document gives XML an edge over other methods of marking up text. There is, for example, no way to check the markup of TeX code (a computer typesetting language still popular with mathematics and scientific academics) or TROFF code (the primitive markup language used to format online help information with the UNIX man utility).
The DTD also provides a certain amount of control over attributes, but offers very little ability to check the actual data inside the elements. It can check that there is a DATE element, but it has no way of confirming that what is inside that DATE element isn't absolute nonsense that couldn't be turned into data no matter how much imagination you applied to the task.
If you want to use XML to encode the contents of a database or use it to transfer credit card transaction data across the Internet, which is a much more commercially viable proposition, there are ways of enforcing tighter control over the data. Most of these systems work on the age-old principle of "garbage in-garbage out" (gigo) and the only way to avoid a lot of manual cleaning up is to ensure the data that goes in is clean. You want to be able to ensure that a numerical value doesn't contain any text, that a currency value only has two digits on the right of the decimal point (or comma, depending on what currency), and so on. Perhaps because XML is a derivative of SGML, which is very document-oriented and considers data rather than documents, the DTD fails dismally.
To add to this already major problem, there is a general feeling that it is already asking a lot for people to learn the syntax of the XML language. The XML DTD, however, isn't really an XML document, and it isn't even an SGML document-the DTD has a syntax all its own. Apart from the fact that this increases the burden of learning even further, it takes away one of the major potential strengths of XML-automation.
Although there aren't that many XML tools yet, there probably will be soon. The situation will probably prove comparable to the way HTML developed. In the beginning there were very few tools and most people used whatever tools they had. They were often reduced to writing HTML code by hand. It didn't take long for sophisticated tools to arrive and we have now reached the point where it is quite easy to produce HTML documents without having a clue what HTML looks like.
The story of XML's development has been driven by a desire to automate-a desire to make machine-generation of XML a genuine possibility. XSL went from being a sophisticated variant of the LISP programming language to a more simplified style language that shares XML's syntax. XLink and XPointer have followed a similar path from being a blend of SGML and HyTime to a simplified language that also shares XML's syntax. The end result is that it becomes realistically possible to consider machine-generating the XML code and the code for its appearance and its linking to other documents.
The general trend in the initiatives aimed at developing an alternative to DTDs is towards an XML schema that provides the tighter content control that data applications require, while sharing XML's syntax in order to implement a complete machine-generation environment.
One weakness of XML that has been increasingly drawing attention is XML's data model-the last nail in the DTD's coffin. The relationship between elements in an XML document is purely hierarchical; there's no way to express relationships in a richer fashion. (There is an SGML application called Topic Map Navigation that offers a solution, but this hasn't had much impact yet.) If you consider the problems of transferring massive amounts of data across an already overburdened Internet, matters of scale and economy become crucial. Many believe what XML needs is an object-oriented hierarchy; it must be possible to work with classes of objects, like purchases and sales. These classes would be subclasses of a wider class, such as a transaction from which the classes could inherit properties like value, date, credit card number, and so on, but could also add "local" properties such as sales discount. None of this could possibly be achieved with a DTD, but all of this is inherent in the design of some of the possible replacements.
The XML DTD isn't dead yet-but it can only be a matter of time before it becomes obsolete. There's no single replacement, but many are waiting in the wings. It has been claimed that they are contending candidates, especially since Microsoft is one originator (XML-Data) and Netscape is another (RDF). It isn't a genuine contest, though, because these two schemas don't necessarily address the same problems. A third schema you will encounter (DCD) manages to combine them both. All we can do is let the interested parties sort out exactly which one is going to be the replacement or (more likely) come up with the ultimate replacement.