XML

Structure of an XML Document

The structure of an XML document can be defined by two standards. The first standard is the XML specification, which defines the default rules for building all XML documents. You can see the specification at the following Web site: http://www.w3.org/TR/1998/REC-xml-19980210. Any XML document that meets the basic rules as defined by the XML specification is called a well-formed XML document. An XML document can be checked to determine whether it is well formed—that is, whether the document has the correct structure (syntax).

For example, one of the rules for a well-formed document is that every XML element must have a begin tag and an end tag. If an element is missing either tag in an XML document, the document is not well formed. Whether an XML document conforms to the XML specification can be easily verified by an XML-compliant computer application such as Microsoft Internet Explorer 5.

The second standard, which is optional, is created by the authors of the document and defined in a document type definition (DTD). When an XML document meets the rules defined in the DTD, it is called a valid XML document. A valid XML document can be checked for proper content. For example, suppose you have created an XML DTD that constrains the body element to only one instance in the entire document. If the document contained two instances of the body element, it would not be valid. Thus, using the DTD and the rules contained in the XML specification, an application can verify that an XML document is valid and well formed. Schemas are similar to DTDs, but they use a different format. DTDs and schemas are useful when the content of a group of documents shares a common set of rules. Computer applications can be written that produce documents that are valid according to the DTD and well formed according to the current XML standard.

Many industries are currently writing standard DTDs and schemas. These standards will be used to create XML documents that will share information among the members of the industry. For example, a committee of members from the medical community could determine the essential information for a patient and then use that information to build a patient record DTD. Patient information could be sent from one medical facility to another by writing applications that create messages containing an XML document built according to the patient record DTD. When an XML patient message was received, the patient record DTD would then be used to verify that the patient record was valid—that is, that it contained all of the required information. If the XML patient message was invalid, the message would be sent back to the sending facility for correction. The patient record DTD and schema could be stored in a repository accessible through the Internet, allowing any medical facility to check the validity of incoming XML documents. One of the goals of BizTalk is to create a repository of schemas.

In this tutorial, we will begin the process of creating an XML document that can be used to build Internet applications. Ideally, you will want to create an XML document that can be read as an XML document by an XML-compliant browser, as an HTML document using style sheets for non-XML-compliant browsers that understand cascading style sheets (CSS), and as straight HTML for browsers that do not recognize CSS or XML.

We will focus here on the process of creating a well-formed document. We'll review the rules that must be met by a well-formed document and create a well-formed document that can be used to display XML over the Internet in any HTML 4-compliant Web browser. In tutorial 4, you'll learn how to create a DTD for this well-formed document, and in tutorial 5, we will rework the DTD to make it more concise.