XML

Criteria for Well-Formed XML Documents

To be well formed, your XML document must meet the following requirements:

  1. The document must contain a single root element.

  2. Every element must be correctly nested.

  3. Each attribute can have only one value.

  4. All attribute values must be enclosed in double quotation marks or single quotation marks.

  5. Elements must have begin and end tags, unless they are empty elements.

  6. Empty elements are denoted by a single tag ending with a slash (/).

  7. Isolated markup characters are not allowed in content. The special characters <, &, and > are represented as &gt, &amp, &lt in content sections.

  8. A double quotation mark is represented as &quot, and a single quotation mark is represented as &apos in content sections.

  9. The sequence <[[ and ]]> cannot be used.

  10. If a document does not have a DTD, the values for all attributes must be of type CDATA by default.

Rules 1 through 6 have been addressed in this chapter. If you need to use the special characters listed in rules 7 and 8, be sure to use the appropriate replacement characters. The sequence in rule 9 has a special meaning in XML and so cannot be used in content sections and names. We will discuss this sequence in Chapter 5. The CDATA type referred to in rule 10 consists of any allowable characters. In our sample document, the values for the attributes must contain characters, which they do.

Adding The XML Declaration

XML Notepad does not add the XML declaration to an XML document. The XML declaration is optional, and should be the first line of the XML document if provided. The syntax for the declaration is shown here:

  <?xml version="version_number" encoding="encoding_declaration"
        standalone="standalone status"?>

The version attribute is the version of the XML standard that this document complies with. The encoding attribute is the Unicode character set that this document complies with. Using this encoding, you can create documents in any language or character set. The standalone attribute specifies whether the document is dependent on other files (standalone = "no") or complete by itself (standalone = "yes").