XML

A Schema for a Data-Oriented XML Document

The example that has been used up to this point has been a document-oriented XML document with no data types besides string. To see how the other data types work, in this section we'll create an example using the Northwind Traders database. (This database can be found in Microsoft Access, Visual Studio, and Microsoft SQL Server 7.) For the Customer and Categories tables, you could create the schema shown below.

  <?xml version ="1.0"?>
  <schema targetNamespace = "http://www.northwind.com/Category"
      xmlns = http://www.w3.org/1999/XMLSchema
      xmlns:Categories = "http://www.northwind.com/Category">
  <simpleType name="String15" source="string"
          <maxLength= "15" />
          <minLength = "0"/>
      </simpleType>
      <simpleType name="String5" base="string">
          <maxLength= "5"/ >
          <minLength = "0"/>
      </simpleType>
      <simpleType name="String30" base="string">
          <maxLength= "30" />
          <minLength = "0"/>
      </simpleType>
      <simpleType name="String60" base="string">
          <maxLength= "60" />
          <minLength = "0"/>
      </simpleType>
     <simpleType name="String10" base="string">
          <maxLength= "10" />
          <minLength = "0"/>
      </simpleType>
      <simpleType name="String24" base="string">
          <maxLength= "24" />
          <minLength = "0"/>
      </simpleType>
     <simpleType name="String40" base="string">
          <maxLength= "40" />
          <minLength = "0"/>
      </simpleType>
      <element name = "Categories">
          <complexType content = "elementOnly">
              <group>
              <sequence>
              <element ref = "Categories.CategoryID"
                       minOccurs = "1" maxOccurs = "1" />
              <element ref = "Categories.CategoryName"
                       minOccurs = "1" maxOccurs = "1" />
              <element ref = "Categories.Description"
                       minOccurs = "0" maxOccurs = "1" />
              <element ref = "Categories.Picture" minOccurs = "0"
                       maxOccurs = "1"/>
              </sequence>
              </group>
          </complexType>
      </element>
      <element name = "Categories.CategoryID" type = "integer">
          <annotation>
              <documentation>Number automatically assigned to a new
                             category
              </documentation>
          </annotation>
      </element>
      <element name = "Categories.CategoryName" type = "String15">
          <annotation>
              <documentation>Name of food category</documentation>
          </annotation>
      </element>
      <element name = "Categories.Description" type = "string"/>
      <element name = "Categories.Picture" type = "binary">
          <annotation>
              <documentation> Picture representing the food category
              </documentation>
          </annotation>
      </element>
      <element name = "Customers">
          <complexType content = "elementOnly">
              <group>
                  <sequence>
                      <element ref = "Customers.CustomerID"
                               minOccurs = "1" maxOccurs = "1"/>
                      <element ref = "Customers.CompanyName"
                               minOccurs = "1" maxOccurs = "1"/>
                      <element ref = "Customers.ContactName"
                               minOccurs = "1" maxOccurs = "1"/>
                      <element ref = "Customers.ContactTitle"
                               minOccurs = "0" maxOccurs = "1"/>
                      <element ref = "Customers.Address"
                               minOccurs = "1" maxOccurs = "1"/>
                      <element ref = "Customers.City" minOccurs = "1"
                               maxOccurs = "1"/>
                      <element ref = "Customers.Region"
                               minOccurs = "1" maxOccurs = "1"/>
                      <element ref = "Customers.PostalCode"
                               minOccurs = "1" maxOccurs = "1"/>
                      <element ref = "Customers.Country"
                               minOccurs = "1" maxOccurs = "1"/>
                      <element ref = "Customers.Phone" minOccurs = "1"
                               maxOccurs = "1"/>
                      <element ref = "Customers.Fax" minOccurs = "0"
                               maxOccurs = "1"/>
                  </sequence>
              </group>
          </complexType>
      </element>
      <element name = "Customers.CustomerID" type = "CustomerIDField">
          <annotation>
              <documentation>
                  Unique five-character code based on customer name
              </documentation>
          </annotation>
      </element>
      <element name = "Customers.CompanyName" type = "String5"/>
      <element name = "Customers.ContactName" type = "String40"/>
      <element name = "Customers.ContactTitle" type = "String30"/>
      <element name = "Customers.Address" type = "String60">
          <annotation>
              <documentation>Street or post-office box</documentation>
          </annotation>
      </element>
      <element name = "Customers.City" type = "String15"/>
      <element name = "Customers.Region" type = "String15">
          <annotation>
              <documentation>State or province</documentation>
          </annotation>
      </element>
      <element name = "Customers.PostalCode" type = "String10"/>
      <element name = "Customers.Country" type = "String15"/>
      <element name = "Customers.Phone" type = "String24">
          <annotation>
              <documentation>
                  Phone number includes country code or area code
              </documentation>
          </annotation>
      </element>
      <element name = "Customers.Fax" type = "String24">
          <annotation>
              <documentation>
                  Fax number includes country code or area code
              </documentation>
          </annotation>
      </element>
  </schema>

Notice that Categories and Customers have been used as prefixes to identify what objects the elements belong to. If you look in the Northwind Traders database, you'll see that the field data types and the lengths for character data types match those in the database. The comments that were included in the Northwind Traders database were also used in the schema. You can see that it's fairly easy to convert a database table into a schema.

Now that we have discussed schemas, we'll need to cover namespaces and schemas. In the following section, we'll examine how to use namespaces in schemas.

Namespaces and Schemas

In Chapter 6, we looked at using namespaces for DTDs. Namespaces can be read and interpreted in well-formed XML documents. Unfortunately, DTDs are not well-formed XML. If you use a namespace in a DTD, the namespace cannot be resolved. Let's look at the following DTD as an example:

  <!DOCTYPE doc [
  <!ELEMENT doc (body)>
  <!ELEMENT body EMPTY>
  <!ATTLIST body bodyText CDATA #REQUIRED>
  <!ELEMENT HTML:body EMPTY>
  <!ATTLIST HTML:body HTML:bodyText CDATA #REQUIRED>
  ]>

A valid usage of this DTD is shown here:

  <doc><body bodyText="Hello, world"/></doc>

The following usage would be invalid, however, because the HTML:body element is not defined as a child element of the doc element:

  <doc><HTML:body bodyText="Hello, world"/></doc>

As far as the DTD is concerned, the HTML:body element and the body element are two completely different elements. A DTD cannot resolve a namespace and break it into its prefix (HTML) and the name (body). So the prefix and the name simply become one word. We want to be able to use namespaces but to be able to separate the prefix from the name. Schemas enable us to do this.