Digging into Complex Types : XML

Complex types can be broken down into four major classifications, as follows:

Empty elements
Element-only elements
Mixed elements
Sequences and choices

The next few sections explore these different complex types in detail.

Empty Elements

Empty elements contain no text content or child elements but are capable of having attributes. In fact, attributes are the only way to associate information with empty elements. You create empty elements using the xsd:complexType element in conjunction with the xsd:complexContent element. Following is an example of how you create an empty element:

<xsd:element name="automobile">
  <xsd:complexType>
    <xsd:complexContent>
      <xsd:extension base="xsd:anyType">
        <xsd:attribute name="vin" type="xsd:string"/>
        <xsd:attribute name="year" type="xsd:year"/>
        <xsd:attribute name="make" type="xsd:string"/>
        <xsd:attribute name="model" type="xsd:string"/>
      </xsd:extension>
    </xsd:complexContent>
  </xsd:complexType>
</xsd:element>

Although this may seem like a lot of work to simply create an empty element with a few attributes, it is necessary. The xsd:complexType and xsd:complexContent elements are necessary to establish that this is a complex type, whereas the xsd:extension element is used to declare that there is no specific base type (xsd:anyType) for the element. Finally, the attributes for the element are created using the familiar xsd:attribute element. Following is an example of how you would use the automobile element in an XML document:

<automobile vin="SALHV1245SA661555" year="1995"
  make="Land Rover" model="Range Rover"/>

Element-Only Elements

Element-only elements are elements that contain only child elements with no text content. They can also contain attributes, of course, but no text content is allowed within an element-only element. To create an element-only element, you simply use the xsd:complexType element. Following is an example of an element-only element that contains a single child element:

<xsd:element name="assets">
  <xsd:complexType>
    <xsd:element name="automobile" type="automobileType"/>
  </xsd:complexType>
</xsd:element>

This code presents a new wrinkle because the child element of assets is declared as type automobileType. This kind of named complex type is created much like the named simple types you saw earlier in the tutorial. Another approach to creating an element-only element involves coding the element as a named type. Following is an example of how you might code the automobileType named complex data type:

<xsd:complexType name="automobileType">
  <xsd:complexContent>
    <xsd:extension base="xsd:anyType">
      <xsd:attribute name="vin" type="xsd:string"/>
      <xsd:attribute name="year" type="xsd:year"/>
      <xsd:attribute name="make" type="xsd:string"/>
      <xsd:attribute name="model" type="xsd:string"/>
    </xsd:extension>
  </xsd:complexContent>
</xsd:complexType>

This is the same empty complex type you saw in the previous section, except in this case it has been created as a named type with additional attributes. Following is an example of XML code that uses the assets element, automobile element, and automobileType complex type:

<assets>
  <automobile vin="SALHV1245SA661555" year="1995"
    make="Land Rover" model="Range Rover"/>
</assets>

You might be wondering exactly how useful the assets element is because it can contain only a single automobile element. In reality, practically all element-only elements are capable of storing multiple child elements, sometimes of different types. However, in order to allow for multiple child elements you must use a special construct known as a sequence. You learn about sequences a little later in this page titled "Sequences and Choices.

Mixed Elements

Mixed elements contain both text and child elements and are the most flexible of all elements. Text-only elements are considered a type of mixed element and can contain only text with no child elements. You create text-only elements using the xsd:complexType element in conjunction with the xsd:simpleContent element. Following is an example of a text-only element:

<xsd:element name="distance">
  <xsd:complexType>
    <xsd:simpleContent>
      <xsd:extension base="xsd:decimal">
        <xsd:attribute name="units" type="xsd:string" use="required"/>
      </xsd:extension>
    </xsd:simpleContent>
  </xsd:complexType>
</xsd:element>

The distance element can be used to store a distance traveled and is capable of using different units of measure to give meaning to the numeric content it stores. The actual distance is located in the element's content, whereas the units are determined by the units attribute, which is a string. It's important to notice the extra use attribute, which is set to required. This attribute setting makes the units attribute a requirement of the distance element, which means you must assign a value to the units attribute. Following is an example of how the distance element and units attribute might be used in an XML document:

<distance units="miles">12.5</distance>

Although text-only elements are certainly useful in their own right, there are some situations where it is necessary to have the utmost freedom in coding element content, and that freedom comes with the mixed element. Mixed elements are created similarly to other complex types but with the addition of the xsd:mixed attribute. Keep in mind that mixed types allow for text and child element content, as well as attributes. Following is an example of a mixed type:

<xsd:element name="message">
  <xsd:complexType mixed="true">
    <xsd:sequence>
      <xsd:element name="emph" type="xsd:string"/>
    </xsd:sequence>
    <xsd:attribute name="to" type="xsd:string" use="required"/>
    <xsd:attribute name="from" type="xsd:string" use="required"/>
    <xsd:attribute name="timestamp" type="xsd:timeInstant" use="required"/>
  </xsd:complexType>
</xsd:element>

In this example, a mixed element is created that can contain text, an emph element, and three attributes. Admittedly, I skipped ahead a little by placing the emph child element in a sequence, but that will be cleared up in the next section. Following is an example of how the message element might be used in an XML document:

<message to="you" from="me" timestamp="2001-03-14T12:45:00">
I hope you return soon. I've <emph>really</emph> missed you!
</message>

In this example the emph child element is used to add emphasis to the word "really" in the message.

Sequences and Choices

One powerful aspect of complex types is the ability to organize elements into sequences and choices. A sequence is a list of child elements that must appear in a particular order, whereas a choice is a list of child elements from which only one must be used. You create a sequence with the xsd:sequence element, which houses the elements that comprise the sequence. Following is an example of creating a sequence:

<xsd:element name="quiz">
  <xsd:complexType>
    <xsd:sequence>
      <xsd:element name="question" type="xsd:string">
      <xsd:element name="answer" type="xsd:string">
    </xsd:sequence>
  </xsd:complexType>
</xsd:element>

In this example, the quiz element contains two child elements, question and answer, that must appear in the order specified. By default, a sequence can occur only once within an element. However, you can use the xsd:minOccurs and xsd:maxOccurs attributes to allow the sequence to occur multiple times. For example, if you wanted to allow the quiz element to contain up to 20 question/answer pairs, you would code it like this:

<xsd:element name="quiz">
  <xsd:complexType>
    <xsd:sequence minOccurs="1" maxOccurs="20">
      <xsd:element name="question" type="xsd:string">
      <xsd:element name="answer" type="xsd:string">
    </xsd:sequence>
  </xsd:complexType>
</xsd:element>

You can set the maxOccurs attribute to unbounded to allow for an unlimited number of sequences. The maxOccurs attribute can also be used with individual elements to control the number of times they can occur.

Following is an example of how you might use the quiz element in an XML document:

<quiz>
  <question>What does XML stand for?</question>
  <answer>eXtensible Markup Language</answer>
  <question>Who is responsible for overseeing XML?</question>
  <answer>World Wide Web Consortium (W3C)</answer>
  <question>What is the latest version of XML?</question>
  <answer>1.0</answer>
</quiz>

If you want to allow an element to contain one of a series of optional elements, you can use a choice. A choice allows you to list several child elements and/or sequences, with only one of them allowed for use in any given element. Choices are created with the xsd:choice element, which contains the list of choice elements. Following is an example of a choice:

<xsd:element name="id">
  <xsd:complexType>
    <xsd:choice>
      <xsd:element name="ssnum" type="xsd:string">
      <xsd:sequence>
        <xsd:element name="name" type="xsd:string">
        <xsd:element name="birthdate" type="xsd:date">
      </xsd:sequence>
      <xsd:element name="licensenum" type="xsd:string">
    </xsd:choice>
  </xsd:complexType>
</xsd:element>

In this example, an element named id is created that allows three different approaches to providing identification: social security number, name and birth date, or driver's license number. The choice is what makes it possible for the element to accept only one of the approaches. Notice that a sequence is used with the name and birth date approach because it involves two child elements. Following is an example of a few id elements that use each of the different choice approaches:

<id>
  <ssnum>123-89-4567</ssnum>
</id>
<id>
  <name>Milton James</name>
  <birthdate>1969-10-28</birthdate>
</id>
<id>
  <licensenum>12348765</licensenum>
</id>

If you're looking to create content models with little structure, you might consider using the xsd:all type, which is used to create complex types that can hold any number of elements in any order. The xsd:all element is used much like a sequence except that the child elements within it can appear any number of times and in any order.

One last topic worth covering before moving on to a complete XSD example has to do with how data types are referenced. With the exception of the root element, which is automatically referenced in an XSD, global document components must be referenced in order to actually appear as part of a document's architecture. You should consider using a global component when you have an element or attribute that appears repeatedly within other elements. In most of the examples you've seen, the components have been declared locally, which means they are automatically referenced within the context that they appear. However, consider an element, such as the following one, which is declared globally:

<xsd:element name="password">
  <xsd:simpleType>
    <xsd:restriction base="xsd:string">
      <xsd:minLength value="8"/>
      <xsd:maxLength value="12"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:element>

Although this element has been declared and is ready for use, it doesn't actually appear within the structure of an XSD until you reference it. You reference elements using the ref attribute, which applies to both elements and attributes. Following is an example of how the password element might be referenced within another element:

<xsd:element name="login" >
  <xsd:complexType>
    <xsd:sequence>
      <xsd:element name="userid" type="xsd:string"/>
      <xsd:element ref="password"/>
    </xsd:sequence>
  </xsd:complexType>
</xsd:element>

In this example the userid element is created and used locally, whereas the password element is referenced from the previous global element declaration. Whether or not you use elements and attributes locally or globally primarily has to do with how valuable they are outside of a specific context; if an element or attribute is used only in a single location then you might as well simplify things and keep it local. Otherwise, you should consider making it a global component and then referencing it wherever it is needed using the ref attribute.

The difference between local and global elements has to do with how they are created, which determines how you can use them. An element (userid in the previous example) declared within another element is considered local to that element, and can only be used within that element. A global element (password in the previous example) is declared by itself and can be referenced from any other element.