Creating Simple Data Types

New simple data types can be created by using simpleType elements. A simplified version of a DTD declaration required for the simpleType element is shown below. (For a complete declaration, see the schema specification at http://www.w3.org/IR/xmlschema-2/.)

  <!ENTITY % ordered ' (minInclusive | minExclusive) | (maxInclusive |
      maxExclusive) | precision | scale '>
  <!ENTITY % unordered 'pattern | enumeration | length | maxlength |
      minlength | encoding | period'>
  <!ENTITY % facet '%ordered; | %unordered;'>
  <!ELEMENT simpleType ((annotation)?, (%facet;)*)>
  <!ATTLIST simpleType
      name     NMTOKEN        #IMPLIED
      base      CDATA         #REQUIRED
      final     CDATA           ''
      abstract (true | false) 'false'
      derivedBy (list | restriction | reproduction) 'restriction'>
  <!ELEMENT annotation (documentation)>
  <!ENTITY % facetAttr 'value CDATA #REQUIRED'>
  <!ENTITY % facetModel '(annotation)?'>
  <!ELEMENT maxExclusive %facetModel;>
  <!ATTLIST maxExclusive %facetAttr;>
  <!ELEMENT minExclusive %facetModel;>
  <!ATTLIST minExclusive %facetAttr;>
  <!ELEMENT maxInclusive %facetModel;>
  <!ATTLIST maxInclusive %facetAttr;>
  <!ELEMENT minInclusive %facetModel;>
  <!ATTLIST minInclusive %facetAttr;>
  <!ELEMENT precision %facetModel;>
  <!ATTLIST precision %facetAttr;>
  <!ELEMENT scale %facetModel;>
  <!ATTLIST scale %facetAttr;>
  <!ELEMENT length %facetModel;>
  <!ATTLIST length %facetAttr;>
  <!ELEMENT minlength %facetModel;>
  <!ATTLIST minlength %facetAttr;>
  <!ELEMENT maxlength %facetModel;>
  <!ATTLIST maxlength %facetAttr;>
  <!-- This one can be repeated. -->
  <!ELEMENT enumeration %facetModel;>
  <!ATTLIST enumeration %facetAttr;>
  <!ELEMENT pattern %facetModel;>
  <!ATTLIST pattern %facetAttr;>
  <!ELEMENT encoding %facetModel;>
  <!ATTLIST encoding %facetAttr;>
  <!ELEMENT period %facetModel;>
  <!ATTLIST period %facetAttr;>
  <!ELEMENT documentation ANY>
  <!ATTLIST documentation source CDATA #IMPLIED>
  <!ELEMENT documentation ANY>
  <!ATTLIST documentation
            source   CDATA #IMPLIED
            xml:lang CDATA #IMPLIED>

As you can see, the simpleType element, which represents a simple data type, can be either ordered or unordered. An ordered type can be placed in a specific sequence. Positive integers are ordered-that is, you can start at 0 and continue to the maximum integer value. Unordered data types do not have any order, and would include data types such as a Boolean that cannot be placed in a sequence. Using the preceding DTD, you can create your own simple data types. These simple data types can then be used in your schemas to define elements and attributes.

Unordered data types include Boolean and binary data types. All of the numeric data types are ordered. Strings are ordered, but when you are defining your own string data types, they will be defined with the unordered elements.

For each data type, numerous possible child elements can be used to define the simpleType element. Each child element will contain an attribute with the value for the child element and an optional comment. The child elements define facets for the data types you create.

Let's look now at how to create simple data types using ordered and unordered facets.

Using ordered facets

Notice that in the previous code listing, ordered facets consist of the following facets: maxExclusive, minExclusive, maxInclusive, minInclusive, precision, and scale. The value of maxExclusive is the smallest value for the data type outside the upper bound of the value space for the data type. The value of minExclusive is the largest value for the data type outside the lower bound of the value space for the data type. Thus, if you wanted to have an integer data type with a range of 100 to 1000, the value of minExclusive would be 99 and the value of maxExclusive would be 1001. The simple data type could be declared as follows:

  <simpleType name="limitedInteger" base="integer">
      <minExclusive = "99"/>
      <maxExclusive = "1001"/>

The minInclusive and maxInclusive facets work in the same way as minExclusive and maxExclusive, except that the minInclusive value is the lower bound of the value space for a data type, and the maxInclusive is the upper bound of the value space for a data type. Our simple data type could be rewritten as follows:

  <simpleType name="limitedInteger" base="integer">
      <minInclusive = "100"/>
      <maxInclusive = "1000"/>

Precision is the number of digits that will be used to represent a number. The scale, which must always be less than the precision, represents the number of digits that will appear to the right of the decimal place. For example, a data type that does not go above but includes 1,000,000 and that has two digits to the right of the decimal place (1,000,000.00) has a precision of 9 (ignore commas and decimals) and a scale of 2. The declaration would look as follows:

  <simpleType name="TotalSales" base="integer">
     <minInclusive = "0"/>
     <maxInclusive = "1000000"/>
     <precision = "9"/>
     <scale = "2"/>

If you had left out the maxInclusive facet, numbers up to 9,999,999 would have been valid. If you had needed a value less than 1,000,000, the following declaration would have been sufficient:

  <simpleType name="TotalSales" base="integer">
     <precision = "8"/>
     <scale = "2"/>

Now that you have learned how to use ordered facets to create simple data types, let's look at how to use unordered facets to create simple data types.

Using unordered facets

In the previous code, you can see that unordered facets are made up of the following facets: period, length, maxLength, minLength, pattern, enumeration, and encoding.

For time data types, you can use the period facet to define the frequency of recurrence of the data type. The period facet is used in a timeDuration data type. For example, if you wanted to create a special holiday data type that includes recognized U.S. holidays, you could use the following declaration:

  <simpleType name="holidays" base="date">
        <documentation>Some U.S. holidays</documentation>
     <enumeration value='--01-01'>
           <documentation>New Year's Day</documentation>
     <enumeration value='--07-04'>
           <documentation>Fourth of July</documentation>
     <enumeration value='--12-25'>

When you use the length facet, the data type must be a certain fixed length. Using length, you can create fixed-length strings. The maxLength facet represents the maximum length a data type can have. The minLength facet represents the smallest length a data type can have. Using minLength and maxLength, you can define a variable-length string that can be as small as minLength and as large as maxLength.

The pattern facet is a constraint on the value space of the data type achieved by constraining the lexical space (the valid values). The enumeration facet limits the value space to a set of values. The encoding facet is used for binary types, which can be encoded as either hex or base64. In addition to containing a facet, simple data types also contain a set of attributes that can be used to define the data type. Let's now take a look at these attributes.