XML

Internal Entities

Let's begin by looking at internal entities. An entity that is going to be used in only one DTD can be an internal entity. If you intend to use the entity in multiple DTDs, it should be an external entity. In this section, you'll learn how to declare internal entities, where to insert them, and how to reference them.

Internal General Entities

Internal general entities are the simplest among the five types of entities. They are defined in the DTD section of the XML document. First let's look at how to declare an internal general entity.

Declaring an internal general entity

The syntax for the declaration of an internal general entity is shown here:

<!ENTITY name "string_of_characters">
NOTE
As you can see from the syntax line above, characters such as angle brackets(< >) and quotation marks (" ") are used specifically for marking up the XML document; they cannot be used as content directly. So to include such a character as part of your content, you must use one of .XML's five predefined entities. The literal entity values for these predefined entities are &amp;, &lt;, &gt;, &quot;, and &apos;. The replacement text for these literal entity values will be &, <, >, ", and '.

You can create your own general entities. General entities are useful for associating names with foreign language characters, such as or , or escape characters, such as <, >, and &. You can use Unicode character values in your XML documents as replacements for any character defined in the Unicode standard. These are called character references.

To use a Unicode representation in your XML document, you must precede the Unicode character value with &#. You can use either the Unicode characters' hex values or their decimal values. For example, in Unicode, is represented as xFC and is represented as xDF. These two characters' decimal values are 252 and 223. Thus, in your DTD you could create general entities for the preceding two characters as follows:

  <!ENTITY u_um "&#xFC">
  <!ENTITY s_sh "&#xDF">

The two entities could also be declared like this:

  <!ENTITY u_um "&#252">
  <!ENTITY s_sh "&#223">

Using internal general entities

To reference a general entity in the XML document, you must precede the entity with an ampersand (&) and follow it with a semicolon (;). For example, the following XML statement references the two general entities we declared in the previous section:

  <title>Gr&u_um;&s_sh;</title>

When the replacement text is inserted by the parser, it will look like this:

  <title>Gr��</title>

Internal general entities can be used in three places: in the XML document as content for an element, within the DTD in an attribute with a #FIXED data type declaration as the default value for the attribute, and within other general entities inside the DTD. We used the first location in the preceding example: (<title>Gr&u_um;&s_
sh;</title>).

The second place you can use an internal general entity is within the DTD in an attribute with a #FIXED data type declaration or as the default value for an attribute. For example, you can use the following general entities in your DTD declaration to create entities for several colors:

  <!ENTITY Cy "Cyan">
  <!ENTITY Lm "Lime">
  <!ENTITY Bk "Black">
  <!ENTITY Wh "White">
  <!ENTITY Ma "Maroon">

Then if you want the value of the bgcolor attribute for tr elements to be White for all XML documents that use the DTD, you could include the following line in the previous DTD declaration:

  <!ATTLIST tr align (Left | Right | Center) 'Center'
          valign (Top | Middle | Bottom) 'Middle'
          bgcolor CDATA #FIXED "&Wh;">

The internal general entities must be defined before they can be used in an attribute default value since the DTD is read through once from beginning to end. In this case, internal general entities for several colors have been created. The bgcolor attribute is declared with the keyword #FIXED, which means that its value cannot be changed by the user-the value will always be White. The color general entities could also be used as content for the elements in the body section of the XML document.

You can use the internal general entity as a default value-for example, bgcolor CDATA "&Wh;". In this case, if no value is given, &Wh; is substituted for bgcolor when the XML attribute is needed in the document body, and that reference will be converted to White.

NOTE
You can use an internal general entity in a DTD for a #FIXED attribute, but the attribute value will be assigned in the XML document's body only when the attribute is referenced. You cannot use an internal general entity in an enumerated type attribute declaration because the general entity would have to be interpreted in the DTD, which cannot happen.

The third place you can use internal general entities is within other general entities inside the DTD. For example, we could use the preceding special character entities as follows:

  <!ENTITY u_um "&#252>
  <!ENTITY s_sh "&#223">
  <!ENTITY greeting "Gr&u_um;&s_sh;">

At this point, it's not clear whether greeting will be replaced with Gr&u_um;&s_sh; in the XML document's body and then converted to Gr�� or whether greeting will be replaced directly with Gr�� when the entity is parsed. The order of replacement will be discussed in the section "Processing Order" later in this chapter.

CAUTION
When you include general entities within other general entities, circular references are not allowed. For example, the following construction is not correct:

  <!ENTITY greeting "&hello;! Gr&u_um;&s_sh;">
  <!ENTITY hello "Hello &greeting;">
In this case, greeting is referencing hello, and hello is referencing greeting, making a circular reference.