Document Type Definition (DTD) Tutorial
This tutorial covers the basics of DTDs. Before reading this
tutorial you should already be familiar with XML. You may want to read my XML
tutorial. Click on the above XML link to do so.
DTD stands for Document
Type Definition.
XML lets applications share data easily. However, XML lets you make up your own
set of tags. A DTD defines the legal elements and their structure in an XML
document.
Independent developers can agree to use a common DTD for exchanging XML data.
Your application can use this agreed upon DTD to verify the data it receives.
The DTD can also be used to verify your own data.
DTD standards are defined by the
World Wide Web Consortium (W3C). The W3C site provides a comprehensive
reference of DTDs.
However, this tutorial focuses on
Microsoft's implementation of the XML and DTDs. All examples contained
herein require Internet Explorer 5.0 or later.
Using a DTD in an XML Document
|
DTDs can be declared inline in your XML code or
they can reference an external file.
This is an example of an internal DTD. You can
open it in Explorer then select View | Source to view the complete XML
document with the DTD included.
<?xml version="1.0"?>
<!DOCTYPE message [
<!ELEMENT message (to,from,subject,text)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT subject (#PCDATA)>
<!ELEMENT text (#PCDATA)>
]>
<message>
<to>Dave</to>
<from>Susan</from>
<subject>Reminder</subject>
<text>Don't forget to buy milk on the way home.</text>
</message>
|
Line 2 defines the <message> element as having the four child elements:
<to>, <from>, <subject> and <text>.
<!ELEMENT message (to,from,subject,text)>
|
This line defines the <to> element to be of type PCDATA.
Here is the same document with an external DTD
reference:
<?xml version="1.0"?>
<!DOCTYPE message SYSTEM "message.dtd">
<message>
<to>Dave</to>
<from>Susan</from>
<subject>Reminder</subject>
<text>Don't forget to buy milk on the way home.</text>
</message>
|
And here is the referenced DTD file. You can
open it in Explorer and select View | Source as above.
<?xml version="1.0"?>
<!ELEMENT message (to,from,subject,text)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT subject (#PCDATA)>
<!ELEMENT text (#PCDATA)>
|
Before talking further about DTDs, we should review some of the basic XML
components.
-
XML elements
An XML element is made up of a start
and end tag with data in between. The tags
describe the data. The data is called the value of
the element. Elements can have a value, child elements or they can be empty
(have no value).
For example, this XML element is a <director> element with the value "Bill
Smith".
<director>Bill Smith</director>
-
Attributes
An element can optionally contain one or more attributes
in its start tag. An attribute is a name-value pair separated by an equal sign
(=). Attribute values must always be quoted.
<CITY ZIP="01085">Westfield</CITY>
ZIP="01085" is an attribute of the <CITY> element.
-
Parsed character data - PCDATA
Like we said, elements have values. If a value has tags representing child
elements, these tags need to be expanded or parsed
and handled as separate elements.
-
Character data - CDATA
The value of an element is treated as a single item and is not expanded.
-
Entities
Certain characters, like "<", have special meaning in XML. If you want to
use these special characters in your data you need a way to tell the XML
parser not to interpret them as having their normally meaning.
Entities
are sets of characters that can be used to represent these special characters
or other text.
Entity
|
Special Character
|
<
|
<
|
>
|
>
|
&
|
&
|
"
|
"
|
'
|
'
|
To use "<" in your data, you would use the entity "<" instead.
In DTD, elements are declared using an element declaration
with the following syntax:
<!ELEMENT element-name (element-content)>
|
-
Empty elements
Empty elements use the empty keyword. Move the
mouse over the text for additional information.
-
Elements with data
If an element has data, you must specify the type of
its data.
<!ELEMENT element-name
(#CDATA)>
<!ELEMENT element-name
(#PCDATA)>
<!ELEMENT element-name
(ANY)>
Example:
<!ELEMENT
message (#PCDATA)>
|
ANY can contain any type of data.
#CDATA is character type data.
#PCDATA is character data that must be parsed and
expanded. If a #PCDATA section contains elements, those elements must also be
declared.
-
Elements with children - Sequences
If an element has child elements, the child
elements must be enumerated in the same order that they appear in the document.
Using the message example:
<!ELEMENT element-name
(child-element,child-element,...)>
Example:
<!ELEMENT
message (to,from,subject,text)>
|
The child elements must also be declared. Here are their declarations:
<!ELEMENT
message (to,from,subject,text)>
<!ELEMENT
to (#CDATA)>
<!ELEMENT
from (#CDATA)>
<!ELEMENT
subject (#CDATA)>
<!ELEMENT
text (#CDATA)>
|
-
DOCTYPE definition
With an internal DTD, you need a DOCTYPE definition to indicate the DTD code.
<!DOCTYPE root-element
[element declarations]>
Example:
<?xml version="1.0"?>
<!DOCTYPE message [
<!ELEMENT
message (to,from,subject,text)>
<!ELEMENT
to (#CDATA)>
<!ELEMENT
from (#CDATA)>
<!ELEMENT
subject (#CDATA)>
<!ELEMENT
text (#CDATA)>
]>
<message>
<to>Dave</to>
<from>Susan</from>
<subject>Reminder</subject>
<text>Don't forget to buy milk on the way home.</text>
</message>
|
-
Element instances
You can specify how many times an element can occur in a document.
Symbol
|
Instances
|
none
|
must occur exactly 1 time
|
*
|
0 or more times
|
+
|
1 or more times
|
?
|
Exactly 0 or 1 time
|
For example:
<!Element message (to+,from,subject?,text,#PCDATA)>
|
DTD Attribute Declaration
|
In DTD, attributes are declared using an ATTLIST declaration.
An attribute declaration specifies the associated element, the attribute, its
type, and possibly its default value. Here are the syntax variations:
<!ATTLIST element-name attribute-name attribute-type #DEFAULT
default-value>
<!ATTLIST element-name attribute-name attribute-type
#FIXED fixed_value>
<!ATTLIST element-name attribute-name attribute-type (Val1|Val2|..)
default_val>
<!ATTLIST element-name attribute-name attribute-type
#IMPLIED>
<!ATTLIST element-name attribute-name attribute-type #REQUIRED>
|
-
Attribute type
Attributes can have these types:
Type
|
Description
|
CDATA
|
Character data
|
ENTITY
|
An entity
|
ENTITIES
|
List of entities data
|
ID
|
Unique ID data
|
IDREF
|
ID of another element
|
IDREFS
|
List of IDs of other elements
|
NMTOKEN
|
XML name data
|
NMTOKENS
|
List of XML names
|
NOTATION
|
Name of a notation
|
(val1|val2|...)
|
List of values
|
xml:
|
Predefined value
|
-
Default values
Attribute default values can be:
Default Value
|
Description
|
#DEFAULT value
|
If no value exists in the XML data, the value specified in the DTD
will be used
|
#FIXED value
|
If another value exists in the XML data, an error will occur
|
#IMPLIED
|
The value doesn't have to be supplied in the XML data.
|
#REQUIRED
|
If the XML data doesn't have a value, an error will occur.
|
-
DTD Examples
Here are some sample DTD statements. Move the mouse over the text for more
information.
<!ATTLIST person
gender CDATA
#DEFAULT
"male">
<!ATTLIST person
gender CDATA
#FIXED
"male">
<!ATTLIST person
gender CDATA
#REQUIRED>
<!ATTLIST person
gender CDATA
#IMPLIED>
<!ATTLIST person
gender (male|female)
"male">
|
Here is an XML statement that will satisfy all of the above DTD statements.
This XML statement does not satisfy DTD rule 2 which requires a value of "male".
This XML statement fails DTD rule 5 because "unknown" is not an acceptable
value.
<person gender="unknown">
|
Recall, entities are variables that represent other values. The value of
the entity is substituted for the entity when the XML document is parsed.
Entities can be defined internally or
externally to your DTD.
Internal declaration:
<!ENTITY entity-name entity-value>
Example:
<!ENTITY website "http://www.TheScarms.com">
External declaration:
<!ENTITY entity-name SYSTEM "entity-URL">
Example:
<!ENTITY website SYSTEM "http://www.TheScarms.com/entity.xml">
|
The above entity make this line of XML valid.
XML line:
<url>&website</url>
Evaluates to:
<url>http://www.TheScarms.com</url>
|
Click this link to view a complete DTD example.
|