XML

DTD

Schemas

XSL

DOM

XHTML

 

Document Object Model (DOM) Tutorial

This tutorial covers the basics of XML DOMs. Before reading this tutorial you should already be familiar with XML and DTDs. You may want to read my XML and DTD tutorials. Click the above links to do so.

What is a DOM?

DOM stands for Document Object Model.

The DOM is an interface that exposes an XML document as a tree structure comprised of nodes. The DOM allows you to programmatically navigate the tree and add, change and delete any of its elements.

The DOM programming interface standards are defined by the World Wide Web Consortium (W3C). The W3C site provides a comprehensive reference of the XML DOM.

However, discussions contained herein focus on Microsoft's implementation of XML and XML DOMs. Thus, all samples require their Internet Explorer, version 5.0 or later, browser which includes their Msxml parser. All references to the Msxml parser, either in text or in sample code, assume Msxml V2.5 or later. For more information or to download Microsoft's XML products, visit their site.

Overview

To manipulate an XML document you first load it into your computer's memory using an XML parser. As stated above, the parser discussed here it the Msxml parser from Microsoft. Once the XML document is loaded, its data can be manipulated using a DOM.

A DOM treats the XML document as a tree. The DocumentElement is the top or root of the tree. This root element can have one or more child nodes which represent the branches of the tree.

The four main objects exposed by a DOM are the DOMDocument, XMLDOMNode, XMLDOMNodeList and XMLDOMNamedNodeMap which are all discussed below in subsequent sections.

For most XML documents, the most common types of nodes are element, attribute, and text. Attributes differ from the other node types because they are not considered child nodes of a parent. A separate programming interface, the XMLDOMNamedNodeMap, is used for attributes.

DOMDocument


Creating a DOM
Using the XML DOM begins when you create a DOMDocument object. You can then load, parse, navigate, and manipulate XML files. The VB code to create a DOMDocument is:

 

Dim xmlDoc = New DOMDocument


Loading and Saving Data
Use the Load or LoadXML methods to load an XML file into the DOM. The load method uses a path or url to an XML file. The loadXML method loads a string containing the XML data. After loading, XMLDoc, contains a tree consisting of the parsed contents of reports.xml.

 

xmlLDoc.load("http://xmlfiles/reports.xml")
xmlDoc.load("c:\temp\reports.xml")

xmlDoc.loadXML("<customer><first_name>Joe</first_name>
    <last_name>Smith</last_name></customer>")

To save a parsed XML document to a file use the Save method. Save can take a file name as a string.

 

xmlDoc.save("c:\temp\reports.xml")


Load and Parse Flags
File loading and parsing is done asynchronously by default. This means your app is free to do other work while the file is being loaded. Also by default, any well formed XML document can be loaded.

You can change this behavior by setting a few properties. Here, the XML is loaded synchronously, and validated against a DTD. Also, any external references in the DTD are resloved.

 

xmlDoc.async = False
xmlDoc.validateOnParse = True
xmlDoc.resolveExternals = True
xmlDoc.load("reports.xml")


Accessing Document and Error Information
You can retrieve the DTD used by the XML document, the path or URL of the file that was loaded and a string containing the entire contents of the XML document.

You can also get detailed information on errors that occurred during parsing.

 

Dim mydoctype as IXMLDOMDocumentType

xmlDoc.load("reports.xml")

Set MyDocType = xmlDoc.doctype
MsgBox (mydoctype.name)      'Display the DTD used.
MsgBox (xmlDoc.url)               'Path or url of the XML file.

If xmlDoc.parseError.errorCode <> 0 Then
    MsgBox ("A parse error occurred.")
Else
    MsgBox xmlDoc.documentElement.xml    'Display the actual XML data.
End If

Here is some more error information that is available:

 

Error Property

Description

errorCode

Error code of the last parse error

filepos

absolute file position of the error

line

line the error occurred on

linepos

position with in the line

reason

error description

srcText

line of XML that contains the error


Accessing the DOM Tree
You can access the tree starting at the root and walking down the tree or by querying for a specific node. You navigate to the root element using the documentElement property which returns the root element as an XMLDOMNode object.

 

Dim xmlDoc As New DOMDocument
Dim root As IXMLDOMElement
Dim child As IXMLDOMNode

xmlDoc.load("reports.xml")

'Set root to the root element collection.
Set root = xmlDoc.documentElement

'Walk from the root to each of its child nodes.
For Each child In root.childNodes
    MsgBox child.text
Next

To navigate to a specific node in the tree use the getElementsByTagName method. This method takes a string containing a specific tag name and returns all element nodes with this tag name.

 

Dim ElemList As IXMLDOMNodeList
Dim xmlDoc As New DomDocument

xmlDoc.load("reports.xml")
Set ElemList = xmlDoc.getElementsByTagName("AUTHOR")
For i=0 To (ElemList.length -1)
    MsgBox ElemList.item(i).xml
Next


Creating Nodes
The DOMDocument object provides a generic createNode method that lets you create nodes by supplying a node type, name, and namespaceURI. I say generic because it also provides individual methods to create most of the following specific node types.

 

Node Type

Value

Description

Node_Element

1

Node is an Element

Node_Attribute

2

Node is an Attritute of an element

Node_Text

3

Node represents the text content of a tag

Node_Cdata_Section

4

A CDATA section in the XML source. CDATA sections escape text that would otherwise be interpreted as markup.

Node_Entity_Reference

5

A reference to an entity in the XML document

Node_Entity

6

Node represents an expanded entity

Node_Processing_Instruction

7

A processing instruction from the XML document

Node_Comment

8

Node represents a comment in the XML document

Node_Document

9

Represents a document object, which, as the root of the document tree, provides access to the entire XML document

Node_Document_Type

10

Represents the document type declaration (DTD, indicated by the tag

Node_Document_Fragment

11

A document fragment node associates a node or subtree with a document without actually being contained within the document

Node_Notation

12

Represents a notation in the document type declaration (DTD)

Here's an example of creating an attribute node:

 

Dim xmlDoc As New DomDocument
Dim MyNode As IXMLDOMNode

xmlDoc.load("C:\books.xml")
Set MyNode = xmlDoc.createNode(2, "XML", "")

 

XMLDOMNode

The XMLDOMNode object is the main object within a DOM. The DOMDocument object is itself an XMLDOMNode. So are the members of node lists and named node maps which are discussed later.


Accessing Node Information
The XMLDOMNode object has several properties which provide info about a node. Here are the simpler ones:

Node Property

Description

hasChildNodes

True if this node has children

namespaceURI

Returns the URI (universal resource identifier) for the namespace (the "uuu" portion of the namespace declaration xmlns:nnn="uuu").

parsed

True if the node and all descendants have been parsed and instantiated.

During asynchronous access, not all of the document tree may be available. Before performing XSL transformations or pattern-matching operations, it is useful to know if the entire tree below this node is available for processing.

xml

Returns a string containing the XML representation of the node and all its descendants.

nodename

Returns the qualified name for the element, attribute, or entity reference. Ex: returns xxx:yyy for the element <xxx:yyy>. The return value depends on the nodetype.

nodetype

Returns an integer representing the XML DOM node type.

nodetypestring

Returns a string representing the XML DOM node type.

specified

Returns True if the attribute is explicitly specified in the element. Returns False if the attribute value comes from the DTD or schema. Returns True on non-attribute nodes.

This example illustrates a few of the above properties. It checks if the root node has children and prints the number of child nodes.

 

Dim xmlDoc As New DOMDocument
Dim currNode As IXMLDOMNode
Dim strXML As String

xmlDoc.async = False
xmlDoc.load("c:\books.xml")
Set currNode = xmlDoc.documentElement.firstChild

strXML = currNode.xml

MsgBox currNode.namespaceURI

If currNode.parsed Then
  MsgBox ("node was parsed")
End If

If currNode.hasChildNodes Then
  MsgBox currNode.childNodes.length Else
  MsgBox ("no child nodes")
End If


Setting Node Information
The data in an XML file is exposed in the DOM as node values. Node values might be the value of an attribute or the text within an XML element.

The nodeValue property provides access to values of attributes, text nodes, comments, processing instructions, and CDATA section nodes.

To get the value of an element type node, you can navigate to its element's children (the text nodes within) and call nodeValue on them or use the text property.

This code sets the value of an attribute and an element.

 

newAttNode = xmlDoc.createAttribute("newAtt")
newAttNode.nodeValue = "hello world"

If (elem1.text = "hello world") Then
  elem1.text = "hi! world"
End If


Navigating Through Nodes
From the XMLDOMNode object, you can navigate to its: parent node using the (parentNode) method, children (childNodes, firstChild, lastChild), siblings (previousSibling, nextSibling), or the document object the node belongs (ownerDocument) to.

If the node type is element, attribute, or entityReference, you can call the definition property to navigate to the schema definition of the node.

If the node type is element, processingInstruction, documentType, entity, or notation, you can navigate to the attributes on the node using the attributes property.

These methods return the indicated node or null if the node doesn't exists.

This example illustrates how to navigate the DOM tree.

 

Dim xmlDoc As New DOMDocument
Dim currNode As IXMLDOMNode
Dim newNode As IXMLDOMNode
Dim rootNode As IXMLDOMNode
Dim oNodeList As IXMLDOMNodeList

xmlDoc.async = False
xmlDoc.load("c:\books.xml")
Set rootNode = xmlDoc.documentElement
'
' Create a new node from another node's parent and display its XML.
'
Set currNode = xmlDoc.documentElement.childNodes.item(1).childNodes.item(0)
Set newNode = currNode.parentNode
MsgBox newNode.xml
'
' Display the XML for the root node's first child.
'
Set currNode = xmlDoc.documentElement.firstChild
MsgBox currNode.xml
'
' Create a new element and insert it before the last child of the top-level node.
'
Set newNode = xmlDoc.createNode (1, "VIDEOS", "")
Set currNode = rootNode.insertBefore(newNode, rootNode.lastChild)
'
' Get a list of the root's children and display the XML for each child.
'
Set oNodeList = rootNode.childNodes
For Each currNode in oNodeList
  MsgBox currNode.xml
Next
'
' Get a node, get its left sibling, display its XML.
'
Set currNode = xmlDoc.documentElement.childNodes.item(1)
Set newNode = currNode.previousSibling
MsgBox newNode.xml

You can also navigate to other nodes in the tree using the selectNodes and selectSingleNode methods. These methods take an XSL Pattern as an argument and return the node or nodes that match that query. For more information about XSL Patterns, see my XSL Tutorial.


Manipulating the Children of a Node
There are four methods that let you manipulate the children of a node. Each one takes a node object as an argument. They are: appendChild, replaceChild, removeChild and insertBefore.

 

Dim xmlDoc As New DOMDocument
Dim refNode As IXMLDOMNode
Dim newNode As IXMLDOMNode
Dim root As IXMLDOMNode

xmlDoc.async = False
xmlDoc.load("c:\books.xml")
Set root = xmlDoc.documentElement
'
' Create a new "pages" node. Insert it before the root's first child. Display its XML.
'
Set newNode = xmlDoc.createElement("PAGES")
Set refNode = rootNode.childNodes.item(1).firstChild
root.childNodes.item(1).insertBefore newNode, refNode
MsgBox root.childNodes.item(1).xml
'
' Remove a child node.
'
Set refNode = root.childNodes.item(1).firstChild
root.childNodes.item(1).removeChild refNode
MsgBox root.childNodes.item(1).xml
'
' Replace the specified child with the new "pages" node.
'
Set newNode = xmlDoc.createElement("PAGES")
root.childNodes.item(1).replaceChild newNode, root.childNodes.item(1).childNodes.item(0)



XMLDOMNodeList

The XMLDOMNodeList object is a collection of nodes. It is returned by the childNodes, selectNodes and getElementsByTagName methods.

You can iterate sequentially through the nodes in the list as shown in this previous example or by using the nextNode method shown below. The length property indicates the number of nodes in the list.

 

Dim xmlDoc As New DOMDocument
Dim currNode As IXMLDOMNode
Dim oNodeList As IXMLDOMNodeList

xmlDoc.async = False
xmlDoc.load("c:\books.xml")
'
' Get a list of the nodes and display their text.
'
Set oNodeList = xmlDoc.getElementsByTagName("AUTHOR")
For i = 0 TO (oNodeList.length -1)
  Set currNode = oNodeList.nextNode
  MsgBox currNode.text
Next

To access nodes randomly, use the item property. This allows you to navigate directly to a specific node. The first node has an index of zero.

 

Dim xmlDoc As New DOMDocument
Dim oNodeList As IXMLDOMNodeList

xmlDoc.async = False
xmlDoc.load("c:\books.xml")
'
' Get a list of the nodes and display their text.
'
Set oNodeList = xmlDoc.getElementsByTagName("AUTHOR")
For i = 0 TO (oNodeList.length -1)
  MsgBox oNodeList(i).text
Next



XMLDOMNamedNodeMap

An XMLDOMNamedNodeMap object is returned by the attributes property. The XMLDOMNamedNodeMap object differs from the node list because it is a collection of nodes that can also be accessed by name.

Just like a node list, a named node map has a length property and can be accessed using its item method. It also exposes the nextNode property

However, you can also access the members of a named node map name by using getNamedItem and getQualifiedItem. The getNamedItem method takes the name of the desired node as a parameter; the getQualifiedItem method takes the name and namespaceURI of the desired node. Each method returns an node object.

This code gets the value of the ID attribute on the elem1 element and assigns that value to the variable "idValue".

 

idValue = elem1.attributes.getNamedItem("ID").nodeValue


Manipulating a Named Node Map
These methods allow you to manipulate named node maps: setNamedItem, removeNamedItem and removeQualifiedItem.

The setNamedItem method takes an XML node object as a parameter, adding that node to the named node map. If an attribute already exists with the same name, the old attribute is replaced. This example creates a new attribute node with the name "ID" and adds it to the attributes of elem1:

 

idAtt = XMLDoc.createAttribute("ID") elem1.setNamedItem(idAtt)

The removeNamedItem method takes a node name as a parameter, removing the node with that name. The removeQualifiedItem method takes a node name and namespaceURI as its parameters, removing the corresponding attribute.

 

 




About TheScarms
About TheScarms


Sample code
version info

If you use this code, please mention "www.TheScarms.com"

Email this page


© Copyright 2024 TheScarms
Goto top of page