Working with the DOM
The DOM allows applications to work with XML document structures and information as program structures rather than text streams. Applications and scripts can read and manipulate these structures without knowing the details of XML syntax, taking advantage of the facilities built into the DOM API of MSXML.
The DOM uses two key abstractions: a tree-like hierarchy and nodes that represent document content and structures. The hierarchy is composed of these nodes, which may contain or be contained by other nodes. For developers, this means that much of the work of XML processing requires navigating this tree structure to find or modify the information it contains. Working with XML requires thinking of information in terms of nested containers, and making sure that information is put into or retrieved from the right container.
The DOM treats nodes as generic objects, making it possible to create a script that loads a document and then traverses all of the nodes, reporting what it finds in the tree.
The following are exposed by the XML DOM.
The DOM programming interfaces enable applications to traverse the tree and manipulate its nodes. Each node is defined as a specific node type, according to the XML DOM enumerated constants, which also define valid parent and child nodes for each node type. For most XML documents, the most common node types are element, attribute, and text. Attributes occupy a special place in the model because they are not considered child nodes of a parent, and are treated more like properties of elements. An additional programming interface, the IXMLDOMNamedNodeMap
, is provided for attributes.
Examples
This sample Active Server Pages (ASP) script uses the MSXML parser to parse a document into a DOM tree, then move down the tree from the root node and report the kinds of nodes it encounters and their content.
VBScript
The first version uses Microsoft Visual Basic® Scripting Edition (VBScript) to load the document and walk the tree. If no form input is provided, it presents the user with a form to gather the URL of an XML document. The user then submits that back to the script, which parses the document and presents a tree.
<%@LANGUAGE=VBScript%>
<html>
<head>
<title>Tree walk test - VBScript</title>
</head><body>
<h1>XML Parsing - DOM Tree Walk Demo</h1>
<%
Function createXmlDomDocument(xd)
On Error Resume Next
Set xd = CreateObject("MSXML2.DOMDocument.6.0")
If (IsObject(xd) = False) Then
alert("DOM document not created. Check MSXML version used in createXmlDomDocument.")
Else
Set createXmlDomDocument = xd
End If
End Function
Function attributeWalk(node)
For i=1 to indent
Response.Write(" ")
Next
For Each attrib In node.attributes
Response.Write("|--")
Response.Write(attrib.nodeTypeString)
Response.Write(":")
Response.Write(attrib.name)
Response.Write("--")
Response.Write(attrib.nodeValue)
Response.Write("<br />")
Next
End Function
Function treeWalk(node)
Dim nodeName
indent=indent+2
For Each child In node.childNodes
For i=1 to indent
Response.Write(" ")
Next
Response.Write("|--")
Response.Write(child.nodeTypeString)
Response.Write("--")
If child.nodeType<3 Then
Response.Write(child.nodeName)
Response.Write("<br />")
End If
If (child.nodeType=1) Then
If (child.attributes.length>0) Then
indent=indent+2
attributeWalk(child)
indent=indent-2
End If
End If
If (child.hasChildNodes) Then
treeWalk(child)
Else
Response.Write child.text
Response.Write("<br />")
End If
Next
indent=indent-2
End Function
' You need to enter the full path to a XML file in same secure authorized directory
' as your Web virtual directory where this ASP page is loaded from and executed.
xmlFile = "C:\Inetpub\wwwroot\testing\books.xml"
Dim root
Dim xmlDoc
Dim child
Dim indent
indent=0
Set xmlDoc = createXmlDomDocument(xmlDoc)
xmlDoc.async = False
xmlDoc.validateOnParse=False
xmlDoc.load xmlFile
If xmlDoc.parseError.errorcode = 0 Then
'Walk from the root to each of its child nodes:
Response.Write("<pre>")
treeWalk(xmlDoc)
Response.Write("</pre>")
Else
Response.Write("<P>There was an error in : " & xmlFile &"</P><P>Line: " & _
xmlDoc.parseError.line & _
"<BR>Column: " & xmlDoc.parseError.linepos & "</P>")
End If
%>
</body>
</html>
At the bottom, the script contains a main routine that either loads a document and passes it to the tree walker or presents a form asking which document to load. This script relies on the tree_walk
function, a recursive function that moves from node to node in the tree and presents a suitably formatted version of the contents. That function in turn relies on an attribute_walk
function to present attribute content, because attribute nodes are not considered children of element nodes within the DOM.
JScript
The Microsoft JScript® version is similar to the VBScript version. However, it requires extra lines of code to avoid overwriting variables during the recursive tree walking.
<%@LANGUAGE=JScript%>
<html>
<head>
<title>Tree walk test - JScript</title>
</head><body>
<h1>XML Parsing - DOM Tree Walk Demo</h1>
<%
function createXmlDomDocument()
{
try {
var xd = new ActiveXObject("MSXML2.DOMDocument.6.0");
}
catch(e) {
alert("DOM document not created. Check MSXML version used in createXmlDomDocument.");
return;
}
return xd;
}
function attribute_walk(node) {
for (k=1; k<indent; k++) {
Response.Write(" ");
}
for (m=0; m<node.attributes.length; m++){
attrib = node.attributes.item(m);
Response.Write("|--");
Response.Write(attrib.nodeTypeString);
Response.Write(":");
Response.Write(attrib.name);
Response.Write("--");
Response.Write(attrib.nodeValue);
Response.Write("<br />");
}
} //end attribute_walk
function tree_walk(node) {
indent=indent+2;
for (current=0; current<node.childNodes.length; current++) {
child=node.childNodes.item(current);
for (j=1; j<indent; j++){
Response.Write(" ");
}
Response.Write("|--");
Response.Write(child.nodeTypeString);
Response.Write("--");
if (child.nodeType<3) {
Response.Write(child.nodeName);
Response.Write("<br />");
}
if (child.nodeType==1) {
if (child.attributes.length>0) {
indent=indent+2;
attribute_walk(child);
indent=indent-2;
}
}
if (child.hasChildNodes) {
//store information so recursion is possible
depthList[depth]=current;
depth=depth+1;
tree_walk(child);
//return from recursion
depth=depth-1;
current=depthList[depth];
}else{
Response.Write (child.text);
Response.Write("<br />");
} //end for (current=0; ...) loop
}//end tree_walk
indent=indent-2;
}
//recursion-tracking variables
depth=0;
depthList=new Array();
indent=0;
xmlFile=new String();
// You need to enter the full path to a XML file in same secure authorized directory
// as your Web virtual directory where this ASP page is loaded from and executed.
xmlFile = "file://C://Inetpub//wwwroot//testing//books.xml"
xmlFile=""+xmlFile; //makes string clean for passing to MSXML
xmlPresented=false;
var xmlDoc = createXmlDomDocument();
xmlDoc.async = false;
xmlDoc.validateOnParse=false;
if ((xmlFile)) {
xmlDoc.load(xmlFile);
if (xmlDoc.parseError.errorcode == null) {
Response.Write("<pre>");
tree_walk(xmlDoc);
Response.Write("</pre>");
xmlPresented==true;
}
}
if (xmlPresented==false){
Response.Write("<P>There was an error in : " + xmlFile + "</P><P>Line: " +
xmlDoc.parseError.line +
"<BR>Column: " + xmlDoc.parseError.linepos + "</P>");
}
%>
</body></html>