Reading XML File With JScript

I am Titus working as a SDET in JScript team. Sometime back I came across a situation where the requirement was to pass a XML file and get a Tree Listing back. The Tree Listing should have all nodes in the file along with proper parent/child relationship as well as a good way to differentiate between nodes with/without values. Let’s call nodes with value as properties. I achieved this by using JScript. In this blog you will learn how to read/parse XML file using Microsoft’s XML DOM and use this to create the Tree Listing. 

Let’s take a sample XML file, say test.xml (can be a URL or a file on your system) to get a clear picture of the kind of Tree Listing required and later we will look at the actual code.

 

The XML file can be looked as

Root Node, name is BookList, has 2 child nodes

childnode0, name is Book and has two properties,

Prop0: Author has a value Paul

Prop1: Price has a value 10.3

childnode1, name is Book and has three properties

Prop0: Author has a value Joe

Prop1: Price has a value 20.95

Prop2: Title has a value Web 2.0

The Required Tree Listing after parsing test.xml is

nName

NodeName

nValue

NodeValue

cNodes

List of Child Nodes

cProps

List of Child Properties

 

The ReadXMLFile function in the code listing below returns the Tree Listing as required.

Many a times you know the XML file contents and are interested in the list of only a specific node. Making a call to ReadXMLFile with second argument as the node name gives just such a list.

Referring test.xml, a call to ReadXMLFile(“test.xml”, “Author”) gives a list like

Whereas a call to ReadXMLFile(“test.xml”, “Book”), returns the list like the below one

If you have carefully noticed the Tree listing, cNodes as well as cProps is an Array. so by using the proper index value, one can reach the desired node.

Here goes the actual code:

 var NODE_ELEMENT = 1;
  
 var NODE_ATTRIBUTE = 2;
  
 var NODE_TEXT = 3;
  
 /**** INTERNALLY USED FUNCTIONS ****/
  
 /*
 
 * Builds up xmlNode list on parentXMLNode
 
 * by iterating over each node in childNodesLst
 
 */
  
 function getXMLNodeList_1(childNodesLst,
  
 parentXMLNode)
  
 {
  
     var i;
  
     var curNode;
  
     var arrLen
  
     //traverse nodelist to get nodevalues and all child nodes
  
     for (i = 0; i < childNodesLst.length; i++) {
  
         //we will ignore all other node types like
  
         //NODE_ATTRIBUTE, NODE_CDATA_SECTION, …
  
         if (childNodesLst[i].nodeType == NODE_ELEMENT
  
         || childNodesLst[i].nodeType == NODE_TEXT) {
  
             if (childNodesLst[i].nodeType == NODE_TEXT) {
  
                 //we got the value of the parent node, populate
  
                 //parent node and return back
  
                 parentXMLNode.nValue = childNodesLst[i].nodeValue;
  
                 return;
  
             }
  
             //we have a new NODE_ELEMENT node
  
             curNode = new XMLNode(childNodesLst[i].nodeName, childNodesLst[i].nodeValue);
  
             if (childNodesLst[i].hasChildNodes) {
  
                 getXMLNodeList_1(childNodesLst[i].childNodes, curNode);
  
                 if (curNode.nValue != null) {
  
                     //we need to add this as a property to the parent node
  
                     if (parentXMLNode.cProps == null) {
  
                         parentXMLNode.cProps = new Array();
  
                         parentXMLNode.hasCProps = true;
  
                     }
  
                     arrLen = parentXMLNode.cProps.length;
  
                     parentXMLNode.cProps[arrLen] = curNode;
  
                 } else {
  
                     //we need to add this as child node to the parent node
  
                     if (parentXMLNode.cNodes == null) {
  
                         parentXMLNode.cNodes = new Array();
  
                         parentXMLNode.hasCNodes = true;
  
                     }
  
                     arrLen = parentXMLNode.cNodes.length;
  
                     parentXMLNode.cNodes[arrLen] = curNode;
  
                 }
  
             } else {
  
                 //no use of such a node
  
                 //mark currNode as null for GC collection
  
                 curNode = null;
  
             }
  
         }
  
     }
  
     return;
  
 }
  
 /*
 
 * Generates appropriate XMLNodeList from nodes
 
 * in childNodes
 
 */
  
 function getXMLNodeList(childNodes)
  
 {
  
     var xmlNode = new XMLNode(null, null);
  
     getXMLNodeList_1(childNodes, xmlNode);
  
     var xmlNodeList = null;
  
     if (xmlNode.hasCNodes) {
  
         xmlNodeList = xmlNode.cNodes;
  
     } else if (xmlNode.hasCProps) {
  
         xmlNodeList = xmlNode.cProps;
  
     }
  
     return xmlNodeList;
  
 }
  
 /**** INTERNALLY USED FUNCTIONS ****/
  
 /* XMLNde DataStruct */
  
 functionXMLNode(ndName, ndVal)
  
 {
  
     this.nName = ndName; //XMLNode name
  
     this.nValue = ndVal; //the value(if any) associated with XMLNode
  
     //As of now only property nodes have associated values
  
     this.hasCNodes = false; //Bool to mark presense of Child Nodes
  
     this.cNodes = null; //List of child nodes (of type XMLNode)
  
     this.hasCProps = false; //Bool to mark presense of Property Nodes
  
     this.cProps = null; //List of property nodes (of type XMLNode)
  
 }
  
 /* Exposed Functions */
  
 function ReadXMLFile(fileName, tagName)
  
 {
  
     if (arguments.length < 1 || arguments.length > 2)
  
     return null;
  
     var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
  
     //load the file sync'ly
  
     xmlDoc.async = false
  
     try {
  
         xmlDoc.load(fileName);
  
     } catch(e) {
  
         //failed to load xml file
  
         return null;
  
     }
  
     //lets get the child nodes
  
     var childNodes = null;
  
     if (arguments.length == 2) {
  
         try {
  
             childNodes = xmlDoc.getElementsByTagName(tagName);
  
         } catch(e) {
  
             return null;
  
         }
  
     } else {
  
         childNodes = xmlDoc.childNodes;
  
     }
  
     return (getXMLNodeList(childNodes));
  
 }
  
 var xmlNodes;
  
 xmlNodes = ReadXMLFile("https://www.noweb.com/test.xml");
  
 //For a file on you system
  
 //xmlNodes = ReadXMLFile ("C:\\My Documents\\test.xml");
  
 //root node name is
  
 var RootNodeName = xmlNodes[0].nName;
  
 xmlNodes = ReadXMLFile("https://www.noweb.com/test.xml", "Book");
  
 var cntBooks = xmlNodes.length;
  
 xmlNodes = ReadXMLFile("https://www.noweb.com/test.xml", "Author");
  
 var authorName = xmlNodes[0].nValue;

Hope you enjoyed the blog!

Thanks,

Titus

Comments

  • Anonymous
    April 01, 2008
    Instead of using JavaScript "constants" for the node types, why not implement it properly so that the element returns the proper integer code: http://developer.mozilla.org/en/docs/DOM:element.nodeType

  • Anonymous
    April 02, 2008
    How to make it work in firefox?

  • Anonymous
    April 02, 2008
    here's an even better idea: http://en.wikipedia.org/wiki/E4X also already part of FF and ECMA4/ AS3. a lot less convoluted and a lot more elegant looking that the solution above.

  • Anonymous
    April 02, 2008
    The comment has been removed

  • Anonymous
    April 02, 2008
    Is there ever a different between MSXML.DOMDocument and Microsoft.XMLDOM? I checked the registry on my machine and they're both going to CLSID {2933BF90-7B36-11D2-B20E-00C04F983E60}. That goes to "%SystemRoot%system32msxml3.dll", and my understanding is that that's version 3 of the library. Why not use "MSXML2.DOMDocument.6.0", which maps to CLSID {88d96a05-f192-11d4-a65f-0040963251e5} (which uses c:WINDOWSsystem32msxml6.dll) instead? Unless I'm insane or mis-remembering, Version 6 performs some operations a lot faster than 3 (like selectNodes()). Yeah, if a 7 comes out the code will need adjustment, but from what I've seen I don't mind making a minor adjustment to a constant somewhere. See http://blogs.msdn.com/xmlteam/archive/2006/10/23/using-the-right-version-of-msxml-in-internet-explorer.aspx

  • Anonymous
    April 02, 2008
    Yawn.  This is a great exercise for a CS 101 class, but converting an XML document to its JavaScript-native equivalent representation has been done a thousand times already and is a foundational skill for a web developer (I sometimes use it as an interview question).  Google "XML to JSON" and you get the idea. Now, some native browser support for E4X would be nice.

  • Anonymous
    April 02, 2008
    What?! 1.) Fix the Node Constants in JScript: [bug 256] http://webbugtrack.blogspot.com/2007/10/bug-256-dom-nodetype-constants-are-not.html 2.) Why on earth are you using ActiveX for this?  What part of Web Standards slipped by you? Use XMLHTTPRequest (the "almost" native) one added in IE7, with a fallback to ActiveX only if the user is on a really old version of IE. 3.) Does the term JSON ring a bell? It has been around for ages, and does what you are trying to do a 100 times better.

  • Anonymous
    April 03, 2008
    @Gerome: The reason for using ActiveX was we wanted it to work even from non browser-hosts especially cscript. @TMO: Thanks for pointing out the extra memory consumption. Actually the intended audience for the blog is mainly novice Jscript programmer who wants to learn how to parse an xml file using jscript, so we overlooked on memory, performance optimizations. Thanks all for your invaluable comments.

  • Anonymous
    June 15, 2008
    but but, this will only work in IE, I don't think any developers who do cross-browser app's will actually use this

  • Anonymous
    November 19, 2008
    What a nice example telling us clearly why is JSON so much better than XML. Just store your books in an js file like this : var BookList =   [  { Author: "Paul", Price : 10.30 } ,  { Author: "Joe", Price : 20.95, Title: "Web 2.0" } ] ; Why XML ?

  • Anonymous
    November 25, 2009
    More vendor lock in with code that ties corporations to IE, I really wish no one used proprietry browser code in this day and age... If you want to provide a serious method for XML then E4X is still waiting to be put into IE. Please oh please by IE9.

  • Anonymous
    April 25, 2010
    but but, this will only work in IE, I don't think any developers who do cross-browser app's will actually use this