Using SelectSingleNode (or SelectNodes) on XML where the default namespace has been set
I've been stumped by this one at least two times over the last couple of years, so I thought it was a good candidate to be written up here.
I was trying to select a node from some standard XHTML where the default namespace was set. In otherwords the XHTML was something like:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"[]>
<html xmlns="https://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>MSN Search News: Microsoft</title> ...
Note the xmlns attribute on the root <html> node.
Without thinking too hard, I first tried to find the title of the page by going ...
XmlDocument resultsXhtml = new XmlDocument();
resultsXhtml.Load("https://search.msn.com/news/results.aspx?q=Microsoft");
XmlNode metaNode = resultsXhtml.SelectSingleNode("//title");
... which left metaNode as null.
This took me a little while to figure out. Clearly I need to identify in the XPath query that the title tag is in the default namespace, but how can I do that if that namespace has no prefix in the actual XML.
The solution (reasonably obviously!) is to register a prefix of my own choosing in an XmlNamespaceManager object, and then use that namespace manager when doing the select. Here's some code that works:
XmlDocument resultsXhtml = new XmlDocument();
resultsXhtml.Load("https://search.msn.com/news/results.aspx?q=Microsoft");
XmlNamespaceManager namespaceManager = new XmlNamespaceManager(resultsXhtml.NameTable);
namespaceManager.AddNamespace("myprefix", "https://www.w3.org/1999/xhtml");
XmlNode metaNode = resultsXhtml.SelectSingleNode("//myprefix:title", namespaceManager);
I think what's interesting about this problem, is the way you have to think about namespaces and XPath queries. The namespace is a logical entity denoted by the URI not the prefix in the actual XML. Therefore you can register that URI with any prefix you want in your XPath, which isn't a completely intuitive concept - to me at least!
Comments
Anonymous
November 21, 2005
Actually, the whole idea that it is the URI that is the logical entity, and NOT the prefix is something that took a while for me to "get" also. It was only when I was working with a lot of files that had an un-prefixed namespace that I finally figured it out!Anonymous
January 25, 2006
Thanks -- I was struggling for AGES with this. The documentation is as clear as mud...Anonymous
July 28, 2009
Thanks, this helped me out. Was wondering why my xpath was not working till i stumbled on this post.Anonymous
August 12, 2009
The comment has been removedAnonymous
December 18, 2009
While I understand that the prefix for the xpath query is controlled by the XmlNamespaceManager and can be different than the prefixes used in the xml itself, it disturbs me that one can set the "default namespace" for the XmlNamespaceManager and that default is ignored in the xpath query. This to me is a bug in the implementation, which should be corrected to avoid untold hours of frustration by developers attempting to discover this workaround. Thanks for your post, it did indeed shorten the amount of time that I was frustrated.Anonymous
January 12, 2010
The comment has been removedAnonymous
August 25, 2010
Thanks a lot!Anonymous
April 25, 2013
Great post. This had me stumped.Anonymous
October 29, 2013
Great explanation. Thanks!Anonymous
December 19, 2013
What about if the XML doesn't have any namespace to refer to?Anonymous
December 19, 2013
Elsa, I'm not sure I understand your question. If there is no namespace set, doesn't the node query just work without any prefix? So in my example: XmlNode metaNode = resultsXhtml.SelectSingleNode("//title"); then it would actually return the title node if the XML had no namespaces set JohnAnonymous
March 05, 2014
Here the namespace is hardcoded. Can we process the multiple XML Docs having the same child node title and have different namespaces [Namespace will be decided at Runtime]? For Example: XML Document 1 <Root xmlns=http://abc.com/example> </Root> XML Document 2 <Root xmlns=http://xyz.com/example2> <title> Title 2 </title> </Root>Anonymous
March 05, 2014
Manish - been a long time since I wrote this article (and don't do much XML programming nowadays!), but I think the answer is that because the namespace of the two documents are different, then the documents aren't actually the same. Just because they look similar with the same structure, the different namespace means they are actually different.