How to use annotations to transform LINQ to XML trees in an XSLT style (LINQ to XML)
Annotations can be used to facilitate transforms of an XML tree.
Some XML documents are "document-centric with mixed content." With such documents, you don't necessarily know the shape of child nodes of an element. For instance, a node that contains text may look like this:
<text>A phrase with <b>bold</b> and <i>italic</i> text.</text>
For any given text node, there may be any number of child <b>
and <i>
elements. This approach extends to a number of other situations, such as pages that can contain a variety of child elements, which could be regular paragraphs, bulleted paragraphs, and bitmaps. Cells in a table may contain text, drop down lists, or bitmaps. One of the primary characteristics of document-centric XML is that you don't know which child element any particular element will have.
If you want to transform elements in a tree where you don't necessarily know much about the children of the elements that you want to transform, then this approach that uses annotations is an effective one.
The summary of the approach is:
- First, annotate elements in the tree with a replacement element.
- Second, iterate through the entire tree, creating a new tree where you replace each element with its annotation. The examples in this article implement the iteration and creation of the new tree in a function named
XForm
.
In detail, the approach consists of:
- Execute one or more LINQ to XML queries that return the set of elements that you want to transform from one shape to another. For each element in the query, add a new XElement object as an annotation to the element. This new element will replace the annotated element in the new, transformed tree. This is simple code to write, as demonstrated by the example.
- The new element that's added as an annotation can contain new child nodes; it can form a subtree with any desired shape.
- There is a special rule: If a child node of the new element is in a different namespace, a namespace that's made up for this purpose (in this example, the namespace is
http://www.microsoft.com/LinqToXmlTransform/2007
), then that child element isn't copied to the new tree. Instead, if the namespace is the above-mentioned special namespace, and the local name of the element isApplyTransforms
, then the child nodes of the element in the source tree are iterated, and copied to the new tree (with the exception that annotated child elements are themselves transformed according to these rules). - This is somewhat analogous to the specification of transforms in XSL. The query that selects a set of nodes is analogous to the XPath expression for a template. The code to create the new XElement that's saved as an annotation is analogous to the sequence constructor in XSL, and the
ApplyTransforms
element is analogous in function to thexsl:apply-templates
element in XSL. - One advantage to taking this approach is that, as you formulate queries, you're always writing queries on the unmodified source tree. You don't need to worry about how modifications to the tree affect the queries that you're writing.
Example: Rename all paragraph nodes
This example renames all Paragraph
nodes to para
.
XNamespace xf = "http://www.microsoft.com/LinqToXmlTransform/2007";
XName at = xf + "ApplyTransforms";
XElement root = XElement.Parse(@"
<Root>
<Paragraph>This is a sentence with <b>bold</b> and <i>italic</i> text.</Paragraph>
<Paragraph>More text.</Paragraph>
</Root>");
// replace Paragraph with para
foreach (var el in root.Descendants("Paragraph"))
el.AddAnnotation(
new XElement("para",
// same idea as xsl:apply-templates
new XElement(xf + "ApplyTransforms")
)
);
// The XForm method, shown later in this article, accomplishes the transform
XElement newRoot = XForm(root);
Console.WriteLine(newRoot);
Imports <xmlns:xf="http://www.microsoft.com/LinqToXmlTransform/2007">
Module Module1
Dim at As XName = GetXmlNamespace(xf) + "ApplyTransforms"
Sub Main()
Dim root As XElement = _
<Root>
<Paragraph>This is a sentence with <b>bold</b> and <i>italic</i> text.</Paragraph>
<Paragraph>More text.</Paragraph>
</Root>
' Replace Paragraph with p.
For Each el In root...<Paragraph>
' same idea as xsl:apply-templates
el.AddAnnotation( _
<para>
<<%= at %>></>
</para>)
Next
' The XForm function, shown later in this article, accomplishes the transform
Dim newRoot As XElement = XForm(root)
Console.WriteLine(newRoot)
End Sub
End Module
This example produces the following output:
<Root>
<para>This is a sentence with <b>bold</b> and <i>italic</i> text.</para>
<para>More text.</para>
</Root>
Example: Calculate averages and sums and add them as new elements to the tree
The following example calculates the average and sum of the Data
elements and adds them as new elements to the tree.
XNamespace xf = "http://www.microsoft.com/LinqToXmlTransform/2007";
XName at = xf + "ApplyTransforms";
XElement data = new XElement("Root",
new XElement("Data", 20),
new XElement("Data", 10),
new XElement("Data", 3)
);
// while adding annotations, you can query the source tree all you want,
// as the tree isn't mutated while annotating.
var avg = data.Elements("Data").Select(z => (Decimal)z).Average();
data.AddAnnotation(
new XElement("Root",
new XElement(xf + "ApplyTransforms"),
new XElement("Average", $"{avg:F4}"),
new XElement("Sum",
data
.Elements("Data")
.Select(z => (int)z)
.Sum()
)
)
);
Console.WriteLine("Before Transform");
Console.WriteLine("----------------");
Console.WriteLine(data);
Console.WriteLine();
Console.WriteLine();
// The XForm method, shown later in this article, accomplishes the transform
XElement newData = XForm(data);
Console.WriteLine("After Transform");
Console.WriteLine("----------------");
Console.WriteLine(newData);
Imports <xmlns:xf="http://www.microsoft.com/LinqToXmlTransform/2007">
Module Module1
Dim at As XName = GetXmlNamespace(xf) + "ApplyTransforms"
Sub Main()
Dim data As XElement = _
<Root>
<Data>20</Data>
<Data>10</Data>
<Data>3</Data>
</Root>
' While adding annotations, you can query the source tree all you want,
' as the tree isn't mutated while annotating.
data.AddAnnotation( _
<Root>
<<%= at %>/>
<Average>
<%= _
String.Format("{0:F4}", _
data.Elements("Data") _
.Select(Function(z) CDec(z)).Average()) _
%>
</Average>
<Sum>
<%= _
data.Elements("Data").Select(Function(z) CInt(z)).Sum() _
%>
</Sum>
</Root> _
)
Console.WriteLine("Before Transform")
Console.WriteLine("----------------")
Console.WriteLine(data)
Console.WriteLine(vbNewLine)
' The XForm function, shown later in this article, accomplishes the transform
Dim newData As XElement = XForm(data)
Console.WriteLine("After Transform")
Console.WriteLine("----------------")
Console.WriteLine(newData)
End Sub
End Module
This example produces the following output:
Before Transform
----------------
<Root>
<Data>20</Data>
<Data>10</Data>
<Data>3</Data>
</Root>
After Transform
----------------
<Root>
<Data>20</Data>
<Data>10</Data>
<Data>3</Data>
<Average>11.0000</Average>
<Sum>33</Sum>
</Root>
Example: Create a new transformed tree from the original annotated tree
A small function, XForm
, creates a new transformed tree from the original, annotated tree. The following is pseudocode for this function:
The function takes an XElement as an argument and returns an XElement.
If an element has an XElement annotation, the returned XElement has these characteristics:
- The name of the new XElement is the annotation element's name.
- All attributes are copied from the annotation to the new node.
- All child nodes are copied from the annotation, with the exception that the special node xf:ApplyTransforms is recognized, and the source element's child nodes are iterated. If the source child node isn't an XElement, it's copied to the new tree. If the source child is an XElement, then it's transformed by calling this function recursively.
Otherwise, the returned XElement has these characteristics:
- The name of the new XElement is the source element's name.
- All attributes are copied from the source element to the destination's element.
- All child nodes are copied from the source element.
- If the source child node isn't an XElement, it's copied to the new tree. If the source child is an XElement, then it's transformed by calling this function recursively.
The following is code for this function:
// Build a transformed XML tree per the annotations
static XElement XForm(XElement source)
{
XNamespace xf = "http://www.microsoft.com/LinqToXmlTransform/2007";
XName at = xf + "ApplyTransforms";
if (source.Annotation<XElement>() != null)
{
XElement anno = source.Annotation<XElement>();
return new XElement(anno.Name,
anno.Attributes(),
anno
.Nodes()
.Select(
(XNode n) =>
{
XElement annoEl = n as XElement;
if (annoEl != null)
{
if (annoEl.Name == at)
return (object)(
source.Nodes()
.Select(
(XNode n2) =>
{
XElement e2 = n2 as XElement;
if (e2 == null)
return n2;
else
return XForm(e2);
}
)
);
else
return n;
}
else
return n;
}
)
);
}
else
{
return new XElement(source.Name,
source.Attributes(),
source
.Nodes()
.Select(n =>
{
XElement el = n as XElement;
if (el == null)
return n;
else
return XForm(el);
}
)
);
}
}
' Build a transformed XML tree per the annotations.
Function XForm(ByVal source As XElement) As XElement
If source.Annotation(Of XElement)() IsNot Nothing Then
Dim anno As XElement = source.Annotation(Of XElement)()
Return _
<<%= anno.Name.ToString() %>>
<%= anno.Attributes() %>
<%= anno.Nodes().Select(Function(n As XNode) _
GetSubNodes(n, source)) %>
</>
Else
Return _
<<%= source.Name %>>
<%= source.Attributes() %>
<%= source.Nodes().Select(Function(n) GetExpandedNodes(n)) %>
</>
End If
End Function
Private Function GetSubNodes(ByVal n As XNode, ByVal s As XElement) As Object
Dim annoEl As XElement = TryCast(n, XElement)
If annoEl IsNot Nothing Then
If annoEl.Name = at Then
Return s.Nodes().Select(Function(n2 As XNode) GetExpandedNodes(n2))
End If
End If
Return n
End Function
Private Function GetExpandedNodes(ByVal n2 As XNode) As XNode
Dim e2 As XElement = TryCast(n2, XElement)
If e2 Is Nothing Then
Return n2
Else
Return XForm(e2)
End If
End Function
Example: Show XForm
in typical uses of this type of transform
The following example includes the XForm
function and a few of the typical uses of this type of transform:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
class Program
{
static XNamespace xf = "http://www.microsoft.com/LinqToXmlTransform/2007";
static XName at = xf + "ApplyTransforms";
// Build a transformed XML tree per the annotations
static XElement XForm(XElement source)
{
if (source.Annotation<XElement>() != null)
{
XElement anno = source.Annotation<XElement>();
return new XElement(anno.Name,
anno.Attributes(),
anno
.Nodes()
.Select(
(XNode n) =>
{
XElement annoEl = n as XElement;
if (annoEl != null)
{
if (annoEl.Name == at)
return (object)(
source.Nodes()
.Select(
(XNode n2) =>
{
XElement e2 = n2 as XElement;
if (e2 == null)
return n2;
else
return XForm(e2);
}
)
);
else
return n;
}
else
return n;
}
)
);
}
else
{
return new XElement(source.Name,
source.Attributes(),
source
.Nodes()
.Select(n =>
{
XElement el = n as XElement;
if (el == null)
return n;
else
return XForm(el);
}
)
);
}
}
static void Main(string[] args)
{
XElement root = new XElement("Root",
new XComment("A comment"),
new XAttribute("Att1", 123),
new XElement("Child", 1),
new XElement("Child", 2),
new XElement("Other",
new XElement("GC", 3),
new XElement("GC", 4)
),
XElement.Parse(
"<SomeMixedContent>This is <i>an</i> element that " +
"<b>has</b> some mixed content</SomeMixedContent>"),
new XElement("AnUnchangedElement", 42)
);
// each of the following serves the same semantic purpose as
// XSLT templates and sequence constructors
// replace Child with NewChild
foreach (var el in root.Elements("Child"))
el.AddAnnotation(new XElement("NewChild", (string)el));
// replace first GC with GrandChild, add an attribute
foreach (var el in root.Descendants("GC").Take(1))
el.AddAnnotation(
new XElement("GrandChild",
new XAttribute("ANewAttribute", 999),
(string)el
)
);
// replace Other with NewOther, add new child elements around original content
foreach (var el in root.Elements("Other"))
el.AddAnnotation(
new XElement("NewOther",
new XElement("MyNewChild", 1),
// same idea as xsl:apply-templates
new XElement(xf + "ApplyTransforms"),
new XElement("ChildThatComesAfter")
)
);
// change name of element that has mixed content
root.Descendants("SomeMixedContent").First().AddAnnotation(
new XElement("MixedContent",
new XElement(xf + "ApplyTransforms")
)
);
// replace <b> with <Bold>
foreach (var el in root.Descendants("b"))
el.AddAnnotation(
new XElement("Bold",
new XElement(xf + "ApplyTransforms")
)
);
// replace <i> with <Italic>
foreach (var el in root.Descendants("i"))
el.AddAnnotation(
new XElement("Italic",
new XElement(xf + "ApplyTransforms")
)
);
Console.WriteLine("Before Transform");
Console.WriteLine("----------------");
Console.WriteLine(root);
Console.WriteLine();
Console.WriteLine();
XElement newRoot = XForm(root);
Console.WriteLine("After Transform");
Console.WriteLine("----------------");
Console.WriteLine(newRoot);
}
}
Imports System.Collections.Generic
Imports System.Linq
Imports System.Text
Imports System.Xml
Imports System.Xml.Linq
Imports <xmlns:xf="http://www.microsoft.com/LinqToXmlTransform/2007">
Module Module1
Dim at As XName = GetXmlNamespace(xf) + "ApplyTransforms"
' Build a transformed XML tree per the annotations.
Function XForm(ByVal source As XElement) As XElement
If source.Annotation(Of XElement)() IsNot Nothing Then
Dim anno As XElement = source.Annotation(Of XElement)()
Return _
<<%= anno.Name.ToString() %>>
<%= anno.Attributes() %>
<%= anno.Nodes().Select(Function(n As XNode) _
GetSubNodes(n, source)) %>
</>
Else
Return _
<<%= source.Name %>>
<%= source.Attributes() %>
<%= source.Nodes().Select(Function(n) GetExpandedNodes(n)) %>
</>
End If
End Function
Private Function GetSubNodes(ByVal n As XNode, ByVal s As XElement) As Object
Dim annoEl As XElement = TryCast(n, XElement)
If annoEl IsNot Nothing Then
If annoEl.Name = at Then
Return s.Nodes().Select(Function(n2 As XNode) GetExpandedNodes(n2))
End If
End If
Return n
End Function
Private Function GetExpandedNodes(ByVal n2 As XNode) As XNode
Dim e2 As XElement = TryCast(n2, XElement)
If e2 Is Nothing Then
Return n2
Else
Return XForm(e2)
End If
End Function
Sub Main()
Dim root As XElement = _
<Root Att1='123'>
<!--A comment-->
<Child>1</Child>
<Child>2</Child>
<Other>
<GC>3</GC>
<GC>4</GC>
</Other>
<SomeMixedContent>This is <i>an</i> element that <b>has</b> some mixed content</SomeMixedContent>
<AnUnchangedElement>42</AnUnchangedElement>
</Root>
' Each of the following serves the same semantic purpose as
' XSLT templates and sequence constructors.
' Replace Child with NewChild.
For Each el In root.<Child>
el.AddAnnotation(<NewChild><%= CStr(el) %></NewChild>)
Next
' Replace first GC with GrandChild, add an attribute.
For Each el In root...<GC>.Take(1)
el.AddAnnotation(<GrandChild ANewAttribute='999'><%= CStr(el) %></GrandChild>)
Next
' Replace Other with NewOther, add new child elements around original content.
For Each el In root.<Other>
el.AddAnnotation( _
<NewOther>
<MyNewChild>1</MyNewChild>
<<%= at %>></>
<ChildThatComesAfter/>
</NewOther>)
Next
' Change name of element that has mixed content.
root...<SomeMixedContent>(0).AddAnnotation( _
<MixedContent><<%= at %>></></MixedContent>)
' Replace <b> with <Bold>.
For Each el In root...<b>
el.AddAnnotation(<Bold><<%= at %>></></Bold>)
Next
' Replace <i> with <Italic>.
For Each el In root...<i>
el.AddAnnotation(<Italic><<%= at %>></></Italic>)
Next
Console.WriteLine("Before Transform")
Console.WriteLine("----------------")
Console.WriteLine(root)
Console.WriteLine(vbNewLine)
Dim newRoot As XElement = XForm(root)
Console.WriteLine("After Transform")
Console.WriteLine("----------------")
Console.WriteLine(newRoot)
End Sub
End Module
This example produces the following output:
Before Transform
----------------
<Root Att1="123">
<!--A comment-->
<Child>1</Child>
<Child>2</Child>
<Other>
<GC>3</GC>
<GC>4</GC>
</Other>
<SomeMixedContent>This is <i>an</i> element that <b>has</b> some mixed content</SomeMixedContent>
<AnUnchangedElement>42</AnUnchangedElement>
</Root>
After Transform
----------------
<Root Att1="123">
<!--A comment-->
<NewChild>1</NewChild>
<NewChild>2</NewChild>
<NewOther>
<MyNewChild>1</MyNewChild>
<GrandChild ANewAttribute="999">3</GrandChild>
<GC>4</GC>
<ChildThatComesAfter />
</NewOther>
<MixedContent>This is <Italic>an</Italic> element that <Bold>has</Bold> some mixed content</MixedContent>
<AnUnchangedElement>42</AnUnchangedElement>
</Root>