Data Transformations with LINQ (C#)

Language-Integrated Query (LINQ) is not only about retrieving data. It is also a powerful tool for transforming data. By using a LINQ query, you can use a source sequence as input and modify it in many ways to create a new output sequence. You can modify the sequence itself without modifying the elements themselves by sorting and grouping. But perhaps the most powerful feature of LINQ queries is the ability to create new types. This is accomplished in the select clause. For example, you can perform the following tasks:

  • Merge multiple input sequences into a single output sequence that has a new type.

  • Create output sequences whose elements consist of only one or several properties of each element in the source sequence.

  • Create output sequences whose elements consist of the results of operations performed on the source data.

  • Create output sequences in a different format. For example, you can transform data from SQL rows or text files into XML.

These are just several examples. Of course, these transformations can be combined in various ways in the same query. Furthermore, the output sequence of one query can be used as the input sequence for a new query.

Joining Multiple Inputs into One Output Sequence

You can use a LINQ query to create an output sequence that contains elements from more than one input sequence. The following example shows how to combine two in-memory data structures, but the same principles can be applied to combine data from XML or SQL or DataSet sources. Assume the following two class types:

class Student
{
    public string First { get; set; }
    public string Last {get; set;}
    public int ID { get; set; }
    public string Street { get; set; }
    public string City { get; set; }
    public List<int> Scores;
}

class Teacher
{
    public string First { get; set; }
    public string Last { get; set; }
    public int ID { get; set; } 
    public string City { get; set; }
}

The following example shows the query:

class DataTransformations
{
    static void Main()
    {
        // Create the first data source.
        List<Student> students = new List<Student>()
        {
            new Student {First="Svetlana",
                Last="Omelchenko", 
                ID=111, 
                Street="123 Main Street",
                City="Seattle",
                Scores= new List<int> {97, 92, 81, 60}},
            new Student {First="Claire",
                Last="O’Donnell", 
                ID=112,
                Street="124 Main Street",
                City="Redmond",
                Scores= new List<int> {75, 84, 91, 39}},
            new Student {First="Sven",
                Last="Mortensen",
                ID=113,
                Street="125 Main Street",
                City="Lake City",
                Scores= new List<int> {88, 94, 65, 91}},
        };

        // Create the second data source.
        List<Teacher> teachers = new List<Teacher>()
        {                
            new Teacher {First="Ann", Last="Beebe", ID=945, City = "Seattle"},
            new Teacher {First="Alex", Last="Robinson", ID=956, City = "Redmond"},
            new Teacher {First="Michiyo", Last="Sato", ID=972, City = "Tacoma"}
        };

        // Create the query. 
        var peopleInSeattle = (from student in students
                    where student.City == "Seattle" 
                    select student.Last)
                    .Concat(from teacher in teachers
                            where teacher.City == "Seattle" 
                            select teacher.Last);

        Console.WriteLine("The following students and teachers live in Seattle:");
        // Execute the query. 
        foreach (var person in peopleInSeattle)
        {
            Console.WriteLine(person);
        }

        Console.WriteLine("Press any key to exit.");
        Console.ReadKey();
    }
}
/* Output:
    The following students and teachers live in Seattle:
    Omelchenko
    Beebe
 */

For more information, see join clause (C# Reference) and select clause (C# Reference).

Selecting a Subset of each Source Element

There are two primary ways to select a subset of each element in the source sequence:

  1. To select just one member of the source element, use the dot operation. In the following example, assume that a Customer object contains several public properties including a string named City. When executed, this query will produce an output sequence of strings.

    var query = from cust in Customers
                select cust.City;
    
  2. To create elements that contain more than one property from the source element, you can use an object initializer with either a named object or an anonymous type. The following example shows the use of an anonymous type to encapsulate two properties from each Customer element:

    var query = from cust in Customer
                select new {Name = cust.Name, City = cust.City};
    

For more information, see Object and Collection Initializers (C# Programming Guide) and Anonymous Types (C# Programming Guide).

Transforming in-Memory Objects into XML

LINQ queries make it easy to transform data between in-memory data structures, SQL databases, ADO.NET Datasets and XML streams or documents. The following example transforms objects in an in-memory data structure into XML elements.

class XMLTransform
{
    static void Main()
    {            
        // Create the data source by using a collection initializer. 
        // The Student class was defined previously in this topic.
        List<Student> students = new List<Student>()
        {
            new Student {First="Svetlana", Last="Omelchenko", ID=111, Scores = new List<int>{97, 92, 81, 60}},
            new Student {First="Claire", Last="O’Donnell", ID=112, Scores = new List<int>{75, 84, 91, 39}},
            new Student {First="Sven", Last="Mortensen", ID=113, Scores = new List<int>{88, 94, 65, 91}},
        };

        // Create the query. 
        var studentsToXML = new XElement("Root",
            from student in students
            let x = String.Format("{0},{1},{2},{3}", student.Scores[0],
                    student.Scores[1], student.Scores[2], student.Scores[3])
            select new XElement("student",
                       new XElement("First", student.First),
                       new XElement("Last", student.Last),
                       new XElement("Scores", x)
                    ) // end "student"
                ); // end "Root" 

        // Execute the query.
        Console.WriteLine(studentsToXML);

        // Keep the console open in debug mode.
        Console.WriteLine("Press any key to exit.");
        Console.ReadKey();
    }
}

The code produces the following XML output:

< Root>
  <student>
    <First>Svetlana</First>
    <Last>Omelchenko</Last>
    <Scores>97,92,81,60</Scores>
  </student>
  <student>
    <First>Claire</First>
    <Last>O'Donnell</Last>
    <Scores>75,84,91,39</Scores>
  </student>
  <student>
    <First>Sven</First>
    <Last>Mortensen</Last>
    <Scores>88,94,65,91</Scores>
  </student>
</Root>

For more information, see Creating XML Trees in C# (LINQ to XML).

Performing Operations on Source Elements

An output sequence might not contain any elements or element properties from the source sequence. The output might instead be a sequence of values that is computed by using the source elements as input arguments. The following simple query, when it is executed, outputs a sequence of strings whose values represent a calculation based on the source sequence of elements of type double.

Note

Calling methods in query expressions is not supported if the query will be translated into some other domain. For example, you cannot call an ordinary C# method in LINQ to SQL because SQL Server has no context for it. However, you can map stored procedures to methods and call those. For more information, see Stored Procedures [LINQ to SQL].

class FormatQuery
{
    static void Main()
    {            
        // Data source. 
        double[] radii = { 1, 2, 3 };

        // Query.
        IEnumerable<string> query =
            from rad in radii
            select String.Format("Area = {0}", (rad * rad) * 3.14);

        // Query execution.  
        foreach (string s in query)
            Console.WriteLine(s);

        // Keep the console open in debug mode.
        Console.WriteLine("Press any key to exit.");
        Console.ReadKey();
    }
}
/* Output:
    Area = 3.14
    Area = 12.56
    Area = 28.26
*/

See Also

Tasks

How to: Combine Data with LINQ by Using Joins (Visual Basic)

Reference

select clause (C# Reference)

Concepts

LINQ Query Expressions (C# Programming Guide)

Other Resources

LINQ (Language-Integrated Query)

LINQ to SQL [LINQ to SQL]

LINQ to DataSet

LINQ to XML [from BPUEDev11]