Querying DataSets – Introduction to LINQ to DataSet

Greetings fellow data junkies! My name is Erick Thompson, a PM at Microsoft who is working on driving ADO.NET forward to utilize the latest and best technology that we have coming down the pipe. In that context, I am excited to tell you about LINQ to DataSet, a new and powerful way to write queries against the DataSet.

The DataSet is a very useful in-memory representation of data, and acts as the core of a wide variety of data based applications. When the DataSet is loaded with data, it is usually by way of a DataAdapter. The DataAdapter generally restricts the data that is loaded into the DataSet, as it is rarely practical for an entire database to be loaded into a single instance of a DataSet. Once the data is loaded, there is often a need to perform additional queries on the data. This is where LINQ to DataSet comes in.

Select, RowFilter and Find

The existing methods that can be used to write queries against the DataSet all utilize string based expressions, which have a SQL like syntax. While these expressions work quite well, they are restricted to a limited set of operators that exist in the DataSet, and due to the fact that they are string based, they do not provide any compile time checking for validity. In addition, they represent another query language that the developer needs to learn.

LINQ

LINQ is a way to make queries and set based operations first class citizens in the .NET world. It allows queries to be written in the development language, and provides compile time type checking. In addition, LINQ allows for the full power of the framework to be utilized when writing queries. LINQ to DataSets brings this power to your DataSet based application.

Query Expression and Method Query

There are two primary ways to use LINQ from your application, using Query Expressions and Method Queries. The advantage of using a Query Expression is that it looks very similar to a SQL statement, so the learning curve is not very steep. For example, if you wanted to find all the customers whose last name is Smith, and you wanted to select the first name, you could write the following Query Expression.

var query = from r in customerDataTable.AsEnumerable()

            where r.Field<string>("LastName") == "Smith"

            select r.Field<string>(“FirstName”);

As you might guess from the AsEnumerable and the semicolon, when this code is compiled, it actually turns into a sequence of method calls which represents the query. The above statement is equivalent to the follow Method Query.

var query = customerDataTable.AsEnumerable()

    .Where(dr => dr.Field<string>("LastName") == "Smith")

    .Select(dr => dr.Field<string>("FirstName"));

While this query is somewhat trivial, you can certainly see that the Query Expression is a much more user friendly way to write LINQ queries. However, the set of queries you can write with Query Expressions is a subset of those you can write with a Method Query. For a full list of what you can do with both query formats, and for a general introduction to LINQ, see the LINQ website.

And now for something completely different

While the above query is nice, and does not rely on a string based query language, it is a query that you could easily write with ADO.NET v2. LINQ gives you a lot more power, and gives you the ability to write some very interesting queries. Some of the queries just are not possible, or are quite difficult, using the current query solution.

As an example, if you wanted to quickly find the count of orders with a total over 100, you would currently have to create a new DataView with the correct filter, then either find the count from the DataView, or create a new computed column. It is not too bad, but all that code is error prone and far too much work. All you have to do is you have a mistake in one of your strings, and run-time exception, here I come! Instead, with LINQ to DataSet, you can write the following.

var query = orderDataTable.AsEnumerable()

    .Count(dr => dr.Field<int>("Total") > 100);

As another more complex example, if you have customers and orders tables, and you want to find the total order amount for each order and show the customer name, you could write a Method Query like the following.

var groupedOrders = customers.Join

    (orders,

    o => o.Field<string>("CustomerID"),

    c => c.Field<string>("CustomerID"),

    (c, o) => new {

                   Customer = c.Field<string>("CustomerName"),

                   OrderID = o.Field<int>("OrderID"),

                   Total = o.Field<decimal>("Total")

               }

    )

    .GroupBy(r => r.OrderID);

As a third example that really shows the power of LINQ, let us say that you have a function that calculates the distance from a given zip code to your shipping depot, and to minimize shipping time, you want to send out distant shipments first. Without LINQ, this operation would involve a lot of code, either creating a temporary DataTable will all the distances and zip codes, or writing an efficient sorting algorithm yourself. With LINQ, all you need is the following query.

public int Distance(int zip)

{

    // Compute the distance

}

var query = from dr in orderDataTable.AsEnumerable()

            orderby Distance(dr.Field<int>("Zip")) descending

            select dr.Field<int>("OrderID");

How cool is that?

Coming soon…

In future LINQ to DataSet posts, I will talk more about how we get type checking, handle nulls, and talk about some cool features of VB.NET that make the whole process easier.

Erick Thompson

ADO.NET Program Manager

LINQ to DataSet Part 2

LINQ to DataSet Part 3

Comments

  • Anonymous
    January 26, 2007
    Any chance we can fill datasets using linq?

  • Anonymous
    January 26, 2007
    These are very good news, and I'm really looking forward to being able to manipulate DataSets with LINQ in a production environment...

  • Anonymous
    January 29, 2007
    _uacct = "UA-1265582-1"; urchinTracker(); There are not a lot of guidelines about blogging, the one that

  • Anonymous
    January 30, 2007
    I have already said my piece on the Vista launch but also Office 2007 launches today which really rocks.

  • Anonymous
    February 01, 2007
    How about a disk streaming model for the dataset where it can be used for large data objects with dlink?

  • Anonymous
    February 01, 2007
    As part of the process of bringing our technology babies into the world, I wrote a series of blog postings

  • Anonymous
    February 02, 2007
    If you are happy to mix query syntax along side traditional coding, then LINQ against DataSets is looking...

  • Anonymous
    February 09, 2007
    Here are a few good links to Orcas material (some old and some new). There is a lot of good and bad material

  • Anonymous
    February 12, 2007
    Pingback from http://oakleafblog.blogspot.com/2007/02/linq-to-dataset-documentation-series.html --rj

  • Anonymous
    February 15, 2007
    Today, someone asked on an internal mailing list, what does LINQ offer, than I cannot do with DataSet.Select()?

  • Anonymous
    August 24, 2007
    With the introduction of LINQ to DataSet there finally exists a full featured query language for the

  • Anonymous
    October 18, 2007
    Time for another weekly roundup of news that focuses on .NET, agile and general development content:

  • Anonymous
    February 24, 2008
    Boa Tarde Pessoal, É com muito prazer que público o material referente a Seção Técnica desta sexta-feira

  • Anonymous
    April 10, 2008
    If you want to learn more about LINQ to DataSet, we have a complete chapter about it from LINQ in Action available as a free download here : http://manning.com/marguerie

  • Anonymous
    July 01, 2008
    Why not just straight SQL statements against data tables in a dataset??!!  This seems like a complete no brainer to me.   .NET is about six years old now.  Isn't that plenty of time to have implemented something so obvious?  Zero learning curve for existing database app developers, complete portability (or nearly so) of code whether the data resides on the server or in memory in the app, etc. etc.  Anything less than full SQL compatibility makes no sense.  While it may be better than nothing it certainly doesn't rise to the level of cool.

  • Anonymous
    January 20, 2009
    How to use LINQ to Dataset in the Windows Form Application with the VB

  • Anonymous
    March 19, 2009
    Linq to DataSet Visual Basic Example &#39; Fill the Customer table that is part of the AjaxDataSet Dim

  • Anonymous
    August 12, 2009
    Thanks for lovely blog. This really help me out in solving my bool related prb.