Improving ObjectQuery.Include

** UPDATE: There’s a bug in the code below – see this post for the update!

One of the great features of LINQ To SQL and LINQ To Entities is that the queries you write are checked by the compiler, which eliminates typing errors in your query. Unfortunately, the ObjectQuery<T>.Include function (which is used to eager-load data that isn’t directly included in the query) takes a string parameter, opening up opportunities for typos to creep back in. In this post I’ll present some sample code that illustrates one way that you can work round this. To start with, let’s take a quick look at a query against an entity model on the Northwind database.

 var query = from customer in context.Customers
            where customer.Orders.Count > 0
            select customer;

This query simply retrieves customers with an order. If the code that uses the query results then needs to make use of the order data, it won’t have been loaded. We can use the Include function to ensure that this data is loaded up front:

 var query = from customer in context.Customers .Include("Orders") 
            where customer.Orders.Count > 0
            select customer;

Notice the Include(“Orders”) call that we’ve inserted which instructs Entity Framework to retrieve the Orders for each Customer. It would be much nicer if we could use a lambda expression to specify what property to load:

 var query = from customer in context.Customers.Include(c => c.Orders)
            where customer.Orders.Count > 0
            select customer;

It turns out that this is very easy to achieve by using an extension method:

 public static class ObjectQueryExtensions
{
    public static ObjectQuery<TSource> Include<TSource, TPropType>(this ObjectQuery<TSource> source, Expression<Func<TSource, TPropType>> propertySelector)
    {
        MemberExpression memberExpression = propertySelector.Body as MemberExpression;
        if (memberExpression== null)
        {
            throw new InvalidOperationException("Expression must be a member expression" + propertySelector);
        }
        MemberInfo propertyInfo = memberExpression.Member;
        return source.Include(propertyInfo.Name);
    }
}

This Include extension method allows the query syntax above with the lambda expression. When the Include method is called, it inspects the Expression Tree. If the method is used as intended, the tree will describe a accessing a member of the TSource class. We can then us the name of the member to call the original Include function.

Whilst this solves the problem as I described it above, what if the code consuming the query also needed the order details? With the original Include function we can write

 var query = from customer in context.Customers .Include("Orders.Order_Details")  
            where customer.Orders.Count > 0
            select customer;

Notice that we can pass a path to the properties to include. The extension method we wrote doesn’t give us a way to handle this case. I imagine using something like the syntax below to describe this situation

 var query = from customer in context.Customers.Include(c => c.Orders.SubInclude(o => o.Order_Details))
            where customer.Orders.Count > 0
            select customer;

Here, we’ve specified that we want to include Orders, and then also that we want to include Order_Details. Adding this support is a bit more code, but not too bad:

 public static class ObjectQueryExtensions
{
    public static ObjectQuery<TSource> Include<TSource, TPropType>(this ObjectQuery<TSource> source, Expression<Func<TSource, TPropType>> propertySelector)
    {
        string includeString = BuildString(propertySelector);
        return source.Include(includeString);
    }
    private static string BuildString(Expression propertySelector)
    {
        switch(propertySelector.NodeType)
        {
            case ExpressionType.Lambda:
                LambdaExpression lambdaExpression = (LambdaExpression)propertySelector;
                return BuildString(lambdaExpression.Body);

            case ExpressionType.Quote:
                UnaryExpression unaryExpression= (UnaryExpression)propertySelector;
                return BuildString(unaryExpression.Operand);

            case ExpressionType.MemberAccess:
                MemberInfo propertyInfo = ((MemberExpression) propertySelector).Member;
                return propertyInfo.Name;

            case ExpressionType.Call:
                MethodCallExpression methodCallExpression = (MethodCallExpression) propertySelector;
                if (IsSubInclude(methodCallExpression.Method)) // check that it's a SubInclude call
                {
                    // argument 0 is the expression to which the SubInclude is applied (this could be member access or another SubInclude)
                    // argument 1 is the expression to apply to get the included property
                    // Pass both to BuildString to get the full expression
                    return BuildString(methodCallExpression.Arguments[0]) + "." +
                           BuildString(methodCallExpression.Arguments[1]);
                }
                // else drop out and throw
                break;
        }
        throw new InvalidOperationException("Expression must be a member expression or an SubInclude call: " + propertySelector.ToString());

    }

    private static readonly MethodInfo[] SubIncludeMethods;
    static ObjectQueryExtensions()
    {
        Type type = typeof (ObjectQueryExtensions);
        SubIncludeMethods = type.GetMethods().Where(mi => mi.Name == "SubInclude").ToArray();
    }
    private static bool IsSubInclude(MethodInfo methodInfo)
    {
        if (methodInfo.IsGenericMethod)
        {
            if (!methodInfo.IsGenericMethodDefinition)
            {
                methodInfo = methodInfo.GetGenericMethodDefinition();
            }
        }
        return SubIncludeMethods.Contains(methodInfo);
    }

    public static TPropType SubInclude<TSource, TPropType>(this EntityCollection<TSource> source, Expression<Func<TSource, TPropType>> propertySelector)
        where TSource : class, IEntityWithRelationships
        where TPropType : class
    {
        throw new InvalidOperationException("This method is only intended for use with ObjectQueryExtensions.Include to generate expressions trees"); // no actually using this - just want the expression!
    }
    public static TPropType SubInclude<TSource, TPropType>(this TSource source, Expression<Func<TSource, TPropType>> propertySelector)
        where TSource : class, IEntityWithRelationships
        where TPropType : class
    {
        throw new InvalidOperationException("This method is only intended for use with ObjectQueryExtensions.Include to generate expressions trees"); // no actually using this - just want the expression!
    }
}

This code still has the Include method with the original signature, and adds a couple of SubInclude extension methods. You can see that the code to extract the property name has been pulled out into a separate method (BuildString). This now also handles some additional NodeTypes so that we can handle the SubInclude calls inside the Include call. There are some checks in to ensure that we are dealing with the SubInclude calls at this point (using the IsSubInclude method). With this code, we can write the previous query as well as:

 var query = from customer in context.Customers.Include(c => c.Orders.SubInclude(o => o.Order_Details).SubInclude(od=>od.Products) )
            where customer.Orders.Count > 0
            select customer;

This query includes the Orders, Order Details, and Products as if we’d called Include(“Orders.Order_Details.Product”) and in fact this is what the code will do! Additionally, it doesn’t matter whether you chain the SubInclude calls (as above) or nest them:

 var query = from customer in context.Customers.Include(c => c.Orders.SubInclude(o => o.Order_Details .SubInclude(od => od.Products) ))
            where customer.Orders.Count > 0
            select customer;

Both of these queries have the same effect, so it’s up to you which style you prefer.

The code isn’t production ready (and as always, is subject to the standard disclaimer: “These postings are provided "AS IS" with no warranties, and confer no rights. Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm”). However, there are a couple of other features that I think are worth briefly mentioning:

  • The IsSubInclude function works against a cached MethodInfos for the SubInclude methods. Because these methods are generic, we have to get the generic method definition to test for comparison
  • The SubInclude functions are not intended to be called at runtime - they are purely there to get the compiler to generate the necessary expression tree!
  • Generic type inference is at work when we write the queries. Notice that we could write Include(c => c.Orders) rather than Include<Customer>(c => c.Orders). Imagine how much less readable the code would become, especially when including multiple levels.

I found it quite interesting putting this code together as it pulls in Extension Methods, LINQ To Entities and Expression Trees. The inspiration for this came from the LinkExtensions.ActionLink methods in ASP.Net MVC framework which do the same sort of thing for ActionLinks. I'’m not sure if I’m entirely satisfied with the syntax for including multiple levels, but it is the best I’ve come up with so far. If you’ve any suggestions for how to improve it then let me know!

Comments

  • Anonymous
    September 08, 2008
    Maybe I’m biased because he’s on the same team as me (ahem, sorry for the plug J ), but I really liked

  • Anonymous
    September 14, 2008
    ADO.Net Data Services allows you to expose your LINQ To Entities model (or LINQ To SQL model, or even

  • Anonymous
    October 14, 2008
    Stuart Leeks recently put together a great post on improving the 'Include' method on ObjectQuery&lt;T&gt;

  • Anonymous
    January 22, 2009
    Please help. Your example is nice; but, I think too complicated for me. I am just trying to avoid the hard-code in this query... var query = from customer in context.Customers.Include("Orders") ...so I want do something like... var query = from customer in context.Customers.Include(MyEntities.Orders.TableName) ...but I cannot do that because Linq does not expose TableName/ColumnName like ADO (which is should, BTW)... ...so how can one get rid of that hardcode? Please advise. Thank you. -- Mark Kamoski

  • Anonymous
    January 22, 2009
    Hi Mark, If you add the ObjectQueryExtensions class to your project then instead of   var query = from customer in context.Customers.Include("Orders") you can use   var query = from customer in context.Customers.Include(customer=>customer.Orders) With the latter, the compiler verifies the customer.Orders property, so you avoid a hardcoded string. Note that with this approach, you don't have to worry what the table name is (in fact with Entity Framework, the entity could be coming from a number of tables). -- Stuart

  • Anonymous
    April 17, 2009
    If you’ve read my last post on Types of Auditing , you should be primed for this one; I’m looking at

  • Anonymous
    April 24, 2009
    Having spent some time using the sample from my previous post on ObjectQuery.Include, I’ve encountered

  • Anonymous
    April 24, 2009
    Stuart -- What exactly do you mean by saying "UPDATE: There’s a bug in the code below – see this post for the update"? Is this current page...   http://blogs.msdn.com/stuartleeks/archive/2008/08/27/improving-objectquery-t-include.aspx#9566780 ...the page to which the phrased "this post" is referring and is the code fixed inline already as I read the page? Is the phrase "this post" supposed to be a link to another page? Something else? Also, to further confuse me (easily done, I admit) your post "Friday, April 24, 2009 10:25 AM by Stuart Leeks" that says this... "Having spent some time using the sample from my previous post on ObjectQuery.Include, I’ve encountered" ...seems a bit truncated, no? Also, I did a diff on the code that we copied from this page about 2 months ago against the code that is on this page now and diff says there are no differences. So, where is the "UPDATE"? Can you help? Thank you. -- Mark Kamoski

  • Anonymous
    April 24, 2009
    Hi Mark, There should indeed have been a link on the "this post" text in the update message. Thanks for pointing it out - it's corrected now! The new code has some extra logic in the BuildString method (under the ExpressionType.MemberAccess case) to handle nested properties that aren't collections. The post on 24th April should continue with "... encountered a bug" and then give the updated code. Let me know if you are still having problems loading that page. -- Stuart

  • Anonymous
    April 24, 2009
    Stuart -- I have it now. Thank you VERY much for all your hard work. This is REALLY helping our work here so you should know we appreciate it. Thank you. -- Mark Kamoski

  • Anonymous
    May 21, 2009
    If you’ve read my last post on Types of Auditing , you should be primed for this one; I’m looking at

  • Anonymous
    July 31, 2009
    A better method that doesn't require nested include methods: public static ObjectQuery<T> Include<T, R>(this ObjectQuery<T> query, Expression<Func<T, R>> selector) {    var path = new StringBuilder();    for (var expression = selector.Body; expression is MemberExpression; expression = ((MemberExpression)expression).Expression)    {        path.Insert(0, "." + ((MemberExpression)expression).Member.Name);    }                path.Remove(0, 1);    return query.Include(path.ToString()); } This allows you to do context.Customers.Include(c => c.Orders.Order_Details.Products) instead of the ugly include method nesting.

  • Anonymous
    July 31, 2009
    Hi MegaShine, I'm not sure how your extension method would work (at least for the example you give). Since Customer.Orders is a collection property and doesn't have an Order_Details property (although items in the collection do). So in this case I believe that you do need to use a nested include. In the case where the relationships being included are 1:1 rather than 1:many then you can indeed you the technique you mentioned to allow Include(o=>o.Property1.Property2) without nesting. In fact, I've added this to the code in my follow-up post that fixed a bug, but didn't call out this feature: http://blogs.msdn.com/stuartleeks/archive/2009/04/24/improving-objectquery-t-include-updated.aspx -- Stuart

  • Anonymous
    August 08, 2010
    blogs.microsoft.co.il/.../say-goodbye-to-the-hard-coded-objectquery-t-include-calls.aspx