Tip 15 - How to avoid loading unnecessary Properties

UPDATE: Made a couple of important corrections re-which Original Values are required.

Problem:

Imagine if you query blog posts:

var myPosts = from post in ctx.Posts
orderby post.Created descending
select post;

Just so you can output the post titles etc.

foreach(var post in myPosts)
{
Console.WriteLine("{0} on {1}", post.Title, post.Created);
}

Well you've just done a lot of wasted work loading all the properties you don't actually need.

Read-only Solution:

For read-only situations the solution is trivial.

You just do a projection:

var myPosts = from post in ctx.Posts
orderby post.Created descending
select new {post.Title, post.Created};

Now you've avoided loading the properties you don't actually need.

This is particularly important for entities with lots of properties or with any properties mapped to a blob column in database, for example something like a Body property mapped to an nvarchar(max) column.

Read-write Solution:

But what if you want to modify the Entity?

Now a Projection is no good, because unless you have a whole object you don't get any object services, which means no update.

Hmm...

As always the key to coming up with a solution is to understand how the Entity Framework works.

When updating an Entity, the Entity Framework, sends updates to the database in this form (pseudo code):

UPDATE [Table]
SET
ModifiedProperty1 = NewValue1,
ModifiedProperty2 = NewValue2,
...
ModifiedPropertyN = NewValueN
WHERE
KeyProperty = KeyValue AND
ModifiedProperty1 = OriginalValue1 AND
ModifiedProperty2 = OriginalValue2 AND
...
ModifiedPropertyN = OriginalValueN
 

Notice that properties that aren't modified don't appear anywhere in the update Command.

Key Insight: This means you only need to know original property values for keys*

With this insight in mind we can try something like this:

  1. Project just the columns we need to read-write
  2. Fabricate an entity from the projection, ignoring columns we don't care about.
  3. Attach(...) that 'partially correct' entity
  4. Make necessary changes to the entity
  5. SaveChanges(...)

Which will allow us to change our entity without materializing properties we aren't interested in!

Here is some working code that does just that:

// Project just the columns we need
var myPosts = from post in ctx.Posts
orderby post.Created descending
select new {post.ID, post.Title};

// Fabricate new Entities in memory.
// Notice the use of AsEnumerable() to separate the in db query
// from the LINQ to Objects construction of Post entities.
var fabricatedPosts = from p in myPosts.AsEnumerable()
select new Post{ID = p.ID, Title = post.Title};

// Now we attach the posts
// And call a method to modify the Title
foreach(var p in fabricatedPosts)
{
ctx.AttachTo("Posts", p);
p.Title = ChangeTitle(p.Title);
}
ctx.SaveChanges();    

Notice we only retrieved the ID property (the key) and the Title property (the thing we wanted to modify) but we still managed to make updates.

TA DA!

*Caveats / Complications:

This tip doesn't apply if you use Stored Procedures to update the entity.

If you think about the way the stored proc works you can see why. When using a stored procedure for update, all current values (and some original values) are mapped to parameters regardless of whether things have been modified or not. Which basically means you have to have all original values :(

In addition you sometimes need to tell the Entity Framework a few other original values, because without them your update won't succeed:

  • Concurrency Properties: The original value of Concurrency Properties is included in the where clause of the update, to make sure you can only update the database if you display knowledge of the current database version. So without the correct original value the update won't succeed.
  • EntityReference EntityKey values: You will need know the original value for 0..1 relationships too, even if you aren't changing the relationship. For example if an Order has a Customer you will need to know the CustomerReference.EntityKey, then you can initialize a new Order with the relationships already established. This problem goes away in the future if you use FK Properties (coming in .NET 4.0).
  • Properties referenced by C-Side Mapping Conditions: the value of a C-Side mapping condition is used to working out which mapping applies, so without the correct original value the correct update mapping can't be established. Most people don't use this feature.

Comments

  • Anonymous
    April 24, 2009
    PingBack from http://blogs.msdn.com/alexj/archive/2009/03/26/index-of-tips.aspx
  • Anonymous
    April 26, 2009
    One passing comment about the where portion.WHERE    KeyProperty = KeyValue AND    ModifiedProperty1 = OldValue1 AND    ModifiedProperty2 = OldValue2 AND    ...    ModifiedPropertyN = OldValueNIf the key uniquely identifies a particular row, why require old values? In the leanest scenario, you would only need ID column(s) to generate an update statement as the end result.
  • Anonymous
    April 27, 2009
    Tanveer,You are of course right. I've updated the tip accordingly. Thanks for that.Alex
  • Anonymous
    April 27, 2009
    Thanks to I've made a few important corrections to Tip 15 . I've corrected the bit that said the UPDATE