Exploring the Performance of the ADO.NET Entity Framework - Part 1

Performance Matters

No matter what type of application you are developing at some point in the lifecycle of a software application or service, performance matters; this is especially true when accessing data. In this article, I’m going to spend some time exploring the performance of the ADO.NET Entity Framework by breaking down the stack and showing how to speed up the simple query. I also explain the performance characteristics of the Entity Framework. I am going to run this in a two part series so keep looking back.

 

So we’re all on the same page, here’s the configuration that I’m using to run my tests:

· Visual Studio 2008.

· SQL Express (installed with Visual Studio).

· ADO.NET Entity Framework Beta 3.

· Entity Framework Tools December 2007 CTP.

· I work on the Entity Framework team, so I have the symbols loaded.

· A C# console application built under the release mode configuration.

· I’m using Northwind as my database.

· I’m running on my laptop, which is a dual core 2GHz processor with 3GB of RAM.

· I’m using the profile tool that ships in Visual Studio.

 

The General Disclaimer

Just as a disclaimer, or a thought provoking exercise, whichever is better suited for you, I want to explain why you would use the EF and Entity Data Model (EDM) and how this relates to performance. While this is an important point, I’ll point it as briefly as I can. Whenever you put a layer and reshaping model that creates a layer of abstraction or something else like an EDM to transform the relational schema of a database, there is going to be a performance decrease. Now with that said, the ADO.NET team has taken the challenge to minimize the performance decrease in this and in future releases.

Query Execution

I am using the following Northwind model as the basis for most of my exploration into performance.

 

 

Figure 1 Northwind Data model using the EDM designer

With this model I can write the following code that queries for the EntityType Order in the EntitySet Orders then iterates the returns result set.

using (NorthwindEntities ne = new NorthwindEntities())

{

    foreach (Order o in ne.Orders)

    {

        int i = o.OrderID;

    }

}

 

I ran this query through 10 iterations in my project, and iterated over a total of 848 rows of data for each query. Out of those ten iterations, here the results in milliseconds.

First run (ms)

Next 9 (ms)

4241

13

13

14

13

13

13

13

24

14

 

The first thing to notice is that the initial iteration takes 4241 milliseconds. The first time you create an ObjectContext and execute any operation that accesses the database, a few expensive operations occur. The pie chart below shows a breakdown, by percentage, of where the time is spent in the initial run.

 

 

Figure 2 – Code area where the first run spends execution time

Here’s what each of those sections mean.

· Loading Metadata 11% – The metadata consists of conceptual, mapping, and logical views of the data model, as defined in the CSDL, MSL, and SSDL files, respectively. During the loading of the metadata, the EDM files are loaded in the MetadataWorkspace and cached in a global cache so other workspaces can take advantage of the existing metadata.

· Initializing Metadata 14%– During the initialization of the metadata, the connection is opened and the actual metadata information is retrieved from the ADO.NET Data Provider which gets information from the database.

· Opening Connection 8%– This is simply just opening the database connection; the first time is usually the slowest.

· View Generation 56%– A big part of creating an abstracted view of the database is providing the actual view for queries and updates in the store’s native language. During this step, the store views are created. The good news is there is a way of making view generation part of the build process so that this step can be avoided at run time.

· Load Assembly 2%– The CLR types must be validated against the metadata.

· Tracking 1%– As part of the Object Services layer, there is a state manager that tracks changes made to entity objects. This is where objects are added to the state manager. Each object’s identity is created and this entity key issued to search for instances of the same entity type with the same key in the state manager. If a match is found, the merge option is used to determine the next steps.

· Materialization 7%– This is the process of actually creating the object and filling in all the properties taken from the returned DbDataReader.

· Misc 1%– This includes all of the other small operations, which includes execution of the SqlCommand, SQL Server query execution, and just application code.

When I run the EDM generator (EdmGen.exe) command line tool with the view generation command parameter (/mode:ViewGeneration), the output is a code file (either C# or Visual Basic) that I can include in my project. Having the view pre-generated reduces the startup time down to 2933 milliseconds or about a 28% decrease. For scenarios where view generation is the primary cost, such as when an object context is created for only a few queries, pre-generating these views and deploying them with your application is a good solution. However, the downside of this solution is the need to keep the generated views synchronized with changes to the model.

 

If we remove the cost of startup and take a look at just the cost to execute the query and return objects, here’s the breakdown of percent per operation.

 

Figure 3 - Warm query results

 

· ObjectContext construction 1.38%– This is the cost of creating the ObjectContext and looking up the existing metadata information based on the information already loaded from the first run. This is important for Web service and ASP.NET scenarios where the short lived context is the programming pattern. Because the service lives for a long time you’ll see the metadata caching and query caching significantly reduced.

· Query creation 11.02%– Each time a query is created in an ObjectContext, the Entity SQL query command is cached. This is the one-time overhead of query creation. The good news is that since query creation is cached, similar queries are executed faster. Later, I show how to use compiled LINQ queries to speed this process up even more.

· EntityKey creation 0.28%– No matter what merge option is used; there is always a cost for EntityKey creation.

· Relationship span 4.13% – For queries where the returned objects are tracked in the ObjectStateManager, we include the related ends. This is invisible to the user, except now EntityReferences have an EntityKey for relationships.

· Object lookup 1.38% – The cost of using an object’s EntityKey to determine if the object already exists in the ObjectStateManager.

· Materialization 73%– the cost of reading from the DbDataReader and creating an object.

· Misc 9.92% – Cost of the remaining stuff, including execution in SQL Server.

 

To better understand how the query process affects performance, here’s the basic logic of a query, in order of execution.

· Parts of the query are broken up to allow for query caching to occur resulting in a query plan.

· Query is passed through to the .NET Framework data provider and executed against the database.

· The results are returned and the user iterates over the results.

· On each entity, the key properties are used to create an EntityKey.

· If the query is a tracking query then the EntityKey is used to do identity resolution.

· If the EntityKey is not found, the object is created and the properties copied into the object.

· The object is added to the ObjectStateManager for tracking.

· If merge options are used, then the properties follow the merge option rules.

· If the object is related to other objects in the ObjectContext, then the relationships between the entities are connected.

 

In my next post, I’ll show some of the performance improvements that can be made on the query itself and how Entity SQL and LINQ to Entities perform.

Brian Dawson
Program Manager, ADO.NET Team

Comments

  • Anonymous
    February 04, 2008
    The comment has been removed

  • Anonymous
    February 07, 2008
    El equipo de ADO.NET hace unos dias ha publicado un interesante post sobre rendiemiento de Entity Framework

  • Anonymous
    February 07, 2008
    Two great posts appeared recently on the ADO.NET team blog. The first details some of the performance

  • Anonymous
    February 10, 2008
    So if it takes so much time to create the ObjectContext, should I create it before hand? In other words, in a windows app, can I just create the ObjectContext when the app loads and the user logs-in, and then use the ObjectContext until the user logs-out or close the app? In the demos and examples I see this objects being created in a "using", implying that I should create it before every chunk of database operation. Can you please clarify that? Thanks!

  • Anonymous
    February 11, 2008
    Great question about the connection. The first opening does including getting additional information. We do open and close the connection when needed and the times are faster over time. As for the ObjectContext and best practice, this really just depends on your application. The first run really is the slowest with the ObjectContext.  There is more than just performance you need to take into account when considing using a longer running ObjectContext.  For example, do you need to track entities longer? If so than "using" may not be the right pattern. The using makes a nice pattern for short lived context, such as in the ASP.NET scenarios.

  • Anonymous
    February 11, 2008
    Thank you very much Brian! The scenario I had in mind when suggested the use of a long running ObjectContext was a Windows app, in which tracking entities for a long time is required - an order taking app for instance. Thanks again.

  • Anonymous
    February 12, 2008
    Exploring the Performance of the ADO.NET Entity Framework

  • Anonymous
    February 13, 2008
    J'ai quelque peu délaissé mon blog ces derniers temps mais maintenant que les techdays sont passés (du

  • Anonymous
    February 13, 2008
    En este post comento una serie buenas prácticas para mejorar la eficiencia de ADO.NET Entity Framework

  • Anonymous
    February 20, 2008
    概述春节后的第一期推荐系列文章,共有10篇文章:1.ASP.NETMVCExampleApplicationoverNorthwindwiththeEntityFramework...

  • Anonymous
    February 20, 2008
    Como hace unos días apuntaba Unai , en el blog del equipo de desarrollo de ADO.NET están publicando una

  • Anonymous
    February 20, 2008
    概述 春节后的第一期推荐系列文章,共有10篇文章: 1.ASP.NETMVCExampleApplicationoverNorthwindwiththeEntityFr...

  • Anonymous
    February 21, 2008
    The first post, Exploring the Performance of the ADO.NET Entity Framework - Part 1 , began: Performance

  • Anonymous
    February 28, 2008
    It's cool to have an idea of the performance of the EF. Nonetheless, I think a comparison must be shown between the dataset world and Linq to Entities. It would be interesting to have the overhead of each compared.

  • Anonymous
    March 27, 2008
    There have been a few questions from the last performance blog post about how the Entity Framework compares

  • Anonymous
    March 27, 2008
    L'ADO .Net Team vient de poster deux nouveaux posts : le premier concerne l'utilisation des procédures

  • Anonymous
    May 01, 2008
    It doesn't take a performance test to tell you that EF will be MUCH, MUCH faster than any DataSet, which has to be some of the worst code in the entire BCL. I've written an ORM that uses tons of reflection, runtime schema analysis, runtime SQL generation, etc - and it outperformed typed DataSets (where queries and all such are generated at design-time) by several factors.

  • Anonymous
    May 24, 2008
    For everybody who is interested on internals of Entity Framework (ER) in relation to referential integrity

  • Anonymous
    May 31, 2008
    Performance Matters No matter what type of application you are developing at some point in the lifecycle of a software application or service, performance matters; this is especially true when accessing data. In this article, I’m going to spend some tim

  • Anonymous
    June 05, 2008
    Performance Matters No matter what type of application you are developing at some point in the lifecycle of a software application or service, performance matters; this is especially true when accessing data. In this article, I’m going to spend some tim

  • Anonymous
    June 20, 2008
    Background To set the stage for this article, do take a look at Exploring the Performance of the ADO.NET

  • Anonymous
    June 20, 2008
    Background To set the stage for this article, do take a look at Exploring the Performance of the ADO.NET

  • Anonymous
    July 01, 2008
    Concurrently with the finalization of the initial LINQ release bits, community previews of complementary

  • Anonymous
    August 12, 2008
    Why does actual metadata information need to be retrieved from the database in runtime, taking 14% of the exec time (Initializing Metadata)? I thought the local metadata in all those XML files would be sufficient. I'm sure it serves a purpose, but I don't like the idea of burdening a production db server with metadata queries.

  • Anonymous
    August 20, 2008
    And what about the perfomance in last EF release? Does something changed?

  • Anonymous
    October 01, 2008
    Yes, it would be interesting to know this for the latest EF release.

  • Anonymous
    January 11, 2009
    If we’ll skip exact details, we can say, that internal behavior of whole modeling and mapping is based

  • Anonymous
    April 08, 2009
    Direi che una buona pagina da cui partire è questo documento su MSDN :Performance Considerations for

  • Anonymous
    January 31, 2011
    this is nice post, but can you give brief introduction of comparing Linq to entities performance with stored procedures.

  • Anonymous
    March 28, 2012
    i totally agree with Morten Mertner i don't use entity framework anymore in my applications, because it is extremely SLOW!!! i tried some steps proposed in this post (not all). even if i try, how much time it will take to setup???!!!! so my conclusion is: i prefer to use old ado.net instead of this problematic evolution for nothing. thanks you, bye

  • Anonymous
    April 02, 2012
    @kamal101 - Be sure to take a look at this post about the performance improvements in EF blogs.msdn.com/.../sneak-preview-entity-framework-5-0-performance-improvements.aspx

  • Anonymous
    December 17, 2013
    Is it possible to repeat these tests for version 5.0 of the EF? I have been struggling to get the symbols loaded and also understand what elements need to be inspected to get the %s as above.

  • Anonymous
    December 18, 2013
    @Mark - We are planning to put together an article with some numbers for the latest releases, but our team is heads down working on releases/features at the moment so it won't be immediately. Here is some helpful documentation about performance in EF - msdn.microsoft.com/.../hh949853. It was written for EF5 but still applies to EF6.