New version of TraceEvent / PerfMonitor Posted to bcl.codeplex.com

For several years now, I have had code called the 'TraceEvent library' that allows you to access ETW files (ETL files) from C#.  

However for over a year now, I have not updated the public version of that library.   Well, that time has ended.

I updated the TraceEvent library as well as the PerfMonitor sample at https://bcl.codeplex.com to sync up with the latest internal versions.   The TraceEvent library is version 1.2.7 (corresponding to the version of PerfView that uses it), and the version of PerfMonitor is Version 2.0. 

For those of you don't know, the TraceEvent library is the power behind the PerfView tool, and that makes it pretty powerful.   With it you can

  1. Use the TraceEventSession class to turn on ETW providers, including the Windows Kernel provider (CPU sampling) and the .NET Runtime providers.  You can turn on capturing stack traces for most events. 
  2. You can tell the sesson to either log the data to a (ETL) file or send it to a in-memory buffer for 'real time' consumption of the events.  
  3. Use the TraceEventSource class to read the resulting data (either ETL files or the in-memory buffer), and parse resulting payloads
  4. The library has built in support for the Kernel and .NET Runtime providers. 
  5. It also built-in support for System.Diagnostics.Tracing.EventSource using he 'DynamicTraceEventParser' class.  Thus you an always properly parse EventSource data. 
  6. As well as support for the RegisteredTraceEventParser, which knows how to decode any 'Manifest Based' provider (basically any OS supplied provider).  
  7. The 'TraceLog' class creates a high level abstraction, which knows about Processes, Threads, Modules, and how to Decode Stacks.  It has a routines (the DiaLib) that know how to look up address in PDBS to get symbolic names as well as get symbolic names for Just in Time (JIT) compiled code.   The result is that you can get at symbolic names for stacks (this is how PerfView works). 

In short, most of what you see PerfView do with an ETL file, you can do yourself.   Now if you just need a 'tweek' of what PerfView does, I recommend using PerfView's extensibility model to do the job (see the Help -> Extending PerfView menu entry in PerfView).  However if you want something 'stand alone'.  TraceEvent is a good choice.

If you have used TraceEvent in the past, there are some small braking changes, but the port should be easy (I updated PerfMonitor in a couple hours).   There have been numerous fixes, as well as the addition of the RegisteredTraceEventParser and much better support for Symbolc resolution The DiaLib stuff is new).   It is worth upgrading to. 

The PerfMonitor utility can be thought of as a 'command line' version of PerfView.   However I don't recommend using it as an actual tool, it almost all cases, PerfView is a better tool than PerfMonitor.   PerfMonitor is more of a sample of how to use TraceEvent in non-trivial ways (it is easier to understand without all the GUI goo cluttering up the logic).  

I will probably post more detailed 'how to's of using ETW with TraceEvent in future blog posts.

Vance

Comments

  • Anonymous
    January 08, 2013
    The comment has been removed

  • Anonymous
    January 09, 2013
    You have found a bug I introduced porting this to the CodePlex version. I believe I have fixed this and have uploaded new version of both PerfView (V2.01) and TraceEvent (V1.2.7.1). Try it with these updates Sorry about the inconvenience.  

  • Anonymous
    February 07, 2013
    PerfView gives me different thread times compared to WPA. On my Windows 7 x64 machine (.NET 4.0) all code is NGenned I get reports of 30% or higher broken stacks. With PerfView I get in the CPU Stacks View

  • Thread (4988) CPU=8388ms With WPA NewThreadId 4988 CPU Usage (ms) 7407,889075 Which numbers should I believe? I did analyze the same etl file which was generaed with WPRUI.exe The broken stacks are an artifact from generated code by DataContractSerializer. I did also notice that the original assemblies from the GAC seem to be loaded although only NGEnned assemblies should be used. Is this due to emittin code via DynamicMethod which does cause a load of the raw IL assemblies from GAC again?
  • Anonymous
    February 11, 2013
    I can't comment on the discrepancy between PerfView and WPA you give above without seeing the detailed data.    WPA has two views of CPU data, one using CPU samples (like PerfView's CPU view), and another using context switches (which is like PerfVIew's 'thread Time view' filtered to only include 'CPU' stacks).   Certainly these can be different.    It is also true is that WPA does try to 'subtract out' CPU that is on a particular thread but was doing OS specific operations (what are called DPC and interrupts).  This MAY be the cause, but there are plenty of other possibilities.   I do note however, that you have the ability to dig into it.   Both are viewers of the same ETW data which allow you to isolate where the CPU came from (both in stacks and in time).  Thus you can compare the two and see exactly what samples are contributing to the discrepancy.   On your question about loading assemblies, please keep in mind that 'loading' the IL and NGEN image is not as 'expensive' as you may be thinking.   The .NET Runtime tries to avoid loading both, but there are a number of cases where it does in fact load both.  Typically this only matters on 32 bit processes load GB of images/heap, and you literally run out of address space.   For 64 bit processes, 'loading' both IL and NGEN images almost never matters from a performance perspective.     Exactly why it is loading both in your case would also need deeper analysis, which is probably not  worth it (since it should not affect perf in most cases).

  • Anonymous
    February 18, 2013
    In my trace of a 64-bit .NET 2.0 process, AeXSvc.exe, I see over 36% broken frames, and method names like: "Process64 AeXSvc (1480)", which isn't a method name. Could I be doing something wrong? Thanks!

  • Anonymous
    February 18, 2013
    I assume you are using the PerfView tool.  As mentioned in the popup and in the 'TroubleShooting' link to the documentation, BROKEN frames are expected on 64 bit processes before Win8.    See the docs for more on the work-arounds (NGEN or looking at the stack parts that are there).     Note that the names in stacks may be things besides methods, and in particular can be processes, or threads (or ROOT, or other PSEUDO frames).

  • Anonymous
    February 20, 2013
    Thanks for the reply Vance. I was using perfMonitor. I tried PerfView and got the same results. It looks like my only chance will be ngen. I'll give it a try.