Find Kernel driver memory leak using Windows Performance Analyzer

Ma K 5 Reputation points
2023-04-05T15:41:34.9533333+00:00

Hi, my paged pool keeps getting bigger and bigger and its not a user-process causing the issue. User's image

I used poolmon.exe and figured out that the corresponsing flag is "Fstr", but I could not find out what this tag is used for: User's image

Next, I created a trace file with xperf. In the WPA it looks like this: User's image

Ok cool, but what now? n/a does not help that much figuring out which driver exactly causes the issue. Any help is highly appreciated. Regards, Manuel

Windows Performance Toolkit
Windows Performance Toolkit
A collection of Microsoft performance monitoring tools that produce in-depth performance profiles of Windows operating systems and applications.
97 questions
0 comments No comments
{count} vote

3 answers

Sort by: Most helpful
  1. Gary Nebbett 5,846 Reputation points
    2023-04-06T09:00:15.0366667+00:00

    Hello Manuel,

    I think that the pool trace information consists of pool allocation and free events (with stack trace information) for events which occur during the trace and a snapshot (state/summary) of pool allocations at the beginning and end of the trace. The snapshots just record statistics for the tag (no stack traces for past events) and that is probably the source of the n/a (not available) entries in the Stack column.

    Just investigate the stacks that you do have for that tag and that should lead you to the driver that is using them.

    Gary


  2. Gary Nebbett 5,846 Reputation points
    2023-04-07T06:01:15.5333333+00:00

    Hello Manuel,

    Analysing trace data in WPA does require quite a bit of experience. In this case, whenever pool is allocated or freed (which can happen in DPC routines and similar and be unrelated to the user-mode code that was interrupted), a stack trace (a sequence of pointers) is collected.

    The interpretation of kernel-mode addresses remains fairly constant (unless drivers are loaded/unloaded/re-loaded), but the interpretation of user-mode addresses depends on which process was "current" at the time the event occurred.

    My guess is that WPA is not really trying hard to determine how to interpret the user-mode addresses - it is not even trying to be consistent.

    Your main task was to determine the kernel-mode driver using the Fstr pool tag and the stack data was adequate for that.

    Gary


  3. Gary Nebbett 5,846 Reputation points
    2023-04-11T08:32:26.0766667+00:00

    Hello Manuel,

    At an individual event level, pool allocation events look like this: User's image

    User's image

    User's image

    The top few frames of the stack show the path from the pool allocation routine that was called (e.g. ExAllocatePool2) to the internal pool routine that calls the ETW trace function. Immediately below that is the module that probably "used" the pool tag (0x6770534E, "NSpg"). The frames below this normally trace back to one of a few sources (a kernel/user mode transition, ntdll!RtlUserThreadStart, ntoskrnl!KiStartSystemThread, etc.); WPA displays stacks as "inverted trees", typically with one of these routines at its root.

    You will need to drill right down through the trees to within a few steps from the leaves to find the module using the pool tag.

    You could just share your trace file with me and I can help you look for the module.

    Gary

    0 comments No comments