Cracking Profile-Guided Optimization profile data with PGOMGR

In one of my previous postings I updated some of the information in my MSDN whitepaper on PGO.  In this entry I'm going to go into more depth about one of the tools useful for PGO, which is PGOMGR (stands for PGO Manager). 

What the heck is this tool?  Well it's a tool that helps manage the .pgc and .pgd files that get generated by the instrumented PGO code.  Most of the usage for PGO is well documented, but one thing is not well documented and it is the summary report that PGOMGR generates.  These are pretty cool reports as they give some interesting information regarding the profile data.

You can run PGOMGR on the generated .pgd file from doing an instrumentation build or optimized PGO build. 

>pgomgr /summary codegen.pgd
Microsoft (R) Profile Guided Optimization Manager 8.00.50727.15
Copyright (C) Microsoft Corporation. All rights reserved.

PGD File: codegen.pgd 08/23/2005 08:26:00
Module Count: 1 Function Count: 9 Arc Count: 16 Value Count: 4

Static instructions: 264 Basic blocks: 63 Average BB size: 4.2
Dynamic instructions: 727851972

                                     entry static dynamic % run
Function Name count instr instr total total
mangle2 200 60 661679600 90.9 90.9
muldiv2 13233500 5 66167500 9.1 100.0
docalc2 1 35 4817 0.0 100.0
_main 1 62 55 0.0 100.0
docalc 0 35 0 0.0 100.0
doubleit 0 1 0 0.0 100.0
muldiv 0 5 0 0.0 100.0
tripleit 0 1 0 0.0 100.0
mangle 0 60 0 0.0 100.0

Lets look at this report in more depth. 
Module Count is the number of modules from this .pgd file.  This is pretty much always 1.

Function Count is the number of functions represented in the .pgd file. 

Arc Count and Value Count are the number of probes inserted in non-Multiple Data Set functions.  You're probably wondering, "What are Multiple Data Set functions"?  Effectively these are functions which are inline candidates.  So what do these Arc Counts and Value Counts really mean to the end-user?  Unfortunately not a whole lot.  At some future date we need to fix this -- today simply ignore it.

Static instructions are the number of instructions there are in the code.  These don't refer to source instructions nor assembly language instructions, but rather the number of instructions in our intermediate representation.  These actually map pretty closely to assembly language instructions though.

Basic block are maximal blocks of code with a single entry and a single exit point. 

Average BB size is the average number of instructions per basic block. 

Dynamic instructions are the number of instructions executed by the running scenarios. 

Entry count is the number of times each function was called. 

You may want even finer granularity with respect to the profile data.  To do that use the /detail switch in conjunction with the /summary switch. Let's try this switch with the same .pgd file, and this is given below:

>pgomgr /summary /detail codegen.pgd
Microsoft (R) Profile Guided Optimization Manager 8.00.50727.15
Copyright (C) Microsoft Corporation. All rights reserved.

PGD File: codegen.pgd 08/23/2005 22:02:42
Module Count: 1 Function Count: 9 Arc Count: 16 Value Count: 4

Static instructions: 264 Basic blocks: 63 Average BB size: 4.2
Dynamic instructions: 727851972

                                     entry static dynamic % run
Function Name count instr instr total total
tripleit 0 1 0 0.0 0.0
Blk 1: 62- 63 1 (100.0%)s 0 (-1.$%)d
doubleit 0 1 0 0.0 0.0
Blk 1: 57- 58 1 (100.0%)s 0 (-1.$%)d
mangle2 200 60 661679600 90.9 90.9
Blk 1: 31- 35 4 ( 6.7%)s 800 ( 0.0%)d
Blk 2: 35- 35 3 ( 5.0%)s 39701100 ( 6.0%)d
taken ( 4) 200, not-taken 13233500
Blk 3: 35- 36 8 (13.3%)s 105868000 (16.0%)d
Blk 4: 38- 38 3 ( 5.0%)s 600 ( 0.0%)d
Blk 5: 38- 38 3 ( 5.0%)s 39701100 ( 6.0%)d
taken ( 7) 200, not-taken 13233500
Blk 6: 38- 46 16 (26.7%)s 211736000 (32.0%)d
Blk 7: 48- 48 3 ( 5.0%)s 600 ( 0.0%)d
Blk 8: 48- 48 3 ( 5.0%)s 79401600 (12.0%)d
taken ( 13) 200, not-taken 26467000
Blk 9: 49- 49 5 ( 8.3%)s 132335000 (20.0%)d
taken ( 12) 26466900, not-taken 100
Blk 10: 49- 49 2 ( 3.3%)s 200 ( 0.0%)d
taken ( 12) 100, not-taken 0
Blk 11: 50- 50 5 ( 8.3%)s 0 ( 0.0%)d
Blk 12: 48- 48 2 ( 3.3%)s 52934000 ( 8.0%)d
Blk 13: 52- 53 3 ( 5.0%)s 600 ( 0.0%)d
mangle 0 60 0 0.0 90.9
Blk 1: 5- 9 4 ( 6.7%)s 0 (-1.$%)d
Blk 2: 9- 9 3 ( 5.0%)s 0 (-1.$%)d
taken ( 4) 0, not-taken 0
Blk 3: 9- 10 8 (13.3%)s 0 (-1.$%)d
Blk 4: 12- 12 3 ( 5.0%)s 0 (-1.$%)d
Blk 5: 12- 12 3 ( 5.0%)s 0 (-1.$%)d
taken ( 7) 0, not-taken 0
Blk 6: 12- 20 16 (26.7%)s 0 (-1.$%)d
Blk 7: 22- 22 3 ( 5.0%)s 0 (-1.$%)d
Blk 8: 22- 22 3 ( 5.0%)s 0 (-1.$%)d
taken ( 13) 0, not-taken 0
Blk 9: 23- 23 5 ( 8.3%)s 0 (-1.$%)d
taken ( 12) 0, not-taken 0
Blk 10: 23- 23 2 ( 3.3%)s 0 (-1.$%)d
taken ( 12) 0, not-taken 0
Blk 11: 24- 24 5 ( 8.3%)s 0 (-1.$%)d
Blk 12: 22- 22 2 ( 3.3%)s 0 (-1.$%)d
Blk 13: 26- 27 3 ( 5.0%)s 0 (-1.$%)d
docalc2 1 35 4817 0.0 90.9
Blk 1: 28- 33 1 ( 2.9%)s 1 ( 0.0%)d
Blk 2: 33- 33 2 ( 5.7%)s 402 ( 8.3%)d
taken ( 4) 1, not-taken 200
Blk 3: 33- 34 3 ( 8.6%)s 600 (12.5%)d
Blk 4: 36- 37 5 (14.3%)s 5 ( 0.1%)d
Blk 5: 37- 37 2 ( 5.7%)s 402 ( 8.3%)d
taken ( 7) 1, not-taken 200
Blk 6: 37- 40 12 (34.3%)s 2400 (49.8%)d
Blk 7: 42- 42 1 ( 2.9%)s 1 ( 0.0%)d
Blk 8: 42- 42 2 ( 5.7%)s 402 ( 8.3%)d
taken ( 10) 1, not-taken 200
Blk 9: 42- 43 3 ( 8.6%)s 600 (12.5%)d
Blk 10: 45- 46 4 (11.4%)s 4 ( 0.1%)d
docalc 0 35 0 0.0 90.9
Blk 1: 9- 13 1 ( 2.9%)s 0 (-1.$%)d
Blk 2: 13- 13 2 ( 5.7%)s 0 (-1.$%)d
taken ( 4) 0, not-taken 0
Blk 3: 13- 14 3 ( 8.6%)s 0 (-1.$%)d
Blk 4: 16- 17 5 (14.3%)s 0 (-1.$%)d
Blk 5: 17- 17 2 ( 5.7%)s 0 (-1.$%)d
taken ( 7) 0, not-taken 0
Blk 6: 17- 20 12 (34.3%)s 0 (-1.$%)d
Blk 7: 22- 22 1 ( 2.9%)s 0 (-1.$%)d
Blk 8: 22- 22 2 ( 5.7%)s 0 (-1.$%)d
taken ( 10) 0, not-taken 0
Blk 9: 22- 23 3 ( 8.6%)s 0 (-1.$%)d
Blk 10: 25- 26 4 (11.4%)s 0 (-1.$%)d
muldiv2 13233500 5 66167500 9.1 100.0
Blk 1: 16- 17 5 (100.0%)s 66167500 (100.0%)d
muldiv 0 5 0 0.0 100.0
Blk 1: 11- 12 5 (100.0%)s 0 (-1.$%)d
_main 1 62 55 0.0 100.0
Blk 1: 22- 29 9 (14.5%)s 9 (16.4%)d
taken ( 3) 0, not-taken 1
Blk 2: 30- 32 6 ( 9.7%)s 6 (10.9%)d
Blk 3: 32- 32 1 ( 1.6%)s 0 ( 0.0%)d
Blk 4: 33- 33 2 ( 3.2%)s 2 ( 3.6%)d
Blk 5: 33- 33 3 ( 4.8%)s 3 ( 5.5%)d
Blk 6: 33- 33 1 ( 1.6%)s 1 ( 1.8%)d
taken ( 8) 1, not-taken 0
Blk 7: 34- 37 6 ( 9.7%)s 0 ( 0.0%)d
Blk 8: 38- 39 5 ( 8.1%)s 5 ( 9.1%)d
Blk 9: 41- 43 13 (21.0%)s 13 (23.6%)d
Blk 10: 43- 43 1 ( 1.6%)s 1 ( 1.8%)d
Blk 11: 43- 43 4 ( 6.5%)s 4 ( 7.3%)d
Blk 12: 43- 43 1 ( 1.6%)s 1 ( 1.8%)d
Blk 13: 43- 45 10 (16.1%)s 10 (18.2%)d

Module name Function name Block % Arc % Instr %
c:\documents and settings\kang tripleit Never executed
c:\documents and settings\kang doubleit Never executed
c:\documents and settings\kang mangle2 92.3 88.2 91.7
block 10: branch never falls through
block 11: never executed
c:\documents and settings\kang mangle Never executed
c:\documents and settings\kang docalc2 100.0 100.0 100.0
c:\documents and settings\kang docalc Never executed
c:\documents and settings\kang muldiv2 100.0 100.0 100.0
c:\documents and settings\kang muldiv Never executed
c:\documents and settings\kang _main 84.6 71.4 88.7
block 1: branch never taken
block 3: never executed
block 6: branch never falls through
block 7: never executed

Overall block: 54.0% arc: 51.4% inst: 56.8%
functions called: 44.4%

/// Done with PGOMGR Output

You'll notice that one of the things that is different is each function now has the following type of statements below the function name:

                                     entry static dynamic % run
Function Name count instr instr total total
_main 1 62 55 0.0 100.0
Blk 1: 22- 29 9 (14.5%)s 9 (16.4%)d
taken ( 3) 0, not-taken 1
Blk 2: 30- 32 6 ( 9.7%)s 6 (10.9%)d
Blk 3: 32- 32 1 ( 1.6%)s 0 ( 0.0%)d

This says that basic block 1 is on line 22 through 29 of the source file, and basic block 2 is on line 30 to 32 of the source file.  Next it says that there are 9 static instructions in block 1, and it's 14.5% of the static instructions in _main. It additionally says that 9 dynamic instructions were executed in block 1, which contributed to 16.45% of the dynamic instructions run in _main (and 6 dynamic instructions for block 2, and none for block 3). 

The "taken ( 3)" statement means that there was a user conditional in the code and the taken part of the conditional branches to block #3, while the not-taken path simply falls through (in this case to block #2).  The 0 on that line indicates that the taken branch was never executed, while the not-taken branch was executed once. 

Another thing that is different in this /detail dump is the summary information at the end.  Lets examine this now:

Module name Function name Block % Arc % Instr %
c:\documents and settings\kang _main 84.6 71.4 88.7
block 1: branch never taken
block 3: never executed
block 6: branch never falls through
block 7: never executed

What this means is that 84.6% of the basic blocks in _main were executed in the code, 71.4% of the arcs were traversed, and 88.7% of the instructions were executed.  Additionally we get specific information when a branch is never taken or a block never executed.  This is kind of cool I think... it's almost like getting code coverage information for your application!

Hopefully all this helps in understanding the information dumped from this tool.  Admittedly, it's not the best formated or readable dump, and we'll work on that in the future, but there's some nice info in it.

Comments