Walk Through: Adding a Function Call to a Program

Here is the scenario: you have compiled and linked a big program – you may have even shipped it out to customers. After it was built you realize that in order to find a bug or determine some necessary information, you need to instrument a certain function in the program. With Phoenix you don’t need to rebuild the program, but can simply use this Phoenix tool to instrument the binary directly. We do this by inserting a function call into the original program, from a DLL that you have written (It is worth noting that this function call that is inserted takes no arguments in this sample. We will handle passing arguments later, as that is a more complex task).

As I promised, I will switch between C# and C++/CLI. This program is taken from the Phoenix RDK and is written in C++/CLI. If you aren’t familiar with the syntax, it is actually quite similar to C#, but if you want more details then the language specification is located here.

Things Covered in this Article

· Reading/Writing a PE file.

· Creating imports.

· Creating a memory operand (MemOpnd).

· Loading functions.

· Adding function calls to an instruction stream.

 

There is one part of the program that this article will NOT cover, which are controls, this is Phoenix lingo for the command line arguments. We will cover controls in depth in a later article. They are somewhat apparent from reading the code, so I don’t think you should be confused by their presence, as you’ve probably written code to parse the command line a million times yourself.

Also, I haven’t talked about the details of the various Units in Phoenix. This is something I’ll have to do in a future posting. If any of this blog is unclear due to this omission, let me know, and I’ll make sure to make this clarification.

The Main Function

Like StaticGlobalDump, we start with main(), which is given below (it’s just after Code Point 9). The code that is bolded are function calls that have more user-defined functionality behind it, whereas the non-bold code calls directly into supplied framework code (either the CRT, STL, CLR, or Phoenix).

 

Code Point 1: Looking at the code, we see that the first thing we do is to initialize the Phoenix targets. In the StaticGlobalDump walkthrough I explained this code, so I’ll skip discussion of it here. The code is identical (except in StaticGlobalDump it was a in a separate function).

 

Code point 2: This is where we begin initialization of the infrastructure. This is the second time we have seen the BeginInit method, as we also saw it in the StaticGlobalDump program.

What happens when you call BeginInit is that a LOT of things get initialized under the covers. Everything from the initialization of threading and memory management infrastructure of Phoenix, to the symbol and type table, to the controls infrastructure. BeginInit is just something you need to do to get Phoenix started.

Code point 3: Phoenix has a very rich set of command-line parsing capability. The various command line arguments one can pass to a Phoenix client are called controls, and are accessed in the Phx::Ctrls namespace. InitCmdLineParser is a user-defined function that parses the command-line argument for the program using the routines in the Phx::Ctrls namespace. We will cover this capability in a future article.

Code point 4: One reasonable question is “Why is InitCmdLineParser in-between BeginInit and EndInit, whereas in StaticGlobalDump there was nothing in-between those two calls?”

The reason is that at EndInit Phoenix parses the command-line for the controls, thus you need to have the controls setup before EndInit is called, and naturally you can’t set up the Phoenix controls before you start initialization of Phoenix. Therefore the InitCmdLineParser must reside in-between BeginInit and EndInit.

EndInit actually does more than just parse the command-line, but for this particular piece of code that’s the only thing that is relevant. We will dive into some of the other things that need to happen in-between BeginInit/EndInit during another article where it is relevant.

Code point 5: This is a call to CheckCmdLine. This is a simple user-defined function that checks to make sure that each of the required command-line arguments is supplied. If not, it exits the program with an error. Again, we will cover this capability in a future article.

Code point 6: This is where we open a PE file, in the same way we did with StaticGlobalDump. the main difference is that we get the string name from a global variable called “GlobalPlaceHolder”. GlobalPlaceHolder is a class that has a set of controls in it, each one mapping to one of the command line arguments:

public ref class GlobalPlaceHolder {

public:

   static Phx::Ctrls::StringCtrl ^ in;

   static Phx::Ctrls::StringCtrl ^ out;

   static Phx::Ctrls::StringCtrl ^ pdbout;

   static Phx::Ctrls::StringCtrl ^ importdll;

   static Phx::Ctrls::StringCtrl ^ importmethod;

   static Phx::Ctrls::StringCtrl ^ localmethod;

};

GlobalPlaceHolder::in->GetValue(nullptr) gets the string out of the “in” field, which corresponds to the name of the input PE file for this program. For this walkthrough, ignore the nullptr argument. We will cover that in the future when I discuss controls.

Code point 7: These two lines simply copy the command-line arguments out of the controls into two fields in the PEModuleUnit that correspond to the command line arguments. The first being the path for the resulting PE image, and the second line being for the output PDB filename.

Code point 8: LoadGlobalSymbols takes a PE ModuleUnit and loads all of the global symbol data out of the associated PDB file. So after doing this call you will have a symbol table populated with all of the global/static variables and the symbols for the types, and the methods associated with that type.

We will go into more depth about LoadGlobalSymobols in a future posting, but if you’re curious, you can dump the Symbol Table for the PEModuleUnit after you load it (PEModuleUnit->SymTable->Dump(dumpOptions)).

Code point 9: DoAddInstrumentation is where the user logic to add the new calls to the existing function takes place. See later in this article for the section on DoAddInstrumentation.

After that is the call to moduleUnit->Close(). It does more than simply closes the PEModuleUnit. It also checks if the OutputImagePath is non-null. If it is non-null then it writes out the PEModuleUnit to disk, using the OutputImagePath. Note that we did set the OutputImagePath in code point 7, thus when this program ends it generates a new binary. It also generates a new PDB file, placing it at OutputPdbPath, which we also set in code point 7.

int main(array<String ^> ^ args) {

   // Initialize the target architectures.

   // 1

   Phx::Targets::Archs::Arch ^ arch =

      Phx::Targets::Archs::X86::Arch::New();

   Phx::Targets::Runtimes::Runtime ^ runtime =

      Phx::Targets::Runtimes::VCCRT::Win32::X86::Runtime::New(arch);

   Phx::GlobalData::RegisterTargetArch(arch);

   Phx::GlobalData::RegisterTargetRuntime(runtime);

   // Initialize the infrastructure.

   // 2

   Phx::Init::BeginInit();

   // Init the cmd line stuff.

   // 3

   ::InitCmdLineParser();

   // Check for Phoenix wide options like "-assertbreak".

  

   // 4

   Phx::Init::EndInit(L"PHX|*|_PHX_", args);

   // Check the command line.

   // 5

   ::CheckCmdLine();

   // Open the module and read it in.

   Phx::PEModuleUnit ^ moduleUnit;

   // 6

   moduleUnit =

      Phx::PEModuleUnit::Open(GlobalPlaceHolder::in->GetValue(nullptr));

   // Setup output file name and PDB.

   // 7

   moduleUnit->OutputImagePath = GlobalPlaceHolder::out->GetValue(nullptr);

   moduleUnit->OutputPdbPath = GlobalPlaceHolder::pdbout->GetValue(nullptr);

   // Iterator will load symbols implicitly.

   // However Load Global Symbols upfront and print total.

   // 8

   moduleUnit->LoadGlobalSyms();

   Phx::Output::WriteLine(L"Total Global Symbols Count - {0} ",

      moduleUnit->SymTable->SymCount.ToString());

   // Do some useful work on the tool front here:

   // 9

   ::DoAddInstrumentation(moduleUnit);

   // Close the ModuleUnit.

   moduleUnit->Close();

   // If this was not the end of the application it would be best to

   // delete the ModuleUnit.

   // moduleUnit->Delete();

   return 0;

}

The DoAddInstrumentation Function

This function is where we do the meat of the work to instrument the PE image. We instrument the PE image with a call to an imported function at entry to the function which is specified on the command-line.

Let’s step back and think about the steps that are required to add a call to a function, even outside of a framework such as Phoenix.

1. Import the DLL that contains the function, F, which we wish to inject into the specified function.

2. Get the import symbol of F from within the import module that we wish to inject into the specified function, S, in the PE file.

3. Get S and find its first instruction.

4. Inject a call to F from the imported DLL before this first instruction in S.

Those are the basic steps that we need to do. DoAddInstrumentation will do these four steps using Phoenix.

Code point 10: The function takes one argument, which is the PEModuleUnit that is going to be instrumented. The following two lines get the names of the imported DLL and imported method from the command-line arguments.

Code point 11: With the import DLL name and the method name, the AddImport method will create the import symbol (or find the import symbol, if the method is already being imported by the program).

Code point 12: The way that import functions are called is through an Import Address Table, aka IAT. The import address table is an array of addresses that map to the various virtual addresses of the imports.

The import symbol for the function, as constructed in Code Point 11, has a field which contains the symbol for the IAT entry of the function. It is this symbol that we will use to construct the call target operand.

The next line gets the name of the function that is going to be the injectee (the local method) from the command-line arguments.

Code point 13: This “for each” loop iterates over every contribution unit in the module. A contribution unit is any unit in the module that contributes code or data (for example, FuncUnit or DataUnit). This is necessary because we need to raise every FuncUnit to IR in order to be able to write out the image back to a PE file. In theory this shouldn’t be a requirement, but today it is a requirement with Phoenix.

The first line inside of the “for each” loop does a dynamic cast of unit to a FuncUnit, and if unit is not a FuncUnit then the loop continues.

Code point 14: funcUnit->DisassembleToBeforeLayout is where the FuncUnit is raised to low lever IR (LIR). (I’ll be perfectly honest, the name DisassembleToBeforeLayout leaves something to be desired. We are working on the naming of our API.) In case you were wondering what raising means: raising a funcUnit (or a PE file) moves it from one level of representation to a higher level of representation. For example, going from PE format on disk to EIR to LIR to MIR to HIR to source code.

After that we use a string comparison to see if the FuncUnit that we just raised is the FuncUnit that we wish to instrument. We do that by comparing the targetname to the name of the function symbol for the FuncUnit that is being processed.

Code point 15: We first create a firstinstr local variable, which we will use to hold the first instruction of the function, because we want to put the call to the imported method before the first instruction.

Then we iterate over the instructions in the function unit[1], in order from the beginning to the end of the function. We do this until we find a “real instruction”. This is done with the property instr->IsReal. A “real instruction” is defined by Phoenix as something that is not a #pragma, data instruction, or a label (and by the way, IsReal == !IsPseudo). Although there shouldn’t be any harm putting your instruction before one of those pseudo-instructions. You have the flexibility to do it either way.

 

Code point 16: The five lines in code point 16 are about building up an operand that will be used as the target of the call instruction to our imported method. The type of operand that will be used will be a MemOperand, as the call will be an indirect call through the IAT.

The New constructor for MemOpnd takes the following arguments: FuncUnit, Type, Sym, VarOpnd, ByteOffset, Align, Tag. That’s a lot of arguments, but we can work through them one by one.

· FuncUnit: In code point 14 we found the FuncUnit that we plan to instrument, and we put that in funcUnit. So the FuncUnit where this operand will reside has already been taken care of for us.

· Type: The type of a memory operand refers to the type that is being referenced by the memory operand. Since this memory operand represents a call through an IAT, and the IAT is a table with ordinal values then we will use an integer. Phoenix has a set of built-in representations for types in the the TypeTable, which is pointed to by each FuncUnit. Thus the following line of code gets the integer type that we will use as an offset into the IAT: funcUnit->TypeTable->Int32Type.

· Sym: This is the symbol for the location in memory that we’re going to call through. Since we’re calling through an IAT we use the IATSym off of the import symbol.

· BaseOpnd: This is the base register for the operand. Since we’re going through the IAT, there is no base register, hence the nullptr.

· ByteOffset: Is the offset from the base register. This value is zero since we’re indirecting through the IAT.

· Align: This is the alignment of the location that is being referenced by this operand. We know the type that is being referenced (ptrType), so we can call Phx::Align::NaturalAlign(ptrType) and this static method will return back the natural alignment of this type.

· Tag: The tag is more specifically an alias tag. What’s an alias tag? Every register/memory operand has an alias tag on it, which describes the alias relationship between itself and all other operands.

This tag argument is created earlier in the code point with “tag = funcUnit->AliasInfo->NonAliasedMemTag”. What this means is that the tag is non-aliased memory, which makes sense since this operand represents the IAT entry.

We then take all of this information and make the call:

      Phx::IR::MemOpnd ^ opnd = Phx::IR::MemOpnd::New(funcUnit,

         ptrType, symIAT, nullptr, 0, align, tag);

with the various information in their appropriate slot in the constructor.

Code point 17: Now that we have the operand that is the target of the call, we need to actually construct the call instruction. The hard work was in constructing the operand, and this part is straightforward. We create a new CallInstr, with the current FuncUnit being the one that the call is bound to. We get the opcode that we wish to use for this call. We’re generating x86 unmanaged code, so we will use (Phx::Targets::Archs::X86::Opcode::call), and then pass it the MemOpnd that we created in code point 16.

Note that the CallInstr is target specific, in that we specifically created an x86 call instruction (hence the X86 in the middle of the namespaces). Obviously, this code would need to change if you were targeting x64, MSIL, or any other architecture with a different ISA.

Code point 18: We now have the instruction, but it’s not actually in the instruction stream. Earlier we figured out where the first instruction in this function was, so we can use that function as a guide to help us place our newly created instruction into the instruction stream.

Code point 19: The final step of DoAddInstrumentation is to legalize the instruction that was inserted into the instruction stream.

What does “legalize” mean? Legalize (in this particular instance) is to take an instruction with operands, and put the operands into a legal form for the target architecture. Note that we selected the opcode from a set of x86 specific opcodes, but the operands are just generic HIR. Since different processor architectures have different ways to build an instruction with operands, there is this special Legalize method that converts the HIR operand into a target specific set of operands automatically. This saves from having to know about the various addressing modes of the different processor architectures, in building a legal instruction stream.

// 10

void DoAddInstrumentation (Phx::PEModuleUnit ^ moduleUnit) {

   String ^ dll = GlobalPlaceHolder::importdll->GetValue(nullptr);

   String ^ method = GlobalPlaceHolder::importmethod->GetValue(nullptr);

   // Create the import

   // 11

   Phx::Syms::ImportSym ^ importsym = ::AddImport(moduleUnit, dll, method);

   // Get the symbol for the IAT slot to use as the call destination

   // 12

   Phx::Syms::Sym ^ symIAT = importsym->IATSym;

   // Method we want to instrument

   String ^ targetname = GlobalPlaceHolder::localmethod->GetValue(nullptr);

   // Iterate over each function actually has contribution in the module.

   // 13

   for each (Phx::Unit ^unit in moduleUnit->GetEnumerableContribUnit())

   {

      Phx::FuncUnit ^ funcUnit = unit->AsFuncUnit;

      if (funcUnit == nullptr) {

         continue;

      }

      // Currently we must raise all FuncUnits when using

      // Phx::ContribUnitIterator otherwise resultant binary will not run.

      // Convert from Encoded IR to Low-Level IR

   // 14

      funcUnit->DisassembleToBeforeLayout();

      // Only instrument the method we are interested in

      if (!String::Equals(targetname, funcUnit->FuncSym->NameString)) {

  continue;

      }

      // Get the first "real" instruction in the func.

   // 15

      Phx::IR::Instr ^ firstinstr = nullptr;

  

      for each (Phx::IR::Instr ^ instr in Phx::IR::Instr::Iter(funcUnit)) {

         // Don't want a LabelInstr, PragmaInstr or DataInstr

         if (instr->IsReal) {

            firstinstr = instr;

            break;

         }

      }

      // Now create the call to the import

      // As this is via an import we use an the intrinsic Int32Type

   // 16

      Phx::Types::Type ^ ptrType = funcUnit->TypeTable->Int32Type;

      // Create an alignment for the MemOpnd

      Phx::Align align = Phx::Align::NaturalAlign(ptrType);

      // This is unaliased memory so we can use either NonAliasedMemTag

      // or IndirAliasedMemTag

      Phx::Alias::Tag tag = funcUnit->AliasInfo->NonAliasedMemTag;

      // Create the MemOpnd for an indirect call through the IAT

      Phx::IR::MemOpnd ^ opnd = Phx::IR::MemOpnd::New(funcUnit,

         ptrType, symIAT, nullptr, 0, align, tag);

      // Create a new call instruction. This sample only supports x86, so

      // we can use the platform specific instruction.

    // 17

      Phx::IR::CallInstr ^ call = Phx::IR::CallInstr::New(

         funcUnit, Phx::Targets::Archs::X86::Opcode::call,

         opnd);

      // Insert it before the instruction

   // 18

      firstinstr->InsertBefore(call);

      // Now legalize the instruction form

   // 19

      funcUnit->Legalize->Instr(call);

   }

}

The AddImport Function

On to the last method for this program -- AddImport. Recall from code point 11, we used AddImport to get the import symbol for the method that we are importing. This method takes three arguments:

· module – The module that we’re currently instrumenting, that will call the import method.

· dll – The name of the DLL with the import method.

· method – The name of the method that we plan to import.

Code point 20: This simply allocates a Phx::Name for the name of DLL and the method.

Notice the use of module->Lifetime as the first argument for creating a new Phx::Name. Each Unit has a lifetime, and when this lifetime is done then it cleans up all resources associated with it (like a destructor with a class). In this case, we’ve created these Phx::Names and associated them with the module->Lifetime so the memory associated with the Phx::Name’s gets cleaned up when the PEModuleUnit is done being processed.

You may be wondering, why do we have this construct when the CLR’s garbage collection should handle cleaning up memory when there is no more references to it. Unfortunately, his is actually a remnant of some internal process that we have at Microsoft. It is one of the things that we’d like to remove out of the API in the future, but for the time being it’s in there.

Code point 21:   We need to create an import module symbol for the DLL that we’re going to import the method from. To do this we call module->FindImportModule, which returns an import module symbol for the DLL if the DLL was previously imported from, otherwise it returns nullptr.

The following if-statement checks if we got back an import module symbol, and if not then that means we need to create a new import module symbol inside of the current PEModuleUnit. This is done with the AddImportModule call. Note the arguments to this call are (0, dllname, false). The reason for the dllname argument is apparent, but the other two are less so. The ‘0’ is passed as the ExternId for the DLL.

Alternatively, one could combine the call to FindImportModule and AddImportModule into a single call into AddImportModule, since AddImportModule will search for the import module and return an existing one, if one exists, and create one if one does not. The FindImportModule call that we did in the beginning of code point 21 would be technically unnecessary in this case.

Code point 22: This call to FindImport returns the ImportSym if it has already been created for the import module. Otherwise it returns a nullptr. If the returned importsym is a nullptr then we need to create an import symbol. This import symbol will be the IAT slot that we will jump from.

Code point 23: This is where we begin the creation of an import symbol, if necessary. First we construct the name of the imported method, which is the original name of the method that has “_imp_” appended to it. This indicates that the method is an import.

We then take this string and convert it into a Phx::Name. The reason we use a Phx::Name here rather than String is twofold:

1. In creating a new GlobalVarSym the argument for the name of the symbol is of type Phx::Name.

2. Phx::Name exists as a way to reduce the cost of using strings. I’ll blog about this aspect more in the future.

The short of it is that you need to pass to GlobarVarSym::New an argument of type Phx::Name, and thus need to convert the String to a Phx::Name.

Code point 24: We make the type of the GlobalVarSym that we’re going to create a 32-bit integer. This is because each entry in the IAT is simply a 32-bit integer value. Although we should point out that on 64-bit platforms these values would be 64-bit integers, as each entry in the IAT would be a 64-bit value.

On the next line we create the GlobalVarSym. What some may find unusual is the last argument, Visibility. This argument is where we identify the visibility of the symbol that is being created. As this symbol represents a DLL import we use the DLLImport visibility value. Here we give the enum definition for Visibility, so you can see the set of visibilities.

enum Visibility
{
_Illegal = 0,
Func = 1,
File = 2,
GlobalRef = 3,
GlobalDef = 4,
GlobalCom = 5,
WeakRef = 6,
ClrTokenDef = 7,
ClrTokenRef = 8,
DLLImport = 9,
DLLExport = 10,
DLLExportCom = 11,
ForwardReferenced = 12,
}

Code Point 25: The last thing we do is to actually create the Import Symbol that we are going to return from this function. We are going to use the import module symbol (created in code point 21), and add an actual import to it. We do this with the AddImport method.

The arguments to this method are 0 for the external ID value, which effectively means it does not have an external ID. methodname is the name of the import. The next 0 is the “hint” for the import symbol (this is used when importing by ordinal and the hint to use for what the ordinal value is – I’m not sure if many people find it to be of much use). symIAT is the IAT entry symbol that we created in the previous code point. And lastly is the nullptr, which is the symbol for the delay loader code, which we’re not going to use at all here.

The AddImport method first checks if the import module already has an import named methodname. If so, it simply returns that import symbol. Otherwise, it creates a new import symbol, adds it to the list of import symbols for the import module, and then returns the newly create import symbol.

Phx::Syms::ImportSym ^AddImport

(

   Phx::PEModuleUnit ^ module, // Module being instrumented

   String ^ dll, // Name of the DLL

   String ^ method // Name of the imported method

) {

   // 20

   Phx::Name dllname = Phx::Name::New(module->Lifetime, dll);

   Phx::Name methodname = Phx::Name::New(module->Lifetime, method);

   // Do we have an import module for this DLL already?

   // 21

   Phx::Syms::ImportModuleSym ^ importmodulesym =

      module->FindImportModule(dllname);

   // If not create a new one

   if (importmodulesym == nullptr) {

      importmodulesym = module->AddImportModule(0, dllname,

         false);

   }

   // Check for the existance of this import

   // 22

   Phx::Syms::ImportSym ^ importsym = importmodulesym->FindImport(methodname);

   // If not then create it

   if (importsym == nullptr) {

      // 1st we need to create a symbol for the IAT slot that we will use

      // as the call target

      // Prepend "_imp_" to the import name

   // 23

      String ^ importString = String::Concat(L"_imp_", method);

  // Create the name for the symbol

      Phx::Name nameIAT = Phx::Name::New(module->Lifetime, importString);

      // As this is via an import we use an the intrinsic Int32Type

   // 24

      Phx::Types::Type ^ globalVarType = module->TypeTable->Int32Type;

      // Now create the sym

      Phx::Syms::GlobalVarSym ^ symIAT =

         Phx::Syms::GlobalVarSym::New(module->SymTable, 0, nameIAT, globalVarType,

            Phx::Syms::Visibility::DLLImport);

      // Now we can create the import

   // 25

      importsym = importmodulesym->AddImport(0, methodname, 0, symIAT, nullptr);

   }

   // Return the symbol.

   return importsym;

}

The end result -- a surprisingly straightforward program to do a rather sophisticated task. And it’s only up to your imagination to think how you can extend this program.

 

Full Code Here (also part of the RDK samples):

//------------------------------------------------------------------------------

//

// Phoenix

// Copyright (C) Microsoft Corporation. All Rights Reserved.

//

// Description:

//

// Simple Phoenix based application to add a call to an external method

// to the start of a method within a binary

//

// Input file must be compiled with a phoenix based compiler and -Zi

//

// Usage:

//

// Usage: addcall /out <filename> /pdbout <filename> /importdll <filename>

// /importmethod <exported method> /localmethod <local method>

// /in <image-name>

//

//------------------------------------------------------------------------------

#pragma region Using namespace declarations

using namespace System;

// Not using Phx namespace to show explicit class heirarchy

// using namespace Phx;

#pragma endregion

// Include additional definitions from samples.h

#include "..\..\common\samples.h"

//------------------------------------------------------------------------------

//

// Description:

//

// Place holder class for Global static data

//

// Remarks:

//

// Create a placeholder class for the global StringCtrl objects as this wont

// compile managed with global variables of this type.

//

//------------------------------------------------------------------------------

public

ref class GlobalPlaceHolder

{

public:

   static Phx::Ctrls::StringCtrl ^ in;

   static Phx::Ctrls::StringCtrl ^ out;

   static Phx::Ctrls::StringCtrl ^ pdbout;

   static Phx::Ctrls::StringCtrl ^ importdll;

   static Phx::Ctrls::StringCtrl ^ importmethod;

   static Phx::Ctrls::StringCtrl ^ localmethod;

};

// Method declarations

Phx::Syms::ImportSym ^

AddImport

(

   Phx::PEModuleUnit ^ module,

   String ^ dllname,

   String ^ methodname

);

void                  DoAddInstrumentation(Phx::PEModuleUnit ^ module);

void                  InitCmdLineParser();

Phx::PEModuleUnit ^ OpenModule(String ^ input);

void                  CheckCmdLine();

void                  Usage();

//-----------------------------------------------------------------------------

//

// Description:

//

// Entry point for application

//

// Arguments:

//

// args - Command line arguments

//

// Returns:

//

// Nothing

//

//-----------------------------------------------------------------------------

int

main

(

   array<String ^> ^ args

)

{

   // Initialize the target architectures.

  

   Phx::Targets::Archs::Arch ^ arch =

      Phx::Targets::Archs::X86::Arch::New();

   Phx::Targets::Runtimes::Runtime ^ runtime =

      Phx::Targets::Runtimes::VCCRT::Win32::X86::Runtime::New(arch);

   Phx::GlobalData::RegisterTargetArch(arch);

   Phx::GlobalData::RegisterTargetRuntime(runtime);

   // Initialize the infrastructure.

   Phx::Init::BeginInit();

   // Init the cmd line stuff.

   ::InitCmdLineParser();

   // Check for Phoenix wide options like "-assertbreak".

   Phx::Init::EndInit(L"PHX|*|_PHX_", args);

   // Check the command line.

   ::CheckCmdLine();

   // Open the module and read it in.

   Phx::PEModuleUnit ^ moduleUnit;

   moduleUnit =

      Phx::PEModuleUnit::Open(GlobalPlaceHolder::in->GetValue(nullptr));

   // Setup output file name and PDB.

   moduleUnit->OutputImagePath = GlobalPlaceHolder::out->GetValue(nullptr);

   moduleUnit->OutputPdbPath = GlobalPlaceHolder::pdbout->GetValue(nullptr);

   // Iterator will load symbols implicitly.

   // However Load Global Symbols upfront and print total.

   moduleUnit->LoadGlobalSyms();

   Phx::Output::WriteLine(L"Total Global Symbols Count - {0} ",

      moduleUnit->SymTable->SymCount.ToString());

   // Do some useful work on the tool front here:

   ::DoAddInstrumentation(moduleUnit);

   // Close the ModuleUnit.

   moduleUnit->Close();

   // If this was not the end of the application it would be best to

   // delete the ModuleUnit.

   // moduleUnit->Delete();

   return 0;

}

//-----------------------------------------------------------------------------

//

// Description:

//

// Check the commandline variables

//

//-----------------------------------------------------------------------------

void

CheckCmdLine()

{

   if (String::IsNullOrEmpty(

         GlobalPlaceHolder::in->GetValue(nullptr))

      || String::IsNullOrEmpty(

         GlobalPlaceHolder::out->GetValue(nullptr))

      || String::IsNullOrEmpty(

         GlobalPlaceHolder::pdbout->GetValue(nullptr))

      || String::IsNullOrEmpty(

         GlobalPlaceHolder::importdll->GetValue(nullptr))

      || String::IsNullOrEmpty(

         GlobalPlaceHolder::importmethod->GetValue(nullptr))

      || String::IsNullOrEmpty(

         GlobalPlaceHolder::localmethod->GetValue(nullptr)))

   {

      ::Usage();

      Environment::Exit(1);

   }

}

//-----------------------------------------------------------------------------

//

// Description:

//

// Initialise the parser variables

//

// Returns:

//

// Nothing

//

//-----------------------------------------------------------------------------

void

InitCmdLineParser()

{

   GlobalPlaceHolder::in =

   Phx::Ctrls::StringCtrl::New(L"in", L"input file",

      Phx::Ctrls::Ctrl::MakeFileLineLocString(L"addcall.cpp", __LINE__));

   GlobalPlaceHolder::out =

      Phx::Ctrls::StringCtrl::New(L"out", L"output file",

         Phx::Ctrls::Ctrl::MakeFileLineLocString(L"addcall.cpp", __LINE__));

   GlobalPlaceHolder::pdbout =

      Phx::Ctrls::StringCtrl::New(L"pdbout", L"output pdb file",

         Phx::Ctrls::Ctrl::MakeFileLineLocString(L"addcall.cpp", __LINE__));

   GlobalPlaceHolder::importdll =

  Phx::Ctrls::StringCtrl::New(L"importdll",

         L"Dll containing the method to call",

         Phx::Ctrls::Ctrl::MakeFileLineLocString(L"addcall.cpp", __LINE__));

   GlobalPlaceHolder::importmethod =

      Phx::Ctrls::StringCtrl::New(L"importmethod",

         L"Name of the method to call as it appears in the export list",

         Phx::Ctrls::Ctrl::MakeFileLineLocString(L"addcall.cpp", __LINE__));

   GlobalPlaceHolder::localmethod =

      Phx::Ctrls::StringCtrl::New(L"localmethod",

         L"Method at the begining of which to insert the call",

         Phx::Ctrls::Ctrl::MakeFileLineLocString(L"addcall.cpp", __LINE__));

}

//-----------------------------------------------------------------------------

//

// Description:

//

// Print the usage string to the console

//

// Returns:

//

// Nothing

//

//-----------------------------------------------------------------------------

void

Usage()

{

   Phx::Output::WriteLine(L"Usage: AddCall /out <filename> "

      L"/pdbout <filename> /importdll <filename> "

      L"/importmethod <exported method> /localmethod <local method> "

      L"/in <image-name>");

}

//-----------------------------------------------------------------------------

//

// Description:

//

// This method does the work of adding the instrumentation to the function

//

// Remarks:

//

// Iterates all Units in the IR and Raises them

// Finds the method specified on the command line and inserts a call

// instruction to the new import at the start of that method

// Encodes the Units ready for writing

//

// Returns:

//

// Nothing

//

//-----------------------------------------------------------------------------

void

DoAddInstrumentation

(

   Phx::PEModuleUnit ^ moduleUnit // Module being instrumented

)

{

   String ^ dll = GlobalPlaceHolder::importdll->GetValue(nullptr);

   String ^ method = GlobalPlaceHolder::importmethod->GetValue(nullptr);

   // Create the import

   Phx::Syms::ImportSym ^ importsym = ::AddImport(moduleUnit, dll, method);

   // Get the symbol for the IAT slot to use as the call destination

   Phx::Syms::Sym ^ symIAT = importsym->IATSym;

   // Method we want to instrument

   String ^ targetname = GlobalPlaceHolder::localmethod->GetValue(nullptr);

   // Iterate over each function actually has contribution in the module.

   for (Phx::ContribUnitIterator contribUnitIterator =

         moduleUnit->GetEnumerableContribUnit().GetEnumerator();

        contribUnitIterator.MoveNext();)

   {

      Phx::FuncUnit ^ funcUnit = contribUnitIterator.Current->AsFuncUnit;

      if (funcUnit == nullptr)

      {

         continue;

      }

      // Currently we must raise all FuncUnits when using

      // Phx::ContribUnitIterator otherwise resultant binary will not run.

      // Convert from Encoded IR to Low-Level IR

      funcUnit->DisassembleToBeforeLayout();

      // Only instrument the method we are interested in

      if (!String::Equals(targetname, funcUnit->FuncSym->NameString))

      {

         continue;

      }

      // Get the first "real" instruction in the func.

      Phx::IR::Instr ^ firstinstr = nullptr;

      for each (Phx::IR::Instr ^ instr in Phx::IR::Instr::Iter(funcUnit))

      {

         // Don't want a LabelInstr, PragmaInstr or DataInstr

         if (instr->IsReal)

         {

            firstinstr = instr;

            break;

         }

      }

      // Now create the call to the import

      // As this is via an import we use an the intrinsic Int32Type

      Phx::Types::Type ^ ptrType = funcUnit->TypeTable->Int32Type;

      // Create an alignment for the MemOpnd

      Phx::Align align = Phx::Align::NaturalAlign(ptrType);

      // This is unaliased memory so we can use either NonAliasedMemTag

      // or IndirAliasedMemTag

      Phx::Alias::Tag tag = funcUnit->AliasInfo->NonAliasedMemTag;

      // Create the MemOpnd for an indirect call through the IAT

      Phx::IR::MemOpnd ^ opnd = Phx::IR::MemOpnd::New(funcUnit,

         ptrType, symIAT, nullptr, 0, align, tag);

      // Create a new call instruction. This sample only supports x86, so

      // we can use the platform specific instruction.

      Phx::IR::CallInstr ^ call = Phx::IR::CallInstr::New(

         funcUnit, Phx::Targets::Archs::X86::Opcode::call,

         opnd);

      // Insert it before the instruction

      firstinstr->InsertBefore(call);

      // Now legalize the instruction form

      funcUnit->Legalize->Instr(call);

   }

}

//-----------------------------------------------------------------------------

//

// Description:

//

// Find or Create the ImportSym for a method in a dll

//

// Remarks:

//

// Searches the import list for a matching import

// If this is not found create the IAT entry and symbol for the import

//

// Returns:

//

// Phx::Syms::Imports of the matching import

//

//-----------------------------------------------------------------------------

Phx::Syms::ImportSym ^

AddImport

(

   Phx::PEModuleUnit ^ module, // Module being instrumented

   String ^ dll, // Name of the DLL

   String ^ method // Name of the imported method

)

{

   Phx::Name dllname = Phx::Name::New(module->Lifetime, dll);

   Phx::Name methodname = Phx::Name::New(module->Lifetime, method);

   // Do we have an import module for this DLL already?

   Phx::Syms::ImportModuleSym ^ importmodulesym =

      module->FindImportModule(dllname);

   // If not create a new one

   if (importmodulesym == nullptr)

   {

      importmodulesym = module->AddImportModule(0, dllname,

         false);

   }

   // Check for the existance of this import

   Phx::Syms::ImportSym ^ importsym =

      importmodulesym->FindImport(methodname);

   // If not then create it

   if (importsym == nullptr)

   {

      // 1st we need to create a symbol for the IAT slot that we will use

      // as the call target

      // Prepend "_imp_" to the import name

      String ^ importString = String::Concat(L"_imp_", method);

      // Create the name for the symbol

      Phx::Name nameIAT = Phx::Name::New(module->Lifetime, importString);

      // As this is via an import we use an the intrinsic Int32Type

      Phx::Types::Type ^ ptrType = module->TypeTable->Int32Type;

      // Now create the sym

      Phx::Syms::GlobalVarSym ^ symIAT =

         Phx::Syms::GlobalVarSym::New(module->SymTable, 0, nameIAT, ptrType,

            Phx::Syms::Visibility::DLLImport);

      // Now we can create the import

      importsym = importmodulesym->AddImport(0, methodname,

         0, symIAT, nullptr);

   }

   // Return the symbol.

   return importsym;

}

 


[1] In the RDK sample the code here is slightly different in that it does not use “for each”. We find it hard to justify not using “for each” syntax in this case, so we’ve made a slight modification to the code. The semantics are completely identical.

Comments