Jit Optimizations: Inlining (I)
Inlining is an optimization that can happen when you have calls. The optimization consists in substituting the call for the code of the callee.
What do you gain from doing this?
- Speed: Reduce the overhead of the call. To get to the actual instructions that do the work, we have to do the following:
- Setup arguments (in registers and the stack).
- Do the actual call instruction.
- Callee prolog
- Callee work
- Callee epilog
- return
- Size: sometimes you get smaller code, as there are situations (for example property getters) that the number of instructions generated when inlining is less than the number of instructions that is generated to do an actual call.
- Expose more optimizations (speed and size wise), as now the code you got from substituting the call can be the target of other
optimizations.
Let's take a look at a simple example:
class Test
{
static int And(int i1, int i2)
{
return i1 & i2;
}
static int i;
static public void Main()
{
i = And(i, 0);
}
}
If we don't inline And(), the generated code for Main() is:
mov ECX, dword ptr [classVar[0x2ca0d24]] ; setup first argument (i)
xor EDX, EDX ; setup second argument (0)
call [Test.And(int,int):int] ; do the call
mov dword ptr [classVar[0x2ca0d24]], EAX ; assign result to static
ret ; return
And the code for And() is
and ECX, EDX ; perform AND
mov EAX, ECX ; setup return register
ret ; return to caller
Note that And() is really simple, and we're not paying anything for prolog and epilog. Even with this, we still get a win when inlining:
Main() with inlining turned on:
xor EAX, EAX ; generate final result ;)
mov dword ptr [classVar[0x2ca0d24]], EAX ; move result to static
ret ; return
For this really simple code we got the following from the inlining optimization
- Went from 8 to 3 instructions for the codepath we have to execute, we'll be much faster.
- For Main(), we went from 20 bytes to 8 bytes of code.
Note how inlining enabled a further optimization, when we inlined we could actually fold away the And operation, as the JIT realized one side of the AND was a 0.
In my next post, I will go into more details of our current implementation, such as some limitations we currently have.
PS: I'm saving all the questions you guys are asking, I'll eventually answer (I hope most of them) on the blog, just have a bit of patience, we're really busy right now trying to get Beta 2 done!.
Comments
Anonymous
October 27, 2004
I'm looking forward to see how implemetation works, since JIT is done per-method! I thought at a lot of possibilities, each with some drawback..Anonymous
October 27, 2004
Can you expand on the limits of inlining?
There is this story about 32 bytes of IL as the maximum limit for inlining. Many ask for manual settings to upper this limit (at the expense of some memory). Would this be something useful?Anonymous
October 28, 2004
The comment has been removedAnonymous
October 31, 2004
http://www.danielmoth.com/Blog/2004/10/blog-link-of-week-44.htmlAnonymous
November 05, 2004
Interesting Findings todayAnonymous
February 06, 2007
PingBack from http://davebrooks.wordpress.com/2007/02/07/threading-through-the-open-window-of-opportunity/Anonymous
August 31, 2007
A menudo al hablar de optimización de código se habla del inlining ( código en línea ) como técnica deAnonymous
October 25, 2007
Just a reminder: Release builds are not Debug builds. Seems obvious, but it's worth saying again. ReleaseAnonymous
October 25, 2007
PingBack from http://www.hanselman.com/blog/ReleaseISNOTDebug64bitOptimizationsAndCMethodInliningInReleaseBuildCallStacks.aspx