Comparing Continuations in F# and C#
Lately I’ve been playing quite a bit with F#. I have several hobby projects I’m working on that take up a bit of my time. But when I’m not playing around with F# I’m exploring ways to apply certain functional patterns to actual coding on the job and/or porting to my functional library: RantPack.
Recently I’ve been playing around with continuations in F#. I thought this was a great topic to do a F# comparison with other languages. In this case C#.
Let’s examine a classic use of continuations: a right fold on a list. For a detailed explanation of fold right and the use of a continuation I suggest taking a look at Brian's discussion. If you’re unfamiliar with continuations I highly suggest that you take a look at this post as Brian gives a great breakdown of continuations and their uses.
Here's a quick refresher on continuations by example. Fold right is an operation which reduces a sequence of elements into a single element by processing the list from right to left. It’s similar to the LINQ Aggregate method except Aggregate operations left to right.
For this post we’ll be writing FoldRight against a sequence. I chose sequence vs. the traditional list because it’s present in both languages (F# = seq<’a>, C# = IEnumerable<T>). It is possible to other F# data structures in C# but the comparison is cleaner when using a type that is native to both languages.
Sequences are a left to right data structure so processing it right to left is not natural. After all, with a sequence all the developer has is current and whether or not there is a next element. Processing the list in a right to left fashion can be done by such acts as reversing the list, or doing a head recursive call. Both have their detractors.
Continuations are a different way to process the list. Instead of processing the list directly we process the list a single element at a time building up a continuation along the way. For each element a lambda expression is generated representing the work needed to be done for that element. The value calculated within the lambda will then be passed to the lambda calculated for the previous element. Once we hit the end of the list, we essentially have a chain of lambda expressions which process each element in the list in reverse order. All that is needed is to call the final lambda with the starting value and we will effectively process the list in reverse order.
Simple enough? Lets take a look at the code.
F# Code
let FoldRight combine (sequence:seq<'a>) acc =
use e = sequence.GetEnumerator()
let rec inner cont =
match e.MoveNext() with
| true ->
let cur = e.Current
inner (fun racc -> cont (combine cur racc))
| false -> cont acc
inner (fun x -> x )
C# Code
public static TAcc FoldRight<TSource, TAcc>(
this IEnumerable<TSource> enumerable,
Func<TAcc, TSource, TAcc> combine,
TAcc start) {
using (var e = enumerable.GetEnumerator()) {
Func<Func<TAcc, TAcc>, TAcc> inner = null;
inner = (cont) => {
if (e.MoveNext()) {
var cur = e.Current;
Func<TAcc, TAcc> innerCont = (x) => cont(combine(x, cur));
return inner(innerCont);
} else {
return cont(start);
}
};
return inner(x => x);
}
}
My immediate reaction to the two samples is the conciseness of the F# code. This is a not a criticism of C# though. F# is designed to be a concise language and it’s delivery on that goal is evident in this sample.
What makes the big difference here is the type inference power of F#. In the C# sample there are 6 explicit types listed in the code sample. The F# sample only has a single type listed. The compiler is able to infer and/or generate the rest of the signatures. F# also requires less explicit generic parameters 1 vs. 2 in C#.
The next big difference I see is the awkward way in which the inner lambda expression must be declared in C#. The lambda expression is called recursively in order to setup the continuation. In order to do that in C# the lambda must be declared and defined in separate expressions. Otherwise, a self reference of “inner” inside the body of “inner” will generate a used before defined warning from the compiler.
The IL
Examining the full IL of both functions would take several blog posts. Not to mention that trying to read disassembled F# much less IL, is like trying to read disassembled C++. An interesting exercise but a bit time consuming.
I did want to focus a bit on one portion of the generated IL. There is a very significant difference in the way the recursive call to the “inner” lambda is made.
F#
L_0032: ldarg.1
L_0033: ldarg.0
L_0034: ldfld !0 Test/clo@6T::acc
L_0039: tail
L_003b: callvirt instance !1 [FSharp.Core]Microsoft.FSharp.Core.FastFunc`2::Invoke(!0)
C#
L_0077: ldarg.0
L_0078: ldfld class [System.Core]System.Func`2, !1> ConsoleApplication1.Extensions/<>c__DisplayClass8::inner
L_007d: ldloc.0
L_007e: callvirt instance !1 [System.Core]System.Func`2, !TAcc>::Invoke(!0)
L_0083: stloc.3
In both cases the first 3 lines are building up the 2 parameters necessary for the recursive lambda call. The closures are structured somewhat differently but the same basic operation is being done.
The key difference between the languages is F#’s use of the tail opcode. This opcode tells the CLR the to call the next method in a tail recursive fashion. This causes the CLR to remove the current method frame from the stack before the next method is called. Because the method is removed from the stack, the recursive call takes up no additional stack space. This is true no matter how many times the function is called.
The C# IL does not have this opcode. So the recursive call will happen with the current method on the frame. With a big enough sequence this will cause the process to run out of stack space and generate a StackOverflowException. This creates a limitation on the number of elements the C# sample can process.
Limits
I explored the limits of both samples on my home laptop. I generated a simple example to sum the sequence with the fold right. Note: For a sum of ints, a fold left is just as good, but it serves fine for this sample.
F#
let sum = FoldRight (fun x y -> x + y) [1..1000000] 0
printfn "%d" sum
C#
var source = Enumerable.Range(1, 9397);
var result = source.FoldRight((x, y) => x + y, 0);
Console.WriteLine("{0}", result);
The C# sample can process a maximum size of 9397 elements. After that I encounter a stackoverflow exception. The F# sample however can easily process 1,000,000 elements.
Closing note. This not meant to be a post criticizing C#. It’s meant to be a general comparison of the same technique in two managed languages. This is a scenario that is far less likely to occur in a C# program. In an F# program it’s quite simply an expectation and hence the F# compiler is optimized for this scenario.
Comments
Anonymous
November 10, 2008
PingBack from http://www.tmao.info/comparing-continuations-in-f-and-c/Anonymous
November 10, 2008
Nice! I looked twice to understand the recursive call to inner. Then I decided to rewrite it to inner (combine cur >> cont) and it immediately became clear what was going on! BTW: you probably wrote (fun x->x) to avoid confusion, but the standard function "id" is of course quite handy for this :-) BTW2: I would prefer to take the sequence as the final argument, which seems consistent with the Seq module.Anonymous
November 10, 2008
Hi, I like write less and less code however it will be difficult to learn a new language and new FrameWork every 1 or 2 years. And I'm wondering when we could use F# in our project ?Anonymous
November 11, 2008
In my last post I went over the differences between using a continuation in F# and C# . As it turns out