When does it make sense to use Win32 Fibers?

This has been discussed fairly frequently on the Web. 

 

Chris Brumme discusses this here: https://blogs.msdn.com/cbrumme/archive/2003/04/15/51351.aspx

Raymond Chen discusses this here: https://blogs.msdn.com/oldnewthing/archive/2004/12/31/344799.aspx.

There’s some discussion about the pros and cons of Fibers in SQL Server here: https://msdn.microsoft.com/en-us/library/aa175385.aspx

 

Those are just the first three relevant links from a simple search. But still, the subject comes up.  

 

Most people think of threads vs. fibers in terms of the "expense" of each option. Fibers are often thought to be "cheaper" than threads (they're often referred to a "light-weight threads,") and are thought to perform better, or at least enable your app to do things to make it perform better. So what are the expenses involved here?

 

For most apps, the most important expense of threads is the physical address space consumed by the stack. Fibers don’t fix this; they carry just as much stack as a thread.

Another expense of threads is that there is overhead to preemptive scheduling. If you have more threads than CPUs, and they routinely use up all of their scheduling quanta, you have to pay the expense of the kernel periodically forcing a (possibly unnecessary) context switch. This is a fairly rare problem for real apps, which almost never use their full quanta. Usually an app has blocked on some kernel object, waiting for input from the user, the network, etc., long before the full quantum is used. And even if preemption does occur, it’s rarely the most significant expense an app incurs.

If preemptive context switches are your biggest problem, the only solution is to reduce the number of threads. Fibers don’t solve that problem. However, reducing the number of threads introduces a new problem: you’ve still got the same amount of work to do, and now it’s up to you to multiplex all that work onto just a few threads.

What fibers actually do (try to) solve is the problem of manually multiplexing lots of work onto a few threads. There are lots of ways to do this – asynchronous I/O being the prime example – but fibers try to make it really easy by letting you keep your stack, so you can just write code the way you’re used to.

Another common use of fibers is to implement coroutines. This is what Raymond Chen’s doing in his Enumerator example. The idea here, again, is to make switching contexts easier by allowing you to do everything on the stack. But here there’s no pretense of a performance benefit; it’s quite obvious to everyone that allocating two or more 1MB stacks, and switching between them every few instructions, is going to suck, perf-wise. It’s just (supposed to be) easier to write code this way than the alternatives (allocating everything on the heap, splitting your logic across lots of methods, etc.).

So really fibers aren’t about performance at all, but about making manual context-switching easier.

The downside of fibers is that they’re not a general-purpose mechanism like threads. Any code that depends on thread-local storage (which is almost all code that does anything) doesn’t work with fibers. By going with fibers, you’re taking a lot of matters into your own hands; this is obvious in the case of scheduling, but it’s not so obvious that you’re also taking responsibility for most of what your language’s runtime library does, or vast swaths of Win32, etc. You also can’t write managed code. And you’ll never be able to call out to 3rd party code.

So in exchange for easy context-switching, you get increased memory usage, and you lose the ability to use virtually any basic software building block ever written.

For something like Microsoft SQL Server, which is one of the most highly-optimized apps on the planet, and which pretty much lives in its own world where memory is “cheap” and library code is suspect anyway, it probably made sense to go with fibers to help reduce context-switch overhead without having to completely re-write SQL Server. For almost literally everyone else, it just doesn’t make sense.

And if all you want is coroutines, ask your favorite language vender to provide proper support for them. J C# has iterators, which beat the pants off of Raymond’s enumeration example; that’s a good start, but iterators are awkward for anything other than, well, enumerating things.

Comments

  • Anonymous
    August 17, 2008
    Is it possible to use COM with fibers?

  • Anonymous
    August 25, 2008
    It may well be possible, with great care - but COM certainly has a bunch of thread-affinitized concepts.  For example, every thread needs to be CoInitialized, which sets up per-thread data structures including the all-important "apartment."  These certainly will not follow fibers, but rather will be associated with the underlying thread.  I don't know exactly what implications this would have for the COM programmer, but I'd be surprised if it wasn't at least a major undertaking to make it work.

  • Anonymous
    October 18, 2010
    Is the fiber API only a public API to stop any-trust problem from other database vendors claiming that the OS team is helping the Microsoft SQL Server team to much?

  • Anonymous
    October 19, 2010
    Ian: I don't know all of the history here, but there are certainly other reasons to want a public fiber API.  It's not that fibers are only useful to SQL Server, it's just that they're not often useful to very many applications.  That describes quite a few public Win32 APIs, actually. :)