The Stack Is An Implementation Detail, Part One

Stack<Stone>I blogged a while back about how “references” are often described as “addresses” when describing the semantics of the C# memory model. Though that’s arguably correct, it’s also arguably an implementation detail rather than an important eternal truth. Another memory-model implementation detail I often see presented as a fact is “value types are allocated on the stack”. I often see it because of course, that’s what our documentation says.

Almost every article I see that describes the difference between value types and reference types explains in (frequently incorrect) detail about what “the stack” is and how the major difference between value types and reference types is that value types go on the stack. I’m sure you can find dozens of examples by searching the web.

I find this characterization of a value type based on its implementation details rather than its observable characteristics to be both confusing and unfortunate. Surely the most relevant fact about value types is not the implementation detail of how they are allocated, but rather the by-design semantic meaning of “value type”, namely that they are always copied “by value” . If the relevant thing was their allocation details then we’d have called them “heap types” and “stack types”. But that’s not relevant most of the time. Most of the time the relevant thing is their copying and identity semantics.

I regret that the documentation does not focus on what is most relevant; by focusing on a largely irrelevant implementation detail, we enlarge the importance of that implementation detail and obscure the importance of what makes a value type semantically useful. I dearly wish that all those articles explaining what “the stack” is would instead spend time explaining what exactly “copied by value” means and how misunderstanding or misusing “copy by value” can cause bugs.

Of course, the simplistic statement I described is not even true. As the MSDN documentation correctly notes, value types are allocated on the stack sometimes. For example, the memory for an integer field in a class type is part of the class instance’s memory, which is allocated on the heap. A local variable is hoisted to be implemented as a field of a hidden class if the local is an outer variable used by an anonymous method(*) so again, the storage associated with that local variable will be on the heap if it is of value type.

But more generally, again we have an explanation that doesn’t actually explain anything. Leaving performance considerations aside, what possible difference does it make to the developer whether the CLR’s jitter happens to allocate memory for a particular local variable by adding some integer to the pointer that we call “the stack pointer” or adding the same integer to the pointer that we call “the top of the GC heap”? As long as the implementation maintains the semantics guaranteed by the specification, it can choose any strategy it likes for generating efficient code.

Heck, there’s no requirement that the operating system that the CLI is implemented on top of provide a per-thread one-meg array called “the stack”. That Windows typically does so, and that this one-meg array is an efficient place to store small amounts of short-lived data is great, but it’s not a requirement that an operating system provide such a structure, or that the jitter use it. The jitter could choose to put every local “on the heap” and live with the performance cost of doing so, as long as the value type semantics were maintained.

Even worse though is the frequently-seen characterization that value types are “small and fast” and reference types are “big and slow”. Indeed, value types that can be jitted to code that allocates off the stack are extremely fast to both allocate and deallocate. Large structures heap-allocated structures like arrays of value type are also pretty fast, particularly if you need them initialized to the default state of the value type. And there is some memory overhead to ref types. And there are some high-profile cases where value types give a big perf win. But in the vast majority of programs out there, local variable allocations and deallocations are not going to be the performance bottleneck.

Making the nano-optimization of making a type that really should be a ref type into a value type for a few nanoseconds of perf gain is probably not worth it. I would only be making that choice if profiling data showed that there was a large, real-world-customer-impacting performance problem directly mitigated by using value types. Absent such data, I’d always make the choice of value type vs reference type based on whether the type is semantically representing a value or semantically a reference to something.

UPDATE: Part two is here

*******

(*) Or in an iterator block.

Comments

  • Anonymous
    April 27, 2009
    Excellent article - you're totally correct - the emphasis on value types stored on the stack is bizarre and confusing.

  • Anonymous
    April 27, 2009
    So... an improvement to the .NET documentation and its explorer could be to separate the semantic usage from the implementation details (show just when needed).

  • Anonymous
    April 27, 2009
    The comment has been removed

  • Anonymous
    April 27, 2009
    "...a simple image which may be wrong under certain circumstances is often better than no image at all." Unless you're making technical decisions based on that image. And most people do; for many, that's the entire purpose of having a mental image: to make decisions easier. One of the purposes of managed code is to make implementation details a much lower priority for developers than they used to be. It's reasonable, in many cases, to simply say that what's going on behind the scenes just doesn't matter. In some cases, it's even a requirement; different implementations of the CLR do things differently, and moving code (or even algorithms) between runtimes can be a wake-up call if you're depending on some implementation detail. And there are more implementations out there than you might think. If knowing the implementation details is something of a hobby for you, then by all means enjoy. If you've identified an implementation-dependent bit of code as a major bottleneck in your application, then feel free to fit it precisely to a specific implementation. But for the vast majority of the managed code that most people write, they should keep the implementation details pretty far from their minds and stick to the behavioural and semantic details -- the ones that are essentially guaranteed by the specifications.

  • Anonymous
    April 27, 2009
    I'm sure I'm not the only reader looking for the upvote button at this point... Having said that, the stack/heap problem doesn't feel as bad to me as the "reference by value" vs "object by reference" issue, where developers still insist that "objects are passed by reference by default" (in C#). Avoiding using the word "reference" for two definitely-related-by-distinct concepts would have been handy - but arguably this is computer science's fault more than C#'s. Jon

  • Anonymous
    April 27, 2009
    The comment has been removed

  • Anonymous
    April 27, 2009
    I often complain about a similar problem with the way people usually describe reorderings in memory models.  Everyone seems to start with a discussion of cache coherence, write buffers, memory heirarchies, etc., when what is really important is just that things can be reordered.  The student is often left with the impression that this stuff is a lot harder to understand than it really is.  (I'm not saying memory models are easy - just that you really don't need an EE degree to understand how to use them).  Describing things this way also makes people think it's just a hardware issue, when in fact the most surprising reorderings are done by software (compilers, runtimes, etc.).  The only time a programmer should be thinking about cache lines is when something isn't going fast enough.  And even then they should try thinking about everything else first. :)

  • Anonymous
    April 27, 2009
    "Making the nano-optimization of making a type that really should be a ref type into a value type for a few nanoseconds of perf gain is probably not worth it. I would only be making that choice if profiling data showed that there was a large, real-world-customer-impacting performance problem directly mitigated by using value types. Absent such data, I’d always make the choice of value type vs reference type based on whether the type is semantically representing a value or semantically a reference to something." I find that the problem is more commonly doing the other way around: I'd like to make all my no-inherent-identity, value semantic types value types, but it seems that it places an undue performance burden for large aggregates for no good reason. Hence the MSDN (and, if I recall correctly, Framework Design Guideline) recommendations to avoid structs over 4 ints large, and make them into classes past that point instead. Which is a pity, because I really shouldn't have to be concerned about such things at all - as you yourself say, I should care about semantics, not about implementation details (and how they affect performance). Unfortunately, I'm forced to. It would be really nice if CLR would indeed "do what's best" in any given case, and allocate some value types on the heap, cloning as needed when passing around copies (and possibly doing some mutability analysis to avoid copying at all!). As you say, it has a mandate to do just that already, so I'm surprised it's not used yet. On a side note, sticking to this definition of value types - don't you think it would be much less confusing to have the value-semantic types in FCL also be structs? I mean things like System.String and System.Uri. In fact, I've never understood the design decision of making String a reference type - it looks like one of those unfortunate things done following the Java path, like array covariance. IMO, the fact that C# overloads == and != for String is already a dead giveaway that it shouldn't be a reference type in the first place (and the same goes for any other type, by the way). And, of course, the pure evilness that is "null"... Structs in the CLI have the pleasant-for-the-implementor property that they are always of a size known at compile time. Strings and URIs do not have that property. So we can either (1) abandon the nice property that structs are of known size, or (2) make strings an immutable reference type and overload equality so that they act like value types. We chose (2). Design is the art of making reasonable compromises when faced with conflicting goals. -- Eric

  • Anonymous
    April 27, 2009
    The comment has been removed

  • Anonymous
    April 27, 2009
    Even odder about "light and fast" idea is that up until recently (and even now, I think), the JIT doesn't do all the optimization it could with value types. Before 3.5SP1, I think tail calls wouldnt be performed if the type was a value type. In running mini benchmarks here and there, sometimes passing around a struct results in far worse performance.

  • Anonymous
    April 27, 2009
    Since I work in C++ mainly, I can only comment from the C++ viewpoint. I try to use as much value semantics, with objects allocated on the stack, as possible. Especially when combined with templates. Littering your code with 'new' statements when it's not necessary and trivial to avoid, is basically a time sink. Stack allocation is an order of magnitude faster, and it does not need to lock anything. So advising people to not pay attention to this, will create a codebase that is slower than it should be.

  • Anonymous
    April 27, 2009
    But...C# is intended to run on CLI, right? Even though ECMA-334 states that C# doesn't have to rely on CLI, but any implementation of C# should still support minimal CLI features required by C# standard. And ECMA-335 CLI does talk about "the invocation stack" and "the heap", even though this stack doesn't have to be related to the C-stack. It's pretty vague to just say "the stack" without pointing out "which stack", for implementations of CLI can choose to put invocation stack frames on the heap, or whatsoever.

  • Anonymous
    April 27, 2009
    I agree that that the storage location of a type is not  the most important property to describe when explaining the diffrence between value and refrence types. Neither is light and fast vs big and slow. But as an alterantive could you give a good alterantive explanation based on the observable characteristics' in such a way that developers will get a good understanding of the diffrence and be able to make the right choice.

  • Anonymous
    April 27, 2009
    The comment has been removed

  • Anonymous
    April 28, 2009
    Interesting Finds: April 28, 2009

  • Anonymous
    April 28, 2009
    I like this article.  I have been seen interview questions like what the difference between a value type and a reference type and this stack vs heap comes up as an answer a lot. The "copy by value" is a much more valuable concept to understand.  Pardon the pun.

  • Anonymous
    April 28, 2009
    > Structs in the CLI have the pleasant-for-the-implementor property that they are always of a size known at compile time. Strings and URIs do not have that property. So we can either (1) abandon the nice property that structs are of known size, or (2) make strings an immutable reference type and overload equality so that they act like value types. There's also (3), implement as an immutable reference type, but expose to API clients wrapped into a value type. I.e.: internal class StringData { ... } public struct String {  private StringData data;  public static bool operator== (String a, String b) { ... }  ... }

  • Anonymous
    April 28, 2009
    | Leaving performance considerations aside, Wait WHAT? I always thought that performance considerations was the only reason to include user-defined value types in the language. Copy semantics are totally useless since we don't have any control over them, we can't implement our own copy/assignment constructors, nor guaranteed destructors (and I can't comprehend why). It's always bit-by-bit copy. Why would you need that, leaving performance considerations aside? So the only thing value types are useful for is performance. Array of structs is layed out contiguosly in memory, which means that when performance is an issue an you can convert some of your objects to structs, ditch List&lt;T&gt; in favour of Array.Resize (wrapped in a ref-parametered extension method that hopefully would be inlined sometimes) and get your 10x speed increase for that part of the code thanks to the prefetch. Oh, well, there is this p/invoke stuff, too. Got it almost right in 2.0, IIRC, with embeddable arrays.

  • Anonymous
    April 28, 2009
    The comment has been removed

  • Anonymous
    April 28, 2009
    I mean. it looks like you've mistaken "value type semantics" for the "immutable type semantics"     i = 10     j = i     j += 1     assert i != j Is that what you meant? Well, firstly, Python lacks value types (and that was valid Python), even ints are heap-allocated, yet they are immutable, because immutability is a property of operations on a type. Second, C# value types are not immutable (except when stored in a List&lt;T&gt;, and that's why one has to use custom wrappers, sadly). No, these two things are profoundly different, and the value of the value types lies entirely in the performance characteristics. Or am I wrong?

  • Anonymous
    April 28, 2009
    The comment has been removed

  • Anonymous
    April 28, 2009
    The comment has been removed

  • Anonymous
    April 28, 2009
    Great article! For reference (haha, a pun), complex numbers using a value type for storage is probably the single most compelling example for value types in the context of performance. For example, computing the FFT of an array of complex numbers is ~5.5x faster with a value type than a reference type. Also, value types can be essential for interop, e.g. in XNA.

  • Anonymous
    April 28, 2009
    The comment has been removed

  • Anonymous
    April 28, 2009
    I think one reason the stack/heap trope lingers is that stack vs heap is often part of the bumper-sticker explanation of boxing.  To wit: reference objects are "in the heap", so to be treated as a reference object, the value type must be copied there, which highlights the fact that it wasn't there to begin with. Furthermore, on the spectrum of .NET knowledge, boxing shades toward the arcane and spooky.  Arcane in that 90% of developers don't knowingly encounter it during 90% of their workday; spooky in that, among those that do think about it, most of the encounters are negative, i.e., cleaning up a bug due to accidental boxing, or at least including boxing in The List of Things That Could Go Subtly Wrong. The brain often transmutes arcane and spooky knowledge into some proxied form.  Mental categories for things-that-might-hurt are pretty powerful.  When trying to account for bizarre behavior, it's at least an interesting place to start.  Interesting enough for Freud to upend the study of psychology, anyway.

  • Anonymous
    April 28, 2009
    The comment has been removed

  • Anonymous
    April 29, 2009
    The comment has been removed

  • Anonymous
    April 29, 2009
    What's lost in the stack allocation discussion is that higher level program structure changes will often have much better performance results. Low level optimizations are typically done last in the development cycle and require the use of a code instruction level profiler.  It would help if the instruction level profiler was included in the basic or profressional Visual Studio.  The much more expensive Team Foundation System is beyond the reach of many development shops.

  • Anonymous
    April 29, 2009
    I would say that it is important to define Value Types vs Object types both in terms of how they behave AND in how they are implemented. If you don't realize that the stack, by it's very nature limits the number of value types you can declare within a function, you risk a StackOverflowException without knowing why or how to fix it, as today's post on my blog points out. If you don't realize that values are copied while objects are pointed to by a reference, you risk thinking that you've copied an object when in fact all you've done is moved a pointer around, and then you wonder why the original value got changed when you changed the copy. WHAT these types are doing in memory is in fact WHY a Value Type or an Object Type behaves the way it does.  Not that it has to be that way, but understanding the WHAT will take you a long way to understanding the behavior you can expect. It's not an "either/or" issue.  It's a "both/and" issue.

  • Anonymous
    April 30, 2009
    Stack vs. heap is still important to understand, for the simple reason that there are other languages out there that do things differently (notably C++). Obviously, items on the stack are gone when the current function returns; objects on the heap may live on.  In C++, a class can be instantiated on the stack, and things like CriticalSection classes depend on that behavior.  People coming from that background need to understand the very important difference in .NET. When .NET came out I thought this change was one of the more profound things they did.  In C++ the developer chose the lifetime semantics, regardless of type (struct or class).  In C#, the type determines the lifetime semantics.  Essential knowledge, even if "the stack" isn't actually on the CPU stack,

  • Anonymous
    April 30, 2009
    > WHAT these types are doing in memory is in fact WHY a Value Type or an Object Type behaves the way it does. I think your logic is reversed here. As Eric has pointed out above, it's the observe behavior of value types that is primary, and their implementation (including "what they are doing in memory") is dictated by that.

  • Anonymous
    May 03, 2009
    This leads to a question I've had about the .Net model model. We use the terms "stack" & "heap" because back in the 1980s, a common C compiler/library organized local data using a stack data structure, while it stored malloc()ed blocked using a heap data structure.   So, the question is, in the current .NET implementation, is the Stack still a stack, and is the Heap still a heap?

  • Anonymous
    May 04, 2009
    In the last post , we’ve seen that one of the overloads used for creating a thread receives a parameter

  • Anonymous
    May 04, 2009
    In the last post , we’ve seen that one of the overloads used for creating a thread receives a parameter

  • Anonymous
    July 12, 2009
    One thing I never understood about Microsoft is why you have employees (like yourself) who blog about problems, but never fix those problems. In this case, you highlighted a significant shortcoming in the MSDN documentation of System.ValueType – if this shortcoming is known, why doesn't it get fixed?

  • Anonymous
    October 16, 2012
    The comment has been removed