Every public change is a breaking change

Here's an inconvenient truth: just about every "public surface area" change you make to your code is a potential breaking change.

First off, I should clarify what I mean by a "breaking change" for the purposes of this article. If you provide a component to a third party, then a "breaking change" is a change such that the third party's code compiled correctly with the previous version, but the change causes a recompilation to fail. (A more strict definition would be that a breaking change is one where the code recompiles successfully but has a different meaning; for today let's just consider actual "build breaks".) A "potential" breaking change is a change which might cause a break, if the third party happens to have consumed your component in a particular way. By a "public surface area" change, I mean a change to the "public metadata" surface of a component, like adding a new method, rather than changing the behaviour of an existing method by editing its body. (Such a change would typically cause a difference in runtime behaviour, rather than a build break.)

Some public surface area breaking changes are obvious: making a public method into a private method, sealing an unsealed class, and so on. Third-party code that called the method, or extended the class, will break. But a lot of changes seem a lot safer; adding a new public method, for example, or making a read-only property into a read-write property. As it turns out, almost any change you make to the public surface area of a component is a potential breaking change. Let's look at some examples. Suppose you add a new overload:

// old component code:
public interface IFoo {...}
public interface IBar { ... }
public class Component
{
public void M(IFoo x) {...}
}

Suppose you then later add

    public void M(IBar x) {...}

to Component. Suppose the consumer code is:

// consumer code:
class Consumer : IFoo, IBar
{
...
component.M(this);
...
}

The consumer code compiles successfully against the original component, but recompiling it with the new component suddenly the build breaks with an overload resolution ambiguity error. Oops.

What about adding an entirely new method?

// old component code:
...
public class Component
{
public void MFoo(IFoo x) {...}
}

and now you add

public void MBar(IBar x) {...}

No problem now, right? The consumer could not possibly have been consuming MBar. Surely adding it could not be a build break on the consumer, right?

class Consumer
{
class Blah
{
public void MBar(IBar x) {}
}
static void N(Action<Blah> a) {}
static void N(Action<Component> a) {}
static void D(IBar bar)
{
N(x=>{ x.MBar(bar); });
}
}

Oh, the pain. In the original version, overload resolution has two overloads of N to choose from. The lambda is not convertible to Action<Component> because typing formal parameter x as Component causes the body of the lambda to have an error. That overload is therefore discarded. The remaining overload is the sole applicable candidate; its body binds without error with x typed as Blah.

In the new version of Component the body of the lambda does not have an error; therefore overload resolution has two candidates to choose from and neither is better than the other; this produces an ambiguity error.

This particular "flavour" of breaking change is an odd one in that it makes almost every possible change to the surface area of a type into a potential breaking change, while at the same time being such an obviously contrived and unlikely scenario that no "real world" developers are likely to run into it. When we are evaluating the impact of potential breaking changes on our customers, we now explicitly discount this flavour of breaking change as so unlikely as to be unimportant. Still, I think its important to make that decision with eyes open, rather than being unaware of the problem.

Comments

  • Anonymous
    January 09, 2012
    You pretty much have to ignore these possible problems, the same way you can't guess what extension methods we've made that may be silently overridden by new methods in a class.

  • Anonymous
    January 09, 2012
    Thanks for the great analysis. I love the suspense building until you finally admit at the end that the examples are "such an obviously contrived and unlikely scenario that no "real world" developers are likely to run into it". I had been wondering if that would be mentioned, and the article does not disappoint!  :) As usual, more interesting insight into the world of language design. (by the way, I find that if I have waited too long after navigating to the blog article page, then when I attempt to post a comment, the submit button results in the page reloading and anything I've typed being discarded. while bugs like these have led to me to have the habit of making sure I've copied typed text before clicking anything, it's still pretty annoying).

  • Anonymous
    January 09, 2012
    The comment has been removed

  • Anonymous
    January 09, 2012
    These problems probably aren't very common. They haven't happened with me even a sigle time. It is good to know, and the post itself is great, 5 stars.

  • Anonymous
    January 09, 2012
    Thomas: See stackoverflow.com/.../8581029 for an example by Eric Lippert.  I was expecting this article to be about that question, actually. I've been meaning to write this post for years; we identified this as a problem when we added lambdas to C# 3.0. The SO question in December motivated me to actually do it. -- Eric

  • Anonymous
    January 09, 2012
    Hmm, as long as the consuming code actually fails to compile I would agree this is not a big problem. But what if client code suddenly binds to a different member but still compiles? This is actually not a breaking change by your definition, but I would think this is a bigger problem as there is no warning whatsoever. I can only think of extension methods that could causes this kind of situation, or are there also more subtle situations?

  • Anonymous
    January 09, 2012
    I'm curious to see how unsealing a class can cause code to stop compiling. Same trick. This trick is surprisingly versatile. class B {}
    sealed class C {}
    class D : B, I {}
    interface I {}
    class P {
      static void M(Func<C, I> fci){}
      static void M(Func<B, I> fbi){} // special agent!
      static void Main() { M(q=>q as I); }
    } That compiles successfully because "q as I" is illegal if q is of type C. The compiler knows that C does not implement I, and because it is sealed, it knows that no possible type that is compatible with C could implement I. Therefore overload resolution chooses the func from B to I, because a derived type B, say, D, could implement I. When C is unsealed then overload resolution has no basis to choose one over the other, so it becomes an ambiguity error. -- Eric  

  • Anonymous
    January 09, 2012
    Seems to me you would be much better off in this regard avoiding the new functional features and sticking to the C# 2.0 specifications. That way you're pretty much protected against breakages like these right? Sure, but at what cost? Is it worthwhile to eschew the benefits of LINQ in order to avoid the drawbacks of some obscure, unlikely and contrived breaking changes? The benefits of LINQ outweigh the costs by a huge margin, in my opinion. -- Eric

  • Anonymous
    January 09, 2012
    You know, all these examples make me think overload resolution is just too clever for its own good -- these kinds of algorithms are more the kind you'd expect in the optimizing part, where cleverness abounds, not in the basic semantic part. Clearly, the solution is to abolish overload resolution, assign every method an unambiguous canonical name (that's stable in the face of changes) and force the developer to specify this name on every call. Alternatively, make overload resolution unnecessary by forbidding overloads. Not sure how to handle generics in such a world -- abolishing them seems too harsh. (Mandatory Internet disclaimer: the above is facetious, but slightly ha-ha-only-serious.) There are languages that do not have overload resolution. A .NET language could, for instance, specify the unique metadata token associated with the method definition. But in a world without overload resolution, how would you do query comprehensions? -- Eric

  • Anonymous
    January 09, 2012
    I believe a made a comment before but it seemed to disappear. Howevver, it seems to me that the breaking changes tradeoff always comes up. In an agile world this just doesn't seem to fit my view of how things should be done. We promote constant refactoring and agile mehotds everywhere else. Isn't it time for us developers to change perspective? Instead of trying to provide non-breaking changes - shouldn't we deliver a tool that lets the consumer convert his code to the new specifications. Obviously this tool could be pedagogic, combine warnings presentations with errors etc. etc. We only need a good abstraction over C# code now! Oh wait...yeah Roslyn is on its way. However, this will naturally require time and effort. You can't help but thinking that a programming language that includes a "transaction log" of the program would be beneficial, i.e. information on how methods have come and gone and changed name or so. But I know of no such language. Opinions, Eric?

  • Anonymous
    January 09, 2012
    I agree that this probably isn't something most developers need to worry about, but it is absolutely something most architects should think about. In a perfect world, The unfortunate fact of the world is that people who consume library code don't want to have to update their code just because you changed something. Library architects have to go out of their way to prevent breaking changes. Obviously the example here is a little contrived, but there are applications. For example, these problems are a lot more likely with fluent-style APIs due to the common method names. Architects should balance that risk with the potential improvement in usability before picking a pattern. Extension methods are another example - core, critical functionality should be placed directly on the class whenever possible to prevent accidental overrides by consumers. I'm sure there are other examples where these sorts of issues can drive important decisions in the real world. They shouldn't just be ignored.

  • Anonymous
    January 10, 2012
    The comment has been removed

  • Anonymous
    January 11, 2012
    I would like to add that I've had code break simply from the creation of a public class!

  • Anonymous
    January 11, 2012
    @Staffan:

  1.  I really appreciate that when I upgraded my server from .Net 2.0 to .Net 4.0, I only had to fix a few things (aka, breaking changes) before my site would work.  I would have been hesitant to update if I believed this would not be the case.
  2.  Breaking changes that are easy to fix automatically can usually be made in a way that prevents them from being breaking changes (ignoring edge cases like this one), and such automation will usually fail in the same places as the change is breaking.
  3.  If you're using a library you don't own, having it break is horrible.
  • Anonymous
    January 13, 2012
    The comment has been removed

  • Anonymous
    January 15, 2012
    Sorry - I wrote @Tim - I meant @Brian

  • Anonymous
    February 13, 2012
    There's a handful of places where reflection to access certain private members is so common that changing them qualifies as breaking.

  • Anonymous
    August 23, 2012
    @Tim Copenhaver, I can't help wondering what these two distinct "developer" and "architect" roles are?