Not as easy as it looks, Part Two

Holy goodness, did you guys ever find a lot of additional ways in which an "eliminate variable" refactoring can go wrong. Just a few of your observations: (again, in every case, "x" is eliminated.)

Any situation in which x is being treated as a variable rather than a value will pose a problem. Some obvious ones, like if x is on the left side of an assignment, or the target of a ++, or used as a "ref" or "out" parameter are straightforward. But there are some situations in which it is a bit surprising where x is being used as a variable. Suppose S is a mutable struct with field F:

S x = new S();
x.F = 123;

Here we cannot simply say (new S()).F = 123 because new S() is not a variable. The mutable field requires a storage location to mutate but we don't have a storage location here.

A number of people pointed out that the C# rules about how simple names may be used potentially pose problems:

int y = 2;
void M()
{
Action x = () => { Console.WriteLine(y); }; // refers to this.y
{
string y = "hello";
x();
}

The two inconsistent uses of the simple name y are legal here because the declaration spaces in which both are first used do not overlap. But the refactoring causes them to overlap; this needs to be refactored to

void M()
{
{
string y = "hello";
((Action)(()=>{Console.WriteLine(this.y);})))();
}

It gets even worse if the inconsistent simple names cannot be fully qualified:

Action x = () => {int z = 123;};
{
string z = "hello";
x();
}

Now one of the local zeds must be renamed if the refactoring is to succeed.

You can also get up to mischief with anonymous types:

int x = 42;
var a = new { x };

This must be refactored to

var a = new { x = 42 };

Similarly,

int x = k.y( );
var a = new { x };

cannot be refactored to

var a = new { k.y() };

because that changes the name of the anonymous type property.

Moving an expression around can also muck with the definite assignment rules; this would have to be illegal to refactor:

void Foo(out int b)
{
int x = b = R();
if (Q()) return;
doSomething(x);
}

because it becomes

void Foo(out int b)
{
if (Q()) return;
doSomething(b = R());
}

And now we have a method that returns without assigning the out parameter.

I already mentioned that moving an expression around can change the semantics of a program because the order of observed side effects can change. A number of horrid ways in which methods become dependent on side effects were noted, the nastiest of which was:

int x = LocksFoo();
lock (Bar) { return M(x); }

The refactoring changes the order in which Foo and Bar are locked, and therefore might cause a deadlock if other code depends on Foo always being locked before Bar.

Array initializers are a kind of expression that is only legal in a small number of situations; the refactoring has to take that into account:

int[] x = {0,1};
Console.WriteLine(x);

needs to be refactored into

Console.WriteLine(new int[] {0, 1});

And finally, moving an expression around can move it from a checked to an unchecked context, or vice versa:

int zero = 0;
int one = 1;
int x = int.MaxValue + one;
Console.WriteLine(checked(x + zero));

The naive refactoring turns a program which succeeds into one which fails at runtime:

Console.WriteLine(checked(int.MaxValue + one + zero));

To be correct, the refactoring would have to introduce an unchecked block around the first addition!

Thanks all, that was quite amusing.

Comments

  • Anonymous
    January 17, 2011
    So much pain for so little benefit... Probably not worth including this refactoring in the next version of VS !

  • Anonymous
    January 17, 2011
    @Thomas - I assume you're not a Resharper user? The funny thing about refactoring tools is how hard is it to see why they're useful until you have them.

  • Anonymous
    January 17, 2011
    I think you mean new { x = 42 }

  • Anonymous
    January 17, 2011
    Having so much edge cases increases the value as well.  If refactoring manually I have to notice all these issues and account for them myself every time.

  • Anonymous
    January 17, 2011
    I think there's one too many closing parenthesis in ((Action)(()=>{Console.WriteLine(this.y);})))(); I thought at first this was a new case where we need to add more than ((Type)(Value)), but it isn't.

  • Anonymous
    January 17, 2011
    Not just amusing but it makes you wonder why you'd ever try to erase a variable.  Even with one use point you'd be asking for the above headache, but in the case where there's multiple use-sites you'd end up with a clear answer: the variable must remain because: the solution to introduce a variable free alternative would ... introduce a variable to ensure the side-effects maintain consistency where applicable. Example: int x = StateChanger(); int y = x + 1, z = x - 2; becomes int y = (StateChanger() + 1), z = (StateChanger() - 2); The issue with that is obvious. In the case of: var x = new S(); x.F = 123; You'd end up using: new S() { F = 123 }; Which is the same, but ... object initializers introduce a background variable to do their work.  Ironic.

  • Anonymous
    January 17, 2011
    The comment has been removed

  • Anonymous
    January 17, 2011
    The comment has been removed

  • Anonymous
    January 17, 2011
    I find this feature really nice, but what I would really like to have is a class reordering feature, like the "Organize using --> Sort using"  but for a class, to reorder it's fields, properties, events, methods and constructors and at the end have the class organized with the same structure of the class diagram.

  • Anonymous
    January 17, 2011
    I guess I'm a bit late to the party. Did anybody mention closing over a loop variable? Func<int>[] fs = new Func<int>[10]; for (int i=0; i!= 10; ++i) { int x = i; fs[i] = ()=>x; } Console.WriteLine(fs7); This prints "7", but if you apply the naive refactoring it prints "10".

  • Anonymous
    January 17, 2011
    One other variant of the "simple names" scenario dealing with static variables: class C {  static int y = 2;  void M()  {    Action x = () => { Console.WriteLine(y); };  // refers to C.y    {      string y = "hello";      x();    } Refactor x: class C {  static int y = 2;  void M()  {    {      string y = "hello";      ((Action)(()=>{Console.WriteLine(C.y);}))();    }

  • Anonymous
    January 18, 2011
    DRBlaise: Static variables can still be disambiguated although not as easily - B.y here would disambiguate just fine.

  • Anonymous
    January 18, 2011
    Weeble: That's a good point! Nobody mentioned that...

  • Anonymous
    January 18, 2011
    Reordering: But that would destroy the natural order of a class... Take this example, to make things clear. Class RectangleCalculator has a public interface which accept a Rectangle, two Points or four integers public int Area(Rectangle r); public int Area(Point p1, Point p2); public int Area(int x1, int y1, int x2, int y2); public int Perimeter(Rectangle r); public int Perimeter(Point p1, Point p2); public int Perimeter(int x1, int y1, int x2, int y2); Each of these does the argument validation, and then they delegate to the private CalculateArea and CalculatePermiter. Wouldn't it make sense to order the class like this? public int Area(Rectangle r); public int Area(Point p1, Point p2); public int Area(int x1, int y1, int x2, int y2); private int CalculateArea(Whatever parameters); public int Perimeter(Rectangle r); public int Perimeter(Point p1, Point p2); public int Perimeter(int x1, int y1, int x2, int y2); private int CalculatePerimeter(Whatever parameters);

  • Anonymous
    January 18, 2011
    The comment has been removed

  • Anonymous
    January 18, 2011
    The comment has been removed

  • Anonymous
    January 18, 2011
    Reordering: if that happens to you, you're doing it wrong. That said, if you do have a method to reorder a class automatically to an order that doesn't make sense, it still wouldn't help you...

  • Anonymous
    January 19, 2011
    Perhaps, still it could be a starting point, I stopped to use resharper because of the useless comments of the type "make a method static" or "don't use the "this" keyword" and moreover it had some bugs which make it worst than the built in functions of visual studio, like the find all references, it was replaced with a resharper method which for some reason didn't find all references. But at this point I would suggest to remove also the "sort and remove the using" option from VS because it could order or remove them in an way that somebody doesn't like, so because somebody doesn't like it nobody should use it, correct? And for me make more sense a class ordered like a in the class diagram than a spaghetti class: public class MySpaghettiClass {   public void OnSpathettiAreUsed(){TheSpaghettiAreUsed(this,EventArgs.Empty);}    public void GetSomeSause() {....}   public event EventHandler TheSpaghettiAreThere;  private int howManySpaghetti=0;  public string TheName {get; set;}  pubclic void IsSomeSpaghettiThere(){ ...}  public MySpaghettiClass() : this(11) {} public int HowManySpaghettiAreLeft(){return _spaghetti;} public event EventHandler TheSpaghettiAreUsed;  public int HowManySpaghetti{ger{return howManySpaghetti;} set{ howManySpaghetti=value;}} public MySpaghettiClass(istring name) {TheName=name;} public void UseSpaghetti(int spaghetti){howManySpaghetti-=spaghetti; OnSpaghettiAreUsed();} .... }

  • Anonymous
    January 21, 2011
    @Reordering: How would a class get that way anyway? Unless you just drop code off at the end of the class when you need to add anything, which is a very bad practice, the class should be organized according to whatever order it is that you like them to be, which would normally not be alphabetic or according to accessibility.

  • Anonymous
    January 23, 2011
    The comment has been removed

  • Anonymous
    January 23, 2011
    The comment has been removed