Many C# Questions: Switching on non-constant values.

I finally decided to play with the style settings on my blog. As you may have guessed, I'm a bit of a newbie when it comes to websites and blogging. Let me know what you think of the new look. Last weeks posting generated some great comments. Tzagotta asks:

Why are constant expressions required in case labels? I see a switch statement conceptually the same as a series of chained ifs/else ifs.

This is an interesting question. It seems reasonable that you should be able to switch on an expression of any type, and have case labels which can take any value – even values which are not compile time constants. A simple example would look something like this:

class Program {

    static void Main() {

        object myFirstObject = ...;

        object mySecondObject = ...;

        object obj = ...;

        switch (obj) {

        case myFirstObject: ...; break;

        case mySecondObject: ...; break;

        default: ...; break;

        }

    }

}

which would be equivalent to this code:

class Program {

    static void Main() {

        object myFirstObject = ...;

        object mySecondObject = ...;

        object obj = ...;

        if (obj == myFirstObject) {

           ...

        } else if (obj == mySecondObject) {

           ...

        } else {

           ...

        }

    }

}

At first glance this seems like a great idea – the switch is slightly more compact and readable than the chained ifs, both desirable characteristics of a programing language. If the construct could easily be limited to this simple example it might be a reasonable language extension to consider. Unfortunately, things are not as simple as they appear at first glance…

Firstly, what happens when myFirstObject is equal to mySecondObject? Well that depends on which case label came first in your switch. The order of the case labels becomes significant in determining which block of code gets executed. Because the case label expressions are not constant the compiler cannot verify that the values of the case labels are distinct, so this is a possibility which must be catered to. This runs counter to most programmers’ intuition about the switch statement in a couple of ways. Most programmers would be surprised to learn that changing the order of their case blocks changed the meaning of their program. To turn it around, it would be surprising if the expression being switched on was equal to an expression in a case label, but control didn’t go to that label.

Secondly, allowing non-constant expressions as case labels lets in some significantly less desirable code as well. For example, this code would also be legal:

        switch (obj) {

        case myFirstObject: ...; break;

        case mySecondObject: ...; break;

        case EraseMyHardDrive(): ...; break;

        default: ...; break;

        }

In C#, any non-constant expression can yield side effects. There is currently no notion of a non-constant expression which does not have side effects in C#. Seeing code like this will leave the reader scratching their head wondering when their hard drive will get erased.

And lastly, programmers coming from a C/C++ background will expect that the execution of a switch statement will be fast and that it will take about the same time to reach any particular case label. When the case labels are constant integral values (or strings), the compiler can do some great optimizations to make the execution speed of the construct match the coder’s expectation. With non-constant case labels, getting to the 100th case label would take significantly more time than getting to the 10th case label. Again, this would be surprising to the programmer.

Peter

C# Guy

Comments

  • Anonymous
    August 15, 2005
    Peter,

    While I understand your comments, I don't see any of them as "deal breakers".

    First, if run-time evaluation is included in the logic of a switch statment, then the order of the case statements will have an effect on the overall code flow. Languages which include dynamic branching (such as VB's Select) have known this for years. If a "lowly VB programmer" can understand this, I don't think a C# developer will have problems with it (for the record, I'm also a VB6 developer). Even if this behavior causes confusion, since when is that reason to not include a language feature? You only have to look at any message board to see there are any number of developers confused about something in the C# language.

    Second, you mention undesireable side effects. I fail to see how this is any different than what is possible now. To use your own example:

    if (obj == myFirstObject)
    {
    ...
    }
    else if (obj == mySecondObject)
    {
    ...
    }
    else if (obj == EraseMyHardDrive())
    {
    ...
    }
    else
    {
    ...
    }

    This is no better, yet entirely possible and full of the same unintended consequenses you decry in the switch example.

    Lastly, I haven't seen any suggestion that the switch statement change how it is executed for all logic. For example, if the switch is constructed with literal/const case statements, the same optimizations would be applied as is the case now. Only when a non-literal value is introduces would the logic need to cascade down the list of case statements looking for a match. Conceptually, this is what happens now. It just so happens the cascade takes place at compile time. Sure, some performance is lost compared to the current switch statement. But at worst, I would expect the performance to be as good as a series of if...else if statements. And some of those same optimizations can still be kept if there are literal case statements within the switch block.

    It is analogous to foreach. Some performance may be lost to provide a "cleaner" picture of the code.