Graph Colouring with Simple Backtracking, Part Three

OK, we've got our basic data structures in place.

Graph colouring is a very well-studied problem. It's known to be NP-complete for arbitrary graphs, so (assuming that P!=NP) we're not going to find an always-fast algorithm for colouring an arbitrary graph. However, for typical graphs that we encounter in the wild, the following simple algorithm is pretty good. Start by saying that every node can be every possible colour. Then:

1) Do you have a single possible colouring for every node in the graph? Success!
2) Do you have any node that cannot possibly have any colour? Failure!
3) Otherwise, you must have a node that still has at least two possible colours.
4) Some of the nodes might have a single possible colour and a neighbour that has that possible colour. Remove the possibility from the neighbour.
5) That might have caused some of the neighbours to themselves be reduced to having just one possible colour. If so, remove those colours from the possibilities of the neighbours of those nodes.
6) And so on; keep on reducing the possibilities until you can do so no longer.
7) Did you eliminate a possibility? Then maybe we've got a solution, or maybe we're now busted. Go back to step 1.
8) Otherwise, you eliminated no possibilities and there was a node that had two or more possible colours. Hypothesize that the node is one of those colours. 
9) Recurse; If the recursion produced success, then you're done. If the recursion failed, then your hypothesis was faulty.
10) If there is an untried hypothesis for the node, go back to step eight and try a different hypothesis.
11) Otherwise none of the possible values for one of the undetermined nodes are good; fail.

Notice that this is essentially a "backtracking" algorithm. We make a hypothesis, and if that hypothesis pans out, great, we're done. If not, we backtrack and make a different hypothesis. If none of those hypotheses work out then either the problem is not solvable (for example, we are trying to colour South America with three colours) or a previous hypothesis was wrong, so we backtrack even further. We keep doing that until we've tried all possible combinations.

That last bit is what makes this potentially exponential in time. In realistic graphs, the "reduction" step quickly eliminates so many possibilities that we don't have to explore nearly the entire space of possibilities before either finding a solution or concluding that there is none.

How are we going to do this? Remember, my goals here are to make the program easier to reason about by keeping things immutable where possible, but not so immutable that doing so produces bad performance or weird code.

What we'll do is we'll make a class called a "Solver" which represents a bunch of possible colours for each node. The solver is immutable; when we must make a hypothesis or a deduction that changes what the possible colours are for a node, we'll make a new solver rather than mutating the old one. Let's make a solver that attempts to give a solution in n colours for a particular graph:

internal sealed class Solver
{
  private enum Result { Solved, Unsolved, Busted }
  private readonly Graph graph;
  private readonly BitSet[] possibilities;
  private readonly int colours;
  public Solver(Graph graph, int colours)
  {
    this.colours = colours;
    this.graph = graph;
    this.possibilities = new BitSet[graph.Size];
    BitSet b = BitSet.Empty;
    for (int i = 0; i < colours; ++i)
        b = b.Add(i);
    for (int p = 0; p < this.possibilities.Length; ++p)
        this.possibilities[p] = b;
 }

Pretty straightforward so far. The class is internal and sealed; we don't expect anyone to derive from this or use it outside of this assembly. Later on we're going to implement the first few lines of the algorithm sketch by asking "are we solved, are we busted, or do we still have more work to do?" so I'll represent that by an enum. The solver keeps a graph, the number of colours it is going to try to use to colour the graph, and a bitset of possibilities for each node. The set of possibilities for each node is an array; as we'll see later, using a mutable data structure for this task has pros and cons.

We initialize the array of possibilities to have one set per node, and say that each node can be every one of the possible colours.

Again, there's no error handling in here. Were this a production-grade library I'd have logic that verified that the number of colours was between 1 and 32.

Later on I'm going to need a private constructor that can set all of these fields directly, so let's just build that now:

    private Solver(Graph graph, int colours, BitSet[] possibilities)
    {
        this.graph = graph;
        this.colours = colours;
        this.possibilities = possibilities;
    }

We'll need to be able to make a new solver with one of the node's possible colours reduced to a single possibility.

    // Make a new solver with one of the colours known.
public Solver SetColour(int node, int colour)
{
BitSet[] newPossibilities = (BitSet[])this.possibilities.Clone();
newPossibilities[node] = BitSet.Empty.Add(colour);
return new Solver(this.graph, this.colours, newPossibilities);
}

Here we now see the pros and cons of using an array of bitsets. The bitsets were designed to be small and lightweight value types; they're just a wrapper on an int. Copying a bitset is trivial; we just copy the int. However, copying the array and then mutating it is more expensive; we duplicate the whole thing. If these arrays were going to be huge then this could get very inefficient; every time we'd make a new solver out of an old one we'd have huge amounts of duplication between the two; the difference between the two would in some cases be a single bit, which seems wasteful. If that were the case then instead of using an array of bitsets, I'd use some sort of immutable collection of bitsets that enabled me to re-use most of the state upon making an edit. That would add a lot of complexity, and frankly, copying small arrays is really fast and not that big. So, with the assumption that the graphs we are colouring are small, we'll stick with the array solution.

Note that our choice to make the sets immutable is paying off. Suppose we'd chosen a mutable set. If we mutated the set by removing all the items but one, we'd have to remember to put them back if this hypothesis doesn't pan out!

 I've made this method public. It seems reasonable that the caller of the solver might want to impose certain restrictions of their own, like "I already know that I want Brazil to be green." And as we'll see in a couple more posts, this will come in handy.

Anyway, getting back to the algorithm. The first thing we're going to need to know is "are we done?"

private Result Status
{
get
{
if (possibilities.Any(p => !p.Any()))
return Result.Busted;
if (possibilities.All(p => p.Count() == 1))
return Result.Solved;
return Result.Unsolved;
}
}

Do not be fooled by its commonplace appearance; there are several interesting design choices here. First, we're saying that if there is any node that has no possibilities, then we're busted. The way I've chosen to do so is clever, and that's maybe bad. I don't like clever code, I like code that is easy to read, and this is getting a bit cute. It seems likely that the nested use of "Any" to determine "is there any member of this collection of possibilities that does not have any possibility?" is hard to read. I'm trying to eschew loops here, to make the code more declarative and less imperative, but this might have gone too far.

Second, we're saying that if we're not busted, then maybe we have a solution: exactly one colour per node. Notice that I am counting each possibility set to determine if there is exactly one. That could do up to 32 bit operations, when we could stop after two.

Notice also that we are potentially iterating over the entire array of possibilities twice, looking for what is essentially the same information. I could have written this code like this:

private Result Status
{
    get
    {
foreach(var p in possibilities)
{
int primitiveCount = p.ZeroOneOrMany();
if (primitiveCount == 0) return Result.Busted;
if (primitiveCount > 1) return Result.Unsolved;
        }
return Result.Solved;
}
}

And then written an extension method ZeroOneOrMany() that counts the first two items of a sequence only.

Here I'm choosing to make the code be more declarative and read more like the algorithm at the cost of a little efficiency. I'm not taking on the burden of writing an unnecessary special-purpose counting algorithm. I know that counting a bitset is going to be pretty cheap; I can afford to have a few extra loops to calculate counts I don't actually need.

Notice also that I am not caching the result; it is entirely possible that this solver ask itself multiple times what its status is, and we recompute it every time. We have a tradeoff here of extra space and redundancy in exchange for better speed. But do we actually have a scenario where one solver is going to be asked multiple times what its status is? I don't think we do, so again, YAGNI. Let's not add extra complexity that we're not going to use, not without a clear benefit.

The basic operation of this algorithm has two phases. First we made all the deductions we can trivially do so without making any hypothesis, and then make a hypothesis. How are we going to do that first part? Again, there is a tension between an easy solution that involves mutating an array and a more pure but somewhat more complex solution that creates an immutable map from nodes to possible colours and then gradually edits the map by producing new maps from it. We'll encapsulate the mutation solution by making it an implementation detail of the reduction method. The idea here is that the reduction method returns either null, if no reductions could be performed, or a new immutable solver with all the reductions performed.

private Solver Reduce()
{
    BitSet[] newPossibilities = (BitSet[])this.possibilities.Clone();
    // The query answers the question "What colour possibilities should I remove?"
    var reductions =
        from node in Enumerable.Range(0, newPossibilities.Length)
        where newPossibilities[node].Count() == 1
        let colour = newPossibilities[node].Single()
        from neighbour in graph.Neighbours(node)
        where newPossibilities[neighbour].Contains(colour)
        select new { neighbour, colour };
    bool progress = false;
    while(true)
    {
        var list = reductions.ToList();
        if (list.Count == 0)
            break;
        progress = true;
        foreach (var reduction in list)
            newPossibilities[reduction.neighbour] = newPossibilities[reduction.neighbour].Remove(reduction.colour);
        // Doing so might have created a new node that has a single possibility,
        // which we can then use to make further reductions. Keep looping until
        // there are no more reductions to be made.
    }
    return progress ? new Solver(graph, colours, newPossibilities) : null;
}

A few things to note here. 

I want to avoid any situation where we are mutating an object while iterating over it. We could in fact get away with that in this case; we could run the query and mutate the array while the query was running. But that's really bad form, and would make it difficult to refactor this code, should we later decide to use a different mutable collection, or use an immutable collection for the possibilities. Instead I run the query once, cache the results in a list, and then iterate over the list.

Note that again I'm trying to use queries to represent operations that ask questions, and loops to represent operations that induce mutations. And note that again I am using Count() rather than writing a custom method that checks "does this sequence have exactly one element", which could then bail out if there were two or more.

On the other hand, I'm using List.Count rather than the Any() extension method because I know that List.Count is actually cheaper than Any(). Normally the Any() extension method is cheaper than the Count() extension method because counting a sequence might involve iterating the whole thing, whereas the Any predicate simply calls MoveNext once.

Notice that we could take an early out along the way here. We could detect whether or not we were in a "busted" configuration and return that result, rather than continuing to do calculations. The question is whether the cost of doing the test every time is worth the benefit of taking the early out. I'm going to assume not.

I'm not normally a huge fan of using null references to signal conditions, but here it seems justified. What's the alternative? Return a tuple of (bool, Solver) that says whether progress was made? That seems weird. This question is much more interesting in our next chunk of code.

Which brings us to the solver algorithm itself:

public IEnumerable<int> Solve()
{
  // Base case: we are already solved or busted.
    var status = this.Status;
    if (status == Result.Solved)
        return this.possibilities.Select(x => x.Single());
    if (status == Result.Busted)
        return null;
    // Easy inductive case: do simple reductions and then solve again.
    var reduced = Reduce();
    if (reduced != null)
        return reduced.Solve();
    // Difficult inductive case: there were no simple reductions.
    // Make a hypothesis about the colouring of a node and see if
    // that introduces a contradiction or a solution.
    int node = this.possibilities.FirstIndex(p => p.Count() > 1);
    var solutions =
        from colour in this.possibilities[node]
        let solution = this.SetColour(node, colour).Solve()
        where solution != null
        select solution;
    return solutions.FirstOrDefault();
}

 

I've used an extension method here that gets you the index of the first item that matches a predicate:

public static int FirstIndex<T>(this IEnumerable<T> items, Func<T, bool> predicate)
{
  int i = 0;
  foreach (T item in items)
  {
    if (predicate(item))
      return i;
    i++;
  }
  throw new Exception();
}

Note that it requires that at least one item match the predicate! This might not be a good general-purpose design but it meets my needs. A more general-purpose design might return a nullable int instead.

Anyway, nNow we have the public interface to the solver; it returns a sequence of colourings for each node, if there is one, or null, if there is no such sequence. This illustrates an important principle about sequences: the null sequence is not the empty sequence. The null sequence is "there is no such sequence", not "there is such a sequence and it is empty".

I'm not sure that a sequence is really the right thing to return here though. What we want to express is that there is a mapping from node number to colour number. But how would the user say "what's the colour of node number twelve?"  You can't index into the sequence! The caller is almost certainly going to call ToArray or ToList  or ToLookup on this thing, and maybe it would be sensible for us to do that for them. On the other hand, it's the caller that knows whether they want an array or a list or a dictionary or what, so maybe we shouldn't make that decision for them. I'll leave it like this for now.

I note that for the first recursive case we don't actually need to use recursion. We could have written the first part this like:

 

public IEnumerable<int> Solve()
{
var current = this;
while(true)
{
        var status = current.Status;
        if (status == Result.Solved)
            return current.possibilities.Select(x => x.Single());
        if (status == Result.Busted)
            return null;
        var reduced = Reduce();
        if (reduced == null)
break;
        current = reduced;
}

 

Which has the nice property of not recursing unnecessarily. But it is harder to reason about (if you reason about recursion easily; not everyone does) and reads less like our English description of the algorithm.

The major recursive step has an interesting use of a query expression. We produce a query that will create a sequence of all solutions for all possible hypotheses for the first node that can have more than one possible colouring. But all we care about is "is there any non-null solution or not?" FirstOrDefault will fetch the first solution, if there is one, and return null otherwise. We're taking advantage of the fact that queries are lazy; we don't calculate all the possible solutions unless we actually iterate over the whole query.

Note that if we needed to we could modify this so that we asked for the first two solutions, if there are two. That way we could tell whether the solution is unique or not! But right now we are looking only for any solution.

Next time: let's take this algorithm out for a test drive. South America, here we come.

(Eric is out camping in the beautiful Pacific Northwest; this posting was prerecorded.)

Comments

  • Anonymous
    July 21, 2010
    Wow, this is a lot to digest. Regarding "if (possibilities.Any(p => !p.Any()))" You could do "if (possibilities.Any(p => p == BitSet.Empty))" which is certainly cheaper and perhaps is a little bit easier to read, because it avoids the nested use of Any(). Can you elaborate on why you prefer to not use null results to signal conditions? When the condition being signaled is "there is no result", it seems like the most natural way to do it... Because it is using a "magic value" as a signal; I also dislike using -1 to mean "not found" in methods that return integers. It means you have to check the return value for whether it is one of the special values, rather than using the return value as a valid value of its type. And what happens when you want to have multiple special values to communicate additional information? (For example: the null that means "we know that there is no solution" vs the null that means "there might be a solution but we bailed out before we could find one.") You cannot do that with reference types; there is only one null. - Eric

  • Anonymous
    July 22, 2010
    In your Solve() method, I see a call to a FirstIndex method. I don't see that defined anywhere, so I transformed that to int node = this.possibilities.Select((p, index) => new { p, index }).First(a => a.p.Count() > 1).index; But I wonder if that could be converted to an extension method like so    public static int FirstIndex<T>(this IEnumerable<T> list, Func<T, bool> filter)
       {
           return list.Select((item, index) => new { item, index }).First(x => filter(x.item)).index;
       } Whoops, I forgot to include that. Thanks! - Eric

  • Anonymous
    July 22, 2010
    The following construct is commonly seen in LINQ code .Count() == x but it should trigger comments during code review. Compare to this: .Take(x + 1).Count() == x The test is equivalent, but the performance characteristics are very different when there is a large difference between x and the total length of the sequence.  You know this, but since you don't show the implementation of ZeroOneOrMany and this is a generalization of that issue, I thought it might be useful for your readers to show this. 

  • Anonymous
    July 22, 2010
    The comment has been removed

  • Anonymous
    July 22, 2010
    "Can you elaborate on why you prefer to not use null results to signal conditions? When the condition being signaled is "there is no result", it seems like the most natural way to do it..." My view on this is that it is hard to make it clear in the code that this is the case. And sometimes it forces unpleasant contortions in calling code. If the case is so common that it would need to be handled alot then (in c#) I prefer a method of the form: bool TrySolve(out IEnumerable<T> solution); since this forces the user of your API to think about it. (in f# this will be magicompiled into (bool,result) but f#'s provides the Option type so you change your method to return Option IEnumerable<T>. That makes it clear to users but they can very easily say "I don't care about failing on failures" and just do solver.Solve().Value.WhateverTheyWantedToDoToTheResult() and they'll get a nice runtime exception (as they would with the null reference exception). I tend to feel that using null is reasonable if the lack of a result is likely to be a terminal failure and I am not writing library / externally visible code) so, for example private methods are more likely to get null return values for lack of result if that is simple to write. Those are my opinions, Eric of course will have different ones (and his are likely to be more clearly expressed ;))

  • Anonymous
    July 22, 2010
    @Alex note that the performance characteristics of that are somewhat subtle. You are trading some constants factors in there as well as an additional allocation. I'd also say that writing the following is much better overall: public static bool CountIs<TSource>(this IEnumerable<TSource> source, int entries)
    {
      if (source == null)
        throw new ArgumentNullException("source");
      IList<TSource> list = source as IList<TSource>;
      if (list != null)
      {
        if (list.Count == entries)
          return true;        
      }
      else
      {
        using (IEnumerator<TSource> enumerator = source.GetEnumerator())
        {
          while (enumerator.MoveNext())
          {
            if (entries-- < 0)
              break;
          }
        }
      }
      return entries == 0;
    } The IList shortcut is the fastest you can reasonably achieve for arrays/lists with pretty-ish code. The fallback case is optimal. Extending it to do less than/greater strict/unstrict is not hard either. The name is not something I particularly like. I think we could do better there. But so long as the name is right the intent is very clear.

  • Anonymous
    July 22, 2010
    I notice that the two implementations of Status could return different results for different values of Solver. I don't know if it could arise in practice, but if the Busted and Unsolved conditions can occur in the same Solver instance, then the first version will return Results.Busted while the second will return whichever appears first. I haven't tried it, but it seems this could potentially cause a lot of extra work to be done solving an already Busted coloring. I note this only because these sorts of puzzles are fun, which is, I guess, one reason I got into programming. Anyway, thanks for this excellent series. Excellent point! - Eric

  • Anonymous
    July 22, 2010
    @Matt Good point - Common operations are worth providing as a single operation and optimizing (even better to look for ICollection<T> rather than IList<T>) It is also worth saying that if the item passed in to Take is an ICollection<T> and the collection and x itself are both large, Take(x).Count() could perform rather worse than Count() using current implementations of LINQ to Objects since LINQ does optimize Count() when passed an ICollection<T>, but does not coalesce operations nor does it provide an ICollection-based optimization for Take (which is to say that if, for example, Take returned an ICollection<T> when passed an ICollection<T>, we wouldn't be concerned about this at all). On the other hand, Take(x + 1).Count() == x behaves rather more reasonably with infinite lists than does Count() == x and since x is typically small and there are many parts of LINQ (not just Take) that do not propagate the ICollection-ness of the source list, Take().Count() is usually preferable to Count(). The main points I wanted to make are:

  1. One productive way of optimizing the performance of LINQ queries is to limit the number of items examined. Since, in the general case, Count() will enumerate all items in the list, uses of Count() should be examined closely in code review.
  2. One simple way of limiting the number of items being examined is to use Take(). There's plenty to discuss around LINQ optimization, but I'll leave it there for now.
  • Anonymous
    July 22, 2010
    When you said you won't implement Count() on BitSet it made sense because it wasn't much of a performance boost. But implementing: public boolean IsEmpty() {
       return this == Empty;
      // or
       return bits == 0;
    } is a major boost. So is public boolean IsSingle() {
       // true if exactly one bit is zero, false otherwise
       return !IsEmpty() && (bits & (bits - 1)) == 0;
    } It's not the most readable code in the world, but it's used in a lot of places and replaces 32 shifts, 32 and's,and 32 comparisons with with two comparisons and three operations. Yep. What I'd probably do if I cared to was make that sort of improvement based on profiling. If it turned out that the count really was the bottleneck then I'd seek to eliminate it. - Eric Another comment I'd like to make is that while you say Solver is immutable, you don't put readonly on fields. How odd that you find the need to explicitly make the class sealed so you don't accidentally inherit it, but you don't seem to care that you might set the immutable fields to different values by accident. Yeah, that was just an omission, one that I have made several times in this series. - Eric Jeffrey makes a good point - the second Status you've shown would have a far worse complexity since it would return Unsolved for a lot of Busted cases.

  • Anonymous
    July 22, 2010
    I would've changed Reduce() a bit to make it clearer that reductions is evaluated for each iteration in the loop. At least I was confused why you added "if (list.Count == 0)" when there was obviously nothing that would make the list smaller. It wasn't until I read the comment at the end of the loop (and a 'huh? wait a ...') that I realized the multiple evaluations of reductions. But I believe most of my colleagues would not be able to reason about Reduce(). My most preferred option would be to create a method GetReductions() that would return an enumerable with anonymous classes. Since that's not possible, and tuples wouldn't be clear enough, I would probably rewrite the last comment to something like:        // Doing so might have created a new node that has a single possibility,
           // which we can then use to make further reductions. Keep re-evaluating
           //  reductions, until no more can be found. Or else I would add a comment above "var list = reductions.ToList()" that says something like:        // re-evaluates reductions with every iteration I'm no big fan of commenting the what of a piece of code, but every rule has it's exceptions. Good points. However, the problem with producing a sequence of reductions is that you then lose the nice property that you're mutating the same mutable state until it cannot be mutated further by this operation. - Eric

  • Anonymous
    July 25, 2010
    Eric, I understand that the purpose of this series is to illustrate design decisions, so the algorithm you chose to solve the problems is really not that important... however, I'm wondering why you chose a "backtracking" algorithm instead of a "greedy" algorith; it seems to me the latter would have been much easier to implement. -Esteban The greedy algorithm is for finding a colouring with any number of colours, and you hope that the graph happens to be one such that the greedy algorithm finds a small number of colours. What I am interested in here is finding a graph colouring with no more than n colours. That's a different problem. - Eric

  • Anonymous
    July 25, 2010
    Esteban, From what I know of regexes there's a choice of greedy backtracking vs lazy backtracking. Could you explain for me what you mean with "backtracking instead of greedy"? Patrick

  • Anonymous
    July 25, 2010
    @Esteban: The problem with a greedy algorithm is that, in this case, it's not correct. With graph coloring, any greedy algorithm will always run into cases where it makes the wrong choice and runs into a Busted case, even when there is a possible coloring. Eric's backtracking algorithm starts out greedily, but once it hits a Busted case, it "backtracks" and tries a different choice. Eventually, it will either have tried everything unsuccessfully, proving that it's not possible to color, or it will find a solution and stop.

  • Anonymous
    July 26, 2010
    When I saw "var list = reductions.ToList();" inside of a loop, the first thing I did was look to see if reductions or list were mutated within the loop. Since neither were, I started wondering why you called ToList inside of a loop. Of course the answer is that reductions is essentially a closure that has to be re-evaluated whenever any of its captured variables changes, meaning that I then had to figure out what variables it captures and see which of them are mutated within the loop. I agree with others who think this deserves a comment, because it could prevent the need for all that analysis on the part of everybody who reads the code. I was also a bit confused by the "Doing so might..." comment. It's not clear what the "so" refers to, and makes me suspect that it refers to a previous comment that was deleted (along the lines of "Now do X...").

  • Anonymous
    July 28, 2010
    I don't see why the reductions query wasn't defined inside the loop, it would make the loop much clearer.

  • Anonymous
    August 03, 2010
    In Reduce(), wouldn't the loop run infinitely?

  • Anonymous
    August 03, 2010
    Oh, sorry, I missed the point with reevaluating the expression.

  • Anonymous
    September 23, 2010
    This may be a side-effect of implementing this in C++, but to prevent an infinite loop I had to move while(true) to be before var reductions = ....  Presumably linq is re-querying based on newPossibilities changing?