Fast Switching with LINQ
Jomo Fisher—I’ve run into a performance problem several times in my career that I’ve never found a truly satisfying answer for until now. The problem can be distilled down to this:
Look up a value based on a string key. The set of strings is fixed and no unknown key values can occur.
Consider my personal rankings of the Seven Dwarves on a scale of 1-10:
happy=9, sneezy=2, doc=7, sleepy=8, dopey=9, grumpy=2, bashful=6
I need a function that takes a name and returns a ranking. The default solution for this problem should be to use a Dictionary or HashTable. The implementations that ship with the .NET framework are extremely fast.
However, these container classes offer more features than I need for this problem. New keys can be added but my set is fixed. They can tell you whether a key is valid or not but I know a priori that the keys are limited to a fixed set. They can enumerate the set of keys or the set of values but I don’t need this capability.
Any time I see a data structure with a capability I’m not using it makes me wonder whether I can trade that capability for something I do need—in this case a speed boost.
If I was really desperate for a performance gain then it was possible to write a function like:
static int Lookup(string s) {
if (s.Length < 6) {
if (s.Length < 5) return 7;
else return 9;
} else {
if (s[0] < 's') {
if (s[0] < 'g') return 6;
else return 2;
} else {
if (s[1] < 'n') return 8;
else return 2;
}
}
}
Notice that we only have to look at the length and the first two characters of the string to make the decision. This function is fast but nearly impossible to maintain or debug. Also, I have to know the set of strings at compile time. All of these drawbacks make this approach significantly less useful.
Enter the LINQ Expression Compiler that will be available in the .NET 3.5. With the new expression support it’s possible to write an expression at runtime and then compile this expression into straight IL (which will be jitted into machine code). It turned out to be pretty easy to write a switch compiler that would generate an appropriate lookup expression on the fly. I wrote a class called SwitchCompiler to do the heavy lifting. Here’s how it’s called:
var dict = new Dictionary<string, int> {
{"happy", 9}, {"sneezy", 2},
{"doc", 7}, {"sleepy", 8},
{"dopey", 9}, {"grumpy", 2},
{"bashful", 6}
};
var Lookup = SwitchCompiler.CreateSwitch(dict);
var v = Lookup("doc");
Debug.Assert(v == 7);
SwitchCompiler first builds up an expression that looks like this:
s.Length < 6 ? (s.Length < 5 ? 7 : 9) :
(s[0] < 's' ? (s[0] < 'g' ? 6 : 2) : (s[1] < 'n' ? 8 : 2));
And then compiles this into a delegate returned into Lookup. (It should also be possible to generate a perfect minimal hash in the Compile method. I haven't tried this).
The actual implementation of SwitchCompiler is about 65 lines of code. You can see it in the attachment.
On my machine, looking up each value in a tight loop 1 million times takes 70ms while the corresponding loop using Dictionary takes 697ms. That’s close to a 900% performance improvement.
This posting is provided "AS IS" with no warranties, and confers no rights.
Comments
Anonymous
March 28, 2007
That's pretty impressive performance for a VM-based language. The C++ equivelent, 1 million iterations, runs in about 55 ms, on a Pentium 4, 3.4 Ghz. Here's the code: struct KeyValue { const char* key; int value; inline bool operator < (const KeyValue& that) const { return strcmp(this->key, that.key) < 0; } }; static const KeyValue kvs[] = { {"bashful", 6}, {"doc", 7}, {"dopey", 9}, {"grumpy", 2}, {"happy", 9}, {"sleepy", 8}, {"sneezy", 2}, }, * const kvs_end = kvs + (sizeof(kvs) / sizeof(0[kvs])); inline int Lookup(const char* s) { KeyValue key = { s }; return std::lower_bound(kvs, kvs_end, key)->value; } int _tmain(int argc, _TCHAR* argv[]) { LARGE_INTEGER begin, end, frequency; QueryPerformanceCounter(&begin); const KeyValue* k = kvs; int total = 0; // Just so the compiler doesn't completely optimize out the for loop. for (int i = 0; i < 1000000; ++i) { total += Lookup(k->key); if (++k == kvs_end) k = kvs; } QueryPerformanceCounter(&end); QueryPerformanceFrequency(&frequency); __int64 ms = (end.QuadPart - begin.QuadPart) * 1000 / frequency.QuadPart; cout << "Total:t" << total << endl; cout << "Milliseconds:t" << ms << endl; } One minor benefit of the C++ code is that there is no overhead of creating a lookup table before the first call to Lookup().Anonymous
March 29, 2007
Hi Jeff, I tried your code on my machine and amazingly got exactly 55ms. That makes your orange look a little more like my apple. Your comment made me realize my implementation has one more feature that I could possibly throw away: there's an array bounds check on the string index operations. I think I could get rid of this by emitting 'unsafe' code. Unfortunately, I think I'd have to bypass the expression compiler to do this. I seem to be within spitting distance of optimized C++, that's pretty cool considering how amazing the VS C++ optimizer is. JomoAnonymous
March 29, 2007
Here's my test code in case anyone wants to reproduce my result: /* Use of included script samples are subject to the terms specified at http://www.microsoft.com/resources/sharedsource/licensingbasics/permissivelicense.mspx Written by Jomo Fisher */ using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Reflection; using FastSwitch; namespace ConsoleApplication1 { class Program { static void Main(string[] args) { var dict = new Dictionary<string, int> { {"happy", 9}, {"sneezy", 2}, {"doc", 7}, {"sleepy", 8}, {"dopey", 9}, {"grumpy", 2}, {"bashful", 6} }; const int count = 1000000; DateTime start = DateTime.Now; for (int i = 0; i < count; ++i) { var v = dict["happy"]; v = dict["sneezy"]; v = dict["doc"]; v = dict["sleepy"]; v = dict["dopey"]; v = dict["grumpy"]; v = dict["bashful"]; } TimeSpan elapsed = DateTime.Now - start; Console.WriteLine("Dictionary: {0} ms", elapsed.TotalMilliseconds); var compiled = SwitchCompiler.CreateSwitch(dict); start = DateTime.Now; for (int i = 0; i < count; ++i) { var v = compiled("doc"); System.Diagnostics.Debug.Assert(v == 7); v = compiled("happy"); System.Diagnostics.Debug.Assert(v == 9); v = compiled("sneezy"); System.Diagnostics.Debug.Assert(v == 2); v = compiled("sleepy"); System.Diagnostics.Debug.Assert(v == 8); v = compiled("dopey"); System.Diagnostics.Debug.Assert(v == 9); v = compiled("grumpy"); System.Diagnostics.Debug.Assert(v == 2); v = compiled("bashful"); System.Diagnostics.Debug.Assert(v == 6); } elapsed = DateTime.Now - start; Console.WriteLine("Compiled: {0} ms", elapsed.TotalMilliseconds); } } }Anonymous
March 30, 2007
You've been kicked (a good thing) - Trackback from DotNetKicks.comAnonymous
March 31, 2007
The comment has been removedAnonymous
March 31, 2007
The c++ code and c# code are timing 2 different things. the c++ code is cycling through each key for a total of 1,000,000 lookups. the c# code is looking each key up 1,000,000 times for a total of 7,000,000 lookups.Anonymous
April 02, 2007
Justin, You're right. I should have noticed the C++ code was doing 1/7th of the lookups as the C# code. When I adjust the C++ code, it dutifully reports 385ms which is pretty close to what the plain old .NET dictionary was doing. So I guess the LINQ algorithm is in the cat-bird seat, at least for a the time-being. I also noticed the C++ answer requires that the set of keys be sorted alphabetically before lookups can commence, so there is a little preparation time.Anonymous
April 02, 2007
The comment has been removedAnonymous
April 03, 2007
The C++ compiler optimized out the entire loop in my first version. I added the total variable, and it still optimized out the whole loop. I added the cout statement at the bottom and it finally executed the code. Any sign the C# compiler is optimizing out the whole loop in the C# code?Anonymous
April 03, 2007
The C# compiler is definitely not removing the loop--for one, it has no way to determine whether the delegate has side-effects that need to be preserved. This can be seen by increasing the number of iterations--2M loops run in twice the time as 1M.Anonymous
April 03, 2007
Wow, your switch compiler and the .Net VM is very impressive indeed. My fastest run of the following C++ code was 88 ms. Note also that I kind of cheated and used std::find instead of std::lower_bound, because when there are only 1 or 2 elements in the vector, std::find runs much faster. #include "stdafx.h" struct KeyValue { const char* key; int value; }; struct CharValue { int char_; bool terminal_; union { int value_; size_t next_index_; vector<CharValue>* next_; }; ptrdiff_t char_step_; }; inline bool operator < (const CharValue& value, int c) { return value.char_ < c; } inline bool operator < (int c, const CharValue& value) { return c < value.char_; } inline bool operator == (const CharValue& value, int c) { return value.char_ == c; } static const KeyValue kvs[] = { {"bashful", 6}, {"doc", 7}, {"dopey", 9}, {"grumpy", 2}, {"happy", 9}, {"sleepy", 8}, {"sneezy", 2}, }, * const kvs_end = kvs + (sizeof(kvs) / sizeof(0[kvs])); class KnownSetTrie { public: typedef vector<CharValue> CharMap; vector<CharMap> tree_; ptrdiff_t first_step_; template <typename Iter> KnownSetTrie(Iter begin, Iter end) { first_step_ = Compile(begin, end, 0) - 1; // Now that the tree_ is stable, convert indices to node pointers. for(vector<CharMap>::iterator i = tree_.begin(); i != tree_.end(); ++i) { for (CharMap::iterator j = i->begin(); j != i->end(); ++j) if (!j->terminal_) j->next_ = &tree_[j->next_index_]; } } template <typename Iter> ptrdiff_t Compile(Iter begin, Iter end, ptrdiff_t char_index) { CharMap map; Iter start = begin, i = begin + 1; while (true) { CharValue value; value.char_ = start->key[char_index]; if (i != end && i->key[char_index] == start->key[char_index]) { do { ++i; } while (i != end && i->key[char_index] == start->key[char_index]); value.terminal_ = false; value.char_step_ = Compile(start, i, char_index + 1); value.next_index_ = tree_.size() - 1; } else { value.terminal_ = true; value.value_ = start->value; } map.push_back(value); if (i == end) break; start = i++; } if (1 == map.size() && !map.front().terminal_) { // Pass through node: return map.front().char_step_ + 1; } tree_.push_back(map); return 1; } inline int Lookup(const char* s) const { const CharMap* node = &tree_.back(); s += first_step_; while (true) { const CharMap::const_iterator value = find(node->begin(), node->end(), s); if (value->terminal_) return value->value_; s += value->char_step_; node = value->next_; } } }; int _tmain(int argc, _TCHAR argv[]) { KnownSetTrie trie(kvs, kvs_end); LARGE_INTEGER begin, end, frequency; QueryPerformanceCounter(&begin); int total = 0; // Just so the compiler doesn't completely optimize out the for loop. for (int i = 0; i < 1000000; ++i) { for (const KeyValue* k = kvs; k != kvs_end; ++k) total += trie.Lookup(k->key); } QueryPerformanceCounter(&end); QueryPerformanceFrequency(&frequency); __int64 ms = (end.QuadPart - begin.QuadPart) * 1000 / frequency.QuadPart; cout << "Total:t" << total << endl; cout << "Milliseconds:t" << ms << endl; }Anonymous
April 03, 2007
It seems the LINQ guys have done a great job of optimization inside of the LINQ namespace (System.Linq)....Anonymous
April 04, 2007
I got the C++ code down to 70 ms. It took a fastcall and some fun struct packing, though. I can't wait to use C# .Net! #include "stdafx.h" struct KeyValue { const char* key; int value; }; struct CharValue { int char : 8; int terminal : 1; int char_step_ : 23; union { int value_; size_t next_index_; vector<CharValue>* next_; }; }; inline bool operator < (const CharValue& value, int c) { return value.char_ < c; } inline bool operator < (int c, const CharValue& value) { return c < value.char_; } inline bool operator == (const CharValue& value, int c) { return value.char_ == c; } static const KeyValue kvs[] = { {"bashful", 6}, {"doc", 7}, {"dopey", 9}, {"grumpy", 2}, {"happy", 9}, {"sleepy", 8}, {"sneezy", 2}, }, * const kvs_end = kvs + (sizeof(kvs) / sizeof(0[kvs])); class KnownSetTrie { public: typedef vector<CharValue> CharMap; vector<CharMap> tree_; ptrdiff_t first_step_; CharMap* root_; template <typename Iter> KnownSetTrie(Iter begin, Iter end) { first_step_ = Compile(begin, end, 0) - 1; // Now that the tree_ is stable, convert indices to node pointers. for(vector<CharMap>::iterator i = tree_.begin(); i != tree_.end(); ++i) { for (CharMap::iterator j = i->begin(); j != i->end(); ++j) if (!j->terminal_) j->next_ = &tree_[j->next_index_]; root_ = &tree_.back(); } } template <typename Iter> ptrdiff_t Compile(Iter begin, Iter end, ptrdiff_t char_index) { CharMap map; Iter start = begin, i = begin + 1; while (true) { CharValue value; value.char_ = start->key[char_index]; if (i != end && i->key[char_index] == start->key[char_index]) { do { ++i; } while (i != end && i->key[char_index] == start->key[char_index]); value.terminal_ = false; value.char_step_ = Compile(start, i, char_index + 1); value.next_index_ = tree_.size() - 1; } else { value.terminal_ = true; value.value_ = start->value; } map.push_back(value); if (i == end) break; start = i++; } if (1 == map.size() && !map.front().terminal_) { // Pass through node: return map.front().char_step_ + 1; } tree_.push_back(map); return 1; } int fastcall Lookup(const char* s) const { const CharMap* node = root; s += first_step; while (true) { const CharMap::const_iterator value = find(node->begin(), node->end(), s); if (value->terminal_) return value->value_; s += value->char_step_; node = value->next_; } } }; int _tmain(int argc, _TCHAR argv[]) { KnownSetTrie trie(kvs, kvs_end); LARGE_INTEGER begin, end, frequency; QueryPerformanceCounter(&begin); int total = 0; // Just so the compiler doesn't completely optimize out the for loop. for (int i = 0; i < 1000000; ++i) { for (const KeyValue* k = kvs; k != kvs_end; ++k) total += trie.Lookup(k->key); } QueryPerformanceCounter(&end); QueryPerformanceFrequency(&frequency); __int64 ms = (end.QuadPart - begin.QuadPart) * 1000 / frequency.QuadPart; cout << "Total:t" << total << endl; cout << "Milliseconds:t" << ms << endl; }Anonymous
April 04, 2007
It seems the LINQ guys have done a great job of optimization inside of the LINQ namespace (System.Linq).Anonymous
April 04, 2007
Hey Jeff, That's a really nice solution. I was wondering whether a trie could help here.Anonymous
April 04, 2007
The comment has been removedAnonymous
April 08, 2007
Recently my time has been taken up with a series of internal issues involving Beta1, C# Samples and variousAnonymous
April 09, 2007
A faster solution than a trie is a hand-built binary search tree, which is what I imagine case statement compiler does. This code runs around 55 ms, and doesn't cheat like my earlier code with fastcalls or std::find() in place of std::lower_bound(): Sorry for turning this into a language shoot-out, but this looked like such an interesting case to play with. And again the .Net runtime performs astoundingly well. #include "stdafx.h" struct KeyValue { const char* key; int value; }; static const KeyValue kvs[] = { {"bashful", 6}, {"doc", 7}, {"dopey", 9}, {"grumpy", 2}, {"happy", 9}, {"sleepy", 8}, {"sneezy", 2}, }, * const kvs_end = kvs + (sizeof(kvs) / sizeof(0[kvs])); struct DTreeNode; struct DTreeStep { bool terminal; union { int retval; int char_step; }; union { DTreeNode* next; size_t next_index; }; }; struct DTreeNode { char c; DTreeStep less_step, ge_step; }; class DTree { public: vector<DTreeNode> nodes; DTreeStep root_step; template <typename Iter> DTree(Iter begin, Iter end) { root_step_ = Compile(begin, end, 0); // Convert indices into pointers. for (vector<DTreeNode>::iterator i = nodes_.begin(); i != nodes_.end(); ++i) { if (!i->less_step.terminal) i->less_step.next = &(nodes_.begin() + i->less_step.next_index); if (!i->ge_step.terminal) i->ge_step.next = &(nodes_.begin() + i->ge_step.next_index); } root_step_.next = &(nodes_.begin() + root_step_.next_index); } template <typename Iter> DTreeStep Compile(Iter begin, Iter end, ptrdiff_t char_index) { ptrdiff_t count = end - begin; DTreeStep step; if (1 == count) { step.terminal = true; step.retval = begin->value; return step; } Iter middle = begin + (end - begin) / 2, middle_begin = middle, middle_end = middle + 1; while (true) { if (middle_begin == begin) break; Iter prior = middle_begin - 1; if (prior->key[char_index] == middle->key[char_index]) middle_begin = prior; else break; } while (true) { if (middle_end == end) break; if (middle_end->key[char_index] == middle->key[char_index]) ++middle_end; else break; } const ptrdiff_t less_count = middle_begin - begin, equal_count = middle_end - middle_begin, greater_count = end - middle_end; if (0 == less_count) { if (0 == greater_count) { // All middles, pass through step = Compile(begin, end, char_index + 1); step.char_step += 1; return step; } else { // middles and greaters middle_begin = middle_end; } } DTreeNode node; node.c = middle_end->key[char_index]; node.less_step = Compile(begin, middle_begin, char_index); node.ge_step = Compile(middle_begin, end, char_index); step.terminal = false; step.char_step = 0; step.next_index = nodes_.size(); nodes_.push_back(node); return step; } int Lookup(const char s) const { const DTreeStep* step = &root_step_; while (true) { const DTreeNode* node = step->next; step = s < node->c ? &node->ge_step : &node->less_step; if (step->terminal) return step->retval; s += step->char_step; } } }; int _tmain(int argc, _TCHAR argv[]) { DTree dtree(kvs, kvs_end); LARGE_INTEGER begin, end, frequency; QueryPerformanceCounter(&begin); int total = 0; // Just so the compiler doesn't completely optimize out the for loop. for (int i = 0; i < 1000000; ++i) { for (const KeyValue* k = kvs; k != kvs_end; ++k) total += dtree.Lookup(k->key); } QueryPerformanceCounter(&end); QueryPerformanceFrequency(&frequency); __int64 ms = (end.QuadPart - begin.QuadPart) * 1000 / frequency.QuadPart; cout << "Total:t" << total << endl; cout << "Milliseconds:t" << ms << endl; }Anonymous
April 12, 2007
ahh, I have no idea how you lot managed to get such results on 3.4 ghz machines for the standard dictionary tests. heres my code.. using standard generic dictionary and I get wihout fail 92-93ms lookup times for 1 million loops. I havent tried the linq version on my machine as I'm only running .net 2.0 but even comparing the results for linq here (even though this is distorted due to difference in machines used) the difference is a max of 20%.. nothing like the 900% claimed. by the way.. my tests were all run on a p4 2.4 ghz machine... i'm sure using a cpu that clocks 30% faster will also reduce the performance gap between my results and those posted in the original statement. PS. dont use datetime for monitoring performance as datetime is not updated every tick.. my experience is that datetime as a +/- difference of 30-40 ms on a good day vs the stopwatch class. anyway heres the code I run my tests with. Dictionary<string, int> index = new Dictionary<string, int>(); index.Add("happy", 9); index.Add("sneezy", 2); index.Add("doc", 7); index.Add("sleepy", 8); index.Add("dopey", 9); index.Add("grumpy", 2); int found = 0; Stopwatch sw = new Stopwatch(); sw.Start(); for (int i = 0; i < 1000000; i++) found = index["doc"]; long elapsed = sw.ElapsedMilliseconds; sw.Stop(); sw = null;Anonymous
April 16, 2007
Zan, Your test is only looking up one value in a loop. Our tests are looking up seven values in a loop. You can see the actual test code in the comments above. If you adjust your test to match you'll get ~650ms.Anonymous
May 22, 2007
I found that using KeyValuePair<string,TR> instead of Case<TR> allows to expose the function CreateSwitch to have a IEnumerable<KeyValuePair<string,TR>> parameter this allows these uses: public static Func<string, TR> CreateSwitch<TR>(this IEnumerable<string> strings, Func<string, TR> func) { return CreateSwitch(from s in strings select new KeyValuePair<string, TR>(s, func(s))); } public static Func<string, IEnumerable<TR>> EnumerableInvert<TR>(this IEnumerable<TR> values, Func<TR, string> name) I added the fllowing check in the function: public static Func<string, TR> CreateSwitch<TR>(this IEnumerable<KeyValuePair<string, TR>> caseDict) { if (caseDict.Count() != caseDict.Distinct().Count()) throw new ArgumentException("duplicate names"); var cases = caseDict.ToArray(); var p = Expression.Parameter(typeof(string), "p"); var expr = Expression.Lambda<Func<string, TR>>( BuildStringLength( p, cases.ToOrderedArray(s => s.Key.Length), 0, cases.Length - 1), new ParameterExpression[] { p } ); var del = expr.Compile(); return del; } { return CreateSwitch(from v in values group v by name(v) into byname select new KeyValuePair<string, IEnumerable<TR>>(byname.Key, byname)); } public static Func<string,TR> Invert<TR>(this IEnumerable<TR> values, Func<TR, string> name) { return CreateSwitch(from v in values group v by name(v) into byname select new KeyValuePair<string, TR>(byname.Key, byname.First())); } public static Func<string, TR> ParseEnum<TR>() //where TR: Enum { return CreateSwitch(from v in (TR[])Enum.GetValues(typeof(TR)).OfType<TR>() select new KeyValuePair<string, TR>(Enum.GetName(typeof(TR), v), v)); }Anonymous
July 06, 2007
Rico Mariani has been posting about LINQ to SQL perfomance and has finally posted the performance numbersAnonymous
July 06, 2007
Rico Mariani has been posting about LINQ to SQL perfomance and has finally posted the performance numbersAnonymous
July 23, 2007
Jomo Fisher—A lot of people (myself included) have written about LINQ in the next version of C#. LINQAnonymous
August 05, 2007
Great stuff Jomo! Unfortunately it doesn't work with this dictionary: {"Ado", i++}, {"A2o", i++}, {"A2i", i++}, {"B2o", i++}, {"AdoX", i++}, {"AdPX", i++} It crashes with an IndexOutOfRangeException in BuildStringChar() because it gets a set of strings that are of different lengths. It seems to me that both your if statements that deal with middle are incorrect. If I replace middle with upper in those and in both calls to FirstDifferentDown() it almost works. Another problem I had to fix is that ToOrderedArray() orders the strings in a way that is not consistent with the Char comparisons in the resulting delegate. In this case it breaks down because 'o' < 'P' is false while "AdoX".CompareTo("AdPX") is -1. My solution was to make a new ToOrderedArray() that takes an IComparer<string> and passes it on to OrderBy(). I then pass it the Instance member from this class: internal class StringCharComparer : IComparer<string> { public static StringCharComparer Instance = new StringCharComparer(); public int Compare(string x, string y) { for (int i = 0; i < Math.Min(x.Length, y.Length); i++) { if (x[i] > y[i]) return 1; else if (x[i] < y[i]) return -1; } if (x.Length == y.Length) return 0; else if (x.Length > y.Length) return 1; else return -1; } } By the way, it was pure luck that the dictionaries I tried exposed these bugs. Thoughts?Anonymous
August 09, 2007
Raptor, I think you're right on both accounts. Thanks for pointing this out.Anonymous
August 28, 2007
Nice, one thing I did was add the optimized Lookup into the testing for comparison. This is where one decides if the lookup is a dynamic list and if the dictionary will change over time. Even though the optimized Lookup is hard to maintain and debug that would be OK if it list was static and did not change over time. But if it does change then one has choices to build a Compiled Expression versus a conventional way. The new idea would be to take a conversional way of doing something and turning it into an Expression that could be compiled and may have performance gains. I like how you take something as simple as a lookup that one might think as being optimized but showing that by building an optimized expression the performance can be even faster. Dictionary: 700.07 ms ~ 7.8x slower Compiled: 170.017 ms ~ 1.89 - 2x slower Lookup: 90.009 ms Dictionary: 700.07 ms ~ Compiled: 170.017 ms ~ 4.12x faster Thanx for the Expresion Building logic that helps whith how all various static methods in System.Linq.Expressions are used to build a dynamic function.Anonymous
November 06, 2007
Nothing really to add of value, but kind of interesting. If you define the following class: class EqualityCompareStrings : IEqualityComparer<string> { public bool Equals(string x, string y) { return string.Equals(x, y); } public int GetHashCode(string obj) { return obj.GetHashCode(); } } And then create the dictionary with an instance of that class like Dictionary<string, int> dict = new Dictionary<string, int>(new EqualityCompareStrings()); Then you get (from my test!) a time on the non-SwitchCompiler code of about 80%. So as I said nothing really to do with the anything much, although considering how often strings are used as keys it might be a trivial optimization to put into System.Collections.Generic.EqualityComparer<T>. CreateComparer() to handle it as a special case... Assuming that I haven't done anything wrong :-) (But then you might as well use the Jomo special SwitchCompiler - but if you want to keep using Dictionary<string, ...>)Anonymous
December 23, 2007
Hello I've also hit on the bug that Raptor-75 pointed out. Using his StringCharComparer dindn't solve it all. If you have a complete fix can you please post it? Best Regards, Gustavo GuerraAnonymous
April 10, 2008
On my current project, we're in a 'make it faster' mode right now.  We've already knocked out aAnonymous
August 28, 2008
StaticStringDictionary - Fast Switching with LINQ revisitedAnonymous
August 28, 2008
Hi, I posted a version that fixed the bugs found by Raptor-75 in my blog: http://blog.codebeside.org/archive/2008/08/28/staticstringdictionary-fast-switching-with-linq-revisited.aspxAnonymous
August 29, 2008
Jomo Fisher—A while back I wrote about using LINQ to get close-to-the-metal performance in a particular