Performance issues with "String Concatenation" in JScript.

Hello Friends,
I am Jaiprakash and work as a Developer in the Jscript team. There is not much to tell about me except one fact - I can’t recall the last time when any of my teammates called me by my name, they just call me JPJ.

I know, you have seen pretty cool blogs in past from my team mates and I promise this one also not going to disappoint you. So coming to the point, I am going to talk about the famous ‘string append’ problem in Internet Explorer (or Jscript to be more accurate) and how are we going to fix it.

The Problem
let’s do a small experiment to get to it. Copy paste following code in an html file and run in Internet Explorer. Observe the time reported.

<script language="javascript">
var before = new Date().getTime();
var s = "jscript";
x = "";
count = 20000;
for (var i = 0; i < count; i++) {
x += s;
}
var after = new Date().getTime();
var time = after - before;
alert("Time taken to append 'jscript' " + count + " times : " + time + " ms");
</script>

On my machine the time reported is 1922 ms (averaged out three times). As you see this code is doing nothing other than appending a 7 character string 20K times. And for such a trivial operation it is consuming almost 2 seconds. Phewww….
Impressed? Not yet. It still doesn’t appeal you? OK. Try increasing the no of appends and enjoy the show. Here is what I get on my box…

Appends

Time (ms)

10K

260

20K

1922

25K

6125

30K

11578

Well, I am sure you would be impressed by now. And I can feel your curiosity to know how Jscript makes such a trivial looking operation such a big one.

Not much, it uses a very trivial algorithm to append strings to get you to such a great show. Whenever it has to append two strings of size ‘m’ and ‘n’ each, it allocates a new buffer of size (m + n) , copies the strings one by one, and returns this buffer back. Hmmm. This logic looks fine. What’s the deal then?

Let’s say you want to append 3 strings like following...

resultStr = “Jscript” + “String” + “Append”;

To give you the final result, Jscript first appends “Jscript” and “String”. It allocates a buffer of size 13, copies them one by one to the buffer and assigns the buffer to a temporary, let’s say temp1. Next it appends temp1 and “Append”, allocates a buffer of size 19, copies temp1 and “Append”, and assigns the result to resultStr. As you see, to generate final string for you it copied “Jscript” and “String” twice (temp1 and resultStr) and allocated one intermediate buffer. Imagine what happens when you append ‘n’ such strings. It will allocate n-1 buffers, copy each string (n-i) times where ‘i’ is the index at which the string appeared in the operation.

You might be thinking why Jscript was shipped with such an algorithm. Well, Jscript was meant to be used just as glue, not as full-fledge programming language as used in modern WebApps.

 How it affects Real world Scenarios
Gone are the days when web apps used to be static text which was just retrieved from the server and displayed in your browser. Today’s world is AJAX, involves lot more interaction with server, and data received is not just the plain HTML. A modern WebApp fetches data from server; builds the page on your machine using Jscript and DOM APIs, and displays on your browser. And to build the page it has to generate HTML, which is nothing but text. And to generate HTML out of the pieces retrieved from server, it has to append them as strings. And as the applications are becoming smarter and smarter, they tend to minimize the interaction with server, and do as much processing as possible in your browser.

Let me give you some real world examples. You all must have used some or the other AJAX mail client like msn.com, outlook web access or yahoo mail beta. In your inbox, if you scroll down or up, they just update your mail list not the entire page. How it happens? It gets the information about mails from server, builds the table of your mails on the fly and displays in the browser.

Another good example would be Redfin which is actually AJAX based real estate portal. If you search a locality, it shows a map with location of all the results and below the map it shows details of each result. This table it constructs by appending several hundreds of strings.

These are just two. There are many more such. Basically what I want to convey is that string append is a pretty common and basic operation and probability is very high that large number of web apps get affected by this.

So we are fixing it!!!
We are trying an algorithm which avoids those intermediate buffer allocations and repeated copy operations. With this algorithm, I just ran the same test case and here are the numbers for you. Please note that the fix is still under progress so these numbers might change a bit here and there, however they should be enough to give you an idea of improvements.

Appends

Time (ms)

10K

23

20K

47

25K

62

30K

78

Shhhh. Numbers speak for themselvesJ.

JP, I Need this fix on my box, now!!!
I wish I could help you. You will have to wait till next release of Internet Explorer.

At last…
Please help us in helping you. If you have encountered other such script “hairballs”, do leave a note on any of the blogs published here. I want to publish more such blogs for youJ.

 

- JP

Comments

  • Anonymous
    October 17, 2007
    PingBack from http://www.artofbam.com/wordpress/?p=9573

  • Anonymous
    October 17, 2007
    > Numbers speak for themselves. Very nice! :-) > You will have to wait till next release of Internet Explorer. Waiting is hard... :-(

  • Anonymous
    October 17, 2007
    Nice one JP! But after reading Eric's post that you linked to, I'm really interested to know what algorithm you've used to fix it. Any chance of a behind-the-scenes look? Cheers Matt

  • Anonymous
    October 17, 2007
    Seems like a vaste of time for me. Even in JScript there are a much better way to concatenate strings than this by using a string builder approach. And we can use that with todays browsers (and of course we all do right?) This is no different from what we should be doing in all the other programming languages we use, because they all have this problem with string copying. Having that said, The JScript performance is one of IE's biggest achilles heals these days, and it's good news that you are optimizing it. VML rendering and mousemove needs some clean up too.

  • Anonymous
    October 17, 2007
    Oh, just great. Listen me. guys, we have an almighty algorithm, but we can not describe it.

  • Anonymous
    October 17, 2007
    That's why I've been using arrays to concatenate strings for, oh, about ten years now. To quote JPJ's examples: var x = []; for (var i = 0; i < count; i++) {     x.push(s); } x = x.join(''); Output: Time taken to += 'jscript' 20000 times : 875 ms Time taken to push 'jscript' 20000 times : 78 ms And the other one is simply rewritten: resultStr = ['Jscript', 'String', 'Append'].join('');

  • Anonymous
    October 17, 2007
    Martijn Coppoolse, yor approach is excellent for IE, but if you want to have a cross-broser decision, you should mention that Array.join approach is the worst in Firefox and not the best in Opera.

  • Anonymous
    October 23, 2007
    Thanks for the feedback guys. The algorithm we have used is somewhat similar to what Eric mentioned in his blog. As I said it is still under testing and may need some changes. As Martin and SharpGIS mentioned, folks have been using Array.push() and Array.join() approach to concatenate strings in IE, but still not everyone knows about it. New developers still fall back to using direct string concatenation approach, which hurts perf and causes lot of extra work. So why not have all of them performant and let developers concentrate on 'task in hand' rather than 'best practices' :).

  •  JP
  • Anonymous
    October 24, 2007
    The comment has been removed

  • Anonymous
    October 25, 2007
    The comment has been removed

  • Anonymous
    October 29, 2007
    Should be possible to get a nice speedup by avoiding some of the long string copies. Something like: var blocksize = 10; for (var i = 0; i < count / blocksize; i++) {    var tmp = "";    for (var j = 0; j < blocksize; j++) {        tmp += s;    }    x += tmp; } This way the number of string copies are the same, but a lot of them are copying short strings instead of long strings.

  • Anonymous
    October 29, 2007
    Absolutely right. The current implmentation gets slower ans slower as string size increases. It's nice work around to append small strings, and then generate a big string out of them. -JP

  • Anonymous
    October 31, 2007
    What is holding you from putting this fix in a next update of current versions? It's not a crime to release performance updates after a products release.

  • Anonymous
    November 05, 2007
    i suggest mentioning array.join in the article, such that less new programmers have to be lightly bruised with sticks for their crimes.  regardless of ie's performance, the problem is not string concatenation, its the people who use it.

  • Anonymous
    December 20, 2007
    I use this all the time in HTA's, simple, easy to read and pretty efficient (i lookup the length of the array so I can add stuff to the output without having to increase a counter). No extra functions / classes needed! :) var output = new Array(); for (var i = 0; i < 1000; i++) {   output[output.length] = i; } alert(output.join(''));

  • Anonymous
    December 20, 2007
    This is not a waste of time, SharpGIS. Whether you can improve string performance is not relevant to the thousands of web pages already existing that use string builders.

  • Anonymous
    December 20, 2007
    This article would have been more accurately titled: "Performance issues with "Extremely Long String Concatenation" in JScript." Because, unfortunately I think it's a flawed test that will mislead lots of readers of that thread can have them use a less attractive coding pattern that can be up to 3x slower, to do a simple thing like concatenate two strings. I will post my tests to the thread. I think that "fixing" our code according to that article would actually reduce our performance. The problem with the example given is that it creates a very specific, non-realistic test that tickles a particular known code path. Specifically, it repeatedly adds a small string to an extremely long one. That is not a real world use case, and if it is, then yes, people should use one of the supplied alternative patterns (and, of course, MS should fix their implementation in any case). That IE's JS performs particularly poorly in this regard is evidenced by the non-linear performance times (and perhaps that's why JP didn't show us, e.g. 1k iterations, or 1 or 20 iterations, which are quite good). This is also mentioned by a follow-up poster. Normally one loops a performance test many times and then divides by the number of iterations to see what the single-iteration cost is, but that is not accurate in JP's test. Essentially, JP's test tells us nothing about how long it takes to do a single string concatenation or even 1000, it only tells us how long it takes to create a very long string via 20k iterations. What, then, if we did a more realistic test, which actually tested what real code is doing? I did that and you'll see that the results are far better and far more linear. I've included two such tests here, as well as my timings for JP's tests, Martin's tests, and my tests. You'll see that in the normal case, IE has stellar performance. I also included a "1000" iteration number so you can see the linearity of the various tests. JP’s code: 20k iterations (1000 iterations): IE: 34000ms (300ms) (yes, you read that right) FF: 83ms (4ms) Martijn Coppoolse ArrayTest (super long strings): 20k iterations (1000 iterations): IE: 63ms (2ms) FF:120ms (5ms) Keith ArrayTest: 20k iterations (1000 iterations) IE: 164ms (6ms) FF: 312ms (15ms) Keith ConcatTest1 (+): 20k iterations (1000 iterations) IE: 43ms (1ms) FF: 76ms (1ms) Keith ConcatTest2 (+=): 20k iterations (1000 iterations) IE: 51ms (1ms) FF: 109ms (6ms) **** THE CODE **** ARRAYTEST var before = new Date().getTime(); var s1 = "11111"; var s2 = "22222"; count = 20000; for (var i = 0; i < count; i++) {     x = [s1, s2];     x.join(''); } var after = new Date().getTime(); var time = after - before; alert("ARRAY TEST(.join): Time taken to append '"+s1+"'+'"+s2+"'(="+x+") " + count + " times : " + time + " ms"); CONCAT TEST1 (+) var before = new Date().getTime(); var s1 = "11111"; var s2 = "22222"; count = 20000; for (var i = 0; i < count; i++) {     x = s1+s2; } var after = new Date().getTime(); var time = after - before; alert("CONCAT1 TEST(+): Time taken to append '"+s1+"'+'"+s2+"'(="+x+") " + count + " times : " + time + " ms"); CONCAT TEST2 (+=) var before = new Date().getTime(); var s1 = "11111"; var s2 = "22222"; count = 20000; for (var i = 0; i < count; i++) {    x = "AAAAA";    x += s2; } var after = new Date().getTime(); var time = after - before; alert("CONCAT2 TEST(+=): Time taken to append '"+s1+"'+'"+s2+"'(="+x+") " + count + " times : " + time + " ms");

  • Anonymous
    December 27, 2007
    Thanks Keith for such detailed example and numbers. I really appreciate it. I agree that I should have stated that problem lies in creating big strings through multiple string concatenation, not small strings. However I feel that contents in the blog convey this. Pls. check out the section "How it affects Real world Scenarios" and the paragraph just above it where I have explained the reason behind IE's slowness. One more thing I would like to mention is that most of the Jscript benchmarks floating around do such concatenations to test the string concatenation performance.

  • JP
  • Anonymous
    December 31, 2007
    While it was overall great to read this, it is horrible, pathetic, and gut-wrenching to hear is that we must wait for the next release of Internet Explorer before we can BEGIN to utilize the fix! I don't understand why you can't patch IE7...? Go ahead and take a look and see the stats on how many people currently use IE6. http://www.w3schools.com/browsers/browsers_stats.asp Now consider how long it will be before the majority of IE users will be running the next release (scheduled for 2009?). RIDICULOUS.

  • Anonymous
    January 07, 2008
    Is this algorithm available in IE7 patch? Ricky.

  • Anonymous
    January 07, 2008
    I believe you mean script 5.7. Sorry, it is not there. -JP

  • Anonymous
    January 19, 2008
    Seems like a vaste of time for me. Even in JScript there are a much better way to concatenate strings than this by using a string builder approach. And we can use that with todays browsers (and of course we all do right?) This is no different from what we should be doing in all the other programming languages we use, because they all have this problem with string copying.

  • Anonymous
    January 19, 2008
    i suggest mentioning array.join in the article, such that less new programmers have to be lightly bruised with sticks for their crimes.  regardless of ie's performance, the problem is not string concatenation, its the people who use it.

  • Anonymous
    January 19, 2008
    Nice one JP! But after reading Eric's post that you linked to, I'm really interested to know what algorithm you've used to fix it. Any chance of a behind-the-scenes look?

  • Anonymous
    January 19, 2008
    > Numbers speak for themselves. Very nice! :-) > You will have to wait till next release of Internet Explorer. Waiting is hard... :-(

  • Anonymous
    February 16, 2008
    En JavaScript, comme en .net ou ava, les strings sont immutables cela veut dire que l'objet ne peut pas

  • Anonymous
    February 27, 2008
    please note that "abc"[1] is not required by ECMA. JScript is not doing anything wrong. It's technically an extension.

  • Anonymous
    March 05, 2008
    Making developers more productive through the design, development, and debug phases of web application

  • Anonymous
    March 06, 2008
    IE8 Beta1 is now available here . It has remarkable script improvements in JScript performance. It also

  • Anonymous
    March 18, 2008
    Hello Friends, Have you read my post on the String Concatenation issue? If yes, then I can sense your

  • Anonymous
    August 05, 2008
    Good to see you are working on this, but IE is still way slower than other browsers.

  • Anonymous
    January 13, 2009
    The comment has been removed