Inside String Concatenation

In this post we will discuss about how the concatenation happens when you call string.concat() and how the concatenation happens when you call Stringbuilder.Append().

Here I have analyzed it based on the Rotor .Net V2.0 code (some may not apply to .Net 1.1).

Generally, it is of the opinion that for any 2-4 strings use String.Concat() and for more than 4 strings use stringbuilder.Append(). The only difference that the string builder brings in is that it reduces the number of string allocations during the append operation , thereby reducing the GC effort.

How does a string concat work:-

If two strings are passed as arguments, it will find out the length of the resultant string, allocate memory for it and then copy these strings one by one.

If more than 4 strings are passed, it will consider it as an array of strings to be concatenated, spin through the array elements to find out the length of the resultant string. Then it allocates the memory for the resultant string and copies the strings one by one into that memory location.

For the C# code below:-

string s1=“hello”;
string s2=“world”;
string s3=” My”;
string s4=” Program”;

string result = s1+s2+s3+s4;

it generates the IL code as

IL_0001: ldstr “hello”

IL_0006: stloc.0

IL_0007: ldstr “world”

IL_000c: stloc.1

IL_000d: ldstr ” My”

IL_0012: stloc.2

IL_0013: ldstr ” Program”

IL_0018: stloc.3

IL_0019: ldloc.0

IL_001a: ldloc.1

IL_001b: ldloc.2

IL_001c: ldloc.3

IL_001d: call string [mscorlib]System.String::Concat(string,

string,

string,

string)

Now when the concatenation changes to

string result = s1+s2+s3+s4+s1+s2;

The IL code results in a call to the overload method of concat.

call string [mscorlib]System.String::Concat(string[])

This is a fastest way of concatenating strings. And thisworks perfectly fine for small number of strings. Then when do we go for stringBuilder.?

You should go for Stringbuilder when you are not sure of how many concatenations are going to happen. i.e., when the string concatenation is going to happen in a loop.

String Concatenation:-

String.Concat(string,string)

  1. The inputs are string_A, string_B;

  2. Total length of the result string = Length(string_A)+ length(String_b);

  3. Call Fast allocate string. What it does is take in the length and allocates memory in the heap and returns a string which is there in that memory.

  4. Then it calls the FillStringChecked function. It takes 3 parameters.

    1. The end result string

    2. Destination position

    3. Source string

  5. First string A is put in to the result. Then String b is copied into this result from the position length(StringA)

  6. This happens in the following way

String.Cs

fixed(char *pDest = &dest.m_firstChar)

fixed (char *pSrc = &src.m_firstChar) {

wstrcpy(pDest + destPos, pSrc, length);

}

pDest contains the address of the first character of the string ‘dest’.

pSrc contains the address of the first character of the string ‘src’.

  1. Here fixed key word is used so that , garbage collector does not move the string object when it is half way through the concat operation. This is called pinning and this can be used when the method is marked unsafe.

String Builder:-

String builder class represents a mutable string. It is convenient for situations in which it is desirable to modify a string, perhaps by removing, replacing or inserting characters without creating a new string subsequent to each modification.

What happens when you call the constructor of String Builder:-

stringBuilder strbConcatenation = new stringBuilder(string);

It allocates a memory space of 16 characters by default. However you can initialize the capacity of the string. The max capacity that can be allocated is 2147483647. ( this is nothing but Int32.MaxValue)

Now if the passed on string length is greater than the capacity then the capacity =capacity * 2. This ensures that there is a predicted growth rate. Then the Memory of that capacity is allocated. Then the input string is copied into the start of the new allocated memory and returns the reference to the memory space.

Now what happens when you call StringBuilder.Append(value) :-

Stringbuilder maintains the reference to the allocated memory space. Now it will find out if the value can be appended in place in the same memory space. If it can be allocated in the same memory space, then the wstrcpy happens.

What if the string builder does not have enough space to accommodate the newly concatenated string?.

In that case, it has to allocate a new memory space, copy the old string and also the new string. All this happens in the fixed regions of memory or pinned memory. This is the reason, that providing an optimal capacity when instantiating the string builder, reduces the number of new allocations. This will dramatically improve the performance as the number of garbage collections will be less.

This is more explicit from the SSCLI code

if (NeedsAllocation(currentString,requiredLength)) {

// here it allocates more memory and gets a new string.

String newString = GetNewString(currentString,requiredLength);

newString.AppendInPlace(value,currentLength);

ReplaceString(tid,newString);

} else {

// the append is carried out in place.

currentString.AppendInPlace(value,currentLength);

ReplaceString(tid,currentString);

}

Conclusion:-

  1. String builder gives an edge in the performance by the way that it is not going to make more Allocations often and deallocations.

  2. String concat is good for small string because the way it concats is pretty much faster when compared to string builder except for the memory allocation overhead and the time required for that memory allocation.

 

Disclaimer : All the information mentioned in the post are my own and it does not reflect any opinion of my company .

Comments