[Azure Powershell] How to improve the performance of BlobCopy by placing inline C#.NET code
Performance issues are always hard and tricky to fix. Recently I had this ask to check why the given Powershell execution takes time and gets slower on heavy iteration. In the beginning, copy operation used to take only few mins but over a heavy iteration, we saw the performance degraded to 2x, 3x..
Sample looping of copy command,
foreach($SrcBlob in $SrcBlobs)
{
$DestBlob = "root/" + $SrcBlob.Name
Start-AzureStorageBlobCopy -SrcBlob $SrcBlob.Name -SrcContainer $SrcContainerName -Context $SrcContext -DestBlob $DestBlob -DestContainer $DestContainerName -DestContext $DestContext -Force
}
We have also used measure-command to list the execution duration and print the value for each iteration and also the sum at the end of the loop to confirm. We ran the loop by copying 5000+ times b/w storage accounts and found that, the Powershell command let execution gets slower on iteration for some reason. Later on investigation, we confirmed this was due to known limitation with Powershell.
$CopyTime = Measure-Command{
Start-AzureStorageBlobCopy -SrcBlob $SrcBlob.Name -SrcContainer $SrcContainerName -Context $SrcContext -DestBlob $DestBlob -DestContainer $DestContainerName -DestContext $DestContext -Force
}
Yes, there is a known issue with PowerShell running slow while iterating large loops due to the nature how PowerShell works, as in the 16th time of the iteration the content of the loop must be compiled dynamically and the .Net will need to run some extra checks. Thanks for some of our internal folks to provide clarity on this. To overcome this, there is a workaround suggested which is what we are here. Yes, we replaced with .Net code having copy logic inside the Powershell script to improve the performance. In this way, the security check only running once instead every 16 time. You will find the detailed information here –> Why cant PowerShell run loops fast ? https://blogs.msdn.microsoft.com/anantd/2014/07/25/why-cant-powershell-run-loops-fast/
How to place the C# as inline code within Powershell:-
$reflib = (get-item "c:\temp\Microsoft.WindowsAzure.Storage.dll").fullname
[void][reflection.assembly]::LoadFrom($reflib)
$Source = @"
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Auth;
using Microsoft.WindowsAzure.Storage.Blob;
using System;
namespace ns{
public static class copyfn1 {
public static void Copy_bw_SA_Blobs_Test(string sourceAccountKey, string destAcKey, string SrcSAName, string DestSAName, string SrcContainerName, string DestContainerName, string SrcPrefix, string DestPrefix) {
StorageCredentials scSrc = new StorageCredentials(SrcSAName, sourceAccountKey);
CloudStorageAccount srcAc = new CloudStorageAccount(scSrc, true);
CloudBlobClient cbcSrc = srcAc.CreateCloudBlobClient();
CloudBlobContainer contSrc = cbcSrc.GetContainerReference(SrcContainerName);
//Generate SAS Key and use it for delegate access
SharedAccessBlobPolicy sasConstraints = new SharedAccessBlobPolicy();
sasConstraints.SharedAccessExpiryTime = DateTime.UtcNow.AddHours(24);
sasConstraints.Permissions = SharedAccessBlobPermissions.Write
| SharedAccessBlobPermissions.List | SharedAccessBlobPermissions.Add
| SharedAccessBlobPermissions.Delete | SharedAccessBlobPermissions.Create
| SharedAccessBlobPermissions.Read;
string sasContainerToken = contSrc.GetSharedAccessSignature(sasConstraints);
//Return the URI string for the container, including the SAS token.
string containersas = contSrc.Uri + sasContainerToken;
CloudBlobContainer container = new CloudBlobContainer(new Uri(containersas));
//Destination account - no SAS reqd
StorageCredentials scDst = new StorageCredentials(DestSAName, destAcKey);
CloudStorageAccount DstAc = new CloudStorageAccount(scDst, true);
CloudBlobClient cbcDst = DstAc.CreateCloudBlobClient();
CloudBlobContainer contDst = cbcDst.GetContainerReference(DestContainerName);
foreach (var eachblob in container.ListBlobs(SrcPrefix, true, BlobListingDetails.Copy)){
CloudBlob srcBlob = (CloudBlob)eachblob;
string srcpath = srcBlob.Name;
string dstpath = (DestPrefix != "") ? srcpath.Replace(SrcPrefix, DestPrefix) : srcpath;
Console.WriteLine("Files copying-" + dstpath);
if (srcBlob.BlobType == BlobType.BlockBlob){
CloudBlockBlob dstblob = contDst.GetBlockBlobReference(dstpath);
dstblob.StartCopy((CloudBlockBlob)srcBlob);
}
else if (srcBlob.BlobType == BlobType.AppendBlob){
CloudAppendBlob dstblob = contDst.GetAppendBlobReference(dstpath);
dstblob.StartCopy((CloudAppendBlob)srcBlob);
}
else if (srcBlob.BlobType == BlobType.PageBlob){
CloudPageBlob dstblob = contDst.GetPageBlobReference(dstpath);
dstblob.StartCopy((CloudPageBlob)srcBlob);
}
}
}
}
}
"@
Add-Type -ReferencedAssemblies $reflib -TypeDefinition $Source -Language CSharp -PassThru
[ns.copyfn1]::Copy_bw_SA_Blobs_Test("acc_key1", "acc_key2", "storage_acc1", "storage_acc2"
, "src_container_name", "dest_container_name", "sales/2017/Jan"
, "sales/2017/backup/")
With this inline C# code, we were able to optimize the copy time duration and also able to mitigate the performance degrade over a heavy loop.
Let me know if this helps in someway or issue with this.