Error when using AutoPoolSpecification with an AutoScaleFormula in .NET

James Thurley 171 Reputation points
2021-05-10T14:41:20.483+00:00

When creating a job using the AutoPoolSpecification where the PoolSpecification.AutoScaleFormula is set I get the following error:

HttpStatusCode:BadRequest
Error Details key=PropertyName value=targetLowPriorityNodes
Error Details key=PropertyValue value=0
Error Details key=Reason value=TargetDedicatedNodes and TargetLowPriorityNodes cannot be specified when EnableAutoScale is true

It is indicating that I have incorrectly set targetLowPriorityNodes to 0, when I haven't set it at all. Explicitly setting it to null doesn't help either. It seems like perhaps the Azure Batch .NET client library is incorrectly defaulting it to zero?

To reproduce, create a console application referencing <PackageReference Include="Microsoft.Azure.Batch" Version="14.0.0" /> with the following code:

using System;
using System.Threading.Tasks;
using Microsoft.Azure.Batch;
using Microsoft.Azure.Batch.Auth;
using Microsoft.Azure.Batch.Common;

namespace BatchAutoPoolSpecificationError
{
    class Program
    {
        static async Task Main(string[] args)
        {
            BatchSharedKeyCredentials credentials = new(
                Environment.GetEnvironmentVariable("BATCH_ACCOUNT_URL"),
                Environment.GetEnvironmentVariable("BATCH_ACCOUNT_NAME"),
                Environment.GetEnvironmentVariable("BATCH_ACCOUNT_KEY"));

            var batchClient = BatchClient.Open(credentials);

            const int NumberOfPoolComputeNodes = 1;
            const int AutoscaleSamplePeriodSeconds = 30;

            PoolSpecification poolSpecification = new()
            {
                AutoScaleEnabled = true,
                AutoScaleEvaluationInterval = TimeSpan.FromMinutes(5),
                AutoScaleFormula = $@"
                    startingNumberOfVMs = {NumberOfPoolComputeNodes};
                    maxNumberofVMs = {NumberOfPoolComputeNodes};
                    pendingTaskSamplePercent = $PendingTasks.GetSamplePercent({AutoscaleSamplePeriodSeconds} * TimeInterval_Second);
                    pendingTaskSamples = pendingTaskSamplePercent < 70 ? startingNumberOfVMs : avg($PendingTasks.GetSample({AutoscaleSamplePeriodSeconds} * TimeInterval_Second));
                    $TargetLowPriorityNodes=min(maxNumberofVMs, pendingTaskSamples);
                    $NodeDeallocationOption = taskcompletion;
                ",
                // TargetLowPriorityComputeNodes = null, // This doesn't help

                VirtualMachineSize = "Standard_D2a_v4",

                VirtualMachineConfiguration = new VirtualMachineConfiguration(
                    new ImageReference(
                        "ubuntu-server-container",
                        "microsoft-azure-batch",
                        "20-04-lts"),
                    "batch.node.ubuntu 20.04")
                {
                    ContainerConfiguration = new ContainerConfiguration()
                },
            };

            var autoPoolSpecification = new AutoPoolSpecification()
            {
                AutoPoolIdPrefix = "temp",
                KeepAlive = false,
                PoolLifetimeOption = PoolLifetimeOption.Job,
                PoolSpecification = poolSpecification
            };

            PoolInformation poolInformation = new()
            {
                AutoPoolSpecification = autoPoolSpecification,
            };

            string jobId = "job-" + Guid.NewGuid().ToString();

            CloudJob unboundJob = batchClient.JobOperations.CreateJob(jobId, poolInformation);

            try
            {
                Console.WriteLine("Comitting job...");
                await unboundJob.CommitAsync();

                CloudTask task = new("t-0", "echo hello")
                {
                    ContainerSettings = new TaskContainerSettings("ubuntu", "--rm")
                };

                Console.WriteLine("Adding task...");
                await batchClient.JobOperations.AddTaskAsync(jobId, new[] { task });

                CloudJob boundJob = await batchClient.JobOperations.GetJobAsync(jobId);

                boundJob.OnAllTasksComplete = OnAllTasksComplete.TerminateJob;
                Console.WriteLine("Comitting...");
                await boundJob.CommitAsync();
                await boundJob.RefreshAsync();
                Console.WriteLine("Done.");
            }
            catch (Exception t)
            {
                Console.WriteLine(t);
                await batchClient.JobOperations.DeleteJobAsync(jobId);
            }
        }
    }
}
Azure Batch
Azure Batch
An Azure service that provides cloud-scale job scheduling and compute management.
320 questions
0 comments No comments
{count} votes

Accepted answer
  1. vipullag-MSFT 25,616 Reputation points
    2021-05-11T12:52:37.29+00:00

    @James Thurley

    Firstly, apologies for the delay in responding here and any inconvenience this issue may have caused.

    I checked with internal team on this, this seems to be a bug which the team is working on fixing it.

    However, from your code, the error is occurring when committing the update to the Job (the “OnAllTasksComplete” change) on line 84. This is because GetJob is returning 0 for the targetLowPriorityNodes field instead of null as it was originally set. This behavior is fine when enableAutoScale is set to false, but causes an error in your case when autoScale is enabled.

    For now till the bug is fixed, the workaround in this case is to set targetLowPriorityNodes to null after calling:

    CloudJob boundJob = await batchClient.JobOperations.GetJobAsync(jobId);  
    

    Hope this helps.

    Please 'Accept as answer' if it helped, so that it can help others in the community looking for help on similar topics.


0 additional answers

Sort by: Most helpful