Monitor Managed DevOps Pools

Managed DevOps Pools provides several options for monitoring your pool instances. The Overview page provides predefined metrics charts, and you can configure custom charts on the Metrics page. Use these tools to monitor the health of your Managed DevOps Pools instances.

Available metrics

Managed DevOps Pools provides the following metrics:

Metric Unit Aggregations Dimensions
AllocationDurationMS
Average pool request duration
Milliseconds Average Image, PoolId, ResourceRequestType, Type
Allocated
Number of Azure DevOps Agents with jobs currently running
Count Average, Min, Max Images, PoolId, ProviderName, SKU
NotReady
Number of Azure DevOps Agents that are not set up for testing
Count Average, Min, Max Images, PoolId, ProviderName, SKU
PendingReimage
Number of Azure DevOps Agents in the process of being reimaged
Count Average, Min, Max Images, PoolId, ProviderName, SKU
PendingReturn
[Azure only] Number of Azure DevOps Agents that are post-cleanup, waiting to be deleted (which occur in batches)
Count Average, Min, Max Images, PoolId, ProviderName, SKU
Provisioned
Number of Azure DevOps Agents currently up
Count Average, Min, Max Images, PoolId, ProviderName, SKU
Ready
Number of Azure DevOps Agents present that are prepared to accept a job
Count Average, Min, Max Images, PoolId, ProviderName, SKU
StartingNumber of Azure DevOps Agents being prepared Count Average, Min, Max Images, PoolId, ProviderName, SKU
Total
Total number of Azure DevOps Agents
Count Average, Min, Max Images, PoolId, ProviderName, SKU
Count
Total number of agents provisioned, grouped by status
Count Count ErrorCode, FailureStage, PoolId, RequestType, Status, Type

Filtering and splitting

Azure Monitor supports filtering and splitting for metrics that have dimensions. Managed DevOps Pools provides the following dimensions. See the previous table to a list of which dimensions apply for a particular metric.

Dimension Description
Image Image name
Images List of images
PoolId Name of Managed DevOps Pool
ProviderName CI/CD provider (AzureProvider is currently the only provider)
ResourceRequestType
SKU VM size
Type
ErrorCode One of the error codes listed in Error codes
FailureStage
RequestType TODO - if there are predefined values for there I can list them here
Status Agent status

Filtering lets you choose which dimension values are included in the chart. You might want to show successful requests when you chart the Total number of agents provisions Count metric. You apply the filter on the Status dimension.

Splitting controls whether the chart displays separate lines for each value of a dimension or aggregates the values into a single line. Splitting allows you to visualize how different segments of the metric compare with each other. You can see one line for an average AllocationDurationMS across all pools, or you can see separate lines for each pool.

For more information, see Analyze Metrics, Use dimension filters and splitting.

View metrics on the Managed DevOps Pool Overview

The Overview page for your Managed DevOps Pool contains the following predefined metrics charts, which can be set to display metrics for the past hour, day, 7 days, or 30 days.

You can customize the charts or create your own. For more information, see Analyze metrics, Create a metric chart.

Pool Usage chart

The Pool Usage chart displays the following metrics.

  • Starting: Count of agents starting up and preparing to accept jobs.
  • Ready: Count of agents only and ready to accept jobs.
  • Allocated: Count of agents currently running jobs.
  • NotReady: Count of stateful agents that have completed a job but are not yet ready to accept a new job.
  • PendingReimage: Count of agents that have completed a job and are preparing to be reimaged. This status is typical if you have your pool configured for stateless agents with standby agent mode enabled.
  • PendingReturn: [Azure only] Number of Azure DevOps Agents that are post-cleanup, waiting to be deleted (which occurs in batches)
  • Provisioned: Count of online agents.
  • Total: Total number of agents.

Pool Provisioning Health chart

The Pool Provisioning Health chart displays the following metrics.

  • Count - Total number of agents provisioned, grouped by status (Completed/Failed)

Request Durations chart

The Request Durations chart displays the following metrics.

  • AllocationDurationMS - Average pool request duration

Failure Stages chart

The Failure Stages chart displays the following metrics.

  • Count - Total number of agents that failed to provision, grouped by FailureStage

Error Codes chart

The Error Codes chart displays the following metrics.

  • Count - Total number of agents that failed to provision, grouped by ErrorCode

For a list of error codes, see the following Error codes section.

Error codes

Error code Error message
AzureInternalServerError The VM allocation failed due to an internal error. Retry later or try deploying to a different location.
ClusterOutOfCapacity Allocation failed. Note that allocation for this subscription is constrained to a set of clusters, which may be out of capacity. To remove the cluster constraint, contact the subscription administrator or Microsoft Support. Read more about improving likelihood of allocation success at https://aka.ms/allocation-guidance.
CustomScriptError VM reported a failure when processing extension 'customScript' (publisher 'Microsoft.Compute' and type 'CustomScriptExtension'). Error message: 'Finished executing command'. More information on troubleshooting is available at https://aka.ms/VMExtensionCSEWindowsTroubleshoot.
DiskProcessingTimeout The processing of VM '...' is halted because of one or more disk processing errors encountered by VM '...' in the same Availability Set. Resolve the error with VM '...' before retrying the operation. For more information, refer to https://aka.ms/activitylog.
EndpointNotFound 404 - There are no listeners connected for the endpoint. TrackingId:00000000-0000-0000-0000-0000000000, SystemTracker:tipresourceprovider.servicebus.windows.net:tipresourceproviderconnection/pools/es_tap_prime_cus_d4ds, Timestamp:2024-02-15T21:15:57
ExceedingQuota Quota exceeded.
FailedToRetrieveUserPassword Failed to retrieve user password ... from Key Vault
ForbiddenByFirewall Forbidden
HTTPResponseBodyNotAvailable HTTP response body isn't available
ImageNotFound The image could not be found. Check the image and the version exists
ImageRemovedFromPool The given key was not present in the dictionary
ImageThrottling Too many simultaneous copy requests from a snapshot or image resource. Retry later.
InstallationOfWindowsUndeployable OS provisioning for VM failed. Error details: This installation of Windows is undeployable. Make sure the image is properly prepared (generalized). Instructions for Windows: https://azure.microsoft.com/documentation/articles/virtual-machines-windows-upload-image/
InsufficientCapacity Allocation failed. We do not have sufficient capacity for the requested VM size in this region. Read more about improving likelihood of allocation success at https://aka.ms/allocation-guidance
InvalidSubnetDelegation Subnet /subscriptions/resourceGroups/SqlClientDrivers/providers/Microsoft.Network/virtualNetworks/SqlClientDrivers-vNet/subnets/Managed-Instance-pool referenced by /subscriptions/resourceGroups/Managed-Instance-pool/providers/Microsoft.Compute/virtualMachineScaleSets//updateGroups/version1/networkInterfaceConfigurations/nic/ipConfigurations/ipconfig can't be used because it contains external resources.
NetworkProfileProcessingTimeout An unexpected error occurred while processing the network profile of the VM. Retry later.
ProvisioningTimeOut Resource /subscriptions//resourceGroups//providers/Microsoft.Network/networkInterfaces/providers/Microsoft.Compute/virtualMachineScaleSets//virtualMachines/networkInterfaces/nic not found. OS Provisioning for VM did not finish in the allotted time. The VM may still finish provisioning successfully. Check provisioning state later. Also, make sure the image has been properly prepared (generalized). Instructions for Windows: https://azure.microsoft.com/documentation/articles/virtual-machines-windows-upload-image/ Instructions for Linux: https://azure.microsoft.com/documentation/articles/virtual-machines-linux-capture-image/ If you are deploying more than 20 Virtual Machines concurrently, consider moving your custom image to shared image gallery. Refer to https://aka.ms/movetosig for the same.
RemoteNameCantBeResolved
ResourceGroupBeingDeleted The resource group ... is in deprovisioning state and can't perform this operation.
SecretDisabled Operation get isn't allowed on a disabled secret. Status: 403 (Forbidden) ErrorCode: Forbidden
ServiceUnavailable The service is unavailable now. Retry the request later.
SkuNotAvailable The requested VM size for resource 'Following SKUs failed for Capacity Restrictions:' is currently not available in location. Try another size or deploy to a different location or different zone. See https://aka.ms/azureskunotavailable for details.
TaskCanceled The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
VirtualNetworkIsNotFound The Virtual Network might be deleted.

See also