Calculate blob count and total size per container using Azure Storage inventory

This article uses the Azure Blob Storage inventory feature and Azure Synapse to calculate the blob count and total size of blobs per container. These values are useful when optimizing blob usage per container.

Prerequisites

None

Enable inventory reports

The first step in this method is to enable inventory reports on your storage account. You may have to wait up to 24 hours after enabling inventory reports for your first report to be generated.

When you have an inventory report to analyze, grant yourself read access to the container where the report CSV file resides by assigning yourself the Storage Blob Data Reader role. Be sure to use the email address of the account you're using to run the report. To learn how to assign an Azure role to a user with Azure role-based access control (Azure RBAC), follow the instructions provided in Assign Azure roles using the Azure portal.

Note

To calculate the blob size from the inventory report, make sure to include the Content-Length schema field in your rule definition.

Create an Azure Synapse workspace

Next, create an Azure Synapse workspace where you will execute a SQL query to report the inventory results.

Create the SQL query

After you create your Azure Synapse workspace, do the following steps.

  1. Navigate to https://web.azuresynapse.net.

  2. Select the Develop tab on the left edge.

  3. Select the large plus sign (+) to add an item.

  4. Select SQL script.

    Screenshot of select SQL script to create a new query.

Run the SQL query

Follow the below steps:

  1. Add the following SQL query in your Azure Synapse workspace to read the inventory CSV file.

    For the bulk parameter, use the URL of the inventory report CSV file that you want to analyze.

    SELECT LEFT([Name], CHARINDEX('/', [Name]) - 1) AS Container,
            COUNT(*) As TotalBlobCount,
            SUM([Content-Length]) As TotalBlobSize
    FROM OPENROWSET(
        bulk '<URL to your inventory CSV file>',
        format='csv', parser_version='2.0', header_row=true
    ) AS Source
    GROUP BY LEFT([Name], CHARINDEX('/', [Name]) - 1)
    
  2. Name your SQL query in the properties pane on the right.

  3. Publish your SQL query by pressing CTRL+S or selecting the Publish all button.

  4. Select the Run button to execute the SQL query. The blob count and total size per container are reported in the Results pane.

    Screenshot of output from running the script to calculate blob count and total size.