Estimate the cost of archiving data
The archive tier is an offline tier for storing data that is rarely accessed. The archive access tier has the lowest storage cost. However, this tier has higher data retrieval costs with a higher latency as compared to the hot, cool, and cold tiers.
This article explains how to calculate the cost of using archive storage and then presents a few example scenarios.
Calculate costs
The cost to archive data is derived from these three components:
- Cost to write data to the archive tier
- Cost to store data in the archive tier
- Cost to rehydrate data from the archive tier
The following sections show you how to calculate each component.
This article uses fictitious prices in all calculations. You can find these sample prices in the Sample prices section at the end of this article. These prices are meant only as examples, and shouldn't be used to calculate your costs.
For official prices, see Azure Blob Storage pricing or Azure Data Lake Storage pricing. For more information about how to choose the correct pricing page, see Understand the full billing model for Azure Blob Storage.
The cost to write
You can calculate the cost of writing to the archive tier by multiplying the number of write operations by the price of each operation. The price of an operation depends on which ones you use to write data to the archive tier.
Put Blob
If you use the Put Blob operation, then the number of operations is the same as the number of blobs. For example, if you plan to write 30,000 blobs to the archive tier, then that requires 30,000 operations. Each operation is charged the price of an archive write operation.
Tip
Operations are billed per 10,000. Therefore, if the price per 10,000 operations is $0.10, then the price of a single operation is $0.10 / 10,000 = $0.00001.
Put Block and Put Block List
If you upload a blob by using the Put Block and Put Block List operations, then an upload requires multiple operations, and each of those operations are charged separately. Each Put Block operation is charged at the price of a write operation for the accounts default access tier. The number of Put Block operations that you need depends on the block size that you specify to upload the data. For example, if the blob size is 100 MiB and you choose block size to 10 MiB when you upload that blob, you would use 10 Put Block operations. Blocks are written (committed) to the archive tier by using the Put Block List operation. That operation is charged the price of an archive write operation. Therefore, to upload a single blob, your cost is (number of blocks * price of a hot write operation) + price of an archive write operation.
Note
If you're not using an SDK or the REST API directly, you might have to investigate which operations your data transfer tool is using to upload files. You might be able to determine this by reaching out the tool provider or by using storage logs.
Set Blob Tier
If you use the Set Blob Tier operation to move a blob from the cool, cold, or hot tier to the archive tier, you're charged the price of an archive write operation.
The cost to store
You can calculate the storage costs by multiplying the size of the data in GB by the price of archive storage.
For example (assuming the sample pricing), if you plan to store 10 TB to the archive tier, the capacity cost is $0.002 * 10 * 1024 = $20.48 per month.
The cost to rehydrate
Blobs in the archive tier are offline and can't be read or modified. To read or modify data in an archived blob, you must first rehydrate the blob to an online tier (either the hot cool, or cold tier).
You can calculate the cost to rehydrate data by adding the cost to retrieve data to the cost of reading the data.
Assuming sample pricing, the cost of retrieving 1 GB of data from the archive tier would be 1 * $0.022 = $0.022.
Read operations are billed per 10,000. Therefore, if the cost per 10,000 operations is $5.50, then the cost of a single operation is $5.50 / 10,000 = $0.00055. The cost of reading 1000 blobs at standard priority is 1000 * $0.0005 = $0.50.
In this example, the total cost to rehydrate (retrieving + reading) would be $0.022 + $0.50 = $0.52.
Note
If you set the rehydration priority to high, then the data retrieval and read rates increase.
If you plan to rehydrate data, you should try to avoid an early deletion fee. To review your options, see Blob rehydration from the archive tier.
Scenario: One-time data backup
This scenario assumes that you plan to remove on-premises tapes or file servers by migrating backup data to cloud storage. If you don't expect users to access that data often, then it might make sense to migrate that data directly to the archive tier. In the first month, you'd assume the cost of writing data to the archive tier. In the remaining months, you'd pay only for the cost to store the data and the cost to rehydrate data as needed for the occasional read operation.
Using the Sample prices that appear in this article, the following table demonstrates three months of spending.
This scenario assumes an initial ingest of 2,000,000 files totaling 102,400 GB in size to archive. It also assumes one-time read each month of about 1% of archived capacity. The operation used this scenario is the Put Blob operation. This scenario also assumes that blobs are rehydrated by copying blobs instead of changing the blob's access tier.
Cost factor | January | February | March | Projected annual |
---|---|---|---|---|
Write operations | 2,000,000 | 0 | 0 | 2,000,000 |
Price of a single write operation | $0.000011 | $0.000011 | $0.000011 | $0.000011 |
Cost to write (operations * price of a write operation) | $22.00 | $0.00 | $0.00 | $22.00 |
Total file size (GB) | 102,400 | 102,400 | 102,400 | 1,228,800 |
Data prices (pay-as-you-go) | $0.002 | $0.002 | $0.002 | $0.002 |
Cost to store (file size * data price) | $204.80 | $204.80 | $204.80 | $2,457.60 |
Data retrieval size (1% of file size) | 1,024 | 1,024 | 1,024 | 12,288 |
Price of data retrieval | $0.022 | $0.022 | $0.022 | $0.022 |
Cost to retrieve (data retrieval size * price of retrieval) | $22.53 | $22.53 | $22.53 | $270.34 |
Number of read operations (File count * 1%) | 20,000 | 20,000 | 20,000 | 240,000 |
Price of a single read operation | $0.00055 | $0.0005 5 | $0.00055 | $0.00055 |
Cost to read (operations * price of a read operation) | $11.00 | $11.00 | $11.00 | $132.00 |
Cost to rehydrate (cost to retrieve + cost to read) | $33.53 | $33.53 | $33.53 | $402.34 |
Total cost (write + storage + rehydrate) | $260.33 | $238.33 | $238.33 | $2,881.94 |
Tip
To model costs over 12 months, open the One-Time Backup tab of this workbook. You can update the prices and values in that worksheet to estimate your costs.
Scenario: Continuous tiering
This scenario assumes that you plan to periodically move data to the archive tier. Perhaps you're using Blob Storage inventory reports to gauge which blobs are accessed less frequently, and then using lifecycle management policies to automate the archival process.
Each month, you'd assume the cost of writing to the archive tier. The cost to store and then rehydrate data would increase over time as you archive more blobs.
Using the Sample prices that appear in this article, the following table demonstrates three months of spending.
This scenario assumes a monthly ingest of 200,000 files totaling 10,240 GB in size to archive. It also assumes a one-time read each month of about 1% of archived capacity. The operation used this scenario is the Put Blob operation.
Cost factor | January | February | March | Projected annual |
---|---|---|---|---|
Write operations | 200,000 | 200,000 | 200,000 | 2,400,000 |
Price of a single write operation | $0.000011 | $0.000011 | $0.000011 | |
Cost to write (operations * price of a write operation) | $2.20 | $2.20 | $2.20 | $26.40 |
Number of files | 200,000 | 400,000 | 600,000 | 2,400,000 |
Total file size (GB) | 10,240 | 20,480 | 39,720 | 122,880 |
Data prices (pay-as-you-go) | $0.002 | $0.002 | $0.002 | |
Cost to store (file size * data price) | $10.14 | $20.28 | $30.41 | $1,597.44 |
Data retrieval size (1% of file size) | 102 | 205 | 307 | 7,987 |
Price of data retrieval | $0.022 | $0.022 | $0.022 | |
Cost to retrieve (data retrieval size * price of retrieval) | $2.25 | $4.51 | $6.76 | $175.72 |
Number of read operations (File count * 1% storage read) | 2,000 | 4,000 | 6,000 | 156,000 |
Price of a single read operation | $0.00055 | $0.00055 | $0.00055 | |
Cost to read (operations * price to read) | $1.10 | $2.20 | $3.30 | $85.80 |
Cost to rehydrate (cost to retrieve + cost to read) | $3.35 | $6.71 | $10.06 | $261.52 |
Total cost | $26.03 | $49.87 | $73.70 | $1,885.36 |
Tip
To model costs over 12 months, open the Continuous Tiering tab of this workbook. You can update the prices and values in that worksheet to estimate your costs.
Archive versus cold and cool
Archive storage is the lowest cost tier. However, it can take up to 15 hours to rehydrate 10-GiB files. To learn more, see Blob rehydration from the archive tier. The archive tier might not be the best fit if your workloads must read data quickly. The cool tier offers a near real-time read latency with a lower price than that the hot tier. Understanding your access requirements helps you to choose between the cool, cold, and archive tiers.
The following table compares the cost of archive storage with the cost of cool and cold storage by using the Sample prices that appear in this article. This scenario assumes a monthly ingest of 200,000 files totaling 10,240 GB in size to archive. It also assumes 1 read each month about 10% of stored capacity (1,024 GB), and 10% of total operations (20,000).
Cost factor | Archive | Cold | Cool |
---|---|---|---|
Write operations | 200,000 | 200,000 | 200,000 |
Price of a single write operation | $0.000011 | $0.000018 | $0.00001 |
Cost to write (operations * price of a write operation) | $2.20 | $3.60 | $2.00 |
Total number of files | 200,000 | 200,000 | 200,000 |
Total file size (GB) | 10,240 | 10,240 | 10,240 |
Data prices (pay-as-you-go) | $0.0020 | $0.0045 | $0.0115 |
Cost to store (file size * data price) | $20.48 | $46.08 | $117.76 |
Data retrieval size (10% of file size) | 1,024 | 1,024 | 1,024 |
Price of data retrieval per GB | $0.022 | $0.03 | $0.01 |
Number of read operations (file count * 10% storage read) | 20,000 | 20,000 | 20,000 |
Price of a single read operation | $0.00055 | $0.00001 | $0.000001 |
Cost to read (operations * price to read) | $11.00 | $.20 | $.02 |
Cost to rehydrate (cost to retrieve + cost to read) | $30.48 | $30.92 | $10.26 |
Monthly cost | $42.62 | $71.38 | $167.91 |
Tip
To model your costs, open the Choose Tiers tab of this workbook. You can update the prices and values in that worksheet to estimate your costs.
The following chart shows the impact on monthly spending given various read percentages. This chart assumes a monthly ingest of 1,000,000 files totaling 10,240 GB in size. Assuming sample pricing, this chart shows a break-even point at or around the 25% read level. After that level, the cost of archive storage begins to rise relative to the cost of cool storage.
Sample prices
The following table includes sample (fictitious) prices for each request to the Blob Service endpoint (blob.core.windows.net
).
Important
These prices are meant only as examples, and shouldn't be used to calculate your costs. For official prices, see the Azure Blob Storage pricing or Azure Data Lake Storage pricing pages. For more information about how to choose the correct pricing page, see Understand the full billing model for Azure Blob Storage.
Price factor | Hot | Cool | Cold | Archive |
---|---|---|---|---|
Price of write operations (per 10,000) | $0.055 | $0.10 | $0.18 | $0.11 |
Price of read operations (per 10,000) | $0.0044 | $0.01 | $0.10 | $5.50 |
List and container operations (per 10,000) | $0.055 | $0.055 | $0.065 | $.055 |
All other operations (per 10,000) | $0.0044 | $0.0044 | $0.0052 | $.0044 |
Price of data retrieval (per GB) | Free | $0.01 | $0.03 | $.022 |
Price of Data storage first 50 TB (pay-as-you-go) | $0.0208 | $0.0115 | $0.0045 | $0.002 |
Price of Data storage next 450 TB (pay-as-you-go) | $0.020 | $0.0115 | $0.0045 | $0.002 |
Price of 100 TB (One-year reserved capacity) | $1,747 | $966 | Not available | $183 |
Price of 100 TB (Three-year reserved capacity) | $1,406 | $872 | Not available | $168 |
Network bandwidth between regions within North America (per GB) | $0.02 | $0.02 | $0.02 | $0.02 |
Price of high priority read operations (per 10,000) | Not applicable | Not applicable | Not applicable | $65.00 |
Price of high priority data retrieval (per GB) | Not applicable | Not applicable | Not applicable | $0.13 |
The following table includes sample prices (fictitious) prices for each request to the Data Lake Storage endpoint (dfs.core.windows.net
). For official prices, see Azure Data Lake Storage pricing.
Price factor | Hot | Cool | Cold | Archive |
---|---|---|---|---|
Price of write operations (every 4 MiB, per 10,000) | $0.07120 | $0.13 | $0.234 | $0.143 |
Price of read operations (every 4 MiB, per 10,000) | $0.0057 | $0.013 | $0.13 | $7.15 |
Iterative write operations (per 100) | $0.0715 | $0.0715 | $0.0715 | $0.0715 |
Iterative read operations (per 10,000) | $0.0715 | $0.0715 | $0.0845 | $0.0715 |
Price of data retrieval (per GB) | Free | $0.01 | $0.03 | $0.022 |
Network bandwidth between regions within North America (per GB) | $0.02 | $0.02 | $0.02 | $0.02 |
Data storage prices first 50 TB (pay-as-you-go) | $0.021 | $0.012 | $0.0045 | $0.002 |
Data storage prices next 450 TB (pay-as-you-go) | $0.020 | $0.012 | $0.0045 | $0.002 |
Price of 100 TB (One-year reserved capacity) | $1,747 | $966 | Not available | $183 |
Price of 100 TB (Three-year reserved capacity) | $1,406 | $872 | Not available | $168 |
Price of high priority read operations (per 10,000) | Not applicable | Not applicable | Not applicable | $84.50 |
Price of high priority data retrieval (per GB) | Not applicable | Not applicable | Not applicable | $0.13 |
Index (GB / month) | $0.0297 | Not applicable | Not applicable | Not applicable |