What is the capacity for storing data in a Azure Data Explorer Database?

tevin.sales 20 Reputation points
2024-02-15T10:58:53.3833333+00:00

How does storage work with Azure Data Explorer in terms of capacity?


For example: In the following document it is described as separating computing and storage spaces. https://video2.skills-academy.com/en-us/azure/data-explorer/how-it-works#data-storage


The pricing on data explorer mentions the Computing and Storage SKU's as well. However, what I am confused about is that the SKU's mention cached storage. Does this mean that the max capacity of the cluster itself only has "X" amount of GB storage? https://azure.microsoft.com/en-us/pricing/details/data-explorer/

---In this thread it mentions that the storage is limitless due to the fact that the database that is in the cluster is a blob storage data and is stored as persistent data. https://stackoverflow.com/questions/70984159/what-is-the-storage-limit-in-azure-data-explorer-and-what-does-it-depend-on


How is data actually saved/ingested with ADX? And how is this priced? For example, if I need to save 40 TBs of data in ADX, but the highest "Storage" SKU is 7 TB.

Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
501 questions
{count} votes

Accepted answer
  1. Sander van de Velde | MVP 30,711 Reputation points MVP
    2024-02-16T12:47:36.99+00:00

    Hello @tevin.sales ,

    welcome to this moderated Azure community forum.

    I followed the discussion and I understand the different 'storage' solutions are a bit confusing.

    I will try to make it a bit more insightful by over simplifying Azure Data Explorer (ADX).

    Most importantly, ADX is a database, optimized for timeseries data. I can store 'time related facts and observations' and store them insert only in tables as rows.

    ADX is a PaaS service. This means a lot of technical implications are managed for you. So you only need to know the concept and start using it.

    The actual storage is in some Azure Storage Account. You cannot access it, it's managed by Microsoft. The data in there is compressed (zipped) in separate files in a very smart way so it's both efficient and accessible very fast for querying. This storage is also done in a very reliable way and you can store gigabytes, terabytes, etc. of data.

    To ingest data, a service is added to ADX. Again, it's managed by Microsoft. You only get the endpoint, an API and some query tooling to get that going.

    To access the ingested and query your data, you need to manage the number of 'VMs' in the ADX cluster. You can specify both the size and amount of VMs. Again, these VM's are managed by Microsoft. You just get some tooling link a query editor, access rights management, a monitoring portal, a chart dashboard, etc.

    Each VM has a combination of disk space and (memory) cache. So you can configure for query speed or for storage speed.

    If you query your older data, say data from five years old data, this will be read from the storage account and put on the disks. This is the normal way to use the data. And access is still very fast.

    But if you only want to query being a few months old, you can make use of the memory cache. A copy of all incoming data is kept accessible in the memory of the VMs and this makes querying extremely fast.

    Please check the documentation of ADX and start using it for free with the free tier of Azure Data Explorer so you learn about the possibilities.


    If the response helped, do "Accept Answer". If it doesn't work, please let us know the progress. All community members with similar issues will benefit by doing so. Your contribution is highly appreciated.

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. BhargavaGunnam-MSFT 28,526 Reputation points Microsoft Employee
    2024-02-15T20:50:07.0266667+00:00

    Hello tevin.sales,

    ADX charges an hourly price based on the VM's size and operating system. The VM size determines factors such as processing power, memory, and storage capacity.

    ADX uses Azure Blob Storage as its underlying storage. This means that the data is stored as blobs in Azure Blob Storage, and ADX provides a layer of abstraction on top of the storage layer to enable fast querying and analysis of the data.

    The storage capacity is virtually limitless, as it is based on the amount of storage available in Azure Blob Storage.

    The cached storage mentioned in the SKU refers to the amount of data that can be cached in memory for faster query performance. This is separate from the amount of data that can be stored in the underlying storage layer.

    When data is ingested into ADX, it is stored as blobs in Azure Blob Storage. The data is partitioned and distributed across multiple nodes in the cluster for faster query performance. The data is compressed and indexed to enable fast querying and analysis.

    For example: L16asv3/L16sv3 clusters has 1000 cluster node limit, or clusters with extremely high concurrent request rates.

    The maximum SSD size per VM is 4TB. SO you can have up to 4000TB of (compressed) data available in the hot cache.

    If you need to store 40 TB of data in ADX, you can use multiple clusters to store the data. You can also use Azure Blob Storage directly to store the data, and then use ADX to query and analyze the data.

    The pricing model for ADX involves separate charges for compute and storage. The storage pricing is based on the amount of data stored in Azure Blob Storage, while the compute pricing is based on the resources allocated to the compute nodes.

    Compute SKU types:

    https://video2.skills-academy.com/en-us/azure/data-explorer/manage-cluster-choose-sku?WT.mc_id=Portal-Microsoft_Azure_Kusto#compute-sku-types

    https://azure.microsoft.com/en-us/pricing/calculator/?service=data-explorer

    https://stackoverflow.com/questions/70984159/what-is-the-storage-limit-in-azure-data-explorer-and-what-does-it-depend-on

    I hope this answers your question. Please let me know if you have any further questions.

    0 comments No comments