Capacity estimations

Jay D Zimmerman 21 Reputation points
2020-09-08T13:05:59.057+00:00

I recently completed a POC to demonstrate using Cosmos DB in lieu of other data stores currently utilized by my company. As I prepare information to present, I need to be able to speak to cost, part of which involves storage requirements. At the moment I am having a terribly difficult finding documentation on properly estimating at-rest storage size. Is there a general rule of thumb that can be applied (for example, a 1 KB JSON document requires x KB of storage space in Cosmos DB)? My existing investigative work shows numbers all over the place, but here is an example of rather startling storage overhead:

I generated 50,000 tiny JSON documents and inserted them into a container with an exclusionary indexing policy.

  • Prior to insertion, the total in-memory size of the documents was 2.15 MB
  • After insertion, the total in-memory size of the documents was 11.92 MB (which is accounted for by the additional metadata fields added by Cosmos DB)
  • The reported documents size (from x-ms-resource-usage) was 25.30 MB
  • The reported container size (from x-ms-resource-usage) was 38.18 MB
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,632 questions
{count} votes

Accepted answer
  1. Mark Brown - MSFT 2,766 Reputation points Microsoft Employee
    2020-09-16T19:41:16.78+00:00

    Hello and thanks again for your patience.

    The numbers you are seeing here with total storage is correct and accurate. Let me explain why.

    Within Cosmos DB we store the following 4 things when you write data:

    1. The data itself with customer defined and system defined properties.
    2. Any customer defined indexes.
    3. A system managed index that manages the partition key. (in your case /id).
    4. A system managed index that ensures /id is unique per partition key.

    Given you have no user defined indexes, a collection with just /id will have roughly 3x the amount of data for just the customer and system defined properties. This percentage of course goes down as you increase the number of properties.

    Thanks again for your patience while I confirmed this information. I hope this answers your question.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.