Columnstore indexes - Query performance

Applies to: SQL Server Azure SQL Database Azure SQL Managed Instance Azure Synapse Analytics Analytics Platform System (PDW)

This article includes recommendations for achieving the fast query performance with columnstore indexes.

Columnstore indexes can achieve up to 100 times better performance on analytics and data warehousing workloads, and up to 10 times better data compression than traditional rowstore indexes. These recommendations help your queries achieve the fast query performance that columnstore indexes are designed to provide.

Recommendations for improving query performance

Here are some recommendations for achieving the high-performance columnstore indexes are designed to provide.

1. Organize data to eliminate more rowgroups from a full table scan

  • Carefully choose insert order. In common case in traditional data warehouse, the data is indeed inserted in time order and analytics is done in time dimension. For example, analyzing sales by quarter. For this kind of workload, the rowgroup elimination happens automatically. In SQL Server 2016 (13.x), you can find out number rowgroups skipped as part of query processing.

  • Use a rowstore clustered index. If the common query predicate is on a column (for example, C1) unrelated to the insert order, create a rowstore clustered index on column C1. Then, drop the rowstore clustered index and create a clustered columnstore index. If you create the clustered columnstore index explicitly using MAXDOP = 1, the resulting clustered columnstore index is perfectly ordered on column C1. If you specify MAXDOP = 8, then you see overlap of values across eight rowgroups. For a nonclustered columnstore index (NCCI), if the table has a rowstore clustered index, the rows are already ordered by the clustered index key. In this case, the nonclustered columnstore index is also automatically ordered. A columnstore index does not inherently maintain the order of rows. As new rows are inserted or older rows are updated, you might need to repeat the process as the analytics query performance might deteriorate.

  • Implement table partitioning. You can partition the columnstore index, then use partition elimination to reduce number of rowgroups to scan. For example, a fact table stores purchases made by customers. A common query pattern is to find quarterly purchases by customer. In this case, combine the insert order column with partitioning on customer column. Each partition contains rows for each customer, ordered upon insertion. Also, consider using table partitioning if there's a need to remove older data from the columnstore. Switching out and truncating partitions that are not needed is an efficient strategy to delete data without generating fragmentation.

  • Avoid deleting large amounts of data. Removing compressed rows from a rowgroup is not a synchronous operation. It would be expensive to decompress a rowgroup, delete the row, and then recompress it. Therefore, when you delete data from compressed rowgroups these rowgroups are still scanned, even though they return fewer rows. If the number of deleted rows for several rowgroups is large enough to be merged into fewer rowgroups, reorganizing the columnstore increases the quality of the index and query performance improves. If your data deletion process usually empties entire rowgroups, consider using table partitioning. Switch out partitions that are not needed anymore and truncate them, instead of deleting rows.

    Note

    Starting with SQL Server 2019 (15.x), the tuple-mover is helped by a background merge task. This task automatically compresses smaller OPEN delta rowgroups that have existed for some time, as determined by an internal threshold, or merges COMPRESSED rowgroups from where a large number of rows has been deleted. This improves the columnstore index quality over time. If deleting large amounts of data from the columnstore index is required, consider splitting that operation into smaller delete batches over time. Batching allows the background merge task to handle the task of merging smaller rowgroups, and improves index quality. Then, there's no need to schedule index reorganization maintenance windows after data deletion. For more information about columnstore terms and concepts, see Columnstore indexes: Overview.

2. Plan for enough memory to create columnstore indexes in parallel

Creating a columnstore index is by default a parallel operation unless memory is constrained. Creating the index in parallel requires more memory than creating the index serially. When there is ample memory, creating a columnstore index takes on the order of 1.5 times as long as building a B-tree on the same columns.

The memory required for creating a columnstore index depends on the number of columns, the number of string columns, the degree of parallelism (DOP), and the characteristics of the data. For example, if your table has fewer than one million rows, SQL Server uses only one thread to create the columnstore index.

If your table has more than one million rows, but SQL Server cannot get a large enough memory grant to create the index using MAXDOP, SQL Server automatically decreases MAXDOP as needed. In some cases, DOP must be decreased to one in order to build the index under constrained memory in the available memory grant.

Since SQL Server 2016 (13.x), the query always operates in batch mode. In previous releases, batch execution is only used when DOP is greater than one.

Columnstore Performance Explained

Columnstore indexes achieve high query performance by combining high-speed in-memory batch mode processing with techniques that greatly reduce I/O requirements. Since analytics queries scan large numbers of rows, they are typically I/O-bound, and therefore reducing I/O during query execution is critical to the design of columnstore indexes. Once data is read into memory, it is critical to reduce the number of in-memory operations.

Columnstore indexes reduce I/O and optimize in-memory operations through high data compression, columnstore elimination, rowgroup elimination, and batch processing.

Data compression

Columnstore indexes achieve up to 10 times greater data compression than rowstore indexes. This greatly reduces the I/O required to execute analytics queries and therefore improves query performance.

  • Columnstore indexes read compressed data from disk, which means fewer bytes of data need to be read into memory.

  • Columnstore indexes store data in compressed form in memory, reducing I/O by avoiding reading the same data into memory. For example, with 10 times compression, columnstore indexes can keep 10 times more data in memory, compared to storing the data in uncompressed form. With more data in memory, it is more likely that the columnstore index finds the data it needs in memory without incurring unnecessary reads from disk.

  • Columnstore indexes compress data by columns instead of by rows, achieving high compression rates and reducing the size of the data stored on disk. Each column is compressed and stored independently. Data within a column always has the same data type, and tends to have similar values. Columnstore data compression techniques are great at achieving higher compression rates when values are similar.

For example, a fact table stores customer addresses and has a column for country-region. The total number of possible values is fewer than 200. Some of those values are repeated many times. If the fact table has 100 million rows, the country-region column compresses easily and requires little storage. Row-by-row compression is not able to capitalize on the similarity of column values in this way, and must use more bytes to compress the values in the country-region column.

Column elimination

Columnstore indexes skip reading in columns that are not required for the query result. Column elimination further reduces I/O for query execution and therefore improves query performance.

  • Column elimination is possible because the data is organized and compressed column by column. In contrast, when data is stored row-by-row, the column values in each row are physically stored together and cannot be easily separated. The Query Processor needs to read in an entire row to retrieve specific column values, increasing I/O because extra data is unnecessarily read into memory.

For example, if a table has 50 columns and the query only uses 5 of those columns, the columnstore index only fetches the 5 columns from disk. It skips reading in the other 45 columns, reducing I/O by another 90%, assuming all columns are of similar size. If the same data are stored in a rowstore, the query processor needs to read the remaining 45 columns.

Rowgroup elimination

For full table scans, a large percentage of the data usually does not match the query predicate criteria. By using metadata, the columnstore index is able to skip reading in the rowgroups that do not contain data required for the query result, all without actual I/O. This ability, called rowgroup elimination, reduces I/O for full table scans and therefore improves query performance.

When does a columnstore index need to perform a full table scan?

Starting with SQL Server 2016 (13.x), you can create one or more regular nonclustered rowstore, or B-tree, indexes on a clustered columnstore index. The nonclustered B-tree indexes can speed up a query that has an equality predicate or a predicate with a small range of values. For more complicated predicates, the query optimizer might choose a full table scan. Without the ability to skip rowgroups, a full table scan can be time-consuming, especially for large tables.

When does an analytics query benefit from rowgroup elimination for a full-table scan?

For example, a retail business models their sales data using a fact table with clustered columnstore index. Each new sale stores various attributes of the transaction, including the date a product is sold. Interestingly, even though columnstore indexes do not guarantee a sorted order, rows in this table are loaded in a date-sorted order. Over time this table grows. Although the retail business might keep sales data for the last 10 years, an analytics query might only need to compute an aggregate for last quarter. Columnstore indexes can eliminate accessing the data for the previous 39 quarters by just looking at the metadata for the date column. This is a 97% reduction in the amount of data that is read into memory and processed.

Which rowgroups are skipped in a full table scan?

To determine which rows groups to eliminate, the columnstore index uses metadata to store the minimum and maximum values of each column segment for each rowgroup. When none of the column segment ranges meet the query predicate criteria, the entire rowgroup is skipped without doing any actual I/O. This works because the data is usually loaded in a sorted order. Although row sorting is not guaranteed, similar data values are often located within the same rowgroup or a neighboring rowgroup.

For more information about rowgroups, see Columnstore Index Design Guidelines.

Batch Mode Execution

Batch mode execution refers to processing a set of rows, typically up to 900 rows, together for execution efficiency. For example, the query SELECT SUM (Sales) FROM SalesData aggregates the total sales from the table SalesData. In batch mode execution, the query execution engine computes the aggregate in group of 900 values. This spreads metadata the access costs and other types of overhead over all the rows in a batch, rather than paying the cost for each row thereby significantly reducing the code path. Batch mode processing operates on compressed data when possible and eliminates some of the exchange operators used by row mode processing, speeding up analytics queries by orders of magnitude.

Not all query execution operators can be executed in batch mode. For example, data manipulation language (DML) operations such as insert, delete, or update are executed one row at a time. Batch mode operator such as Scan, Join, Aggregate, Sort, and others can improve query performance. Since the columnstore index was introduced in SQL Server 2012 (11.x), there is a sustained effort to expand the operators that can be executed in the batch mode. The following table shows the operators that run in batch mode according to the product version.

Batch Mode Operators When used SQL Server 2012 (11.x) SQL Server 2014 (12.x) SQL Server 2016 (13.x) and SQL Database1 Comments
DML operations (insert, delete, update, merge) no no no DML is not a batch mode operation because it is not parallel. Even when we enable serial mode batch processing, we don't see significant gains by allowing DML to be processed in batch mode.
columnstore index scan SCAN Not available yes yes For columnstore indexes, we can push the predicate to the SCAN node.
columnstore index Scan (nonclustered) SCAN yes yes yes yes
index seek Not available Not available no We perform a seek operation through a nonclustered B-tree index in row mode.
compute scalar Expression that evaluates to a scalar value. yes yes yes Like all batch mode operators, there are some restrictions on data type.
concatenation UNION and UNION ALL no yes yes
filter Applying predicates yes yes yes
hash match Hash-based aggregate functions, outer hash join, right hash join, left hash join, right inner join, left inner join yes yes yes Restrictions for aggregation: no min/max for strings. Aggregation functions available are sum/count/avg/min/max.
Restrictions for join: no mismatched type joins on non-integer types.
merge join no no no
multi-threaded queries yes yes yes
nested loops no no no
single-threaded queries running under MAXDOP 1 no no yes
single-threaded queries with a serial query plan no no yes
sort Order by clause on SCAN with columnstore index. no no yes
top sort no no yes
window aggregates Not available Not available yes New operator in SQL Server 2016 (13.x).

1 Applies to SQL Server 2016 (13.x), SQL Database Premium tiers, Standard tiers - S3 and higher, and all vCore tiers, and Analytics Platform System (PDW)

For more information, see the Query Processing Architecture Guide.

Aggregate pushdown

A normal execution path for aggregate computation to fetch the qualifying rows from the SCAN node and aggregate the values in Batch Mode. While this delivers good performance, starting with SQL Server 2016 (13.x), the aggregate operation can be pushed to the SCAN node. Aggregate pushdown improves the performance of aggregate computations by orders of magnitude on top of Batch Mode execution, provided the following conditions are met:

  • The aggregates are MIN, MAX, SUM, COUNT and COUNT(*).
  • Aggregate operator must be on top of SCAN node or SCAN node with GROUP BY.
  • This aggregate is not a distinct aggregate.
  • The aggregate column is not a string column.
  • The aggregate column is not a virtual column.
  • The input and output data type must be one of the following and must fit within 64 bits:
    • tinyint, int, bigint, smallint, bit
    • smallmoney, money, decimal and numeric with precision <= 18
    • smalldate, date, datetime, datetime2, time

For example, aggregate pushdown is done in both of the following queries:

SELECT  productkey, SUM(TotalProductCost)
FROM FactResellerSalesXL_CCI
GROUP BY productkey;
    
SELECT  SUM(TotalProductCost)
FROM FactResellerSalesXL_CCI;

String predicate pushdown

When designing a data warehouse schema, the recommended schema modeling is to use star-schema or snowflake schema consisting of one or more fact tables and many dimension tables.

Tip

The fact table stores the business measurements or transactions and dimension table store the dimensions across which facts need to be analyzed. For more information on dimensional modeling, see Dimensional modeling in Microsoft Fabric.

For example, a fact can be a record representing a sale of a particular product in a specific region while the dimension represents a set of regions, products and so on. The fact and dimension tables are connected through a primary/foreign key relationship. Most commonly used analytics queries join one or more dimension tables with the fact table.

Let us consider a dimension table Products. A typical primary key is ProductCode, commonly represented as string. For performance of queries, it is a best practice to create surrogate key, typically an integer column, to refer to the row in the dimension table from the fact table.

The columnstore index runs analytics queries with joins and predicates involving numeric or integer-based keys efficiently. SQL Server 2016 (13.x) improved the performance of analytics queries with string-based columns significantly, by pushing down the predicates with string columns to the SCAN node.

String predicate pushdown leverages the primary/secondary dictionary created for columns to improve the query performance. For example, consider a string column segment within a rowgroup consisting of 100 distinct string values. Each distinct string value is referenced 10,000 times on average, assuming one million rows. With string predicate pushdown, the query execution computes the predicate against the values in the dictionary. If the predicate qualifies, all rows referring to the dictionary value are automatically qualified. This improves the performance in two ways:

  • Only the qualified row is returned reducing number of the rows that need to flow out of scan node.
  • The number of string comparisons are reduced. In this example, only 100 string comparisons are required as against 1 million comparisons. There are some limitations:
    • No string predicate pushdown for delta rowgroups. There is no dictionary for columns in delta rowgroups.
    • No string predicate pushdown if dictionary exceeds 64-KB entries.
    • Expression evaluating nulls are not supported.

Segment elimination

Data type choices might have a significant impact on query performance based common filter predicates for queries on the columnstore index.

In columnstore data, row groups are made up of column segments. There is metadata with each segment to allow for fast elimination of segments without reading them. This segment elimination applies to numeric, date, and time data types, and the datetimeoffset data type with scale less than or equal to two. Beginning in SQL Server 2022 (16.x), segment elimination capabilities extend to string, binary, guid data types, and the datetimeoffset data type for scale greater than two.

After upgrading to a version of SQL Server that supports string min/max segment elimination (SQL Server 2022 (16.x) and later), the columnstore index will not benefit this feature until it is rebuilt using a REBUILD or DROP/CREATE.

Segment elimination does not apply to LOB data types, such as the (max) data type lengths.

Currently, only SQL Server 2022 (16.x) and later supports clustered columnstore rowgroup elimination for the prefix of LIKE predicates, for example column LIKE 'string%'. Segment elimination is not supported for non-prefix use of LIKE, such as column LIKE '%string'.

Ordered clustered columnstore indexes also benefit from segment elimination, especially for string columns. In ordered clustered columnstore indexes, segment elimination on the first column in the index key is most effective, because it is sorted. Performance gains due to segment elimination on other columns in the table will be less predictable. For more on ordered clustered columnstore indexes, see Use an ordered clustered columnstore index for large data warehouse tables. For ordered columnstore index availability, see Ordered column index availability.

Using the query connection option SET STATISTICS IO, you can view segment elimination in action. Look for output such as the following to indicate that segment elimination has occurred. Row groups are made up of column segments, so this might indicate segment elimination. The following SET STATISTICS IO output example of a query, roughly 83% data was skipped by the query:

...
Table 'FactResellerSalesPartCategoryFull'. Segment reads 16, segment skipped 83.
...