SQL Server: Minimize Disk I/O
Query tuning and indexing is an effective way to reduce physical and logical disk I/O.
Excerpted from “SQL Server DMV Starter Pack,” published by Red Gate Books (2010).
Glenn Berry, Louis Davidson and Tim Ford
There is a persistent need to minimize logical and physical I/O. The collection of I/O-related database management objects (DMOs) helps investigate, specifically, physical I/O taking place on your system, when data is written to and read from disk.
The DMOs in this category provide an explicit picture of disk I/O from the point of view of the disk subsystem. They show us, for example, how the I/O is distributed across various files on the disk, places where I/O is becoming a bottleneck and resulting in I/O stalls, and so on. You can use this information to optimize the disk subsystem architecture. You could also gather data and use it to support requests to business unit leaders for more storage capacity.
Naturally, some physical I/O is unavoidable. SQL Server must write application data to disk. It also has to write to the transaction log for every insert, update and delete, and even for bulk operations. However, before jumping to the conclusion that you simply require more disk power, remember there’s much you can do in terms of query tuning and indexing to minimize unnecessary logical and physical I/O.
You should consider the I/O information derived from the DMOs covered here (all of which start with “sys.dm_io_”), as well as data from other Dynamic Management Views (DMVs) that reference I/O performance in some manner, including:
- sys.dm_exec_query_stats – I/O that a given query has cost over the times it has been executed
- sys.dm_exec_connections – I/O that has taken place on that connection
- sys.dm_exec_sessions – I/O that has taken place during that session
- sys.dm_os_workers – I/O pending for a given worker thread
All of the queries in this section work with SQL Server 2005, 2008 and 2008 R2, and all require View Server State permission.
Investigate Disk Bottlenecks via I/O Stalls
The DMV we’ll use here is sys.dm_io_virtual_file_stats, which SQL Server Books Online describes as: “Returns I/O statistics for data and log files. This dynamic management view replaces the fn_virtualfilestats function.”
This DMV accepts two arguments: database_id and file_id. You can specify NULL for either one. In that case, it will return information on all of the databases or all of the files.
Note that this DMV is accumulative. In other words, the values in the data columns increment continuously from the point when the server was last restarted. This means you need to take a baseline measurement, followed by the actual measurement. Then subtract the two, so that you can see where I/O is accumulating.
This script lets you see the number of reads and writes on each data and log file for every database running on an instance of SQL Server. It’s sorted by average I/O stall time, in milliseconds:
-- Calculates average stalls per read, per write, and per total input/output -- for each database file. SELECT DB_NAME(database_id) AS [Database Name] , file_id , io_stall_read_ms , num_of_reads , CAST(io_stall_read_ms / ( 1.0 + num_of_reads ) AS NUMERIC(10, 1)) AS [avg_read_stall_ms] , io_stall_write_ms , num_of_writes , CAST(io_stall_write_ms / ( 1.0 + num_of_writes ) AS NUMERIC(10, 1)) AS [avg_write_stall_ms] , io_stall_read_ms + io_stall_write_ms AS [io_stalls] , num_of_reads + num_of_writes AS [total_io] , CAST(( io_stall_read_ms + io_stall_write_ms ) / ( 1.0 + num_of_reads + num_of_writes) AS NUMERIC(10,1)) AS [avg_io_stall_ms]FROM sys.dm_io_virtual_file_stats(NULL, NULL)ORDER BY avg_io_stall_ms DESC ;
This query will show you the files waiting the longest for disk I/O. It can help you to decide where to locate individual files based on your available disk resources. You can also use it to help persuade someone like a SAN engineer that SQL Server is seeing disk bottlenecks for certain files.
Investigate Disk Bottlenecks via Pending I/O
This takes a slightly different approach to investigating disk I/O bottlenecks. Use the sys.dm_io_pending_io_requests DMV, which SQL Server Books Online describes as: “Returns a row for each pending I/O request in SQL Server.”
The data in the DMV provides a “point in time” snapshot of I/O requests pending on your system, right at the moment you execute the script:
-- Look at pending I/O requests by file SELECT DB_NAME(mf.database_id) AS [Database] , mf.physical_name ,r.io_pending , r.io_pending_ms_ticks , r.io_type , fs.num_of_reads , fs.num_of_writesFROM sys.dm_io_pending_io_requests AS r INNER JOIN sys.dm_io_virtual_file_stats(NULL, NULL) AS fs ON r.io_handle = fs.file_handle INNER JOIN sys.master_files AS mf ON fs.database_id = mf.database_id AND fs.file_id = mf.file_idORDER BY r.io_pending , r.io_pending_ms_ticks DESC ;
Because this data represents a point-in-time snapshot of activity, you’ll want to run this query multiple times to see if the same files (and the same drive letters) show up consistently at the top of the list. If that happens, it’s evidence of I/O bottlenecks for that particular file or drive letter. You could use this to help convince your SAN engineer the system was experiencing I/O issues for a particular LUN.
The last two columns in the query return the cumulative number of reads and writes for the file since SQL Server was started (or since the file was created—whichever was shorter). This information is helpful when trying to decide which RAID level to use for a particular drive letter. For example, files with more write activity will usually perform better on a RAID 10 LUN than they will on a RAID 5 LUN.
Knowing the relative read/write ratio for each file can help you place your database files on an appropriate LUN. This, in turn, will help you tune your queries for greater efficiency.
Glenn Berry works as a database architect at NewsGator Technologies in Denver, Colo. He’s a SQL Server MVP, and has a whole collection of Microsoft certifications, including MCITP, MCDBA, MCSE, MCSD, MCAD and MCTS, which proves that he likes to take tests.
Louis Davidson has been in the IT industry for 16 years as a corporate database developer and architect. He has been a SQL Server Microsoft MVP for six years and has written four books on database design. Currently he’s the data architect and sometimes DBA for the Christian Broadcasting Network, supporting offices in Virginia Beach, Va., and Nashville, Tenn.
Timothy Ford is a SQL Server MVP and has been working with SQL Server for more than 10 years. He’s the primary DBA and subject-matter expert for the SQL Server platform for Spectrum Health. He’s been writing about technology since 2007 for a variety of Web sites and maintains his own blog at thesqlagentman.com, covering SQL as well as telecommuting and professional development topics.**
Learn more about “SQL Server DMV Starter Pack” at red-gate.com/our-company/about/book-store.