DiskSpd, PowerShell and storage performance: measuring IOPs, throughput and latency for both local disks and SMB file shares

1. Introduction

 

I have been doing storage-related demos and publishing blogs with some storage performance numbers for a while, and I commonly get questions such as “How do you run these tests?” or “What tools do you use to generate IOs for your demos?”. While it’s always best to use a real workload to test storage, sometimes that is not convenient. In the past, I frequently used and recommended a free tool from Microsoft to simulate IOs called SQLIO. However, there is a better tool that was recently released by Microsoft called DiskSpd. This is a flexible tool that can simulate many different types of workloads. And you can apply it to several configurations, from a physical host or virtual machine, using all kinds of storage, including local disks, LUNs on a SAN, Storage Spaces or SMB file shares.

2. Download the tool

 

To get started, you need to download and install the DiskSpd. You can get the tool from https://aka.ms/DiskSpd. It comes in the form of a ZIP file that you can open and copy local folder. There are actually 3 subfolders with different versions of the tool included in the ZIP file: amd64fre (for 64-bit systems), x86fre (for 32-bit systems) and armfre (for ARM systems). This allows you to run it in pretty much every Windows version, client or server.

In the end, you really only need one of the versions of DiskSpd.EXE files included in the ZIP (the one that best fits your platform). If you’re using a recent version of Windows Server, you probably want the version in the amd64fre folder. In this blog post, I assume that you copied the correct version of DiskSpd.EXE to the C:DiskSpd local folder.

If you're a developer, you might also want to take a look at the source code for DiskSpd. You can find that at https://github.com/microsoft/diskspd.

3. Run the tool

 

When you’re ready to start running DiskSpd, you want to make sure there’s nothing else running on the computer. Other running process can interfere with your results by putting additional load on the CPU, network or storage. If the disk you are using is shared in any way (like a LUN on a SAN), you want to make sure that nothing else is competing with your testing. If you’re using any form of IP storage (iSCSI LUN, SMB file share), you want to make sure that you’re not running on a network congested with other kinds of traffic.

WARNING: You could be generating a whole lot of disk IO, network traffic and/or CPU load when you run DiskSpd. If you’re in a shared environment, you might want to talk to your administrator and ask permission. This could generate a whole lot of load and disturb anyone else using other VMs in the same host, other LUNs on the same SAN or other traffic on the same network.

WARNING: If you use DiskSpd to write data to a physical disk, you might destroy the data on that disk. DiskSpd does not ask for confirmation. It assumes you know what you are doing. Be careful when using physical disks (as opposed to files) with DiskSpd.

NOTE: You should run DiskSpd from an elevated command prompt. This will make sure file creation is fast. Otherwise, DiskSpd will fall back to a slower method of creating files. In the example below, when you're using a 1TB file, that might take a long time.

From an old command prompt or a PowerShell prompt, issue a single command line to start getting some performance results. Here is your first example using 8 threads of execution, each generating 8 outstanding random 8KB unbuffered read IOs:

PS C:DiskSpd> C:DiskSpddiskspd.exe -c1000G -d10 -r -w0 -t8 -o8 -b8K -h -L X:testfile.dat

Command Line: C:DiskSpddiskspd.exe -c1000G -d10 -r -w0 -t8 -o8 -b8K -h -L X:testfile.dat

Input parameters:

        timespan: 1
-------------
duration: 10s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'X:testfile.dat'
think time: 0ms
burst size: 0
software and hardware cache disabled
performing read test
block size: 8192
using random I/O (alignment: 8192)
number of outstanding I/O operations: 8
stride size: 8192
thread stride size: 0
threads per file: 8
using I/O Completion Ports
IO priority: normal

Results for timespan 1:
*******************************************************************************

actual test time: 10.01s
thread count: 8
proc count: 4

CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 5.31%| 0.16%| 5.15%| 94.76%
1| 1.87%| 0.47%| 1.40%| 98.19%
2| 1.25%| 0.16%| 1.09%| 98.82%
3| 2.97%| 0.47%| 2.50%| 97.10%
-------------------------------------------
avg.| 2.85%| 0.31%| 2.54%| 97.22%

Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 20480000 | 2500 | 1.95 | 249.77 | 32.502 | 55.200 | X:testfile.dat (1000GB)
1 | 20635648 | 2519 | 1.97 | 251.67 | 32.146 | 54.405 | X:testfile.dat (1000GB)
2 | 21094400 | 2575 | 2.01 | 257.26 | 31.412 | 53.410 | X:testfile.dat (1000GB)
3 | 20553728 | 2509 | 1.96 | 250.67 | 32.343 | 56.548 | X:testfile.dat (1000GB)
4 | 20365312 | 2486 | 1.94 | 248.37 | 32.599 | 54.448 | X:testfile.dat (1000GB)
5 | 20160512 | 2461 | 1.92 | 245.87 | 32.982 | 54.838 | X:testfile.dat (1000GB)
6 | 19972096 | 2438 | 1.90 | 243.58 | 33.293 | 55.178 | X:testfile.dat (1000GB)
7 | 19578880 | 2390 | 1.87 | 238.78 | 33.848 | 58.472 | X:testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total: 162840576 | 19878 | 15.52 | 1985.97 | 32.626 | 55.312

Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 20480000 | 2500 | 1.95 | 249.77 | 32.502 | 55.200 | X:testfile.dat (1000GB)
1 | 20635648 | 2519 | 1.97 | 251.67 | 32.146 | 54.405 | X:testfile.dat (1000GB)
2 | 21094400 | 2575 | 2.01 | 257.26 | 31.412 | 53.410 | X:testfile.dat (1000GB)
3 | 20553728 | 2509 | 1.96 | 250.67 | 32.343 | 56.548 | X:testfile.dat (1000GB)
4 | 20365312 | 2486 | 1.94 | 248.37 | 32.599 | 54.448 | X:testfile.dat (1000GB)
5 | 20160512 | 2461 | 1.92 | 245.87 | 32.982 | 54.838 | X:testfile.dat (1000GB)
6 | 19972096 | 2438 | 1.90 | 243.58 | 33.293 | 55.178 | X:testfile.dat (1000GB)
7 | 19578880 | 2390 | 1.87 | 238.78 | 33.848 | 58.472 | X:testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total: 162840576 | 19878 | 15.52 | 1985.97 | 32.626 | 55.312

Write IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
1 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
2 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
3 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
4 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
5 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
6 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
7 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total: 0 | 0 | 0.00 | 0.00 | 0.000 | N/A

  %-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 3.360 | N/A | 3.360
25th | 5.031 | N/A | 5.031
50th | 8.309 | N/A | 8.309
75th | 12.630 | N/A | 12.630
90th | 148.845 | N/A | 148.845
95th | 160.892 | N/A | 160.892
99th | 172.259 | N/A | 172.259
3-nines | 254.020 | N/A | 254.020
4-nines | 613.602 | N/A | 613.602
5-nines | 823.760 | N/A | 823.760
6-nines | 823.760 | N/A | 823.760
7-nines | 823.760 | N/A | 823.760
8-nines | 823.760 | N/A | 823.760
max | 823.760 | N/A | 823.760

NOTE: The -w0 is the default, so you could skip it. I'm keeping it here to be explicit about the fact we're doing all reads.

For this specific disk, I am getting 1,985 IOPS, 15.52 MB/sec of average throughput and 32.626 milliseconds of average latency. I’m getting all that information from the blue line above.

That average latency looks high for small IOs (even though this is coming from a set of HDDs), but we’ll examine that later.

Now, let’s try now another command using sequential 512KB reads on that same file. I’ll use 2 threads with 8 outstanding IOs per thread this time:

PS C:DiskSpd> C:DiskSpddiskspd.exe -c1000G -d10 -w0 -t2 -o8 -b512K -h -L X:testfile.dat

Command Line: C:DiskSpddiskspd.exe -c1000G -d10 -w0 -t2 -o8 -b512K -h -L X:testfile.dat

Input parameters:

        timespan: 1
-------------
duration: 10s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'X:testfile.dat'
think time: 0ms
burst size: 0
software and hardware cache disabled
performing read test
block size: 524288
number of outstanding I/O operations: 8
stride size: 524288
thread stride size: 0
threads per file: 2
using I/O Completion Ports
IO priority: normal

Results for timespan 1:
*******************************************************************************

actual test time: 10.00s
thread count: 2
proc count: 4

CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 4.53%| 0.31%| 4.22%| 95.44%
1| 1.25%| 0.16%| 1.09%| 98.72%
2| 0.00%| 0.00%| 0.00%| 99.97%
3| 0.00%| 0.00%| 0.00%| 99.97%
-------------------------------------------
avg.| 1.44%| 0.12%| 1.33%| 98.52%

Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 886046720 | 1690 | 84.47 | 168.95 | 46.749 | 47.545 | X:testfile.dat (1000GB)
1 | 851443712 | 1624 | 81.17 | 162.35 | 49.497 | 54.084 | X:testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total: 1737490432 | 3314 | 165.65 | 331.29 | 48.095 | 50.873

Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 886046720 | 1690 | 84.47 | 168.95 | 46.749 | 47.545 | X:testfile.dat (1000GB)
1 | 851443712 | 1624 | 81.17 | 162.35 | 49.497 | 54.084 | X:testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total: 1737490432 | 3314 | 165.65 | 331.29 | 48.095 | 50.873

Write IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
1 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | X:testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total: 0 | 0 | 0.00 | 0.00 | 0.000 | N/A

  %-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 9.406 | N/A | 9.406
25th | 31.087 | N/A | 31.087
50th | 38.397 | N/A | 38.397
75th | 47.216 | N/A | 47.216
90th | 64.783 | N/A | 64.783
95th | 90.786 | N/A | 90.786
99th | 356.669 | N/A | 356.669
3-nines | 452.198 | N/A | 452.198
4-nines | 686.307 | N/A | 686.307
5-nines | 686.307 | N/A | 686.307
6-nines | 686.307 | N/A | 686.307
7-nines | 686.307 | N/A | 686.307
8-nines | 686.307 | N/A | 686.307
max | 686.307 | N/A | 686.307

With that configuration and parameters, I got about 165.65 MB/sec of throughput with an average latency of 48.095 milliseconds per IO. Again, that latency sounds high even for 512KB IOs and we’ll dive into that topic later on.

5. Understand the parameters used

Now let’s inspect the parameters on those DiskSpd command lines. I know it’s a bit overwhelming at first, but you will get used to it. And keep in mind that, for DiskSpd parameters, lowercase and uppercase mean different things, so be very careful.

Here is the explanation for the parameters used above:

PS C:> C:DiskSpddiskspd.exe -c1G -d10 -r -w0 -t8 -o8 -b8K -h -L X:testfile.dat

Parameter Description Notes
-c Size of file used. Specify the number of bytes or use suffixes like K, M or G (KB, MB, or GB). You should use a large size (all of the disk) for HDDs, since small files will show unrealistically high performance (short stroking).
-d The duration of the test, in seconds. You can use 10 seconds for a quick test. For any serious work, use at least 60 seconds.
-w Percentage of writes. 0 means all reads, 100 means all writes, 30 means 30% w

Comments

  • Anonymous
    January 01, 2003
    The comment has been removed
    • Anonymous
      March 10, 2016
      The comment has been removed
    • Anonymous
      November 06, 2016
      The comment has been removed
  • Anonymous
    January 01, 2003
    Wow. Great post. I am curious how much memory you could test with -h and a 'low' amount of file size. Thoughts here?
  • Anonymous
    January 01, 2003
    Thanks.
  • Anonymous
    January 01, 2003
    I'm thinking that in SQLIO we can have more than 1 data file in the parameter file which could potentially have different sizes. If SQLIO (or DISKSPD) does the same number of IO on each data file we might be able to have the different density. Would you be able to confirm that?
    • Anonymous
      November 06, 2016
      @Mathieu – DiskSpd supports complex profile files (in XML) that allow different targets with different IO sizes and rates, so you could set up a complex I pattern. I'm not sure exactly what you want here, but DiskSpd will probably do it.
  • Anonymous
    January 01, 2003
    When we get ready to bench test a disk system we start by creating a baseline.

    That is we run a series of tests against one disk of each type to get our base.

    We can then accurately assess a group of the same disks as we would know optimal queue depths, write sizes, and thread counts.

    Our primary testing platform is SOFS (Scale-Out File Server) and Storage Spaces.

    IMNSHO if one does not know the baseline performance characteristics of one disk then results via any form of multi-disk testing would be highly suspect.
  • Anonymous
    January 01, 2003
    The comment has been removed
    • Anonymous
      July 04, 2017
      Hi Dan, Looks like a broken link to your article - I'd love to see what you found.
  • Anonymous
    January 01, 2003
    great article, I'm now using this tool to measure SAN performance.. very helpful.. thank you!
  • Anonymous
    October 16, 2014
    This is a very nice post. Sounds more interesting than SQLIO. However, I'm using IOMeter a very free, rich and powerfull tool.
    One question: To simulate the OLTP SQL load, I use 64K Random not 8k Random since SQL Server reads and writes to data files in 64K blocks.
  • Anonymous
    October 22, 2014
    Do you have any recommendations for simulating Hyper-V workloads?
  • Anonymous
    October 26, 2014
  1. Introduction
    Every once in a while I hear from someone that they believe they have a performance
  • Anonymous
    November 19, 2014
    In this post, I'm sharing my favorite links related to Storage Spaces in Windows Server 2012 R2.
  • Anonymous
    August 03, 2015
    In this post we will discuss how to find if performance that you observe on a Cluster Shared Volume
  • Anonymous
    January 04, 2016
    IMPORTANT NOTE: SQLIO has been deprecated, as shown at http://blogs.msdn.com/b/sql_server_team/archive
  • Anonymous
    January 04, 2016
    IMPORTANT NOTE: SQLIO has been deprecated, as shown at http://blogs.msdn.com/b/sql_server_team/archive
  • Anonymous
    January 04, 2016
    IMPORTANT NOTE: SQLIO has been deprecated, as shown at http://blogs.msdn.com/b/sql_server_team/archive
  • Anonymous
    March 04, 2016
    The x86 version does not work on windows XP or 2003, it says not a valid Win32 imageIt does work on the 32bit version of Windows 7
    • Anonymous
      September 16, 2016
      It does not work for windows 2008 either. Any ideas how to fix that?
  • Anonymous
    March 16, 2016
    Can i set timeout value while running DISKSPD?
  • Anonymous
    March 22, 2016
    I've noticed that DiskSpd isn't reliable when run inside a Hyper-V VM (I've also heard anecdotally that the same issue is apparent on VMware).When testing against dedicated dynamic .vhdx files, no matter what size the test file was (using -c up to 79GB) the .vhdx file didn't grow beyond 100MB. I also ran using the raw (80GB) disk number as target, with the same result.This inaccuracy in file size translated into inaccurate test results (>40,000 IOPS, >4000MB/s) due to Hyper-V and SAN caches both being able to hold the entire <100MB test file.Has anyone else seen this? is it a known bug with planned resolution?
    • Anonymous
      March 22, 2016
      I should add, -h was set to bypass caches within the VM.
    • Anonymous
      October 16, 2016
      FYI still seeing the same fault/bug testing within virtual servers. Managed 8000MB/s read throughput, which is impressive (i.e. impossible) over a 2x 10Gb network!Tested on both 3Par / FC and Compellent /iSCSI + SOFS / RoCE.
      • Anonymous
        October 24, 2016
        Same issue here, did you manage to get it work properly?I'm investigating older version of the tool, since in CrystalDiskMark 4.1.0 it's working fine.
  • Anonymous
    March 24, 2016
    Hi all, How can i simulate a simple disk-to-disk copy of a big file, like cut my file in C: and paste it in D:Thanks
  • Anonymous
    June 07, 2016
    The comment has been removed
    • Anonymous
      October 16, 2016
      Probably a dumb question, but are you running the right .exe? There are 3 versions for different CPU architectures. On 64-bit Windows you want the one in the amd64fre folder.
  • Anonymous
    July 29, 2016
    The comment has been removed
    • Anonymous
      October 16, 2016
      Dave,There's no way to disable storage array write caching from diskspd (or any other app). The OS can request a synchronous write, which means that the storage must ensure the write is complete prior to acknowledging, but any SAN storage (with dual controllers and mirrored cache) will provide that confirmation as soon as the write is in cache.
  • Anonymous
    September 17, 2016
    We are in process of upgrade the OS from windows 2008 to windows 2012 r2. Therefore, we want to breach-mark the window 2008 using deskspd, unfortunately it will not run . We keep getting an error saying this is not a valid Win32 application. Does anyone have any ideas I walk around the scission?
  • Anonymous
    October 04, 2016
    can somebody tell me how -g works?
  • Anonymous
    March 21, 2017
    hello,i am a beginner for using this tool diskspd .I would like to know what do you mean by "different workloads".
  • Anonymous
    April 09, 2017
    About -w switch a (maybe stupid) question. It says in help text: "IMPORTANT: a write test will destroy existing data without a warning". Does this concern only the specified test file, or other data on the disk as well?
  • Anonymous
    November 21, 2017
    The comment has been removed