Storage Spaces Direct - Good Hyper-V host performance, poor guest VM Performance

Rob Cowlthorp 20 Reputation points
2024-02-08T00:04:35.7766667+00:00

Hey Everyone,

I am building a 4-node cluster with Windows Server 2022, Hyper-V, and Storage Spaces Direct. Everything is going well except for storage performance for guest VMs, it is performing much slower than expected. In general, the Hyper-V hosts see ~10x the performance of the Hyper-V guest.

Hyper-V Hosts:

  • Model: Dell 740XD2
  • RAM: 768 GB
  • CPU: 64 cores (2 sockets x 32 cores)
  • Storage: 2x 256GB SAS SSD (OS), 4x 1.6TB NVME SSD, 16x 3.2TB NVME SSD
  • Network: 2x Broadcom 25Gb, Cisco ACI

Performance Numbers:

Computer   ReadMiBSec   WriteMiBSec   ReadIOPS   WriteIOPS   Hardware   Storage

HVGuest21          60            18      7,654       2,297      HV VM   S2D - volume21    <The Problem>

HVHost21          636           190     81,353      24,338       Dell   S2D - volume21
HVHost22          616           184     78,899      23,603       Dell   S2D - volume22
HVHost23          509           153     65,215      19,524       Dell   S2D - volume23
HVHost24          606           181     77,575      23,210       Dell   S2D - volume24



  • Performance data was generated using diskSpd.exe
  • Command: diskspd.exe -b8k -d30 -o4 -t8 -h -r -w23 -L -Z1G -c20G C:\ClusterStorage\<hvhostxx>\DiskSpd.dat
  • Host storage is aligned - there is one virtual disk per HV host, each virtual disk is assigned to the corresponding host, and each host uses the corresponding disk for testing
  • The HV Guest VM storage is aligned also
  • If the host and storage are un-aligned, performance metrics drop to ~50% for the HV hosts and the HV guest
  • There is minimal load on the HV hosts, with 2 idle test VMs per host.

I know it's not apples to apples but in our VMWare environment our guest VM throughput and IOPS numbers are ~3x what we see with Hyper-V / S2D using VSAN or NetApp/NFS, identical underlying servers and networking, and the VMWare servers are hosting ~60 VMs each.

Questions

  • Is it typical to see this much difference in performance between the host and the guest?
  • Can I do something to improve the storage performance of my guest VMs?

I hope I'm just missing something. Any suggestions would be appreciated.

Thanks,
Rob

Hyper-V
Hyper-V
A Windows technology providing a hypervisor-based virtualization solution enabling customers to consolidate workloads onto a single server.
2,613 questions
Windows Server Storage
Windows Server Storage
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Storage: The hardware and software system used to retain data for subsequent retrieval.
642 questions
0 comments No comments
{count} votes

Accepted answer
  1. Net Runner 600 Reputation points
    2024-02-13T18:22:13.6866667+00:00

    Indeed, it is typical to see such a difference in performance between the host and the guest running on Storage Spaces Direct. That is one of the reasons we prefer using Starwinds on Hyper-V for clustering purposes.

    As for your issue, I recommend you try disabling RSS and VMQ on all your network adapters (for simplicity) and rerun benchmarks to see if that changes anything.

    Try benchmarking the storage inside guests using a deeper queue. For NVMe storage like yours, I would recommend going with at least -o32 to see if that makes any difference as well.

    Make sure your DELL Bios is configured with a performance power profile. That is one of the common root causes for the slow performance of virtual machines in general.

    You didn't provide the latency and CPU load you get while benchmarking. Make sure diskspd does not max out your CPU inside the virtual machine when running tests, just in case.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Ian Xue (Shanghai Wicresoft Co., Ltd.) 33,376 Reputation points Microsoft Vendor
    2024-02-09T04:48:34.5433333+00:00

    Hi Rob,

    Thanks for your post. In general, start with these steps:

    1. Confirm the make and model of SSD is certified for Windows Server 2016 and Windows Server 2019 by using the Windows Server Catalog. Confirm with the vendor that the drives are supported for Storage Spaces Direct.
    2. Inspect the storage for any faulty drives. Use storage management software to check the status of the drives. If any of the drives are faulty, work with your vendor.
    3. Update the storage and drive firmware if necessary. Ensure that the latest Windows Updates are installed on all nodes. You can get the latest updates for Windows Server 2016 from Windows 10 and Windows Server 2016 update history. Get the latest updates for Windows Server 2019 from Windows 10 and Windows Server 2019 update history.
    4. Update the network adapter drivers and firmware.
    5. Run cluster validation and review the Storage Space Direct section. Ensure that the drives you use for the cache are reported correctly and have no errors.

    If you're still having problems, review the troubleshooting information for each of the specific issues in this article. Storage Spaces Direct troubleshooting | Microsoft Learn

    Best Regards,

    Ian Xue


    If the Answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments