We have a 5 node Windows Server 2016 Failover Cluster setup using an HPE Nimble as shared storage. We're using the cluster for Hyper-V. All virtual machine VHDXs are stored on the cluster shared volume (CSV).
We're having a problems with disk performance within VMs when the VM is running on a node which does not own the CSV storage.
Transferring files via SMB between VMs when they are both running on a node which owns the CSV speeds are between 1.5GB/s and 2GB/s. If you take the storage ownership away from that node, speeds drop to ~100MB/s.
This seems like the storage traffic is going via the 1GB network, through the owner node then into the SAN. From what I understand this shouldn't be the case unless the CSV has been set to redirected mode. (I've not confirmed this with Wireshark or anything yet, working on that)
I've run the command Get-ClusterSharedVolumeState which returned the following:
BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : HyperV03
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\
BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : Hyperv06
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\
BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : hyperv05
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\
BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : Hyperv04
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\
BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : Hyperv02
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\
According to this output redirection isn't the cause of the issue.
Can anyone think of a reason why else this might be happening?
Connections to the SAN have all been setup using HPE Windows Toolkit which configures the MPIO settings and various other bits for you. We've confirmed all nodes are able to hit transfers speeds of the expected 1GB/s+ but only when that node takes ownership of the CSV.