Configuring Windows Failover Cluster Networks

In this blog, I will discuss the overall general practices to be considered when configuring networks in Failover Clusters.

Avoid single points of failure:

Identifying single points of failure and configuring redundancy at every point in the network is very critical to maintain high availability. Redundancy can be maintained by using multiple independent networks or by using NIC Teaming. Several ways of achieving this would be:

· Use multiple physical network adapter cards. Multiple ports of the same multiport card or backplane used for networks introduces a single point of failure.

· Connect network adapter cards to different independent switches. Multiple Vlans patched into a single switch introduces a single point of failure.

· Use of NIC teaming for non-redundant networks, such as client connection, intra-cluster communication, CSV, and Live Migration. In the event of a failure of the current active network card will have the communication move over to the other card in the team.

· Using different types of network adapters eliminates affecting connectivity across all network adapters at the same time if there is an issue with the NIC driver.

· Ensure upstream network resiliency to eliminate a single point of failure between multiple networks.

· The Failover Clustering network driver detects networks on the system by their logical subnet. It is not recommended to assign more than one network adapter per subnet, including IPV6 Link local, as only one card would be used by Cluster and the other ignored.

Network Binding Order:

The Adapters and Bindingstab lists the connections in the order in which the connections are accessed by network services. The order of these connections reflects the order in which generic TCP/IP calls/packets are sent on to the wire.

How to change the binding order of network adapters

  1. Click Start, click Run, type ncpa.cpl, and then click OK. You can see the available connections in the LAN and High-Speed Internet section of the Network Connections window.
  2. Press the <ALT><N> keys on the keyboard to bring up the Advanced Menu.
  3. On the Advanced menu, click Advanced Settings, and then click the Adapters and Bindings tab.
  4. In the Connections area, select the connection that you want to move higher in the list. Use the arrow buttons to move the connection. As a general rule, the card that talks to the network (domain connectivity, routing to other networks, etc should the first bound (top of the list) card.

Cluster nodes are multi-homed systems.  Network priority affects DNS Client for outbound network connectivity.  Network adapters used for client communication should be at the top in the binding order.  Non-routed networks can be placed at lower priority.  In Windows Server 2012/2012R2, the Cluster Network Driver (NETFT.SYS) adapter is automatically placed at the bottom in the binding order list.

Cluster Network Roles:

Cluster networks are automatically created for all logical subnets connected to all nodes in the Cluster.  Each network adapter card connected to a common subnet will be listed in Failover Cluster Manager.  Cluster networks can be configured for different uses.

Name

Value

Description

Disabled for Cluster Communication

0

No cluster communication of any kind sent over this network

Enabled for Cluster Communication only

1

Internal cluster communication and CSV traffic can be sent over this network

Enabled for client and cluster communication

3

Cluster IP Address resources can be created on this network for clients to connect to. Internal and CSV traffic can be sent over this network

Automatic configuration

The Network roles are automatically configured during cluster creation. The above table describes the networks that are configured in a cluster.

Networks used for ISCSI communication with ISCSI software initiators is automatically disabled for Cluster communication (Do not allow cluster network communication on this network).

Networks configured without default gateway is automatically enabled for cluster communication only (Allow cluster network communication on this network).

Network configured with default gateway is automatically enabled for client and cluster communication (Allow cluster network communication on this network, Allow clients to connect through this network).

Manual configuration

Though the cluster networks are automatically configured while creating the cluster as described above, they can also be manually configured based on the requirements in the environment.

To modify the network settings for a Failover Cluster:

· Open Failover Cluster Manager

· Expand Networks.

· Right-click the network that you want to modify settings for, and then click Properties.

· If needed, change the name of the network.

· Select one of the following options:

o Allow cluster network communication on this network.  If you select this option and you want the network to be used by the nodes only (not clients), clear Allow clients to connect through this network. Otherwise, make sure it is selected.

o Do not allow cluster network communication on this network.  Select this option if you are using a network only for iSCSI (communication with storage) or only for backup. (These are among the most common reasons for selecting this option.)

Cluster network roles can also be changed using PowerShell command, Get-ClusterNetwork.

For example:

(Get-ClusterNetwork “Cluster Network 1”). Role =3

This configures “Cluster Network 1” to be enabled for client and cluster communication.

Configuring Quality of Service Policies in Windows 2012/2012R2:

To achieve Quality of Service, we can either have multiple network cards or used, QoS policies with multiple VLANs can be created.

QoS Prioritization is recommended to configure on all cluster deployments. Heartbeats and Intra-cluster communication are sensitive to latency and configuring a QoS Priority Flow Control policy helps reduce the latency.

An example of setting cluster heartbeating and intra-node communication to be the highest priority traffic would be:

New-NetQosPolicy “Cluster”-Cluster –Priority 6
New-NetQosPolicy “SMB” –SMB –Priority 5
New-NetQosPolicy “Live Migration” –LiveMigration –Priority 3

Note:

Available values are 0 – 6

Must be enabled on all the nodes in the cluster and the physical network switch

Undefined traffic is of priority 0

Bandwidth Allocation:

It is recommended to configure Relative Minimum Bandwidth SMB policy on CSV deployments

Example of setting minimum policy of cluster for 30%, Live migration for 20%, and SMB Traffic for 50% of the total bandwidth.

New-NetQosPolicy “Cluster” –Cluster –MinBandwidthWeightAction 30
New-NetQosPolicy “Live Migration” –LiveMigration –MinBandwidthWeightAction 20
New-NetQosPolicy “SMB” –SMB –MinBandwidthWeightAction 50

Multi-Subnet Clusters:

Failover Clustering supports having nodes reside in different IP Subnets. Cluster Shared Volumes (CSV) in Windows Server 2012 as well as SQL Server 2012 support multi-subnet Clusters.

Typically, the general rule has been to have one network per role it will provide. Cluster networks would be configured with the following in mind.

Client connectivity

Client connectivity is used for the applications running on the cluster nodes to communicate with the client systems. This network can be configured with statically assigned IPv4, IPv6 or DHCP assigned IP addresses. APIPA addresses should not be used as will be ignored networks as the Cluster Virtual Network Adapter will be on those address schemes. IPV6 Stateless address auto configuration can be used, but keep in mind that DHCPv6 addresses are not supported for clustered IP address resources. These networks are also typically a routable network with a Default Gateway.

CSV Network for Storage I/O Redirection.

You would want this network if using as a Hyper-V Cluster and highly available virtual machines. This network is used for the NTFS Metadata Updates to a Cluster Shared Volume (CSV) file system. These should be lightweight and infrequent unless there are communication related events getting to the storage.

In the case of CSV I/O redirection, latency on this network can slow down the storage I/O performance. Quality of Service is important for this network. In case of failure in a storage path between any nodes or the storage, all I/O will be redirected over the network to a node that still has the connectivity for it to commit the data. All I/O is forwarded, via SMB, over the network which is why network bandwidth is important.

Client for Microsoft Networks and File and Printer Sharing for Microsoft Networks need to be enabled to support Server Message Block (SMB) which is required for CSV. Configuring this network not to register with DNS is recommended as it will not use any name resolution. The CSV Network will use NTLM Authentication for its connectivity between the nodes.

CSV communication will take advantage of the SMB 3.0 features such as SMB multi-channel and SMB Direct to allow streaming of traffic across multiple networks to deliver improved I/O performance for its I/O redirection.

By default, the cluster will automatically choose the NIC to be used for CSV for manual configuration refer the following article.

Designating a Preferred Network for Cluster Shared Volumes Communication
https://technet.microsoft.com/en-us/library/ff182335(WS.10).aspx

This network should be configured for Cluster Communications

Live Migration Network

As with the CSV network, you would want this network if using as a Hyper-V Cluster and highly available virtual machines. The Live Migration network is used for live migrating Virtual machines between cluster nodes. Configure this network as Cluster communications only network. By default, Cluster will automatically choose the NIC for Live migration.

Multiple networks can be selected for live migration depending on the workload and performance. It will take advantage of the SMB 3.0 feature SMB Direct to allow migrations of virtual machines to be done at a much quicker pace.

ISCSI Network:

If you are using ISCSI Storage and using the network to get to it, it is recommended that the iSCSI Storage fabric have a dedicated and isolated network. This network should be disabled for Cluster communications so that the network is dedicated to only storage related traffic.

This prevents intra-cluster communication as well as CSV traffic from flowing over same network. During the creation of the Cluster, ISCSI traffic will be detected and the network will be disabled from Cluster use. This network should set to lowest in the binding order.

As with all storage networks, you should configure multiple cards to allow the redundancy with MPIO. Using the Microsoft provided in-box teaming drivers, network card teaming is now supported in Win2012 with iSCSI.

Heartbeat communication and Intra-Cluster communication

Heartbeat communication is used for the Health monitoring between the nodes to detect node failures. Heartbeat packets are Lightweight (134 bytes) in nature and sensitive to latency. If the cluster heartbeats are delayed by a Saturated NIC, blocked due to firewalls, etc, it could cause the cluster node to be removed from Cluster membership.

Intra-Cluster communication is executed to update the cluster database across all the nodes any cluster state changes. Clustering is a distributed synchronous system. Latency in this network could slow down cluster state changes.

IPv6 is the preferred network as it is more reliable and faster than IPv4. IPv6 linklocal (fe80) works for this network.

In Windows Clusters, Heartbeat thresholds are increased as a default for Hyper-V Clusters.

The default value changes when the first VM is clustered.

Cluster Property

Default

Hyper-V Default

SameSubnetThreshold

5

10

CrossSubnetThreshold

5

20

Generally, heartbeat thresholds are modified after the Cluster creation. If there is a requirement to increase the threshold values, this can be done in production times and will take effect immediately.

Configuring full mesh heartbeat

The Cluster Virtual Network Driver (NetFT.SYS) builds routes between the nodes based on the Cluster property PlumbAllCrossSubnetRoutes.

Value Description

0 Do not attempt to find cross subnet routes if local routes are found

1 Always attempt to find routes that cross subnets

2 Disable the cluster service from attempting to discover cross subnet routes after node successfully joins.

To make a change to this property, you can use the command:

(Get-Cluster). PlumbAllCrossSubnetRoutes = 1

References for configuring Networks for Exchange 2013 and SQL 2012 on Failover Clusters.

Exchange server 2013 Configuring DAG Networks.
https://technet.microsoft.com/en-us/library/dd298065(v=exchg.150).aspx

Before Installing Failover Clustering for SQL Server 2012
https://msdn.microsoft.com/en-us/library/ms189910.aspx

At TechEd North America 2013, there was a session that Elden Christensen (Failover Cluster Program Manager) did that was entitled Failover Cluster Networking Essentials that goes over a lot of configurations, best practices etc.

Failover Cluster Networking Essentials
https://channel9.msdn.com/Events/TechEd/NorthAmerica/2013/MDC-B337#fbid=ZpvM0cLRvyX

S. Jayaprakash
Senior Support Escalation Engineer
Microsoft India GTSC

Comments

  • Anonymous
    January 01, 2003
    Lost me on the Network Binding Order sections. Where on earth is this??? Not on my OS. Why don't you use a screen shot to back up your descriptions because I there is NO Advance Menu. Do you right click on the interface????? If so where is this Advance Menu?
  • Anonymous
    January 01, 2003
    @Hackee
    Yes, that is perfectly fine. You can have multiple networks in a Cluster.
  • Anonymous
    January 01, 2003
    The comment has been removed
  • Anonymous
    January 01, 2003
    @Leonard Hopkins:
    Article is fixed to reflect this now. Sorry, the author did not think to add this step.

    @Naomi:
    When copying files and CSV is involved, you are talking about MetaData. MetaData is the actual data on the NTFS File system of the CSV drive that you are directly accessing. When MetaData is involved, is is going to go to the "coordinator" node to do the copies. The "coordinator" is the owner as shown in Failover Cluster Manager / Storage / Disks. If you are not sitting on the "coordinator", then everything is redirected over the network to the node you are sitting on. For example, you are sitting on Node1 and Node2 is the "coordinator". When you copy something from the CSV to Node1's local drive, we are going to go over the network to Node2 and copy it back over the network. What you are seeing is a functionality of SMB3 (SMB Multichannel to be precise). What SMB Multichannel will do is select all the networks that are basically the same (speed, settings, metric, etc) and copy over multiple networks to make things faster then relying on copying over a single network.

    @Lars:
    There used to be a utility called NVSPBIND.EXE that would allow you to change the binding order of network cards on a Core Server. Unfortunately, this utility is no longer avialable from Microsoft.

    @Diane Schaefer:
    Try to keep things separate. Failover Cluster deals with "networks", not network cards. So if your two storage connected cards are 1.1.1.1 and 1.1.1.2, from a Windows and MPIO perspective, that is fine and Windows handles it like it is supposed to when going to and from the storage. Failover Cluster on the other hand, uses the "networks" for its Cluster communication (heartbeats, joins, registry replication, etc) between the nodes. If it sees two cards on the same "network", it is only going to use one of the cards for its communication between the nodes. Cluster Validation will flag this setup, but you can ignore it as a possible misconfiguration because it is not. Since these are cards going out to your storage, you should have them disabled for Cluster use so that we do not use it for anything Cluster communication related. Right mouse click the network (Failover Cluster Manager / Networks) and choose Properties and you will see the setting. Set this network to be disabled for Cluster use. There is no restart needed as it is a dynamic change.
  • Anonymous
    January 01, 2003
    @Rajib
    unfortunately no. The Cluster network driver is always detecting networks and any changes.
  • Anonymous
    January 01, 2003
    @Robert. Fixed. Thank you kind sir.
  • Anonymous
    January 01, 2003
    Regarding the Network Binding procedure. You should refer to http://windows.microsoft.com/en-us/windows/change-network-protocol-bindings-order#1TC=windows-7 if you are going to give instructions. You left out the part about pressing the "ALT" key to bring up said menu. Do you think everyone Knows this????

    If you are going to give instructions, please, please, please, don't ASSUME or LEAVE out steps. You will lose people and really do yourself an injustice.
  • Anonymous
    February 20, 2014
    Under "Example of setting minimum policy of cluster for 30%, Live migration for 20%, and SMB Traffic for 50% of the total bandwidth" you list the commands used to set priority from the previous section. These should have the parameter MinBandwidthWeight, right?
  • Anonymous
    February 21, 2014
    Thank you Scott. the example information is updated.
  • Anonymous
    February 26, 2014
    227 Microsoft Team blogs searched, 60 blogs have new articles. 141 new articles found searching from
  • Anonymous
    April 08, 2014
    well done John!
    att,
    Nicolas
    MS tier 2 support.
  • Anonymous
    April 22, 2014
    The comment has been removed
  • Anonymous
    May 08, 2014
    The comment has been removed
  • Anonymous
    June 10, 2014
    Useful article! Thanks. One question, in Exchange 2010 on Windows 2008 R2 we had issues with configuring the DAG with a management network IP address. In our cloud infrastructure our backup infrastructure is on this separate, isolated management network and not on the production network, as in many environments. Does SQL 2012 always on availability groups on server 2012 allow for the DAG to be presented on more than one network so that your backup infrastructure can talk to it? Microsoft Exchange blogs previously stated that service packs and powershell commands would reset the network configuration and remove this additional configuration so could not be relied upon. This causes us huge issues with duplication of backup infrastructure, storage and replication channels. Thanks
  • Anonymous
    July 07, 2014
    Hi,

    when a network interface is configured over the USB device i.e. it is a Ethernet Over USB, it is meant for a isolated point to point network with the USB device. but i notice this kind of network is also getting added automatically to the clustering failover network. what is the way to permanentlty disable clustering on this kind of network?

    Thanks,
    Rajib.
  • Anonymous
    August 20, 2014
    When defining the Live Migration Network we're specifying a subnet of a point-to-point network between Hyper-V hosts for testing, we're seeing that this value is ignored and that traffic is just passing over the default management network. Is there a better way for force this traffic over a specific interface?
  • Anonymous
    September 26, 2014
    Can this be implemented for 2 Sites? PR and DR. If PR goes down all the applications should catch by DR.
  • Anonymous
    November 12, 2014
    I am setting up a lab for Hyper-V failover clustering. I have 2 standalone PC with 1 NIC on each and does not have any space to install 2nd NIC. My question is that Can I use a Ethernet to USB adapter to use as a 2nd NIC.??

    Thanks in advance
  • Anonymous
    January 06, 2015
    Hi John,

    We are suffering big performance issues since 4 months. We have a Failover Cluster with Hyper-v and ISCSI storage connected. When we copy a large file from a CSV to local storage on one of the hosts in the cluster, the performance is dramatically, we also see freezes during the file copy.

    As we look in to this further we see during the copy strange behavior. The cluster uses the Storage network for the file copy, but it also uses for example the live migration network which is strange. We set up the Failover Cluster setting as mentioned in this article. Is there anyone who has a clue where this behavior is coming from or if it is default behavior in a cluster environment?

    Regards,
    Naomi
  • Anonymous
    January 16, 2015
    Hi John,
    what's about setting the network binding order in core installation like Hyper-V Server 2012? Any hints?
    Regards, Lars
  • Anonymous
    February 18, 2015
    I'm having trouble understanding the requirement that Failover clustering will use only one network adapter per subnet. I have two Infiniband NIC ports assigned to the same subnet with different IP addresses. (NIC teaming not yet supported by my Mellanox drivers) The network adapter is connected to two different switches to an iSCSI target on the back end. I set up MPIO between the two paths so a single drive can be seen by my failover cluster. Although I am using the same subnet, why is the setup being flagged in the cluster validation?
  • Anonymous
    October 03, 2016
    The comment has been removed