DHCP Failover Load Balance Mode

As described in the blog on DHCP failover, there are two types of DHCP Failover relations – Load Balance which provides Active-Active configuration and Hot Standby which provides Active-Passive configuration. This blog article elaborates on the Load Balance failover relationship.

As is evident from the name, in load-balance mode of operation both the servers respond to client requests. Here’s how the servers ensure the distribution of client requests between themselves:

Each DHCP server on receiving the client request calculates hash of the MAC address in the client request as per hashing algorithm specified in RFC 3074.  Each server hashes any MAC address to a value between 1 and 256. If the load distribution ratio between the 2 servers is left at the default of 50:50; and if the hash of the MAC address falls between 1 and 128 then the first server will respond to the client request else if the hash is any value between 129 and 256, the other server responds to the client. This ensures that only one server responds for a specific client. If the load distribution ratio has been changed by the admin to a different value, the distribution of hash buckets would be in that proportion. The admin does not need to configure the MAC addresses on any server configuration a-priori.

Figure 1: Load Balance Ratio in a Failover Relationship

Handling of IP address pool in load balance failover

The free IP addresses of each failover scope are also distributed in the same proportion as the load balancing ratio. So for example, let’s say the failover scope - 10.10.10.0/24 - with an IP address range of 10.10.10.1 through 10.10.10.250. Suppose all IP addresses from 10.10.10.1 to 10.10.10.50 in this scope are leased out and all IPs starting from 10.10.10.51 are free. In this situation, the IP addresses from 10.10.10.51 through 10.10.10.150 would be apportioned to the first server and IP addresses 10.10.10.151 through 10.10.10.250 to the second server assuming the load balance ratio is 50:50. So, a client requesting a new lease and whose MAC address hash falls within the hash buckets of the first server would get an IP address 10.10.10.51 and so on. If the client’s MAC address hash falls within the hash buckets of the second server, the client will get the IP address 10.10.10.151.

As you can see from this example, when a scope is configured for failover, the 2 failover servers would be granting new IP address leases from two different portions of the IP address range of the scope. This is in contrast to the case of a standalone server where the server proceeds  sequentially through the free IP address pool of a scope, to give out new leases, starting with the first free IP address.

Figure 2: IP Address Lease view of a Failover Scope 

As clients request new leases, based on the MAC addresses of the clients, the free IP address pool of one server may get depleted faster than the other. To ensure that free IP address pool is at all times apportioned as per the load balancing ratio, every 5 minutes, the primary server checks the distribution of free IP pool distribution and transfers ownership of the IP address from itself to the partner server or vice versa using server to server failover protocol messages (binding update). This is referred as periodic rebalancing of the free IP address pool.

You can get the number of free IP addresses (and percentage of free IP pool) on each server for a failover scope, by viewing the scope statistics. The fields Addresses Available (this Server's Pool)  and Addresses Available (Partner Pool) indicate the number of free IP addresses owned by each server for the specific scope. You can view the scope statistics in DHCP MMC by right clicking on the failover scope and click on Display Statistics. You can also use the PowerShell cmdlet Get-DhcpServerv4ScopeStatistics with the –failover switch to get the same information in PowerShell. The two additional fields shown in display statistics – Addresses granted (this Server's Pool) and Addresses granted (Partner Pool) – show the number of IP addresses leased out by the servers.

Figure 3: Statistics for a scope in Failover Relationship

Load balancing operation in various failover states

When the failover relationship is in Normal state, hash bucket algorithm is applied for serving every DHCP client request. In communication-interrupted and partner-down states (i.e. when the partner server is unreachable or has gone down) hash bucket algorithm is not employed for servicing
client requests and server responds to all the clients to ensure service continuity.

Even while in Normal state, the server responds to the client if the client has been retransmitting the same request for a while. The server determines that a client has been retransmitting based on the secs field in DHCP client request. As per RFC 2131 the secs field is defined as “seconds elapsed since client began address acquisition or renewal process”. If secs field in client request is greater than 6 seconds, DHCP server will respond to the client even if the hash of the client MAC address does not fall within the hash buckets of the server. The idea behind this approach is to cater to a scenario where the server which actually owns the hash bucket for that client is down, but relation state is still Normal (there is a lag of 30 seconds between network connection (or the server) going down and this being detected by the partner server).

Most of the details shared in this article are not something that a DHCP administrator has to worry about. However, if you ever wondered how failover works under the hood (and most people do!), now you know!

Team DHCP

Comments

  • Anonymous
    January 01, 2003
    Jobish, please look at the description of MCLT as well as the DHCP examples section in "Understand and deployment guide' for DHCP failover -http://technet.microsoft.com/en-us/library/dn338985.aspx

  • Anonymous
    January 01, 2003
    When a client sends a request for a new lease, it will get lease for MCLT duration. When the client attempts to renew the lease at half the lease period i.e. MCLT/2, it will be given the scope lease duration if DHCP failover is in NORMAL state. This is as per the DHCP failover protocol. This does increase the traffic from new clients but given the scalability of Windows DHCP server, this should not pose any deployment problem.

  • Anonymous
    January 01, 2003
    Joe, yes, client 1 will get the same IP address again. There are two cases possible -

  1. DHCP 1 synced the IP address 10.2.3.4 to DHCP2 before it went down. In this case, the client will given the full lease duration which has been configured for the scope.
  2. DHCP 1 went down before syncing the IP address 10.2.3.4 to DHCP2 In this case also, the client will be able to renew but the lease duration will be shorter - same as the value configured for MCLT. There is another client behavior to be aware of here - at half the lease period, the client will attempt to renew the lease. This is just normal DHCP client behavior as per the DHCP protocol. The renew message is unicast. So in the scenario above, it will be directed to DHCP 1 which is down and so there will be no response. At 7/8th of the lease period, the client will broadcast the renew request message. This message will be seen by DHCP2 which will respond to the renew request.
  • Anonymous
    January 01, 2003
    John, when one server goes down the second server (in communication interrupted state) will renew all existing clients including clients which were earlier responded to by the server which went down. I think this addresses your concern ?
    There is a different aspect where the second server will be serving "new leases" from 50% of "free" IP addresses in the scope. However, after the second server moves to "Partner down" state, it will have take over 100% of the "free" IP addresses in the scope.

  • Anonymous
    January 01, 2003
    Thomas, please see the blog article "DHCP Failover using PowerShell" at - blogs.technet.com/.../dhcp-failover-using-powershell.aspx

  • Anonymous
    January 01, 2003
    Hello DHCP Team, I got issues with DHCP Snooping as recorded on http://support.microsoft.com/kb/2978225, but instead of reduce the number of servers on switches I changed the mode from Load Balance to StandBy and so far I don't have detected packages being dropped by the DHCP Snooping, so if this is actually a valid configuration to keep both features fully functional, you could evaluate to include it on article as additional solution.

  • Anonymous
    January 01, 2003
    yongfoo, you can configure the address rebalancing interval using the registry - DhcpFailoverAddrRebalacingTimeInt. You can create this under HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesDHCP ServerConfiguration. You need to specify the value in seconds.

  • Anonymous
    January 01, 2003
    Andrew, its not clear what exact behavior you are referring to. DHCP failover scope can be updated on either of the servers. However, on any update, you need to invoke "Replicate scope" from the server on which an update was performed.
    If lets say you want to remove server 2, perform "deconfigure failover" from server 1 MMC. Now, the failover scopes will be removed from server 2 and retained only on server 1.

  • Anonymous
    January 01, 2003
    Hi Andrew, once the DHCP server service on the second server starts, it will automatically sync up with the first server and make its lease database up to date. After that, it will enter NORMAL failover state and start servicing clients. At that point, they will start sharing the load 50:50. There is no admin intervention required. It is expected to just work!

  • Anonymous
    August 06, 2012
    I was looking through some of your blog posts on this site and I believe this web site is really informative! Keep on putting up.This site is really helpful for us. http://janinepatterson.com/

  • Anonymous
    September 17, 2012
    Question: If client1 gets an IP address of 10.2.3.4 from DHCP1 and then DHCP1 goes down but is in a load balance failover pair with DHCP2 and client1 does a renew - will client1 get the same IP address again? In other words - Does DHCP2 have IP to MAC address mapping for both pairs in the load balance?

  • Anonymous
    April 21, 2013
    I got a question here. If the failover pair stay in Normal status, which lease time will the client obtain? MCLT or the time defined in the Scope configuration. My test result is MCLT, I think it is unreasonable as it double the request traffic of DHCP. According most of document of failover, MCLT should and ONLY be actived when failover enter COMMUITCATE-INTERRPUT or PARTER-DOWN status. I'm confused.

  • Anonymous
    July 31, 2013
    It would be useful to show the PowerShell Cmdlets used to set this up.

  • Anonymous
    November 04, 2013
    Could you please provide more explanation on MCLT? Does it has got any relation with DHCP lease period? I am confused.

  • Anonymous
    November 06, 2014
    If one server goes down for an extended period, is there a setting to make 100% of the scope available to the good server instead of 50% so that all 200 of our hosts get an ip address instead of only 100? Thanks.

  • Anonymous
    January 28, 2015
    Assuming one DHCP server in a load-balanced pair has crashed and the second server is supporting 100% of the clients, what is the process to recover the second server and re-establish the 50%/50% split? Is this process documented somewhere on TechNet?

  • Anonymous
    February 09, 2015
    The comment has been removed

  • Anonymous
    February 10, 2015
    Thanks for the quick reply - I will double check how the customer has configured the load-balancing relationship and make sure they invoke the 'replicate scope' function after any changes.

  • Anonymous
    February 10, 2015
    Can the "5 minutes" interval for the periodic rebalancing of the free IP address pool can be configure to different value?

  • Anonymous
    May 28, 2015
    Hi - I was wondering. If you run a 50 - 50 % failover - with only 1 available address which server would get it? and if the in my case both of the DHCP servers states the other one has it, what to do?

  • Anonymous
    August 24, 2015
    Hey,

    I've got a failover configuration between 2 Server 2012 R2 DC's setup exactly like this.
    Firstly, am I correct in saying that the scope options, lease information, and everything else is automatically replicated between the 2 servers, EXCEPT for manual reservations? Secondly, is it only possible to automate reservation replication by using a script (as outlined in this article - http://blogs.technet.com/b/teamdhcp/archive/2012/11/27/automatic-syncing-of-scope-configuration-changes-between-2-dhcp-failover-servers.aspx), or is there another option?

  • Anonymous
    August 24, 2015
    ZA_Lad_84,
    Lease information is replication using the DHCP failover synchronization protocol between the 2 DHCP failover servers. Scope options, reservations and other "configuration" is replicated using one of the following:
    - using IPAM 2012R2 to manage DHCP failover which performs the option, reservation update on both DHCP servers
    - using the auto sync script in the blog mentioned in your comment above
    - "Replicate" option in DHCP MMC/PowerShell (Manual action by admin)

  • Anonymous
    August 24, 2015
    teamdhcp
    Thanks for the quick response. I hadn't heard of the first option you listed - IPAM. I'll look into that and see how that works for us, thanks!

  • Anonymous
    September 15, 2015
    Hello,

    I recently experienced an issue with DHCP LB environment responsible for approximately 27 scopes in 50/50 mode. All clients started receiving lease times equivalent to MCLT, things did not return to normal until we disabled failover on all scopes. I suspect there was a communication error between DHCP1 and DHCP2 that invoked "communication interrupted" state but both were online continuing to service client computers. Could it be that while in this state, all clients receive MCLT lease times when they renew from their respective servers causing this situation to occur? Should I focus on ensuring good communication between these servers over port 647?

  • Anonymous
    November 26, 2015
    I have a quick question here.. In communications interrupted state, where one peer has gone down and the client comes with an elapsed time greater than 6, will the working peer provide lease from it's own pool?

    As per RFC, it says that in communications interrupted state, each servers will provide new leases from it's own share of IP addresses.

  • Anonymous
    November 26, 2015
    Sundar, when you say "the client comes with an elapsed time ...", that seems to imply a scenario of a client which already has a lease and is trying to renew. Its not a case of a new lease. In this case, the DHCP server which is running in communication interrupted state will renew the lease for the client for lease duration of MCLT.