DHCP Failover fixes in KB 2919393 for Windows Server 2012 and KB 2919355 for Windows Server 2012 R2

The following fixes in DHCP Server for Windows Server 2012 have been released as part of Windows rollup update KB 2919393 and for Windows Server 2012 R2 for KB 2919355.

 

Issue no 1: DHCP Failover server issues reserved IP address to a client with a different MAC address

This  issue pertains to the case when a  DHCP Scope which is a part of a failover relationship has an exclusion range and a reservation exists in the scope for one of the IP addresses within the exclusion range. The sequence of events leading to this issue is as follows:

  • The reserved IP address is leased out to the reserved client.
  • That client releases the IP address (DHCP RELEASE message) by sending a DHCP release messages. This causes the DHCP server to mark the IP addresses as available.
  • A different client sends a DISCOVER packet to the DHCP Server. The DHCP server OFFERs the reserved IP to this client though the MAC address of the client does not match the one in the reservation..
  • Now the client sends a REQUEST packet to the server.
  • DHCP Server now sends back a NAK message taking into account that it is a reserved IP Address.
  • Client starts the DORA (DISCOVER OFFER REQUEST ACK) sequence again leading to the same consequences.
  • The client is perpetually stuck into a DISCOVER-OFFER-REQUEST-NAK cycle and never gets an IP Address.

With time this may lead to many clients stuck in DISCOVER-OFFER-REQUEST-NAK cycle and losing network connectivity.

Issue no 2: Some IP addresses are perpetually stuck in BAD ADDRESS state on one of the DHCP failover servers while in Active state on the other server. DHCP Server admin channel contains BINDING ACK reject events 20291 and 20292 for these IP addresses. The sequence of events which leads to this issue is as follows:

  • A DHCP Server is migrated to a DHCP Server 2012 or DHCP Server 2012 R2 without migrating the lease records. The new DHCP server does not have any leases in it’s database.
  • Server is configured for failover leading to a state where none of the failover partners have any lease records.
  • A new client requests an IP address. One of the failover partners leases out the first free IP address in it’s database. This IP address is already in use by another client who had obtained it from the DHCP server before migration.
  • The client performs a duplicate address detection test which fails. The client declines the lease (DHCP DECLINE) and hence the address is marked as BAD_ADDRESS on the DHCP server.
  • The same update is sent to the partner server and that server also marks that IP address as a BAD_ADDRESS.
  • When the client to whom the IP Address was issued originally sends a RENEW request to one of the failover partners the server sends an ACK and marks the IP Address as Active.

Now when the update of BAD_ADDRESS to Active state is sent to the partner server, the BINDING update is rejected leading to inconsistency between the 2 DHCP servers. 

Both these issues have been fixed in the rollup update for Windows Server 2012 in KB 2919393 and for Windows Server 2012 R2 is KB 2919355.

DHCP failover relation need not be recreated to apply this patch. DHCP server restart is required.

UPDATE: Some customers reported events 20291 getting logged even after applying KB 2919355. KB 2955135 (https://support.microsoft.com/kb/2955135 ) has been released to address this issue with DHCP failover. Customers who have DHCP failover deployments with Windows Server 2012 R2 and experiencing events 20291 should apply this patch.

UPDATE:

We are aware of a remote scenario that is not addressed by the KB 2919355. It occurs when a client sends multiple request packets within the same second and the second client request is a release request. You will notice the following pattern in the DHCP audit logs.

 

10,09/05/14,17:49:29,Assign,10.218.2.76, Test,547F544A05CC,,362119506,0,,,,,,,,,0

12,09/05/14,17:49:29,Release,10.218.2.76,Test,547F544A05CC,,377053407,0,,,,,,,,,0

 

In DHCP failover, whenever one of the server responds with an ACK to a client request, the same is updated in the database of the DHCP server with a timestamp and is sent to the partner for synchronization. This timestamp is at granularity of a second. If the partner server receives two synchronization updates from the other DHCP server with the same timestamp, it will reject the second request. This is as per the DHCP failover IETF document (https://tools.ietf.org/html/draft-ietf-dhc-failover-12). If the second request from the client is a release request then this causes the failover servers to go out of sync as the partner server will reject the synchronization update but the first server (which received the client request) will apply the update to its database.

This will cause the failover servers to generate 20291 and 20292 events.

This is considered a remote scenario as a client sending two requests within the same second seems unlikely. However, we would like to understand if customers are seeing this scenario in their deployment. Customers seeing such an issue should contact Microsoft support which will enable collecting of required logs etc to diagnose the issue further.

Comments

  • Anonymous
    January 01, 2003
    Thanks For the details Tom. We are actively investigating this issue and will get back to you shortly with our findings.
  • Anonymous
    January 01, 2003
    The comment has been removed
  • Anonymous
    January 01, 2003
    Hi Volk, the patch for 2012R2 has been released as part of KB 2919355. See http://support.microsoft.com/kb/2919355
    I assume #2 is a non issue now that the fixes are released? Let us know.
  • Anonymous
    January 01, 2003
    Thanks David! :-) Appreciate the note!
  • Anonymous
    January 01, 2003
    Hi Mike, the fixes for 2012 R2 have been released. See http://support.microsoft.com/kb/2919355.
    The workaround for issue #2 is to migrate the scopes with leases. But that would require you to redo the migration and recreate the failover relationship. Now that the fixes are released, you should not be needing the workaround.
  • Anonymous
    January 01, 2003
    Hi Raj
    The configuration looks alright to us from the DHCP Relay agent perspective. Did you get it reviewed by the Cisco expert. If not we would recommend you to do that.
    Thanks
  • Anonymous
    January 01, 2003
    Hi Tom
    Can you provide some more details about your setup and under what conditions are you facing this issue. Ideally the choice Windows or non-windows devices shall not make any difference with respect to this issue.
    Thanks
  • Anonymous
    January 01, 2003
    Hi Peter, the KB 2919393 will be updated with details of the DHCP failover fixes. There are no other DHCP server fixes in 2919393.
  • Anonymous
    January 01, 2003
    I've the same issues like John. After applying KB2919355 on both 2012 R2 Machines there are still severeal Events with ID 20292 (Reject Reason Unknown) on both sides. Is there already a solution or a workaround in place?
  • Anonymous
    January 01, 2003
    Hi Mike, the BAD Addresses would take some time to resolve itself. When the client which has this IP address leased renews its lease, the BAD address for that specific IP address will get resolved.
  • Anonymous
    January 01, 2003
    Hi David, No, its not considered a bad practice and the blog does not say its a bad practice either. Its just that there was a bug with this kind of configuration and failover - which has now been for Windows Server 2012.
  • Anonymous
    January 01, 2003
    Georg, we are analyzing the issue that Mike reported. Will post an update once the investigation completes.
  • Anonymous
    January 01, 2003
    Updated the article with details for patch update for Windows Server 2012 R2.
  • Anonymous
    January 01, 2003
    Hello,

    Is there any list of ALL released DHCP server updates/hotfixes for Windows Server 2012 RTM?
  • Anonymous
    January 01, 2003
    Hi LE2Strat, DHCP failover is supported since Windows Server 2012. These fixes are applicable to DHCP failover. So, Windows Server 2008 R2 SP1 is not affected.
  • Anonymous
    January 01, 2003
    John, Heyko and others reporting event 20291 in their DHCP failover deplpoyments -

    KB 2955125 (http://support.microsoft.com/kb/2955135 ) has been released to address events 20291. Please apply this patch.
  • Anonymous
    January 01, 2003
    Hi Mike, the plan for release of these fixes for DHCP server in 2012 R2 is in the works. We will post an update once its finalized.
  • Anonymous
    January 01, 2003
    thanks
  • Anonymous
    February 26, 2014
    Thanks, I noticed a million of these events on my setup, mostly from the WIFI scopes, will test the R2 patch once is out
    Thx
    Martin
  • Anonymous
    February 27, 2014
    Why can't I find anything regarding these two fixes in KB2919393?

    Is there other fixes regarding the DHCP server 2012 HA in KB 2919393?

    The two problems mentioned above have been a problem since the switch to DHCP server 2012 HA, so they are very welcome.

    Regards,

    Peter Jakobsen
    phj@aalborg.dk

  • Anonymous
    March 03, 2014
    227 Microsoft Team blogs searched, 65 blogs have new articles. 191 new articles found searching from
  • Anonymous
    March 06, 2014
    Any timeframe as to when the hotfix will be available for DHCP server 2012 R2?
  • Anonymous
    March 25, 2014
    So is Issue #1 considered "bad practice"? ie - reserving IPs in an exclusion range? I was thinking about doing that just now and found this blog about it. I'm trying to find a way to reserve a "block" of addresses in my pool for two computer labs, without having the addresses scattered throughout the existing pool. Reservations in an exclusion seemed the logical way to do it, but maybe not...
  • Anonymous
    March 26, 2014
    Sweet. Thanks for the info. You guys have done a great job on the Windows DHCP server. I just switched from ISC's dhcpd to Windows last year, and can't say that I miss it much. Failover is totally cool...
  • Anonymous
    April 07, 2014
    Is there any available workaround to fix the issue 2? thanks
  • Anonymous
    April 08, 2014
    The comment has been removed
  • Anonymous
    April 10, 2014
    Great news, thanks.
  • Anonymous
    April 23, 2014
    Hi, after applying the hotfix KB2919355 on windows server 2012 R2 DHCP servers, both of primary and partner. i still can see a lot ip with bad_address names from the DHCP console. i assume that the IPs with bad_address are the previous ones before appling the hotfix. could you tell me what is the best practice to resolve it? remove those bad ones from address lease spaces or other ideas. thanks a lot
  • Anonymous
    April 28, 2014
    is it normal that I still got Event 20291 and 20292 after applying the hotfix KB2919355 on windows server 2012 R2 DHCP servers?
    Thanks for your helps very much.
  • Anonymous
    May 06, 2014
    What if we are seeing issue #1 with our 2008 R2 SP1 DHCP server? Is there a hotfix coming for that, or is that version not affected?
  • Anonymous
    May 12, 2014
    Dear Sir or Madam,
    is Mike already asked about three Posts ago.
    We applied the patch in KB2919355 on both Partner Servers running WinServer 2012 R2 and we still get 20291/20292 Errors.
    Further the leases mentioned in those erros wont be synced to the Partner Server.
    From my Point of view this is something to take care about.
    @teamdhcp: Would you be so Kind to give a hint on this issue?
  • Anonymous
    May 19, 2014
    Same issue as Mike and Georg.
  • Anonymous
    May 28, 2014
    Hi DHCP team,

    Thx. for the fix. We have been running this exact configuration in an evironment for more than one year and never noticed the issue until today. Found the fix and it seems to resolve the issue.

    Cheers!
  • Anonymous
    June 23, 2014
    The comment has been removed
  • Anonymous
    June 26, 2014
    Hi again,

    Thanks for the response.

    We've been doing some digging into our problems here. Using your network monitor software, i've found that the DHCP discovery is reaching our server, but when it responds with an offer of an IP address, the PC on the other end does not receive the offer and it keeps sending discovery packets.

    We are looking at our network too see if we can find something there now, we run a Cisco environment. We started looking at the network side because we found that on some switches, the PC will get an IP address just fine. We've compared configs on the 2 switches and have found nothing of note. It's very strange.

    Do you folks have any "recommended" settings on the Cisco switches that you would know of? We have Nexus 5500 cores and C2960X edge devices. Also, here is the config we got for the interfaces on the Nexus cores:

    interface VLAN
    no shutdown
    ip access-group ****** in
    no ip redirects
    ip address *************/24
    hsrp version 2
    hsrp 25
    preempt
    priority 250
    ip ********************
    ip dhcp relay address ******* : This is the IP address of one of the dhcp servers
    ip dhcp relay address ******* : IP address of the SCCM PXE server
    ip dhcp relay address ******* : This is the IP address of the other dhcp server which is a partner of it.
  • Anonymous
    July 17, 2014
    I just received a complaint that we are experiencing issue #2. I checked and we already have this patch installed on both 2012 servers. Both servers reside on the same subnet and I am load balancing many zones. This issue mostly happens I was told when we do firmware upgrades to cisco wireless controllers where all the APs reboot or mass resets of ShoreTel phones. Networks with just Windows PCs seem to hum along fine.
  • Anonymous
    July 17, 2014
    The comment has been removed
  • Anonymous
    August 15, 2014
    Hello,
    still getting a lot of 20291 and 20292 events with 2919355 patch applied on 2012 R2 - reject reason unknown, some leases does not replicate. Any new insights into the case? Looks like similar problem to the one Mike George and Byron are/were experiencing.
  • Anonymous
    October 12, 2015
    The comment has been removed
  • Anonymous
    October 12, 2015
    Hello Rajk, no this is not expected behavior. Primary should continue to lease IP addresses even when 10% of IP addresses are available. What is the number of IP addresses in this scope.
  • Anonymous
    October 12, 2015
    The scope statistics shows 24 Ips are available . However, it is not responding to the DHCP request of clients.
  • Anonymous
    October 13, 2015
    Hi Teamdhcp, We have KB2919355 installed already on our DHCP servers. But still we see the error Evend ID:20292 . Is there any fix for this error .
  • Anonymous
    October 13, 2015
    Hi Teamdhcp, Reply to your question on number of IP addresses in the scope ... we have around 22 IPs free out of 22x
  • Anonymous
    October 14, 2015
    Hi Raj, if your scope has 220 IPs and you are using 5% as reserve address percentage, 11 IP addresses will be reserved for the standby server. So, the primary should be able to lease upto 209 IP addresses - i.e. till only 11 free IP addresses are free in the scope. If your observation is different from this, please contact Microsoft support so that they could look into why the behavior is not as expected.