Why Exchange 2013 CU6+ use out-of-site DCs/GCs

This is no more applicable post Exchange 2013 CU11 for On-premise environments - also not valid for Exchange 2016!

 

 

We have few escalations from our customers, who recognized huge traffic between Exchange 2013 CU6+ and out-of-site DCs/GCs.

When we Get-ExchangeServer –Status we can see that Exchange uses out-of-site DCs but at a same time in event 2080 we can see that other In-Site DC are availible.

Here is how it looks in Exchange 2010 and Exchange 2013 RTM – CU5:

 

I used topology with 4 DC in-Site and 1 Out-Site

From event 2080:

Process Microsoft.Exchange.Directory.TopologyService.exe (PID=2276). Exchange Active Directory Provider has discovered the following servers with the following characteristics:

(Server name | Roles | Enabled | Reachability | Synchronized | GC capable | PDC | SACL right | Critical Data | Netlogon | OS Version)

In-site:

DC001.CU1.com CDG 1 7 7 1 0 1 1 7 1

dc2.CU1.com CDG 1 7 7 1 0 1 1 7 1

DC3.CU1.com CDG 1 7 7 1 0 1 1 7 1

dc4.CU1.com CDG 1 7 7 1 0 1 1 7 1

Out-of-site:

dc5.CU1.com CDG 1 7 7 1 0 1 1 7 1

 

Get-ExchangeServer exch5-cu1 -Status

CurrentDomainControllers        : {dc2.CU1.com, DC001.CU1.com, dc4.CU1.com, DC3.CU1.com}

CurrentGlobalCatalogs           : {dc2.CU1.com, DC001.CU1.com, dc4.CU1.com, DC3.CU1.com}

CurrentConfigDomainController   : DC001.CU1.com

 

>netstat -n | findstr 3268

We can see established connections with all 4 GCs

 

Turn off DC4

Information MSExchange ADAccess 2070 Topology:

Process MSExchangeHMWorker.exe (ExHMWorker) (PID=3116).  Exchange Active Directory Provider lost contact with domain controller dc4.CU1.com.  Error was 0x34 (Unavailable) (Active directory response: The server is unavailable.).  Exchange Active Directory Provider will attempt to reconnect with this domain controller when it is reachable.

 

Get-ExchangeServer exch5-cu1 -Status

CurrentDomainControllers        : {DC001.CU1.com, DC3.CU1.com, dc2.CU1.com }

CurrentGlobalCatalogs           : {DC001.CU1.com, DC3.CU1.com, dc2.CU1.com}

CurrentConfigDomainController   : DC001.CU1.com

 

>netstat -n | findstr 3268

We can see established connections with 3 GCs

 

Turn off DC3

 

CurrentDomainControllers        : {DC001.CU1.com, dc2.CU1.com}

CurrentGlobalCatalogs           : {DC001.CU1.com, dc2.CU1.com}

 

>netstat -n | findstr 3268

We can see established connections with 2 In-Site GCs

 

Turn off DC2

 

CurrentDomainControllers        : {DC001.CU1.com}

CurrentGlobalCatalogs           : {DC001.CU1.com}

CurrentConfigDomainController   : DC001.CU1.com

 

>netstat -n | findstr 3268

We can see connections only to DC001

 

From event 2080:

Process Microsoft.Exchange.Directory.TopologyService.exe (PID=2276). Exchange Active Directory Provider has discovered the following servers with the following characteristics:

(Server name | Roles | Enabled | Reachability | Synchronized | GC capable | PDC | SACL right | Critical Data | Netlogon | OS Version)

In-site:

DC001.CU1.com CDG 1 7 7 1 0 1 1 7 1

dc2.CU1.com CDG 1 0 0 0 0 0 0 0 0

DC3.CU1.com CDG 1 0 0 0 0 0 0 0 0

dc4.CU1.com CDG 1 0 0 0 0 0 0 0 0

Out-of-site:

dc5.CU1.com CDG 1 7 7 1 0 1 1 7 1

 

In other words: we do not try to establish connection to Out-of-site DC while we have at least one In-site DC availible.

 

What happenes as soon as you update your servers to CU6+:

Get-ExchangeServer exch5-cu1 -Status

CurrentDomainControllers        : {dc2.CU1.com, DC001.CU1.com, dc4.CU1.com, DC3.CU1.com}

CurrentGlobalCatalogs           : {dc2.CU1.com, DC001.CU1.com, dc4.CU1.com, DC3.CU1.com}

CurrentConfigDomainController   : DC001.CU1.com

 

>netstat -n | findstr 3268

We can see established connections with all 4 GCs

Same as in RTM

 

Turn off DC4

Get-ExchangeServer exch5-cu1 -Status

CurrentDomainControllers        : {DC001.CU1.com, DC3.CU1.com, dc2.CU1.com }

CurrentGlobalCatalogs           : {DC001.CU1.com, DC3.CU1.com, dc2.CU1.com}

CurrentConfigDomainController   : DC001.CU1.com

 

>netstat -n | findstr 3268

We can see established connections with 3 GCs

Same as RTM

 

Turn off DC3

CurrentDomainControllers      : {DC001.CU1.com, dc2.CU1.com, dc5.CU1.com}

CurrentGlobalCatalogs         : {DC001.CU1.com, dc2.CU1.com, dc5.CU1.com}

CurrentConfigDomainController : DC001.CU1.com

NEW!!!

We established connection to Out-of-Site DC dc5.cu1.com

It is by design. Saying that if number of in-site DCs are less than MinSuitableServer, which is by default 3, out-site DCs will be used. Once the number of in-site DCs is larger than MinSuitableServer, out-site DCs should not be used any more.

Previously when Exchange process asks for domain controllers, topology service only returns servers from either In-Site list or Out-of-Site list. That says, as long as there is one single DC suitable in In-Site list, topology service will return it back and does not further search Out-of-Site list, no matter how many is requested by the client.
 
This might cause some load unbalanced issue, especially during site failover. Good domain controllers left in the being failed out site take much more load than outside DCs.
 
To fix this, a new configurable setting, MinSuitableServer, is introduced. Topology service will first check whether there are enough suitable servers in In-Site list. If no, it will add servers from Out-of-Site list. Similar change is done in topology discovery, too.

 

How we can return it back or configure?

 

If we really want to use in-site DCs only, even though there is just 1 available (as it was in 2010 or 2013 RTM-CU5), we can add an entry:

MinSuitableServer = "1"

in Microsoft.Exchange.Directory.TopologyService.exe.config:

In section   <Topology MinimumPrefixMatch = "2"

 

EnableWholeForestDiscovery = "true"

MinSuitableServer = "1"   <----------ADD THIS VALUE

ForestWideAffinityRequested = "true"/>

 

I turned DC4 off as we do not need it

Also I added MinSuitableServer = "2" and restarted Microsoft Exchange Active Directory Topology aka MSExchangeADTopology or whole server

CurrentDomainControllers      : {DC3.CU1.com, dc2.CU1.com, DC001.CU1.com}

CurrentGlobalCatalogs         : {DC3.CU1.com, dc2.CU1.com, DC001.CU1.com}

CurrentConfigDomainController : dc2.CU1.com

 

Turn DC3 off

From event 2080:

Process Microsoft.Exchange.Directory.TopologyService.exe (PID=2504). Exchange Active Directory Provider has discovered the following servers with the following characteristics:

(Server name | Roles | Enabled | Reachability | Synchronized | GC capable | PDC | SACL right | Critical Data | Netlogon | OS Version)

In-site:

DC001.CU1.com CDG 1 7 7 1 0 1 1 7 1

dc2.CU1.com CDG 1 7 7 1 0 1 1 7 1

DC3.CU1.com CDG 1 0 0 0 0 0 0 0 0

dc4.CU1.com CDG 1 0 0 0 0 0 0 0 0

Out-of-site:

dc5.CU1.com CDG 1 7 7 1 0 1 1 7 1

 

[PS] C:\Windows\system32>Get-ExchangeServer Exch5-cu1 -Status | fl Current*

 

CurrentDomainControllers      : {DC001.CU1.com, dc2.CU1.com}

CurrentGlobalCatalogs         : {DC001.CU1.com, dc2.CU1.com}

CurrentConfigDomainController : dc2.CU1.com

 

Turn off DC2

 

CurrentDomainControllers      : {DC001.CU1.com, dc5.CU1.com}

CurrentGlobalCatalogs         : {DC001.CU1.com, dc5.CU1.com}

CurrentConfigDomainController : DC001.CU1.com

 

Start DC3

 

CurrentDomainControllers      : {DC001.CU1.com, DC3.CU1.com}

CurrentGlobalCatalogs         : {DC001.CU1.com, DC3.CU1.com}

CurrentConfigDomainController : DC001.CU1.com

 

So we returned back to In-site DC as soon it became available.

 

Now set MinSuitableServer = "1"

 

CurrentDomainControllers      : {dc2.CU1.com, DC3.CU1.com, DC001.CU1.com}

CurrentGlobalCatalogs         : {dc2.CU1.com, DC3.CU1.com, DC001.CU1.com}

CurrentConfigDomainController : DC001.CU1.com

 

Turn off DC2

CurrentDomainControllers      : {DC3.CU1.com, DC001.CU1.com}

CurrentGlobalCatalogs         : {DC3.CU1.com, DC001.CU1.com}

CurrentConfigDomainController : DC001.CU1.com

 

Turn off DC3

CurrentDomainControllers      : {DC001.CU1.com}

CurrentGlobalCatalogs         : {DC001.CU1.com}

CurrentConfigDomainController : DC001.CU1.com

 

In other words: same as it were in 2010 and 2013 RTM-CU5.

Comments

  • Anonymous
    August 12, 2015
    Thanks a lot for making this available, very useful for a current issue my customer have.
  • Anonymous
    August 12, 2015
    Thanks a lot for sharing that info
    very valuable information
  • Anonymous
    August 12, 2015
    Thanks a lot, very useful information!
  • Anonymous
    August 28, 2015
    Now we have KB for IT https://support.microsoft.com/en-us/kb/3088777
  • Anonymous
    February 26, 2016
    Hi, we had an issue that was "fixed" by this setting but ours is a single AD site environment.
    We started off with 2 W2k3 DC with Exchange 2013 CU9. Everything was working fine. To upgrade the AD, we introduced 2 new W2k12R2 DC. Exchange was still working at this point.

    We shutdown the 2 old DC and only the 2 new DC remained. Now Exchange services cannot start and we are getting eventid 4027, 2142,2193. Seems like the Exchange servers are still trying to contact the old DCs. Exchange services started when the old DCs were booted up.

    We changed these 2 settings in the Microsoft.Exchange.Directory.TopologyService.exe.config after doing some Google search that found this link

    https://social.technet.microsoft.com/Forums/exchange/en-US/34b1c301-ad12-4655-aeea-772e70c654bc/event-id-2142-2077-2069-msexchangeadtopology-exchange-2013?forum=exchangesvradmin

    We changed the value below
    MinPercentageOfHealthyDC = "50" to "10"
    And we added MinSuitableServer = “1”

    After this, with only the 2 new DC started and Exchange servers rebooted, Exchange services starts fine.

    My questions
    1) Does the above settings with the default value really means we need at least 50% of available DC before Exchange services will start?
    2) Why does Microsoft choose such a default value? It seems pretty silly not to let Exchange use the surviving DC even if it is less than 50%.
    3) Should we "fine tune" these value based on the number of DC we have in a site? If so how and where is the official guidance as I cannot find it.
    4) Is this a bug or a feature?

    Thank you.
  • Anonymous
    March 04, 2016
    Very useful post. Thanks!