Exchange 2007 Site Resiliency

I just had a discussion around when to use stretched CCR vs. SCR with the Exchange 2007 product team. They provided me with some of their upcoming general rule of thumbs for which direction to go in when considering Exchange 2007 for site resilience. I have tried to net it out here with my interpretation:

 

If the latency between the primary datacenter and standby datacenter is greater than a 50ms Round Trip Time (RTT) then:

 

Use Continuous Cluster Replication (CCR) for availability within your local datacenter and leverage Standby Cluster Replication (SCR) in the remote location for site resiliency.

 

This essentially requires dedicated standby cluster hardware and well as dedicate Exchange role servers in the remote datacenter and is a warm failover to remote site where some manual intervention is required with the use of /recovercms. For more on the SCR manual intervention steps required see here.

 

If the latency between the primary datacenter and standby datacenter is less than or equal to a 50ms Round Trip Time (RTT) and there is large bandwidth then either:

 

Use stretched Continuous Cluster Replication (CCR) between your local datacenter and remote datacenter for both availability and site resiliency.

 

This option can leverage non-dedicated CCR standby cluster servers and non-dedicated Exchange role servers in the remote datacenter. Additionally, the use of Windows Server 2008 clustering is recommended since it supports multiple subnet failover clustering. You could use Windows Server 2003 clustering for this however the network must be able to accommodate a stretched subnet since 2003 clustering will only work within a single subnet.

 

When considering a stretched CCR deployment you also need to consider how things like maintenance/service packs would work in this scenario since it would require you to fail to the alternate datacenter to accomplish this. I think once you go down this path this option may not be as attractive as the option below.

 

Or

 

Use CCR for local datacenter availability with SCR for site resiliency with warm failover.

 

This option is similar to the SCR option above except you can leverage non-dedicated standby hardware (the Exchange hardware could be used in the remote datacenter as local redundant Exchange roles for example to lower the cost of site resiliency). Another key difference here are the AD sites in both locations span both datacenters.

 

Additional Exchange 2007 site resiliency notes:

 

The Exchange product team also just provided new recommendations around no longer using a CNAME record 'switch' for the File Share Witness server with stretched Exchange 2007 clusters and the recommendation is to now leverage the built in Force Quorum as a more reliable means of bringing up a new FSW server in the standby datacenter . Read more on this here.

 

For more information on bandwidth, hot, warm Exchange 2007 datacenter sites as well as dedicated and non-dedicated Exchange hardware check here.

 

I hope this helps you in your Exchange 2007 site resiliency planning.

Comments

  • Anonymous
    January 01, 2003
    Brian, Of course there is not a set answer since it depends on amount of logs generated, etc. I have seen in some cases 100Mb links work with SCR or CCR. There is a nice calculator from the Exchange team that helps with those exact questions as to which network links (10Mb, 100Mb, 1000Mb, etc) can handle the log replication traffic based on specific environmental data, etc. You can download it here: http://msexchangeteam.com/files/12/attachments/entry438481.aspx Enter your inputs including network link. Click on the "Log Replication Requirements" tab Toggle the network link to see if SCR or CCR can handle the log traffic with the given link speeds.

  • Anonymous
    May 01, 2008
    Please define "large" in reference to bandwidth.  I know the Exchange team recommends gigabit, but what can be changed to allow this to work for less than gigabit?  In the Exchange documentation it says that a max database can be 200 GB with gigabit, or 100 GB with 100 Mbit.  So can you implement a strethed CCR cluster with less than 100 Mbit?  How about 10 Mbit, would you be limited to 10 GB databases?