Exchange 2010 Troubleshooting: Event ID 4051 - Quorum Group Health Check failed

Scenario:

As an Exchange Admin there might be times where you change the FSW (File Share Witness) for a DAG (Database Availability Group) and it works just fine.

One day you sifting through the event logs and you notice Event ID 4051. It indicates that the DAG cannot communicate with a FSW that was decommissioned.

https://collaborationpro.com/wp-content/uploads/2017/02/FSW1.jpg

In Exchange you run the following command and verify that the correct FSW is in use:

  • Get-DatabaseAvailabilityGroup -Identity <DAG NAME> | fl

You open up Failover Cluster Manager and verify that on the top it is also showing the correct information.

Solution:

The first step is to check in PowerShell what is actually going on. If you open up an elevated PowerShell Window and type the following command below it will show you the FSW:

  • cluster <FQDN of DAG> res

https://collaborationpro.com/wp-content/uploads/2017/02/FSW2.jpg

As shown above, we can see the old File Share Witness, the next step is to remove it.

Take note that you need to take the cluster offline to do this or you will get the following error when you try run the command:

  • System Error 5019 has occurred (0x0000139b). The operation could not be completed because the cluster resource is online.

https://collaborationpro.com/wp-content/uploads/2017/02/FSW3.jpg

To do this open up Failover Cluster Manager:

https://collaborationpro.com/wp-content/uploads/2017/02/FSW4.jpg

As Shown above, click on the cluster name on the top left and then on the right hand side (highlighted above), click "Take this resource offline"

You will get a box confirming this, click yes I want to take this resource offline.

The final step is to run the command to remove the old information from the same PowerShell window you had open:

  • cluster res "File Share Witness (\Server FQDN\DAG FQDN)" /delete 

An example of the above would be:

  • cluster res "File Share witness (\server1.test.local\dag1.test.local)" /delete

After that you can bring the resources back online and run the first command again and it should be fine now.