SCOM Troubleshooting: Repairing Agent fails - The client has been disconnected from the server.

Issue

When a SCOM Agent is repaired it fails with the exception "The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection."

Exception

Error messsage

Date: 3/4/2018 2:07:22

AM Application: Operations Manager

Application Version: 7.2.11938.0

Severity: Warning

Message: 

Microsoft.EnterpriseManagement.Common.ServerDisconnectedException:

The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection.

---> System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

Server stack trace: 

at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()

at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]: 

at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

at Microsoft.EnterpriseManagement.Common.Internal.IDispatcherService.DispatchUnknownMessage(Message message)

at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.InsertAgentPendingActions(IList`1 agentPrincipalNames, IList`1 pendingActionDataXml, String managementServerPrincipalName, Int32 pendingActionType)

--- End of inner exception stack trace ---

at Microsoft.EnterpriseManagement.Common.Internal.ExceptionHandlers.HandleChannelExceptions(Exception ex)

at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.InsertAgentPendingActions(IList`1 agentPrincipalNames, IList`1 pendingActionDataXml, String managementServerPrincipalName, Int32 pendingActionType)

at Microsoft.EnterpriseManagement.Administration.ManagementServer.SubmitRepairAgentsInternal(IList`1 agentServerNamePairsList, RepairAgentConfiguration repairAgentConfiguration, TaskStatusChangeCallback callback)

at Microsoft.EnterpriseManagement.Mom.Internal.UI.Administration.ModifyAgent.<>c__DisplayClass4.<OnOk>b__0(Object param0, ConsoleJobEventArgs param1)

at Microsoft.EnterpriseManagement.Mom.Internal.UI.Console.ConsoleJobExceptionHandler.ExecuteJob(IComponent component, EventHandler`1 job, Object sender, ConsoleJobEventArgs args)

System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

Server stack trace: 

at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()

at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]: 

at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

at Microsoft.EnterpriseManagement.Common.Internal.IDispatcherService.DispatchUnknownMessage(Message message)

at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.InsertAgentPendingActions(IList`1 agentPrincipalNames, IList`1 pendingActionDataXml, String managementServerPrincipalName, Int32 pendingActionType)

Event viewer

When we see the Events in Operations Manager, we may see some thing like the below:

Event ID 33333

Event id: 33333

Event Source: DataAccessLayer

Data Access Layer rejected retry on SqlError:

Request: AgentPendingActionProcessChange -- (AgentName=SCVMM2016SRV.SystemCenterInfra.Net), (PendingActionType=3), (AgentPendingActionId=148b5c64-4b52-c028-63a0-7da61fab6cdb), (ManagementServerName=SCOM16MS02.SystemCenterInfra.Net), (PendingActionData=<PendingActionInformation xmlns="urn:Microsoft.EnterpriseManagement.Mom.PendingActionInformation"><AgentDnsName /><AgentOperatio...), (RETURN_VALUE=1)

Class: 16 

Number: 18054

Message: Error 777980450, severity 16, state 1 was raised, but no message with that error number was found in sys.messages. If error is larger than 50000, make sure the user-defined message is added using sp_addmessage.

Event id: 26319

Event Source: OpsMgr SDK Service

An exception was thrown while processing InsertAgentPendingActions for session ID uuid:fc0dd016-9e92-4d32-bc37-1c46ac987172;id=185.

 Exception message: Error 777980450, severity 16, state 1 was raised, but no message with that error number was found in sys.messages. If error is larger than 50000, make sure the user-defined message is added using sp_addmessage.

 Full Exception: System.Data.SqlClient.SqlException (0x80131904): Error 777980450, severity 16, state 1 was raised, but no message with that error number was found in sys.messages. If error is larger than 50000, make sure the user-defined message is added using sp_addmessage. at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)

at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)

at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)

at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()

at System.Data.SqlClient.SqlDataReader.get_MetaData()

at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption)

at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)

at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)

at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior)

at Microsoft.EnterpriseManagement.DataAccessLayer.SqlRetryHandler.ExecuteReader(ExecuteArguments executeArguments, QueryResults& queryResults)

at Microsoft.EnterpriseManagement.DataAccessLayer.SqlRetryHandler.Execute[T](ExecuteArguments executeArguments, RetryPolicy retryPolicy, GenericExecute`1 genericExecute)

at Microsoft.EnterpriseManagement.DataAccessLayer.SqlRetryHandler.ExecuteReader(SqlCommand sqlCommand, IList`1 prologEpilogList, IList`1 projection, QueryDefinition queryDefinition, RetryPolicy retryPolicy)

at Microsoft.EnterpriseManagement.DataAccessLayer.QueryRequest.Execute(SqlNotificationRequest sqlNotificationRequest)

at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.AgentManagement.InsertAgentPendingActions(IList`1 agentPrincipalNames, IList`1 pendingActionDataXmls, String managementServername, Int32 pendingActionType)

at Microsoft.EnterpriseManagement.ServiceDataLayer.AdministrationService.InsertAgentPendingActions(IList`1 agentPrincipalNames, IList`1 pendingActionDataXml, String managementServerPrincipalName, Int32 pendingActionType)

ClientConnectionId:d7acaf4d-ecbd-4b39-81b8-f6e69635c111

Error Number:18054,State:1,Class:16

Cause

The error says that it could not insert the Agent which you are repairing into the database.

Now this looks like the entry on the same table it is trying to insert already exists and SQL Constrain does not allow it to be Inserted again.

Ideally, the Agent you are trying to repair should be already repair in progress or in Pending Management for an "Agent Available for Upgrade" or "Repair in Progress"

Solution

So the solution, in this case, was that the Agent which we were trying to repair was already in Pending management due to a UR being installed and was under "Agent Available for Upgrade" as below:

Selected the Agent --> right-click and select Reject Agent.

Now the repair happens below successfully: