FailoverClustering event ID 1795 error code 1168

ChanSeon 5 Reputation points
2023-07-24T03:03:49.5966667+00:00

Hi team,

My team had configured Hyper Converged Storage for business around 1 month ago.

Configured 1 converged storage with 6 physical server. And created around 20 Virtual Machine through Hyper-V.

Once a week we could see one of Hyper V hosts(physical server) terminated from the converged storage. And at that time I checked error from event viewer.

[Cluster node [node name] was removed from the active failover cluster membership....

Also check for failures in any other network components to which the mode is connected such as hubs, switches, or bridges.]

Source : FailoverClustering / event ID : 1135 / Level : critical

While we are struggling to figure out root cause of this event... we faced different event like below.


Cluster physical disk resource encountered an error while attempting to terminate.

Phsyical Disk Resource Name: Cluster Virtual Disk (Volume01)
Device Number: 3
Device Guid: (Sac7bb9b-fbff-4c62-be57-af1 af4f56ad 7}
Error Code: 1168
Reason: ReleaseDiskPRFailure


Can somebody give me advice about those 2 topics?

Thank you.

Hyper-V
Hyper-V
A Windows technology providing a hypervisor-based virtualization solution enabling customers to consolidate workloads onto a single server.
2,613 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
974 questions
Windows Server Storage
Windows Server Storage
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Storage: The hardware and software system used to retain data for subsequent retrieval.
642 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Ian Xue (Shanghai Wicresoft Co., Ltd.) 33,376 Reputation points Microsoft Vendor
    2023-07-25T07:32:03.2933333+00:00

    Hi,

    The cluster uses heartbeats to detect if a cluster node is still alive or not. If there are heartbeats missed, the cluster detects the node as unreachable and the event id 1135 gets populated which means that the node is removed from the active cluster membership.

    Possible causes for 1135:

    • Network latency
    • Network outages
    • Faulty drivers or network cards, including TCP offload issues
    • Misconfigured firewall rules
    • Security software such as anti-virus, intrusion detection, etc.

    Best Regards,

    Ian Xue


    If the Answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. Alex Bykovskyi 1,841 Reputation points
    2023-08-22T19:12:29.9233333+00:00

    Hey,

    This issue you are facing might be related to network configuration. You should check that every layer of your networks works as expected. In addition, you can try alternative shared storage to see if issue persists. You can use StarWind VSAN as an example. https://www.starwindsoftware.com/starwind-virtual-san

    Cheers,    
    Alex Bykovskyi    
    StarWind Software    
    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

    0 comments No comments