How To
Summary
In a Windows Server 2012 R2 cluster, unstable network connectivity between cluster nodes or between nodes and the File Share Witness (FSW) host can lead to quorum loss. This may cause the Cluster Service to unexpectedly failover between nodes (e.g., from Node A to Node B) or shut down clustered resources entirely. The FSW is a quorum resource, and its failure to come online — due to network issues or share unavailability — can prevent the cluster from maintaining quorum.
Objective
How to troubleshoot the cluster service when it's frequently switching between nodes, disrupting high-availability and potentially leading to downtime or degraded performance.
Environment
Steps
- Review Logs:
- Check system, application, and cluster logs for errors like “Remote endpoint unreachable,” “quorum loss,” or “File Share Witness failed.”
- Note timestamps to correlate with failover events.
- Run Validation Tools:
- Use the Validate a Configuration wizard in Failover Cluster Manager to check network, storage, and node health.
- Look for warnings about network adapters or witness access.
- Update Drivers:
- Ensure network card drivers and hypervisor agents (for VMs) are current on all nodes and the witness host.
- Check Antivirus:
- Verify antivirus software isn’t blocking port 3343. Temporarily disable it for testing (in a non-production environment).
- Inspect Network Infrastructure:
- Confirm stable connectivity between nodes and the witness host.
- Check for firewall blocks or issues in switches/hubs.
- Monitor Packet Loss:
- Use Performance Monitor to track “Network Interface\Packets Received Discarded” on nodes and the witness host.
- Adjust network settings if high packet drops are detected.
Additional Information
-For additional information, you may search online for the below MS articles:
"Failover Cluster Troubleshooting"
"Unexpected cluster failover troubleshooting guidance"
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
29 August 2025
UID
ibm17243573