Troubleshooting
Problem
Encrypted connections (DSE internode encryption and SSH) hang during SSL/TLS handshake when communicating between datacenters, specifically between physical hosts and VMware virtual hosts with mismatched MTU settings.
Symptom
- DSE Symptoms:
- Internode encryption fails between datacenters
- Unencrypted traffic works normally
- Local datacenter communication works fine
- SSL handshake initiates but hangs during key exchange
- SSH Symptoms:
- SSH connections hang after key exchange algorithm negotiation
- Telnet to SSH port (22) shows
SSH-2.0-OpenSSH_X.Xbanner successfully SSH verbose mode shows connection stuck at:
debug1: kex: algorithm: curve25519-sha256 debug1: kex: host key algorithm: ssh-rsa debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none debug3: send packet: type 30 debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
- Network Symptoms:
- tcpdump shows incoming packets as "IP11 (invalid)"
- Packets appear corrupted or malformed
- Only affects encrypted traffic, not plaintext
Cause
MTU (Maximum Transmission Unit) mismatch between network interfaces combined with the "Don't Fragment" (DF) bit set on encrypted packets.
Technical Details
- MTU Configuration Mismatch:
- Physical hosts: MTU = 9000 (Jumbo frames)
- VMware hypervisor: MTU = 1500 (Standard Ethernet)
- Virtual hosts inherit 1500 MTU from hypervisor
- Encrypted Traffic Behavior:
- Encrypted packets (SSL/TLS, SSH) set the "Don't Fragment" (DF) bit in IP header
- This prevents packet fragmentation along the network path
- When a packet larger than the receiving interface's MTU arrives with DF bit set, the packet is dropped
- Why It Only Affects Encrypted Traffic:
- Unencrypted traffic typically doesn't set DF bit, allowing fragmentation
- Encrypted protocols set DF bit for security and performance reasons
- Key exchange packets in SSL/TLS and SSH are often larger than 1500 bytes
- Why Local DC Works:
- All hosts in same DC have matching MTU (9000)
- No MTU mismatch = no packet drops
Environment
- Affected Datacenter: DC1 (VMware virtual hosts, MTU 1500)
- Working Datacenter: DC2 (Physical hosts, MTU 9000)
- Affected Services: DSE with internode encryption, SSH
- Network Path: Cross-datacenter communication
Diagnosing The Problem
1. Check MTU Settings
# On each host, check MTU
ip link show | grep mtu
# Or for specific interface
ip link show eth0 | grep mtu
# Expected output shows MTU value:
# eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP
2. Test Path MTU Discovery
# Test with different packet sizes (Don't Fragment bit set)
ping -M do -s 1472 <remote_host> # Should work (1472 + 28 header = 1500)
ping -M do -s 8972 <remote_host> # Will fail if path MTU < 9000
# -M do: Set Don't Fragment bit
# -s: Packet size (add 28 bytes for IP+ICMP headers)
3. Capture Network Traffic
# Capture encrypted connection attempt
sudo tcpdump -i any -nn host <remote_host> and port 7001 -w /tmp/ssl-test.pcap
# Analyze capture
tcpdump -tttt -nn -r /tmp/ssl-test.pcap
# Look for "IP11 (invalid)" or dropped packets
4. Test SSH with Smaller Key Exchange
# Force SSH to use smaller KEX algorithm
ssh -o KexAlgorithms=ecdh-sha2-nistp521 user@remote_host
# If this works, confirms MTU issue (smaller packets succeed)
5. Verify VMware MTU Settings
# On VMware host, check virtual switch MTU
esxcli network vswitch standard list
# Check physical NIC MTU
esxcli network nic list
Resolving The Problem
Option 1: Standardize MTU to 1500 (Recommended for Mixed Environments)
On all DSE hosts with MTU 9000:
# Temporary change (lost on reboot)
sudo ip link set <interface> mtu 1500
# Permanent change - RHEL/CentOS 7/8/9
sudo vi /etc/sysconfig/network-scripts/ifcfg-<interface>
# Add or modify:
MTU=1500
# Restart network
sudo systemctl restart NetworkManager
# or
sudo nmcli connection down <connection> && nmcli connection up <connection>
# Verify
ip link show <interface> | grep mtuFor Ubuntu/Debian:
# Edit netplan configuration
sudo vi /etc/netplan/01-netcfg.yaml
# Add MTU setting:
network:
version: 2
ethernets:
ens192:
mtu: 1500
# ... other settings
# Apply
sudo netplan applyOption 2: Increase VMware MTU to 9000 (If Infrastructure Supports Jumbo Frames)
Requirements:
- All switches and routers in path must support jumbo frames
- Physical NICs must support MTU 9000
- May require network team coordination
On VMware ESXi:
# Set vSwitch MTU
esxcli network vswitch standard set -v vSwitch0 -m 9000
# Set VMkernel interface MTU
esxcli network ip interface set -i vmk0 -m 9000
# Verify
esxcli network vswitch standard listOn Virtual Machines:
- Follow same steps as Option 1, but set MTU to 9000
- Reboot VM or restart network service
Option 3: Configure Path MTU Discovery (Workaround)
# Enable PMTU discovery on Linux hosts
sudo sysctl -w net.ipv4.ip_no_pmtu_disc=0
# Make permanent
echo "net.ipv4.ip_no_pmtu_disc=0" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
Note: This is a workaround and may not fully resolve the issue if intermediate devices don't properly handle ICMP "Fragmentation Needed" messages.
Verification
1. Test DSE Internode Encryption
# Enable encryption in cassandra.yaml
server_encryption_options:
internode_encryption: all
# Restart DSE
sudo systemctl restart dse
# Check logs for successful connections
grep -i "ssl\|handshake" /var/log/cassandra/system.log
# Verify with nodetool
nodetool status
2. Test SSH Connection
# SSH with verbose output
ssh -vvv user@remote_host
# Should complete without hanging at KEX_ECDH_REPLY
3. Test with OpenSSL
# Test SSL handshake
openssl s_client -connect <remote_host>:7001 -showcerts
# Should complete handshake and show certificate chain
Prevention
- Standardize MTU across all datacenters
- Document MTU requirements in infrastructure standards
- Use consistent MTU (1500 or 9000) across physical and virtual environments
- Validate MTU during host provisioning
- Add MTU check to deployment automation
- Include in pre-production validation checklist
- Monitor MTU settings
- Add MTU monitoring to infrastructure monitoring tools
- Alert on MTU mismatches between connected hosts
- Document network topology
- Maintain documentation of MTU settings per datacenter
- Include in network diagrams and runbooks
Document Location
Worldwide
Product Synonym
dse; tls; ssl; server_encryption; mtu
Was this topic helpful?
Document Information
Modified date:
11 May 2026
UID
ibm17272588