NSX-T troubleshooting
Troubleshooting for NSX-T network issues.
Known issues
- Intermittent failure when you log in to the management console in HA clusters that use NSX-T 2.2.
In HA clusters that use NSX-T 2.2, you might not be able to log in to the management console. After you specify the login credentials, you are redirected to the login page. You might have to try logging in multiple times until you succeed. This
issue is intermittent due to the known issue in VMware: For Kubernetes service type ClusterIP, Client-IP based session affinity is not supported. For more information, see [NSX Container Plugin 2.4.1 Release Notes] .
- In an NSX-T environment, when you restart a master node, the management console becomes inaccessible.
In an NSX-T environment, when you restart a master node, the management console is inaccessible even though all the service pods are in a good state. This issue is caused because of non-persistent IPtable NAT rules, which help host port and pod communication through host IP. NSX-T does not support host port. IBM Cloud Private uses host port for the management console.
To resolve the issue, run the following commands on all the master nodes. Use the network CIDR that you specified in the /<installation_directory>/cluster/config.yaml file.
iptables -tnat -N ICP-NSXT
iptables -tnat -A POSTROUTING -j ICP-NSXT
iptables -tnat -A ICP-NSXT ! -s <network_cidr> -d <network_cidr> -j MASQUERADE
MustGather
- Get the pod status of both the NCP controller and node-agent pods.
kubectl -n kube-system get pods -o wide -l tier=nsx-networking - Get the log of the pods (NCP controller and node-agent pods) which are not in ready state.
-
When node-agent pod is failing in a node, get the kubelet logs and ovs log from the node.
journalctl -u kubelet ovs-vsctl show -
Create the configmap value of
nsx-ncp-configandnsx-node-agent-config.kubectl -n kube-system get cm nsx-ncp-config -o yaml kubectl -n kube-system get cm nsx-node-agent-config -o yaml -
Create the secret value of
nsx-secrets.kubectl -n kube-system get secrets nsx-secrets -o yaml
Troubleshooting
To avoid NSX-T network issues during installation, ensure that the following settings are correctly configured.
- Install the NSX-T CNI plug-in on all the nodes. As part of the installation, a validation check is added and a proper error is produced.
- Install and configure open vSwitch in all the nodes. As part of the installation, a validation check is added and a proper error is produced.
- The mandatory NSX-T resources that are given in config.yaml are configured in the NSX-T manager. If they are not configured, the NCP controller does not go to ready state and the installation waits for kube-dns.
TASK [waitfor : Waiting for kube-dns to start] - Tag the logical switch port with
node nameandcluster namein the NSX-T Manager. If the tag is not correct, the node-agent pod for the node goes to ready state.