Troubleshooting
Problem
If two EN4093 switches are connected to separate upstream switches via an Link Aggregation Control Protocol (LACP)/Static connection or even a single link and the EN4093 switches are configured with 25 or more Server Time Protocol (STP) groups with one Virtual Local Area Network (VLAN) each, then any network change on the link between the EN4093 and its upstream partner can cause the EN4093 Central Processing Unit (CPU) utilization to spike to 100 percent. This can be triggered when the link between those switches are physically made, if the link flaps, if a root bridge flaps, or if a port channel flaps. This condition persists for maximum of 20-25 minutes.
Resolving The Problem
Source
RETAIN tip: H21431
Symptom
If two EN4093 switches are connected to separate upstream switches via an Link Aggregation Control Protocol (LACP)/Static connection or even a single link and the EN4093 switches are configured with 25 or more Server Time Protocol (STP) groups with one Virtual Local Area Network (VLAN) each, then any network change on the link between the EN4093 and its upstream partner can cause the EN4093 Central Processing Unit (CPU) utilization to spike to 100 percent.
This can be triggered when the link between those switches are physically made, if the link flaps, if a root bridge flaps, or if a port channel flaps.
This condition persists for maximum of 20-25 minutes.
Affected configurations
The system is configured with one or more of the following IBM Option part numbers:
- IBM Flex System Fabric EN4093 10 Gb Scalable Switch, any model
This tip is not system specific.
This tip is not software specific.
The system has the symptom described above.
Solution
This behavior will be corrected in the GA4 release late August 2013, and in a 7.5.5 release expected to exit test in early July 2013.
The target date for this release is scheduled for third quarter 2013.
The file is or will be available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and operating system on IBM Support's Fix Central web page, at the following URL:
Workaround
The workaround is to use the command "no logging log spanning-tree-group" on both EN4093s.
This stops the CPU from writing STP changes. Once this command is applied, Bridge Protocol Data Units (BPDUs)and Link Aggregation Control Protocol Data Units (LACPDUs) are processed in a timely manner, and the issue is not observed.
Additional information
When the situation described previously occurs, the EN4093s are swamped with multiple STP 'new root bridge' and 'topology change' notification syslogs for each VLAN.
As a result, the CPU utilization will spike to 100 percent. The CPU will no longer be able to process LACPDUs and BPDUs in a timely manner, causing LACP, STP and control plane protocols to flap.
This condition persists for maximum of 20-25 minutes.
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
30 January 2019
UID
ibm1MIGR-5093222