Troubleshooting
Problem
Customer is receiving multiple alerts regarding failed or unreachable components in range of single SPA while system remain online.
Symptom
Monitoring commands will return similiar output:
nzhw -issues
Description HW ID Location Role State
----------- ----- ----------- ------ -----------
MM 1234 spa1.mm1 Active Warning
PowerSupply 1235 spa1.pwr3 Active Missing
EthSw 1236 spa1.ethsw1 Active Unreachable
EthSw 1237 spa1.ethsw2 Active Unreachable
ssh mm001 health -l a
system> health -l a -f
system: Critical
mm[1] : OK
mm[2] : Critical
Media Tray 1 hardware failure.
Power module 1 or 2 is required to power blades in power domain 1.
Insufficient chassis power to support redundancy
Power module 3 or 4 is required to power blades in power domain 2.
Chassis temperature device is unavailable. Cooling capacity set to maximum.
blade[1] : Non-Critical
(SN#YK11509CW2SA) Blade incompatible with I/O module configuration
blade[3] : Non-Critical
(SN#YK115001Y1YN) Blade incompatible with I/O module configuration
blade[5] : Non-Critical
(SN#YK11509CW2P5) Blade incompatible with I/O module configuration
blade[7] : Non-Critical
(SN#YK11509CW2GW) Blade incompatible with I/O module configuration
blade[9] : Non-Critical
(SN#YK105002GFM1) Blade incompatible with I/O module configuration
blade[11] : Non-Critical
(SN#YK115001Y1SR) Blade incompatible with I/O module configuration
power[1] : Critical
Power module 1 communication failure
power[2] : Critical
Power module 2 communication failure
power[3] : Critical
Power module 3 communication failure
power[4] : Critical
Power module 4 communication failure
blower[1] : OK
blower[2] : OK
switch[1] : Critical
I/O module 1 fault
I/O module 1 incompatible with blade configuration
switch[2] : Critical
I/O module 2 fault
I/O module 2 incompatible with blade configuration
switch[3] : Critical
I/O module 3 fault
I/O module 3 incompatible with blade configuration
switch[4] : Critical
I/O module 4 fault
I/O module 4 incompatible with blade configuration
Replacement/resat of AMM or mediatray, midplane replacement are not providing any improvements
Cause
Possible cause of this behaviour is presence of two different power supply units in same Chassis:
Model 1)
Part no.: 69Y5815
FRU no.: 69Y5816
Model 2)
Part no.: 39Y7408
FRU no.: 39Y7409
Diagnosing The Problem
Verify power supply units type in H-Chassis by running following command:
for i in {1..4}; do ssh mm0XX info -T power[$i] | grep -i "FRU no"; done;
Where mm0XX is number of concerned SPA
Sample command and outputs:
for i in {1..4}; do ssh mm001 info -T power[$i] | grep -i "FRU no"; done;
FRU no.: 39Y7409
FRU no.: 39Y7409
FRU no.: 39Y7409
FRU no.: 39Y7409
Resolving The Problem
If after AMM and/or mediatray replacement - communication erros still occur - replace power supply units with FRU: 39Y7409 to type: 69Y5816
Was this topic helpful?
Document Information
Modified date:
17 October 2019
UID
swg21686304