Fault isolation methodology
The basic methodology used to locate faults within a storage system, and to identify the pertinent CRUs affected.
Overview
- Gather fault information, including using system LEDs.
- Determine where in the system the fault is occurring.
- Review logs from the ClevOS Manager event console.
- If required, isolate the fault to a data path component or configuration as described in Isolate the fault.
Gather fault information
When a fault occurs, it is important to gather as much information as possible. Doing so will help you determine the correct action needed to remedy the fault.
- Is the fault related to an internal data path or an external data path?
- Is the fault related to a hardware component such as a disk drive module, controller module, or power supply unit?
By isolating the fault to one of the components within the storage system, you will be able to determine the necessary corrective action more quickly.
Determine where the fault is occurring
- See Rear panel LEDs.
- See Top panel LEDs.
The LEDs help you identify the location of a CRU reporting a fault.
Isolate the fault
Occasionally, it might become necessary to isolate a fault. This is particularly true with data paths, due to the number of components comprising the data path. For example, if a host-side data error occurs, it could be caused by any of the components in the data path: Controller node HBA, Cable, IOM, or Disk Enclosure.
If the enclosure does not initialize
- Power cycle the system.
- Make sure the power cord is properly connected, and check the power source to which it is connected.
- Check the ClevOS Manager event console for errors.