Troubleshooting
Problem
The appliance incorrectly calls out a failed raid drive, which during investigation it is verify that there is no failed raid drive on that appliance.
Symptom
A hardware type error is reported to the Console that resembles the following message:
Monitoring: [ERROR] [NOT:0500000100] (host IP X.X.X.X) Disk Failure - Hardware Monitoring has determined that a disk is in failed state - (Number of Failed Disks: 1) Slot Number. X -Failed;
This message is the disk failure error received on the console when a raid drive fails. In this case, when the appliance is checked, it shows no failed raid drives.
Cause
Appliance incorrectly calls out a failed raid drive when there is none. This situation commonly happens on a newly refreshed box, typically an M6, but could happen on other appliances that have recently been replaced or refreshed.
Failed raid drives or even predictive failed drives report as having an issue and report the IP of the new M6, which was re-used from its previous appliance. That previous appliance was taken out of the deployment, but left running in the same subnet, and the old appliance is the real place there is a drive issue, or any hardware issue that can be reported via advertising on the network.
Environment
Any live customer QRadar environment. Sometimes recently refreshed appliances can manifest this issue.
Diagnosing The Problem
A hardware type error is reported to the Console as a certain IP is experiencing a certain problem (usually a failed raid drive or pred-fail raid drive). The user calls to have the issue diagnosed, or try's to diagnose it themselves and do not find any failed drives and all drives online when using Command-line commands like:
/opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary -a0
Here are some examples of the Command-line verification:
Command-line verification confirms that the live appliance has no failed raid drives confirming all show online
[root@ ~]# /opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary -a0
System
Operating System: Linux version 3.10.0-1160.45.1.el7.x86_64
Driver Version: 07.714.04.00-rh1
CLI Version: 8.04.10
Hardware
Controller
ProductName : RAID 930-16i 4GB Flash(Bus 0, Dev 0)
SAS Address : 500062b207ec8e00
FW Package Version: 51.17.0-4094
Status : Optimal
BBU
BBU Type :
Status : Healthy
Enclosure
Product ID : VirtualSES
Type : SES
Status : OK
PD
Connector : Port 0 - 3<Internal>: Slot 0
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 0 - 3<Internal>: Slot 2
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 0 - 3<Internal>: Slot 3
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 4 - 7<Internal>: Slot 7
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 4 - 7<Internal>: Slot 6
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 4 - 7<Internal>: Slot 5
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 8 - 11<Internal>: Slot 9
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 0 - 3<Internal>: Slot 1
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 8 - 11<Internal>: Slot 10
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 8 - 11<Internal>: Slot 11
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 4 - 7<Internal>: Slot 4
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 8 - 11<Internal>: Slot 8
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Storage
Virtual Drives
Virtual drive : Target ID 0 ,VD name
Size : 72.768 TB
State : Optimal
RAID Level : 6
Exit Code: 0x00
Command-line verification on the old appliance with the same IP verifies where the failed raid drive is located, in this case slot 8 shows failed
[root@ ~]# /opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary -a0
System
Operating System: Linux version 3.10.0-1160.45.1.el7.x86_64
Driver Version: 07.714.04.00-rh1
CLI Version: 8.04.10
Hardware
Controller
ProductName : RAID 930-16i 4GB Flash(Bus 0, Dev 0)
SAS Address : 500062b207ec8e00
FW Package Version: 51.17.0-4094
Status : Needs Attention
BBU
BBU Type :
Status : Healthy
Enclosure
Product ID : VirtualSES
Type : SES
Status : OK
PD
Connector : Port 0 - 3<Internal>: Slot 0
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 0 - 3<Internal>: Slot 2
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 0 - 3<Internal>: Slot 3
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 4 - 7<Internal>: Slot 7
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 4 - 7<Internal>: Slot 6
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 4 - 7<Internal>: Slot 5
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 8 - 11<Internal>: Slot 9
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 0 - 3<Internal>: Slot 1
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 8 - 11<Internal>: Slot 10
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 8 - 11<Internal>: Slot 11
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 4 - 7<Internal>: Slot 4
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Online
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Active
Connector : Port 8 - 11<Internal>: Slot 8
Vendor ID : LENOVO
Product ID : MG06SCA800E
State : Failed
Disk Type : SAS hard disk Device
Capacity : 7.276 TB
Power State : Inactive
Storage
Virtual Drives
Virtual drive : Target ID 0 ,VD name
Size : 72.768 TB
State : Partially Degraded
RAID Level : 6
Exit Code: 0x00
Resolving The Problem
The fix is one of the three following actions.
- Shut down the older appliance.
- Reinstall the OS to an OS without QRadar installed.
- Replace the failed drive in the old appliance.
Document Location
Worldwide
[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwtcAAA","label":"Hardware"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]
Was this topic helpful?
Document Information
Modified date:
29 November 2022
UID
ibm16612045