Troubleshooting
Problem
When you get on a system and nzhw -issues shows the following:
[]
[nz@nzhost ~]$ nzhw -issues
Description HW ID Location Role State
------------- ----- -------------------------- ------ -------
HostDisk 1080 rack1.host2.hostDisk1 Failed Down
SASController 1085 rack1.host2.SASController0 Active Warning [
]
Symptom
The corresponding SASController will also show up with a Warning
Cause
A host disk has failed.
Diagnosing The Problem
As the Linux 'root' user, verify that the part is actually bad by running /opt/MegaRAID/MegaCli/MegaCli64 pdlist a0 | more
- You will get a long output. Go through it and look out for Firmware state: Unconfigured(bad) :
Slot Number: 1
Enclosure position: N/A
Device Id: 16
WWN: 5000CCA00AE01D6B
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 136.731 GB [0x11176d60 Sectors]
Non Coerced Size: 136.231 GB [0x11076d60 Sectors]
Coerced Size: 135.972 GB [0x10ff2000 Sectors]
Emulated Drive: No
Firmware state: Unconfigured(bad)
Device Firmware Level: C610
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x5000cca00ae01d69
SAS Address(1): 0x0
Connected Port Number: 6(path0)
Inquiry Data: IBM-ESXSCBRCA146C3ETS0 NC610PCYZ7XAECCXSA610
IBM FRU/CRU: 42D0422
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: Foreign
Foreign Secure: Drive is not secured by a foreign lock key
Resolving The Problem
- Verify if the system is a Skimmer(100), Twinfin (N1001), Striper(N2001) , or Cruiser ( 100, N1001, N2001/N2002, or C1000)
- Get the Replacement Procedures Guide for the corresponding machine type from the Netezza Documentation Page.
- From the Replacement Procedure Guide, get the correct FRU number that you will need:
- *** Do Not trust the FRU number in the output of Step 1 as this is a generic FRU and may not be correct ***
- The FRU will be in the beginning of the "Replacing a Host Disk Drive" chapter.
- The Replacement will require NO outage. The system needs to be online.
- Ask the customer when they would like to perform the host disk drive replacement.
- *** Please note that we need AT LEAST 4 hours notice for US customers to order the part and get an SSR on-site***
- Once you have the date and time for the replacement, fill out the TSS Ticket ( Netezza Work Flow ) Insert and ask for the SSR to come on-site with the needed FRU number. (Please specify the quantity if you need more than one)
- Generate and requeue a secondary to NZOPS,387 so that they may assign an SSR and open a service request for the host disk drive replacement.
- Once an SSR has been assigned, call the SSR and confirm that they have the correct FRU.
- Follow the steps in the Replacement Procedure Guide for the Host Disk Drive Replacement.
Related Information
[{"Product":{"code":"SSULQD","label":"IBM PureData System"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Host","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.0.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Was this topic helpful?
Document Information
Modified date:
17 October 2019
UID
swg21693785