Verifying a repair

Learn how to verify hardware operation after you make repairs to the system.

  1. Power on the system.
  2. Did you replace a graphics processing unit (GPU), PCIe adapter, disk drive, or solid-state drive?
    If Then
    Yes: Go to step 5.
    No: Continue with the next step.
  3. Scan the system event logs (SELs) for serviceable events that occurred after system hardware was replaced. For information about SELs that require a service action, see Identifying a service action by using system event logs.
  4. Did any serviceable SEL events occur after hardware was replaced?
    If Then
    Yes: The problem is not resolved. Go to Identifying a service action by using system event logs and complete the service actions indicated. This ends the procedure.
    No: The problem is resolved. This ends the procedure.
  5. Use the following table to determine the verification action to complete:
    Table 1. Determining a verification action for GPUs, PCIe adapters, and devices
    Adapter type Verification action
    Devices that are controlled by a RAID adapter Complete the following steps:
    1. Install the arcconf utility for the RAID adapter.
    2. Type ARCCONF GETSMARTSTATS 1 at the command prompt and press Enter.
    3. Verify that the self-monitoring, analysis and reporting technology system (SMART) health assessment for the device passed.
    Devices that are not controlled by a RAID adapter Complete the following steps:
    1. Install the smartmontools utility.
    2. Type apt-get install smartmontools at the command prompt and press Enter.
    3. At the command prompt, type smartctl --all /dev/sdx, where x is the letter that is associated with the drive.
    4. Verify that the SMART health assessment passed.
    GPU Complete the following steps:
    1. Type nvidia-smi -L at the command prompt and press Enter. Verify that the GPU is listed.
    2. Type nvidia-smi -q at the command prompt and press Enter. Verify that no errors are listed.
    Network adapter Complete the following steps:
    1. At the command prompt, type ethtool ethx, where x is the number of the physical port that you are testing. Verify that the connection speed that is indicated in the output is correct.
    2. Perform a ping test to verify the network connectivity.
    RAID adapter Complete the following steps:
    1. Install the arcconf utility for the RAID adapter.
    2. Type ARCCONF GETLOGS 1 STATS at the command prompt and press Enter.
    3. Verify that usage statistics are returned. The presence of usage statistics indicates that the adapter is functioning properly.



Last updated: Thu, December 02, 2021