Resolving a graphics processing unit problem

Learn about the possible problems and service actions that you can perform to resolve a graphics processing unit (GPU) problem.

About this task

Note: To determine the location of the GPU, see Identifying the location of the GPU.
Table 1. GPU problems and service actions for the 8335-GTC, 8335-GTG, 8335-GTH, 8335-GTW, or 8335-GTX
Problem Service action
System unable to find GPU
  1. Verify that the GPU is properly seated.
  2. Verify that the drivers for the GPU are installed.
  3. Verify that the most recent firmware is installed on the system. Otherwise, install the most recent firmware if it is not already installed.
  4. Restart the system.
  5. If the GPU is still missing, replace the following items, one at a time, until the problem is resolved:
    Note: Go to 8335-GTC, 8335-GTG, 8335-GTH, 8335-GTW, or 8335-GTX locations to identify the physical location and the removal and replacement procedure.
    1. GPU
    2. System processor modules
    3. System backplane
Fence errors in the operating system log
  1. Restart the system. Do fence errors continue to be logged in the operating system log?
    • Yes: Continue with the next step.
    • No: This ends the procedure.
  2. Does NPU chip 0 appear in the fence error log entry?
    • Yes: Continue with the next step.
    • No: Go to step 4.
  3. Replace the following items, one at a time, until the problem is resolved:
    Note: Go to 8335-GTC, 8335-GTG, 8335-GTH, 8335-GTW, or 8335-GTX locations to identify the physical location and the removal and replacement procedure.
    1. CPU 0
    2. GPU 2
    3. GPU 1
    4. GPU 0
    5. System backplane
    This ends the procedure.
  4. Does NPU chip 1 appear in the fence error log entry?
  5. Replace the following items, one at a time, until the problem is resolved:
    Note: Go to 8335-GTC, 8335-GTG, 8335-GTH, 8335-GTW, or 8335-GTX locations to identify the physical location and the removal and replacement procedure.
    1. CPU 1
    2. GPU 5
    3. GPU 4
    4. GPU 3
    5. System backplane
    This ends the procedure.
GPU stops working suddenly
  1. If the system was recently installed, moved, serviced, or upgraded, verify that the GPU is seated properly.
  2. Inspect the GPU and verify that it is not physically damaged.
  3. If the GPU is still not working, replace the following items, one at a time, until the problem is resolved:
    Note: Go to 8335-GTC, 8335-GTG, 8335-GTH, 8335-GTW, or 8335-GTX locations to identify the physical location and the removal and replacement procedure.
    1. GPU
    2. System processor modules
    3. System backplane
Other problems For information about adapter diagnostics, see Supporting diagnostics. For information about adapter user information, see User guides for GPUs and PCIe adapters.