Running system-level diagnostics

Use the system-level diagnostics procedure to run diagnostics on N6200 and N7x50T series platforms. To run diagnostics on earlier platforms, use the SYSDIAG tool.

About this task

You use the system-level diagnostics procedure to run diagnostics on N6200 series and N7x50T platforms. If you are running diagnostics on earlier platforms, you must use the SYSDIAG tool (see Running SYSDIAG tool diagnostics).

You can stop the diagnostics test at any time without harm to the system by entering the sldiag device stop command.

For additional information about system-level diagnostics, see the IBM® System Storage® N series System-Level Diagnostics Guide on the N series support website at www.ibm.com/storage/support/nseries/.

Complete the following steps to run diagnostics on N6200 series and N7x50T platforms. .

Procedure

  1. Complete the step, depending on where the node halted during the boot process.
    If the node halted at the... Then enter...
    Loader prompt Go to step 2.
    Boot menu
    1. Select the Maintenance mode option from the menu.
    2. Enter the following command at the prompt:
      halt

      After you issue the command, wait until the system stops at the Loader prompt.

    3. Go to step 2.
  2. Enter the following command at the Loader prompt: boot_diags
    Note: You must run this command from the Loader prompt for system-level diagnostics to function properly. The boot_diags command starts drivers that are designed specially for system-level diagnostics.
    Attention: If you have an N6200 series system, during the boot_diags process, the following warning message might display. Enter "y" before the system boots to Maintenance mode.
    WARNING: System id mismatch. This usually occurs when replacing CF or NVRAM cards!
    Override system ID? {y/n} [n]y
    The Maintenance mode prompt (*>) is displayed.

    Enter the remaining commands in this procedure in the Maintenance mode unless specified otherwise.

  3. Enter the following command: sldiag

    For details about the sldiag command, see the sldiag man page.

  4. Clear the status logs by entering the following command: sldiag device clearstatus
  5. Verify that the log is cleared by entering the following command: sldiag device status
    The following default response is displayed:
    SLDIAG: No log messages are present
  6. Run the diagnostics test by entering the applicable command in the following table.

    If your system has only Flash Cache 2 modules, the test should complete within two minutes. If your system has only Flash Cache modules or a mix of Flash Cache and Flash Cache 2 modules, the test should complete in approximately 10 minutes.

    Note: Best practice is to run the diagnostics test on all caching modules in the system. The test runs in parallel on all caching modules. However, you can run the test on the specific module you installed.

    If you run the test on a specific module, the N used in the command corresponds to the slot number. For example, to test the caching module in slot 3 (running Data ONTAP 8.0.2 or later), you enter the sldiag device run -dev fcache -name fcache_slot3 command.

    If your system has... Then enter... You can run the test on a specific caching module, by entering the following command...
    Data ONTAP 8.0.2 or later sldiag device run -dev fcache sldiag device run -dev fcache -name fcache_slotN
    Data ONTAP earlier than 8.0.2 sldiag device run -dev pam2 sldiag device run -dev pam2 -name pam2_slotN
  7. View the status of the test by entering the following command:
    sldiag device status
    Your storage system provides the following output while the tests are still running:
    There are still test(s) being processed
    After all the tests are completed, the following response displays by default:
    *> <SLDIAG:_ALL_TESTS_COMPLETED>
  8. Verify that no hardware problems resulted from the addition or replacement of hardware components on your system by entering the applicable command:
    If your system has... Then enter...
    Data ONTAP 8.0.2 or later sldiag device status -dev fcache -long -state failed
    Data ONTAP earlier than 8.0.2 sldiag device status -dev pam2 -long -state failed

    If there are problems in hardware, the prompt is followed by the status of the test failures. If there are no hardware problems, only the prompt is displayed.

  9. Proceed based on the results of the tests:
    If system-level diagnostics tests... Then...
    Were completed without any failure Return your system to normal operation.
    1. Clear the status logs by entering the following command:
      sldiag device clearstatus
    2. Verify that the log is cleared by entering the following command:
      sldiag device status
      The following response is displayed:
      SLDIAG: No log messages are present
    3. Exit Maintenance mode by entering the following command:
      halt
    4. Reboot the storage system by entering the following command at the firmware prompt:
      boot_ontap
    5. If your system is in an HA pair, run the cf giveback command (for 7-Mode) or storage failover giveback command (for Clustered Data ONTAP) from the partner node console.
    Failed Determine the cause of the problem.
    1. Exit Maintenance mode by entering the following command:
      halt
    2. Perform a clean shutdown and disconnect the power supplies.
    3. Verify that cables are securely connected, and that hardware components are properly installed in the storage system.
    4. Reconnect the power supplies and power on the storage system.
    5. Rerun the system-level diagnostics.
  10. When the system-level diagnostics tests are successful, you have completed system-level diagnostics. Go to Completing the replacement process, to finish.