Running system panic diagnostics

Running diagnostics after your storage system suffers a system panic can help you to identify the possible cause of the panic.

Procedure

  1. At the storage system prompt, enter the following command to get to the LOADER prompt: halt
  2. On the node with the replaced component, enter the following command at the LOADER prompt: boot_diags
    Note: You must enter this command from the LOADER prompt for system-level diagnostics to function properly. The boot_diags command starts special drivers designed specifically for system-level diagnostics.
    Important: During the boot_diags process, you might see the following prompts:
    • A prompt warning of a system ID mismatch and asking to override the system ID.
    • A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that the partner remains down.

    You can safely respond y to these prompts.

    The Maintenance mode prompt (*>) appears.
  3. Run diagnostics on all the devices by entering the following command: sldiag device run
  4. View the status of the test by entering the following command: sldiag device status Your storage system provides the following output while the tests are still running:
    There are still test(s) being processed.
    After all the tests are complete, you receive the following default response:
    *> <SLDIAG:_ALL_TESTS_COMPLETED>
  5. Identify the cause of the system panic by entering the following command: sldiag device status -long -state failed The example shows that the tests were run without the appropriate hardware:
    If the system-level diagnostics tests... Then...
    Were completed without any failures There are no hardware problems and your storage system returns to the prompt.
    1. Clear the status logs by entering the following command: sldiag device clearstatus
    2. Verify that the log is cleared by entering the following command: sldiag device status

      The following default response is displayed:

      SLDIAG: No log messages are present.
    3. Exit Maintenance mode by entering the following command: halt
    4. Enter the following command at the firmware prompt to reboot the storage system: boot

    You have completed system-level diagnostics.

    Resulted in some test failures Determine the cause of the problem.
    1. Exit Maintenance mode by entering the following command: halt
    2. Perform a clean shutdown and disconnect the power supplies.
    3. Verify that you have observed all the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
    4. Reconnect the power supplies and power on the storage system.
    5. Repeat Steps 1 through 5 of Running system panic diagnostics.
    The following example displays the full status of the failures that occurred:
    *> sldiag device status -long -state failed
    
    TEST START ------------------------------------------
    DEVTYPE: nvram_ib
    NAME: external  loopback test
    START DATE: Sat Jan  3 23:10:55 GMT 2009
    
    STATUS: Completed
    ib3a: could not set loopback mode, test failed
    END DATE: Sat Jan  3 23:11:04 GMT 2009
    
    LOOP: 1/1
    TEST END --------------------------------------------
    
    TEST START ------------------------------------------
    DEVTYPE: fcal
    NAME: Fcal Loopback Test
    START DATE: Sat Jan  3 23:10:56 GMT 2009
    
    STATUS: Completed
    Starting test on Fcal Adapter: 0b
    Started gathering adapter info.
    Adapter get adapter info OK 
    Adapter fc_data_link_rate: 1Gib
    Adapter name: QLogic 2532
    Adapter firmware rev: 4.5.2
    Adapter hardware rev: 2
     
    Started adapter get WWN string test.
    Adapter get WWN string OK wwn_str: 5:00a:098300:035309
     
    Started adapter interrupt test
    Adapter interrupt test OK
     
    Started adapter reset test.
    Adapter reset OK
     
    Started Adapter Get Connection State Test.
    Connection State: 5
    Loop on FC Adapter 0b is OPEN 
     
    Started adapter Retry LIP test
    Adapter Retry LIP OK
     
    ERROR: failed to init adaptor port for IOCTL call
    
    ioctl_status.class_type = 0x1
    
    ioctl_status.subclass = 0x3
    
    ioctl_status.info = 0x0
     Started INTERNAL LOOPBACK: 
    INTERNAL LOOPBACK   OK
    Error Count: 2  Run Time: 70 secs
    >>>>> ERROR, please ensure the port has a shelf or plug.
    END DATE: Sat Jan  3 23:12:07 GMT 2009
    
    LOOP: 1/1
    TEST END --------------------------------------------

What to do next

If the failures persist after repeating the steps, you need to replace the hardware.