Obtaining error information from an IBM System p

IBM® device drivers, for the System p® operating system, logs error information when an error occurs on a tape drive or library.

The error information includes the following.

  • Device VPD
  • SCSI command parameters
  • SCSI sense data (if available)

The AIX® Tape and Media Changer Device Driver for System p provides logging to the system error log for various errors. You can view the error log by following this procedure.

  1. At the AIX command line, type errpt |pg to display a summary report, or type errpt -a |pg to display a detailed report. Press [Enter].
    Note: Use the summary report to find the date and time of any errors that are related to library devices. Then, use the detail report to obtain the sense data that is needed to identify the cause of the error.
  2. Press [Enter] to scroll through the error log.
  3. Type q and press [Enter] to quit the error log at any time.

To correct a problem you noticed in the errpt report, determine the type of error by using the examples that follow:

SCSI sense data definition

The following example is of a tape drive communication failure while attached to an Open Systems host through a SAS link, with SCSI protocol. When the host detected the failure, it built the following SCSI Sense Data record. An explanation of the SCSI Sense Data breakout in this example follows.

SENSE DATA
aabb xxxx ccdd eeee eeee eeee eeee eeee ffgg hhxx ssss ssss ssss ssss ssss ....

0600 0000 1200 0000 0000 0000 0000 0000 0200 0300 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
Note: The bold area represents the SCSI Sense Data that are presented by the host. The regular font data (in this case many bytes of "zero"), designated by "ssss" would normally contain device sense data. However, with the kind of failure in this example (COMMAND TIMEOUT), the host cannot collect valid device sense data, so zeros are the result and are ignored. If the host was able to collect valid sense data from the drive, the first byte “ss” would be "70", "71", "F0", or "F1", and valid device sense data would be listed.
Detail Data

aabb xxxx ccdd eeee eeee eeee eeee eeee ffgg hhxx ssss ssss ssss ssss ssss ....

aa Length of the Command Descriptor Block (CDB) sent by the host. 
In this case, “06” bytes.
bb SCSI target address. In this example, SCSI address “00”.
xx Unused or reserved.
cc Start of CDB, cc is the operation code (byte 0). In this case, 
“12” which was an “Inquiry”.

SCSI sense data - library error

The following example of SCSI Sense Data was received from a System p Open System host and shows a Tape Drive Failure and what the sense data would look like. Unlike the previous situation with SCSI sense data definition, this data contains valid sense data as defined by the hex "70" in the first sense byte position. Therefore, instead of all zeros as in the previous example, there is valid data to rely on. While the data shows a TAPE_ERR2, it might be caused by a library failure. When you attempt a Move Medium command ("A5"), the ASC/ASCQ points to a "Mechanical Positioning Error". For more information about sense data, see the IBM LTO Ultrium Tape Drive SCSI Reference.

LABEL:          TAPE_ERR2
IDENTIFIER:     476B351D

Date/Time:       Fri May 04 42:26 DFT
Sequence Number: 1665
Machine Id:      0046083B4C00
Node Id:         risc4
Class:           H
Type:            PERM
Resource Name:   smc0 Resource Class:  tape Resource Type:   3572 
Location:        P1.1-I3/Q1-W5003013D38321011-L1000000000000
VPD:
        Manufacturer................IBM
        Machine Type and Model......3572-TL
        Serial Number...............X2U78B0384
        Device Specific.(FW)........4.09 (Firmware Level)

Description
TAPE DRIVE FAILURE

Probable Causes
TAPE DRIVE

Failure Causes
TAPE
TAPE DRIVE

        Recommended Actions
        PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data

aabb xxxx ccdd eeee eeee eeee eeee eeee ffgg hhxx ssss ssss ssss ssss ssss ....

aa Length of the Command Descriptor Block (CDB) sent by the host. In this case, 
“0C” bytes.
bb SCSI target address. In this example, SCSI address “00”.
xx Unused or reserved.
cc Start of CDB, cc is the operation code (byte 0). In this case, “A5” which 
was an “Move Medium”.

SENSE DATA
0C00 0000 A500 0000 100F 1010 0000 0000 0102 0000 7000 0400 0000 000A 0000 0000
818F 0000 BE00 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
Table 1. Library sense data example
Hex Description
A5 SCSI Command (in this case Move Medium).
70 Byte 0 of Library Sense Data (Valid Data).
04 Sense Key (in this case Hardware Error).
818F ASC/ASCQ (extra sense code/additional sense code qualifier), in this case a “Cannot Find Slider Block” error.

SCSI sense data - tape drive error

The following example of SCSI Sense Data was received from a System p Open System host and shows a Tape Drive Failure and what the sense data would look like. Like the SCSI sense data in the previous example, this data contains valid sense data as defined by the hex “71” in the first sense byte position. Therefore, there is valid data to rely on. While the data shows a TAPE_ERR2, further review of the ASC/ASCQ (Media Load or Eject Failed) points more to a problem with the media or the drive. The fact that a FID was listed ("86") defines the failure as "The drive detected a drive hardware or media fault". In this case, follow the FID to make a repair. For more information about Sense Key and ASC/ASCQ fields, see the IBM LTO Ultrium Tape Drive SCSI Reference.

LABEL:          TAPE_ERR2
IDENTIFIER:     476B351D

Date/Time:       Wed May 09 07:51:42 DFT
Sequence Number: 1669
Machine Id:      0046083B4C00
Node Id:         risc4
Class:           H
Type:            PERM
Resource Name:   rmt0 Resource Class:  tape Resource Type:   LTO 
Location:        P1.1-I3/Q1-W5003013D38321011-L0

VPD:
        Manufacturer................IBM        
        Machine Type and Model......ULT3573-TD4        
        Serial Number...............1300000680
        Device Specific.(FW)........74H4 (Firmware Level)
        Loadable Microcode Level....A1700D5C

Description
TAPE DRIVE FAILURE

Probable Causes
TAPE DRIVE

Failure Causes
TAPE
TAPE DRIVE

Recommended Actions
        PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0600 0000 0000 0000 0000 0000 0000 0000 0102 0000 7100 0400 0000 0058 0000 0000
5300 8602 8E07 0000 0001 0110 0001 0000 0000 0000 0200 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 2800 01E0 0000 0000 0000 4133 3820
2020 2000 0000 0000 0000 0000 0000 8000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
Table 2. Drive sense data example
Hex Description
71 Byte 0 of Library Sense Data (Valid Data).
04 Sense Key (in this case Hardware Error).
5300 ASC/ASCQ (extra sense code/additional sense code qualifier).
86 FID (FRU identification number). In this case, a Drive Hardware or Media problem.