Tasks and service aids

The diagnostic package contains programs called tasks and service aids. Tasks and service aids are used to have the diagnostics perform specific functions on resources contained in a system.

Notes:
  • Many of these programs work on all system model architectures. Some programs are only accessible from online diagnostics in service or concurrent mode, while others might be accessible only from stand-alone diagnostics.
  • The specific tasks available depend on the hardware attributes or capabilities of the system you are servicing. Not all service aids nor tasks are available on all systems.
  • If the system is running on a logically partitioned system, the following tasks can be run only in a partition with service authority:
    • Configure scan dump policy
    • Enable platform automatic power restart
    • Configure platform processor diagnostics

For more information about Linux tasks and service aids, see the Service Aids topic in the Linux Knowledge Center.

To perform these tasks, use the Task Selection option from the FUNCTION SELECTION menu.

After a task is selected, a resource menu might be displayed showing all resources supported by the task.

You can use a fast path method to perform a task by using the diag command with the -T flag. By using the fast path method, you can bypass most of the introductory menus to access a particular task. You are presented with a list of resources available to support the specified task. The fast path tasks include the following options:
certify
Certifies media
chkspares
Checks for the availability of spare sectors
download
Downloads microcode to an adapter or device
disp_mcode
Displays current level of microcode
format
Formats media
identify
Identifies the PCI RAID physical disks
identifyRemove
Identifies and removes devices (hot plug)
pdiskfg
Displays the fuel gauge for pdisk read intensive solid-state drives

To run these tasks directly from the command line, specify the resource and other task-unique flags. Use the descriptions in this topic to understand which flags are needed for each task.

Add resources to the resource list

Use this task to add resources back to the resource list.

Note: Only resources that were previously detected by the diagnostics and deleted from the diagnostic test list are listed. If no resources are available to be added, then none are listed.

Shell prompt

Note: Use this service aid in online service mode only.

This service aid allows access to the AIX® command line. To use this service aid, you must know the root password (if a root password is set).

Note: Do not use this task to install code or to change the configuration of the system. This task is intended to view files, configuration records, and data. Using this service aid to change the system configuration or install code can produce unexplained system problems after exiting the diagnostics.

Analyze the adapter internal log

Note: Use this service aid in online mode only.

The PCI RAID adapter has an internal log that logs information about the adapter and the disk drives attached to the adapter. Whenever data is logged in the internal log, the device driver copies the entries to the system error log and clears the internal log.

The analyze adapter internal log service aid analyzes these entries in the system error log. The service aid displays the errors and the associated service actions. Entries that do not require any service actions are ignored.

When running this service aid, a menu is presented to enter the start time, the end time, and the file name. The start time and end time have the following format: [mmddHHMMyy]. The mm is the month (1-12), dd is the date (1-31) HH is the hour (00-23) MM is the minute (00-59), and yy is the last two digits of the year (00-99). The file name is the location where you want to store the output data.

To start the service aid task from the command line, type:
diag -c -d devicename -T "adapela [-s start date -e end date]"
Flag
Description
-c
Specifies not console mode.
-d device name
Specifies the device whose internal log you want to analyze (for example, SCRAID0)
-s start date
Specifies all errors after this date are analyzed.
-e end date
Specifies all errors before this date are analyzed.
-T
Specifies the Analyze Adapter Internal Log task
Note: To specify a file name from the command line, use the redirection operator at the end of the command to specify where the output of the command is to be sent. For example > filename (where filename is the name and location where the user wants to store the output data (for example, /tmp/adaptlog).

Back up and restore media

This service aid allows verification of backup media and devices. It presents a menu of tape and diskette devices available for testing and prompts for selecting the wanted device. It then presents a menu of available backup formats and prompts for selecting the wanted format. The supported formats are tar, backup, and cpio. After the device and format are selected, the service aid backs up a known file to the selected device, restores that file to /tmp, and compares the original file to the restored file. The restored file remains in /tmp to allow for visual comparison. All errors are reported.

Certify media

This task allows the selection of diskette, DVD-RAM media, or hard disk files to be certified. Normally, this task is done under the following conditions:
  • To determine the condition of the drive and media
  • To verify that the media is error-free after a format service aid is run on the media

Normally, run Certify if after running diagnostics on a drive and its media, no problem is found, but you suspect that a problem still exists.

Hard disk files can be connected either to a SCSI adapter (non-RAID) or a PCI SCSI RAID adapter. The usage and criteria for a hard disk file connected to a non-RAID SCSI adapter are different from the usage and criteria for a hard disk file connected to a PCI SCSI RAID adapter.

Certify media can be used with the following options:
Certify Diskette
Use this selection to verify the data written on a diskette. When you select this service aid, the menu prompts you for a diskette type that you want to verify. The program then reads all of the ID and data fields on the diskette one time and displays the total number of bad sectors found.
Certify DVD-RAM media
This selection reads all of the ID and data fields. It checks for bad data and counts all errors encountered. If an unrecovered data error occurs, the data on the media must be transferred to another media and the original media must be discarded. If an unrecovered equipment error occurs or recovered errors exceed the threshold value, the original media must be discarded.
The certify service aid displays the following information:
  • Capacity in bytes
  • Number of data errors recovered
  • Number of data errors not recovered
  • Number of equipment check errors
  • Number of equipment checks not recovered
If the drive is reset during a certify operation, the operation is restarted.
If the drive is reset again, the certify operation is terminated, and you are asked to run diagnostics on the drive.
If you are running the AIX operating system in online diagnostic mode, this task can be run directly from the command line. The command-line syntax is: diag -c -d -T "certify"
The following flags can be used:
Flag
Description
-c
No console mode
-d
Specifies a device
-T
Specifies the certify task
Certify Hard disk file Attached to a Non-RAID and PCI-X RAID SCSI adapter
For pdisks and hdisks, this selection reads all of the ID and data fields on the hard disk file. If bad-data errors are encountered, the certify operation counts the errors.
If there are non-recovered data errors that do not exceed the threshold value, do one of the following tasks:
  • For hdisk hard disk files, format the hard diskfile and certify again.
  • For pdisk hard disk files, run diagnostics on the parent adapter.
If the non-recovered data errors, recovered data errors, recovered and non-recovered equipment errors exceed the threshold values, the hard disk file must be replaced.
After the read certify of the disk surface completes for hdisk hard disk files, the certify operation performs 2000 random-seek operations. Errors are also counted during the random-seek operations. If a disk timeout occurs before the random seeks are finished, the disk needs to be replaced.
The Certify service aid displays the following information:
  • For hdisks:
    • Drive capacity in megabytes.
    • Number of data errors recovered.
    • Number of data errors not recovered.
    • Number of equipment checks recovered.
    • Number of equipment checks not recovered.
  • For pdisks:
    • Drive capacity in megabytes.
    • Number of data errors not recovered.
    • Number of LBA reassignments
    • Number of equipment checks not recovered.
If you are running the AIX operating system in online diagnostic mode, this task can be run directly from the command line. The command-line syntax is: diag -c -d deviceName -T "certify"
Flag
Description
-c
No console mode
-d
Specifies a device
-T
Specifies the certify task
Certify Hard Disk File Attached to a PCI SCSI RAID adapter
This selection is used to certify physical disks attached to a PCI SCSI RAID adapter. Certify reads the entire disk and checks for recovered errors, unrecovered errors, and reassigned errors. If these errors exceed the threshold values, you are prompted to replace the physical disk.
If you are running the AIX operating system in online diagnostic mode, this task can be run directly from the command line. The command-line syntax is: diag -c -d RAIDadapterName -T "certify {-l chID | -A}"
Flag
Description
-c
No console mode
-d
Specifies the RAID adapter to which the disk is attached
-T
Specifies the certify task and its parameters
-I
Specifies physical disk channel/ID (for example: -l 27)
-A
All disks

Change hardware vital product data

Use this service aid to view the alter or display the vital product data (VPD) selection menu. The menu lists all resources installed on the system. When a resource is selected, a menu displays that lists all the VPD for that resource.

Note: The user cannot alter the VPD for a specific resource unless the VPD is not machine readable.

Configure dials and LPF keys

Note: The dials and LPF keys service aid is not supported in stand-alone mode (CD/DVD-ROM and NIM) on systems with 32 MB or less memory. If you have problems in stand-alone mode, use the hard disk-based diagnostics.

This service aid provides a tool for configuring and removing dials and LPF keys to the asynchronous system ports.

This selection starts the System Management Interface Tool (SMIT), which allows dial and LPF key configuration. A TTY must be in the available state on the async port before the dials and LPF keys can be configured on the port. The task allows an async adapter to be configured, then a TTY port defined on the adapter. Dials and LPF keys can then be defined on the port.

Before configuring dials or LPF keys on a system port, you must remove all defined TTYs. To determine whether there are any defined TTYs, select List All Defined TTYs. After all defined TTYs are removed, then add a TTY and configure the dials or LPF keys.

Configure reboot policy (CHRP)

This service aid controls how the system tries to recover when power is restored after a power outage.

Use this service aid to display and change the following settings for the reboot policy.

Enable platform automatic power restart

When enabled, Platform auto power restart allows the platform firmware to restart a system after power is restored following a power outage. If the system is partitioned, each partition that was running when the power outage occurred is restarted as indicated by the SMIT option: Automatically reboot operating system after a crash. This setting must be set for each partition.

This service aid can be accessed directly from the command line, by entering:
/usr/lpp/diagnostics/bin/uspchrp -b
The parameter setting might be read and set directly from the command line. To read the parameter, use the command:
/usr/lpp/diagnostics/bin/uspchrp -q platform-auto-power-restart
To set the parameter, use the command:
/usr/lpp/diagnostics/bin/uspchrp -e platform-auto-power-restart=[0|1] 
where:

1 = Enable Platform Automatic Power Restart
0 = Disables Platform Automatic Power Restart

The Platform Boot Speed system parameter can be read or set from the command line only. To read the Platform Boot Speed system parameter, use the command: /usr/lpp/diagnostics/bin/uspchrp -q PlatformBootSpeed

To set the Platform Boot Speed system parameter, use the command:
/usr/lpp/diagnostics/bin/uspchrp -e PlatformBootSpeed=[fast|slow]

With a fast platform speed, the platform firmware performs a minimal set of hardware tests before loading the operating system. With a slow platform speed, the platform firmware performs a comprehensive set of hardware tests before loading the operating system.

For the command:

/usr/lpp/diagnostics/bin/uspchrp -q <variable name> | -e <variable name>=value
The return codes are:
0 = command successful  
1 = command not successful  

Configure platform processor diagnostics

This service aid provides the user-interface to specify a system parameter platform processor diagnostics used by the firmware. The firmware uses the parameter setting to determine when a series of processor diagnostics tests are run. Errors from the processor diagnostics tests are logged in to the error log and are analyzed by the sysplanar0 diagnostics. Otherwise, there is no notification to the operating system when the tests are run. The possible values of the system parameter and their descriptions are as follow:
disabled
No processor diagnostics.
staggered
Processor diagnostics are run periodically. All processors are tested but are not scheduled at the same time.
immediate
When setting this value, processor diagnostics are run immediately. When querying this value, processor diagnostics are currently running.
periodic
Processor diagnostics are run periodically, all at the same time.

The periodic setting cannot be set by using this service aid, although it can be read. The management console is used to set the periodic setting.

The Configure platform processor diagnostics setting is accessed by using the diag command, and then selecting the appropriate topic from the diagnostics task menus.

It also might be accessed directly from the AIX command line, by entering:
/usr/lpp/diagnostics/bin/uspchrp -p
To query the platform processor diagnostics parameter, enter:
/usr/lpp/diagnostics/bin/uspchrp -q PlatformProcessorDiagnostics
Note: The output of the query operation might be disabled, staggered, immediate, or periodic.
To set the platform processor diagnostics parameter, enter:
/usr/lpp/diagnostics/bin/uspchrp -e PlatformProcessorDiagnostics=[disabled|staggered|immediate]

Configure scan dump policy

Configure scan dump policy allows the user to set or view the scan dump policy (scan dump control and size) in NVRAM. Scan dump data is a set of chip data that the service processor gathers after a system malfunction. It consists of chip scan rings, chip trace arrays, and scan COM (SCOM) registers. This data is stored in the scan-log partition in the nonvolatile random access memory (NVRAM) on the system.

Use this service aid to display and change the following settings for the scan dump policy at run time:
  • Scan Dump Control (how often the dump is taken)
  • Scan Dump Size (size and content of the dump)
The Scan Dump Control (SDC) settings include the following options:
As needed
This setting allows the platform firmware to determine whether a scan dump is performed. This setting is the default setting for the dump policy.
Always
This setting overrides the firmware recommendations and always performs a dump after a system failure.
The Scan Dump Size (SDS) settings include the following options:
As Requested
Dump content is determined by the platform firmware.
Minimum
Dump content collected provides the minimum debug information, enabling the platform to reboot as quickly as possible.
Optimum
Dump content collected provides a moderate amount of debug information.
Complete
Dump data provides the most complete error coverage at the expense of reboot speed.
You can access this service aid directly from the AIX command line by typing:
/usr/lpp/diagnostics/bin/uspchrp -d

Delete resource from resource list

Use this task to delete resources from the resource list.
Note: Only resources that were previously detected by the diagnostics and were not deleted from the diagnostic test list are listed. If no resources are available to be deleted, then none are listed.

Disk maintenance

This service aid provides the following options for the hard disk maintenance:
  • Disk to Disk Copy
  • Display/Alter Sector

Disk-to disk-copy

Notes:
  1. This service aid cannot be used to update a drive of a different size. The service aid only supports copying from a SCSI drive to another SCSI drive of the same size.
  2. Use the migratepv command when copying the contents to other disk drive types. This command also works when copying SCSI disk drives or when copying to a SCSI disk drive that is not the same size.

Use this selection to recover data from an old drive when replacing it with a new drive. The service aid recovers all logical volume manager (LVM) software-reassigned blocks. To prevent corrupted data from being copied to the new drive, the service aid stops if an unrecoverable read error is detected. To help prevent possible problems with the new drive, the service aid stops if the number of bad blocks to be reassigned reaches a threshold.

To use this service aid, both the old and new disks must be installed in, or attached to the system with unique SCSI addresses. The new disk drives SCSI address must be set to an address that is not currently in use, and the drive must be installed in an empty location. If there are no empty locations, then one of the other drives must be removed. When the copy is complete, only one drive can remain installed. Either remove the target drive to return to the original configuration, or perform the following procedure to complete the replacement of the old drive with the new drive:
  1. Remove both drives.
  2. Set the SCSI address of the new drive to the SCSI address of the old drive.
  3. Install the new drive in the location of the old drive.
  4. Install any other drives (that were removed) into their original location.

To prevent problems that can occur when running this service aid from disk, run this service aid from the diagnostics that are loaded from removable media when possible.

Display/alter sector

Attention: Use caution when you use this service aid. Inappropriate modification to some disk sectors can result in the total loss of all data on the disk.

This selection allows the user to display and alter information about a disk sector. Sectors are addressed by their decimal sector number. Data is displayed both in hex and in ASCII. To prevent corrupted data from being incorrectly corrected, the service aid does not display information that cannot be read correctly.

Display configuration and resource list

If a device is not included in the test list or if you think a diagnostic package for a device is not loaded, check by using the display configuration and resource list task. If the device you want to test has a plus (+) sign or a minus (-) sign preceding its name, the diagnostic package is loaded. If the device has an asterisk (*) preceding its name, the diagnostic package for the device is not loaded or is not available.

This service aid displays the item header only for all installed resources. Use this service aid when there is no need to see the vital product data (VPD). (No VPD is displayed.)

Display firmware device node information

This task displays the firmware device node information. This service aid is intended to gather more information about individual or particular devices on the system. The format of the output data might differ depending on which level of the operating system is installed.

Display hardware error report

This service aid uses the errpt command to view the hardware error log.

The display error summary and display error detail selections provide the same type of report as the errpt command. The display error analysis summary and display error analysis detail selections provide additional analysis.

Display hardware vital product data

This service aid displays all installed resources, along with any VPD for those resources. Use this service aid when you want to look at the VPD for a specific resource.

Display machine check error log

Note: The display machine check error log service aid is available only on stand-alone diagnostics.

When a machine check occurs, information is collected and logged in an NVRAM error log before the system unit shuts down. This information is logged in the error log and cleared from NVRAM when the system is rebooted from the hard disk, LAN, or stand-alone media. When booting from stand-alone diagnostics, this service aid converts the logged information in to a readable format that can be used to isolate the problem. When booting from the hard disk or LAN, the information can be viewed from the AIX error log by using the hardware error report service aid. In either case, the information is analyzed when the sysplanar0 diagnostics are running in problem determination mode.

Display microcode level

Note: Display microcode level is a subtask that can be accessed after selecting Microcode Tasks, see Microcode tasks.

This task provides a way to display microcode on a device or adapter. When the sys0 resource is selected, the task displays the levels of both the system firmware and service processor firmware. sys0 might not be available in all cases.

You can display the current level of the microcode on an adapter, the system, or a device by using the diag command. See the following command syntax: diag -c -d device -T "disp_mcode"
-c
No console mode.
-d
Used to specify a device.
-T
Use the disp_mcode option to display microcode.

The lsmcode command serves as a command-line interface to the display microcode level task.

Display MultiPath I/O (MPIO) device configuration

Note: Use this service aid in online mode only.

This service aid displays the status of MPIO devices and their connections to their parent devices.

Use this service aid to send SCSI commands on each available path regardless of the default MPIO path algorithm. Therefore, it is useful for testing the unused path for integrity.

Run this service aid if you suspect a problem with the path between MPIO devices and their parent devices.

Use this service aid for:
  • Listing MPIO devices
  • Listing the parents of MPIO devices
  • Displaying the status and location of specified MPIO devices
  • Displaying the hierarchy of MPIO adapters and devices.

If there are no devices with multiple paths, this service aid is not shown on the Task Selection menu.

Access this service aid directly from the command line by typing:
/usr/lpp/diagnostics/bin/umpio

Display or change bootlist

This service aid allows the bootlist to be displayed, altered, or erased.

The system attempts to perform an IPL from the first device in the list. If the device is not a valid IPL device or if the IPL fails, the system proceeds in turn to the other devices in the list to attempt an IPL.

Display or change diagnostic runtime options

The display or change diagnostic runtime options task allows the diagnostic runtime options to be set.
Note: The runtime options are used only when selecting the run diagnostic task.
The runtime options are:
Display Diagnostic Mode Selection menus
This option allows the user to turn on or off displaying the DIAGNOSTIC MODE SELECTION MENU (the default is on).
Run Tests Multiple Times
This option allows the user to turn on or off, or specify a loop count, for diagnostic loop mode (the default is off).
Note: This option is only displayed when you run the online diagnostics in service mode.
Include Advanced Diagnostics
This option allows the user to turn on or off including the advanced diagnostics (the default is off).
Number of Days Used to Search Error Log
This option allows the user to select the number of days for which to search the AIX error log for errors when running the error log analysis. The default is seven days, but it can be changed from one to 60 days.
Display Progress Indicators
This option allows the user to turn on or off the progress indicators when running the diagnostic applications. The progress indicators, in a box at the bottom of the screen, indicate that the test is being run (the default is on).
Diagnostic Event Logging
This option allows the user to turn on or off logging information to the diagnostic event log (the default is on).
Diagnostic Event Log File Size
This option allows the user to select the maximum size of the diagnostic event log. The default size for the diagnostic event log is 100 KB. The size can be increased in increments of 100 KB to a maximum of 1 MB.
Use the diaggetrto command to display one or more diagnostic runtime options. Use the following AIX command syntax:
/usr/lpp/diagnostics/bin/diaggetrto [-a] [-d] [-l] [-m] [-n] [-p] [-s] 
Use the diagsetrto command to change one or more diagnostic runtime options. Use the following AIX command syntax:
/usr/lpp/diagnostics/bin/diagsetrto [-a on|off] [-d on|off] [-l size] 
[-m on|off] [-n days] [-p on|off]
Flag descriptions for the diaggetrto and diagsetrto commands are as follows:
Flag
Description
-a
Displays or changes the value of the advanced diagnostics option.
-d
Displays or changes the value of the diagnostic event that is being logged.
-l
Displays or changes the value of the diagnostic event log file size. Allowable size is between 100K and 1000K in increments of 100K. The size cannot be decreased.
-m
Displays or changes the value of the display diagnostic mode selection menu option.
-n
Displays or changes the value of the number of days used to search the error log option. Allowable values are 1 - 60 days. Seven days is the default.
-p
Displays or changes the value of the display progress indicators option.
-s
Displays all of the diagnostic runtime options.

Display previous diagnostic results

Note: This service aid is not available when using stand-alone diagnostics.

This service aid allows a service representative to display results from a previous diagnostic session. When the display previous diagnostic results option is selected, the user can view up to 25 no trouble found (NTF) and service request number (SRN) results.

This service aid displays diagnostic event log information. You can display the diagnostic event log in a short version or a long version. The diagnostic event log contains information about events logged by a diagnostic session.

This service aid displays the information in reverse chronological order.

This information is not from the operating system error log. This information is stored in the /var/adm/ras directory.

You can run the command from the command line by typing:
/usr/lpp/diagnostics/bin/diagrpt [[-o] ? [-s mmddyy] ? [-a] ? [-r]]
-o
Displays the last diagnostic results file stored in the /etc/lpp/diagnostics/data directory
-s mmddyy
Displays all diagnostic result files logged since the date specified
-a
Displays the long version of the diagnostic event log
-r
Displays the short version of the diagnostic event log

Display resource attributes

Note: Use this service aid in online mode only.

This task displays the customized device attributes associated with a selected resource. This task is similar to running the lsattr -E -l resource command.

Display software product data

This task uses SMIT to display information about the installed software and provides the following functions:
  • List Installed Software
  • List Applied but Not Committed Software Updates
  • Show Software Installation History
  • Show Fix (APAR) Installation Status
  • List Fileset Requisites
  • List Fileset Dependents
  • List Files Included in a Fileset
  • List File Owner by Fileset

Display test patterns

This service aid provides a means of adjusting system display units by providing test patterns that can be displayed. The user uses a series of menus to select the display type and test pattern. After the selections are made, the test pattern displays.

Display USB devices

The following are the main functions of this service aid:
  • Display a list of USB controllers on an adapter.
  • Display a list of USB devices that are connected to the selected controller.

To run the USB devices service aid, go to the diagnostics TASKS SELECTION menu, and select Display USB Devices. From the controller list that displayed on the screen, select one of the items that begins with OHCDX, where X is a number. A list of devices attached to the controller displays.

Download microcode

Note: Download microcode is a subtask that can be accessed after selecting Microcode Tasks, see Microcode tasks.

This service aid provides a way to copy microcode to an adapter or device. The service aid presents a list of adapters and devices that use microcode. After the adapter or device is selected, the service aid provides menus to guide you in checking the current level and installing the needed microcode.

This task can be run directly from the AIX command line. Most adapters and devices use a common syntax as identified in the Microcode installation to adapters and devices section. Information for adapters and devices that do not use the common syntax can be found following this section.

Microcode installation to adapters and devices

For many adapters and devices, microcode installation occurs and becomes effective while the adapters and devices are in use. Ensure that a current backup is available and the installation is scheduled during a non-peak production period.

Notes:
  1. If the source is /etc/microcode, the image must be stored in the /etc/microcode directory on the system. If the system is booted from a NIM server, the image must be stored in the usr/lib/microcode directory of the SPOT the client is booted from.
  2. If the source is CD (cdX), the CD must be in ISO 9660 format. There are no restrictions as to what directory in which to store the image.
  3. If the source is diskette (fdX), the diskette must be in backup format and the image stored in the /etc/microcode directory.
If you are using the AIX operating system and are using online diagnostics, the following example is the common syntax command: diag [-c] -d device -T "download [-s {/etc/microcode|source}] [-l {latest|previous}] [-f]"
-c
No console mode. Run without user interaction.
-d device
Run the task on the device or adapter specified.
-T download
Install microcode.
-s /etc/microcode
The microcode image is in the /etc/microcode directory. This directory is the default.
-s source
Microcode image is on specified source. For example, fd0, cd0.
-l latest
Install latest level of microcode. This setting is the default.
-l previous
Install previous level of microcode.
-f
Install microcode even if the current level is not on the source.

Microcode installation to an SES device

Notes:
  1. If the source is /etc/microcode, the image must be stored in the /etc/microcode directory on the system. If the system is booted from a NIM server, the image must be stored in the usr/lib/microcode directory of the SPOT the client is booted from.
  2. If the source is CD (cdX), the CD must be in ISO 9660 format. There are no restrictions as to what directory to store the image.
  3. If the source is diskette (fdX), the diskette must be in backup format and the image stored in the /etc/microcode directory.
The following is the common syntax command:
diag [-c] -d device -T "download [-s {/etc/microcode|source}]" 
-c
No console mode. Run without user interaction.
-d device
Run the task on the device or adapter specified.
-T download
Install microcode.
-s /etc/microcode
Microcode image is in /etc/microcode.
-s source
Microcode image is on specified source. For example, fd0, cd0.

Microcode installation to PCI SCSI RAID adapters

PCI SCSI RAID adapters that support this type of installation are:
  • Type 4-H, PCI SCSI-2 Fast/Wide RAID adapter (Feature Code 2493)
  • Type 4-T, PCI 3-Channel Ultra2 SCSI RAID adapter (Feature Code 2494)
  • Type 4-X, PCI 4-Channel Ultra3 SCSI RAID adapter (Feature Code 2498)
Notes:
  1. If the image is on the hard disk drive, it must be stored in the /etc/microcode directory on the system. If the system is booted from a NIM server, the image must be stored in the usr/lib/microcode directory of the SPOT the client is booted from.
  2. If the image is on a diskette, the diskette must be in backup format and the image stored in the /etc/microcode directory.
Syntax: diag [-c] -d RAIDadapterName -T "download [-B][-D][-P]"
-c
No console mode. Run without user interaction.
-d RAIDadapterName
Run the task on the RAID adapter specified.
-T download
Install microcode.
-B
Install boot block microcode. Default is functional microcode.
-D
Microcode image is on diskette. Default is /etc/microcode.
-P
Install the previous level of microcode. Default is latest level.

Microcode installation to disk drive attached to PCI SCSI RAID adapters

Microcode for a disk drive attached to a PCI SCSI RAID adapter is installed through the adapter to the drive. PCI SCSI RAID adapters that support this type of installation are:
  • Type 4-H, PCI SCSI-2 Fast/Wide RAID adapter (Feature Code 2493)
  • Type 4-T, PCI 3-Channel Ultra2 SCSI RAID adapter (Feature Code 2494)
  • Type 4-X, PCI 4-Channel Ultra3 SCSI RAID adapter (Feature Code 2498)
Notes:
  1. If the image is on the hard disk drive, it must be stored in the /etc/microcode directory on the system. If the system is booted from a NIM server, the image must be stored in the usr/lib/microcode directory of the SPOT the client is booted from.
  2. If the image is on a diskette, the diskette must be in backup format and the image stored in the /etc/microcode directory.
Syntax: diag [-c] -d RAIDadapterName -T "download {-l chID | -A} [-D][-P]"
-c
No console mode. Run without user interaction.
-d RAIDadapterName
Name of the RAID adapter the disk is attached to.
-T download
Install microcode.
-l
Physical disk channel/ID of RAID disk drive (example: 27).
-A
All disk drives attached to specified RAID adapter.
-D
Microcode image is on diskette. Default is /etc/microcode.
-P
Install the previous level of microcode. Default is the latest level.

Fault indicators

This task is only available through a command-line interface. It is not available from the diagnostic menu or from stand-alone diagnostics.

The fault indicators are used to identify a fault with the system. These indicators might be set automatically by hardware, firmware, or diagnostics when a fault is detected in the system.

The System Attention Indicator is turned off when a Log Repair Action is performed. All other Fault Indicators are turned off when the failing unit is repaired or replaced. After a serviceable event is complete, do a System Verification to verify the fix. Also, do a Log Repair Action if the test on the resource was good, and that resource had an entry in the error log.

For more information about the use of these indicators, see the service information for the system unit you are using.
Note: The AIX command does not allow you to set the fault indicators to the fault state.
Use the following command syntax:
/usr/lpp/diagnostics/bin/usysfault [-s normal] [-l location code | -d devicename]
/usr/lpp/diagnostics/bin/usysfault [-t]
-s normal
Sets the fault indicator to the normal state.
-l location code
Identifies the resource by physical location code.
-d device name
Identifies the resource by device name.
-t
Displays a list of all supported fault indicators by physical location codes.

When the command is used without the -s flag, the current state of the indicator is displayed as normal or fault.

When the command is used without the -l or -d flag, the System Attention Indicator is used.

Use the -l or -d flags only in systems that have more than one fault indicator.

Fibre Channel RAID service aids

The Fibre Channel RAID service aids contain the following functions:
Certify LUN
This selection reads and checks each block of data in the logical unit number (LUN). If excessive errors are encountered, you are notified.
You can run this task from the AIX command line. Use the following AIX fast path command:
diag -T "certify"
Certify spare physical disk
This selection certifies (check integrity of the data) drives that are designated as spares.
You can run this task from the AIXcommand line. Use the following fast path command:
diag -T "certify"
Format physical disk
This selection formats a selected disk drive.
You can run this task from the AIX command line. Use the following fast path command:
diag -T "format"
Array controller microcode download
This selection updates the microcode on the Fibre Channel RAID controller when required.
You can run this task from the AIX command line. Use the following fast path command:
diag -T "download"
Physical disk microcode download
This selection updates the microcode on any of the disk drives in the array.
You can run this task from the AIX command line. Use the following fast path command:
diag -T "download"
Update EEPROM
This selection updates the contents of the electronically erasable programmable read-only memory (EEPROM) on a selected controller.
Replace controller
Use this selection when it is necessary to replace a controller in the array.

Flash drive (USB)

Use this command to update microcode images or boot images for stand-alone diagnostics from a flash memory device.

You must first load an ISO9660 or later image onto a supported USB flash drive. You are prompted to connect a flash drive, select a flash drive from a list of available flash drives, and select a source ISO image. The source image might be on the file system or on removable media.

This service aid is also used to copy the contents of optical media and other flash drives to a flash drive.

Note: There is no command-line interface for this task.

Flash SK-NET FDDI firmware

This task allows the flash firmware on the SysKonnect SK-NET FDDI adapter to be updated.

Format media

This task allows the selection of diskettes, hard disks, or optical media to be formatted.

Hard disk attached to SCSI adapter (non-RAID)

This service aid includes the following options:

Hard disk format
Writes all of the disk. The pattern written on the disk is device-dependent; for example some drives might write all zeros, while some might write the hexadecimal number 5F. No bad block reassignment occurs.
Hard disk Format and Certify
Performs the same function as hard disk format. After the format is completed, Certify is run. Certify then reassigns all bad blocks encountered.
Hard disk Erase Disk
This option can be used to overwrite (remove) all data currently stored in user-accessible blocks of the disk. The erase disk option writes one or more patterns to the disk. An additional option allows data in a selectable block to be read and displayed on the system console.
To use the erase disk option, specify the number (0-3) of patterns to be written. The patterns are written serially; that is, the first pattern is written to all blocks. The next pattern is written to all blocks, overlaying the previous pattern. A random pattern is written by selecting the Write Random Pattern? option.
Note: The erase disk service aid is not certified as meeting the Department of Defense or any other security organization guidelines.
To overwrite the data on the drive, use the following steps:
  1. Select Erase Disk.
  2. Do a format without certify.
  3. Select Erase Disk to run it a second time.
For a newly installed drive, you can ensure that all blocks on the drive are overwritten with your pattern by using the following procedure:
  1. Format the drive.
  2. Check the defect MAP by running the erase disk option.
    Note: If you use the format and certify option, there might be some blocks which get placed into the grown defect MAP.
  3. If there are bad blocks in the defect MAP, record the information presented and ensure that this information is kept with the drive. This data is used later when the drive is to be overwritten.
  4. Use the drive as you would normally.
  5. When the drive is no longer needed and is to be erased, run the same version of the erase disk option which was used in step 2.
    Note: Using the same version of the service aid is only critical if any bad blocks were found in step 3.
  6. Compare the bad blocks which were recorded for the drive in step 3 with the bad blocks that now appear in the grown defect MAP.
    Note: If there are differences between the saved data and the newly obtained data, all sectors on this drive cannot be overwritten. The new bad blocks are not overwritten.
  7. If the bad block list is the same, continue running the service aid to overwrite the disk with the chosen pattern or patterns.
This task can be run directly from the command line. The command syntax is:
diag -c -d deviceName -T "format [-s* fmtcert | erase -a {read | write}
								 -P {comma separated list of patterns}] [-F]*"
The following flags are not available for pdisk devices.
Flag
Description
fmtcert
Formats and certifies the disk.
erase
Overwrites the data on the disk.
*
Available in no-console mode only.
-F
Forces the disk erasure even if all blocks cannot be erased because of errors when accessing the grown defect map.
-P
Comma-separated list of hexadecimal patterns to be written to the drive serially. Up to eight patterns can be specified by using a single command. The patterns must be 1, 2, or 4 bytes long without a leading 0x or 0X. Example of using five patterns: -P ff, a5c0, 00, fdb97531, 02468ace
Note: If no patterns are specified for the erase disk option in command-line mode, then the default pattern of 00 is used.

Hard disk attached to PCI SCSI RAID adapter

This function formats the physical disks attached to a PCI SCSI RAID adapter. This task can be run directly from the AIX command line. The command-line syntax is:
diag -c -d RAIDadapterName -T "format {-l chId | -A }"
-l
Physical disk channel/ID (An example of a physical disk channel/ID is 27, where the channel is 2 and the ID is 7.)
-A
All disks

Optical media

Use the following functions to check and verify optical media:
Optical Media Initialize
Formats the media without certifying. This function does not reassign the defective blocks or erase the data on the media. This option provides a quick way of formatting the media and cleaning the disk.
Note: It takes approximately 1 minute to format the media.
Optical Media Format and Certify
Formats and certifies the media. This function reassigns the defective blocks and erases all data on the media.
This task can be run directly from the command line. The command-line syntax is:
diag -c -d deviceName -T "format [-s {initialize | fmtcert} ]"
initialize
Formats media without certifying
fmtcert
Formats and certifies the media

DVD-RAM media

Initialize
Formats the media without certifying. This function does not reassign the defective blocks or erase the data on the media. This format type can be used only with previously formatted media.
Format and Certify
Formats and certifies the media. This function reassigns the defective blocks and erases the data on the media by writing an initialization pattern to the entire media.
This task can be run directly from the command line. The command-line syntax is:
diag -c -d deviceName -T "format [-s{initialize|fmtcert}]"
-c
No console mode
-d
Used to specify a device
-s initialize
Initialize the media (quick format). This setting is the default.
-s fmtcert
Formats and certifies the media.
-T
Used to specify the format task

Diskette format

This selection formats a diskette by writing patterns to it.

Gather system information

If you are using the Linux operating system, the gather system information option does not apply. This service aid uses the snap command to collect configuration information about networks, file systems, security, the kernel, the ODM, and other system components. You can also collect SSA adapter and disk drive configuration data, or trace information for software debugging.

The output of the SNAP service aid can be used by field service personnel. The output can also be put on removable media and transferred to remote locations for more extensive analysis.

To use the SNAP task, select Gather system information from the task list. You can select which components you want to collect information for, and where to store the data (hard disk or removable media).

Generic microcode download

Note: Generic microcode download is a subtask that can be accessed after selecting Microcode Tasks, see Microcode tasks.

The generic microcode download service aid provides a means of executing a genucode script from a diskette or tape. The purpose of this generic script is to load microcode to a supported resource.

The genucode program must be downloaded onto diskette or tape in the tar format. The microcode image itself goes onto another one in restore format. Running the generic microcode download task searches for the genucode script on diskette or tape and runs it. You will be prompted to insert a genucode media into the drive. The service aid moves the genucode script file to the /tmp directory and runs the program that downloads the microcode to the adapter or device.

This service aid is supported in both concurrent and stand-alone modes from disk, LAN, or loadable media.

Hot plug task

Attention: Some systems do not support hot pluggable procedures. These systems must be shut down and powered off before replacing any PCI adapter or device. Follow the non-hot pluggable adapter or device procedures when replacing a PCI adapter or device on any of these systems.

The hot plug task provides software function for those devices that support hot plug or hot plug capability. These devices include PCI adapters, SCSI devices, and some RAID devices. This task was previously known as SCSI Device Identification and Removal or Identify and Remove Resource.

If you are running the AIX operating system, the hot plug task has a restriction when running in stand-alone or online service mode. New devices cannot be added to the system unless there is already a device with the same FRU part number installed in the system. This restriction is in place because the device software package for the new device cannot be installed in stand-alone or online service mode.

Depending on the environment and the software packages installed, selecting this task displays the following subtasks:
  • PCI hot plug manager
  • SCSI hot plug manager
  • RAID hot plug devices

To run the hot plug task directly from the AIX command line, type the following command: diag -T "identifyRemove"

If you are running the diagnostics in online concurrent mode, run the missing options resolution procedure immediately after removing any device.

If the missing options resolution procedure runs with no menus or prompts, device configuration is complete. Select the device that has an uppercase M in front of it in the resource list so that missing options processing can be done on that resource.

PCI hot plug manager

The PCI hot plug manager task is a SMIT menu that enables you to identify, add, remove, or replace PCI adapters that are hot pluggable. The following functions are available under this task:
List PCI hot plug slots
Lists all PCI hot plug slots. Empty slots and populated slots are listed. Populated slot information includes the connected logical device. The slot name consists of the physical location code and the description of the physical characteristics for the slot.
Add a PCI hot plug adapter
Prepares a slot for the addition of a new adapter. The function lists all the empty slots that support hot plug. When a slot is selected, the visual indicator for the slot flashes at the identify rate. After the slot location is confirmed, the visual indicator for the specified PCI slot is set to the action state. This means that the power for the PCI slot is off and the new adapter can be plugged in.
Replace/remove a PCI hot plug adapter
Prepares a slot for adapter exchange. The function lists all the PCI slots that support hot plug and are occupied. The list includes the physical location code of the slot and the device name of the resource installed in the slot. The adapter must be in the defined state before it can be prepared for hot plug removal. When a slot is selected, the visual indicator for the slot is set to the identify state. After the slot location is confirmed, the visual indicator for the specified PCI slot is set to the action state. This means that the power for the PCI slot is off, and the adapter can be removed or replaced.
Identify a PCI hot plug slot
Helps identify the location of a PCI hot plug adapter. The function lists all the PCI slots that are occupied or empty and support hot plug. When a slot is selected for identification, the visual indicator for the slot is set to the identify state.
Unconfigure devices
Attempts to put the selected device, in the PCI hot plug slot, into the defined state. This action must be done before any attempted hot plug function. If the unconfigure function fails, it is possible that the device is still in use by another application. In this case, the customer or system administrator must be notified to quiesce the device.
Configure devices
Allows a newly added adapter to be configured into the system for use. This function must be used when a new adapter is added to the system.
Install/configure devices added after IPL
Attempts to install the necessary software packages for any newly added devices. The software installation media or packages are required for this function.
The stand-alone diagnostics have restrictions on using the PCI hot plug manager. For example:
  • Adapters that are replaced must be the same FRU part number as the adapter that is being replaced.
  • New adapters cannot be added unless a device of the same FRU part number exists in the system. This rule is because the configuration information for the new adapter is not known after the stand-alone diagnostics are booted.
  • The following functions are not available from the stand-alone diagnostics and are not displayed in the list:
    • Add a PCI hot plug adapter
    • Configure devices
    • Install/configure devices added after IPL
You can run this task directly from the AIX command line by typing the following command:
diag -d device -T "identifyRemove"

However, some devices support both the PCI hot plug task and the RAID hot plug devices task. If this is the case for the device specified, then the hot plug task displays instead of the PCI hot plug manager menu.

SCSI hot plug manager

This task was previously known as SCSI Device Identification and Removal or Identify and Remove Resources. This task allows you to identify, add, remove, and replace a SCSI device in a system unit that uses a SCSI Enclosure Services (SES) device. The following functions are available:
List the SES Devices
Lists all the SCSI hot plug slots and their contents. Status information about each slot is also available. The status information available includes the slot number, device name, whether the slot is populated and configured, and location.
Identify a Device Attached to an SES Device
Identifies the location of a device attached to an SES device. This function lists all the slots that are occupied or empty which support hot plug. When a slot is selected for identification, the visual indicator for the slot is set to the Identify state.
Attach a Device to an SES Device
Lists all empty hot plug slots that are available for the insertion of a new device. After a slot is selected, the power is removed. If available, the visual indicator for the selected slot is set to the remove state. After the device is added, the visual indicator for the selected slot is set to the normal state, and power is restored.
Replace/Remove a Device Attached to an SES Device
Lists all populated hot plug slots that are available for removal or replacement of the devices. After a slot is selected, the device that is populating that slot is unconfigured; then the power is removed from that slot. If the unconfigure operation fails, it is possible that the device is in use by another application. In this case, the customer or system administrator must be notified to quiesce the device. If the unconfigure operation is successful, the visual indicator for the selected slot is set to the remove state. After the device is removed or replaced, the visual indicator, if available for the selected slot, is set to the normal state, and power is restored.
Note: Before you remove the device, be sure that no other host is using it.
Configure Added/Replaced Devices
Runs the configuration manager on the parent adapters that had child devices added or removed. This function ensures that the devices in the configuration database are configured correctly.
The stand-alone diagnostics have restrictions on using the SCSI hot plug manager. For example:
  • Devices being used as replacement devices must be the same type of device as the device that is being replaced.
  • New devices cannot be added unless a device of the same FRU part number exists in the system. This rule is because the configuration information for the new device is not known after the stand-alone diagnostics are booted.
You can run this task directly from the AIX command line. The command-line syntax is:
diag -d device -T "identifyRemove"
OR
diag [-c] -d device -T "identifyRemove -a [identify|remove]"
-a
Specifies the option under the task.
-c
Run the task without displaying menus. Only command-line prompts are used. This flag is only applicable when running an option such as identify or remove.
-d
Indicates the SCSI device.
-T
Specifies the task to run.

SCSI and SCSI RAID hot plug manager

This task was previously called SCSI hot-swap manager, SCSI device identification and removal, or Identify and remove resources. This task allows the user to identify, add, remove, and replace a SCSI device in a system unit that uses a SCSI hot plug enclosure device. This task also performs these functions on a SCSI RAID device attached to a PCI-X RAID controller. The following functions are available:
List the SCSI hot plug enclosure devices
Lists all the SCSI hot plug slots and their contents. Status information about each slot is also available. The status information available includes the slot number, device name, whether the slot is populated and configured, and location.
Identify a device attached to a SCSI hot plug enclosure device
Helps identify the location of a device attached to a SCSI hot plug enclosure device. This function lists all the slots that are occupied or empty which support hot plug. When a slot is selected for identification, the visual indicator for the slot is set to the identify state.
Attach a device to a SCSI hot plug enclosure device
Lists all empty hot plug slots that are available for the insertion of a new device. After a slot is selected, the power is removed. If available, the visual indicator for the selected slot is set to the remove state. After the device is added, the visual indicator for the selected slot is set to the normal state, and power is restored.
Replace/remove a device attached to a SCSI hot plug enclosure device
Lists all populated hot plug slots that are available for removal or replacement of the devices. After a slot is selected, the device that is populating that slot is unconfigured, the power is removed from that slot. If the unconfigure operation fails, it is possible that the device is in use by another application. In this case, the customer or system administrator must be notified to quiesce the device. If the unconfigure operation is successful, the visual indicator for the selected slot is set to the remove state. After the device is removed or replaced, the visual indicator, if available for the selected slot, is set to the normal state, and power is restored.
Note: Before you remove the device, be sure that no other host is using it.
Configure added/replaced devices
Runs the configuration manager on the parent adapters that had child devices added or removed. This function ensures that the devices in the configuration database are configured correctly.
The stand-alone diagnostics have restrictions on using the SCSI hot plug manager. For example:
  • Devices being used as replacement devices must be the same type of device as the device that is being replaced
  • New devices cannot be added unless a device of the same FRU part number exists in the system. This restriction is because the configuration information for the new device is not known after the stand-alone diagnostics are booted.
You can run this task directly from the AIX command line. The command syntax is:
diag -d device -T "identifyRemove"
OR
diag -d device -T "identifyRemove -a [identify|remove]"
-a
Specifies the option under the task.
-d
Indicates the SCSI device.
-T
Specifies the task to run.

RAID hot plug devices

This task allows the user to identify or remove a RAID device in a system unit that uses a SCSI Enclosure Services (SES) device. The following subtasks are available:
  • Normal
  • Identify
  • Remove

The normal subtask is used to return a RAID hot plug device to its normal state. This subtask is used after a device is identified or replaced. This subtask lists all channel/IDs of the RAID and the status of the devices that are connected. A device in its normal state has power and the check light is off.

The identify subtask is used to identify the physical location of a device or an empty position in the RAID enclosure. This subtask lists all channel/IDs of the RAID and the status of the devices that are connected to the RAID enclosure. If a device is attached to the selected channel/ID, the check light on the device will begin to flash. If the channel/ID does not have a device attached, the light associated with the empty position on the enclosure will begin to flash.

The remove subtask is used to put the RAID hot plug device in a state where it can be removed or replaced. This subtask lists all channel/IDs of the RAID adapter that have devices that can be removed. Only devices with a status of Failed, Spare, Warning, or Non Existent can be removed. The status of a device can be changed with the AIX smitty pdam command. After a device is selected for removal, the check light on the device will begin to flash, indicating that you can physically remove that device.

The stand-alone diagnostics have restrictions on using the RAID hot plug manager:
  • Devices being used as replacement devices must be the same type of device as the device that is being replaced.
  • New devices cannot be added unless a device of the same FRU part number exists in the system. This rule is because the configuration information for the new device is not known after the stand-alone diagnostics are booted.

You can run this task directly from the AIX command line. The command-line syntax is:

diag -c -d devicename -T "identifyRemove -l ChId -s {identify|remove|normal}"
-c
Run the task without displaying menus. Only command-line prompts are used.
-d
Raid adapter device name (for example, scraid0).
-s
Subtask to start, such as identify, remove, or normal.
-l
CHId is the channel number of the RAID adapter and SCSI ID number of the position in the enclosure concatenated together (for example, 27 for channel 2, device 7).
-T
Task to run.

Identify indicators

The component and attention LEDs assist in identifying failing components in your server.

Identify and system attention indicators

This task is used to display or set the identify indicators and the single system attention indicator on the systems that support this function.

Some systems might support only the identify indicators or only the attention indicator. The identify indicators are used to help physically identify the system, enclosure, or FRU in a large equipment room. The attention indicator is used to alert a user that the system needs attention and might have a hardware problem. In most cases, when an identify indicator is set to the Identify state, this results in a flashing LED. And, when an attention indicator is set to the Attention state, this results in a solid LED.

When a hardware problem is detected on a system that supports the attention indicator, the indicator is set to an attention state. After the failure is identified, repaired, and a repair action is logged, the attention indicator is reset to the normal state.

This task can also be run directly from the AIX command line by typing:
/usr/lpp/diagnostics/bin/usysident [-s {normal | identify}][-l location code | -d device name]
/usr/lpp/diagnostics/bin/usysident [-t] 
-s {normal | identify}
Sets the state of the system identify indicator to either normal or identify.
-l location code
Identifies the resource by physical location code.
-d device name
Identifies the resource by device name
-t
Displays a list of all supported identify indicators by physical location codes.

When this command is used without the -l or the -d flags, the primary enclosure resource is used.

Use the -l flag only in systems that have more than one identify indicator. Use of the -d flag is preferred over use of the -l flag.

When this command is used without the -s flag, the current state of the identify indicator is displayed.

Local area network analyzer

This selection is used to exercise the LAN communications adapters (token ring, Ethernet, and (FDDI) Fiber Distributed Data Interface). The following services are available:
  • Connectivity testing between two network stations. Data is transferred between the two stations, requiring the user to provide the IP addresses of both stations.
  • Monitoring ring (token ring only). The ring is monitored for a specified time. Soft and hard errors are analyzed.

Log repair action

The log repair action task logs a repair action in the AIX operating system error log. A repair action log indicates that a FRU has been replaced, and error log analysis should not be done for any errors logged before the repair action. The log repair action task lists all resources. Replaced resources can be selected from the list, and when commit (F7 key) is selected, a repair action is logged for each selected resource.

To locate the failing part in a system or partition, do the following steps:

  1. Log in as root user.
  2. At the command line, enter diag.
  3. Select the Diagnostics Routines option.
  4. When the DIAGNOSTIC MODE SELECTION menu displays, select Problem Determination.
  5. When the ADVANCED DIAGNOSTIC SELECTION menu displays, do one of the following options:
    • To test a single resource, select the resource from the list.
    • To test all the resources available to the operating system, select All Resources.
  6. Press Enter, and wait until the diagnostic programs run to completion, responding to any prompts that appear on the console.
  7. Use the location information for the failing part to activate the indicator light that identifies the failing part. For instructions, see Activate the indicator light for the failing part.

Microcode tasks

Similar microcode tasks are combined under a single task topic, while providing a way to access the microcode and flashing features. The combined tasks that are included under Microcode tasks are:
  • Display microcode level
  • Download microcode
  • Generic microcode download
  • Update system or service processor flash
  • Update and manage system flash

PCI RAID physical disk identify

For a description of the PCI RAID physical disk identify task, see SCSI RAID Physical Disk Status and Vital Product Data.

PCI-X SCSI disk array manager

Restriction:
  • If you are using the AIX operating system, note the following restrictions:
    • There are limits to the amount of disk drive capacity allowed in a single RAID array. For example, when using the 32-bit kernel, there is a capacity limitation of 1 TB for each RAID array. When using the 64-bit kernel, there is a capacity limitation of 2 TB for each RAID array. For RAID adapters and RAID enablement cards, this limitation is enforced by the operating system when the RAID arrays are created using the PCI-X SCSI disk array manager.
    • When creating a RAID array of up to 2 TB by using stand-alone diagnostics, ensure version 5.3.0.40 or higher is used. Previous versions of the stand-alone diagnostics have a capacity limitation of 1 TB for each RAID array.

This service aid calls the smitty pdam fast path, and is used to manage a RAID array connected to a SCSI RAID adapter. It might also be run from stand-alone diagnostics on systems or logical partitions that are running the AIX operating system. If you are running the Linux operating system, use the iprconfig tool for disk array management.

Some of the tasks performed by using this service aid include:
  • Check device status for the disk array on your system.
  • Display information of physical drives and disk arrays.
  • Run recovery options on the RAID. This action needs to be done at the end of a service call in which you replaced the RAID adapter cache card or changed the RAID configuration)

Other RAID functions are available by using this service aid; they must be used only by the system administrator who is familiar with the RAID configuration. These functions are normally done when booting AIX by running smitty pdam from the command line.

Attention: Without knowledge of how the RAID was set up, these functions can cause loss of data stored on the RAID.

Process supplemental media

Diagnostic supplemental media contains all the necessary diagnostic programs and files required to test a particular resource. The supplemental media is normally released and shipped with the resource as indicated on the diskette label. Diagnostic supplemental media must be used when the device support has not been incorporated into the latest diagnostic CD/DVD-ROM.

This task processes the diagnostic supplemental media. Insert the supplemental media when you are prompted; then press Enter. After processing has completed, go to the resource selection list to find the resource to test.

Notes:
  1. This task is supported in stand-alone diagnostics only.
  2. Process and test one resource at a time. Run diagnostics after each supplemental media is processed. (For example, if you need to process two supplemental media, run diagnostics twice, once after each supplement media is processed.)

Read intensive SSD fuel gauge

Use this service aid to display the life expectancy status of pdisk read intensive solid-state drives (SSDs).

This task can be run directly from the AIX command line. To display the status of all supported read intensive SSDs, type the following command and press Enter:

/usr/lpp/diagnostics/bin/pdiskfg

To display the status of a specific SSD, type the following command and press Enter:

/usr/lpp/diagnostics/bin/pdiskfg -d pdiskX, where X is the pdisk number.

If you are running the AIX operating system and are using the online diagnostics, you can run this task directly from the command line. Use the following command syntax:
diag -d pdiskX -T pdiskfg
-d
Device name (for example, pdisk0)
-T
Task to run (pdiskfg is the fuel gauge task for read intensive SSDs)

Run diagnostics

If you are using the AIX operating system, or by using the stand-alone diagnostics, the run diagnostics task starts the resource selection list menu. When the commit key is pressed, diagnostics are run on all selected resources.

The procedures for running the diagnostics depend on the state of the diagnostics runtime options. See Display or change diagnostic run time options.

Run error log analysis

The run error log analysis task starts the resource selection list menu. When the commit key is pressed, error log analysis is run on all selected resources.

SCSI bus analyzer

Use this service aid to diagnose a SCSI bus problem in a freelance mode.

To use this service aid, you must understand how a SCSI bus works. Use this service aid when the diagnostics cannot communicate with anything on the SCSI bus and cannot isolate the problem. To find a problem on the SCSI bus with this service aid, start with a single device attached, ensure that it is working, then start adding devices and cables to the bus. After each addition, ensure that each one works. This service aid works with any valid SCSI bus configuration.

The SCSI bus service aid transmits a SCSI inquiry command to a selectable SCSI address. The service aid then waits for a response. If no response is received within a defined amount of time, the service aid displays a timeout message. If an error occurs or a response is received, the service aid then displays one of the following messages:
  • The service aid transmitted a SCSI Inquiry Command and received a valid response back without any errors being detected.
  • The service aid transmitted a SCSI Inquiry Command and did not receive any response or error status back.
  • The service aid transmitted a SCSI Inquiry Command and the adapter indicated a SCSI bus error.
  • The service aid transmitted a SCSI Inquiry Command and an adapter error occurred.
  • The service aid transmitted a SCSI Inquiry Command and a check condition occur.

When the SCSI bus service aid is started a description of the service aid displays.

Pressing Enter displays the adapter selection menu. Use this menu to enter the address to transmit the SCSI Inquiry Command.

When the adapter is selected, the SCSI bus address selection menu displays. Use this menu to enter the address to transmit the SCSI inquiry command.

After the address is selected, the SCSI bus test run menu displays. Use this menu to transmit the SCSI inquiry command by pressing Enter. The service aid then indicates the status of the transmission. When the transmission is completed, the results of the transmission displays.

Notes:
  1. A check condition can be returned when the bus or device is working correctly.
  2. If the device is in use by another process, the command is not sent.

SCSI RAID physical disk status and vital product data

Note: This task was previously known as the PCI RAID physical disk identify task.

Use this service aid when you want to look at the vital product data for a specific disk attached to a RAID adapter. This service aid displays all disks that are recognized by the PCI RAID adapter, along with their status, physical location, microcode level, and other vital product data. The physical location of a disk consists of the channel number of the RAID adapter and the SCSI ID number of the position in the enclosure. The microcode level is listed next to the physical location of the disk.

If you are running the AIX operating system and are using the online diagnostics, you can run this task directly from the command line. Use the following command syntax:
diag -c -d devicename -T "identify"
-c
Run the task without displaying menus. Only command-line prompts are used.
-d
RAID adapter device name (for example, scraid0).
-T
Task to run.

SCSD tape drive service aid

Use this service aid to obtain the status or maintenance information from an SCSD tape drive. Not all models of SCSD tape drive are supported.

The service aid provides the following options:
Display time since a tape drive was last cleaned.
The time since the drive was last cleaned displays on the screen. Also, a message is shown whether it is recommended to clean the drive.
Copy a trace table for a tape drive.
The trace table of the tape drive is written to diskettes or a file. The diskettes must be formatted for DOS. Writing the trace table might require several diskettes. The actual number of diskettes is determined by the size of the trace table. Label the diskettes as follows:

TRACEx.DAT (where x is a sequential diskette number). The complete trace table consists of the sequential concatenation of all the diskette data files.

When the trace table is written to a disk file, the service aid prompts for a file name. The default name is: /tmp/TRACE. x, where x is the name of the SCSD tape drive that is being tested.

Display or copy a log sense information for a tape drive.
The service aid provides options to display the log sense information to the screen, to copy it to a DOS formatted diskette, or to copy it to a file. The file name LOGSENSE.DAT is used when the log sense data is written to the diskette. If you selected to have the log sense data be copied to a file, you will be prompted for a file name
This service aid can be run directly from the AIX command line. See the following command syntax (the path is /usr/lpp/diagnostics/bin/utape):
utape [-h | -?] [-d device] [-n | -l | -t]
OR
utape -c -d device [-v] {-n | {-l | -t} { -D | -f [ filename]}}
Flag
Description
-c
Run the service aid without displaying menus. The return code indicates success or failure. The output is suppressed except for the usage statement and the numeric value for hours since cleaning (if -n and -D flags are used).
-D
Copy data to diskette.
-f
Copy data to the file name given after this flag or to a default file name if no name is specified.
-h, -?
Display a usage statement or return code. If the -c flag is present, only the return code displays to indicate that the service aid did not run. If the -c is not used, a usage statement displays and the service aid exits.
-l
Display or copy log sense information.
-n
Display time since drive was last cleaned.
-t
Copy trace table.
-v
Verbose mode. If the -c flag is present, the information displays on the screen. If the -n flag is present, the information about tape-head cleaning is printed.

Spare sector availability

This selection checks the number of spare sectors available on the optical disk. The spare sectors are used to reassign when defective sectors are encountered during normal usage or during a format and certify operation. Low availability of spare sectors indicates that the disk must be backed up and replaced. Formatting the disk does not improve the availability of spare sectors.

You can run this task directly from the AIX command line. The command syntax is:
diag -c -d deviceName -T chkspares

SSA service aid

If you are using the Linux operating system, the SSA service aid option does not apply. This service aid provides tools for diagnosing and resolving problems on SSA-attached devices. The following tools are provided:
  • Set Service Mode
  • Link Verification
  • Configuration Verification
  • Format and Certify Disk

System fault indicator

If a failing component is detected in your system, an amber-colored attention LED on the front of the system unit is turned on solid (not flashing).

System identify indicator

To identify a system from a group of systems, an amber-colored attention LED on the front of the system unit is flashing.

Update disk-based diagnostics

This service aid allows fixes (APARs) to be applied.

This task starts the SMIT update software by fix (APAR) task. The task allows the input device and APARs to be selected. You can install any APAR by using this task.

Update system or service processor flash

Notes:
  • Update system or service processor flash is a subtask that can be accessed after selecting Microcode Tasks, see Microcode tasks.
  • This task has been replaced with the Update and Manage System Flash task, see Update and manage system flash.
Attention: If the system is running on a logically partitioned system, ask the customer or system administrator if a service partition has been designated.
  • If a service partition has been designated, ask the customer or system administrator to shut down all of the partitions except the one with service authority. The firmware update can then be done by using the service aid or the command line in that partition.
  • If a service partition has not been designated, the system must be shut down. If the firmware update image is available on backup diskettes or optical media, the firmware update can then be done from the service processor menus as a privileged user. If the firmware update image is in a file on the system, reboot the system in a full system partition and use the following normal firmware update procedures.

If the system is already in a full system partition, use the following normal firmware update procedures.

This selection updates the system or service processor flash. Some systems might have separate images for system and service processor firmware; newer systems have a combined image that contains both in one image.

Look for additional update and recovery instructions with the update kit. You need to know the fully qualified path and file name of the flash update image file provided in the kit. If the update image file is on a diskette or optical media, the service aid can list the files on the diskette or optical media for selection. The diskette must be a valid backup format diskette.

See the update instructions with the kit, or the service information for the system unit to determine the current level of the system unit or service processor flash memory.

When this service aid is run from online diagnostics, the flash update image file is copied to the /var file system. Put the source of the microcode that you want to download into the /etc/microcode directory on the system. If there is not enough space in the /var file system for the new flash update image file, an error is reported. If this error occurs, exit the service aid, increase the size of the /var file system, and try the service aid again. After the file is copied, a screen requests confirmation before continuing with the flash update. When you continue the update flash, the system reboots by using the shutdown -u command. The system does not return to the diagnostics, and the current flash image is not saved. After the reboot, you can remove the /var/update_flash_image file.

When this service aid is run from the stand-alone diagnostics, the flash update image file is copied to the file system from diskette, optical media, or from the Network Installation Management (NIM) server. If you use a diskette, you must provide the image on backup format diskette because you will not have access to remote file systems or any other files that are on the system. Before you can boot diagnostics from the NIM server, you must ensure that the microcode image is copied to the /usr/lib/microcode directory on the NIM server. Then point to the NIM SPOT (from which you plan to have the NIM client boot stand-alone diagnostics). Next, a NIM check operation must be run on the SPOT containing the microcode image on the NIM server. After performing the NIM boot of diagnostics, you can use this service aid to update the microcode from the NIM server. Choose the /usr/lib/microcode directory when prompted for the source of the microcode that you want to update. If there is not enough space available, an error is reported, stating additional system memory is needed. After the file is copied, a screen requests confirmation before continuing with the flash update. When you continue with the update, the system reboots by using the reboot -u command. You might receive a Caution: some processes would not die message during the reboot process. You can ignore this message. The current flash image is not saved.

You can use the update_flash command in place of this service aid. The command is in the /usr/lpp/diagnostics/bin directory. The command syntax is as follows:
update_flash [-q ]-f file_name
update_flash [-q ]-D device_name -f file_name
update_flash [-q ]-D update_flash [-q ]-D device_name -l
Attention: The update_flash command reboots the entire system. Do not use this command if more than one user is logged in to the system.
Flag
Description
-D
Specifies that the flash update image file is on diskette. The device_name variable specifies the device. The default device_name is /dev/fd0.
-f
Flash update image file source. The file_name variable specifies the fully qualified path of the flash update image file.
-l
Lists the files on a diskette, from which the user can choose a flash update image file.
-q
Forces the update_flash command to update the flash EPROM and reboot the system without asking for confirmation.

Update and manage system flash

Note: Update and manage system flash is a subtask that can be accessed after selecting Microcode Tasks, see Microcode tasks.
Attention: If the system is managed by a management console, the firmware update must be done through the management console. If the system is not managed by a management console, the firmware update can be done by using the service aid or the AIX command line.

This selection validates a new system firmware flash image and uses it to update the system temporary flash image. This selection can also be used to validate a new system firmware flash image without performing an update, commit the temporary flash image, and reject the temporary flash image.

When this service aid is run from online diagnostics, the flash update image file is copied to the /var file system. If there is not enough space in the /var file system for the new flash update image file, an error is reported. If this error occurs, exit the service aid, increase the size of the /var file system, and try the service aid again. After the file is copied, a screen requests confirmation before continuing with the flash update. When you continue the update flash, the system reboots by using the shutdown -u command. The system does not return to the diagnostics, and the current flash image is not saved. After the reboot, you can remove the /var/update_flash_image file.

When this service aid is run from stand-alone diagnostics, the flash update image file is copied to the file system from optical media, or from the NIM server. Before performing the NIM boot of diagnostics, the server firmware image must first be copied onto the NIM server in the /usr/lib/microcode directory. Then you must point to the NIM SPOT (from which you plan to have the NIM client boot stand-alone diagnostics). Next, a NIM check operation must be run on the SPOT containing the microcode image on the NIM server. After performing the NIM boot of diagnostics, you can use this service aid to update the microcode from the NIM server. Choose the /usr/lib/microcode directory when prompted for the source of the microcode that you want to update. If enough space is not available, an error is reported, stating additional system memory is needed. After the file is copied, a screen requests confirmation before continuing with the flash update. When you continue with the update, the system reboots by using the reboot -u command. You might receive a message that says: "Caution: some processes would not die" during the reboot process; you can ignore this message. The current flash image is not saved.

If you are using online diagnostics, you can use the update_flash command in place of this service aid. The command is in the /usr/lpp/diagnostics/bin directory. The command syntax is as follows:
update_flash [-q | -v] -f file_name
update_flash [-q | -v] -D device_name -f file_name
update_flash [-q | -v] -D update_flash [-l]
update_flash -c
update_flash -r
Attention: The update_flash command reboots the entire system. Do not use this command if more than one user is logged in to the system.
Flag
Description
-D
Specifies that the flash update image file is on diskette. The device_name variable specifies the device. The default device_name is /dev/fd0.
-f
Flash update image file source. The file_name variable specifies the fully qualified path of the flash update image file.
-l
Lists the files on a diskette, from which the user can choose a flash update image file.
-q
Forces the update_flash command to update the flash EPROM and reboot the system without asking for confirmation.
-v
Validates the flash update image. No update will occur. This flag is not supported on all systems.
-c
Commits the temporary flash image when booted from the temporary image. This action overwrites the permanent image with the temporary image. This flag is not supported on all systems.
-r
Rejects the temporary image when booted from the permanent image. This action overwrites the temporary image with the permanent image. This flag is not supported on all systems.

Examples: Commands

To download the adapter microcode, use this command syntax: diag -c -d deviceName -T "download [-B][-D][-P]"
Flag
Description
-B
Download boot block microcode (default to functional microcode)
-D
Microcode is on diskette (default to /etc/microcode directory)
-P
Download the previous level of microcode (default to latest level)
To download physical disk microcode, use this command syntax: diag -c -d deviceName -T "download -l ChId [-D][-P]"
Flag
Description
-D
Microcode is on diskette (default to the /etc/microcode directory)
-l
Physical disk channel/ID (for example, 27)
-P
Download the previous level of microcode (default to latest level)
To format a physical disk, use this command syntax: diag -c -d deviceName -T "format -l ChId"
Flag
Description
-l
Physical disk channel/ID (for example, 27)
To certify a physical disk, use this command syntax: diag -c -d deviceName -T "certify -l ChId"
Flag
Description
-l
Physical disk channel/ID (for example, 23)

To identify a physical disk, use this command syntax: diag -c -d deviceName -T "identify"




Last updated: Mon, March 23, 2020