Troubleshooting
Problem
This document provides information on how to gather diagnostics for debug of SR-IOV problems.
Resolving The Problem
- This document provides information on how to gather diagnostics for the debug of SR-IOV problems.
1. Description of the problem
Provide a description of the error including:
- The error message ID and text, if applicable.
- The adapter location code;
- Approximate time of the error (both HMC clock time and partition(s) clock time);
- Background including activity performed at time of error, recent changes, etc.
2. HMC pedbg
Collect pedbg from the HMC managing the server with the adapter. For recent failures (for example, within approximately 24 hours), use the command pedbg -c -q 3. For problems that occurred at later dates, you should use the command pedbg -c -q 4 . For further information, refer to IBM Support document 688045, HMC Enhanced View: Collecting PEDBG from the HMC.
3. Client Partition Data (if applicable)
For issues involving VNICs, collect this from the partition with the VNIC client adapter.
For issues impacting only one partition's SR-IOV logical port, collect this from the partition with the failing logical SR-IOV port.
a. Version, release, and fix level of the partition.
b. Partition specific diagnostics
- AIX
See document N1015043, Gathering a SNAP.
VIOS
See document 643361, Collecting Snap Data on VIOS Server Partition.
IBM i:- i) Licensed Internal Code logs (LIC logs)
- From the Start a Service Tool panel (STRSST, Option 1).
5. Licensed Internal Code log.
2. Dump entries to printer from the Licensed Internal Code log.
Set dump option to 3=Header and entire entry.
Set start/end date to timeframe around the error.
Press Enter.
ii) Advanced Analysis macro "viofr -vlan -all -fr"- *Adapter firmware level requires APAR MA43674 (v7r1m0f: MF58375 v7r2m0f: MF58373)
1. Start a service tool.
4. Display/Alter/Dump.
2. Dump to printer
2. Licensed Internal Code (LIC) data.
14. Advanced analysis.
Type 1 next to the blank command line, type viofr on the blank line and press Enter:
Type -vlan -all -fr on the options line, and press Enter.
The Specify Dump Title panel appears. Enter a title, and press Enter.
The Dump to printer successfully submitted message will appear
- From the Start a Service Tool panel (STRSST, Option 1).
- i) Licensed Internal Code logs (LIC logs)
4. (VNICs only) Hosting VIOS partition data
This applies only when using VNICs. Collect snap from each VIOS with a server VNIC adapter.
For further information on collecting a SNAP see document 643361, Collecting Snap Data on VIOS Server Partition.
5. (VNICs only) Hypervisor Resource Dump
For problems involving VNIC adapters gather a non-disruptive Hypervisor Resource Dump of the server:
https://www.ibm.com/support/pages/node/667943 - How to Initiate a Resource Dump from the HMC
Use selector: system
6. SR-IOV Dump
For issues involving a SR-IOV physical adapter collect an SR-IOV dump for the failing adapter. SR-IOV dumps will generate a dump file starting with "LPADUMP". You should first check for existing LPADUMPs on the HMC(s) in the /dump/ directory. If you find dumps that correspond to the failure times, please send those into IBM. If there are no dumps, then you can collect an SR-IOV dump close to the time of failure. Start with a non-disruptive dump option (this includes SR-IOV firmware data but does not include an adapter microcode dump), and if IBM needs a disruptive dump (includes an adapter microcode dump) it will be requested.
6. SR-IOV Dump
For issues involving a SR-IOV physical adapter collect an SR-IOV dump for the failing adapter. SR-IOV dumps will generate a dump file starting with "LPADUMP". You should first check for existing LPADUMPs on the HMC(s) in the /dump/ directory. If you find dumps that correspond to the failure times, please send those into IBM. If there are no dumps, then you can collect an SR-IOV dump close to the time of failure. Start with a non-disruptive dump option (this includes SR-IOV firmware data but does not include an adapter microcode dump), and if IBM needs a disruptive dump (includes an adapter microcode dump) it will be requested.
https://www.ibm.com/support/pages/node/667943 - How to Initiate a Resource Dump from the HMC
Use selector: sriov <adapter_location_code>
Example selector: sriov U2C4E.001.DBJY198-P2-C8
Collect the new and any existing LPADUMPs from time of error
For servers managed by dual HMCs, check both HMCs for any existing LPADUMP files.
- At a HMC command line, run ls -ltr /dump
Example:
- HMCUser@ATSHMC3:~> ls -ltr /dump
...
-rw-rw-r-- 1 ccfw ccfw 31745504 Apr 25 17:12 LPADUMP.1026D2P.2E000000.20140425161138
7. Sending data to IBM
Upload the data at: https://www.ecurep.ibm.com/app/upload
For further information on senidng data to IBM see IBM Technical Support Document N1019224, MustGather: Instructions for Sending Data to IBM i Support.
[{"Product":{"code":"SSB6AA","label":"Power System Hardware Management Console Physical Appliance"},"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Component":"HMC","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"Version Independent","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]
Was this topic helpful?
Document Information
Modified date:
19 April 2024
UID
nas8N1019958