IBM Support

AIX MustGather: System Performance Analysis

Product Documentation


Abstract

This MustGather document is meant to assist AIX Administrators with collecting AIX performance data needed when opening a support case with AIX System Performance team.

Content

Tab navigation

AIX System Performance Support requires the person opening the case to have some insight into the reported performance issue.
Gathering information before calling IBM support might shorten the time it takes to resolve performance issues. The AIX System Performance Analysis team requires at least two pieces of information when diagnosing a AIX System Performance issue: 
  •  Problem description
  • The perfpmr data collection
 1. A detailed problem description of the performance issue is required.
Provide answers to the following questions:
  • What is the exact nature of the performance problem? (For example, slow system response times, longer than normal batch job completion times, slow backups, performance metrics reported by the OS or applications.)
  • When did the performance issue first appear? 
  • Is there just one partition impacted by this slowdown? If not, How many other partitions are impacted? Are the impacted partitions on the same frame?
  • Were there any changes made to the hardware, application, operating system, or network before the problem first appeared? If so, provide details of the changes that were made.
  • Does the performance issue constant, or intermittent? If intermittent, how often are you seeing it happen?
  • How long does the slowdown last? (hours, minutes).?
  • Can the slowdown issue be reproduced on demand?
  • Is recovery possible? Does system or application performance return to normal without user input? How long does it take to recover? 
  • Have you used any monitoring tools such as vmstat, iostat, lparstat, to help identify the bottleneck? If you ran these commands during the slowdown, provide a timestamp of the slowdown period, and provide the output from those commands. You can upload this data as a test case.
  • Have you engaged other vendors (for example, EMC, Oracle, etc) to help resolve this issue?
  • If the issue is related to a batch job, how long does it take for the batch job to complete?  Provide both batch job run times and slow run times. (Specify the time differences in seconds, minutes, hours) 
  •  How often does the batch job in question run, once per day, once a week?
*There are more questions in the PROBLEM.INFO file included with the perfpmr tool download script. Answer those questions, if applicable, and place the updated version in the data directory used to store output files. Add these answers to your CSP case to ensure timely technical analysis.
 
2. Collect performance data by using the perfpmr.sh script.
      *You must collect perfpmr data at the time the system is experiencing slowness.* 
      *perfpmr data collected before or after the slowdown is not useful.

Download the perfpmr data collection script from the following link:
The readme file provides detailed information about running perfpmr, and uploading the output to IBM: 
It is recommended to collect perfpmr data on the VIOS if virtual devices are configured:
  • Collect the VIOS perfpmr data at the same time you collect perfpmr on the client partition.
    • Note: *Collect the perfpmr data during the slowdown.
  • Run oem_setup_env for root access on the VIOS, then use the same perfpmr steps described for the LPAR client.
Download the VIOS perfpmr data collection scripts from the following links:
VIOS 2.2 running AIX 6.1  
VIOS 3.1 running AIX 7.2

-https://public.dhe.ibm.com/aix/tools/perftools/perfpmr/perf72/perf72.tar.Z 

*For hosts that use shared processors, it is recommended to enabling CEC-wide performance information through the HMC before collecting perfpmr data. 
      -Log on to the HMC
      -Right click the specific LPAR
      -Select 'Properties'
      -Check the 'Allow performance information collection' box
  • The change to "Allow performance information collection" is dynamic and does NOT require a reboot of the system.  
  • The change enables the lparstat command to display the number of processors in the shared pool. 
*Special considerations for hosts running HACMP version 7.2 or later:
Before running the perfpmr tool on an active PowerHA cluster node, some tunables should be modified to reduce the likelihood of a false takeover due to the additional load. It is recommended to open a case with the AIX HA team for more instructions.
 
TESTCASE UPLOADS
Upload details are provided in the readme files. For convenience, these steps are summarized in this section.

Upload the yourcase#.pax.gz file created during the perfpmr collection by using one of the following options (a, b, or c)

     a) Attach to your case
     https://www.ibm.com/mysupport/s/my-cases

     b) Upload to the Enhanced Customer Data Repository(ECuRep) 
     https://www.secure.ecurep.ibm.com/app/upload_sf

     c) Upload to the Blue Diamond FTP server (Blue Diamond Customers Only)
     https://msciportal.im-ies.ibm.com

* Note: For more information about doing a Blue Diamond upload, see:

     http://www.ibm.com/support/docview.wss?uid=nas8N1020947

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"Performance","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
15 September 2022

UID

ibm10875894