This MustGather document was written to assist AIX Administrators collect data needed when opening a support case with AIX System Performance Team.
AIX System Performance Support requires the person opening the case have some insight into the issue being reported.
- Problem description
- Perfpmr data collection
- What is the exact nature of the performance problem? (Such as slow system response times, longer than normal batch job completion times, slow backups, performance metrics reported by the OS or applications.)
- When did the performance issue first appear?
- Is there just one partition impacted by this slowdown? If not, How many other partitions are impacted? Are the impacted partitions located on the same frame?
- Were there any changes made to the hardware, application, operating system, or network before the problem first appeared? If so, provide details of the changes that were made.
- Is the performance problem chronic (happening constantly) or is it intermittent? If intermittent, how often are you seeing it happen?
- How long does the slow down last? (hours, minutes).?
- Can the slowdown issue be reproduce the slowdown on demand?
- How does recovery from this issue or occur or does the system /application performance return to normal without user input? How long does it take to recover?
- Are you using any monitoring tools such as vmstat, iostat, lparstat, etc. being used to help with identifying the resource bottleneck of the slowdown? If so, provide output from those commands from the time period when the slowdown occurred as a testcase along with providing the timestamp of when the error occurred.
- Are there any other vendors involved with resolving this issue, such as EMC, Oracle, etc ?
- If the issue is related to a batch job, how long does it take for the batch job to complete? Provide both batch job run times and slow run times. (Specify the time differences in seconds, minutes, hours)
- How often does the batch job in question run, once per day, once a week?
*You must collect perfpmr data at the time the system is experiencing slowness.*
Here are links to download the perfpmr.sh script and README files. It may be necessary to cut and paste the links in your browser.
- AIX 6.1 http://ftp.software.ibm.com/aix/tools/perftools/perfpmr/perf61/README
- AIX 7.1 http://ftp.software.ibm.com/aix/tools/perftools/perfpmr/perf71/README
- AIX 7.2 http://ftp.software.ibm.com/aix/tools/perftools/perfpmr/perf72/README
- It will be necessary l need to collect perfpmr data at the same time your collecting perfpmr on client partition again while experiencing slowness.
- The steps used to collect perfpmr data on the VIOS are the same as on the client partition; however, it will necessary to login as oem_setup_env for root access:
VIOS 3.1 running AIX 7.2 ftp://ftp.software.ibm.com/aix/tools/perftools/perfpmr/perf72/perf72.tar.Z
-Right click on the specific LPAR
-Check the Box for 'Allow performance information collection
2. Extend the values. The maximum is 600 seconds for HEARTBEAT_FREQUENCY, and NETWORK_FAILURE_DETECTION_TIME must be at least 10 seconds less than HEARTBEAT_FREQUENCY. A cluster sync is required, and can be done while cluster services are running. These values will be propogated to all nodes.
clmgr modify cluster NETWORK_FAILURE_DETECTION_TIME=90
clmgr sync cluster
3. Run perfpmr. When the perfpmr data collection complete, revert the tunables to their previous values and synchronize.
clmgr modify cluster HEARTBEAT_FREQUENCY=[previous_value]
clmgr sync cluster
Upload details are provided in the README files listed above. For convenience, these steps are summarized below.
Upload yourcase#.pax.gz created during the perfpmr collection using one of the following options (a, b, or c)
a) Attach to your case
b) Upload to the Enhanced Customer Data Repository(ECuRep)
c) Upload to the Blue Diamond FTP server (Blue Diamond Customers Only)
* Note: For information about doing a Blue Diamond upload see:
20 January 2021