Question & Answer
- Has the device's FFDC feature been configured or enabled? See for the Best Practices: Most Detailed Error Report - assure error reports are generated Always on Startup.
- Is this occurring on a single device or a group? What are the devices names and which are impacted, are there also others that are not impacted?
- Is the problem behavior something that can be replicated? What service(s) and domain(s) are involved?
- Is the probe enabled? Is debug logging enabled?
- Identify any recent changes made to domains and/or services on the device.
- After collecting diagnostics below, should issue persist try removing all non-management traffic from the device(s) and note if the issue still occurs.
- Using an 'admin' CLI/SSH session to DataPower will be required to better understand the problem. The following diagnostics will help establish root cause of the issue.
- Setup DPMon to be running prior to replicating the problem:
topdiagdpmon ondpmon show
- Setup LLDiag to be running prior to replicating the problem: https://www.ibm.com/support/docview.wss?uid=ibm10719605
- Have the following commands ran before, during and after the problem is recreated (the more iterations over time the better). A sample script (sample_cli_script.sh) is included at the bottom for periodic captures on a linux operating system with a bash script.
show memory details
show activity 100
- Including a minimum of 5 CLI outputs is ideal. The periods over which you must collect will vary depending on the ability to recreate the issue. The default is 5 minute segments, but if the issue happens in shorter periods it would be advised to collect the CLI outputs more frequently. At a minimum in production do not collect CLI data faster than 30 seconds.
- In production environments if there is concern of an impact, omit the 'show tcp-table', 'show gateway-transactions' and 'show handles' commands.
- A debug log from the default domain and domain where the active service is running should be collected. If it is unclear which other domain the debug log should be collected from, the default domain alone will be a good start.
- Save an Error Report, either through CLI 'co; save error-report' or through WebGUI Troubleshooting->Generate Error Report
- Collect a device backup. With this, DataPower Support will have all domains including the default to work from as needed.
Keep in mind this is only a sample to assist in this collection and is not a supported script. The 'admin' user is required for this as we need to access the 'diag' (diagnostics) prompt. If this script is run with the admin id using an incorrect password you could lock yourself out of the device. The "COUNT" and "sleep" values should be adjusted depending on problem behavior.
Section 3. When issue is replicated, collect the following for IBM Support:
- Questions to bullet points in Section 1.
- DPMon data directory, (check output from: 'top; diag; dpmon show') by default in the temporary:///dpmon directory, dpmon, dpmon.1, dpmon.2, dpmon.x (all iterations) and also dpmon.errlog. Setup via Section 2, Step 1.
- LLDiag output files (lldiag.txt, lldiag.txt.1, etc.) generated via Section 2, Step 2.
- CLI output collection generated via Section 2, Step 3.
- Obtain error report generated via Section 2, Step 7.
- Device backup generated via Section 2, Step 8.
- Sample messages of client input and/or server response messages that can trigger the condition.
- Obtain all logs stored inside the logtemp:// directory.
Diagnostic commands are not publicly documented and are intended for IBM Support diagnosis only. These diagnostic commands can be intrusive, but is necessary to diagnose the problem correctly. To prevent any known issues from causing additional complications or problems, it is highly recommended that you be running the latest firmware. To confirm your firmware will not cause a problem during debugging, always check the release notes for your firmware level accessible from this document.
04 September 2019