Troubleshooting
Problem
Resolving The Problem
Overview of Business Automation Workflow diagnostic information
General diagnostic information
As needed diagnostic information
Detailed diagnostic collection steps
These steps are the detailed steps to gather different types of data for BAW. When you run the diagnostic commands, run them from an empty collection directory to make it easy to package the files. Run the commands from the project or namespace containing BAW or use the -n <namespace> flag with all oc commands.
Note: oc commands are interchangeable with kubectl.
When using something other than OCP, you should use kubectl and the -n parameter with kubernetes commands.
Important: If your issue is with workflow authoring on version 22.x or later, see the BA Studio mustgather for diagnostic collection.
1: Provide a detailed description of the problem and your environment
- Provided a detailed description of your issue. Include screen captures and re-create steps if possible.
Is it an intermittent or recreatable issue? Has this issue always been a problem or one that started only after a change occurred?
What is the business impact? Are there any deadlines impacted by the issue? - Provide a reference to the documentation being followed for the failing operation
- Which platform are you using (Red Hat OpenShift, managed Red Hat OpenShift, other Kubernetes platform)?
- What is the database type and version?
2: Gather the configuration information
oc get icp4acluster -oyaml > Cp4aCR.yaml
oc get content -oyaml > ContentCR.yaml
oc adm must-gather --image=icr.io/cpopen/cpfs/must-gather:latest -- gather -m automationfoundation -n <cloud pak namespace>
The -n parameter is required and must be a single namespace. If you are using an air gap setup, ensure you push the latest version of the must-gather image into your local repository. The command requires cluster admin access to execute. Generally, this collection takes 5 - 10 minutes and produces a 25 - 50MB gzip file.Additionally in 23.0.1, new mustgather command options were added. See Gathering deployment information and logs from Cloud Pak for Business Automation for more details. This command is an alternative that gathers more targeted config data and logs if the issue is specific to workflow where cp4baNS is the Cloud Pak namespace and 23.0.2 is the appropriate version tag.
oc adm must-gather --image=icr.io/cpopen/cp4ba/icp4a-must-gather:23.0.2 -- gather -m cp4ba –p workflow_runtime -n cp4baNS
If you are not able to use the must-gather command then see item 2 option 2 of the main Cloud Pak MustGather to gather some basic info. The oc adm must-gather command is not valid for non-OCP environments and remember to replace oc with kubectl for kubernetes commands.
3: Log and Tracing data for WebSphere Liberty
- Edit the icp4acluster CR yaml used by the operator create the BAW pods.
The same steps can be used for either baw_configuration or workflow_authoring_configuration.
Modify the traceSpecification property in the BAW logs section of the yaml and set the following trace string or a trace that fits your problem.spec: ... baw_configuration: ... logs: trace_specification: '*=info:WLE.*=all:com.ibm.bpm.*=all:com.ibm.workflow.*=all'
Update the CR with the new configuration by using your preferred method. For example, the edit command can be used.oc edit icp4acluster
Note: It can take a large amount of time to recognize the change (length of an operator reconcile) and update the configuration. You can grep the log file for traceSpecification to see when the trace settings change.
-
Optional: The changes can be applied immediately by additionally modifying the configmap ending with baw-server-configmap-custom. This configmap contains a trace-specification.xml file. Edit the settings of this file to match what was used in the CR file.
- The following command can be used to gather the BAW logs where pod name is one of the BAW pods.
oc cp <pod-name>:/logs/application/ ./BAW
- Disable the trace by setting traceSpecification back to "*=info" and applying the changes again.
4: Export of your application
5: Collect Operator logs
oc cp $operator_pod_name:/tmp/ansible-operator/runner/ ./operator_logs/
For recent versions of BAW, you generally need to provide these from both the cp4a and content operators.
For more information, see the installation troubleshooting page.
6: Collect Browser data for UI issues
- Network traffic capture export in .har or .saz format. See Collect a HTTP traffic capture with Fiddler or your web browser.
- Export the browser console log.
Open the browser developer tools (can be accessed by pressing F12), and copy the contents of the console tab.
7: Gathering javacores and heap dumps
- Determine the names of the BAW server pods by using the get pods command.
oc get pods | grep baw-server
-
If dumps need to be generated, you can use the Liberty server dump commands to create them. Use the javadump command to generate javacores for each BAW server pod. Include the option --include=heap or --include=system to generate heap dumps or system core dumps. For example, the following command generates a javacore and heap dump for the pod.
oc exec <podname> -- bash -c "server javadump --include=heap"
Note: If a BAW Liberty server JVM crashes, then you might also see dumps get generated. -
The following command can be used to gather the BAW dumps where pod name is one of the BAW pods.
oc cp <pod-name>:/opt/ibm/wlp/output/defaultServer/dump ./BAW/dumps/
Note: The dumps can also be gathered directly from the associated BAW dump (baw-dumpstore-pvc) persistent volume(PV).
Enabling verbose:gc and other JVM dump options.
Update the CR to include the needed JVM options and point the logs at an appropriate location.
baw_configuration:
jvm_customize_options: -verbose:gc -Xverbosegclog:/logs/application/verbosegc/verbosegc.%Y%m%d.%H%M%S.%pid.txt,20,10000 -Xdump:stack:events=allocation,filter=#25m
These options enable verbose:gc, send log files to the logging PVC under the /verbosegc directory and enables stack dumps for gc events larger than 25MBs.The operator will rollout the changes. To confirm the change or speed up the process, you can view or edit the configmap ending with baw-server-configmap-liberty. The jvm.options key in this configmap contains the settings. The pods do need to be restarted to pickup the new settings if you change the configmap. Once enabled logs can be gathered from the logging PV as mentioned in item 3 of this mustgather.
8: Gathering resource registry data
- Get a dump of the resource registry contents. Run this command from one of the resource registry pods.
etcdctl --cacert=/shared/resources/tls/ca-cert.pem --user=root:<root password> --insecure-skip-tls-verify get "" --from-key
- Enable this trace string in addition to any other needed tracing when recreating the issue:
com.ibm.bpm.dbaregistry.*=all: com.ibm.bpm.resourceregistry.*=all: com.ibm.bpm.serviceregistry.*=all: com.ibm.bpm.bas.registry.*=all
What to do next
- Review the log files and traces at the time of the problem to try to determine the source of the problem.
-
Check these locations for known issues:
-
Search in the Cloud Pak for Automation Support Page.
-
Review the Business Automation Workflow documentation.
-
- Once you completed gathering all the needed information and diagnostics, you can add them to your case. Alternatively, you can upload files to ECURep. For more information, see Enhanced Customer Data Repository (ECuRep) - Overview.
Related Information
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
06 February 2024
UID
ibm16259483