Troubleshooting
Problem
Diagnosing The Problem
- CPU Architecture
- Database Type
- Version
- Fix Pack
- DB Install Time zone
- IBM MQ Version
- External Service (Cassandra, Kafka, Elastic Search etc.) Versions if applicable
- Cloud Vendor Details
- Version and Fix Pack
- Version and Fix Pack
- OMS Version Deployed
- Operator Version
-
Version and Pod Status
Objective: To check the operational status and versions of Catalog Sources and associated Pods.
Command:
oc get catalogsource,pods -n <namespace>Note: Replace <namespace> with the specific namespace where the catalog source is deployed. For OpenShift, this typically is openshift-marketplace. In other Kubernetes platforms, the namespace might vary based on your setup.
-
Catalog Source Details
Objective: To collect detailed YAML configuration data of the Catalog Sources for in-depth troubleshooting.
Commands:
-
General Catalog Source Information:
oc get catalogsource -n <namespace> -o yaml > operator_troubleshooting_csinfo.yaml -
Specific Catalog Source Configuration:
oc get catalogsource <catalogsource-name> -n <namespace> -o yaml > specific_catalogsource_details.yamlThis command allows for retrieving YAML configurations of oms catalog source i.e. ibm-oms-catalog providing a granular view necessary for deeper troubleshooting.
Notes on Namespace Specification: Ensure <namespace> correctly reflects where the catalog source is deployed. The default in OpenShift is often openshift-marketplace. It's important to adjust this based on your Kubernetes or OpenShift configuration settings.
-
-
Operator Group Details
Objective: Retrieve details about the Operator Group in the specified namespace to ensure correct configuration and alignment with the deployed operators.
Command:
oc get OperatorGroup -n <namespace>Note: Replace <namespace> with the specific namespace where the subscription or operator is deployed. This is crucial as the Operator Group details need to match the namespace of the operational environment to provide accurate diagnostics.
-
Subscription Details
Retrieve Subscription Names
Objective: Gather the names of all subscriptions in the specified namespace to check their current status and configuration.
Setup: Set the namespace variable to ensure all commands are consistently applied to the correct namespace:
ns=<namespace> # Replace <namespace> with the actual namespaceCommand:
subs=$(oc get sub -n $ns --no-headers -o custom-columns=:metadata.name)This command saves the names of all subscriptions in the specified namespace into a variable, making it easier to use in subsequent commands.
Subscription Details
Objective: After obtaining the names of the subscriptions, fetch detailed information about each subscription.
Command:
oc get sub $subs -n $ns -o yaml > operator_troubleshooting_subinfo.yamlNote: Ensure that the $ns variable is set to the namespace where the subscription or operator is deployed. This approach not only streamlines the command execution but also minimizes the risk of applying commands to the wrong namespace, ensuring accurate and relevant data collection.
-
Logs Collection
IBM-OMS-operator Logs
Objective: List and save logs for IBM-OMS-operator pods.
Setup: Set the namespace variable to ensure all commands are consistently applied to the correct namespace:
ns=<namespace> # Replace <namespace> with the actual namespaceCommands:
-
Get the IBM-OMS-operator Pods:
pods_OMS=$(oc get pods -n $ns -l app.kubernetes.io/name=ibm-oms-operator -o custom-columns=NAME:.metadata.name --no-headers) -
Export logs if any pods are found:
if [ -z "$pods_OMS" ]; then echo "No IBM-OMS-pods found in namespace $ns." else for pod in $pods_OMS; do oc logs $pod -n $ns > ${pod}_logs.txt echo "Logs for pod $pod have been saved to ${pod}_logs.txt" done fi
Notes:Ensure that the $ns variable is set to the namespace where the IBM-OMS-operator is deployed.
-
-
Deployment and Service Group Status
1. Integrated OM Environment Deployment Status
Objective: Gather the status of the Integrated OM Environment deployment.
Command:
oc describe omenvironments.apps.oms.ibm.com -n oms > oms_operator_troubleshooting_deploymentstatusinfo.txtNote: Replace <namespace> with the actual namespace where the OM environment is deployed. This command provides a detailed description of the OM environment deployment status, which is saved to a file for troubleshooting.
Service Group Status
Objective: Retrieve the status of various service groups in YAML format.
Setup: Set the namespace variable to ensure all commands are consistently applied to the correct environment:
ns=<namespace> # Replace <namespace> with the actual namespaceNotes:
Namespace and Instance Reference: Ensure that the $ns variable is set to the namespace where the OM environment is deployed, and the $instance variable is set to the specific instance name.
-
2. Standalone OM Environment Deployment Status
Objective: Gather the status of the Standalone OM Environment deployment.
-
List All Instances:
OM Environment and OMServer:
kubectl get omenvironments.apps.oms.ibm.com -n $ns kubectl get omservers.apps.oms.ibm.com -n $nsOrder Service:
kubectl get orderservices.apps.oms.ibm.com -n $nsOrder Hub:
kubectl get orderhubs.apps.oms.ibm.com -n $ns -
Set the Instance Variable:
instance=<instance> # Replace <instance> with the actual instance name returned by the above command (e.g., dev, prod, oms-ohub, etc.) -
Retrieve Specific Instance Details:
OM Environment and OMServer Groups:
kubectl get OMEnvironment $instance -n $ns -o yaml > ${instance}_OMEnvironment.yaml kubectl get OMServer $instance -n $ns -o yaml > ${instance}_OMServer.yamlOrder Service:
kubectl get OrderService $instance -n $ns -o yaml > ${instance}_OrderService.yamlOrder Hub:
kubectl get OrderHub $instance -n $ns -o yaml > ${instance}_OrderHub.yaml
-
-
Stateful Sets and CRDs
Check the Status of Stateful Sets and Custom Resource Definitions (CRDs)
Objective: Check the status of Stateful Sets and Custom Resource Definitions (CRDs) associated with OMS.
Setup: Set the namespace variable to ensure all commands are consistently applied to the correct environment:
ns=<namespace> # Replace <namespace> with the actual namespaceCommands:
kubectl get statefulset,crd -n $ns | grep oms -
If Catalog Source Pod Is Not Being Created During OMS Deployment Due to PodSecurityStandards (PSS) Enforcement
Adjusting Pod Security Standards
Objective: Change the namespace to a less restrictive PSS profile to allow the creation of Catalog Source pods.
Setup: Set the namespace variable to ensure all commands are consistently applied to the correct environment:
ns=<namespace> # Replace <namespace> with the actual namespace where the catalog source is deployedCommands:
-
Adjust PSS Profile:
kubectl label --overwrite ns $ns pod-security.kubernetes.io/enforce=baseline -
Confirm Execution:
Objective: Ensure that the command was successfully executed to comply with the necessary security standards.
Command:
kubectl get ns $ns --show-labels | grep pod-security.kubernetes.io/enforce=baselineThis command verifies that the namespace label has been successfully updated.
Notes:
Namespace Reference: Ensure that the $ns variable is set to the namespace where the Catalog Source is deployed. Adjusting the PSS profile helps in overcoming the restrictions that prevent pod creation.
The commands ensure that the namespace is set to a less restrictive profile, facilitating the successful deployment of Catalog Source pods.
-
-
Persistent Volume (PV) Access Permissions
Ensure Correct Access Permissions for Persistent Volumes
Objective: Verify and adjust the access permissions for Persistent Volumes used by Operators.
Setup: Set the namespace variable:
ns=<namespace> # Replace <namespace> with the actual namespaceRequirements:
- Access Mode: ReadWriteMany (RWX)
- Storage: Minimum of 10 GB
- Accessibility: Accessible by all containers across the cluster
- Write Access: Owner group must have write access
- Security Context: Set the fsGroup parameter in the OMEnvironment custom resource
Commands:
-
List Persistent Volumes:
kubectl get pv -n $ns > pv_list.txtAction: Save the list to a file.
-
Describe Specific Persistent Volume:
kubectl describe pv <pv-name> > ${pv-name}_description.txtNote: Replace <pv-name> with the PV name. Action: Save the description to a file.
-
List Persistent Volume Claims (PVCs):
kubectl get pvc -n $ns > pvc_list.txtAction: Save the list to a file.
-
Describe Specific Persistent Volume Claim:
kubectl describe pvc <pvc-name> -n $ns > ${pvc-name}_description.txtNote: Replace <pvc-name> with the PVC name. Action: Save the description to a file.
-
Purpose
This script is designed to collect comprehensive diagnostic data from your Kubernetes namespace. It gathers information about Pods, Network Policies, Custom Resources, Persistent Storage, Nodes, and various other components. The script compiles the data into an HTML report and compresses everything into a single archive file for easy sharing.
Usage Instructions
1. Download the Script:
2. Save the provided script as
diagnostics_script.shon your local machine.3. Make the Script Executable:
4. Open a terminal and navigate to the directory where the script is saved. Run the following command to make the script executable:
chmod +x diagnostics_script.sh5. Run the Script:
Execute the script by specifying the required arguments:
If you are using OpenShift, use
oc:To gather diagnostics for a specific pod in a namespace:
./diagnostics_script.sh -c oc -p <PodName> -n <NameSpace>Replace
<PodName>with the name of your pod and<NameSpace>with your namespace.To gather diagnostics for all pods in a namespace:
./diagnostics_script.sh -c oc -a <NameSpace>Replace
<NameSpace>with your namespace.If you are using vanilla Kubernetes, use
kubectl:To gather diagnostics for a specific pod in a namespace:
./diagnostics_script.sh -c kubectl -p <PodName> -n <NameSpace>Replace
<PodName>with the name of your pod and<NameSpace>with your namespace.To gather diagnostics for all pods in a namespace:
./diagnostics_script.sh -c kubectl -a <NameSpace>Replace
<NameSpace>with your namespace.Collect the Output:
After the script completes, it will generate a compressed file named
Diagnostics.tgzcontaining all the collected data and the HTML report.Share the Output:
Attach the
Diagnostics.tgzfile and share it with us for further analysis.What the Script Does
Collects Detailed Diagnostics: Gathers detailed information about various Kubernetes components in the specified namespace.
Generates an HTML Report: Creates an intuitive and user-friendly HTML report summarizing the collected data.
Compresses Data: Compiles all the collected data and the HTML report into a single compressed file (Diagnostics.tgz). -
Note: This script does not capture any secret or configmap data, ensuring that sensitive information is not included in the diagnostic package.
Example Commands
To gather diagnostics for the pod
oms-server1-657775b7f4-5ddfpin theomsnamespace usingoc:./diagnostics_script.sh -c oc -p oms-server1-657775b7f4-5ddfp -n omsTo gather diagnostics for the pod
oms-server1-657775b7f4-5ddfpin theomsnamespace usingkubectl:./diagnostics_script.sh -c kubectl -p oms-server1-657775b7f4-5ddfp -n omsTo gather diagnostics for all pods in the oms namespace using
oc:./diagnostics_script.sh -c oc -a omsTo gather diagnostics for all pods in the oms namespace using
kubectl:./diagnostics_script.sh -c kubectl -a omsAfter the script runs, you will find the
Diagnostics.tgzfile in the same directory. Please attach this file and share it with us for troubleshooting and support.
| How to submit diagnostic data to IBM Support |
|---|
|
After you have collected the preceding information, and the case is opened, please see: For more details see submit diagnostic data to IBM (ECuRep) and Enhanced Customer Data Repository (ECuRep) secure upload |
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
09 September 2024
UID
ibm16482237