IBM Support

Collecting IBM Cloud Pak foundational services information for problem determination

How To


Summary

The Cloud Pak MustGather tool collects information about your cluster that is crucial for troubleshooting problems for support. The document also includes instructions to collect must-gather information for foundational services (earlier known as common services) and Cloud Pak for a disconnected (AirGap) environment with no access to an external registry (quay.io or icr.io).

Objective

Assist in gathering the required documentation before you contact support expedites the troubleshooting process and save time. 

Environment

Describe the Environment:
Provide the following information if applicable to your environment. 
Cloud Pak Name and Version:          # (CP4Auto : 19.0.x, CP4App:4.2.x , CP4I 21,0.2, N/A) 
OpenShift Cluster Version:           # (OCP 4.6,4.7) 
Foundational Services Platform:            # (Reference [1])  
ClusterID:                           # (Reference[2])  
Customer:                            # (MyCompany) 
Architecture:                        # (x86_64, s290x, ppc64le) 
Platform:                            # (IBM Cloud, AWS, Azure, GCP, BareMetal, VMware) 
Business Impact:                     # (Reference [3] )
Problem Summary:   
Problem detail:
 # (Clearly articulate the current issue's symptoms and the request for opening the new case. )
  • Is this a new Installation?
  • Any recent changes to the environment? 
  • When does the behavior occur? Frequency? Repeatedly? At certain times?
  • MustGather Tool Output: When appropriate, collect the logs from the environment using one of the scripts provided in the next section 

Steps

Collecting MustGather 

It is often helpful to provide debug information about your cluster when you open a case. Depending on your environment, chose one of the following methods to collect the Cloud Pak  must-gather

 Depending on the cluster status, you can use one of the methods to collect the cluster information. 

Cloud Pak Must-Gather from an OCP 4. x cluster with access to the internet:

If you have internet access to icr.io, run the following command to collect the must-gather for the Cloud Pak clusterThe default "oc adm must-gather" will only collect the openshift-* namespace and will not contain the logs from the Cloud Pak namespace.

The image icr.io/cpopen/cpfs/must-gather:latest is an enhanced quay.io/openshift/origin-must-gather: image. The enhanced must-gather collects the overview and failure information of the OCP environment and Cloud Pak Foundational Services related components. 

cat > cp-must-gather-CS.sh << 'EOT'
#!/bin/bash
export MY_CLOUDPAK_NAMESPACES=_cloudpak-operators,_cloudpak-instance,other-namaspace-with-issues
export MUST_GATHER_IMAGE=icr.io/cpopen/cpfs/must-gather:latest
export CLOUDPAK_NAMESPACES=common-service,ibm-common-services,openshift-operators,openshift-operator-lifecycle-manager,openshift-marketplace,ibm-licensing,ibm-cert-manager,cs-control,$MY_CLOUDPAK_NAMESPACES
export MUST_GATHER_MODULES=overview,failure,cloudpak
oc adm must-gather --image=$MUST_GATHER_IMAGE -- gather -m $MUST_GATHER_MODULES -n $CLOUDPAK_NAMESPACES
EOT
MY_CLOUDPAK_NAMESPACES: Replace the MY_CLOUDPAK_NAMESPACES variable with the correct namespaces with issues, separated by a comma and no space between the namespace
Example MY_CLOUDPAK_NAMESPACES=cpd-operator,cpd-instance.cp4ba,rook-ceph,API
 MUST_GATHER_MODULES: The modules can be set to collect CloudPak and foundational services related information.  Add the  "OCP,system,route" modules, if the problem determination requires openshift-* namespace. Note adding "ocp" will collect all openshif-* namespace and add to the time to connect complete must-gather. Avoid using this module if not related to openshift operators.   
-- Change file permission of the shell script created from the above script and run the scripts to collect the support data.
   
chmod +x cp-must-gather-CS.sh
./cp-must-gather-CS.sh
--  cloudpak-must-gather-xxx.tar.gz  will be generated under must-gather.local.xxx/quay-io-opencloudio-must-gather-xxxxxx directory .  Do not compress the long directory again.
-- Upload the cloudpak-must-gather-xxx.tar.gz file already generated.  
Upload the following output 
  
oc get authentication.operator.ibm.com example-authentication -A > $MGDIR/example-authentication.yaml
oc get commonservice -A > commonserviceCR-all.yaml

Cloud Pak Must-Gather from an OCP 4.x cluster in a disconnected environment (AirGap):

For offline installation, the mirrored Cloud Pak images in your local repository include "opencloudio/must-gather" image. You can replace the [LOCAL_REGISTRY:5000] with your local mirror
cat > cp-must-gather-CS-Airgap.sh << 'EOT'
#!/bin/bash
export MY_CLOUDPAK_NAMESPACES=_cloudpak-operators,_cloudpak-instance,other-namaspace-with-issues
export MUST_GATHER_IMAGE=[LOCAL_REGISTRY:5000]/cpopen/cpfs/must-gather:latest
export CLOUDPAK_NAMESPACES=common-service,ibm-common-services,openshift-operators,openshift-operator-lifecycle-manager,openshift-marketplace,ibm-licensing,ibm-cert-manager,$MY_CLOUDPAK_NAMESPACES
export MUST_GATHER_MODULES=overview,system,failure,cloudpak,route
oc adm must-gather --image=$MUST_GATHER_IMAGE -- gather -m $MUST_GATHER_MODULES -n $CLOUDPAK_NAMESPACES
EOT

MY_CLOUDPAK_NAMESPACES: Replace the MY_CLOUDPAK_NAMESPACES variable with the correct namespaces with issues. separated by a comma and no space between the namespace
Example MY_CLOUDPAK_NAMESPACES=cpd-operator,cpd-instance.cp4ba,rook-ceph,API,etc separated by a comma and no space between the namespace
MUST_GATHER_IMAGE: Replace the[LOCAL_REGISTRY:5000] with your local repository where the cloudPak images are mirrored
You can check the latest version available by running the following command.
    
skopeo list-tags docker://[LOCAL_REGISTRY:5000]/cpopen/cpfs/must-gather
{
    "Repository": "[LOCAL_REGISTRY:5000]/cpopen/cpfs/must-gather",
    "Tags": [
        "4.5.16"
         .....
        "4.6.7",
        "4.6.8",
        "4.6.9",
        "latest"
    ]
}

MUST_GATHER_MODULES: If the problem requires openShift namespace configuration and logs to resolve the issue, add the ocp,etcd, route, to collect all openshift-* namespace logs and configuration. Review the referenced documentation for available modules and what is collected.
-- Change file permission of the shell script created from the above script and run the scripts to collect the support data.
   
	chmod +x cp-must-gather-CS-Airgap.sh
	./cp-must-gather-CS-Airgap.sh
--  cloudpak-must-gather-xxx.tar.gz  will be generated under must-gather.local.xxx/quay-io-opencloudio-must-gather-xxxx directory . No need to tarZ the long directory.
  • Upload the cloudpak-must-gather-xxx.tar.gz file .

Gather debugging information by using "inspect" (AirGap):

If you cannot gather debugging information using "oc adm mustgather", use the following script to collect the information the "oc admin inspect" command for a specific resource. You will not need internet access to the download mustgather.

1- Create the gathering script. Please copy & paste the below commands on a bastion (where you used to run oc command)
 

cat > cs-mg-inspect.sh << 'EOF'
 
#!/bin/bash
#NOTE update the MY_CLOUDPAK_NAMESPACES (ns/cloudpak1 )with the namespace where the problem and relevant namespace, separate with a space
#NOTE: The collection does not include the actual certificate. Add certificaterequests to CRS variable, if certs need to be reviewed 

export MGDIR=cs-mg-inspect-$(date '+%y%b%dT%H-%M-%S')
mkdir -p $MGDIR

export MY_CLOUDPAK_NAMESPACES="ns/cloudpak1 ns/cloudpak2"   # << update and customize for your environment <<<
oc adm inspect $MY_CLOUDPAK_NAMESPACES --dest-dir=$MGDIR

CRS="OperandRequests OperandConfigs OperandRegistries Issuers Certificate Certmanagers \
 CommonServices NamespaceScopes OperandBindInfos MongoDBs  Routes Ingresses managementingresses NetworkPolicies Clients ZenServices \
 businessteamsservices Clusters analyticsproxies analyticsproxieswithsubmodules PostgresClusters pgupgrades \
 flinkclusters AutomationBases Cartridges CartridgeRequirements EventProcessors PlatformNavigators AssetRepositories \
 OperationsDashboards Dashboards EventStreams ElasticSearches Kafkas kafkaclaims kafkausers KafkaComposite certificaterequests \
 DesignerAuthorings DataPowerservices APIConnectClusters IntegrationServers QueueManagers ICP4AClusters AutomationUIConfigs CP4IServicesBindings"
RESOURCES="olm"

for i in $(oc api-resources --verbs=list | awk '{print $1}' | sort | uniq); do
   echo $CRS | grep -w -i -q ${i}
   if [ $? -eq 0 ]; then
     RESOURCES+=",${i}"
   fi
done
echo "CRs:" $RESOURCES | tee $MGDIR/CRs.txt
oc adm inspect $RESOURCES -A --dest-dir=$MGDIR


OLM_NS="ns/openshift-marketplace ns/openshift-operator-lifecycle-manager ns/openshift-operators "
oc adm inspect $OLM_NS --dest-dir=$MGDIR

if [[ $(oc get project ibm-common-services > /dev/null 2>&1) -eq 0  ]] ; then
   oc adm inspect  ns/ibm-common-services --dest-dir=$MGDIR
fi
if [[ $(oc get project cs-control > /dev/null 2>&1) -eq 0  ]] ; then
   oc adm inspect  ns/cs-control --dest-dir=$MGDIR
fi


MGDIROV=$MGDIR/overview
mkdir -p $MGDIROV
oc get clusterversion -oyaml > $MGDIROV/ocp-cluster-version.txt
oc get co > $MGDIROV/clusterOperators.txt
oc adm top nodes   >  $MGDIROV/node-list.txt
oc get node -owide >> $MGDIROV/node-list.txt
oc describe nodes  >  $MGDIROV/node-list.txt
oc get pods -A -owide  > $MGDIR/pods-list.txt
oc get crd  > $MGDIROV/crd-list.txt
oc get Certificaterequests -A > $MGDIROV/certreq-list.txt
oc get certs -A -owide > $MGDIROV/certs-list.txt
oc -n kube-public get cm ibm-common-services-status -oyaml >  $MGDIROV/cm_kube-public-ibm-common-services-status.txt
oc -n kube-public get cm ibmcloud-cluster-info -oyaml >  $MGDIROV/cm_kube-public-ibmcloud-cluster-info.txt
oc -n kube-public get cm  common-service-maps -oyaml >  $MGDIROV/cm_kube-public-common-service-maps.txt
oc get ImageContentSourcePolicy -n openshift-marketplace -oyaml > $MGDIROV/ImageContentSourcePolicy.txt
oc get clients -A > $MGDIR/clinet-list.txt
oc get oauthclient -A > $MGDIR/oauthclient-list.txt
oc get authentication.operator.ibm.com example-authentication -A > $MGDIR/example-authentication.yaml

tar aczf $MGDIR.tar.gz ./$MGDIR
echo "Done. upload $MGDIR.tar.gz file to the case."
#----------------------------------------------------
EOF

2- Collect the support data using the following command, and upload the resulting cs-mg-DATE.tar.gz file to the case:
Change the file permission of the shell script created from the above script and run the script to collect the support data
    
	chmod +x cs-mg-inspect.sh
	./cs-mg-inspect.sh

Cloud Pak Must-Gather using Scripts for Red Hat OpenShift 4.x , EKS (K8S):

The must-gather code deploys a pod on the cluster to collect the cluster information. If you cannot run the must-gather tool, use the following scripts to gather cluster information. 
Note: For EKS and K8S environments create an  "alias oc=kubectl" before executing bellow script 


#CSNAMESPACE add namespace for cloudPak problem is occurring 
export CSNAMESPACE=<CS-NAMESPACE>   
export MGDIR=cp-MG-Script-$(date '+%y%b%dT%H-%M-%S')
export LOGLIMIT="--tail=1000"
mkdir -p $MGDIR
oc get node,hostsubnet -o wide > $MGDIR/node-list.txt
oc adm top nodes > $MGDIR/node-detail-list.txt
oc get all,events  -o wide -n default > $MGDIR/all-event.txt

oc describe nodes > $MGDIR/node-describe.txt
oc get namespaces > $MGDIR/namespaces.txt

oc get clusteroperators > $MGDIR/cluster-operators.txt
oc adm top pod --all-namespaces  > $MGDIR/TopNameSapce.txt
oc get pods --all-namespaces -owide --show-labels > $MGDIR/pods.txt 
oc get po --all-namespaces -o wide| grep -Ev '([[:digit:]])/\1.*R' | egrep -v "Completed" > $MGDIR/podsNotRunning-list.txt 

#ocp upgrade related
oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}' > $MGDIR/clusterID.txt
oc get clusterversion -o yaml > $MGDIR/ocpclusterversion.txt
oc logs $(oc get pod -n openshift-cluster-version -l k8s-app=cluster-version-operator -oname) -n openshift-cluster-version > $MGDIR/clusterVersionOperator-Upgrade.log
oc get mcp > $MGDIR/machineConfigPool.txt
oc describe mcp >> $MGDIR/machineConfigPool.txt
oc get co/machine-config > $MGDIR/co-machineConfig.txt
oc describe co/machine-config >> $MGDIR/co-machineConfig.txt
oc get catalogsource -A > $MGDIR/catalogsource.txt
oc get catalogsource -A -o yaml > $MGDIR/catalogsourcedetail.yaml


oc get cm  ibmcloud-cluster-info -o yaml > $MGDIR/ibmcloud-cluster-info-ConfigMap.txt
oc get installplan -A > $MGDIR/installplan.txt  


oc get certificates.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/certificates.txt
oc get challenges.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/challengesCert.txt
oc get clusterissuers.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/clusterissuers.txt

oc get configmap --all-namespaces -owide --show-labels > $MGDIR/configmap.txt
oc get crd --all-namespaces -owide --show-labels > $MGDIR/crd.txt
oc get cronjob --all-namespaces -owide --show-labels > $MGDIR/cronjob.txt
oc get csv --all-namespaces -owide --show-labels > $MGDIR/csv.txt
oc get ds --all-namespaces -owide --show-labels > $MGDIR/ds.txt
oc get endpoints --all-namespaces -owide --show-labels > $MGDIR/endpoints.txt
oc get event --all-namespaces -owide --show-labels > $MGDIR/event.txt
oc get hpa --all-namespaces -owide --show-labels > $MGDIR/hpa.txt
oc get ingress --all-namespaces -owide --show-labels > $MGDIR/ingress.txt
oc get issuers.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/issuers.txt
oc get job --all-namespaces -owide --show-labels > $MGDIR/job.txt
oc get namespace --all-namespaces -owide --show-labels > $MGDIR/namespace.txt
oc get networkpolicy --all-namespaces -owide --show-labels > $MGDIR/networkpolicy.txt
oc get authentications.operator.ibm.com --all-namespaces > $MGDIR/authentications.txt
oc get orders.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/orders.certmanager.txt
oc get pvc --all-namespaces -owide --show-labels > $MGDIR/pvc.txt
oc get pv --all-namespaces -owide --show-labels > $MGDIR/pv.txt

oc get resourcequota --all-namespaces -owide --show-labels > $MGDIR/resourcequota.txt
oc get route --all-namespaces -owide --show-labels > $MGDIR/route.txt
oc get secret --all-namespaces -owide --show-labels > $MGDIR/secret.txt
oc get svc --all-namespaces -owide --show-labels > $MGDIR/svc.txt
oc get sts --all-namespaces -owide --show-labels > $MGDIR/sts.txt
oc status --all-namespaces > $MGDIR/status.txt
oc get storageclass --all-namespaces -owide --show-labels > $MGDIR/storageclass.txt

c -n kube-public get cm ibm-common-services-status -oyaml >  $MGDIR/cm_kube-public-ibm-common-services-status.txt
oc -n kube-public get cm ibmcloud-cluster-info -oyaml >  $MGDIR/cm_kube-public-ibmcloud-cluster-info.txt
oc -n kube-public get cm  common-service-maps -oyaml >  $MGDIR/cm_kube-public-common-service-maps.txt
oc get ImageContentSourcePolicy -A  > $MGDIR/ImageContentSourcePolicy.txt
oc get ImageContentSourcePolicy -A -oyaml > $MGDIR/ImageContentSourcePolicy.yaml
oc get clients -A > $MGDIR/clinet-list.txt
oc get oauthclient -A > $MGDIR/oauthclient-list.txt




#If you have a large number of projects and namespace, you can reduce the data collected by specifying a limited namespace in the for loop

for NS in `oc get ns | awk -v csnamespace="$CSNAMESPACE" 'NR>1 && (/openshift-marketplace/ || /openshift-operator-lifecycle-manager/ ||/common/ ||/kube/ || /cert/ || /$csnamespace/){ORS=" "; print $1}'` default; do
 export NS=$NS; mkdir $MGDIR/$NS; echo gathering info from namespace $NS
 oc get all,secrets,cm,events -n $NS -o wide &> $MGDIR/$NS/all-list.txt
 oc get pods -n $NS | awk 'NR>1{print "oc -n $NS describe pod "$1" > $MGDIR/$NS/"$1"-describe.txt && echo described "$1}' | bash
 oc get pods -n $NS -o go-template='{{range $i := .items}}{{range $c := $i.spec.containers}}{{println $i.metadata.name $c.name}}{{end}}{{end}}' > $MGDIR/$NS/container-list.txt
 awk '{print "oc -n $NS logs "$1" -c "$2" $LOGLIMIT -p > $MGDIR/$NS/"$1"_"$2"_previous.log && echo gathered previous logs of "$1"_"$2}' $MGDIR/$NS/container-list.txt | bash
 awk '{print "oc -n $NS logs "$1" -c "$2" $LOGLIMIT > $MGDIR/$NS/"$1"_"$2".log && echo gathered logs of "$1"_"$2}' $MGDIR/$NS/container-list.txt | bash
done

tar czf CaseTS123456-$MGDIR.tgz $MGDIR/ # replace case number TS123456 
 

Additional Information

Reference: 

[1] Foundational Services version: You can find the foundational services version by using the following command 
# oc get csv --all-namespaces | grep ibm-common-service-operator
ibm-common-services ibm-common-service-operator.v3.6.4 IBM Cloud Platform Common Services     3.6.4     ibm-common-service-operator.v3.6.3       Succeeded
 
[2] ClusterID: This is a uniqueID for the cluster, which helps track the previous case related to the environment and identify Command:  <p> oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}' 
 
[3] Business Impact: Business impact is a clear and specific description of the impact the issue has on your business (such as timeline, stakeholder commitments, revenue, regulatory requirements, or user impact). Providing a detailed impact allows us to understand better the urgency of the issue you are experiencing and whether you need an immediate workaround or full root cause analysis and correction. Supply the specific impact this issue is having on your company, including:

Document Location

Worldwide

Operating System

Cross Brand:All operating systems listed

[{"Type":"MASTER","Line of Business":{"code":"LOB77","label":"Automation Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSRV9V","label":"IBM Cloud Pak foundational services"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Product Synonym

foundational services; common services

Document Information

Modified date:
21 April 2025

UID

ibm16398264