How To
Summary
The Cloud Pak MustGather tool collects information about your cluster that is crucial for troubleshooting problems for support. The document also includes instructions to collect must-gather information for foundational services (earlier known as common services) and Cloud Pak for a disconnected (AirGap) environment with no access to an external registry (quay.io or icr.io).
Objective
Environment
- Is this a new Installation?
- Is this an IPI or UPI installation?
- Any recent changes to the environment?
- When does the behavior occur? Frequency? Repeatedly? At certain times?
- MustGather Tool Output: When appropriate, collect the logs from the environment.
Steps
Collecting MustGather
It is often helpful to provide debug information about your cluster when you open a case. Depending on your environment, chose one of the following methods to collect the Cloud Pak must-gather
Depending on the cluster status, you can use one of the methods to collect the cluster information.
Cloud Pak Must-Gather from an OCP 4. x cluster with access to the internet:
If you have internet access to icr.io, run the following command to collect the must-gather for the Cloud Pak cluster. The default "oc adm must-gather" will only collect the openshift-* namespace and will not contain the logs from the Cloud Pak namespace.
cat > cp-must-gather-CS.sh << 'EOT'
#!/bin/bash
export MY_CLOUDPAK_NAMESPACES=cp4i,apic
export MUST_GATHER_IMAGE=icr.io/cpopen/cpfs/must-gather:latest
export CLOUDPAK_NAMESPACES=common-service,ibm-common-services,openshift-operators,openshift-operator-lifecycle-manager,openshift-marketplace,$MY_CLOUDPAK_NAMESPACES
export MUST_GATHER_MODULES=overview,system,failure,cloudpak,route
oc adm must-gather --image=$MUST_GATHER_IMAGE -- gather -m $MUST_GATHER_MODULES -n $CLOUDPAK_NAMESPACES
EOT
chmod +x cp-must-gather-CS.sh
./cp-must-gather-CS.sh
Cloud Pak Must-Gather from an OCP 4.x cluster in a disconnected environment (AirGap):
cat > cp-must-gather-CS-Airgap.sh << 'EOT'
#!/bin/bash
export MY_CLOUDPAK_NAMESPACES=cp4i,apic
export MUST_GATHER_IMAGE=[LOCAL_REGISTRY:5000]/cpopen/cpfs/must-gather:latest
export CLOUDPAK_NAMESPACES=common-service,ibm-common-services,openshift-operators,openshift-operator-lifecycle-manager,openshift-marketplace,$MY_CLOUDPAK_NAMESPACES
export MUST_GATHER_MODULES=overview,system,failure,cloudpak,route
oc adm must-gather --image=$MUST_GATHER_IMAGE -- gather -m $MUST_GATHER_MODULES -n $CLOUDPAK_NAMESPACES
EOT
skopeo list-tags docker://[LOCAL_REGISTRY:5000]/cpopen/cpfs/must-gather
{
"Repository": "[LOCAL_REGISTRY:5000]/cpopen/cpfs/must-gather",
"Tags": [
"4.5.16"
.....
"4.6.7",
"4.6.8",
"4.6.9",
"latest"
]
}
MUST_GATHER_MODULES: If the problem requires openShift namespace configuration and logs to resolve the issue, add the ocp,etcd, route, to collect all openshift-* namespace logs and configuration. Review the referenced documentation for available modules and what is collected.
chmod +x cp-must-gather-CS-Airgap.sh
./cp-must-gather-CS-Airgap.sh
- Upload the cloudpak-must-gather-xxx.tar.gz file .
Gather debugging information by using "inspect" (AirGap):
If you cannot gather debugging information using "oc adm mustgather", use the following script to collect the information the "oc admin inspect" command for a specific resource. You will not need internet access to the download mustgather.
1- Create the gathering script. Please copy & paste the below commands on a bastion (where you used to run oc command)cat > cs-mg-inspect.sh << 'EOF' #!/bin/bash #NOTE update the MY_CLOUDPAK_NAMESPACES (ns/cp4i)with the namespace where the problem and relevant namespace, separate with a space #NOTE The collection does nto include the actual certificate. Add certificaterequests to CRS variable, if certs need to be reviewed export MGDIR=cs-mg-inspect-$(date '+%y%b%dT%H-%M-%S') mkdir -p $MGDIR export MY_CLOUDPAK_NAMESPACES="ns/cp4i ns/apic" # << update and customize for your environment <<< oc adm inspect $MY_CLOUDPAK_NAMESPACES --dest-dir=$MGDIR CRS="OperandRequests OperandConfigs OperandRegistries Issuers Certificate Certmanagers \ CommonServices NamespaceScopes OperandBindInfos MongoDBs Routes Ingresses managementingresses NetworkPolicies Clients ZenServices \ businessteamsservices Clusters analyticsproxies analyticsproxieswithsubmodules PostgresClusters pgupgrades \ flinkclusters AutomationBases Cartridges CartridgeRequirements EventProcessors PlatformNavigators AssetRepositories \ OperationsDashboards Dashboards EventStreams ElasticSearches Kafkas kafkaclaims kafkausers KafkaComposite \ DesignerAuthorings DataPowerservices APIConnectClusters IntegrationServers QueueManagers ICP4AClusters AutomationUIConfigs CP4IServicesBindings" RESOURCES="olm" for i in $(oc api-resources --verbs=list | awk '{print $1}' | sort | uniq); do echo $CRS | grep -w -i -q ${i} if [ $? -eq 0 ]; then RESOURCES+=",${i}" fi done echo "CRs:" $RESOURCES | tee $MGDIR/CRs.txt oc adm inspect $RESOURCES -A --dest-dir=$MGDIR OLM_NS="ns/openshift-marketplace ns/openshift-operator-lifecycle-manager ns/openshift-operators " oc adm inspect $OLM_NS --dest-dir=$MGDIR if [[ $(oc get project ibm-common-services > /dev/null 2>&1) -eq 0 ]] ; then oc adm inspect ns/ibm-common-services --dest-dir=$MGDIR fi if [[ $(oc get project cs-control > /dev/null 2>&1) -eq 0 ]] ; then oc adm inspect ns/cs-control --dest-dir=$MGDIR fi MGDIROV=$MGDIR/overview mkdir -p $MGDIROV oc get clusterversion -oyaml > $MGDIROV/ocp-cluster-version.txt oc get co > $MGMGDIROVIR/clusterOperators.txt oc adm top nodes > $MGDIROV/node-list.txt oc get node -owide >> $MGDIROV/node-list.txt oc describe nodes > $MGDIROV/node-list.txt oc get pods -A -owide > $MGDIR/pods-list.txt oc get crd > $MGDIROV/crd-list.txt oc get Certificaterequests -A > $MGDIROV/certreq-list.txt oc get certs -A -owide > $MGDIROV/certs-list.txt oc -n kube-public get cm ibm-common-services-status -oyaml > $MGDIROV/cm_kube-public-ibm-common-services-status.txt oc -n kube-public get cm ibmcloud-cluster-info -oyaml > $MGDIROV/cm_kube-public-ibmcloud-cluster-info.txt oc -n kube-public get cm common-service-maps -oyaml > $MGDIROV/cm_kube-public-common-service-maps.txt oc get ImageContentSourcePolicy -n openshift-marketplace -oyaml > $MGDIROV/ImageContentSourcePolicy.txt oc get clients,oauthclient -A > $MGDIROV/clinet-list.txt tar aczf $MGDIR.tar.gz ./$MGDIR echo "Done. upload $MGDIR.tar.gz file to the case." #---------------------------------------------------- EOF
chmod +x cs-mg-inspect.sh
./cs-mg-inspect.sh
----
Cloud Pak Must-Gather for Red Hat OpenShift 4.x using Scripts:
export MGDIR=cp-MG-Script-$(date '+%y%b%dT%H-%M-%S')
export LOGLIMIT="--tail=1000"
mkdir -p $MGDIR
oc get node,hostsubnet -o wide > $MGDIR/node-list.txt
oc adm top nodes > $MGDIR/node-detail-list.txt
oc get all,events -o wide -n default > $MGDIR/all-event.txt
oc describe nodes > $MGDIR/node-describe.txt
oc get namespaces > $MGDIR/namespaces.txt
oc get clusteroperators > $MGDIR/cluster-operators.txt
oc adm top pod --all-namespaces > $MGDIR/TopNameSapce.txt
oc get pods --all-namespaces -owide --show-labels > $MGDIR/pods.txt
oc get po --all-namespaces -o wide| grep -Ev '([[:digit:]])/\1.*R' | egrep -v "Completed" > $MGDIR/podsNotRunning-list.txt
#ocp upgrade related
oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}' > $MGDIR/clusterID.txt
oc get clusterversion -o yaml > $MGDIR/ocpclusterversion.txt
oc logs $(oc get pod -n openshift-cluster-version -l k8s-app=cluster-version-operator -oname) -n openshift-cluster-version > $MGDIR/clusterVersionOperator-Upgrade.log
oc get mcp > $MGDIR/machineConfigPool.txt
oc describe mcp >> $MGDIR/machineConfigPool.txt
oc get co/machine-config > $MGDIR/co-machineConfig.txt
oc describe co/machine-config >> $MGDIR/co-machineConfig.txt
oc get catalogsource -A > $MGDIR/catalogsource.txt
oc get catalogsource -n openshift-marketplace -o yaml > $MGDIR/catalogsourcedetail.yaml
oc get cm ibmcloud-cluster-info -o yaml > $MGDIR/ibmcloud-cluster-info-ConfigMap.txt
oc get installplan -A > $MGDIR/installplan.txt
oc get certificates.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/certificates.txt
oc get challenges.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/challengesCert.txt
oc get clusterissuers.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/clusterissuers.txt
oc get configmap --all-namespaces -owide --show-labels > $MGDIR/configmap.txt
oc get crd --all-namespaces -owide --show-labels > $MGDIR/crd.txt
oc get cronjob --all-namespaces -owide --show-labels > $MGDIR/cronjob.txt
oc get csv --all-namespaces -owide --show-labels > $MGDIR/csv.txt
oc get ds --all-namespaces -owide --show-labels > $MGDIR/ds.txt
oc get endpoints --all-namespaces -owide --show-labels > $MGDIR/endpoints.txt
oc get event --all-namespaces -owide --show-labels > $MGDIR/event.txt
oc get hpa --all-namespaces -owide --show-labels > $MGDIR/hpa.txt
oc get ingress --all-namespaces -owide --show-labels > $MGDIR/ingress.txt
oc get issuers.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/issuers.txt
oc get job --all-namespaces -owide --show-labels > $MGDIR/job.txt
oc get namespace --all-namespaces -owide --show-labels > $MGDIR/namespace.txt
oc get networkpolicy --all-namespaces -owide --show-labels > $MGDIR/networkpolicy.txt
oc get authentications.operator.ibm.com --all-namespaces > $MGDIR/authentications.txt
oc get orders.certmanager.k8s.io --all-namespaces -owide --show-labels > $MGDIR/orders.certmanager.txt
oc get pvc --all-namespaces -owide --show-labels > $MGDIR/pvc.txt
oc get pv --all-namespaces -owide --show-labels > $MGDIR/pv.txt
oc get resourcequota --all-namespaces -owide --show-labels > $MGDIR/resourcequota.txt
oc get route --all-namespaces -owide --show-labels > $MGDIR/route.txt
oc get secret --all-namespaces -owide --show-labels > $MGDIR/secret.txt
oc get svc --all-namespaces -owide --show-labels > $MGDIR/svc.txt
oc get sts --all-namespaces -owide --show-labels > $MGDIR/sts.txt
oc status --all-namespaces > $MGDIR/status.txt
oc get storageclass --all-namespaces -owide --show-labels > $MGDIR/storageclass.txt
#If you have a large number of projects and namespace, you can reduce data colleted by specifying the limited namespace in the for loop
for NS in `oc get ns | awk 'NR>1 && (/openshift-marketplace/ || /openshift-operator-lifecycle-manager/ ||/common/ ||/kube/ || /infra/){ORS=" "; print $1}'` default; do
export NS=$NS; mkdir $MGDIR/$NS; echo gathering info from namespace $NS
oc get all,secrets,cm,events -n $NS -o wide &> $MGDIR/$NS/all-list.txt
oc get pods -n $NS | awk 'NR>1{print "oc -n $NS describe pod "$1" > $MGDIR/$NS/"$1"-describe.txt && echo described "$1}' | bash
oc get pods -n $NS -o go-template='{{range $i := .items}}{{range $c := $i.spec.containers}}{{println $i.metadata.name $c.name}}{{end}}{{end}}' > $MGDIR/$NS/container-list.txt
awk '{print "oc -n $NS logs "$1" -c "$2" $LOGLIMIT -p > $MGDIR/$NS/"$1"_"$2"_previous.log && echo gathered previous logs of "$1"_"$2}' $MGDIR/$NS/container-list.txt | bash
awk '{print "oc -n $NS logs "$1" -c "$2" $LOGLIMIT > $MGDIR/$NS/"$1"_"$2".log && echo gathered logs of "$1"_"$2}' $MGDIR/$NS/container-list.txt | bash
done
tar czf CaseTS123456-$MGDIR.tgz $MGDIR/ # replace case number TS123456
Cloud Pak Must-Gather for Red Hat OpenShift 3.11 diagnostics Scripts:
export MGDIR=openshift3.11-diag-$(date '+%y%b%dT%H-%M-%S')
export LOGLIMIT="--tail=1000"
mkdir -p $MGDIR
oc get nodes > $MGDIR/node-list.txt
oc describe nodes > $MGDIR/node-describe.txt
oc get namespaces > $MGDIR/namespaces.txt
oc get pods --all-namespaces -o wide > $MGDIR/all-pods-list.txt
for NS in `oc get ns | awk 'NR>1 && (/openshift/ || /common/ ||/kube/ || /infra/){ORS=" "; print $1}'` default; do
export NS=$NS; mkdir $MGDIR/$NS; echo gathering info from namespace $NS
oc get pods,svc,route,ing,secrets,cm,events -n $NS -o wide &> $MGDIR/$NS/all-list.txt
oc get pods -n $NS | awk 'NR>1{print "oc -n $NS describe pod "$1" > $MGDIR/$NS/"$1"-describe.txt && echo described "$1}' | bash
oc get pods -n $NS -o go-template='{{range $i := .items}}{{range $c := $i.spec.containers}}{{println $i.metadata.name $c.name}}{{end}}{{end}}' > $MGDIR/$NS/container-list.txt
awk '{print "oc -n $NS logs "$1" -c "$2" $LOGLIMIT -p > $MGDIR/$NS/"$1"_"$2"_previous.log && echo gathered previous
logs of "$1"_"$2}' $MGDIR/$NS/container-list.txt | bash
awk '{print "oc -n $NS logs "$1" -c "$2" $LOGLIMIT > $MGDIR/$NS/"$1"_"$2".log && echo gathered logs of "$1"_"$2}'
$MGDIR/$NS/container-list.txt | bash
oc get svc -n $NS | awk 'NR>1{print "oc -n $NS describe svc "$1" > $MGDIR/$NS/svc-describe-"$1".txt && echo
described service "$1}' | bash
done
tar czf CaseTS123456-$MGDIR.tgz $MGDIR/ # replace case number TS123456
SOS report from an RHEL CoreOS node:
In some situations, support might request to provide a sosreport taken from one or more Red Hat OpenShift nodes RHCOS
It is not recommended to connect to an RHCOS node via SSH. The following steps provide instructions on how to get the sosreport by using a debug pod.
$ oc get nodes
$ oc debug -t node
Change root on /host and execute the toolbox command and sosreport:
# chroot /host
# toolbox
# sosreport -k crio.all=on -k crio.logs=on
The sosreport artifact is saved into the /var/tmp folder. From local machine use scp to copy the files and upload to the IBM support case
scp core@nodename:/var/tmp/sosreport-XXXXX.tar.x .
Upload the sosreport to the IBM support case and delete the sosreport from the node
Reference: How to provide a sosreport from an RHEL CoreOS node
Additional Information
# oc get csv --all-namespaces | grep ibm-common-service-operator
ibm-common-services ibm-common-service-operator.v3.6.4 IBM Cloud Platform Common Services 3.6.4 ibm-common-service-operator.v3.6.3 Succeeded
oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}'
- Upcoming deadlines and dates
- Realized or potential effect on customers or your business
- For guidelines on how this impact relates to recommended severity, refer to the IBM® Enterprise Support and Preferred Care Severity Definition page.
Related Information
Document Location
Worldwide
Product Synonym
foundational services; common services
Was this topic helpful?
Document Information
Modified date:
05 October 2023
UID
ibm16398264