Troubleshooting certificates

Learn how to isolate and resolve problems with certificates.

Expired certificate secrets fail to rotate

When a cluster is restored from a snapshot or rebooted after a long period of shutdown, the certificate manager might need to rotate many certificates at the same time. When the IBM Cert Manager is used, this can lead to the Cert Manager failing to rotate some secrets associated with the certificates.

Solution: Use the following commands to identify if any certificate secrets contain expired certificate data:

  1. Set the project (namespace):

    oc project <namespace>
    

    Where <namespace> is the project (namespace) where IBM Cloud Pak for AIOps is installed.

  2. Run the following script to check for expired certificate secrets:

    Note: The script requires version 4 of bash.

    convert_date () {
    local months=( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec )
    local i
    for (( i=0; i<11; i++ )); do
       [[ $2 = ${months[$i]} ]] && break
    done
    printf "%4d-%02d-%02d\n" $3 $(( i+1 )) $1
    }
    
    declare -A CERT_TO_EXPIRY
    
    for secret in $(oc get certificate -o jsonpath='{.items[*].spec.secretName}'); do CERT_TO_EXPIRY["$(oc get secret $secret -o jsonpath='{.metadata.name}')"]="$(oc get secret $secret -o go-template='{{index .data "ca.crt" | base64decode}}' | openssl x509 -text -noout | grep "Not After" | awk '{print $5 " " $4 " " $7}')"; done
    
    NOW=$(date +%s)
    
    for secret in ${!CERT_TO_EXPIRY[@]}; do
    CERT_DATE=${CERT_TO_EXPIRY[$secret]}
    if [[ "$OSTYPE" == "linux-gnu"* ]]; then
       CERT_DATE_SEC=$(date -d "$(convert_date $CERT_DATE)" +%s)
    elif [[ "$OSTYPE" == "darwin"* ]]; then
       CERT_DATE_SEC=$(date -j -f "%Y-%m-%d" "$(convert_date $CERT_DATE)" +%s)
    fi
    if [ $CERT_DATE_SEC -lt $NOW ]; then echo "$secret expired on $CERT_DATE"; fi
    done
    
  3. If no secret names are printed from running the preceding script, no action is required. If secret names are printed, check the state of the owning certificates using the following command:

    oc get certificates.cert-manager.io
    

    The output can resemble the following sample output:

    NAME                                            READY   SECRET                                          AGE
    aiops-ibm-elasticsearch-tls-secret              True    aiops-ibm-elasticsearch-tls-secret              4h2m
    aiops-installation-redis-client-cert            True    aiops-installation-redis-client-cert            4h7m
    aiops-installation-redis-server-cert            True    aiops-installation-redis-server-cert            4h7m
    

    If any certificates show as False in the READY column, contact IBM Support to resolve the Cert Manager failure. If all certificates show as True in the READY column, continue with the resolution steps.

  4. Use the following script to force the certificate secrets to be rotated.

    Note: The script requires version 4 of bash.

convert_date () {
   local months=( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec )
   local i
   for (( i=0; i<11; i++ )); do
      [[ $2 = ${months[$i]} ]] && break
   done
   printf "%4d-%02d-%02d\n" $3 $(( i+1 )) $1
}

declare -A CERT_TO_EXPIRY

for secret in $(oc get certificate -o jsonpath='{.items[*].spec.secretName}'); do CERT_TO_EXPIRY["$(oc get secret $secret -o jsonpath='{.metadata.name}')"]="$(oc get secret $secret -o go-template='{{index .data "ca.crt" | base64decode}}' | openssl x509 -text -noout | grep "Not After" | awk '{print $5 " " $4 " " $7}')"; done

NOW=$(date +%s)

for secret in ${!CERT_TO_EXPIRY[@]}; do
  CERT_DATE=${CERT_TO_EXPIRY[$secret]}
  if [[ "$OSTYPE" == "linux-gnu"* ]]; then
    CERT_DATE_SEC=$(date -d "$(convert_date $CERT_DATE)" +%s)
  elif [[ "$OSTYPE" == "darwin"* ]]; then
    CERT_DATE_SEC=$(date -j -f "%Y-%m-%d" "$(convert_date $CERT_DATE)" +%s)
  fi
  if [ $CERT_DATE_SEC -lt $NOW ]; then oc delete secret $secret; fi
done