EDB Postgres OCP Upgrade

After OCP upgrade 4.14 -> 4.15, the cloud-native-postgresql.v1.22.1 CSV (comma-separated values) is in failed state

Symptoms

After the OCP upgrade, it is observed that the CSV switches between failed, install ready, and pending states.

message: 'install strategy failed: rolebindings.rbac.authorization.k8s.io "postgresql-operator-controller-manager-1-18-7-service-auth-reader"

Diagnosing the problem

Workaround steps to apply :

  1. Identify the affected resources:
    oc get csv -A | grep postgre
    oc describe csv <failed cloud-native-postgresql.v1.18.7> -n <operator-namespace> | tail -n 10
  2. Look for messages with the following phrases:
     message: 'install strategy failed: rolebindings.rbac.authorization.k8s.io "postgresql-operator-controller-manager-1-18-7-service-auth-reader"

    If you see either of the preceding phrases in the log files, proceed to the next step.

Resolving the problem

  1. If the LSN of the primary pod and the replica pod is different, delete the affected resource.
    oc delete rolebinding -n kube-system postgresql-operator-controller-manager-1-18-7-service-auth-reader
  2. Verify resource redeployment with appropriate labels indicating the Operational Level Management (OLM) management:

    $ oc get rolebinding -n kube-system postgresql-operator-controller-manager-1-18-7-service-auth-reader -oyaml
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      labels:
        olm.owner: cloud-native-postgresql.v1.18.7
        olm.owner.kind: ClusterServiceVersion
        olm.owner.namespace: cpd-operator
        operators.coreos.com/cloud-native-postgresql.cpd-operator: ""
       ...
  3. Review the failed CSV:

    The resolution is complete when the resource already exists message is not displayed. The installation process is complete when the deployment is in ready state.