Migrating data from Zalando to CloudNativePG

You can transfer data from Zalando to CloudNativePG by using the pg_basebackup bootstrap mode within a cluster that is operating in replica mode. To transfer the data, you need to create the CloudNativePG replica cluster (target) that replicates the Zalando data store (source).

Prerequisites

To bootstrap from a live cluster, complete the following steps:

  • Ensure that the target and source have the same major PostgreSQL version.
  • Set up the streaming_replica user with replication and login roles in the Zalando PostgreSQL database.

Modifying the Zalando Postgres data store for data migration

To modify the Zalando Postgres data store for data migration, complete the following steps:

  1. Modify the existing Zalando yaml file:

    .........
    postgresql:
     parameters:
       listen: "*"
       max_wal_senders: 5
       unix_socket_directories: "/controller/run"
    .........
    
  2. Connect to the Zalando pod:

    1. To get the primary pod, run the following command:

      kubectl get pods -o jsonpath={.items..metadata.name} -l application=spilo,spilo-role=master -n instana-postgres
      
    2. To run commands directly in the pod, run the following command:

      kubectl exec -it <primary_pod_name> -n instana-postgres
      
    3. To connect to the Postgres database, run the following command:

      psql -U postgres
      
    4. To list roles and create streaming_replica user with replication and login roles in the Zalando database, run the following commands:

      \du
      CREATE ROLE streaming_replica WITH REPLICATION;
      ALTER ROLE streaming_replica WITH LOGIN PASSWORD '<password_retrived_from_zalando>';
      
    5. To exit psql prompt, run the following command:

      \q
      
  3. Create two empty files that are called custom.conf and override.conf within the pgdata directory on all pods, which are located alongside the postgresql.conf file.

    1. To list the pods, run the following command:

      kubectl get pods -n instana-postgres
      
    2. To run commands directly within the pod, use the given command and execute the specified set of commands across all pods.

      kubectl exec -it <pod_name> -n instana-postgres
      
      cd /var/lib/postgresql/data/pgdata
      touch -f custom.conf
      touch -f override.conf
      
  4. Exit from the pod terminal.

    exit
    

Creating a Postgres data store by using the CloudNativePG Postgres Operator for data migration

Installing the Postgres Operator online

  1. To deploy the CloudNativePG Postgres Operator online, complete the following steps:

    1. Create the instana-postgres-01 namespace:

      kubectl create namespace instana-postgres-01
      
    2. Determine the file system group ID on Red Hat OpenShift.

      Red Hat OpenShift requires that file system groups are within a range of values specific to the namespace. On the cluster where the CloudNativePG Kubernetes Operator is deployed, run the following command:

      kubectl get namespace instana-postgres-01 -o yaml
      

      An output similar to the following example is shown for the command:

      apiVersion: v1
      kind: Namespace
      metadata:
        annotations:
          .......
          openshift.io/sa.scc.uid-range: 1000750000/10000
        creationTimestamp: "2024-01-14T07:04:59Z"
        labels:
          kubernetes.io/metadata.name: instana-postgres-01
          .......
        name: instana-postgres-01
      

      The openshift.io/sa.scc.supplemental-groups annotation contains the range of allowed IDs. The range 1000750000/10000 indicates 10,000 values that are starting with ID 1000750000, so it specifies the range of IDs from 1000750000 to 1000760000. In this example, the value 1000750000 might be used as a file system group ID.

    3. Install the CloudNativePG Postgres Operator by running the following Helm commands:

      helm repo add instana https://artifact-public.instana.io/artifactory/rel-helm-customer-virtual --username=_ --password=<AGENT_KEY>
      
      helm repo update
      
      helm install cnpg instana/cloudnative-pg --set image.repository=artifact-public.instana.io/self-hosted-images/3rd-party/operator/cloudnative-pg --set image.tag=v1.21.1_v0.6.0 --version=0.20.0 --set imagePullSecrets[0].name=instana-registry --set containerSecurityContext.runAsUser=<UID from namespace> --set containerSecurityContext.runAsGroup=<UID from namespace> -n instana-postgres-01
      
  2. Create image pull secrets for the instana-postgres-01 namespace:

    kubectl create secret docker-registry instana-registry -n instana-postgres-01 \
    --docker-username=_ \
    --docker-password=<AGENT_KEY> \
    --docker-server=artifact-public.instana.io
    

    Note: Before you create the secret, update the <AGENT_KEY> value with your own agent key.

  3. Create a file, such as postgres-secret.yaml, for external cluster access:

    kind: Secret
    apiVersion: v1
    metadata:
      name: instanaadmin
    type: Opaque
    stringData:
      username: instanaadmin
      password: <user_generated_password_from_zalando>
    
  4. Apply the postgres-secret.yaml file by running the following command:

    kubectl apply -f postgres-secret.yaml -n instana-postgres-01
    
  5. Create a CloudNativePG Cluster resource in replica mode:

    1. Create a file, such as cnpg-postgres.yaml, as follows:

      apiVersion: postgresql.cnpg.io/v1
      kind: Cluster
      metadata:
        name: postgres
      spec:
        instances: 3
        imageName: artifact-public.instana.io/self-hosted-images/3rd-party/cnpg-containers:15_v0.8.0
        imagePullPolicy: IfNotPresent
        imagePullSecrets:
          - name: instana-registry
        enableSuperuserAccess: true
        replicationSlots:
          highAvailability:
            enabled: true
        managed:
          roles:
          - name: instanaadmin
            login: true
            superuser: true
            createdb: true
            createrole: true
            replication: true
            passwordSecret:
              name: instanaadmin
        postgresql:
          pg_hba:
            - local     all          all                            trust
            - host      replication  postgres          all          trust
            - host      replication  streaming_replica 0.0.0.0/0    trust
            - host      all          all               0.0.0.0/0    trust
            - local     replication  standby                        trust
            - hostssl   replication  standby      all               md5
            - hostnossl all          all          all               reject
            - hostssl   all          all          all               md5
        bootstrap:
          pg_basebackup:
            source: zalando-postgres
        replica:
          enabled: true
          source: zalando-postgres
      
        externalClusters:
        - name: zalando-postgres
          connectionParameters:
            host: postgres.instana-postgres.svc
            user: postgres
          password:
            name: instanaadmin
            key: password
      
        superuserSecret:
          name: instanaadmin
      
        storage:
          size: 20Gi
          storageClass: nfs-client
      
    2. Apply the cnpg-postgres.yaml file by running the following command:

      kubectl apply -f cnpg-postgres.yaml -n instana-postgres-01
      
  6. SSH into the debug container of the first CloudNativePG pod to modify postgresql.conf.

    Following the initialization of the cluster in replica mode, the initial Pod (postgres-1-pgbasebackup) status will be Completed. Subsequent attempts to start the first CloudNativePG Pod (postgres-1) will fail. This is an expected behavior.

    To ensure successful initialization of the Cluster and subsequent starting of the Pod, complete the following steps:

    1. Run the following commands to figure out the initial Pod, SSH into it, and go to the directory that contains the pgdata volume:

      kubectl debug pod/postgres-1 --as-root -n instana-postgres-01
      
      cd /var/lib/postgresql/data/pgdata/
      
    2. Modify the pg_hba and pg_ident paths inside the postgresql.conf file in each pod:

      • Change the pg_hba path from /var/lib/postgresql/15/main/pg_hba.conf to the following path:

        /var/lib/postgresql/data/pgdata/pg_hba.conf
        
      • Change the pg_ident path from /var/lib/postgresql/15/main/pg_ident.conf to the following path:

        /var/lib/postgresql/data/pgdata/pg_ident.conf
        
    3. Add include 'custom.conf' and include 'override.conf' at the end of the file:

      echo "include 'custom.config'" >> postgresql.conf
      echo "include 'override.config'" >> postgresql.conf
      
    4. Connect to the database by running the following commands:

      psql -U postgres
      
    5. Update the collation version:

      ALTER DATABASE template1 REFRESH COLLATION VERSION;
      
  7. Disable the replica cluster:

    1. Modify the cnpg-postgres.yaml file:

      ........
      replica:
          enabled: false
          source: zalando-postgres
      ..........
      
    2. Reapply cnpg-postgres.yaml as follows:

      kubectl apply -f cnpg-postgres.yaml -n instana-postgres-01
      
  8. Update the core spec configuration:

    1. In your Instana Core file, update the postgresConfig configuration to the following:

      .....................
      postgresConfigs:
        - authEnabled: true
           hosts:
             - postgres-rw.instana-postgres-01
      .....................
      
    2. Reapply the core.yaml file:

      kubectl apply -f core.yaml -n instana-core