IBM Support

Global Data Platform (GDP) configuration and upgrade issues

Fix Readme


Abstract

Global Data Platform (GDP) upgrade is stuck at 95% for more than 15 minutes and Global Data Platform (GDP) storage configuration is stuck at 60% for more than 20 minutes even though the standalone cluster containing both the storage nodes and the non-storage nodes.

Content

How to identify the problem: 

  1. Retrieve daemon CR details using the following command to get the status of roles and versions in the cluster. 

    oc get daemon ibm-spectrum-scale -n ibm-spectrum-scale -oyaml

  2. Check role status from the daemon CR and ensure that the runningCountpodCount of the storage role matches the nodeCount.


    Sample output:

    roles:
    - name: storage
    nodeCount: "6"
    nodes: compute-1-ru5.hciocp1.ssclab.ibm.si, compute-1-ru6.hciocp1.ssclab.ibm.si, compute-1-ru7.hciocp1.ssclab.ibm.si, control-1-ru2.hciocp1.ssclab.ibm.si, control-1-ru3.hciocp1.ssclab.ibm.si, control-1-ru4.hciocp1.ssclab.ibm.si
    podCount: "6"
    pods: compute-1-ru5, compute-1-ru6, compute-1-ru7, control-1-ru2, control-1-ru3, control-1-ru4
    runningCount: "6"
    - name: client
    nodeCount: "2"
    nodes: compute-1-ru8.hciocp1.ssclab.ibm.si, compute-1-ru9.hciocp1.ssclab.ibm.si
    podCount: "2"
    pods: compute-1-ru8, compute-1-ru9
    runningCount: "2"
    - name: afm
    nodeCount: "2"
    nodes: compute-1-ru23.hciocp1.ssclab.ibm.si, compute-1-ru24.hciocp1.ssclab.ibm.si
    podCount: "2"
    pods: compute-1-ru23, compute-1-ru24 runningCount: "2"

  3. Check the versions in the daemon CR. 

    Sample output:

    versions:
        - count: "10"
          pods: control-1-ru2, control-1-ru3, control-1-ru4, compute-1-ru23, compute-1-ru8, compute-1-ru9, compute-1-ru5, compute-1-ru6, compute-1-ru7, compute-1-ru24
          version: 5.2.1.1
  4. For a standalone cluster, ensure that the versions field should contain a single entry (i.e., versions[0]), and the following conditions must be met:
    • The versions array should have only one set of values, confirming that all pods are updated and running the same version.
    • status.versions[0].count should be greater than the running count of the storage role.
    • status.versions[0].pods should contain both storage and non-storage pods, meaning all pods listed under roles (e.g., both storage and other roles) should be included.
  5. If status.versions[0].count is greater than the runningCount of the storage role, this can result in:
    • The GDP upgrade process getting stuck at 95%.

    • The GDP configuration process stalling at 60%.

Resolution:

  1. Replace the isf-storage-operator-controller-manager image with a new image in the installed operator CSV.

    isf-storage-operator - cp.icr.io/cp/fusion-hci/isf-storage-operator@sha256:d83031d644a7830d3b37e10542f200e3a353a23b411ae6751af5d83fc95d22d6
  2. Run the following commands to make sure that all the pods are running.

    oc get pods -n ibm-spectrum-fusion-ns | grep isf-storage-operator-controller-manager

    Sample output:

    oc get pods -n ibm-spectrum-fusion-ns | grep isf-storage-operator-controller-manager
    isf-storage-operator-controller-manager-86ccc69c4d-hjh65 2/2 Running 0 87m
  3. If the issue is with configuring the GDP, then monitor the GDP configuration percentage using the following command.

    oc get Scale storagemanager -n ibm-spectrum-fusion-ns -o jsonpath='{.status.installProgressStatus}'
  4. If the issue is with upgrading the GDP, then monitor the GDP upgrade percentage on IBM Fusion user interface .

[{"Type":"MASTER","Line of Business":{"code":"LOB69","label":"Storage TPS"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSXEFDS","label":"IBM Fusion HCI Appliance Software"},"ARM Category":[{"code":"a8m3p0000000rX7AAI","label":"HW"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"2.9.0"}]

Document Information

Modified date:
03 March 2025

UID

ibm17184374