preload_data.sh
does not complete successfully
Symptom
During the isolated migration process, if the preload_data.sh
script completes unsuccessfully or does not run, the environment is in an error state due to the default "admin" user being different than expected. In foundational
services 3.x, the default admin user is named admin
. In foundational services 4.x, the default admin username changed to cpadmin
. However, if you are upgrading from foundational services 3.x, the default stays with admin
so you do not have to reset its default admin user on upgrade. This is done through the preload_data.sh
script.
If the preload_data.sh
script does not complete successfully or does not run for any reason, then it is likely that the admin
user will not be carried over and the new installation of foundational services 4.x will default
to using cpadmin
instead, which causes a mismatch with zen specifically as it still expects admin
to be the default. This error will typically manifest through the subsequent zenservice reconcile after the foundational
services 4.x upgrade. The zenservice reconcile may hang at 37% with the following error:
5.1.0/roles/0020-core has failed with error: Timeout of waiting for zen-audit, zen-core, zen-core-api, and zen-watcher deployments to be ready
However, a more specific error can be found in the logs of the usermgmt pod:
12/18/2023, 20:07:15 PM - error: GET /v1/internal/user/cpadmin: The asset is not found: user details not found: 404
12/18/2023, 20:07:19 PM - error: GET /v1/user/cpadmin: The asset is not found: user details not found: 404
The zen watcher pod might also be in an error state:
time="2023-12-18 19:57:23" level=error msg=GetUserDetailsByNameInternal func=zen-core-api/source/apis/commonutils.GetUserDetailsByNameInternal file="/go/src/zen-core-api/source/apis/commonutils/usermgmt-util.go:128" body="{\"exception\":\"user details not found\",\"_messageCode_\":\"not_found\",\"message\":\"The asset is not found\",\"_statusCode_\":404}" event="read usermgmt response"
panic: runtime error: invalid memory address or nil pointer dereference
It is likely that all three of the preceding errors are present in their respective locations.
Solution
To reset the default admin user to use admin
instead of cpadmin
:
- Update the
platform-auth-idp-credentials
Secret to use admin as theadmin_username
value (base64 encoded) - Restart the pods for the IM v4 Operand Deployments:
platform-auth-service
,platform-identity-provider
, andplatform-identity-management
(for example, runoc delete po -lapp=${DEPLOYMENT_NAME}
) - If there is a Client that has failed to register properly from the iam-config-job that Zen runs, delete the Client CR and the iam-config-job; the ibm-zen-operator should re-create this Job and the Client if it is still progressing.
Note: The following might or might not also be required in case of preload not running ahead of time. This script should run the same mongodump and mongorestore processes as preload.
To ensure that that mongo data is properly carried over, go to the case bundle and run the following command:
/backup_restore_mongo/backup_restore_mongo.sh --bns ibm-common-services --rns <Bedrock 4.x mongo namespace> -b -r -c