Account, user, and entitlements troubleshooting

Troubleshooting user, account, and entitlements issues are associated with the Entitlements service. The IBM Security QRadar® Suite Software Entitlements service manages accounts, users, and roles, along with the change log and connection to IBM Cloud Pak® foundational services.

Account error 500 due to identity provider removal

After an identity provider is removed from IBM Security QRadar Suite Software, when you attempt to access an account, an error message is displayed.

When you try to manage users for an account in the System Administration account, you see the following error: 500 Error: Internal server error

500 error cause

An identity provider that is removed from an account in QRadar Suite Software causes a 500 error.

To confirm the diagnosis, complete the following steps.
  1. Log in as a system administrator.
  2. From the home page, in the System Administration account, under Quick navigation, click Account management.
  3. Note the identity providers that are listed for the System Administration account.
  4. Switch to the account where the error occurred.
  5. Click Account management.
  6. Check the identity providers that are listed for the account and note an identity provider in the list that does not display in the System Administration account.

    Any identity provider that is not listed on the System Administration account and is found in the account, indicates that the identity provider was removed from QRadar Suite Software. This removal causes the 500 error.

Resolving the 500 error

You can recover an account after an identity provider is removed. To resolve the problem, as account admin you must complete the following steps for the account where the error occurred.

  1. From the home page, under Quick navigation, click Account management.
  2. Note the identity providers that are listed in the "Account settings" box.
  3. Click Edit account settings. Observe that the number of identity providers in the "Edit account settings" pane does not match the list that you saw in step 2.
  4. Click Save changes.

    You see an Account updated message and the identity providers value is updated. When you return to the System Administration account, the accounts list is updated and the account that caused the issue now has the correct value for identity providers.

To avoid this error in future, use the following procedures to remove an identity provider.
  • Remove a Lightweight Directory Access Protocol directory (LDAP) identity provider by using the IBM Cloud Pak foundational services console.
  • Red Hat® OpenShift® Kubernetes Service (ROKS) authentication can only be disabled during an upgrade of QRadar Suite Software.

Cannot add users from a modified LDAP directory

It is not possible to add users from one or more modified LDAP directories.

It is not possible to add users from one or more modified LDAP directories after significant changes to the LDAP configuration of the cluster where IBM Security QRadar Suite Software is installed.

For example, this problem might occur following the restoration of a QRadar Suite Software backup to a different cluster.

Resolving the inability to modify the LDAP directory

QRadar Suite Software provides an action to synchronize significant changes to the IBM Cloud Pak foundational services LDAP directories with the Entitlements service.

Install the command-line interface (CLI) utility cpctl from the cp-serviceability pod. For more information, see Installing the cpctl utility.

The sync_ldap action triggers an internal routine in the Entitlements service that queries foundational services for the latest directories. If changes are detected, these changes are applied to all the affected users and accounts that are stored in the Entitlements service. When the action is completed, the Entitlements service deployment is updated and all the Entitlements service pods are restarted.

Run the following command. No parameters are required.
cpctl remediation sync_ldap --token $(oc whoami -t)
The following output shows an example of what you see when the action is completed.
Executing playbook sync_ldap.yaml

- localhost on hosts: localhost -
Gathering Facts...
  localhost ok
Update LDAP directories stored in Entitlements service...
  localhost ok | msg: OK (unknown bytes)
Patch Deployment to Restart Entitlements Pods...
  localhost ok
cp4s namespace...
  localhost ok
Patch Deployment to Restart Pods...
  localhost done
Show LDAP sync status...
  localhost ok: {
    "changed": false,
    "msg": "LDAP directories are in sync"
}
If the action fails, check the status of the Entitlements service by running the following command.
cpctl diagnostics check_deployment --only entitle --token "$(oc whoami -t)"

System Administration account admin person is unavailable

When the System Administration account is configured with only one user who has the "Accounts management" Admin role and that person is unavailable to access the cluster, no one can do accounts management for IBM Security QRadar Suite Software.

The System Administration account has no valid users with the "Accounts management" Admin role.

When you change the System Administration account through the user management page, QRadar Suite Software ensures that there is always at least one user with the accounts management access. Where the sole user who has this access is unavailable, you must give the accounts management Admin permission to a different user on the System Administration account or add a user to the account to do accounts management tasks.

Resolving account admin unavailable

Before you begin, you must have cluster administrator access with the Kubernetes command-line interface tool to the cluster where QRadar Suite Software is installed.

  1. To find the namespace where QRadar Suite Software is installed, run the following commands.

    NAMESPACE=$(oc get pod -lname=cp4s-extension --all-namespaces --no-headers 
    awk '{print $1}')
    echo $NAMESPACE
  2. To identify a running isc-entitlements pod, run the following commands:

    POD=$(oc get pods -lname=isc-entitlements -n=$NAMESPACE --no-headers | grep -i running | head -1 | awk '{print $1}')
    echo $POD
  3. Confirm that result of step 1 is similar to the following output.

    isc-entitlements-c5bc499ff-2qwb5
    1. If no output results from the isc_entitlements command, run the oc get pods -lname=isc-entitlements command and verify that pods are present and are in a Running state.
    2. If no pods are in the Running state, see the instructions for running MustGather.
  4. To find the ID for the System Administration account, run the following command:

    oc exec $POD -n=$NAMESPACE -- node ./utilities/listAccounts.js

    Obtain the ID from the output as shown in the following example:

    ┌─────────┬────────────────────────────────────────┬──────────────────────────────────┐
    │ (index) │               Account ID               │           Account Name           │
    ├─────────┼────────────────────────────────────────┼──────────────────────────────────┤
    │    0    │ '88bb81d6-2e5a-4ca2-b5ee-f5d2f391d549' │     'System Administration'      │
    │    1    │ '9b0a1cb6-97b3-42aa-88e7-11e428eee301' │     'Test account 1'             │
    │    2    │ 'a6be0459-5a69-40fd-aea6-847a6792a781' │     'Test account 2'             │
    │    3    │ 'a8bb651d-5f9b-4447-802e-a1d3a514d988' │     'Test account 3'             │
    └─────────┴────────────────────────────────────────┴──────────────────────────────────┘
    
  5. Run the setAdminUser command with the user ID of the new Admin and the account ID of the System Administration account.

    oc exec $POD -n=$NAMESPACE -- node ./utilities/setAdminUser.js <username> <account_ID>

    If the user does not exist on the account, the command searches the Lightweight Directory Access Protocol (LDAP) directories that are connected to IBM Cloud Pak foundational services. If the specified user is found, that user is added to the System Administration account and given the accounts management Admin role on that account.

    The following log is an example of the output from the command:

    {"level":"info","message":"Getting connected LDAP directories...","ibm_datetime":"2021-09-14T12:04:22.635Z"}
    {"level":"info","message":"Searching LDAP directories for test.user@company.com...","ibm_datetime":"2021-09-14T12:04:23.411Z"}
    {"level":"info","message":"Found test.user@company.com: b6736880-eba5-11eb-934d-9d3351e73cde#uid=I00206754,c=ie,ou=bluepages,o=ibm.com, adding user to account","ibm_datetime":"2021-09-14T12:04:24.299Z"}
    {"level":"info","message":"User 'test.user@company.com' has been provisioned in account '88bb81d6-2e5a-4ca2-b5ee-f5d2f391d549'. Current subscription: a7a63e2e-d6cb-4ced-be7b-af7473d2138b","ibm_datetime":"2021-09-14T12:04:27.204Z"}
    {"level":"info","message":"Attempting to give test.user@company.com Admin role on all applications in account 88bb81d6-2e5a-4ca2-b5ee-f5d2f391d549","ibm_datetime":"2021-09-14T12:04:27.205Z"}
    {"level":"info","label":"system-account","message":"Attempt to set: \"test.user@company.com\" as Admin on 3 applications","ibm_datetime":"2021-09-14T12:04:27.425Z"}
    {"level":"info","message":"Complete","ibm_datetime":"2021-09-14T12:04:27.632Z"}

Validate the solution by verifying that the administrator that is added in the cluster administrator response procedure can access the System Administration account with account management Admin privileges.

Provider accounts remain in the Provisioning state

After IBM Security QRadar Suite Software is restored from a backup, Provider accounts remain in the Provisioning state.

The Account Management page shows that a Provider account and the Standard accounts in the Provider account are in the Provisioning state for more than an hour.

Resolving the provider accounts in the Provisioning state

Before you begin, you must have cluster administrator access to the cluster where QRadar Suite Software is installed. The cpctl remediation restore_account_crs command creates any missing custom resources (CR) that are used by the Entitlements service for Provider accounts and Standard accounts in Provider accounts.

  1. Install the command-line interface (CLI) utility cpctl from the cp-serviceability pod. For more information, see Installing the cpctl utility.

  2. Type the following command.
    cpctl remediation restore_account_crs

    The following output is an example of what you see when you run the restore_account_crs command.

    Execute the endpoint to restore all CR's for provider and client accounts...
      localhost ok | msg: OK (unknown bytes)
    Patch Deployment to Restart Entitlements Pods...
      localhost ok
    cp4s namespace...
    Patch Deployment to Restart Pods...
      localhost done
    Patch Deployment to Restart Entitlements Operator...
      localhost ok
    cp4s namespace...
    Patch Deployment to Restart Pods...
      localhost done
    - Play recap -
      localhost                  : ok=19   changed=2    unreachable=0    failed=0    rescued=0    ignored=0

The restore_account_crs action causes the entitlements service to query the entitlements database for any Provider or client accounts. If the AppEntitlements and Offerings CRs for any of the accounts don't exist in Kubernetes, they are created, and the entitlement pods restart.

Resource issues with the Entitlements service

Various symptoms can indicate resource depletion. Resource limitations overall system stability and performance.

The following symptoms can indicate resource depletion.
  • Users cannot log in after a cluster restart.
  • Pods that host deployments fail to reach the "Ready" state.
  • Error logs from the Entitlements service indicate operations that cannot be completed (upstream timeouts).
  • Error logs from other services indicate failure to communicate with the Entitlements service (request timeouts).

Entitlements service resource issues causes

The default configurations for the Entitlements service are sufficient to support a wide range of system workloads. Although you can change the default configuration, the remediation is complex. Ensure that you diagnose the issue carefully before you modify the configuration.

The following basic system resources can become depleted:
  • Data capacity for the persistent storage of entitlements data (CouchDB capacity).
  • Performance when data is retrieved from the persistent storage (CouchDB performance).
  • Transactional performance for application queries for user entitlements.
  • Transactional performance of the IBM Cloud Pak foundational service that provides username resolution and access to the Identity and Management services.

Data capacity and performance diagnosis

The Entitlements service uses a small portion of the common CouchDB instance that is shared across IBM Security QRadar Suite Software. If either the transactional or the storage capacities are exceeded, you might see various indicators across many application logs. Because the Entitlements service is central to the functioning of all other applications, these capacity problems are likely to manifest as unexpected errors in the Entitlements service logs.

The Entitlements service writes new records in the following situations:
  • Creation of new accounts.
  • Creation of new user IDs added to new or existing accounts.
  • Deletion of user IDs.
  • Deletion of accounts.

Modified records are not deleted; they are retired from use and retained to support replication. This retention means that the data store necessary to support the system can increase rapidly. Because the CouchDB resource is shared, excessive use by other components of the system can exhaust the storage capacity.

The Entitlements service also maintains a log that captures any changes. With this log, QRadar Suite Software can monitor and react to changes in the Entitlements system. As with the data records, these changes are never deleted, so this change log can grow indefinitely. For more information, see Checking the change log processing status.

Runtime queries to the Entitlements service use computational resources in the pods that run the CouchDB services. Investigate error messages that are received during any of these administrative actions to determine whether the error is caused by CPU capacity issues.

It is unlikely that the Entitlement service is the cause of CouchDB resource depletion. Of the 60 GB of data that is allocated to CouchDB, the Entitlements service uses only a few megabytes. However, the Entitlements service might be the first to show signs of CouchDB resource depletion.

Whenever the Entitlements service restarts, it reconciles and checks the integrity of the persistent information. If the Entitlements service cannot restart, users cannot log in. In this case, enter the following command.
oc get pods -l `name=isc-entitlements`
This command produces output similar to the following example to indicate startup failure:
NAME                                READY   STATUS    RESTARTS   AGE
isc-entitlements-66c5d46dcd-2pqnm   1/1     Running   0          3d14h
isc-entitlements-66c5d46dcd-pk8gn   0/1     Error     0          84s
Use the output from this report to inspect the logs from the Entitlements service for start errors and their cause, as shown in the following sample log report.
{"level":"error","label":"http-client.request-error","message":"GET /idmgmt/identity/api/v1/teams/9b749de3-a663-41c2-af78-3b736266d8ac +122ms","ibm_datetime":"2021-11-01T23:59:01.079Z"}

{"level":"error","message":"An error occurred trying to 'Retrieve Team' in Common Services. Reason: Internal Server Error","ibm_datetime":"2021-11-01T23:59:01.080Z"}

{"level":"error","label":"http-client.request-error","message":"GET /idmgmt/identity/api/v1/teams/c993d4dd-a6bc-4a83-857f-7255a9d9ec79 +85ms","ibm_datetime":"2021-11-01T23:59:01.219Z"}

{"level":"error","message":"An error occurred trying to 'Retrieve Team' in Common Services. Reason: Internal Server Error","ibm_datetime":"2021-11-01T23:59:01.219Z"}

This sample log report contains clear references to errors in communicating with the identity management service of Common Services (IBM Cloud Pak foundational services).

By default, the Entitlements service is configured to log only errors with a severity marked "error". In some situations, more detail might be needed to diagnose a problem. To turn on an "info" level of logging, update the Entitlements service deployment environment by using the following command and then restart the Entitlements service.
oc set env deployment/isc-entitlements DEBUG_LEVEL=info
To restart the service, scale to zero replicas first, then scale to two replicas. After the pods restart, the logs look similar to the following example.
{"level":"info","message":"Version:1.9.0. Production mode:true. Log level:info. ROKS enabled:false","ibm_datetime":"2021-11-02T05:04:36.833Z"}
{"level":"info","message":"App started","ibm_datetime":"2021-11-02T05:04:36.844Z"}
{"level":"info","label":"couchdb.initialize","message":"All required Offerings present in BoM","ibm_datetime":"2021-11-02T05:04:36.939Z"}
{"level":"info","label":"couchdb","message":"Couch url: https://GYGs4iXe:mBJ4UsDgif@default-couchdbcluster.cp4s.svc.cluster.local","ibm_datetime":"2021-11-02T05:04:36.939Z"}
{"level":"info","label":"couchdb","message":"Successfully instantiate EntitlementsDB","ibm_datetime":"2021-11-02T05:04:37.029Z"}
{"level":"info","message":"Couch url: https://GYGs4iXe:mBJ4UsDgif@default-couchdbcluster.cp4s.svc.cluster.local","ibm_datetime":"2021-11-02T05:04:37.029Z"}
{"level":"info","message":"Successfully instantiate ChangeLogDB","ibm_datetime":"2021-11-02T05:04:37.030Z"}
{"level":"info","label":"redis-client","message":"Connection established to the Redis server","ibm_datetime":"2021-11-02T05:04:37.134Z"}
{"level":"info","label":"redis-client","message":"Ready to communicate with the Redis server","ibm_datetime":"2021-11-02T05:04:37.137Z"}
{"level":"info","label":"couchdb","message":"Successfully tested connection to database","ibm_datetime":"2021-11-02T05:04:37.164Z"}

For more information about extracting logs from the system, see Extracting logs from the EFK stack.

Transactional capacity and performance diagnosis

Transactional performance and capacity of the Entitlements service is divided into two major categories.
  1. Transactions that are related to onboarding accounts, applications, and users
  2. Transactions that are related to retrieving user entitlements and account information from the persistent storage that underlies the Entitlements service

The design of the Entitlements service emphasizes the performance of entitlements retrieval over onboarding and user entitlements changes.

Overloading the Entitlements service with queries from applications might result in slow performance. Critically slow performance is indicated by Upstream Time Outs messages that are reported by functional applications. These timeouts indicate transactional capacity issues and are attributed to the Entitlements Service by the application's error logs.

Important: Upstream Time Outs reports in the Entitlements service logs might also be symptoms of overall transaction limitations. However, these reports are more likely to be a hard failure of the upstream system.

Depending on the diagnosis, use the following instructions to resolve the problem.

Increasing the capacity of CouchDB
The capacity of the backing store for the Entitlements service is dictated by the size of the Persistent Volume Claim (PVC) for the deployment of CouchDB. For more information, see Extracting logs from the EFK stack.
Important: Before you adjust the PV Claim, you must determine the cause of the resource depletion.
Improving the performance of the CouchDB

QRadar Suite Software is configured to load-balance CouchDB queries and updates across three replicas of the database service. Increasing the number of replicas does not result in substantial performance improvement; changing this configuration is not supported. If your system has this kind of performance issue, contact IBM Support.

Improving the transactional capacity of the Entitlements service
The default configuration for QRadar Suite Software allocates two replicas for the Entitlements service deployment. Increasing the replica count provides more system resource for processing of user and account management operations. However, increasing to more than three replicas only results in a small performance improvement, and is not recommended. To temporarily increase the allocation of the Entitlements service replicas, run the following command.
oc scale --replicas=3 deployment/isc-entitlements
To permanently increase the allocation, use the oc patch on the deployment of the Entitlements service as shown in the following example:
oc patch deployment isc-entitlements -p '{"spec":{"replicas":3}}'
This command increases the number of replicas from the default value of 2, to a new value of 3.
Important: After a system reinstall, or upgrade you must reapply any change to the replica count because the upgrade reverts the replica count back to the current default.

If the increased replica count for the service does not result in sufficiently improved performance, contact IBM Support.

Entitlement licensing cronjob pod is in CreateContainerConfigError state

On a Red Hat OpenShift Container Platform cluster with multiple IBM Cloud® Paks and multiple versions of foundational services installed, the pod for the Entitlement licensing cronjob in the QRadar Suite Software namespace is not in the running state and shows a CreateContainerConfigError error.