Training does not complete after upgrade

Training fails after upgrade because the policyset table has extraneous rows which slow processing.

Problem

When Netcool® Operations Insight® on OpenShift® is upgraded, training sometimes fails, with the following error in the policy registry service logs.
Server timeout during read query at consistency LOCAL_ONE (0 replica(s) responded over 1 required)

Cause

Training cannot complete because the policyset table has too many rows. The policy registry service de-deduplicates entries on insertion, and the queries are affected by the number of rows in the policyset table.

Resolution

Clean up the data in the policyset and policy table with the following procedure.
  1. As an administrator user, log in to one of the Cassandra pods.
    oc exec -ti release_name-cassandra-0 bash
    Where <release_name> is the name of your deployment, as specified by the value used for name (Operator Lifecycle Manager UI Form view), or name in the metadata section of the noi.ibm.com_noihybrids_cr.yaml or noi.ibm.com_nois_cr.yaml files (YAML view).
  2. Start Cassandra query language.
    CASSANDRA_HOME/bin>./cqlsh
  3. Get the policyset names with the following query.
    select policyset from ea_policies.policyset where tenantid='cfd95b7e-3bc7-4006-a4a8-a73a79c71255' and groupid='analytics.temporal-patterns';
  4. Delete unwanted policy entries from the policies table.
    
    delete from ea_policies.policies where tenantid='cfd95b7e-3bc7-4006-a4a8-a73a79c71255' and partitionid in (0,1,2,3,4,5,6,7,8,9) and policyset in ('name1','name2',...,'nameN');
    Where 'name1','name2',...,'nameN' are the policyset names that were returned by the previous step.
  5. Delete unwanted rows from the policyset table.
    
    delete from ea_policies.policyset where tenantid='cfd95b7e-3bc7-4006-a4a8-a73a79c71255' and groupid='analytics.temporal-patterns';
Draft comment: LOUISERoberts
#7975/#7880