Training does not complete after upgrade
Training fails after upgrade because the policyset
table has extraneous
rows which slow processing.
Problem
When Netcool® Operations Insight® on OpenShift® is
upgraded, training sometimes fails, with the following error in the
policy registry service
logs.Server timeout during read query at consistency LOCAL_ONE (0 replica(s) responded over 1 required)
Cause
Training cannot complete because the policyset
table has too many rows. The
policy registry service
de-deduplicates entries on insertion, and the queries are
affected by the number of rows in the policyset
table.
Resolution
Clean up the data in the
policyset
and policy
table with the
following procedure.- As an administrator user, log in to one of the Cassandra
pods.
Where <release_name> is the name of your deployment, as specified by the value used for name (Operator Lifecycle Manager UI Form view), or name in the metadata section of the noi.ibm.com_noihybrids_cr.yaml or noi.ibm.com_nois_cr.yaml files (YAML view).oc exec -ti release_name-cassandra-0 bash
- Start Cassandra query language.
CASSANDRA_HOME/bin>./cqlsh
- Get the policyset names with the following
query.
select policyset from ea_policies.policyset where tenantid='cfd95b7e-3bc7-4006-a4a8-a73a79c71255' and groupid='analytics.temporal-patterns';
- Delete unwanted policy entries from the policies table.
Where 'name1','name2',...,'nameN' are thedelete from ea_policies.policies where tenantid='cfd95b7e-3bc7-4006-a4a8-a73a79c71255' and partitionid in (0,1,2,3,4,5,6,7,8,9) and policyset in ('name1','name2',...,'nameN');
policyset
names that were returned by the previous step. - Delete unwanted rows from the policyset
table.
delete from ea_policies.policyset where tenantid='cfd95b7e-3bc7-4006-a4a8-a73a79c71255' and groupid='analytics.temporal-patterns';
#7975/#7880