IBM Support

Use of incident creation hold-off can cause alert/incident processing to fail - after Install of, or upgrade to IBM Cloud Pak for AIOps 4.9.0

Flashes (Alerts)


Abstract

Where a hold-off time has been set for one or more incident creation policies, alert and incident processing can fail after some time. 

Content

Symptoms

The symptoms are:

  1. New alerts and incidents fail to be processed, or the same set of older alerts/incidents are continually processed, with no forward progression
  2. One or more incident creation policies have been created with a hold-off time. The issue is more likely where there are longer hold-off times, or many policies with hold-off times.
  3. The  `aiops-lifecycle-flink-taskmanager-` pods are in a restart loop.
  4. There is an error repeated in the `aiops-lifecycle-flink-taskmanager-` pods:
Caused by: com.esotericsoftware.kryo.KryoException: com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: com.ibm.aiops.lifecycle.sdk.policies.models.ActionAddress
Serialization trace:
actionsToExecute (com.ibm.aiops.lifecycle.policy.models.v2.PolicyExecutionRequest)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:82)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
	at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:82)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:22)
	at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599)
	at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.serialize(KryoSerializer.java:356)
	at org.apache.flink.api.common.typeutils.base.MapSerializer.serialize(MapSerializer.java:142)
	at org.apache.flink.api.common.typeutils.base.MapSerializer.serialize(MapSerializer.java:44)
	at org.apache.flink.runtime.state.heap.CopyOnWriteStateMapSnapshot.writeState(CopyOnWriteStateMapSnapshot.java:147)
	at org.apache.flink.runtime.state.heap.AbstractStateTableSnapshot.writeStateInKeyGroup(AbstractStateTableSnapshot.java:116)
	at org.apache.flink.runtime.state.heap.CopyOnWriteStateTableSnapshot.writeStateInKeyGroup(CopyOnWriteStateTableSnapshot.java:38)
	at org.apache.flink.runtime.state.heap.HeapSnapshotStrategy.lambda$asyncSnapshot$3(HeapSnapshotStrategy.java:172)
	at org.apache.flink.runtime.state.SnapshotStrategyRunner$1.callInternal(SnapshotStrategyRunner.java:91)
	at org.apache.flink.runtime.state.SnapshotStrategyRunner$1.callInternal(SnapshotStrategyRunner.java:88)
	at org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:78)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at org.apache.flink.util.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:508)
	... 7 more
Caused by: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: com.ibm.aiops.lifecycle.sdk.policies.models.ActionAddress
	at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:48)
	at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26)
	at com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:351)
	at com.twitter.chill.KryoBase.newDefaultSerializer(KryoBase.scala:58)
	at com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:344)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56)
	at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:461)
	at com.twitter.chill.KryoBase.getRegistration(KryoBase.scala:52)
	at com.esotericsoftware.kryo.Kryo.getSerializer(Kryo.java:476)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:69)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:22)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
	... 24 more
Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException
	at jdk.internal.reflect.GeneratedConstructorAccessor674.newInstance(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)
	at com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:35)
	... 36 more
Caused by: java.lang.UnsupportedOperationException: java.lang.UnsupportedOperationException: can't get field offset on a record class: private final int com.ibm.aiops.lifecycle.sdk.policies.models.ActionAddress.address
	at jdk.unsupported/sun.misc.Unsafe.objectFieldOffset(Unsafe.java:648)
	at com.esotericsoftware.kryo.serializers.UnsafeCacheFields$UnsafeIntField.<init>(UnsafeCacheFields.java:34)
	at com.esotericsoftware.kryo.serializers.UnsafeCachedFieldFactory.createCachedField(UnsafeCachedFieldFactory.java:23)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.newMatchingCachedField(FieldSerializer.java:375)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.newCachedField(FieldSerializer.java:343)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.createCachedFields(FieldSerializer.java:307)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.rebuildCachedFields(FieldSerializer.java:239)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.rebuildCachedFields(FieldSerializer.java:156)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.<init>(FieldSerializer.java:133)
	... 41 more
.
Environment
IBM Cloud Pak for AIOps 4.9.0 when using a hold-off time for incident creation.
Diagnosing The Problem
If you are seeing the following symptoms you are experiencing this issue
  1. New alerts and incidents fail to be processed, or the same set of older alerts/incidents are continually processed, with no forward progression
  2. One or more incident creation policies have been created with a hold-off time. The issue is more likely where there are longer hold-off times, or many policies with hold-off times.
  3. The  `aiops-lifecycle-flink-taskmanager-` pods are in a restart loop.
  4. There is an error repeated in the `aiops-lifecycle-flink-taskmanager-` pods.
Resolving the Problem
This issue has been resolved in IBM Cloud Pak for AIOps 4.9.1

We recommend clients currently on IBM Cloud Pak for AIOps 4.9.0 

  1. Upgrade to IBM Cloud Pak for AIOps 4.9.1, or
  2. Contact Support to request the Hot-fix.

If you are Planning an upgrade to IBM Cloud Pak for AIOps 4.9.x, we suggest upgrading directly to IBM Cloud Pak for AIOps 4.9.1

Use these instructions to upgrade an online deployment of IBM Cloud Pak® for AIOps 4.8.0 or later to 4.9.1. 

https://www.ibm.com/docs/en/cloud-paks/cloud-pak-aiops/4.9.1?topic=upgrading-online-console 


 

[{"Type":"MASTER","Line of Business":{"code":"LOB77","label":"Automation Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSE9G0Q","label":"IBM Cloud Pak for AIOps"},"ARM Category":[{"code":"a8m3p000000hAHpAAM","label":"Watson AIOps-\u003EAI Manager"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"4.9.0"}]

Document Information

Modified date:
16 May 2025

UID

ibm17233730