IBM Support

IBM ESS Technote: IBM Elastic Storage System 3000 may see reboot while running heavy I/O workload

Troubleshooting


Problem

IBM ESS 3000 canister may reboot while running heavy I/O workload.

Symptom

In an IBM ESS 3000 system, the pemsmod module may hit a hard lockup which may lead to a kernel crash resulting in a system reboot.
The vmcore-dmesg.txt will have this message:
Kernel panic - not syncing: Hard LOCKUP
CPU: 27 PID: 14563 Comm: pemsRollUpQueue Kdump: loaded Tainted: G

Cause

A system reboot can happen while system is running heavy I/O workload. Vmcore analysis will show hard lockup in a pemsmod thread called pemsRollupQueue.

Environment

All prior versions of ESS 3000 before V6.1.1.2.

Resolving The Problem

Users running IBM ESS 3000 V6.0.0.0 through V6.1.1.1, should apply IBM ESS 3000 V6.1.1.2 or later, available from Fix Central at:
If you cannot apply one of the above PTF levels, contact IBM Service to obtain and apply an efix for your level of code:
     - For IBM ESS 3000 V6.0.0.0 through V6.0.2.2, reference APAR IJ34813
     - For IBM ESS 3000 V6.1.0.0 through V6.1.1.1, reference APAR IJ34393

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STHMCM","label":"IBM Elastic Storage Server"},"ARM Category":[{"code":"a8m50000000KzfLAAS","label":"Crashes - Reboots"}],"ARM Case Number":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"6.0.0;6.1.0;6.1.1"}]

Document Information

Modified date:
07 October 2021

UID

ibm16493365