Error 'BUG: soft lockup - CPU#1 stuck for 67s! [swapper:1]' when trying to reboot Guardium VM



When trying to reboot the Central Manager appliance hosted on VMWare (running CentOS), reboot fail during OS boot phase with the below error.
BUG: soft lockup - CPU#1 stuck for 67s! [swapper:1]


The error is more of a warning regarding a kernel lookup which is locking when something went wrong. It points toward CPU being unresponsive to a softlockup timer (interrupt) within the timer window.

A 'softlockup' is defined as a bug that causes the kernel to loop in kernel mode for more than 20 seconds without giving other tasks a chance to run.

Softlockup can occur is when the machine goes into a loop with interrupts turned off. This commonly happens if a device driver uses spinlocks improperly, but this error also could be caused by faulted hardware.

Diagnosing The Problem

A root login is required.
Check /var/log/messages for error.
Depending on CentOS version, check  kernel property softlockup_thresh or watchdog_thresh setting value (found in /proc/sys/kernel).

Resolving The Problem

If the cause is not faulty hardware, in older kernels tune /proc/sys/kernel/softlockup_thresh.

With new(er) centOS kernel,  tune /proc/sys/kernel/watchdog_thresh.

Note: Setting takes effect on next reboot


To resolve the issue add this line to /etc/sysctl.conf followed by a reboot:





# Controls the maximum size of a message, in bytes

kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes

kernel.shmmax = 68719476736

# Controls interval between generating an NMI perf monitoring interrupt that kernel uses to check for soft-lockup errors.





