Flashes (Alerts)
Abstract
QEMU guests on PPC64 might hang during startup with soft-lockups if the NUMA topology creates partially overlapping CPU masks.
Content
Linux Releases Affected
SLES-16 PPC64LE Kernels 6.12.0-160000.16-default or later.
IBM Systems Affected
Any IBM Power System LPAR that aupports and runs a KVM Guest in an LPAR.
Symptoms
Guest system hangs during early boot.
In these cases, the guest might stop responding during startup and show the following lockup messages:
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000000250000 ...
[ 24.403004][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
[ 48.502675][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 45s! [swapper/0:1]
[ 60.770656][ C0] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 60.771038][ C0] rcu: (detected by 0, t=6002 jiffies, g=-1171, q=2 ncpus=8)
[ 60.771257][ C0] rcu: All QSes seen, last rcu_sched kthread activity 6002 (4294943345-4294937343), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 60.771673][ C0] rcu: rcu_sched kthread starved for 6002 jiffies! g-1171 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 60.771818][ C0] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
Root cause
Before the fix (f55dac1dafb3334be1), the kernel correctly identified partial overlaps between CPU masks and treated the topology as invalid, returning early from `build_sched_domains()` to prevent invalid domain creation.
After the fix, the scheduler sometimes compares wrong CPU masks, allowing partial overlaps to pass. This situation creates invalid domain hierarchies, leading to soft-lockups.
A partial overlap occurs when two domains share some CPUs but are not fully nested or fully separate.
Example
```
-numa dist,src=0,dst=0,val=10 -numa dist,src=0,dst=1,val=10 -numa dist,src=0,dst=2,val=10 -numa dist,src=0,dst=3,val=10
-numa dist,src=1,dst=0,val=10 -numa dist,src=1,dst=1,val=10 -numa dist,src=1,dst=2,val=10 -numa dist,src=1,dst=3,val=10
-numa dist,src=2,dst=0,val=10 -numa dist,src=2,dst=1,val=10 -numa dist,src=2,dst=2,val=10 -numa dist,src=2,dst=3,val=11
-numa dist,src=3,dst=0,val=10 -numa dist,src=3,dst=1,val=10 -numa dist,src=3,dst=2,val=11 -numa dist,src=3,dst=3,val=10
```Node 2 and Node 3 partially overlap
Node 2: CPUmask {0,1,2}
Node 3: CPU mask{0,1,3}
This should be rejected, but the fix treats it as valid and leads to soft-lockups.
Note: This is a synthetic scenario, uncommon on real hardware. Physical NUMA systems usually have hierarchical or symmetric distances.
Workaround
Using symmetric NUMA distances between nodes, along with NUMA topologies where CPU masks either fully overlap or are disjoint, can help prevent this issue.
Fix Outlook
SUSE mirrored bug number: 1246843
You can see the fix in the upstream master kernel:
https://github.com/torvalds/linux/commit/661f951e371cc134ea31c84238dbdc9a898b8403
Fix will be provided in a future release.
I/O device impacted
None.
Was this topic helpful?
Document Information
Modified date:
10 November 2025
UID
ibm17244965