IBM Support

Kernel panic with different CPU stepping - IBM System x

Troubleshooting


Problem

When attempting to install the operating system (RHEL 4 U3 or earlier x86 64 only) or boot via the uniprocessor kernel (RHEL 4 U3 or earlier x86 64 only)

Resolving The Problem

Source

RETAIN tip: H192962

Symptom

When attempting to install the operating system (RHEL 4 U3 or earlier x86 64 only) or boot via the uniprocessor kernel (RHEL 4 U3 or earlier x86 64 only), the following kernel panic occurs:

  Capability LSM initialized as secondary
Mount-cache hash table entries: 256 (order: 0, 4096 bytes)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K using mwait in idle threads.

MCE: warning: using only 6 banks
CPU: Intel Xeon CPU 5150 @ 2.66GHz stepping 06
ACPI: System reset via FADT Reset Register is supported

Kernel BUG at apic:305 invalid operand: 0000 [1]
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.9-34.EL
RIP: 0010:[<ffffffff8054596d>] <ffffffff8054596d>{setup_local_APIC+27}
RSP: 0000:000001007fb13ef8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000050014 RCX: ffffffff8042d710
RDX: 0000000000000000 RSI: ffffffff8042d710 RDI: ffffffff8036bdae
RBP: 0000000000000014 R08: 0000000100000246 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffffffff80538a80(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000101000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo 000001007fb12000, task 000001007fb11110)
Stack: 0000000000000006 0000000006020800 0000000000000000 ffffffff80545eb2
0000000000000001 0000000000000000 0000000100000246 ffffffff8010c4aa
ffffffff8010c3d4 0000000006020800
Call Trace:<ffffffff80545eb2>{APIC_init_uniprocessor+133} <ffffffff8010c4aa>{init+214}
<ffffffff8010c3d4>{init+0} <ffffffff801114bf>{child_rip+8}
<ffffffff8010c3d4>{init+0} <ffffffff801114b7>{child_rip+0}

Code: 0f 0b fc c0 36 80 ff ff ff ff 31 01 48 8b 05 98 0e ee ff ff
RIP <ffffffff8054596d>{setup_local_APIC+27} RSP <000001007fb13ef8>
<0>Kernel panic - not syncing: Oops

Affected configurations

This tip is not machine specific.

The system is configured with multiple CPU's.

This tip is not option specific.

The system is configured with at least one of the following:

  • Red Hat Enterprise Linux 4, Update 2, Update 3

Note: This does not imply that the network operating system will work under all combinations of hardware and software.

Please see the compatibility page for more information:

Solution

Update to RHEL 4 Update 4. This file is available from the following URL:

Workaround

The user should select one of the three workarounds for this issue:

  1. Modify and recompile the kernel.
  2. Use RHEL 4 Update 4 x86_64.
  3. Place the lowest stepping level CPU in the primary socket.

Note: The workaround selected is to be determined by the user.

Additional information

This is a kernel bug where the kernel assumes that the boot CPU is APIC ID 0. When a system contains multiple CPUs of different stepping levels, the BIOS is required to boot from the lowest stepping CPU to avoid incompatible capabilities.

If the lowest stepping level CPU is not in the primary boot socket, then the system will not boot from CPU APIC ID 0, rather some other APIC ID associated with the lowest step CPU.

Since this bug is isolated to a single line of code in the file apic.c in the following function, recompiling will fix the issue.

Below is the fix already incorporated into update 4:

  int __init APIC_init_uniprocessor (void)...
phys_cpu_present_map = physid_mask_of_physid(0);

Below is what the code should look like after modification:

  int __init APIC_init_uniprocessor (void)...
phys_cpu_present_map = physid_mask_of_physid(boot_cpu_id);

Note: The final workaround simply moves the boot CPU to APIC ID 0 so that the kernel assumption is correct.

Document Location

Worldwide

Operating System

System x:Red Hat Linux

[{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW312","label":"System x->System x3800"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW313","label":"System x->System x3650 T"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW314","label":"System x->System x3850"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW315","label":"System x->System x3950"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW316","label":"System x->System x3950 E"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW317","label":"System x->System x3500"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW318","label":"System x->System x3550"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW319","label":"System x->System x3650"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW321","label":"System x->System x3400"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW322","label":"System x->System x3455"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW323","label":"System x->System x3655"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW324","label":"System x->System x3755"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW328","label":"System x->System x3105"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"LOB26","label":"Storage"}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW329","label":"System x->System x3200"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW331","label":"System x->System x3250"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW332","label":"System x->System x3850 M2"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW333","label":"System x->System x3950 M2"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW334","label":"System x->System x3200 M2"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW335","label":"System x->System x3250 M2"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW336","label":"System x->System x3350"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW337","label":"System x->System x3100"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW338","label":"System x->System x3450"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW339","label":"System x->System x3610"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWXA1","label":"System x->System x3650 NAS"},"Platform":[{"code":"PF042","label":"Caldera"}],"Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
29 January 2019

UID

ibm1MIGR-5074918