Boot hangs with error 0551 or "Starting kernel"
AnthonyEnglish 270000RKFN Comment (1) Visits (13239)
After restoring a mksysb to a new logical partition the reboot hung with the AIX IPL progress code "0551" "IPL Vary-on is running." After trying slight variations in the profile the boot process hung at "Starting kernel". The rootvg couldn't be varied on, and I didn't know why. The solution was simple, as you'll see.
Troubleshooting the bootAs part of the troubleshooting process, I booted the logical partition off the AIX Product Media. I was able to Access the root volume group, and recreate the boot image using bosboot, and set the bootlist. But still the system wouldn't boot off the hard disk.
I tried booting off dedicated disk, in case the virtual SCSI disk or drivers were an issue. No luck there.
I tried restoring the mksysb onto a different managed system. It worked.
I also tried restoring mksysbs from other AIX logical partitions using the same target logical partition profile as the one which was giving me grief. They all worked fine.
Perhaps the problem was the source logical partition, from which the mksysb was made. The problem was that I didn't want to make any changes to the original source system until I could be confident that a mksysb restore would work in the event we needed it.
After escalating the problem to IBM and doing what seemed to be an endless amount of (pretty interesting, but time-consuming) troubleshooting, we were about to capture boot debugging, but before we did, the IBM support person suggested a solution which was remarkably simple:
Add memory!Even though the 4 GB was sufficient for the other mksysbs which I restored, for this one for some reason it wasn't enough. I assigned an extra 2 GB and the system booted perfectly.
In defence of iron castingIn a sense I suppose I was too clever by half; a victim of my own experience. A beginner might have suggested we throw some iron at the system: "Why don't you try adding some more memory?"
None of the error codes or troubleshooting I did explicitly showed that was where the problem lay, so the beginner's approach wouldn't have been very scientific. But it would have worked.