Getting bosboot errors, don’t reboot just yet
In this demonstration, I will be using hdisk0 in the examples provided. The
checks I provide are not an exhaustive list, but rather common issues I
have come across over the years. When getting
bosboot errors, I always find it best to go by
the numbers, that is, a checklist I tick off until the error is resolved.
bosboot errors are not a show stopper,
of course easy to say perhaps, but not so easy to be convinced by that
statement when you are by yourself and it is late into the evening doing
some IBM® AIX® maintenance. The errors typically occur due to
some recent change on the disks, this could be migration or accidental
user error. I have always been able to resolve
bosboot issues and reboot a system, even if AIX
shouts at me not to reboot.
The most common errors issued by the
command that I have come across are:
- Invalid boot device specified
- hd5 does not exist on this hdiskX
- Boot disk is part of rootvg , but not according to the
Boot list history
The following tasks are in no particular order as a
bosboot error can crop up in different
situations. The first port of call is to check your current bootlist and
find whether you are booting from the correct disk. You can check this
using the following commands.
# lspv |grep rootvg hdisk0 00cd94a60f01c745 rootvg active # bootlist -m normal -o hdisk0 blv=hd5 hdisk0 blv=hd5 hdisk0 blv=hd5 hdisk0 blv=hd5
What was the disk you last booted off? Is it what you think it was? Both of the following commands will return the information you require.
# bootinfo -b hdisk0 # getconf BOOT_DEVICE hdisk0
If no output is returned in the above example, this is typically due to a recent rootvg migration, and this in itself is not a real problem. So let's carry on with the tasks to check.
As a point of interest, cross check the date of the last hardware migration with the last time it was rebooted using the following command.
# who -b . system boot Dec 04 10:57
Check the disks
Confirm whether the bootable disks that AIX knows about are the disks listed and are contained in your current bootlist.
# ipl_varyon -i PVNAME BOOT DEVICE PVID VOLUME GROUP ID hdisk0 YES 00cd94a60f01c7450000000000000000 00cd94a600004c00 hdisk1 NO 00cd94a6e0bd72af0000000000000000 00cd94a600004c00
If you find that your disk is not a bootable device in the output from the
ipl_varyon command, ask yourself whether you
: chpv –c hdisk0 command by mistake?
If you did clear the boot records of the hdisk, it will not be displayed
as bootable. If this is the case, rerun a
bosboot command on that disk with the following
# bosboot –ad /dev/hdisk0
Check that your boot logical volume which is typically hd5 resides on the bootable disk.
# lslv -m hd5 hd5:N/A LP PP1 PV1 PP2 PV2 PP3 PV3 0001 0181 hdisk0
If it is not, then migrate it across to the bootable disk. If hd5 is damaged and you cannot migrate it, then simply remove and re-create it afresh.
# rmlv hd5 # mklv -y hd5 -t boot -a e rootvg 1 hdisk0
Very rarely when running a
bosboot command, AIX
might complain about hd5 not being contiguous across the partitions it has
been allocated. I have experienced this only one time. If this is the
case, you have no choice but to remove hd5 and re-create it, as noted
above. Then run the
bosboot to the bootable
ipldevice, as described further
If AIX states that it cannot run a
command because hd5 existed on a previous hdisk or it cannot find
ipldevice, then the following should fix it:
Confirm whether the bootable disk has the same major and minor number as
the ipldevice and the ipldevice is a hard link to the bootable disk.
# ls -l *hdisk0* brw------- 1 root system 16, 2 Jan 15 11:23 hdisk0 crw------- 2 root system 16, 2 Oct 31 14:58 rhdisk0 # ls -l ipldevice crw------- 2 root system 16, 2 Oct 31 14:58 ipldevice
In the above output, the major and minor numbers 16, 2 on hdisk0 match with
that of the ipldevice, and therefore, no problems here. Issue
bosboot on both: the ipldevice and the bootable
# bosboot –a /dev/ipldevice # bosboot –a /dev/hdisk0
The host can then be rebooted.
If the major and minor numbers are different on the bootable disk to ipldevice, then fix it by removing the ipldevice and relinking.
# rm /dev/ipldevice # ln /dev/rhdisk0 /dev/ipldevice
Now that ipldevice has been relinked to the bootable disk, run
bosboot, on both the ipldevice and the boot
# bosboot -ad /dev/ipldevice # bosboot -ad /dev/hdisk0
If your ipldeivce is not present at all, re-create the link as describe above.
In my experience, all conditions have been met. If you can run a
bosboot command on the ipldevice and the hdisk
and the major and minor numbers match with the ipldevice, the host can be
rebooted. When the host comes up back, you will find that the
bosboot –a command works fine.
savebase is not saving me, it is
savebase command stores information in the
Object Data Manager (ODM) as such and is closely linked with the
bosboot command. Typically,
savebase errors occur after a hardware
migration or a alt_disk migration. A common error is when
lspv reports that the bootable disk is not part
of rootvg, but you can see it is when issuing a
lsvg –p rootvg command. The
savebase errors can be confusing sometime
because the errors just pop up and can mask another issue. However, for
this demonstration, let's assume it is a straight-forward
savebase problem. First, check whether the
error is not due to space issues and run a verbose
savebase command to confirm this.
# savebase –v
Next , let's assume it is a mirrored volume group and confirm that the volume group is indeed synchronized correctly using the following command.
# syncvg –v rootvg
Then force a rebuild of the logical control blocks so that they are in sync with the volume group descriptors area on the disks.
# synclvodm –Pv rootvg
That should be sufficient and the
command should now work. Finally, run the
savebase –v command, and all should be good.
Then, to complete the process, run
both: the ipldevice and the bootable disk, as noted earlier.
I have found that running the checks described in this article are
sufficient to resolve the common
bosboot errors is an inconvenience for sure.
But, you can overcome the issue by following the checks discussed in this
article. This can also give you the confidence that a reboot is good to