Can you recover rootvg without media or NIM?
AnthonyEnglish 270000RKFN Comments (11) Visits (7405)
A colleague in another country had an AIX logical partition that went down and couldn't boot. He asked how he could boot it without media. It was a development logical partition.
I knew nothing at all of the environment, and had no possibility of access to it myself, so I found out the following pertinent facts:
So how would they boot it? Maybe you've already come up with your own answer:
Before you ask
I had a few other questions swirling around my head.
Remember, the NIM server was down level in comparison to the AIX host which was (I expect) a NIM client. That's a big no-no. You're supposed to have the NIM server at the latest level and then use that to update the clients, but obviously they had bypassed NIM in building and/or updating the AIX host.
"So, update NIM already!"
Easier said than done. In this environment, the NIM server is a production host and it can't just be updated without notice. Yes, it should have been up to date all along before they worked on updating their dev host, but these things happen.
What about the VIOS Virtual Media Library?
Obvious solution, isn't it? You can load an ISO image of the AIX installation media and/or the mksysb (produced by the mkdvd command) and boot off that. Hopefully, the problem with booting up is just a bootlist that needed to be updated or something simple. But they didn't have a VIOS and I didn't know if they had the resources to build one - particularly an adapter to connect to a disk for the VIOS rootvg.
Restoring mksysb to another disk
There was another approach they could have used. Since they had a good mksysb backup, they could build an alternate disk with that backup (they'd have to do it from a running AIX system, not from the dead one!), and then reassign the logical partition to boot off that disk. I've never tried this myself, but it would be done with the
A new NIM
I later heard that they built a new, temporary NIM server and booted the logical partition from that. I didn't get a full briefing but perhaps they restored from the mksysb.
After the scramble to get a system up again, it's always worth doing a post mortem when there are problems like this, even if you don't get to the root cause.
This little account probably has got you thinking of what you would or wouldn't do. There were some things in the environment which could have been done better, particularly having the NIM server up to date.
Sometimes shortcuts are the slowest way to get your destination. No doubt my colleague and his team will revisit this environment, and consider updating NIM, and then keeping it updated before they do any patching or installation of other partitions. They may also virtualise a little more by installing a VIOS. And they could also look at a regular test restoration from a mksysb.
In the end, they were able to recover the dev host that had gone down. I'm sure they'll consider new approaches to the AIX environment which will make the process smoother next time.
Over to you ...
If you've got some ideas about how to address this situation, you can add a comment to this post. Comments are moderated, mainly to stop people trying to put in spurious links that sell handbags. You'll need to log in to IBM developerWorks to add a comment. I'd be interested in your feedback.