- After a reboot, exportvg worked without any errors
- importvg then worked and made the missing file systems available
- when rebooting, the problem reappeared
- the VM had been rebooted recently, prior to the AIX migration, and there were no problems.
AIX Down Under
AnthonyEnglish 270000RKFN Tags:  importvg volume_group varyonvg vg exportvg migration 3 Comments 15,262 Views
A customer did a migration from AIX 5.3 to 6.1 and then called me to report a strange set of symptoms. Some file systems didn't mount following a reboot. When the file systems in the volume group (let's call it datavg) went to mount, they returned the error that there was no such device. If the customer ran an exportvg and an importvg, all the datavg file systems became available. But then another reboot was done and the datavg file systems didn't mount.
Update: I have an idea of what might have happened. The customer (as I should have mentioned) did the migration to AIX 6.1 after restoring from an AIX 5.3 mksysb. I suspect the bosinst.data file had the option to Import user volume groups (such as datavg) set to No. The system had been migrated from 5.3 to 7.1, then when they ran into difficulties, they chose to restore the 5.3 mksysb and then upgrade just to 6.1. Perhaps I should have mentioned that background in the original post. Pretty poor omission if this was a detective story that you had paid hard cold cash for. But it isn't. And you didn't. So read on.
As I only had support over the phone, I had to do some detective work without being able to run any commands or look at screens for myself. (Technology is still pretty primitive down here in the antipodes).
Nested File System - not the culprite
I immediately thought it was a case of nested file systems that were attempting to mount before their parent file systems. This typically happens when the sub-file system (e.g. /tsm/log) appears in the file /etc/filesystems before the parent file system (/tsm). As /tsm hasn't been mounted, there is no mount point directory called /tsm/log, and so the /tsm/log file system fails to mount. This is usually as a result of someone manually editing /etc/filesystems.
As the customer pointed out to me, the missing device was the logical volume /dev/fslv00, not the file system's mount point directory. So I buried my nested file systems theory to be pulled out for next time a customer rings me with missing file systems.
Just, the facts, Sir
With my nested file systems theory soundly demolished, it was time to take a good, hard look at the facts so far:
The fact (fact #1) that the exportvg worked without any errors indicated that the volume group's file systems were not mounted before running the exportvg. If they had been mounted, we would have got some warnings that the file systems were in use. So a successful exporting actually hinted at a problem.
The fact that the importvg worked indicated that the volume group's disks were available to the OS. So there was no issue with SAN connectivity or some volume group corruption.
Since the volume group didn't come up after the reboot, perhaps the problem was with the volume group itself, rather than any one of the file systems or logical volumes belonging to it.
If the volume group had a problem with it, this seems to have occurred during the migration.
Patterns of Expertise
We looked for a pattern, and noticed that all the missing file systems belonged to the one volume group (datavg). The volume group could be imported / varied on from the command line, but didn't come up after a reboot. Then it twigged: the volume group had somehow been set not to varyon automatically following a reboot.
Sure enough, when we ran lsvg datavg we saw that datavg was set not to auto varyon. Nothing wrong with the volume group. It's just that somewhere along the way, someone, or perhaps a bug, had set it to varyon=no. Easily fixed, as the chvg command explains:
chvg -a y vg03
Which is exactly what we did for datavg.
Getting to the root cause
We rebooted and found the problem was certainly fixed. What caused the volume group not to varyon automatically as part of a reboot? A bit hard to know. Perhaps someone ran the importvg with the -n option, which Causes the volume not to be varied at the completion of the volume group import into the system.
Maybe there was a bug in AIX, where the migration didn't auto-varyon volume groups.
We may never get to the root cause. It's not a big concern, as we've set the varyon correctly now, and everything comes up fine after a reboot.
AnthonyEnglish 270000RKFN Tags:  physical_volume varyoffvg recreatevg varyonvg migration volume_group importvg pv file_system disk exportvg aix lun flash_copy 4 Comments 11,622 Views
It's pretty easy to move a volume group from one AIX system to another. You unmount all the file systems from the source volume group, varyoff the VG, export the volume group (exportvg), and then remove the disks from the source system (rmdev -dl hdiskN). Then you assign the LUNs to the target host, import the volume group, mount the file systems, and check permissions.
But what if you want to copy a volume group? You might want to replicate a volume group, by doing a flash copy across the SAN. Then on the remote site, you'd present the SAN LUNs to the target host, run cfgmgr to get the host to see the new disks. The disks on the source host may be named differently on the target host, because the target will just assign the next available hdisk number when you run the cfgmgr. The hdisk numbers may be different between the source and target hosts, but the Physical Volume IDs are the same. After all, the target LUN is a replica of the source LUN.
The problem: duplicate PVIDs and LVs
But that brings up a problem: duplicate PVIDs.
Enter the recreatevg command. Just like the move you can do with importvg and varyonvg,
That overcomes the issue of duplicate PVIDs and Logical Volume IDs.
Now when you run the importvg command yourself, you only specify one physical volume. For example, if the volume group consists of hdisk1, hdisk2 and hdisk3, then the command
importvg -y datavg hdisk2
will import the entire volume group, since the Volume Group Descriptor Area (VGDA) is on all three disks, and all the disks in the volume group know the PVIDs of all the other disks in that volume group.
When you run recreatevg,
The recreatevg command removes all logical volumes that fully or partially exist on the physical volumes that are not specified on the command line. Mirrored logical volumes can be an exception (see the -f flag).
So if you're wondering why the logical volumes on the disks you forgot to mention didn't get included, the LVM will reply: "you never asked." I expect that's because recreatevg (unlike importvg), isn't relying on PVIDs, since it's creating new ones.
Update: Hoarders and chuckers
There are two kinds of people in the world. Some like to keep their old junk, just in case they'll need it some day (they won't!). Others like to toss it out, just in case they don't need it (they will!). The first are the hoarders, the second are those who are chuckers (with apologies to our cultured readership for using such an expression). Well, importvg is a hoarder: you nominate one disk and it assumes you want the others in the volume group. recreatevg, on the other hand, is a chucker.
Perhaps people with more experience using recreatevg will have comments about how this all works in the real world (the recreatevg command, not how to keep the peace between hoarders and chuckers).
AnthonyEnglish 270000RKFN Tags:  aix varyoffvg exportvg varyonvg tunable pending_disk_i/os_blocked... lvm pbuf vmstat pv_pbuf_count unmount nextboot ioo umount importvg volume_group file_system lvmo 20,322 Views
An important warning about importing a VG
I've been playing with LVM tunables, specifically to do with pbufs, to see if changes to the parameters stay with a volume group when it gets moved to a new LPAR.
First, some background
A pbuf is a pinned memory buffer. As this developerWorks article explains, "T
The lvmo command is used to manage pbuf tuning parameters. It allows you to view or set the pbuf count by volume group rather than doing it globally. You can see the number of blocked I/Os for a volume group using the lvmo command. You can also see the global blocked count (total for all volume groups) using vmstat -v, identified as pending disk I/Os blocked with no pbuf.
For now, the question is about setting the pv_pbuf_count.
I was curious. If the pv_pbuf_count (a pbuf count for each physical volume) is set on a volume group basis, where does it get stored? Is it in
Setting the pbuf count
First, I'll change a volume group's pv_pbuf_count from the default value of 512 to 2048 using lvmo:
lvmo -v datavg -o pv_pbuf_count=2048
Now' display the current settings and statistics using
lvmo -v datavg -a
vgname = datavgThe new pv_pbuf_count is set to 2048. We're allowed 2048 pbufs for each PV in the volume group. The total pbufs for the volume group (total_vg_pbufs) are also 2048. This is because the volume group only has one PV in it. The global_blocked_io_count is set to 59, but that's not from this vg, as the pervg_blocked_io_count (blocked I/Os for this volume group) is set to 0.
This tunable parameter (pv_pbuf_count) survives a reboot, so where is this parameter change recorded? In /etc/tunables/nextboot? No, that file was unchanged. (If we'd used the old, system-wide way of changing the pbufs using ioo, we'd see it in nextboot, but that would change it for all volume groups, not just for the one we want).
So is the setting in the VGDA? I'll export the volume group using exportvg and import it and see what happens.
Ordinarily, you'd be doing the export from one LPAR, map the LUN to another LPAR and then import the volume group there, but doing the export and import on the same LPAR will prove the point for this exercise.
Before exporting the volume group, I need to unmount any file systems in it. You can list file systems in a volume group using the lsvgfs command. Having done that, you can deactivate and export the volume group:
And then import it again, to see what happens to our beloved pv_pbuf_count parameter.
Now, let's see what happened to the pv_pbuf_count:
lvmo -v datavg -a
Aha! The export and import has reset the pv_pbuf_count back to the default of 512.
When you do a volume group export and import - a great way of moving all of a volume group's data to a new location, rather than copying it or restoring it - the logical volumes and file systems get moved across to the target system, but tuning parameters don't come for the ride.
AnthonyEnglish 270000RKFN Tags:  mb size lsdev lspv volume_group getconf aix gb deprecated disk "lspv_size" vios bootinfo san physical lun 6 Comments 127,178 Views
Give bootinfo the boot
getconf DISK_SIZE /dev/hdisk0
The size is reported in MB, so the disk above is a 140 GB disk.
If you like this tip, get some practical, actionable steps in your inbox weekly. http://tiny.cc/AIXquicktips
AnthonyEnglish 270000RKFN Tags:  syncvg san lvm disk synchronise varyonvg logical_volume_manager reducevg mirrrovg storage extendvg lv aix mirror smit unmirrorvg pavlova migratepv lslv mklvcopy lun lsvg smitty vg logical_volume rmlvcopy volume_group 3 Comments 26,939 Views
SERVING UP LVs, PVs and PAVs
Since we've got redundant arrays on SANs these days, it may seem almost quaint to speak about software mirroring using the AIX Logical Volume Manager. Even so, LVM is very useful when you want to move data around. If you need to move to a new storage subsystem or just to a new LUN, and you're not able to do it on the backend, the LVM may be just the ticket.
For example, supposing you are using a LUN that's a whole lot bigger than you need. There might be a lot of reasons how it came to that but the most common one is you may have slightly overestimated the amount of disk you needed when you first went with your begging bowl to the storage team. Admit it. You asked for thirty times more disk than you needed, just in case. And the reason you did that is because you never listened to your mum when you couldn't finish your pavlova ("pav" in Aussie-land). Don't you remember her telling you:
"Your eyes are bigger than your stomach"
Well now's your chance to
Make IT history!
and hand back some storage you don't need.
LV to new PV in same VG
First you allocate a new, leaner (smaller!) LUN to AIX then add it to the volume group using extendvg. (You may need to change its queue depth, preferred path, health check interval etc). Once it's a member of the VG, you can mirror at the logical volume level using mklvcopy.
You can mirror a whole volume group, (mirrorvg), and that's really the best way to do it with rootvg, because it has boot disks and dump devices which need special treatment. For other volume groups I often use mklvcopy because it allows me to mirror one logical volume at a time. You don't need to synchronise the two copies immediately (using the -k flag), but until you do, the lslv command (and the lsvg command) will show some partitions are stale. You can create the copies and wait for a quieter time to run the synchronise. It's a lot faster if the disks aren't busy, but it's perfectly legitimate to synchronise them while they're in use.
If you want to synchronise the two copies, use syncvg. You can synchronise the whole volume group using
syncvg -v VGNAME
The varyonvg command (which activates a volume group) will do the same thing, and you can run that command - varyonvg - even if the volume group is already active. With both varyonvg and syncvg, if there are no stale partitions to be synchronised, the shell prompt will come back in a jif.
If you want to synchronise a single logical volume, use
syncvg -l LVNAME
Not seven years' bad luck
Once you've synchronised the mirror to the new LUN, you can break the mirror to the old one, by using
rmlvcopy LVNAME 1 hdiskN
And you'll be stopped from running rmlvcopy if there will be only stale partitions left afterwards; you're not allowed to remove the last good copy of a physical partition. That's nice, isn't it? You also get a big warning if you try to remove a pv from a volume group when it still has any partitions on it.
The oft-quoted Chris Gibson has an article showing how he migrated to a new SAN using LVM. The same principle applies for a single LUN. There is also the migratepv command which is a simple way of moving everything off one pv to another. As with mirrorvg and mklvcopy, the target pv has to be a member of the volume group first.
These commands are so much fun that it's a shame to use SMIT, but you can do it that way if you want to.
Shock the SAN team
Once you've run rmlvcopy (or unmirrorvg which will remove your seven years of bad luck), you can remove the PV from the volume group (reducevg) before taking it out of the ODM using
rmdev -dl hdiskN
If you've removed the old giant LUN from the configuration all along the way (VG, ODM, VIO server), you can hand it back to your SAN team for them to recycle. Once they realise you're not playing some sort of practical joke, they'll be grateful. Shocked, but grateful. They may need the disk for someone else who didn't listen to his mum when she served up the pav.
AnthonyEnglish 270000RKFN Tags:  disk growth chvg aix resize spare volume_group backend lun 7.1 extendlv 5.3 6.1 rootvg san 13 Comments 45,478 Views
There are lots of good reasons for having spare disk for rootvg, as I looked at in the post make way for rootvg. With virtual disks you can resize your volume group on the fly:
Note: this is supported for rootvg and concurrent vgs from AIX 6.1 TL 4. See IBM technote IZ80021 http://bit.ly/cmHjmy
Resizing the rootvg disk
I tried to increase rootvg on an LPAR running AIX 5.3 TL 11 and hit the following error:
Looks like the volume group needed to be varied off and varied on again. For rootvg, that means a reboot.
No reboot on AIX 6.1
In AIX 6.1 (from TL 4 - use oslevel -s to check your AIX level), you can increase rootvg on the fly.
Sounds like yet another reason to migrate to AIX 6.1.
AnthonyEnglish 270000RKFN Tags:  aix lsvgfs volume_group df grep file_systems awk 1 Comment 15,627 Views
As you know, you can display all the file systems using the AIX df command. But what if you want to run df just for the file systems in a single volume group? You may want to identify the culprites filling up a LUN, for example.
It's surprisingly simple to do df by vg. It's a matter of combining two commands: lsvgfs and df.
Command 1: lsvgfs
You can use lsvgfs to list all the file systems belonging to a volume group.
# lsvgfs rootvgCommand 2: df with arguments
And df can take arguments. You can use the df command to display a specific file system or set of file systems:
Combine these two commands and you can do a df for all file systems in a volume group.
df -m $(lsvgfs rootvg)I have an LPAR with other volume groups, but I'm only interested in the rootvg file systems for the moment. Here's how it looks:
Looks like I've got some cleaning up to do in /opt.
This combination of two commands: df and lsvgfs, has got to be easier than the usual ways of narrowing down the file systems:
Your fingers, my nightmares
df and lsvgfs. Please start using this. It will save me waking up in a cold sweat worrying about your fingers wearing out, especially if they're outside their warranty period. I don't want your precious little fingertips making a special guest appearance in my nightmares.
AnthonyEnglish 270000RKFN Tags:  mount_point logical_volume mksysb lv chfs importvg chlv volume_group rootvg gentleman exclude.rootvg exportvg tsm aix manners file_system 19,782 Views
Diamond in the SAN
I recently cloned an LPAR from a P6 to a less-busy P5 server via a mksysb backup. Unfortunately, after starting up the new LPAR I found that there was an essential directory missing because it had been excluded via the /etc/exclude.rootvg file (-e option on mksysb). Let's call the directory /diamond (its name has been changed to protect the guilty who didn't back it up).
I considered my options but each of them had a drawback.
Option 1: restore directory contents from TSM
I went with option 4, which allowed me to dig out the treasured directory's contents from the SAN. It did have the drawback that it would rename logical volumes on the original SAN LUN. I was willing to take the risk. It wasn't likely I would need to boot off that LUN again. Even if I did have to, I could always rename them via SMS in single user mode before mounting file systems.
I had to get the old rootvg onto a live AIX LPAR. Here's how I did it.
An import-ant command
After mapping the old SAN LUN to the new AIX LPAR, I used importvg to import the old LPAR's rootvg under another name - oldrootvg
newhost:/ # importvg -y oldrootvg hdisk3I got some warnings which I had to deal with before getting access to the contents of the directory I needed. The first was to do with LV names, the second was mount points. Here are the details,
LVM's good manners
When you're importing a rootvg (under a different name) onto a running LPAR with its own active rootvg, there are bound to be duplicate file systems and logical volumes. Fortunately, the AIX Logical Volume Manager (LVM) shows a bit of respect - the active LPAR's file systems and LVs don't get overwritten. The new guest LUN isn't allowed to push in. Instead, if there any conflicts, the imported VG gets new LV names. So when I ran importvg, hd4 was renamed to fslv02:
0516-530 synclvodm: Logical volume name hd4 changed to fslv02.
0516-530 synclvodm: Logical volume name hd2 changed to fslv03.
0516-530 synclvodm: Logical volume name hd9var changed to fslv04.
0516-530 synclvodm: Logical volume name hd3 changed to fslv05.
0516-530 synclvodm: Logical volume name hd1 changed to fslv06.
0516-530 synclvodm: Logical volume name hd10opt changed to fslv07.
Here's where the LVM showed itself to be a perfect gentleman. Good guests don't push people out of their own homes.
As for file systems, I got a warning about duplicate mount points that needed to be fixed before mounting the old rootvg's file systems:
imfs: Warning: mount point / already exists in /etc/filesystems.The logical volume for the file system I needed had been renamed from hd4 to fslv02, but the mount point was still set to /, so I had to change that for the newly named logical volume:
imfs: Warning: mount point /usr already exists in /etc/filesystems.
imfs: Warning: mount point /var already exists in /etc/filesystems.
imfs: Warning: mount point /tmp already exists in /etc/filesystems.
imfs: Warning: mount point /home already exists in /etc/filesystems.
imfs: Warning: mount point /opt already exists in /etc/filesystems.
chlv -L'/mnt/oldroot' fslv02At this point I created a new jfs2 log in smit lv (type jfs2log). I then set the jfs2log for my "new" root file system:
chfs -a log=/dev/loglv02 /mnt/oldroot
Finding that mount
The mount point /mnt/oldroot didn't exist, so I created the directory.
mkdir -p /mnt/oldrootI could then mount the file system /mnt/oldroot, and recover the contents of /mnt/oldroot/diamond.
I then cleaned up the oldrootvg using umount, varyoffvg and exportvg and rmdev of the disk. I then unmapped the SAN LUN which was no longer needed on the new host and was able to start the database.
I fixed the /etc/exclude.rootvg (and the TSM inclexcl file, just to be safe) and gave a quiet thanks that when importing the rootvg into an existing running LPAR, the LVM had acted as a perfect gentleman.
AnthonyEnglish 270000RKFN Tags:  file_systems delivering_baby chfs logical_volume_manager paging_space obstetrics_down_under volume_group multibos dump_devices size rootvg lvm aix 16,062 Views
TRACING YOUR ROOTS
Every AIX system has a
volume group (a logical group of disks) called rootvg, which is created automatically at
the time you install AIX. The rootvg may be made up of one or several physical disks or virtual disks (SAN LUNs, Logical volumes on the VIO Server). Whatever "disks" make up rootvg, it's
important to keep the rootvg file system footprint small, but give the total disk assigned to rootvg enough
space to take giant steps. Let me explain.
After you've done an initial base AIX install, it's usually best to create your own new logical volumes and file systems in a separate volume group from rootvg, provided you've got the disk available. A separate data volume group (datavg) is better, unless your data and application file systems (and LVs) are relatively small.
It's important to keep your rootvg fairly lean.
That will speed up mksysb backups, recovery of the operating system,
migrations to new releases or cloning systems. Even though those upgrade and recovery activities are fairly rare, when they do happen you want to lose as little time as possible.
So the rule I (generally) follow is: rootvg for the OS, data file systems in their own volume group(s).
… large steps
Although it's sensible to keep your non-AIX data away from rootvg, it's even more important to be able to have lots of unused space within rootvg or the ability to increase disk allocation to rootvg quickly. Among other things, it's handy to have some head room in rootvg for:
Digging around the rootvg
To check how much available space you have in rootvg, use the lsvg command and run lsvg rootvg. Look at Free PPs to see how much rootvg disk is available for expansion.
Increase rootvg dynamically
If your rootvg “disk” is actually virtual, such as a SAN LUN or a logical volume on the VIO server, then it usually can be expanded on the SAN (or using extendlv on the VIOS) and then recognised on the AIX LPAR using the -g flag of the chvg command:
That "brief" introduction was a high level view of why rootvg can be hungry for disk. The rest is details, so you can
STOP READING NOW!
Unless you want a little more explanation on those steps above or are curious about something you won't find in a Redbook.
STEP BY STEP
File system growth
This should be obvious enough. If you
keep your apps and data in datavg or other volume groups, rootvg
should remain fairly stable. The /usr file system may even be sitting
at 98% and only get increased automatically when you install software
using AIX utilities such as smitty
or installp. Still, it's good to have some room for log files and temporary files. With disk space, prevention is better than cure.
Shrinking file systems
On recent releases of AIX, reducing a file system can be done using chfs without unmounting it. For example, this command will reduce /usr by 10 GB:
You need to have enough wriggle room to do this, though, so that's where the Free PPs are important.
AIX patching and installing new software
When you install patches such as AIX fix packs from the IBM Support Portal, you can either Apply or Commit the software updates. If you apply them, you can reject them, effectively rolling them back. Applying (rather than committing) software updates is a smart interim measure, but it does take a little more space. You'll obviously also need some room for when you install other software that gets written into the rootvg file systems.
System dump devices
System updates with multibos
If you haven't realised how easily you can have a dual boot on the same disk using multibos, you really are missing out on a very simple way of patching AIX while minimising downtime. My compatriot Chris Gibson has two excellent articles on AIX Updates with Multibos and Working With multibos. Easy to learn, and worth knowing, and in some ways an easier alternative to cloning the entire rootvg using alt_disk_install.
Nothing can bring an AIX system to its
knees as effectively as running out of paging space. Although paging
space (swap space) doesn't have to be in the rootvg, it's always
worth having some room for expansion of paging space. You will need
to reassess your paging space requirements if you increase your
memory allocation using Dynamic
LPAR or Changing
a partition profile's properties.
The mksysb command can do a backup of the rootvg by writing it to a file. Similarly, the mkcd and mkdvd commands back up the rootvg, but they need to create temporary file systems first. You can use flags to indicate an alternate volume group to use, but if you don't, then guess which one needs the extra space?
Learning more about LVM
If you want to learn more about LVM, have a look at a Wiki by the indefatigable Nigel Griffiths, where he deals with LVM theory and practice.
Not in the Redbooks
If you got this far, congratulations. Since you've obviously got nothing better to do with your time, let me explain why I couldn't resist including a graphic of baby feet.
I delivered a baby for the first time on Friday (you won't find a Redbook on that) after standing by over the years for the births of our five other children. Couldn't have done it without my wife. An obstetrician was there. He suggested I add "delivering babies" to the hobbies section of my resume.