Live Partition Mobility is an exciting new feature of the IBM® POWER6™-based System p® servers. Since its announcement, I'd been eager to test this new technology in a real-world scenario. This article focuses on performing live partition mobility using IBM JS22 Blade servers. It describes what you can expect before, during, and after a live migration with SAP/Oracle on AIX. Some pointers on configuring the environment for mobility will also be covered.
I assume the reader is already familiar with AIX, Logical Partitioning (LPAR), and Virtual I/O server (VIOS) concepts and technology, so I won't go into detail regarding AIX installation and VIOS configuration. Several documents have been published in relation to all aspects of live partition mobility, VIO, and the Integrated Virtualization Manager (IVM). I have included links to these documents under the resources section. They go into great detail on how to install the IVM/VIOS in a BladeCenter® environment. I encourage you to review these documents.
What is Live Partition Mobility and what are the benefits?
Before getting started, let's quickly review what all the fuss is about. Live Partition Mobility is available on POWER6™-based System p® servers. It enables the migration of an active (or inactive) LPAR from one physical system to another. Mobility uses a simple procedure that transfers the LPAR from the source to the target system without disrupting the hosted application or the operating system. It allows an administrator to perform hardware maintenance, such as disruptive firmware updates, without requiring system downtime. LPARs can be temporarily moved to different physical servers during the maintenance window. They can be easily moved back once the outage is complete. It provides an administrator greater control over the usage of System p resources as workload can be moved dynamically between systems.
Live Partition Mobility is targeted for planned activity. It does not protect you from system failures, so it does not replace high-availability software like the IBM HACMP high-availability cluster technology.
In my environment, I wanted to be able to move an active AIX LPAR, running SAP and Oracle, from one JS22 Blade to another physical Blade. This would provide me with the ability to perform disruptive hardware and/or software maintenance on a Blade without the need for an outage to my SAP applications. For example, if I had a requirement to upgrade a VIOS on a Blade, I could move the workload to another Blade (without an outage to SAP) , perform the VIOS update, reboot the Blade, and then move the LPAR back once the activity was successfully completed. Likewise, I could take the same action if I needed to update the Blade's firmware.
Before discussing performing partition mobility, here is a quick overview of my JS22 environment and configuration. Within my IBM BladeCenter H chassis, I have two JS22 Blades in slots 13 and 14, respectively. Both Blades have 16GB of memory installed, 4 x 4GHz POWER6 processors, and the 'PowerVM Enterprise Edition' enabled (required for mobility). Each Blade was installed with a Virtual I/O server (VIOS, version 1.5) and Integrated Virtualization Manager (IVM). The SAN disk storage for these systems was an IBM DS8100.
Each VIOS was given a hostname: bvio82 (slot 13) and bvio83 (slot 14). The Blade in slot 13 had one AIX LPAR (bxaix85) configured and active. It was running AIX V5.3 TL7 SP3. The application hosted on this system was a single SAP R3 v4.7 instance with Oracle 10G. SAP was installed and configured by our SAP Basis team. It is important to note that they did not have to do anything special with their SAP (or Oracle) installation to support the mobility feature.
There are several prerequisites for partition mobility. One of the most important is that all network connectivity from the LPAR must be virtualized, meaning it must communicate using a VIOS. This implies that the VIOS must have a Shared Ethernet Adapter (SEA) configured and operational. Both of my VIOS were configured with an SEA, on the same physical VLAN. I used one of the Logical Host Ethernet (LHE) ports to configure the SEA. All of the SEA configuration was performed using the IVM and was very straightforward (which you will see in a moment). The Virtual I/O Client (VIOC), bxaix85, was configured with a virtual ethernet interface configured with the appropriate VLAN ID to communicate with the outside world using the SEA in the VIOS.
Another important prerequisite for partition mobility is that all storage connected to the mobile LPAR must be on the SAN, even the operating system, which in the case of AIX lives within the root volume group (rootvg). This SAN disk must be assigned to both Blades and be detected by both VIOS. This is to allow the target VIOS the ability to "take over" the storage during a migration. I allocated two SAN (DS8100) disks to both VIOS. One disk was for the OS (AIX rootvg) and the other was for the SAP/Oracle software and database (sapvg).
Figure 1 shows the JS22 environment and the high-level configuration.
Figure 1. JS22 environment
Configuring the JS22 environment for partition mobility
The first step to configuring the environment is to install a Virtual I/O Server (VIOS) on each JS22. This is accomplished by installing a VIOS mksysb image using NIM. The internal disk within the Blade can be used to house the VIO server or you may choose to boot the Blade with the SAN, as this is also supported. I chose the internal disk for my Blades.
Once the VIOS is installed on each Blade, you can then connect to the Web-based IVM. This interface provides a "HMC-like" GUI that allows an administrator to configure LPARs, virtual network, and virtual storage on the Blade and VIOS. As this is a Web-based tool, you can simply point your Web browser at the VIOS hostname (for instance, http://bvio82), and you will be presented with the IVM login page. To log in, use the VIOS padmin userid and password.
Before you can test mobility with the JS22, first you should ensure that the environment is prepared appropriately to support it. The first step is to update the firmware levels of the JS22 and associated components such as the Fibre Channel (FC) adapters. Download the latest firmware images for the JS22 and the FC adapters from the JS22 support site (refer to the resources section for a link to the site) and apply them to each Blade. You should also install the latest VIOS fixpacks. During the build of my VIOS, the latest fixpack was 1.5.1.1-FP-10.1.
The final component to install (and update) is the multipath I/O (MPIO) device driver. When connecting to an IBM DS8100 storage device, the supported MPIO software is SDDPCM v2.2.0.
With the correct software and firmware levels installed, you should now prepare the Blade, the VIOS, and the LPAR for partition mobility. What follows is a brief checklist of the tasks performed with the IVM:
- Enter the PowerVM Enterprise Edition APV key on both Blades. This key is required to enable the mobility feature on the JS22 Blade.
- Confirm that the memory region size is the same on both Blades. This information can be found under "View/Modify System Properties," in the "Memory" tab.
- Configure an SEA on both VIOS. Enable the Host Ethernet Adapter for ethernet "bridging". This is required in order for the virtual ethernet devices to access the physical ethernet adapter and the external network. This is performed under the "View/Modify Host Ethernet Adapter", "Properties" tab. Select Allow virtual Ethernet bridging. Under "View/Modify Virtual Ethernet" and the "Virtual Ethernet Bridge" tab, and select the physical adapter to be used as the SEA. A message will appear stating the operation was successful. The SEA is now configured.
- Create an LPAR (in my case, this was bxaix85) on the source Blade. Select View/Modify Partition, Create Partition. Enter the LPAR name, memory, and processor requirements. Ensure that none of the physical HEA ports are selected. Under "Virtual Ethernet," select the SEA to use (for instance, ent0). Under "Storage Type", select Assign existing virtual disks and physical volumes. Select the SAN disk assigned to the VIOS, which in my environment is the DS8100 disks.
- Click Finish to create the LPAR. The next step is to install AIX. This can be achieved using a NIM mksysb (or rte) install.
- With the AIX installation and configuration complete, you can now configure SAP and Oracle.
At this point, you are ready to perform a live partition migration. Do one final review and check. On each VIOS an SEA has been configured. Confirm this by viewing the device configuration on both VIOS.
Figure 2. SEA configuration on the VIOS
From the output above, you can confirm that the SEA, ent6, on both VIOS, is configured using the first LHE port, ent0.
Verify that the same SAN disk can be seen by both VIOS. Using the lspv command, check that both VIOS have the same PVID associated with the SAN storage (hdisk1, 2, and 3), as shown in Figure 3.
Figure 3. lspv output from both VIOS
Ensure that MPIO for the disks is configured and functioning appropriately. Run the pcmpath command (from oem_setup_env) and verify that all paths are operating normally on both VIOS.
One last check. Confirm that the AIX LPAR, bxaix85, is configured with only virtual devices (meaning no physical adapters, another prerequisite for mobility). Figure 4 shows that the LPAR is configured with virtual Ethernet and virtual SCSI adapters.
Figure 4. Virtual Ethernet and SCSI adapters configured
Performing Live Partition Mobility
At this point, two VIOS have been configured, bvio82 and bvio83, one per Blade, and one active AIX LPAR (bxaix85) running on the first Blade as a VIO client (VIOC). You are now ready to perform a live partition migration. During the migration, the first Blade (in slot 13, bvio82) will be known as the source system (refer to refer to Figure 5) and the second Blade (slot 14, bvio83), will be the target system (refer to Figure 6).
Figure 5. The source Blade view from the IVM Web interface
Figure 6. The target Blade view from the IVM Web interface
The objective here is to move the LPAR, bxaix85, from the Blade in slot 13 to the Blade in slot 14. At the end of the migration, bxaix85 will be running as a VIOC from bvio83 on the other physical Blade. AIX, SAP, and Oracle will continue to function throughout the entire migration.
Prior to the migration, run the lsconf command from AIX, and note the system serial number:
Figure 7. lsconf output prior to the migration
During the migration, there are SAP jobs running on the LPAR, as shown in Figure 8. Monitor the system using the topas command and observe that SAP (disp+work) and Oracle processes are consuming processor during the migration.
Figure 8. topas session showing SAP workload on bxaix85
All tasks to perform partition mobility will be executed from the IVM, on the source Blade. To start the migration, check the box next to the LPAR (bxaix85) and choose Migrate from the "More Tasks" drop-down menu. Refer to Figure 9.
Figure 9. Starting the migration process for bxaix85
You'll be presented with a screen to enter the target system details. Enter the details and then click on Validate. Refer to Figure 10.
Figure 10. Validating the migration
During the validation phase, several configuration checks are performed. Some of the checks include:
- Ensuring the target system has sufficient memory and processor resources to meet the LPAR's current entitlements.
- Checking there are no dedicated physical adapters assigned to the LPAR.
- Verifying that the LPAR does not have any virtual SCSI disks defined as logical volumes on any VIOS. All virtual SCSI disks must be mapped to whole LUNs on the SAN.
- RMC connections to the LPAR and the source and target VIOS are established.
- The partition state is active, meaning Running.
- The LPAR's name is not already in use on the target system.
- A virtual adapter map is generated that maps the source virtual adapter/devices on to the target VIOS. This map will be used during the actual migration.
Once the validation completes successfully, a message stating it "might be possible" to migrate the LPAR appears (Figure 11). Click Migrate and the migration to the other Blade begins. Monitor the status of the migration by clicking the Refresh icon regularly (Figure 12).
Figure 11. Migrating the LPAR
Figure 12. Monitoring the status of the migration
On the target Blade, observe that a new LPAR has been created with the same name as the LPAR on the source Blade. It has a state of Migrating – Running, as shown in Figure 13.
Figure 13. The shell LPAR on the target, migrating
What happens during the partition migration phase?
During the active migration of the LPAR, state information is transferred from the source to the target system. This "state information" includes such things as partition memory, processor state, virtual adapter state, NVRAM, and the LPAR configuration. The following are just some of the events and actions that occur during the migration:
- A partition shell is created on the target system. This shell partition is used to reserve the resources required to create the inbound LPAR, or processor entitlements, memory configuration, and virtual adapter configuration.
-
A connection between the source and target systems and their respective POWER Hypervisor
is established through a device called the Virtual Asynchronous Service Interface (VASI)
on the VIOS. The source and target VIOS use this new virtual device to communicate with
the POWER Hypervisor to gain access to the LPAR's state and to coordinate the migration.
You can confirm the existence of this device with the
lsdevcommand on the VIOS.
Figure 14. The VASI device on the VIOS
The vasistat command displays the statistics for the VASI
device. Run this command on the source VIOS during the migration. You'll observe that
"Total Bytes to Transfer" indicates the size of the memory copy and that
"Bytes Left to Transfer" indicates how far the transfer has progressed, as shown in Figure 15.
Figure 15. The vasistat command
-
The virtual target devices and virtual SCSI adapters are created on the target system.
Using the
lsmapcommand on the target VIOS, before the migration, you'll notice that there are no virtual SCSI or virtual target device mappings, as shown in the following figure.
Figure16. lsmap on the target VIOS prior to migration
Running the same command after the migration shows that the virtual disk mappings have been automatically created, as part of the migration process.
Figure 17. lsmap on the target VIOS after migration
-
The LPAR's physical memory pages are copied to the shell LPAR on the target system. Using the
topascommand on the source VIOS, you may observe some network traffic on the SEA (ent6) as a result of the memory copy.
Figure 18. topas on the source VIOS displaying SEA traffic
- Since the LPAR is still active, with SAP still running, its state continues to change while the memory is copied. Memory pages that are modified during the transfer are marked as dirty. This process is repeated until the number of pages marked as dirty are no longer decreasing. At this point, the target system instructs the Hypervisor on the source system to suspend the LPAR.
- The LPAR confirms the suspension by quiescing all its running threads. The LPAR is now suspended.
- During the LPAR suspension, the source LPAR continues to send partition state information to the target server. The LPAR is then resumed.
- The LPAR resumes execution on the target system. If the LPAR requires a page that has not yet been migrated, then it will be "demand-paged" from the source system.
- The LPAR recovers its I/O operations. A gratuitous ARP request is sent on all virtual Ethernet adapters to update the ARP caches on all external switches and systems on the network. The LPAR is now active again.
- When the target system receives the last dirty page from the source system, the migration is complete. The period between the suspension and resumption of the LPAR lasts just a few seconds. During my tests, I did not notice any disruption to the LPAR as a result of this operation.
With the memory copy complete, the Virtual I/O Server on the source system removes the virtual SCSI server adapters associated with the LPAR and removes any device to LUN mapping that existed previously.
The LPAR is then, automatically, deleted from the source Blade (Figure 19).
Figure 19. Source Blade view from the IVM after the migration. LPAR removed
The LPAR is now in a Running state on the target Blade, as shown in Figure 20.
Figure 20. Target Blade view from the IVM after migration. LPAR in running state
The migration is 100 percent complete, as shown in Figure 21, below.
Figure 21. Migration 100% complete
Now that the LPAR is running on the other Blade, run the lsconf command again to confirm that the serial number has changed with the physical hardware:
Figure 22. lsconf after the migration
In order to confirm and verify that SAP and Oracle are not impacted by the migration, check the Oracle alert log for any errors. No errors are found, as shown in Figure 23.
Figure 23. Oracle alert log showing no errors during the migration
From within SAP, run the lsconf command before and after the migration to confirm that the physical server has changed:
Figure 24. lsconf from SAP, before and after the migration
My ssh login sessions on bxaix85 remained active, meaning I did not suffer any connectivity issues as a result of the live migration. The SAP team also did not notice any disruption to their SAP GUI client sessions or jobs running on the LPAR.
Mobility activity is logged on the LPAR and the source and target VIOS. Review the logs
with the errpt (AIX) and errlog (VIOS) commands. On AIX you'll notice messages similar
to CLIENT_PMIG_STARTED and CLIENT_PMIG_DONE. Additional information from DRMGR, on AIX, is also logged to syslog, for instance, Starting CHECK phase for partition migration. On the VIOS you'll find
messages relating to the suspension of the LPAR and the migration status (Client partition suspend issued and Migration
completed successfully).
The final objective has been achieved. The LPAR is now running on a different physical server (Figure 25). You are now able to perform scheduled maintenance activities on the Blade and SAP will not suffer any down time as a result of this activity.
Figure 25. JS22 configuration after Live Partition Mobility
The migration took roughly two minutes to complete. The LPAR being moved was configured with 4GB of memory. Most of the time required for the migration was for the copying of the LPAR's memory from the source to the target. The "suspend" of the LPAR itself lasted no more than two seconds. Consider using a high-performance network between the source and target systems. Also, prior to the migration, I'd recommend reducing the LPAR's memory update activity. Taking these steps will help to improve the overall performance of the migration. We used a 1GB network within our Blade environment. For larger System p servers (570 and 595), we are considering using a 10GB network when we start moving systems with a large amount of memory (80GB or more).
Live Partition Mobility has enormous potential for dramatically reducing scheduled downtime for system maintenance activities. Being able to perform scheduled activities, like preventative hardware maintenance or firmware updates, without disruption to user applications and services is a significant enhancement to any System p environment. Additionally, this technology can assist in managing workloads within a System p landscape. It gives administrators the power to adjust resource usage across an entire farm of System p servers. LPARs can moved to different physical servers to help balance workload demands.
Future System p hardware migrations (POWER6 to the next generation of POWER servers) and consolidation efforts may also be improved greatly, as existing LPARs could simply be moved to a new platform with little effort on the part of the administrator.
I encourage the reader to review the material listed under the resources section in order to learn a great deal more about Live Partition Mobility and the JS22 Blade.
Learn
- PowerVM Live Partition Mobility
on IBM System p: This Redbook assists you in understanding, planning,
preparing for, and performing partition migration on IBM System p servers
running AIX.
- Live Partition Mobility Support for POWER Systemsprovides installation and configuration information to help guide you through the process of installing and configuring Live Partition Mobility.
- Get Support for BladeCenter JS22
- Integrated Virtualization
Manager on IBM System p5: This Redpaper provides an introduction to IVM by describing its architecture and
showing how to install and configure a partitioned server using its capabilities.
Discuss
-
Participate in the AIX and UNIX forums:
- AIX Forum
- AIX Forum for developers
- Cluster Systems Management
- IBM Support Assistant Forum
- Performance Tools Forum
- Virtualization Forum
- More AIX and UNIX Forums
- AIX Networking
Comments (Undergoing maintenance)





