Contents


Configuration of Enterprise Pool for Capacity on Demand with IBM Geographically Dispersed Resiliency for Power Systems solution

Comments

In this article, we introduce you to the newly released disaster recovery (DR) solution, IBM® Geographically Disperse Resiliency for Power Systems™, and explain how to use it seamlessly with Power Enterprise Pools. We also explain the concepts involved in Capacity on Demand (CoD) and then walk you through the steps involved in configuring the same from both the Hardware Management Console (HMC) GUI and command line.

What are Power Enterprise Pools?

You can learn about Power Enterprise Pools in the IBM Knowledge Center and IBM Power Systems – Capacity on Demand Redbooks.

Brief excerpt:

"Power Enterprise Pool provides flexibility and value for Power Systems. A Power Enterprise Pool is a group of systems that can share Mobile Capacity on Demand (CoD) processor resources and memory resources. You can move Mobile CoD resource activations among the systems in a pool with Hardware Management Console (HMC) commands. These operations provide flexibility when you manage large workloads in a pool of systems and helps to rebalance the resources to respond to business needs. This feature is useful for providing continuous application availability during maintenance. Not only can the workloads be easily moved to alternate systems but the processor activations and memory activations can be moved. Disaster recovery planning also is more manageable with the ability to move activations where and when they are required."

What is IBM Geographically Dispersed Resiliency for Power Systems?

IBM Geographically Dispersed Resiliency for Power Systems solution is a disaster recovery solution that is easy to deploy and provides automated process to recover production site virtual machines (VMs) on disaster recovery. Because disaster recovery of applications and service are the key components to provide continuity for business, the IBM Geographically Dispersed Resiliency solution helps customer to have automated disaster recovery process during failure. Disaster recovery solutions mainly based on cluster-based technology and virtual machine restart based technology. This solution provides an easy deployment model that uses a controller system (called KSYS) to monitor the entire virtual machine environment. This solution also provides flexible failover policies and storage replication management.

You can learn more about Geographically Disperse Resiliency for Power Systems at the IBM developerWorks® wiki documents: Why Geographically Dispersed Resiliency is the ideal DR solution for Power Systems and FAQ

How can enterprise pools and Geographically Dispersed Resiliency work together to optimize resources?

When you have a production (active site) and recovery (backup) site setup, you should plan to provision majority of your resources in the production site, and keep only a fraction of resources in the recovery site. Keeping minimal resources in the recovery site results in considerable savings on hardware resource costs. However, if a disaster is to hit the production site, you should be able to move the hardware resources to the recovery site so that all the workloads that are running on the primary site can be brought up online on the recovery site. This is where enterprise pools come into the picture. Using enterprise pools, you can pool majority of your resources into a resource pool and have them allocated to the production site during normal times, but move to resources from the production site to the recovery site when a disaster occurs.

Let's look at the detailed steps to configure capacity on demand enterprise pool along with the Geographically Dispersed Resiliency solution.

What are some of the key Geographically Dispersed Resiliency terminologies required to understand this article?

KSYS: KSYS stands for C(K)ontroller System logical partition (LPAR). It is the LPAR (currently an IBM AIX® LPAR) where Geographically Dispersed Resiliency software is deployed. KSYS acts as the orchestrator that monitors, manages, and moves VMs from one site to another

ksysrppmgr command: To manage the resource allocations in a disaster recovery environment, the Geographically Dispersed Resiliency for Power Systems solution provides a resource pool provisioning (RPP) command, ksysrppmgr. The ksysrppmgr command adjusts available resources on the managed hosts; you need not check the current available resources.

Figure 1. Reference environment to illustrate the steps involved in configuring CoD and using Geographically Dispersed Resiliency
  • Site1: Production site is the site where the workloads are running on numerous VMs (LPARs).
    • HMC_1: It is the HMC for managed systems in the production site. (vmhmc1 is the production site HMC in the hardware used in this article.)
    • Host_1: It is the managed system that is used by the production site to host the VMs. (kumquat_9179-MHD-105E67P is the production site managed system in the hardware used in this article.)
  • Site2: Recovery site is the site that acts as the backup for the production workload in case of a disaster event or planned maintenance of the production site.
    • HMC_2: It is the HMC for managed systems in the recovery site. (vmhmc3 is the recovery site HMC in the hardware used in this article.)
    • Host_2: It is the managed system that is used to host the VMs in the recovery site when the workloads are switched over. (orange-9179-MHD-SN107895P is the production site managed system in the hardware used in this article.)

Configuring the Enterprise Pool Capacity on Demand pool

Configuration steps detailed in this document minimally require the following versions:

  • KSYS LPAR should have at least AIX 7.2 TL01 SP01
  • HMC version 8.6

Step 1:

Log in to the HMC command line as the root user and copy the enterprise pool configuration file obtained from IBM. The configuration file contains the Power Enterprise Pool membership activation code for each of the systems in the pool along with the mobile processor activation code and mobile memory activation code for the pool. Refer to Ordering Power Enterprise Pools on how to obtain the file.

Step 2:

Generate the public and the private key on the HMC to be configured using the following command:

/opt/hsc/bin/hscSignal 373 <private key path> <public key
          path>
Figure 2. Generating public and private keys

Step 3:

Generate the signed XML file from the text XML file you copied in step 1 using the following command.

 /opt/hsc/bin/hscSignal 374 <unsigned pool config file path> <private key
          path> <public key path> <signed pool config file path>
Figure 3. Generating a signed XML file

Step 4:

Set up the master HMC. The first HMC becomes the master of the EPCoD pool. Provide the signed XML generated in step 3 to create the pool. Host_1 gets added to HMC_1 and HMC_1 acts as the master HMC.
Create a pool and set it as the master HMC using the following command:

mkcodpool -p EPCOD_NAME -f <signed pool config file>
Figure 4. Creating a pool

Step 5:

Set the backup-site HMC by first adding the backup-site HMC to the pool and then adding the host belonging to backup site. Here, add HMC_2 as the backup site HMC followed by adding Host_2 to HMC_2.

Add the backup-site HMC using the following command:

        chcodpool -o add -p <poolname> --mc <hmcname/hmcip> -u <username> --passwd <password> --force

Add the backup-site host using the following command:

        chcodpool -o update -p <poolname> -f <signed pool config file>
Figure 5. Adding backup-site HMC to the pool

Step 6:

Verify that the pool is created with both the master HMC and the backup-site HMC and their respective hosts:
List the pool information at the managed system level using the following command:

 lscodpool -p <poolname/poolid> --level sys
Figure 6. Listing the pool information at different levels

List the pool information at a detailed level using the following command:

 lscodpool -p <poolname/poolid> --level mc
Figure 7. Listing the detailed pool information

To verify the pool configuration from GUI:

In the Hardware Management Console GUI, on the left pane, expand Systems Management -> Power Enterprise Pool. On the right pane, in the Power Enterprise Pool Management section, click Managing HMCs.

Figure 8. Verifying pool information through GUI

Using Geographically Dispersed Resiliency with Enterprise Pools CoD

Enterprise Pools CoD can be used with both planned and unplanned Geographically Dispersed Resiliency recoveries.

Planned DR: A planned move is an operation in which an administrator initiates a move when there is no disaster event and the resources in the active site can be shut down gracefully. These types of operations are initiated mainly to perform a DR test drill, move from one site to another, or when one of the sites needs to be taken offline for maintenance.

Unplanned DR: In an unplanned DR scenario, a disaster such as power failure has brought down the active site and it can no longer be reached from the backup site. In such a situation, the VMs need to be started on the backup site and software stack brought back online to resume the business applications. Because a disaster has struck down the active site, the resources in the active site are no longer reachable and cannot be automatically released back into the enterprise pool by Geographically Dispersed Resiliency (KSYS). After the active site is up, the admin can use KSYS to manually initiate a cleanup of the VMs of the active site.

The following steps need to be performed by the admin to recover from DR:

  1. Return the resources from the active site to the enterprise pool.
  2. Allocate the resources to the backup site.
  3. Initiate a move from active to backup site.
  4. In case the move was unplanned, clean up the resources in the active site after it has recovered from the disaster.

Let us now take a detailed look at the steps needed to perform the above activities. In the following steps, we refer to processor units for resources, however the same steps hold good for memory as well.

Step 1: Return the resources from the active site to the Enterprise Pool

Let's assume that the recovery site is short of eight processor units We need to reduce eight processor units from the managed system (kumquat_9179-MHD-105E67P) belonging to the production site, so that they can be added to the managed system in the recovery site (orange-9179-MHD-SN107895P) in the next step.


In the Hardware Management Console GUI, on the left pane, expand Systems Management -> Power Enterprise Pool. On the right pane, in the Power Enterprise Pool Management section, click Processor Resources.

When you open the enterprise pool page to manage resources, you will observe that the resources are currently being used by the hosts in the production site.

Figure 9. Reducing the resources from active site

When the number of resources is reduced at the production site, the enterprise pool tracks these resources as unreturned resources" against the production site.

Figure 10. Tracking unreturned resources

Using command:

The ksysrppmgr command can be used to add or remove the resources to and from the pool. You can use the action to be "e" or "execute" to execute the resources requests or "c" or "check" to simulate if the resources requests would be satisfied.

Figure 11. CLI for reducing and adding resources to and from the pool

Authenticate the HMCs, before using the command line:

 hmcauth -u <username> -p <password> -a <HMCname>
Figure 12. HMC authorization

Use the ksysrppmgr command to allocate resources:

 ksysrppmgr -o c -h [<HMCname>]:<hmcuri>:<username> -m
        <managedsystem>:<action>::<proc amount> -e <poolname> -v -r
Figure 13. Command to allocate resources using ksysrppmgr:

Step 2: Allocate the resources to the backup site:

In this step, we will allocate the resources released in the previous step to the managed system (orange-9179-MHD-SN107895P) in the recovery site.

The resources allocated to the backup site will be over-committed licensed resources.

In the Hardware Management Console GUI, on the left pane, expand Systems Management -> Power Enterprise Pool. On the right pane, in the Power Enterprise Pool Management section, click Processor Resources.

Figure 14. Allocating resources to backup site from GUI
Figure 15. Allocating resources to backup site using command line

Step 3: Initiate a move from active to backup site:

After initiating a DR from the active site, the VMs migrate from the production site to the recovery site.


When the migration is successfully completed, in case of a planned DR, cleanup of the VMs on the active site is done automatically. In this case, the resources named as unreturned will be returned to the pool.When we invoke an unplanned DR, the cleanup of the active site VMs should be done manually and the resources will be assigned back to the pool.

Use the following command to invoke the cleanup of the site manually:

ksysmgr cleanup site <backup sitename>

Use the following command to invoke a site move from the active site to the backup site:

ksysmgr move site from=<active site name> to=<backup sitename>
Figure 16. Pool configuration after the migration and cleanup

Helper scripts: resource allocation.sh, resource_reallocation.sh

As you must have observed, step 1 and step 2 are manual activities that you'll have to perform to reallocate resources from the production site to the recovery site. We have written some simple scripts that you can use to automate these activities, such as allocating the resources from the active site to the backup site and re-allocating the resources. These scripts can also be registered with KSYS. KSYS calls these scripts during pre- and postverification operations and pre-move operation of DR.

You can use these scripts in the following ways:

  • Use scripts manually
  • Run the resource allocation script before verification, as it assigns the resources from the active site to the backup site.
  • Invoke a verification process. Make sure that the verification process succeeds without any capacity check errors. This can be done using the following command:
                Ksysmgr verify site <active sitename>
  • Run the resource re-allocation script after verification, as it releases the resources from the backup site to the active site.
  • Run the resource allocation script again, if the admin needs to proceed with the DR operation.
  • Use resource allocation and reallocation scripts by registering with ksys

These scripts can be registered with KSYS with the following commands:

  • Registering for pre-verify:
                ksysmgr add script entity=site
                pre_verify="/opt/IBM/ksys/samples/resouce_allocation.sh"
  • Registering for post-verify
                ksysmgr add script entity=site
                post_verify="/opt/IBM/ksys/samples/resouce_reallocation.sh"
  • Registering for pre-offline
                ksysmgr add script entity=site
                pre_offline="/opt/IBM/ksys/samples/resouce_allocation.sh"

Note:

  • The scripts must be run on the KSYS node, if used manually.
  • The script queries the production and recovery site HMCs to fetch resource information. The script requires user name and password of the production and recovery site HMCs. These (SOURCEHMCUSER, TARGETHMCUSER, SOURCEPASSWD, TARGETPASSWD) can be set as variables in the beginning of the script.

Conclusion

To summarize, Enterprise Pool for Capacity on Demand and Geographically Dispersed Resiliency work together seamlessly to optimize hardware resources between production and recovery sites by enabling capacity checks and smoothly migrating VMs between sites.


Downloadable resources


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=1043584
ArticleTitle=Configuration of Enterprise Pool for Capacity on Demand with IBM Geographically Dispersed Resiliency for Power Systems solution
publish-date=01312017