IBM PowerHA SystemMirror HyperSwap with Metro Mirror

IBM® HyperSwap® with Metro Mirror is a new feature in IBM PowerHA® SystemMirror 7.1.2 Enterprise Edition. It provides continuous availability against storage errors and prevents unnecessary application fallover to a disaster recovery site, which happens only if the storage fails. This article introduces HyperSwap with PowerHA and provides a detailed explanation of how to plan for and configure the PowerHA Enterprise Edition cluster with HyperSwap and Metro Mirror.

Share:

Kunal Langer (kunal.langer@in.ibm.com), Technical Consultant, IBM

Kunal Langer's photoKunal Langer works as a Power Systems Technical Consultant in Systems and Technology Group Lab Based Services (LBS) based out of India. He has more than six years of experience in AIX and PowerHA development, testing, and support and demonstrated expertise in PowerHA SystemMirror installation, configuration, administration, testing, and development. He has experience in interacting with customers and handling customer-critical situations. You can contact Kunal at kunal.langer@in.ibm.com.



19 August 2013

Also available in Chinese

Introduction

Data center and services availability is one of the most important topics of IT infrastructure and each day draws more attention. Replication of data between sites is a good way to minimize business disruption because backup restore operations can take too long to meet business requirements or the equipment might be damaged and depending on the extent of the disaster, might not be available for restoring data. Recovery options vary in cost ranging from the least expensive (that takes a longer time for recovery) to the most expensive (that provides the shortest recovery time and the closest to having zero data loss).

PowerHA SystemMirror 7.1.2 Enterprise Edition provides one such disaster recovery and high availability solution that helps automate node failures and application events and provide high availability. It helps to automate recovery actions on storage failures for selected storage, controlling storage replication between sites (separate data centers) and enabling recoveries for entire site failures, thus ensuring copies are in the consistent state to make the failover, enabling you to build a disaster recovery solution.

HyperSwap is one of the offerings from PowerHA SystemMirror 7.1.2 Enterprise Edition portfolio. It is a function that provides for continuous availability against storage errors. It is based upon storage-based synchronous replication [Peer-to-Peer Remote Copy (PPRC) or Metro Mirror]. When directed (or upon disk failure), the IBM AIX® host accessing the primary disk subsystem might transparently switch over to the backup copy of the data such that customers of the disks (such as middleware) are not affected.


Background

HyperSwap is a feature originally introduced in GDPS few years back and is for Metro Mirror PPRC (synchronous) environment, where it enhances the resilience of Parallel Sysplex by facilitating the immediate switching of PPRC-mirrored disk subsystems.

The HyperSwap technology enables the host to transparently switch an application's I/O operations to the secondary Metro Mirror volumes, provided physical connectivity exists between the host and the secondary storage subsystem. This affords the ability to provide continuous operations from a single site or from multiple locations within metro distances. By implementing HyperSwap, disk failures and maintenance functions can be endured without incurring any interruption to the application service.

This solution can offer better disaster recovery solutions for the customer and can demonstrate close integration of PowerHA with IBM storage.

The HyperSwap technology can enable PowerHA SystemMirror to support the following capabilities for the customers:

  • Eliminate primary disk subsystems as the single point of failure to provide the next level of continuous operations support within metro distances.
  • Enable storage maintenance without any application downtime.
  • Enable migration from old to new storage.

All of these use cases fall into one of the two types of HyperSwap activity:

  • Unplanned HyperSwap: When the primary storage fails, the OS that hosts the application detects and reacts to the event by performing a PPRC failover such that the application's I/O activities are transparently redirected to the secondary storage subsystem, thereby allowing the application to continue running without any interruption. Note that in this case, errors are detected by the operating system's Small Computer System Interface (SCSI) disk drivers in the OS and a decision is made across multiple hosts to switch over to the secondary storage subsystem wholesomely. For the duration of the HyperSwap swapping process, the I/O activity will be temporarily frozen from proceeding further. During this time, note that the applications would not experience failure, but instead experience non-lethal delays.
  • Planned HyperSwap: In this case, the administrator willingly initiates a HyperSwap from the primary to secondary storage subsystem. When the administrator has requested for a planned HyperSwap, the I/O activity will be frozen after coordination across the hosts in the cluster. Swap is performed and then I/O operations are allowed to be continued. Planned HyperSwap is helpful for performing maintenance tasks on the primary storage and also for migrating data from old storage to a newly purchased storage subsystem.
Figure 1: PowerHA SystemMirror HyperSwap configuration example
PowerHA SystemMirror HyperSwap configuration example

AIX support for HyperSwap

Figure 2 shows the components supporting HyperSwap.

Figure 2: AIX components for HyperSwap
AIX components for HyperSwap

HyperSwap-related components of AIX include:

  • Cluster Aware AIX (CAA)
    • Orchestrate cluster-wide actions
  • PowerHA HyperSwap kernel extension
    • Work with CAA to coordinate actions with other nodes
    • Analyze the messages from PowerHA framework and AIX storage framework and take proper actions
    • Determine the swap action
  • AIX storage framework
    • Work with the AIX interface with storage
    • Work closely with PowerHA HyperSwap kernel extension
    • Manage the status of the storage
    • Inform PowerHA HyperSwap kernel extension about I/O errors
    • Get swap decision from PowerHA HyperSwap kernel extension and send order to AIX PCM (MPIO)

Benefits

HyperSwap support with PowerHA provides following benefits:

  • Provides continuous availability against storage failures
  • Substitutes secondary storage to take the place of a failed primary device
  • Is non-disruptive and keeps the application running
    Figure 3: PowerHA cluster HyperSwap support
    Figure 3: PowerHA cluster HyperSwap support
  • Is transparent to the application
    Figure 4: HyperSwap disk representation
    Figure 4: HyperSwap disk representation
  • Enables consistency group management across IBM System Storage® DS8000® systems
  • Provides HyperSwap support for critical system disks, including:
    • Rootvg
    • Paging devices
    • Dump devices
    • Repository disks
  • Provides disk grouping support
  • Provides support for both AIX Logical Volume Manager (LVM) and raw disks
    • Disk or VG replication
    • Disk error handling
    • Oracle can be deployed with LVM or address space manager (ASM) disks
  • Provides support for multisite deployments
    • Compute node outages
      • Active-active workload provides continuous availability
    • Storage outages
      • HyperSwap provides continuous availability
    • Active-passive sites
      • Active-active workload within a site
      • Active-passive across sites
      • Continuous availability for site storage outages
Figure 5: Active-passive HyperSwap
Active-passive HyperSwap

Requirements

This section lists the hardware and software requirements for PowerHA SystemMirror Enterprise Metro Mirror HyperSwap support.

Hardware requirement

Hardware requirements for PowerHA SystemMirror HyperSwap include:

  • DS8800 storage devices
  • Power firmware level: IBM POWER7® or above
  • DS8800 firmware: 6.3 or higher, microcode: 86.30.49.0 or higher
  • Metro Mirror license
  • Storage area network (SAN) connectivity between storage subsystems
  • Fibre Channel (FC) connection between storages for Metro Mirror

Software requirement

Software requirement for configuring HyperSwap with PowerHA include:

  • PowerHA Enterprise Edition version 7.1.2
  • AIX 6.1 TL8 or AIX 7.1 TL2
  • IBM DSCLI 7 or later

Considerations

While planning for HyperSwap with PowerHA, you need to make a note of the following considerations.

  • HyperSwap for PowerHA is supported only on IBM DS8800 devices
  • Concurrent workloads across sites, such as Oracle RAC, are currently not supported. Note that this might change in future releases.
  • SAN connectivity between storage subsystems
  • DS8800 Metro Mirror (in-band) functions including HyperSwap are not supported on virtual SCSI (VSCSI).
  • To use Live Partition Mobility (LPM), you must disable HyperSwap for all online mirror groups. After LPM is complete, enable HyperSwap.
  • Disk replication relationships must adhere to one-to-one relationship between the underlying logical subsystem (LSS).
  • SCSI reservations are not supported in mirror groups that use the HyperSwap functions
  • Swap time must be calculated. This is the amount of I/O delay time in seconds that PowerHA causes while performing a HyperSwap operation on a mirror group. The swap timeout value is specific for each mirror group in a cluster. Swap timeout for planned HyperSwap is 120 seconds, and it cannot be changed. The swap timeout for unplanned HyperSwap is between 0 and 180 seconds. Factors to be considered to determine swap timeout for unplanned HyperSwap are:
    • Number of nodes where application is hosted. The greater the number of nodes means that more information is being shared.
    • Network latency and application network usage.
    • Number of disks that are used by the application.
    • I/O response time requirements for the application.

Performance considerations

It is important to make a note of the following performance considerations.

  • HyperSwap processing would be performed in a time-bound manner (timing characteristics shall be tunable and is enforced by AIX Storage Framework).
  • While processing a planned or unplanned HyperSwap, network communication to other cluster nodes can be done as a best-effort operation. These communications are expected to be of light weight and effort shall be made to keep the response time low.
  • Planned failover for resource groups that use DS8800 in-band Metro Mirror is expected to complete substantially faster because of in-band communication. Out-of-band performance has traditionally been suboptimal (due to DSCLI performance issues).

Implementation considerations

Following considerations should be kept in mind while planning for implementation:

  • I/O freeze operation on DS8800 operates on the whole LSS. If a single DS8800 LSS contains PPRC volumes from more than one application and if one of the replication links goes down, all the PPRC paths will get destroyed. And, if some of these applications are not managed by PowerHA, then some PPRC paths will have to be manually recreated by the customer.
  • PowerHA rediscovery utility would have to run after any storage-level PPRC configuration change. This includes anytime an update (such as add, remove, or change) of new PPRC paths is performed. Furthermore, the HyperSwap function performed (or automatically triggered) during this time window can cause an unexpected behavior.
  • Disk replication relationships must adhere to a one-to-one relationship between the underlying LSSs.
  • Enabling HyperSwap for repository disk would require an alternate disk to be specified.
  • Applications using raw disks are expected to open all the disks up front to enable the HyperSwap capability.
  • HyperSwap will not automatically transfer the SCSI reservations (if any) from the primary to the secondary disks.

Restrictions and limitations

While planning for HyperSwap with PowerHA, keep the following limitations and current restrictions in mind:

  • HyperSwap with PowerHA is not supported on VSCSI.
  • It is supported on IBM DS8800 systems and higher only.
  • Storage-level PPRC relationships and PPRC paths must be predefined (before PowerHA configuration).
  • Freeze operation on DS8800 operates on the whole LSS. If a single DS8800 LSS contains PPRC volumes from more than one application and if one of the replication links goes down, all the PPRC paths will get destroyed. If some of these applications are not managed by PowerHA, then some PPRC paths will have to be manually recreated by the customer.
  • The PowerHA Rediscovery utility must be run after any storage-level PPRC configuration change (such as add, remove, or change of new PPRC paths) is performed. Furthermore, the HyperSwap function performed (or automatically triggered) during this time window can cause unexpected or undesired behavior.
  • Support for Live Partition Mobility (LPM) shall be available subject to limitations of base AIX Device Driver support.
  • HyperSwap enable/disable operation on rootvg mirror group shall apply to all nodes of the cluster. Similarly, HyperSwap enable/disable operation on repository disk mirror group shall apply to both sites.
  • Disk replication relationships must adhere to one-to-one relationship between the underlying LSSs.
  • Enabling HyperSwap for the repository disk would require an alternate disk to be specified.
  • Applications using raw disks are expected to open all the disks upfront and the HyperSwap capability might not be available until this condition is met.
  • For disks managed by PowerHA Inband/HyperSwap support, PPRC operations performed outside of PowerHA is not supported and might cause undefined or unexpected results.
  • Concurrent workloads that concurrently access the same (primary) PPRC disk, but span across multiple sites shall not be supported. This might change in future.
  • Active-Active workloads that perform I/O to both PPRC primary and PPRC secondary volumes shall not be supported.

Initial disk configuration

Before your start, make a note of the following points:

  • AIX Path Control Module (PCM) driver is used. Enter the following command to configure all disks that are part of the storage system to use the AIX_AAPCM driver. A restart will be required.
    manage_disk_drivers –d device –o AIX_AAPCM
  • SCSI reservations are not supported for disks that are used in a HyperSwap mirror group. Verify that no disk reservations are set.
    devsrv –c query –l hdisk_name

    The command returns the following data:
    ODM Reservation Policy : NO RESERVE
    Device Reservation Policy : NO RESERVE
  • To create HyperSwap disks, prepare the disk pairs in storage subsystems and AIX before configuring in PowerHA.
    1. Select two disks, one from each subsystem, to be mirrored for HyperSwap disks. Choose two disks, one from each storage subsystem, to make the PPRC pair (for example, hdiskA and hdiskB).

      We need two disks, one from each DS8800 storage system. Disks already used can be used for HyperSwap; however, special care must be taken to ensure data integrity.

      The lshostvol.sh command located in /opt/ibm/dscli/bin displays the disk attributes, including the storage system LSS ID. The volume ID contains the following data:

      <vendor_name>.<storage_type>-<serial_number>/<LSS_ID><volume_ID>

      Example:IBM.2107-75TL771/BC00

      Choose two disks to make a PPRC pair. To create a PPRC pair, we need the WWPN for both storages which can be obtained with the lssi command from each storage system.

      We also need to know the port numbers available to connect this pair of disks. It can be obtained by using the lsavailpprcpair command.

    2. Establish the connection path from hdiskA to hdiskB (using the mkpprcpath command).

      We establish the connection path from hdiskA to hdiskB using the mkpprcpath DSCLI command and check the status using the lspprcpath command.

      Syntax:

      /opt/ibm/dscli/dscli/mkpprcpath –dev <Local Storage ID -srclss <Source LSS ID> 
      -tgtlss <Target LSS ID> -remotewwnn <Remote Storage WWNN> <IO Port1>:<IO Port2>

      Example:

      /opt/ibm/dscli/dscli/mkpprcpath –dev IBM.2107-75TL771 
      –srclss 9A –tgtlss BC –remotewwnn 50050763081B06D4 I0102:I0334
    3. Establish the connection path from hdiskB to hdiskA (using the mkpprcpath command).

      We can establish the connection path from hdiskB to hdiskA using the mkpprcpath DSCLI command and check the status using the lspprcpath command.

      Syntax:

      /opt/ibm/dscli/dscli/mkpprcpath –dev <Local Storage ID -srclss <Source LSS ID> 
      -tgtlss <Target LSS ID> -remotewwnn <Remote Storage WWNN> <IO Port1>:< IO Port2>

      Example:

      /opt/ibm/dscli/dscli/mkpprcpath –dev IBM.2107-75LY981 
      –srclss BC –tgtlss 9A –remotewwnn 500507630AFFC16B I0334:I0102
    4. Establish the volume pair of hdiskA and hdiskB in one direction (using the mkpprc command).

      Now, we establish the volume pair of hdiskA and hdiskB in one direction using the mkpprc command.

      Syntax:

      /opt/ibm/dscli/dscli/mkpprc –dev <Local Storage ID> -remotedev 
      <Remote Storage ID> -mode <value> -type <mmir/gcp> <Local LSS>:<Remote LSS>

      Example:

      /opt/ibm/dscli/dscli/mkpprc –dev IBM.2107-75TL771 
      –remotedev IBM.2107-75LY981 –mode full –type mmir BC00:9A00
    5. Enable HyperSwap for hdiskA on all the nodes (from all nodes using the chdev command).

      Next, we need to enable HyperSwap capability for the PPRC pair. Make the disk HyperSwap-capable using the chdev command.

      Syntax:

      $ chdev –a san_rep_cfg=migrate_disk –l hdiskX –U

      Example:

      $ chdev –a san_rep_cfg=migrate_disk –l hdisk25 –U

      After the command is successful, the secondary disk becomes unavailable. It is changed to the defined state. Repeat this step for all the other nodes.


AIX tunable related to HyperSwap

This section covers some tunable to be set for HyperSwap.

Table 1: HyperSwap relevant configuration settings

Name Component Value Description
dyntrk Protocol driver (fscsi) Enabled Provides transparent I/O recovery if the N_port ID for a device is changed
(for example, due to movement of the Fibre Channel from one switch port
to another). This option is handled at the host bus adapter (HBA) level.
fc_err_recov Protocol driver (fscsi) fast_fail Allows you to detect Fibre Channel problems between the switch and the
storage device.
hcheck_interval Disk driver (hdisk) 60 The time interval when a health check request is sent to the storage device.
The default setting is 60 seconds.
rw_timeout Disk driver (hdisk) 30 (DS8000) DS8000 read/write timeout value is set to 30 seconds
timeout_policy Disk driver (hdisk) fail_path(DS8000) Default value for DS8000

Most of the values quoted in the table are the default settings in AIX 7.1. However, they may need to be set to the values shown in Table 1 on AIX 6.1

HyperSwap related activities depend on two time components, failure detection time and actual swap time.

  • Failure detection time depends on the environment and the circumstances of the failures and is related to many of the time components mentioned in the above table. Timeouts and retries form a significant part of the failure declaration time.
  • Actual swap time depends on the AIX device driver and the PowerHA cluster components working together to establish the need for HyperSwap, coordinating across the cluster, and then performing the actual swap (any network issues might result in delays regarding the network coordination and could result in timeouts / delays or failure of HyperSwap). Swap operation time itself depends on the number of disks and hence the time taken by DS8800 to perform the swap. Additionally, actual swap time depends on timeouts and retries involved (if any) in regard to soft recoverable errors encountered while issuing the swap operations.

Table 2 provides the metrics related to swap time and its effect to the application for various scenarios.

Table 2: Swap time metrics

Swap type Swap time Transparent to application
Planned user mirror group 4 seconds Yes, if no read and reasonable amount of write operations, otherwise almost transparent
Planned system mirror group Less than 1 second Yes
Planned repository mirror group Less than 1 second Yes
Unplanned swap for pure write application 30 seconds (tunable) Yes, if write operations are not too high
Unplanned swap for pure read application 30 seconds (tunable) No, application hangs in the entire duration of swap.

PowerHA configuration

After all the pre-requisites are met and initial disk configuration is done, we need to configure the PowerHA SystemMirror cluster and add HyperSwap disks to the resource group. You need to perform the following steps to configure PowerHA.

  1. Ensure that all the necessary file sets (including cluster.es.genxd) are installed.
  2. Populate on all nodes the CAA rhosts file (/etc/cluster/rhosts) with the IP labels to be used for the communication path. Restart the clcomd daemon.
  3. Configure the cluster through smitty sysmirror > Cluster Nodes and Networks > Multi Site Cluster Deployment > Setup a Cluster, Nodes and Networks. Select the cluster type as Stretched or Linked. For our testing, we used Stretched.
  4. Then, choose the repository disks and the multicast address to be used by CAA. This can be done through smitty sysmirror > Cluster Nodes and Networks > Multi Site Cluster Deployment > Define Repository Disk and Cluster IP Address.
  5. Verify and synchronize the cluster. The CAA cluster gets configured on successful completion of the verification process.
  6. Create a volume group with HyperSwap capable disks for all nodes.
  7. Define the storage and the site association. Use path smitty cm_add_strg_system or smitty sysmirror > Cluster Applications and Resources > Resources > Configure DS8800 Metro Mirror (In-Band) Resources > Configure Storage Systems > Add a Storage System. Repeat the step for the secondary site storage as well.
  8. Set up the following mirror groups for the HyperSwap disks.
    • User mirror group
    • Cluster_repository mirror group
    • SystemMirror group

    Use path smitty cm_cfg_mirror_grps or smitty sysmirror > Cluster Applications and Resources > Resources > Configure DS8800 Metro Mirror (In-Band) Resources > Configure Mirror Groups > Add a Mirror Group. You can choose to configure the user, system or cluster_repository mirror group. For more details on various mirror groups, you can refer to PowerHA SystemMirror 7.1.2 Enterprise Edition for AIX from IBM Redbooks®.

  9. Create a resource group with a site policy. Select Prefer Primary Site or Online on Either Site as the intersite management policy.
  10. Add the mirror group and the volume group to resource group. Use smitty sysmirror > Cluster Applications and Resources > Resource Groups.
  11. Verify and synchronize.
  12. Start the cluster services.

Tips

The following tips can help during the configuration and also help achieve high availability.

  • Use more than one controller, per DS8800, in production for high availability.
  • In the production environment, it is recommended to put the repository disk on both sites (by using a linked-cluster setup) to avoid site failure, thus causing multiple impacts to the system, or use the HyperSwap capability to support repository mirror groups.
  • It is recommended to adhere to the following rules to achieve better performance for HyperSwap:
    • Keep all the HyperSwap disks of the same application in the same LSS whenever possible.
    • Do not mix HyperSwap disks of different applications in the same LSS whenever possible.
  • As a best practice, modify the tunables before using the disks. You can change the tunable using the chdev command:
    # chdev –l hdiskX –a rw_timeout=60
  • You can use the PowerHA tools for planned HyperSwap. This can be used for the maintenance of the storage subsystem so that there is no interruption to the application.
  • Users and group definition must be the same on all cluster nodes for applications or databases.
  • For Oracle 11g on AIX, in order to allow the operating system to use 16 MB pages or pinned memory when allocating shared memory, the Oracle user ID must have the following capabilities set: CAP_BYPASS_RAC_VMM and CAP_PROPAGATE.

Resources

Learn

Get products and technologies

  • Find and download service packs from Fix Central.
  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement service-oriented architecture (SOA) efficiently.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=940463
ArticleTitle=IBM PowerHA SystemMirror HyperSwap with Metro Mirror
publish-date=08192013