IBM Support

Policy-based replication is replacing Remote Copy (Metro Mirror, Global Mirror, Global Mirror with Change Volumes and HyperSwap)

News


Abstract

IBM Storage Virtualize (FlashSystem and SAN Volume Controller) released a brand new replication technology called policy-based replication (PBR) in 2022, with the goal of simplifying management and significantly improving throughput and latency characteristics. Since the first release, PBR has continued to be enhanced with new features and interoperability.

To help simplify the user experience when configuring systems, 8.7.0 will be the last software release to support Remote Copy.

This document aims to help answer the question of what should you do with existing systems that are using Remote Copy

Content

What is Policy-based Replication (PBR)

Policy-based replication, first released in Storage Virtualize 8.5.2, provides asynchronous replication between two storage systems. It significantly simplifies configuring and managing replication through the use of replication policies and volume groups with simple reporting of the replication status and recovery point. It delivers an asynchronous replication solution that can dynamically adapt to the available link bandwidth and write throughput to maintain the disaster recovery copy without impacting application performance.
This replaces both Global Mirror and Global Mirror with Change Volumes. More details on policy-based replication can be found in the product documentation: Asynchronous disaster recovery replication.

What is Policy-based High Availability (PBHA)

Policy-based high availability is the preferred high availability solution for Storage Virtualize. It was first released in 8.6.1 and provides a high-performance HA storage solution between two independent storage systems. From version 8.7.1 onwards it can be combined with policy-based replication to deliver a HA between two systems with asynchronous replication to a third independent system for disaster recovery.
Further details on policy-based high availability can be found in the product documentation: High Availability.

What are the options for systems currently using Remote Copy

There is no single solution that is correct for all users - so a variety of options are listed below.

A. For all configurations - Do nothing

IBM will continue to support 8.7.0 with security patches and defect fixes until all hardware that is capable of running 8.7.0 is out of hardware support.

Therefore, there is no requirement to do anything. You can leave your system configured as it is today, until the hardware is decommissioned.

B. Global Mirror, Global Mirror with Change Volumes - Migration with temporary loss of DR

If your business requirements allow it, the simplest migration procedure will be to delete the Remote Copy relationships, then configure policy-based replication.

Remote Copy and PBR can co-exist on the same pair of systems, so the migration can be done in batches (one consistency group at a time).

C. Global Mirror - Migration without loss of DR

There is a non-disruptive procedure available that will allow a Global Mirror relationship to be converted to policy-based replication, by running GM and PBR in parallel.

Full details about how to perform the migration can be found in the product documentation: Converting Global Mirror legacy replication to policy-based asynchronous disaster recovery replication.

The migration procedure temporarily requires 2 copies of data at the DR site and will involve a full re-sync of the data between the two systems, but it can be done in batches (one consistency group at a time).

D. Global Mirror with Change Volumes - Migration without loss of DR

If your system and your link are capable of running Global Mirror for a single consistency group, then convert the consistency group from GMCV to GM then use the Global Mirror procedure in Option C.

For systems that are not capable of sustaining Global Mirror, use Option A or Option B (above).

E. Metro Mirror

IBM has released a Statement of Direction that policy-based replication will support synchronous replication for disaster recovery in a future release. More details about migration options will be made available when synchronous replication is released.
Some Metro Mirror customers might wish to consider switching to PBHA.

F. HyperSwap

Due to the different architectures between HyperSwap and PBHA,  no in place migration is available to convert HyperSwap to Policy-based HA.
Please contact your account team to discuss alternative options.

Tips for migrating from Remote Copy to Policy-based replication

Review the partnership settings

In Policy-based asynchronous replication, all replication traffic is controlled by the partnership bandwidth limit. This is different to Metro Mirror, Global Mirror and HyperSwap - where the partnership bandwidth limit was only used to control the rate of the synchronization.  Therefore you may need to change your settings as part of the conversion process.

There are 2 settings which need to be configured on the partnership for policy-based replication.  All other Remote Copy settings are not used for Policy-based replication:

  • link_bandwidth_mbits - bandwidth setting in Mebibits per second  (not Megabytes per second)
  • background_copy_rate - percentage between 0 and 100

The link_bandwidth_mbits is multiplied by the background_copy_rate to produce a single setting which will be called the Replication Rate Limit for ease of use in this document.   The Replication Rate Limit is used by both Remote Copy and PBR to control the maximum amount of data to send to the remote cluster (for workloads that are subject to rate limiting).   More details about what types of traffic are controlled by the Replication Rate Limit are included in the table below.

Note: The background_copy_rate is no longer used to calculate the Replication Rate Limit in 8.7.1, therefore for systems running 8.7.0 or earlier the recommendation is to set the background_copy_rate to 100, and set the link_bandwidth_mbits directly so that the settings don't change during later upgrades.

Replication Type
What is controlled by the replication rate limit
Is the replication rate limit a global limit, or a per-I/O group limit
What is the impact of setting the replication rate limit too low?
What is the impact of setting the replication rate limit too high?
Guidance for setting the replication rate limit
  • Global Mirror
  • Metro Mirror
  • HyperSwap
The Replication Rate Limit will control the synchronization rate when replicating new volumes, or catching up after the replication has been stopped.

Host IO will generate additional traffic which is not limited by the Replication Rate Limit
The setting applies to traffic between the two systems Synchronization takes longer when replicating new volumes, or when starting replication after it has stopped The host traffic may be impacted if the synchronization traffic is using too much of the link The Replication Rate Limit should be a small percentage (for example 10% - 25%) of the bandwidth that this system is permitted to use on the replication links.

This leaves plenty of headroom for the Host IO traffic
Global Mirror with Change Volumes
The Replication Rate Limit will control all replication traffic The setting applies to traffic between the two systems The RPO (age of the data at the remote site) is higher than necessary Congestion on the replication link may cause issues to other users of the same replication links. The Replication Rate Limit should be the amount of bandwidth that this system is permitted to use on the replication links
Policy-based Asynchronous Replication (including the asynchronous link of the 3 site replication)
The Replication Rate Limit will control all replication traffic The setting applies to all traffic between each pair of I/O groups in the local and partner system, referred to as an I/O group connection. Each I/O group connection is limited by the Replication Rate Limit.

For example: I/O group 0 in system A to I/O group 0 in system B is one I/O group connection, and I/O group 1 to I/O group 1 is another I/O group connection.
The RPO (age of the data at the remote site) is higher than necessary Congestion on the replication link may cause issues to other users of the same replication links. For single I/O group systems, the Replication Rate Limit should be the amount of bandwidth that this system is permitted to use on the replication links

For multi I/O group systems, divide the single I/O group number by the number of I/O groups
Policy-based High Availability
Policy-based HA does not use the Replication Rate Limit.

The replication rate limit can be changed at any time, so if the limits are too high or too low they can be adjusted as needed.     If you are using multiple different replication types (for example during migration from Global Mirror to PBR) then it may be necessary to pick a value between the two different guidelines and tune based on real-world behaviour.

Volumes may need to be moved to different I/O Groups as part of the conversion to Policy-based replication

In Remote Copy, there is the concept of a consistency group, which indicates that a set of volumes must be kept synchronized with each other.

Policy-based replication uses the concept of Volume Groups to provide the same functionality, however a Policy-based replication has an additional requirement that all volumes in a Volume Group must be in the same I/O Group. Therefore, some volumes may need to be moved to alternative I/O groups before they can be replicated using PBR.

The ability to move a volume between I/O Groups without deleting the Remote Copy relationships first was added to 8.6.0 to make this conversion process easier. 

Additional management connectivity required

Policy-based replication vastly simplifies the user configuration overhead by removing the requirement for users to make configuration changes on both systems when configuring replication.

To achieve this simplification, the two systems require IP connectivity between the systems to allow one system to execute commands on the other system.
Specifically, management IP 1 on System A requires the ability to make a HTTPs connection to port 7443 on management IP 1 on the System B.  And the same in the opposite direction.    This traffic will not use the HTTP proxy (if configured).
Replication setup using the GUI requires access to both systems from the host where the web browser is running. Ensure that any firewalls between the web browser and storage systems allow traffic to port 7443 and 443 on the system management IP addresses.

[{"Type":"MASTER","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STSLR9","label":"IBM FlashSystem 9x00"},"ARM Category":[{"code":"a8m0z000000bqPRAAY","label":"Configuration"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"ST3FR9","label":"IBM FlashSystem 5x00"},"ARM Category":[{"code":"a8m0z000000bqPRAAY","label":"Configuration"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSA76Z4","label":"IBM FlashSystem 7x00"},"ARM Category":[{"code":"a8m0z000000bqPRAAY","label":"Configuration"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STPVGU","label":"SAN Volume Controller"},"ARM Category":[{"code":"a8m0z000000bqPRAAY","label":"Configuration"}],"Platform":[{"code":"PF025","label":"Platform Independent"}]}]

Document Information

Modified date:
24 September 2024

UID

ibm17166967