Replication of data in retention sets

If you use retention rules to meet your long-term data retention requirements, you can replicate the data that is stored in retention sets. This data might be expired according to policy settings for data retention, but it is protected from expiration if it is contained in an active retention set. The data can be replicated from a source replication server to a target replication server.

While it is possible to define retention rules and retention sets on the source or target replication servers, it is preferable to do so on the source replication server only and to schedule the creation of the retention set when the retention rule itself is created. This helps to ensure consistency between the source and target replication servers and to ensure that all files are available for inclusion in the retention set. For information about replicating client data to another server, see Replicating client data to another server.

Retention rules and retention sets that are defined on the source replication server are not replicated to the target server. Only the data in the retention sets is replicated. Replication metadata is also updated when the state of the replicated data object changes to RETAINED. A data object changes to a RETAINED state when the data expires or is deleted but is kept because it is contained in an active retention set. The retained data is kept until the retention set itself expires or is deleted.

To enable node replication, follow the instructions in Enabling node replication.

Prerequisites

Before you run a replication operation, ensure that the following prerequisites are met:

  • Verify that the target replication server supports retention. Retention rule features are supported from IBM Spectrum® Protect Version 8.1.7.
  • Ensure that data ingestion is complete on the source replication server for the nodes or virtual machines that contain the retained data.
Restrictions that pertain to configuration:
  • You can define retention rules and retention sets on the source or target replication server. However, you cannot define retention rules and retention sets for the same data on both the source and target replication servers at the same time. If a retention set exists for a node that is being replicated to another server, that same node cannot be a member of a retention set on the target replication server. Similarly, if a retention set exists for a node that is a target node for replication from another server, the source node cannot be a member of a retention set on the source replication server. Replication processing fails until the retention set on one server in the replication operation expires or is deleted.
  • If you have retention sets on either a source or target replication server, you cannot reverse the roles of the replication servers; that is, you cannot make the source server the target server and vice versa. If you do so, data in retention sets might be damaged or lost.
  • If a node on a server is the target for a replication operation from another server, one-time only retention sets cannot be created in the past for such a node. For more information, see DEFINE RETRULE (Define a retention rule).
  • In the Operations Center, you cannot define retention rules to protect nodes on a target replication server if the nodes were replicated from another server.
  • If you are replicating nodes to a target server that is earlier than V8.1.7, retention is not supported on the target server and you cannot add the nodes to a retention rule on the source replication server.
  • If you issue the REMOVE REPLNODE command to remove a node from replication, previously replicated backup data objects that are in a RETAINED state might be deleted from the target server due to expiration processing if they are not part of a retention set on the target server. To avoid deleting the data, ensure that the expire inventory process is stopped and disabled on the affected target servers. For more information, see REMOVE REPLNODE (Remove a client node from replication).

Guidelines about expiration and deletion of retained data

Typically, data objects expire when they exceed retention criteria that is specified in the policy settings for data retention on the server. For more information, see Policy concepts.

Expiration processing on the server removes the expired files from the server database and the files are deleted from server storage. However, files that are deemed to be expired according to policy settings for data retention but that are contained in an active retention set are kept and available for replication across servers. Expiration of this retained data can be processed in the following ways:

  • If a retained data object expires or is deleted on the source replication server, the replica data object on the target replication server is deleted during the next replication operation.
  • Data on a target replication server does not expire until all the retention sets containing the data on the source replication server expire. When the retention set expires on the target server, only the retained data objects that it contains are deleted.
  • If you set dissimilar policies on servers that contain retention sets, expiration is handled in the following way:
    • If the retention set is defined on the source replication server, retained data objects are replicated and marked as INACTIVE on the target server. The replicated data either expires according to the target server’s own policy settings or is deleted.
    • If the retention set is defined on the target replication server, the target server’s own policies handle the expiration of the retained data. If the retained data belongs to an active retention set, it is kept until that retention set expires.
    Important: When you use dissimilar policies, ensure that you are aware of the effect of the policies on retained data so that the data does not expire until you want it to expire. If you enable dissimilar policies for your retention sets on the source and target servers, replicated backup objects that are in a RETAINED state but that are not in a retention set on the target server are deleted according to policy settings.

    To set dissimilar policies, issue the SET DISSIMILARPOLICIES command. For more information, see SET DISSIMILARPOLICIES (Enable the policies on the target replication server to manage replicated data).

Scenario: Retention rule and retention sets are defined on the source replication server

If the retention rule and retention set are created on the source replication server, the following considerations apply:

  • Retention rules and retention sets that are defined on the source replication server are not replicated to the target server. Only the data in the retention sets is replicated. For information about replicating client data to another server, see Replicating client data to another server.
  • If you replicate an active or inactive data object and then the data object changes to a RETAINED state, the replicated data object on the target replication server changes to a RETAINED state during the next replication operation.
  • If a data object is in a RETAINED state on the source replication server and was not replicated before changing to this state, the data object is replicated to the target replication server during the next replication operation and will automatically be marked as RETAINED.

Scenario: Retention rules and retention sets are defined on the target replication server

If you define a retention rule or retention set on the target replication server, the following considerations apply:

  • Ensure that files are available for inclusion in the retention set by issuing the DEFINE COPYGROUP command and setting the VEREXISTS parameter to a value of 3 or greater.
  • When a replicated file on the target replication server is subject to deletion due to deletion processing on the source server, the replicated file changes to a RETAINED state if it belongs to an active retention set on the target replication server.
  • When the data on the source replication server expires, the relationship between the affected files on the source and target servers is removed and the files on the target server are no longer considered to be replicas.
  • When a retention set expires on the target server, only the retained data objects that it contains are deleted.