Synchronization of source and target VSAM data sets

Because Data Replication for VSAM sends only change data to the target data set, you must begin replication with matching copies of your source and target VSAM spheres.

Data Replication for VSAM does not provide support for initially populating your target sphere. Two basic approaches are possible:

You can use various IBM® or third-party utilities to unload the source VSAM data and reload its contents at the target site after you transfer the data from the source site to the target site.
You can organize your VSAM spheres so that you can flash copy the source VSAM data and transfer the copies to the secondary site where they are restored. In this type of environment the source and target sites are mirrors of each other.

Additional setup steps might also be required to define the VSAM spheres at the target site depending upon whether you already have a mechanism in place to support a secondary site or you are starting from scratch.

Data Replication for VSAM sends only changed data to a target server to be applied to your target VSAM sphere by reading data capture log records that are produced by VSAM loggers such as CICS® TS or CICSVR at your source site. These data capture log records identify:

The source VSAM cluster that was updated by one of your source applications. The log record contains the file name and the block contains the application ID where the change originated. The application ID and file name can be related to the VSAM base cluster or path data set name through the tieup record that is generated.
The operation that the application performed (insert, update, or delete).
Information about the record and key that was updated by the application.

In addition to using data capture log records, Data Replication for VSAM captures and analyzes other kinds of log records that are produced by VSAM loggers to determine whether a unit of recovery (UOR) was committed or rolled back (for recoverable changes). Using this information, Data Replication for VSAM keeps your source and target VSAM spheres synchronized and mirrors the actions that are performed by the source VSAM applications. For example, if your source online transaction abnormally terminates, the source CICS TS region automatically backs out recoverable changes and produces additional log record that indicate that the UOR failed. Data Replication for VSAM intercepts those log records and does not forward the associated changes to the target because those updates were not permanently hardened to the source VSAM sphere, being reversed during rollback of the unit of work that rolled back.

Non-recoverable changes are always sent because they are not under the control of a transaction manager. Non-recoverable changes are grouped within Data Replication for VSAM for efficiency through the replication pathway. Non-recoverable grouping is based on the number of changes in the group, the number of replication log blocks that are read while a group is being held, the number of times the end of the replication log is reached, and other controls that are designed to maximize grouping efficiency and minimize latency that is introduced by grouping.

To ensure that replication is successful, Data Replication for VSAM requires, by default, that the source and target VSAM spheres have identical attributes defined in their respective ICF catalogs. These attributes are identified in the topic VSAM sphere validation. However, you can relax this requirement and replicate data provided that the source and target VSAM spheres are compatible. This capability is provided to give flexibility while upgrading or changing your VSAM spheres and applications to support new business requirements and can be used to assist in this process.

By default, both the structure and data of your source and target VSAM spheres must match. To achieve this match when you start replication for the first time, you must both define the source and target VSAM clusters similarly and load the source data into the target. CICS file definitions must be defined on the target server with the access privileges that are needed for replication (Read, Add, Update, Delete).

In most situations, you do not have to reinstall or reload a target data set after replication begins. If errors occur that are not related to data consistency (for example a network outage) or replication is stopped, your replication environment can catch up to current processing by reading the logs in the source LPAR when replication resumes.

1. Defining or redefining the target VSAM data sets

To define or redefine VSAM data sets, you close the file in the target CICS region, delete the cluster, define the cluster, and copy (REPRO) the source contents from a source quiesce point into the target cluster. You then start or restart replication from the quiesce point.

Conditions under which you install or reinstall a target data set include the following ones:

You set up Data Replication for VSAM for the first time.
You add a new data set to a subscription.
You change the key, record length, or organization of the source data set.
To change the definition of a data set, see the procedure in Changing the definition of a replicated VSAM data set (schema change).

If you only reclaim unused space or physically reorganize data in the source data set, you do not need to redefine the target data set. However, to improve access to the target data set for Data Replication for VSAM you might want to perform the same maintenance on the target data set.

2. Loading or reloading the data

You load the target data set by using the IDCAMS REPRO command to copy the source VSAM data set to a non-VSAM data set (flat file). You then transfer the non-VSAM copy to the target and use the IDCAMS REPRO command to load the non-VSAM data set copy into the target VSAM data set.

Follow these steps to ensure that source and target data sets match:

Allocate the source data sets.
Define a log stream for the source data sets. See the topic "Defining replication log streams".
Allocate data sets on the target server.
(Optional) Define a log stream for the target data sets.
Perform this step if data sets are defined with LOG(ALL) or the target is set up for LOGREPLICATE. Otherwise, you do not need to perform this step. See Defining replication log streams.
Copy data from the source data sets to the target data sets.
(Optional) Ensure that the source and target data sets are equivalent. Source and target data sets must begin as exact replicas.
Define the bookmark data. This step is only needed in a new deployment. When the bookmark data set is created it should not be altered in any way. Redefining the bookmark data set would discard any existing bookmark information. See Creating a bookmark database.
Set the log position. See Activating replication mappings.

Conditions under which you load or reload a target data set include the following ones:

You encounter errors or inconsistencies in the target data set, such as missing records
You restore a source data set to a prior version.
A VSAM batch job abends and you do not run the CICS VR batch backout utility (DWWBACK). For detailed information, see Recovering data sets after CICS VR batch job failures.
Mass updates occurred at the source that changed most or all records in the data set
Replication has been inactive for a subscription, and reloading the data will take less time than replicating the historical changes.