IBM Tivoli Storage Manager, Version 7.1

Replication of deduplicated data

Data deduplication is a method for eliminating redundant data that is stored in sequential-access disk (FILE) primary storage pools, copy storage pools, and active-data storage pools. Before the data is replicated, the source replication server determines whether storage pools are set up for data deduplication.

Restriction: During replication processing, the simultaneous-write function is disabled on the target replication server when you store data to a primary storage pool that is enabled for data deduplication. Data that is replicated consists of only files or extents of data that do not exist on the target replication server.

The following table shows the results when storage pools on source and target replication servers are enabled for data deduplication. The destination storage pool is specified in the backup or archive copy-group definition of the management class for each file. If the destination storage pool does not have enough space and data is migrated to the next storage pool, the entire file is sent, whether the next storage pool is set up for deduplication.

If the storage pool on the source replication server is	And the destination storage pool on the target replication server is	The result is
Enabled for data deduplication	Enabled for data deduplication	Only extents that are not stored in the destination storage pool on the target replication server are transferred.
Enabled for data deduplication	Not enabled for data deduplication	Files are reassembled by the source replication server and replicated in their entirety to the destination storage pool.
Not enabled for data deduplication	Enabled for data deduplication	The source replication server determines whether any extents were identified for files that were previously stored in deduplicated storage pools. Any files that were never in a deduplicated storage pool are replicated in their entirety. For files that had extents that were previously identified, only extents that do not exist in the destination storage pool are transferred.
Not enabled for data deduplication	Not enabled for data deduplication	Files are replicated in their entirety to the destination storage pool.

Tip: If you have a primary storage pool that is enabled for deduplication on a source replication server, you can estimate a size for a new deduplicated storage pool on the target replication server. Issue the QUERY STGPOOL command for the primary deduplicated storage pool on the source replication server. Obtain the value for the amount of storage space that was saved in the storage pool as a result of server-side data deduplication. This value is represented by the field Duplicate Data Not Stored in the command output. Subtract this value from the estimated capacity of the storage pool.

Feedback