Reclamation of WORM-protected file volumes

Reclamation processing helps to manage the storage capacity that is used by storage pool volumes. To help ensure that data is always protected, you can set the default retention period of the NetApp SnapLock feature or IBM Spectrum® Scale immutable fileset to 30 days to match the default reclamation period of a Write Once Read Many (WORM) FILE volume. IBM Spectrum Protect reclaims any remaining data on a WORM FILE volume just before the retention date expiration.

The reclamation of a WORM FILE volume to another WORM FILE volume before the retention date expiration helps to ensure that data is always protected.

Because this protection is at an IBM Spectrum Protect volume level, the data on the volumes can be managed by IBM Spectrum Protect policy without consideration of where the data is stored. Data that is stored on WORM FILE volumes is protected both by data retention protection and by the retention period that is stored with the physical file on the SnapLock volume or volume in the IBM Spectrum Scale immutable fileset.

If an IBM Spectrum Protect administrator issues a command to delete the data, the command fails. If someone attempts to delete the file by using a series of network file system calls, the SnapLock or immutable fileset features prevent the data from being deleted. The storage pool volume cannot be deleted from the file system until its retention period expires. Even if most of the archived objects in the storage pool volume have expired, the reclamation process does not start until the retention period of all data in the volume expires or the data is moved to another SnapLock volume or to a volume in the immutable fileset.

During reclamation processing, if the IBM Spectrum Protect server cannot move data from an expiring SnapLock volume (or volume in an immutable fileset) to a new one, a warning message is issued.

How reclamation processing works

For each volume in a SNAPLOCK storage pool, an IBM Spectrum Protect reclamation period is created. The IBM Spectrum Protect reclamation period has a start date, BEGIN RECLAIM PERIOD, and an end date, END RECLAIM PERIOD. You can view these dates by issuing the QUERY VOLUME command with the FORMAT=DETAILED parameter. The output is similar to this example:
                       Begin Reclaim Period: 09/05/2017
                         End Reclaim Period: 10/06/2017

When IBM Spectrum Protect archives files to a SnapLock volume or volumes in an immutable fileset, the server tracks the latest expiration date of those files, and the BEGIN RECLAIM PERIOD value is set to that latest expiration date. When more files are added to the SnapLock volume or to the immutable fileset, the starting date is set to that later date if you have a file with a later expiration date than the one currently on the volume. The start date is set to the latest expiration date for any file on that volume. The expectation is that all files on that volume are already either expired, or are expiring on that day. On the following day, no valid data remains on that volume.

The END RECLAIM PERIOD is set to a month later than the BEGIN RECLAIM PERIOD. The retention date set in the NetApp file server for that volume is set to the END RECLAIM PERIOD date. The NetApp file server prevents deletion of that volume until the END RECLAIM PERIOD date is reached. This date is approximately a month after the data has expired in the IBM Spectrum Protect server. When the IBM Spectrum Protect server calculates an END RECLAIM PERIOD date for a volume, and the date is later than the current END RECLAIM PERIOD, the date is reset in the NetApp file server for that volume to the later date. Resetting the data to a later date guarantees that the IBM Spectrum Protect WORM FILE volume is not deleted until all data on the volume expires, or the data is moved to another WORM FILE volume.

The IBM Spectrum Protect reclamation period is the amount of time between the begin date and the end date. During the reclamation period, the IBM Spectrum Protect server deletes volumes on which all the data is expired, or moves files that are not expired on expiring SnapLock volumes or volumes in an immutable fileset to new volumes with new dates.

Data on a SnapLock volume or on a volume in an immutable fileset typically expires by the beginning date. Therefore, the volume is usually empty. When the end date arrives, the volume can be safely deleted from the IBM Spectrum Protect inventory and the SnapLock file server or immutable fileset.

However, some events might cause valid data to remain on the SnapLock volume or on the volume in the IBM Spectrum Scale immutable fileset:

  • Expiration processing in the IBM Spectrum Protect server for that volume might be delayed or is incomplete.
  • The retention parameters on the copy group or associated management classes might be altered for a file after it was archived. As a result, file expiration is scheduled for a later date.
  • A deletion hold might be placed on one or more of the files on the volume.
  • Reclamation processing is either disabled or is encountering errors during data movement to new volumes on a SnapLock storage pool.
  • A file is waiting for an event to occur before the IBM Spectrum Protect server can begin file expiration.

When the beginning date arrives and files are not expired on a SnapLock volume or immutable fileset, the files must be moved to a new SnapLock volume or volume on an immutable fileset with a new begin and end date. However, if expiration processing is delayed on the IBM Spectrum Protect server, and those files expire when expiration processing on the IBM Spectrum Protect server runs, it is inefficient to move those files to a new SnapLock volume or immutable fileset.

To ensure that unnecessary data movement does not occur for files that are due to expire, movement of files on expiring SnapLock volumes or volumes in immutable filesets will be delayed by a specified number of days after the BEGIN RECLAIM PERIOD date. Because the data is protected in the SnapLock file server or in the immutable fileset until the END RECLAIM PERIOD date, there is no risk to the data in delaying this movement. This allows IBM Spectrum Protect expiration processing to finish. After the specified number of days, if valid data is on an expiring SnapLock volume or volume in an immutable fileset, the data is moved to a new SnapLock volume or immutable fileset, thus continuing the protection of the data.

Since the data was initially archived, there might be changes in the retention parameters for that data (for example, changes in the management class or copy pool parameters) or there might be a deletion hold on that data. However, the data on that volume is protected by SnapLock or immutable fileset features only until the END RECLAIM PERIOD date. Data that is not expired is moved to new SnapLock storage pool volumes during the IBM Spectrum Protect reclamation period. If errors occur when data is moved to a new SnapLock volume or immutable fileset, a warning message indicates that the data will soon be unprotected. If the error persists, issue a MOVE DATA command for the problem volume.

Attention: Do not disable reclamation processing on a SnapLock storage pool. After the processing is disabled, the IBM Spectrum Protect server cannot issue warning messages that data will become unprotected. This situation can also occur if reclamation and migration are disabled for the entire server.