Performing an unplanned maintenance using IW cache

You can perform an unplanned maintenance by using the IW cache.

For an unplanned outage, some changes might not be synchronized with the home. IW can handle requests that are lost from the cache due to a disaster, while allowing application updates at the home to the same data.

The cache became unavailable because of the disaster at the IW cache. Therefore, the IW cache is down with pending updates (Side A). Steps to follow in an unplanned outage are:

  1. Start applications at home (Side B). Record the time.
  2. When the IW cache site (Side A) is ready to take over again, stop applications at the home site (Side B).

    The following figure illustrates IW cache site (Side A) to home site (Side B).

    Figure 1. IW cache site (Side A) and home site (Side B)
    IW cache site (Side A) and home site (Side B)
  3. Link the cache site back (Side A) and ensure that the gateways are up. Do not access the fileset directories.

    If the cache directories are accessed, recovery is triggered. Old contents from the cache, which were pending, are synchronized with the home at the end of recovery, over-writing latest application changes at home. Perform the following steps so that the latest data is preserved and old or stale data is discarded. The data staleness is determined based on file mtime and failover time. The failover time is the time when applications were moved to the home, given as input to the failback command.

  4. Run the following command from the cache site to synchronize the latest data, specifying the time when the home site became functional and including the complete time zone, in case home and cache are in different time zones:
    mmafmctl FileSystem failback -j Fileset --start --failover-time 'TimeIncludingTimezone'

    The failback command resolves conflicts between pending updates in cache at the time of failover with the data changes at home. When the synchronization is in progress, the fileset state is FailbackInProgress. The fileset is read-only when the failback is in progress.

    After failback completes, the fileset state is FailbackCompleted. The failback process resolves conflicts in the following way:

    1. Dirty data that was not synchronized is identified.
    2. If the changed files or directories were not modified at the home when the applications were connected to the home, the cache pushes the change to the home to avoid any data loss.
    3. If the files or directories are recently modified at the home, the cache discards the earlier updates from the cache. The next lookup on the cache brings the latest metadata from the home.

    If the conflicted dirty data is a directory, it is moved to .ptrash. If the conflicted dirty data is a file, it is moved to .pconflict. The administrator must clean .pconflict directory regularly. If IW is converted to the other modes, .ptrash and .pconflict directories remain.

  5. After achieving the FailbackCompleted state, run the following command to move the fileset from the Read-Only to the Active, where it is ready for use. If the command is not run successfully, run the command again.

    mmafmctl FileSystem failback -j Fileset --stop

    Note: If failback is not complete or if the fileset moves to NeedFailback, run the failback command again.
The cache site is ready for use. All applications can start functioning from the cache site. New files that are created at home are reflected in the cache on the next access, based on the revalidation interval. Failback does not pull in data of uncached files from home, which needs to be done explicitly by the administrator by using the mmafmctl prefetch command. If failback might be used, the ctime of RENAME operations in the home file system must be updated. This is enabled by using setCtimeOnFileRename at home:
mmchconfig setCtimeOnFileRename=yes –i
Note: Failback does not work if revalidation is disabled in the IW cache site.