Several tasks must be completed before you use the system.
The recovery procedure re-creates the old system from the quorum data. However, some things
cannot be restored, such as cached data or system data that manages in-flight I/O. This latter loss
of state affects RAID arrays that manage internal storage. The detailed map about where data is out
of synchronization is lost, meaning that all parity information must be restored, and mirrored pairs
must be brought back into synchronization. Normally, this action results in the use of either old or
stale data, so only writes in flight are affected. However, if the array lost redundancy (such as
syncing, degraded, or critical RAID status) before the error that requires system recovery, then the
situation is more severe. Under this situation you need to check the internal storage:
- Parity arrays are likely syncing to restore parity; they do not have redundancy when this
operation proceeds.
- Because there is no redundancy in this process, bad blocks might be created where data is not
accessible.
- Parity arrays might be marked as corrupted. This identification indicates that the extent of
lost data is wider than in-flight I/O; to bring the array online, the data loss must be
acknowledged.
- RAID 6 arrays that were degraded before the system recovery might require a
full restore from backup. For this reason, it is important to have at least a capacity match spare
available.
Be aware of these differences about the recovered configuration:
- FlashCopy® mappings are restored as
idle_or_copied
with 0% progress. Both volumes
are restored to
their original I/O groups.
- The system ID is different. Any scripts or associated programs that refer to
the system-management ID of the system must be changed.
- Any FlashCopy mappings that were not in the
idle_or_copied
state with 100% progress at the point of disaster have inconsistent data on their target disks. These mappings must be restarted.
- Intersystem partnerships and relationships are not restored and must be
re-created manually.
- Consistency groups are not restored and must be re-created manually.
- Intrasystem Metro Mirror relationships are
restored if all dependencies were successfully restored to their original I/O groups.
- If hardware was replaced before the recovery, the SSL certificate might not be restored. If it is not restored, then a new self-signed certificate is generated with a
validity of 30 days. Follow the associated Directed Maintenance Procedures (DMP) for a permanent
resolution.
- The system time zone might not be restored.
- Any Global Mirror secondary volumes on the recovered system might have inconsistent data if
replication I/O from the primary volume is cached on the secondary system at the point of the
disaster. A full synchronization is required when re-creating and restarting these
relationships.
- Any
volumes that are while being formatted as a system failure occurs are set to the "formating_corrupt"
state by a system recovery and are taken offline. The recovervdisk CLI command must be used to
recover the volume, synchronize it with a synchronized copy, and bring it back online.
- After the system
recovery process completes, the disks are initially set to entire real-capacity. When I/O resumes,
the capacity is determined, and is adjusted to reflect the correct value.
Similar behavior occurs when you use the -autoexpand option on volumes. The real capacity of a disk might increase slightly, caused by the same kind of behavior that affects compressed volumes. Again, the capacity shrinks down as I/O to the disk is resumed.
- Distributed RAID 1 rebuild
in place synchronizes data between data strip mirrors, where possible. This synchronization can be
observed through the lsarraymemberprogress command.
- If
the system recovery occurs during a nondisruptive system migration, recovery of system data is
dependent on the point in the migration process when the system recovery action occurred. For more
information, see Verifying migration volumes after a system recovery.
For Virtual Volumes (VVols), complete the following tasks.
- After you confirm that the T3 completed successfully, restart Spectrum Control Base (SCB)
services. Use the Spectrum Control Base command service ibm_spectrum_control
start.
- Refresh the storage system information on the SCB GUI to ensure that the systems are in sync
after the recovery.
- To complete this task, login to the SCB GUI.
- Hover over the affected storage system, select the menu launcher, and then select
Refresh. This step repopulates the system.
- Repeat this step for all Spectrum Control Base instances.
- Rescan the storage providers from within the vSphere Web Client.
For Virtual Volumes (VVols), also be aware of the following information.
- FlashCopy mappings are not restored for VVols. The implications are as follows.
- The mappings that describe the VM's snapshot relationships are lost. However, the Virtual
Volumes that are associated with these snapshots still exist, and the snapshots might still appear
on the vSphere Web Client. This outcome might have implications on your VMware back up solution.
- Do not attempt to revert to snapshots.
- Use the vSphere Web Client to delete any snapshots for VMs on a VVol data store to free up disk
space that is being used unnecessarily.
- The targets of any outstanding 'clone' FlashCopy relationships might not function as expected (even if the vSphere Web Client recently reported clone operations as complete). For any VMs, which are targets of recent clone operations, complete the following tasks.
- Complete data integrity checks as is recommended for conventional volumes.
- If clones do not function as expected or show signs of corrupted data, take a fresh clone of
the source VM to ensure that data integrity is maintained.