Checklist 4: Actions for load-source disk unit failure

The checklist shows the sequence of steps to use to recover after a failure on a load-source disk unit with complete data loss. A user auxiliary storage pool was configured.

This checklist is used for the following problem situation:

Failed unit
Load source unit
Data loss
All
User ASP is configured
Yes
Basic user ASP overflowed
No
Attention: When you replace a disk unit in your system auxiliary storage pool (ASP), the system loses addressability to the objects in your basic user ASPs. Recovering object ownership for objects, other than document library objects (DLOs), you must manually assign ownership for every object in every basic user ASP. You might want to treat this situation as a total recovery and restore all your information from your save media if the following conditions are true:
  • You have many objects in your basic user ASPs
  • You backed up your system

Run the steps that are described in Checklist 20: Recovering your entire system after a complete system loss topic to recover your system.

Before you begin your recovery, make a copy of this checklist. Complete the appropriate areas as you and the service representative run the recovery steps. This checklist provides an important record of your recovery actions. It can help you to diagnose any problems that occur after the recovery. It might also be useful in evaluating your backup strategy.

Most steps in the checklist include references to other topics in this document. Refer to these tasks if you need more information about how to run a particular step. You might not need to run some steps, such as restoring changed objects, if they do not apply in your situation.

Note: When your load source unit fails and you are recovering from distribution media, you might not be able to use the Operations Console (LAN) for the recovery.
Table 1. Recovering from a disk failureChecklist 4
Task What to do Where to read more about it
Actions run by the service representative
___ Task 1 Attach the new disk unit.  
___ Task 2

Prepare to load the Licensed Internal Code1 using the most recent Save System (SAVSYS) media.

Preparing for loading the Licensed Internal Code.
___ Task 3 Install the Licensed Internal Code by selecting option 3 (Install Licensed Internal Code and Recover Configuration). Loading the Licensed Internal Code.
___ Task 4 Recover the disk configuration (assignment of disks to ASPs and protection). Recovering your disk configuration.
Actions run by the user
___ Task 5

If you are using an encrypting tape drive, verify that the Encryption Key Manager (EKM) is running and connected to the tape library before you begin the recovery operation. The EKM contains the encryption keys that are needed for the recovery operation. Skip this step if you are not using an encrypting tape drive.

rzarmencryptrecovertape.htm
___ Task 6 Restore the operating system, beginning with Task 1: Starting to restore the operating system. You are doing a complete restore operation. Restoring the operating system, task 1 through task 6.
___ Task 7

If you are restoring from an encrypted backup, the save/restore master key on the target system must match the save/restore key on the source system. If the save/restore master key does not match, set the target system's save/restore master key to recover all the master keys.

Recovering from an encrypted backup using software encryption and Loading and setting save/restore master key in the Cryptography IBM i documentation topic.
___ Task 8 If you restored the operating system from distribution media, some system information was reset to default values, such as access path recovery times and the system reply list. Set these values correctly. Recovering system information.
___ Task 9 If necessary, change the QALWOBJRST system value by using the WRKSYSVAL command. Write the old value here: ___________ Controlling restoration of security-sensitive objects.
___ Task 10 If necessary, change the QVFYOBJRST system value by using the WRKSYSVAL command. Write the old value here: ___________ Controlling restoration of security-sensitive objects.
___ Task 11 If necessary, change the system value that controls whether the job log wraps when it is full. Use the Work with System Values command: WRKSYSVAL QJOBMSGQFL. Write down the current value here: ___________. Then change the value to *PRTWRAP. The System values topic in the IBM i documentation.
___ Task 12 After the system values are changed, sign off by using the command SIGNOFF *LIST. Then, using a newly created password, sign back on as QSECOFR for the new values to take effect. Describing the contents of your user auxiliary storage pools.
___ Task 13 Describe or diagram, as much as possible, the contents of your user ASPs before the failure. Describing the contents of your user auxiliary storage pools.
___ Task 14 Recover user profiles, configuration, libraries in the system ASP, and the contents of your basic user ASPs. If you choose not to restore all of your libraries now, verify that you restore the QGPL and QUSRSYS libraries along with the libraries that you are restoring. Recovering a basic user auxiliary storage pool (ASP) after recovering the system ASP, task 1 through task 11.
___ Task 15 Restore document library objects. Restoring documents and folders.
___ Task 16 Restore your last complete save of directories.1 Restoring objects in directories.
___ Task 17 If you have user-defined file systems in user ASPs that do not restore correctly, you might need to run more recovery steps. Task 7: Restoring user-defined file systems to the user auxiliary storage pool.
___ Task 18 Restore changed objects and apply journaled changes. Restoring changed objects and apply journaled changes, task 1 through task 7.
___ Task 19 Update program temporary fix (PTF) information for all PTF save files in the QGPL library by typing: UPDPTFINF. Restoring changed objects and apply journaled changes.
___ Task 20 If the Save System Information (SAVSYSINF) was used, then run the Restore System Information (RSTSYSINF) command. The Restore System Information (RSTSYSINF) command restores a subset of the system data and objects that are saved with the Save System Information (SAVSYSINF) command. Restoring system information.
___ Task 21 Restore authority. Type - RSTAUT Restoring object authorities.
___ Task 22 Reapply any PTFs that were applied since your last SAVSYS operation. Restoring program temporary fixes (PTFs).
___ Task 23 If necessary, change the QALWOBJRST system value back to its original value by using the WRKSYSVAL command. Controlling restoration of security-sensitive objects.
___ Task 24 If necessary, change the QVFYOBJRST system value back to its original value by using the WRKSYSVAL command. Controlling restoration of security-sensitive objects.
___ Task 25 If necessary, change the QJOBMSGQFL system value back to its original value by using the WRKSYSVAL command. System values.
___ Task 26 Run one of the following commands -
SIGNOFF *LIST
DSPJOBLOG * *PRINT

Check the job log to verify that all objects were restored. The job log contains information about the restore operation. To verify that all objects were restored, spool the job log for printing along with the job's remaining spooled output, if any.

Message CPC3703 is sent to the job log for each library that was successfully restored. Message CPF3773 is sent to indicate how many objects were restored. Objects are not restored for various reasons. Check for any error messages, correct the errors, and then restore those objects from the media.

 
___ Task 27 Do an IPL now. Performing a normal initial program load.
___ Task 28 If IBM Content Manager OnDemand for i is installed, complete journaling for Content Manager OnDemand by typing the following commands -
CALL QRDARS/QRLCSTRJ PARM('RLC')
CALL QRDARS/QRLCSTRJ PARM('RLR')
 
___ Task 29

If you have the Cryptographic Device Manager licensed program, 5733-CY2 or 5733-CY3, installed, run this command: CALL QCCADEV/QCCAELOAD

 
___ Task 30 Review job logs or output from your restore operations to verify that all objects were restored successfully. Verifying whether objects are restored successfully.