How a database can become corrupted

Human error is the most common cause of a corrupted database. The following table shows the most common ways that pointer errors are introduced into IMS databases.

Although the list is not all-inclusive, the errors most commonly reported by users are explained in the following table.

Table 1. Ways pointers get damaged
Cause of damage When it happens
Update with wrong DBD (ACB) JCL error in batch job
Improper recovery procedures Missing log
Improper reorganization procedures Misuse of an incorrect reorganization procedure
Failure to use emergency restart After cancel, abend, or power failure
Failure to run batch backout After cancel or abend
Software errors  
Hardware errors Unnoticed I/O error

Updating with the wrong DBD (or ACB)

DBD is the road map that IMS uses to interpret the database block. If a database is updated with a wrong DBD, results are unpredictable, and pointer errors might occur.

This is usually the result of a JCL error. A test DBD library could be accidentally used in a production update job. Production databases might be accidentally used in a test job. This usually occurs in batch jobs.

Improper recovery procedures

Using improper recovery procedures is a common way pointers get damaged. When running the IMS Database Change Accumulation utility and the Database Recovery utility, it is important to include all of the log tapes. If a log data set is omitted from the recovery process, a corrupted database is the likely result.

When recovering a database, IMS restores segments, pointers, and free space elements. If a log is left out of a recovery attempt, then a segment could be stored in the range of an incorrectly restored free space element.

Usually, IMS reclaims the free space and creates or updates a free space element when a segment is deleted. All pointers pointing to targets that are in the free space area are set to zero. If a log is left out of a recovery attempt, then these pointers might not get set to zero.

Improper reorganization procedures

Using improper reorganization procedures is a common way that databases get damaged. In this case, generally, you can recover the database from the image copy that was taken before the reorganization.

However, when a HALDB is reorganized by improper reorganization procedures, HALDB partition reorganization numbers in the partition can become corrupted or ILKs can become incorrect. For example, if the HALDB reorganization number verification function of IMS is not enabled and either a reorganization fails to increment the reorganization number of a partition correctly or a segment that has a low reorganization number in its EPS is moved into a partition and lowers the reorganization number of the destination partition, reorganization numbers can become corrupted. In this case, you cannot repair the corrupted partition reorganization numbers by using the standard IMS recovery methods. For more information, see the description of the DUPILKCHK keyword in PROC statement.

For information about HALDB partition reorganization numbers and how they can become corrupted, see the topic "HALDB partition reorganization numbers" in IMS Database Administration.

Failure to use emergency restart

Emergency restart is an extension of IMS’s normal restart process. It is initiated by a master terminal operator command whenever it is necessary to restart IMS after an IMS, z/OS®, hardware, or power failure. It should also be used whenever a prior execution of the IMS system was not terminated with a successful checkpoint.

If there is a power failure, the memory contents are lost (IMS buffers, too). If the system is brought back up without emergency restart, the database is probably damaged. Database changes made by in-flight transactions do not get backed out.

If the operator cancels the IMS DB/DC or CICS®/IMS DB control region, the situation is similar to that of a power failure. If the system is brought back up without emergency restart, the database is probably damaged.

Failure to run Batch Backout

If the IMS Batch Backout utility is not run when it is necessary, a corrupted database is the likely result. A batch backout might be needed after a batch job is canceled or abnormally terminates. The circumstances under which this utility must be run are described in the IMS Database Administration.

Software errors

Software errors in z/OS or IMS programs could result in pointer errors.

Hardware errors

Hardware failures could also result in pointer errors. If the operator does not notice an I/O error message and recover it appropriately, the database could be damaged.