Logical Units of Work

To provide support of CMS work units, as well as to provide atomicity of file pool server requests (either complete or fail), a file pool server has a request grouping for commit or rollback called a logical unit of work (LUW). You can think of a logical unit of work as the server's representation of the work it is doing on behalf of a user. File pool server processing is designed so all operations in a logical unit of work must succeed or be unsuccessful as a unit. A server starts a logical unit of work for a user when the user first sends a request to the server. It ends a logical unit of work when it receives a request to commit or roll back the changes. Many file pool server requests are defined to be atomic. That is, the CMS work unit completes with one file pool server request and the file pool server logical unit of work is just a single request. BFS requests to a file pool server are normally atomic. One exception to this is for BFS files opened by an application as described in the note in section Work Units. Because multiple CMS work units can be active in a user machine, a user can have multiple corresponding logical units of work active in a server.

To help enforce the rule that all changes in a CMS work unit must succeed or be unsuccessful as a unit, file pool server repositories use the file pool log minidisks and the CRR recovery server uses the CRR log minidisks. The filepool log minidisks are critical to the server's ability to recover from system errors. In the file pool logs, server processing records changes that occur during all logical units of work. It also records when logical units of work are committed or rolled back. In the CRR logs, server processing records sync point activity. If the system encounters a problem and stops, the server can determine what work is incomplete the next time it is started. During this process, known as restart recovery, the server rolls back any work that was not committed or rolled back at the time of the error. It also re-does work that was committed, but that was not yet permanently changed in the file pool itself. If the server is a CRR recovery server, restart recovery discards any coordinated transactions (logged sync points) that are not prepared, and the coordinated transactions that are prepared go into resynchronization processing. By doing so, the server eliminates any partial changes to the file pool and satisfies the definition of CMS work units.

Maintaining a file pool log is so important that the server maintains two copies of it, each on a separate minidisk. This lets the server protect the integrity of the file pool even if there is a damaged track on one of the file pool logs or if there is a media error (the device breaks) that makes one of the file pool logs useless. The server can detect a damaged spot on one of the file pool logs and automatically compensate for it on the other. In the case of a DASD error, you still have to replace the damaged minidisk, but the integrity is protected by the other file pool log. To protect against media errors, define your file pool log minidisks on different DASD volumes.

The file pool logs are initialized when the file pool is generated. They can be reconfigured (moved to a different device, made larger, made smaller) by running the FILESERV LOG command.

Each file pool log is limited in size to a single DASD volume. And, because the file pool logs are mirrors of each other, both logs must be of identical size. Generating a File Pool and Server describes how to estimate the size needed and provides recommendations on where to place the file pool repository logs and CRR logs.

If the server is a CRR recovery server, there must also be two CRR logs associated with the CRR recovery server in addition to the two file pool repository logs. The CRR log minidisks are critical to the CRR recovery server's ability to complete coordinated transactions, by means of resynchronization processing, when there has been a system error. The CRR recovery server records the state of all protected resources and protected conversation partners that are participating in a coordinated transaction. The CRR resynchronization function uses the data on the CRR logs to commit or roll back all the incomplete updates within a CRR logical unit of work to recover from:
  • Application errors
  • Server (participating resource manager) errors
  • Communications errors
  • System errors

Maintaining a CRR log is so important that the CRR recovery server maintains two copies of it, each on a separate minidisk. This lets the CRR recovery server protect the integrity of the coordinated transaction even if there is a damaged track on one of the CRR logs or if there is a media error (the device breaks) that makes one of the CRR logs useless. The CRR recovery server can detect a damaged spot on one of the CRR logs and automatically compensate for it on the other. In the case of a DASD error, you still have to replace the damaged minidisk, but the integrity is protected by the other CRR log. To protect against media errors, your CRR log minidisks should be defined on different DASD volumes.