Response time

Response time is relative to an individual end user who is not interested in the resource utilization or how many other users happen to be using the system. It turns out that an end user is usually most interested in a response at the very same time that many other users are also interested in their responses. That is, all the end users are busy simultaneously as, for example, can be the case for bank tellers at various branches throughout a city during lunch hour.

A line representing response time in Figure 1 shows the principal delays that contribute to the time needed to respond to a message. This figure is a representation of a message requiring a total elapsed time of 3 seconds from the moment an end user makes a request at a terminal or workstation until the reply begins to appear at the user's terminal or workstation; the total elapsed time is called response time in the z/TPF vernacular. The measurement of response time begins at the instant the end user enters a request (message) and includes the time that the message travels over communication lines to a communication controller, and arrives through the channel subsystem at a CPU in a central processor complex (CPC). At the CPC, programs are invoked to process the message by accessing a large database, formatting a reply message, and sending the reply. The time required for this to occur is also included in the response time.

Figure 1 shows a CPU occupancy of 1/2 second or 500 milliseconds (ms). CPU occupancy includes the creation of a control block that identifies the programs and data necessary to perform the message processing. Although not all the programs and data need reside in main storage during the processing of the message, the control block does. During this CPU occupancy, most of the time is usually spent waiting for I/O to be completed and very little time executing CPU instructions. The CPU occupancy of any given message consists of several intermixed processing intervals and I/O delays. Because the I/O gaps represent most of the delay while the message is in the CPU, the number of channels to secondary storage, queueing disciplines, and the organization of data are very important in order to maintain fast response times at peak periods; these issues are related to minimizing the length of the I/O gaps.

The contribution of the processing intervals to the response time delay of a single message is very slight. A very slight processing interval, of course, depends upon the power of the CPU doing the processing. In the example response time line of Figure 1, the I/O delays can account for 494 milliseconds and the processing intervals for 6 milliseconds. Clearly, if the system responds to only one user, the CPU can be orders of magnitude slower without any dramatic change in the single user's response. Multiprogramming is employed to utilize a single CPU on behalf of other messages for other transactions during the I/O delays for a given message. Multiprocessing is used to allow multiple CPUs to share the processing load of very high message rates.

z/TPF systems are generally used in environments where economies are realized by an affinity of large volumes of data with sufficiently powerful CPUs to service thousands of users at peak periods. The z/TPF system is used in some environments to hold shared network data in order to distribute processing function throughout a network of processing centers. For example, the system is used in credit verification applications to process a credit inquiry or to route the inquiry between an agent and an appropriate processing center.
Figure 1. Response time (per message)