Parallel Sysplex principles

A Parallel Sysplex® is a collection of z/OS® systems that cooperatively use certain hardware and software components, to achieve a high-availability workload processing environment.

A Parallel Sysplex is a set of up to 32 z/OS systems that are connected to behave as a single, logical computing platform.

The base sysplex was introduced in 1990.
The Parallel Sysplex was introduced in 1994 with the addition of the Coupling Facility.

The underlying structure remains virtually transparent to users, networks, applications, and operations.

The main principle behind a Parallel Sysplex is to provide data sharing capabilities, which then provide the following benefits:

Reduction/removal of Single points of failure within the server, LPARs, and subsystems
Improved application availability
A single systems image
Dynamic session balancing
Dynamic transaction routing
Scalable capacity

Parallel Sysplex components

A Parallel Sysplex is a set of systems that all have access to the same one or more Coupling Facilities. While a basic sysplex is an actual entity, with a defined name (the sysplex name), a Parallel Sysplex is more conceptual. There is no single place that maintains the name of the Parallel Sysplex and a list of the systems it contains. There are a number of constituents to a Parallel Sysplex:

Coupling Facility

A key component in any Parallel Sysplex is the Coupling Facility (CF) infrastructure. In a Parallel Sysplex, Central processor Complexes (CPCs) are connected via the CF. The CF consists of hardware and specialized microcode (control code) that provides services for the systems in a sysplex. A CF runs in its own LPAR. Areas of CF storage are allocated for the specific use of CF exploiters. These areas are called structures. There are three types of structure:

Lock:
For serialization of data with high granularity. Locks are, for example, used by IRLM for IMS DB and IBM® MQ databases, and by CICS® when using DFSMS VSAM RLS.
Cache:
For storing data and maintaining local buffer pool coherency information. Caches are, for example, used by DFSMS for catalog sharing, RACF® databases, IBM MQ databases, VSAM and OSAM databases for IMS, and by CICS when using DFSMS VSAM RLS. Caches contain both directory entries and optional data entries
List
For shared queues and shared status information. List structures used by CICS include coupling facility data tables (CFDTs), named counters, CICS shared temporary storage, and CICSPlex SM region status.

There is also an area in the storage of the CF called dump space. This is used by an SVC dump routine to quickly capture serialized structure control information. After a dump, the dump space is copied into z/OS CPC storage and then to the SVC dump dataset. The dump space can contain data for several structures. The definitions of CF structures are maintained in coupling facility resource management (CFRM) policies.

The cross-system extended services (XES) component of z/OS enables applications and subsystems to take advantage of the coupling facility.

The high-availability characteristics of Parallel Sysplex rely on the ability to non-disruptively move the structures from one CF to another, allowing a CF to be taken offline for service without impacting the systems that use that CF. All Parallel Sysplexes should have at least two CFs and those CFs should be accessible to every member of the sysplex.

Couple datasets

z/OS requires a dataset (and an alternate dataset is recommended for availability) to be shared by all systems in the Parallel Sysplex. z/OS stores information that is related to the Parallel Sysplex, systems, XCF groups, and their members, in the sysplex couple dataset.

XCF

Cross system coupling facility (XCF) services allow authorized programs on one system to communicate with programs on the same system or on other systems. If a system fails, XCF services also provide the capability for batch jobs and started tasks to be restarted on another eligible system in the sysplex.

Application components are defined to XCF and are aware of the existence of other components that support the application. If one component fails, XCF automatically informs the other components. For more information about XCF, see XCF Concepts in z/OS MVS Programming: Sysplex Services Guide.

Nodes in a sysplex use XCF Communication Services (XCF) to perform coordination among sysplex TCP/IP stacks, to discover when new TCP/IP stacks are started, and to learn when a TCP/IP stack is stopped or leaves the XCF group (following a failure). This information is essential for automating the movement of applications and for enabling sysplex Distributor to make intelligent workload management decisions. XCF communication messages can be transported either through a coupling facility or directly through an IBM Enterprise Systems Connection (FICON®) or IBM Fibre Channel connection (FICON).

System Logger

z/OS System Logger provides a generalized logging capability for saving and recalling log records. It provides a single syslog that combines all of the information from all the systems within the sysplex. Exploiters of System Logger include OPERLOG, for sysplex-wide syslog, LOGREC, for sysplex-wide error information, and CICS, which writes its DFHLOG and DFHSHUNT log records to Logger.

The z/OS workload manager

A component of z/OS that provides the ability to run multiple workloads at the same time within one z/OS image or across multiple images.

Parallel Sysplex networking

The overall objective of designing a Parallel Sysplex environment for high availability is to create an environment where the loss of any single component does not affect the availability of the application. In order to achieve high availability and automatically overcome system or subsystem failure within a Parallel Sysplex, you must avoid tying an application to a single, fixed network address. There are a number of ways of doing this:

VIPA (Virtual IP Addressing)

Traditionally, an IP address is associated with a physical link, and while failures in intermediate links in network can be rerouted, the endpoints are points of failure. VIPA was introduced to provide fault-tolerant network connections to a TCP/IP for z/OS system stack. This enables the definition of a virtual interface that is not associated with hardware components, and is always available. To the routing network the VIPA appears to be a host address that is indirectly attached to z/OS. Name servers are configured to return the VIPA of the TCP/IP stack, not of the physical interface. If the physical interface fails, dynamic route updates are sent out to update IP routing tables to use an alternate path; IP connections are not broken but non-disruptively recovered through remaining physical interfaces.

There are two versions of VIPA:

Static VIPA
Dynamic VIPA (DVIPA)

A static VIPA is an IP address that is associated with a particular TCP/IP stack. Using either ARP takeover or a dynamic routing protocol, for example, OSPF, static VIPAs can enable mainframe application communications to continue unaffected by network interface failures. While a single network interface is operational on a host, communication with applications on the host persist. Static VIPA does not require sysplex (XCF communications) because it does not require coordination between TCP/IP stacks.

Dynamic VIPAs (DVIPAs) can be defined on multiple stacks and moved from one TCP/IP stack in the sysplex to another automatically. One stack is defined as the primary or owning stack, and the others are defined as backup stacks. Only the primary stack is made known to the IP network.

TCP/IP stacks in a sysplex exchange information about DVIPAs and their existence and current location, and the stacks are continuously aware of whether the partner stacks are still functioning. If the owning stack leaves the XCF group, for example, as the result of some sort of failure, then one of the backup stacks automatically takes its place and assumes ownership of the DVIPA. The network just sees a change in the routing tables or in the adapter that responds to ARP requests. Applications that are associated with these DVIPAs are active on the backup systems, providing a hot standby and high availability for the services. DVIPA addresses identify applications independently of which images in the sysplex the server applications execute on and allow an application to retain its identity when moved between images in a sysplex.

Sysplex Distributor

A Sysplex Distributor provides connection workload management within a sysplex. It balances workloads and logons among systems that implement multiple concurrent application instances, each sharing access to data. The Sysplex Distributor enables an IP workload to be distributed to multiple server instances within the sysplex without requiring changes to clients or networking hardware, and without delays in connection setup. With Sysplex Distributor, you can implement a dynamic VIPA as a single network-visible IP address that is used for a set of hosts that belong to the same sysplex cluster. A client on the IP network sees the sysplex cluster as one IP address, regardless of the number of hosts in the cluster.

Using internal workload management in a z/OS environment, when a TCP connection request is received for a given distributed DVIPA address, the decision as to which instance of the application serves that particular request is made by the Sysplex Distributor running in the TCP/IP stack that is configured to be the routing stack. The Sysplex Distributor has real-time capacity information available (from the sysplex workload manager) and can use QoS information from the Service Policy Agent. Consequently, internal workload management requires no special external hardware. However, all inbound messages to the distributed DVIPA must first transit the routing stack before they are forwarded to the appropriate application instance.

Port sharing

Port sharing is a method to distribute workload for IP applications within a z/OS LPAR. TCP/IP allows multiple listeners to listen on the same combination of port and interface. Workload that is destined for this application can be distributed among the group of servers that listen on the same port. Port sharing does not rely on an active Sysplex Distributor implementation; it works without Sysplex Distributor. However, you can use port sharing in addition to Sysplex Distributor operation.

z/OS supports two modes of port sharing:

SHAREPORT
SHAREPORTWLM

SHAREDPORT, when specified on the PORT statement, incoming client connections for this port and interface are distributed by the TCP/IP stack across the listeners, by using a weighted round-robin distribution method based on the Server accept Efficiency Fractions (SEFs) of the listeners that share the port. The SEF is a measure of the efficiency of the server application, which is calculated at intervals of approximately one minute, in accepting new connection requests and managing its backlog queue.

SHAREPORTWLM can be used instead of SHAREPORT. Similar to SHAREPORT, SHAREPORTWLM causes incoming connections to be distributed among a set of TCP listeners. However, unlike SHAREPORT, the listener selection is based on WLM server-specific recommendations, which are modified by the SEF values for each listener. These recommendations are acquired at intervals of approximately one minute from WLM, and they reflect the listener’s capacity to handle additional work.

z/OS Communications Server generic resources

z/OS Communications Server generic resources provide a single system image of an application no matter where it runs in the sysplex. A User accesses the application by using the generic resource name of the application, and z/OS Communications Server determines the actual application instance based on performance and workload criteria. This allows application instances to be added, removed, and moved without impacting the user.

For more information about using z/OS Communications Server generic resources in CICS, see Configuring z/OS Communications Server generic resources.