IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
    
     Home      Products      Services & industry solutions      Support & downloads      My IBM     
developerworks > My developerWorks >  Dashboard > Tivoli System z Monitoring and Application Management > ... > Common Best Practices > Automating a High Available HUB TEMS
developerWorks
Log In   View a printable version of the current page.
Automating a High Available HUB TEMS
Added by obriend, last edited by obriend on May 29, 2009  (view change)
Labels: 
(None)

 System z Monitoring and Application Management

Home > Best Practices > Automating a High Available HUB TEMS on z/OS

Automating a High Available HUB TEMS on z/OS

This document describes how a HUB Tivoli Enterprise Monitoring Server (TEMS) address space running on z/OS can be made high-available within an IBM System z Parallel Sysplex using Tivoli System Automation for z/OS (SA z/OS). It is assumed that the configuration steps necessary to define a HUB monitoring server without any data collection functionality have been performed already as described in [ROY] also available on OPAL.

There can be always only a single instance of a HUB TEMS in the enterprise. If the HUB TEMS address space fails or in case of a planned or unplanned system outage, monitoring data cannot be delivered anymore to the end users. The focus to achieve high availability must therefore be on restarting the HUB TEMS in place or moving it to another system in the sysplex as fast as possible.

The definitions shown below will provide the following automatic behavior:

  • If the HUB TEMS fails, SA z/OS attempts to restart it in place. If the HUB TEMS gets into a BROKEN status, SA z/OS moves the HUB TEMS to another LPAR. This is the case if the HUB TEMS terminates with a non-restartable ABEND-code or if it fails multiple times within a certain period (critical failure threshold is exceeded)
  • If TCP/IP fails or in case of a system failure, SA z/OS moves the HUB TEMS to another LPAR
  • Operators can move the HUB TEMS manually from one system to another using the INGMOVE command, if system maintenance is required. This includes the ability to move the HUB to a specific target system immediately or only when the current system is IPLed.

Relationships to other system components

Monitoring agents and remote TEMS address spaces are connected to the HUB TEMS through a fixed IP-address, a so-called virtual IP-address or VIPA. A VIPA stays the same, even though the application using it can be moved from one system to another. There are different possibilities to configure and activate a VIPA. The recommended way in regard to the HUB TEMS is to define a dynamic VIPA with a VIPARANGE statement in the TCP/IP configuration. The VIPA is activated automatically upon startup of the HUB TEMS address space. Upon termination, when the HUB TEMS closes the socket, the VIPA is also automatically deactivated. With this configuration, the VIPA can be managed by the z/OS Communication Server product alone, without requiring any additional automation support.

To satisfy the dependency described above, two relationships from the HUB TEMS to the TCP/IP address space running on the same system are needed.

  1. HasParent relationship ensures that TCP/IP is indeed active before the HUB TEMS can be started.
  2. ForceDown/WhenObservedDownOrStopping relationship ensures that the HUB TEMS is automatically stopped if TCP/IP fails.

System Automation for z/OS resources

The following sections guide you through the necessary definitions using the SA z/OS customization dialog. Along with this paper, a sample automation policy is provided that contains the definitions described below. You can use the policy import function in the customization dialog to import the whole sample policy (SA z/OS V3.2 only) or parts of it (SA z/OS V3.1 and earlier) into an existing automation policy. Either during policy import or afterwards, you may want to change the names of the resources in the sample policy to fit your naming conventions.

HUB TEMS application

The HUB TEMS is configured as an application of type MVS. In this document, the HUB TEMS is referred to by the name HUBTEMS as subsystem name and job name. You are free to choose your own names, however.

Application Information policy

Unless the procedure name of the HUB TEMS matches the job name, specify the procedure name in field JCL Procedure Name. SA z/OS will then use the following command to start the HUB TEMS address space: MVS S procname,JOBNAME=&SUBSJOB

Shutdown policy

To terminate the HUB TEMS address space, you need to define the commands for a normal shutdown.

  • On the first pass, specify MVS P &SUBSJOB.
  • On the 4th pass, if the HUB TEMS did not terminate even though a stop command was given already, cancel the address space with MVS C &SUBSJOB.

Note: Have a look at the SA z/OS *BASE sample policy. It includes an application class called C_APPL. This class contains the stop commands for normal, immediate, and force mode that fits in most cases to customer needs. If you import this class, you merely have to link HUBTEMS to C_APPL and you're done.

The shutdown passes are processed in intervals specified by the field Shutdown Pass Interval in the Application Information policy. The default interval is 1 minute.
Since the shutdown time of the HUB TEMS should be shorter compared to that of a remote TEMS running data collectors inside the address space, the passes chosen above are probably working in most cases. As an alternative, you can change the pass number of the CANCEL command from 4 to 3 or even 2, or you can shorten the shutdown pass interval, or both.

Another option is to additionally specify STOP commands for immediate shutdown. In this case, for example, you could specify the following:

  • On the first pass, MVS P &SUBSJOB is issued twice in a row. Note that in this case you are likely to see abends during shutdown as the HUB TEMS doesn't wait for its tasks to properly terminate.
  • On the second pass, cancel the HUB TEMS using MVS C &SUBSJOB.
Messages and User Data policy

SA z/OS must be told when the HUB TEMS is up and running and also when it has stopped normally or abnormally. The message that signals the up status is KO4SRV032. The message that signals normal termination is IEF404I and abnormal termination is signaled by message IEF450I. These messages are already defined in the default NetView automation table delivered with SA z/OS and no further action is required on your side.

For the KO4SRV032 message to be issued to the console, ensure that the option KGL_WTO=YES is specified in RKANPARU-member KDSENV within the RTE configured through ICAT for the HUB TEMS.

Only, if you are interested in additional messages and want to react upon them, you have to specify the message ID and the associated commands in the Messages and User Data policy.

Relationships policy

As discussed previously, the dependency to TCP/IP is modeled in form of two relationship rules that have to be specified in the Relationships policy. In this document, TCP/IP is referred to by the subsystem and job name TCPIP. You need to adopt this name accordingly, if TCP/IP is called differently in your installation.
The HasParent relationship to the supporting resource TCPIP/APL/= is needed to model the proper startup and shutdown sequence.

The ForceDown/WhenObservedDownOrStopping relationship to the supporting resource TCPIP/APL/= is needed to immediately stop the HUB TEMS in case TCP/IP becomes unavailable.

  1. If you are running SA z/OS V3.1, in order to implement a serial move, don't forget to also add a MakeAvailable/WhenObservedDown relationship to MOVXHUB (see section 3.2 below).

Sysplex high availability MOVE group

The aforementioned application HUBTEMS must be added to a sysplex group of nature MOVE. In this document, this group is referred to as MOVXHUB.

Application Group Information policy

The MOVE group is created with active behavior and an automation name of MOVXHUB. For simplicity, it is recommended to use the same identifier for the group's entry name and the group's automation name. Leave the default preference at *DEF which gives each member in the MOVE group a preference value of 700.
If you are already on SA z/OS V3.2, specify a Move Mode of SERIAL.

Note: If you are running SA z/OS V3.1, in order to implement a serial move, add a MakeAvailable/WhenObservedDown relationship from TCPIP to MOVXHUB (see section 3.1.4 above).

Applications policy

As group members, select the HUBTEMS application in the Applications policy. Having the preferences set as described above (700), there is no preferred member in the group. The HUB TEMS can be selected on any system solely based on system and application availability.

Resources policy

It is assumed that the systems involved in the sysplex MOVE group MOVXHUB have all access to the HUB TEMS RTE on shared DASD. If this requirement is not met for any of the systems in the MOVE group, set the preference value for those systems to 1 or remove the preference completely. This will prevent SA z/OS to consider such a member as a possible candidate to be selected.

Where Used policy

Finally, the group MOVXHUB must be linked to the sysplex group(s) where it should be used.

Customization summary

Once the SA z/OS resources have been defined and a new automation configuration has been built, it can be activated in the target sysplex. The following figure depicts the resources and their relationships for a 3-way sysplex with systems SYS1, SYS2, and SYS3:


Figure 1

The necessary relationships for application HUBTEMS in Figure 1 above are summarized in the following table:

Relationship Meaning
FD/WDoS ForceDown/WhenObservedDownOrStopping, i.e. HUBTEMS is stopped, when TCPIP fails or terminates.
HP HasParent, i.e. HUBTEMS can be started only after TCPIP is up and running.
MA/WOD (passive) SA z/OS V3.1 only:
MakeAvailable/WhenObservedDown (passive), i.e. when the sysplex MOVE group MOVXHUB is observed unavailable because none of its members is available, let the group select a new candidate member based on the highest preference value. This 'trick' results in a serial move operation.

Operating the high-availability HUB TEMS

Sometimes, it is necessary to manually move the HUB TEMS from one system to another, for example, to IPL a system after applying software service. This section describes how to use SA z/OS for the most typical scenarios.

SA z/OS commands can be issued from any NetView Command Facility (NCCF) 3270 console or from any system console. If the command is not issued from a system in the local sysplex, the TARGET-parameter must be specified to denote the target system or sysplex. When the system console is used, commands are passed to NetView via a MODIFY netvproc command, where netvproc is the name of your NetView procedure. Alternatively, a subsystem prefix character (default '%') can be defined to route the command directly to NetView via the NetView subsystem interface.

Scenario 1 - move immediately

The example assumes that the HUB TEMS is currently running on system SYS1. Operations wants to move it immediately to SYS2. To accomplish this, the INGMOVE command can be used as follows:

INGMOVE *MOVXHUB/APG* TO=SYS2 FDBK=(MSG,B,1) OUTMODE=LINE

With the feedback parameter specified above, you are informed about both, success and failure of this operation. If successful within 1 minute, the following message is issued, for example:

DSI039I MSG FROM AUTMON   : ING300I COMMAND "INGMOVE MOVXHUB/APG TO=SYS2 FDBK=(MSG,B,1) OUTMODE=LINE" COMPLETED SUCCESSFULLY

Scenario 2 - move at next IPL

The example assumes that the HUB TEMS is currently running on system SYS2. Operations decides to move it back to SYS1 once SYS2 is IPL-ed or when the HUB TEMS is recycled. This could be accomplished with the INGGROUP-command where you set the adjusted preference value for SYS1 to 900 and the adjusted preference value for SYS2 to 700. As long as the HUB TEMS runs on SYS2, the effective preference value, however, is 950 because of the bonus points attributed to the active member. When SYS2 is shut down or when the HUB TEMS is recycled, the effective preference of SYS2 drops to 725 (only the sticky bonus remains), which is lower than 900. Therefore, SA z/OS moves the HUB TEMS back to SYS1 which has the highest preference value at this time.

As you see, dealing with preference values is intended primarily for expert use. For daily operations, a more comfortable and safer means, however, is provided in form of the INGMOVE-dialog. Invoke INGMOVE as follows:

INGMOVE *MOVXHUB/APG*

This gets you to a panel shown in the following figure:


Figure 2

In the Cmd-field, enter a 'P' to tell SA z/OS to prepare a move upon the next IPL or recycle. Then, specify the target system, here SYS1, on the rightmost input field. When you hit Enter, you are asked for confirmation. Now, when system SYS2 is IPL-ed or when the HUB TEMS is recycled, it is moved to SYS1.

Scenario 3 - shutdown HUB TEMS

The example assumes that the HUB TEMS is currently running on system SYS1. Operations decides to shutdown the HUB TEMS on SYS1 and to prevent SA z/OS from moving it to another system. To accomplish this, a stop request is issued against the MOVE group like follows:

INGREQ *MOVXHUB/APG* REQ=STOP

Later on, if operations decides to start the HUB TEMS again, the stop request must be cancelled using:

INGREQ MOVXHUB/APG REQ=CANCEL

SA z/OS then selects any startable member in the MOVE group to restart the HUB TEMS.

Using the sample policies

The definitions described above have been provided in a sample policy that you can include into your own automation policy. For detailed policy import instructions, please refer to the official System Automation for z/OS publication listed in the References section on the back of this document.

To upload the sample policy to the host, proceed as follows:

  • Clients based on SA z/OS V3.1 need to look at sample file hahub_policy_v310
  • Clients based on SA z/OS V3.2 need to look at sample file hahub_policy_v320

Use any convenient method to upload the file to your SA z/OS host. Here the method using FTP is shown. Enter the bold commands and replace the italic variables accordingly:

D:\download\-drive>*ftp* _hostname_
Connected to _hostname_.
220\-FTP _time_ on _date_.
220 Connection will close if idle for more than 60 minutes.
User (_hostname_:\(none)): _userid_
331 Send password please.
Password:
230 _userid_ is logged on.  Working directory is _user\-prefix_.
ftp> *type image*
200 Representation type is Image
ftp> *quote site blk=0 file=seq lr=80 pri=4 sec=0 rec=fb tr u=sysda*
200 SITE command was accepted
ftp> *put hahub\_policy\_v3*_xx_*.fb80* _tso\-seq\-dataset_
200 Port request OK.
125 Storing data set _tso\-seq\-dataset_
250 Transfer completed successfully.
ftp: 92560 bytes sent in 0,02Seconds 5785,00Kbytes/sec.
ftp> *bye*

After uploading the binary file to TSO, you need to receive it. In ISPF 6, issue the following command:

RECEIVE INDS(_tso\-seq\-dataset_)

You'll see the following messages and either accept the dataset name or specify your own:

INMR901I Dataset BHOL.HUBTEMS.V3xx.PDB from BHOL on BOEKEYA  
INMR154I The incoming data set is a 'DATA LIBRARY'.     
INMR906A Enter restore parameters or 'DELETE' or 'END'

At this point, you are ready for policy import. If you are using SA z/OS V3.2, please skip the next section and directly continue with section 5.2 SA z/OS V3.2 import steps on page 10.

SA z/OS V3.1 import steps

Execute the following steps to import the sample policy that you have uploaded to the host and received as illustrated above.

  1. On the SA z/OS customization dialog primary menu, select option 4 Policies.

  2. Enter the command ADD HAHUB and press Enter.

  3. Specify the name of the dataset containing the sample automation policy. Leave the dialog with PF3.

  4. Select your target automation policy.

  5. Go back to the SA z/OS customization dialog primary menu and select option 5 Data Management.

  6. Select option1 Import from PDB.

  7. Fill out the fields below as follows:



  8. On that panel, select option 1 Import Policy Data and press Enter.

  9. Select all groups as shown below and press Enter twice:



  10. On panel Selected Entry Names for Import, all entries that exist already in your target policy are marked as duplicate (column labeled D). You don't have to be concerned about these as they won't be imported. However, links to such entries will be imported. Therefore it is not recommended to remove them.

    1. BASE_APL is a passive group without an automation name. The sample TCPIP application is linked to that group. If you have already a TCPIP application, that one is already linked to a group and you can remove this entry by typing 'M' into the Action column.

    2. TCPIP represents the TCPIP application. If TCPIP is not marked as duplicate, this application is stored under a different entry name in the target policy. In this case, overtype the name to match the target policy's entry name.

    3. MOVXHUB represents the sysplex MOVE group for the HUB TEMS. You can overtype the name if you want to store it under a different entry name.

    4. HUBTEMS represents the HUB TEMS application. You can overtype the name if you want to store it under a different entry name.

    5. C_APPL is a generic application class containing common shutdown commands. The HUBTEMS application uses C_APPL to inherit these commands. If C_APPL is not marked as duplicate, it is safe to simply import it. Otherwise, make sure that your C_APPL does not define common settings that shouldn't be inherited by HUBTEMS. When in doubt, remove that entry by typing 'M' into the Action column.

      The following snippet shows you the final selection:


  11. Press Enter to start with the import.

  12. You can now continue with the validation described in Complete the policy import.

SA z/OS V3.2 import steps

Execute the following steps to import the sample policy that you have uploaded to the host and received as illustrated above.

  1. On the SA z/OS customization dialog primary menu, select option 4 Policies.

  2. Enter the command ADD HAHUB and press Enter.

  3. Specify the name of the dataset containing the sample automation policy. Leave the dialog with PF3.

  4. Select your target automation policy.

  5. Go back to the SA z/OS customization dialog primary menu and select option 5 Data Management.

  6. Select option1 Import from PDB.

  7. Fill out the fields below as follows:



  8. On that panel, select option 1 Import Policy Data and press Enter.

  9. Select the sysplex group as shown below and press Enter twice:


  10. On panel Selected Entry Names for Import, all entries that exist already in your target policy are marked as duplicate (column labeled D). You don't have to be concerned about these as they won't be imported. However, links to such entries will be imported. Therefore it is not recommended to remove them.

    1. SYSPLEX represents a sysplex GRP entry. If SYSPLEX is not marked as duplicate, the group is stored under a different entry name in the target policy. In this case, overtype the name to match the target policy's entry name.

    2. SYS1, SYS2, and SYS3 represent system entries. Overtype these names with the actual system names in your sysplex. If your sysplex consists of less than three systems, remove surplus entries by typing 'M' into the Action column.

    3. BASE_APL is a passive group without an automation name. The sample TCPIP application is linked to that group. If you have already a TCPIP application, that one is already linked to a group and you can remove that entry by typing 'M' into the Action column.

    4. TCPIP represents the TCPIP application. If TCPIP is not marked as duplicate, this application is stored under a different entry name in the target policy. In this case, overtype the name to match the target policy's entry name.

    5. MOVXHUB represents the sysplex MOVE group for the HUB TEMS. You can overtype the name if you want to store it under a different entry name.

    6. HUBTEMS represents the HUB TEMS application. You can overtype the name if you want to store it under a different entry name.

    7. C_APPL is a generic application class containing common shutdown commands. The HUBTEMS application uses C_APPL to inherit these commands. If C_APPL is not marked as duplicate, it is safe to simply import it. Otherwise, make sure that your C_APPL does not define common settings that shouldn't be inherited by HUBTEMS. When in doubt, remove that entry by typing 'M' into the Action column.

      The following snippet shows you the final selection:


  11. Press Enter to start with the import.

  12. You can now continue with the validation described in 5.3 Complete the policy import on page 11.

Complete the policy import

Before you build your updated policy including the HUB TEMS support, you need to perform a few validation and customization steps to finally adapt the definitions to your environment:

  1. On the SA z/OS customization dialog primary menu, select option 1 Open to open the updated policy.

  2. Select 2 GRP, select the sysplex entry for which the change was made, and open the Application Group policy:

    1. For SA z/OS V3.2, the MOVXHUB entry should be already selected to that sysplex.

    2. For SA z/OS V3.1, select your MOVXHUB entry to add it to that sysplex.

  3. Select 5 APG and open your MOVXHUB entry:

    Change the automation name, if necessary.

  4. Select 6 APL and open your HUBTEMS entry:

    1. Change the subsystem name and job name if necessary.

    2. If the procedure to start the HUB TEMS address space doesn't match the job name, specify the correct HUB TEMS procedure name.

    3. If you have not imported the class C_APPL, specify shutdown commands as described in 3.1.2 Shutdown policy on page 2.

    4. If you have renamed the automation name of your MOVXHUB or if the subsystem name used for your TCP/IP is not TCPIP, edit the relationships and use the corresponding names instead.

At this point, the import is completed and you can build the new automation policy.

References:
[ROY] R.Roy, "IBM Tivoli Monitoring, High-Availability HUB TEMS on z/OS", Tivoli Open Process
Automation Library OPAL (2008)

http://www-01.ibm.com/software/brandcatalog/portal/opal/details?catalog.label=1TW10TM61

[IBM1] System Automation for z/OS Version 3.2 - Defining Automation Policy (SC33-8262), Tivoli
Product Library (2007)

[IBM2] System Automation for z/OS Version 3.2 - Operator's Commands (SC33-8265), Tivoli
Product Library (2007)

Posted by holtz at Dec 04, 2008 02:59 | Permalink

    About IBM Privacy Contact