Using the IBM Tivoli Storage Manager Backup-Archive client in cluster environments on UNIX and Linux

Jason Basler (jbasler@us.ibm.com), Senior software engineer, IBM

Jason Basler is a Senior Software Engineer in the IBM Tivoli Software organization. He has spent the past twelve years testing various components and releases of the Tivoli Storage Manager product family.



Neeta Garimella (neeta@us.ibm.com), Staff Software Engineer, IBM

Neeta Garimella is a Senior Engineer and has been part of the TSM development organization for over six years. She has been a key contributor to the TSM Backup-Archive client and TSM for Data Protection products including operations within highly available environments. She was the lead developer for Tivoli Workload Scheduler. Before joining IBM, she worked at BEA Systems as a professional services consultant where she helped customers build and deploy solutions using BEA products.



Kevin Hoyt (hoytk@us.ibm.com), Staff software engineer, IBM

Kevin Hoyt is a Staff Software Engineer in the IBM Tivoli Software organization. He has been involved with Tivoli Storage Manager for over 8 years and is the lead developer for the Mac OS X client.



Giang T Pham (giangp@us.ibm.com), Staff software engineer, IBM

Giang Pham is a Staff Software Engineer in the IBM Tivoli Software organization. He is currently working in System Test Team for Tivoli Storage Manager product.



Jim Smith (smithjp@us.ibm.com), Backup-Archive client architect, IBM

Jim Smith has been involved with Tivoli Storage Manager for over ten years. He has been a part of the support and development teams and is currently serving as the architect for the Tivoli Storage Manager Backup-Archive clients. He has been involved in many Tivoli Storage Manager projects involving clustering solutions such as Microsoft Windows Cluster Services, High Availability Cluster Multi-Processing, Novell Cluster Services and VERITAS Cluster Services.



18 July 2007

About this tutorial

Introduction

Software products that provide a high availability of resources are becoming commonplace in today's computing environments. As the complexity of managing logical disks and their data increases in cluster environments, understanding and deploying an effective backup strategy for complete data protection is a top priority.

Our goal is to introduce a methodology for backing up and restoring UNIX and Linux data that resides in a cluster environment. Basic concepts of clustering and how Tivoli Storage Manager can be deployed in the cluster environment are introduced, and practical examples are given on how to configure the Tivoli Storage Manager Backup-Archive client and backup schedules. The migration procedure for Tivoli Storage Manager Backup-Archive client configurations that currently use clusternode option in AIX, HACMP environments is also described. The procedures outlined here depend on features introduced in Tivoli Storage Manager version 5.4.0.

Prerequisites

The term cluster can mean many things to many different vendors and users. It can mean "highly available," "high performance," "load balancing," "grid computing," or some combination thereof. There are currently several clustering products available for UNIX and Linux. Our intent is not to provide a backup solution for any specific clustering product, but to define those aspects of a clustering environment that need to exist in order for this backup methodology to work correctly.

The concept of cluster, in this paper, refers to a UNIX or a Linux environment which exhibits the following characteristics:

  • Disks are shared between physical machines, either in an exclusive fashion (only one host has access to the logical disk at any one time) or in a concurrent fashion.
  • Disks appear as local disks to the host and not as network resources. We suggest that the file systems be mounted locally to the system, not through a LAN-based file share protocol such as network file system (NFS).
  • Mount points of local disks are identical on each physical host in the environment (if file system /group1_disk1 fails from NodeA to NodeB, it is mounted on NodeB as /group1_disk1).

If the cluster environment exhibits all of the above characteristics, the methodologies we outline are supported by version 5.4 and higher of Tivoli Storage Manager Backup-Archive client. If the cluster environment only exhibits a subset (or none) of the characteristics, the methodologies outlined are not applicable and are not supported by Tivoli Storage Manager.

It is assumed that the reader has a basic understanding of how the Tivoli Storage Manager product functions, and that they have a basic understanding of how their cluster software functions. This paper does not address any cluster software related activities such as the development of application start and stop scripts.


Overview of cluster environments

Cluster environments can be set up in many different configurations. This section describes the most popular cluster configurations.

Active/active: Pool cluster resources

In an active/active configuration, each node is actively managing at least one resource and is configured as a backup for one or more resources in the cluster. Active/active is the most common form of a cluster environment.

Figure 1. Active/active cluster environment: Normal operation
Active/active cluster environment: Normal operation

The configuration in Figure 1 is operating normally; each node is operational and managing only one resource.

Figure 2. Active/active cluster environment: NodeB fails over to NodeA
Active/active cluster environment: NodeB failover to NodeA

In the configuration shown in Figure 2, Node B has experienced a fault. Resource 2 has failed over to Node A.

Active/passive: Fault tolerant

In an active/passive configuration, one node actively manages the resource. The other node is only used if the primary node experiences a fault and the resource needs to failover. An active/passive cluster is a sub-type of an active/active cluster. For the examples described in this paper, the setup requirements are similar.

Figure 3. Active/passive cluster environment: Normal operation
Active/Passive Cluster Environment: Normal operation

The configuration in Figure 3 is operating normally. NodeA is managing Resource 1. If NodeA experiences a fault, Resource 1 will failover to NodeB.

Figure 4. Active/passive cluster environment: NodeA failover to NodeB
Active/passive cluster environment: NodeA failover to NodeB

In the configuration shown in Figure 4, NodeA has experienced a fault. Resource 1 has failed over to NodeB.

Concurrent access

In a concurrent configuration, more than one node manages a resource. When a fault occurs, the resource continues to be managed by the other nodes. This type of cluster is not common and is beyond the scope of this paper.

Figure 5. Concurrent access cluster environment: Normal operation
Concurrent access cluster environment: Normal operation

The configuration shown in Figure 5 is operating normally. Every node is managing every resource. If any node experiences a fault, the other nodes will continue to manage the resources.

Figure 6. Concurrent access cluster environment: NodeB failover to NodeA
Concurrent access cluster environment: NodeB failover to NodeA

In the configuration shown in Figure 6, NodeB has experienced a fault. NodeA and NodeB continue to manage Resources 1, 2, and 3.

The following sections describe how to configure Tivoli Storage Manager in an active/active or an active/passive cluster environment. Tivoli Storage Manager can be used in a concurrent access cluster, but that configuration is not described in the upcoming sections.


Tivoli Storage Manager backup concepts relevant to cluster environments

How does Tivoli Storage Manager fit into the concept of a cluster environment?

One of the biggest challenges when providing backup and recovery in a cluster environment is determining where backup operations should be performed in the context of the cluster's resources. Clusters are designed to offer high-availability to applications and users. A user of a mail or database application does not need to know which host physically owns a disk resource, but instead relies upon a virtualized connection to the cluster's resources. For example, any movement of resources within a cluster between hosts due to failover or load balancing is hidden from the end-user. A backup product can be placed within or external to the cluster's resource groups.

If the backup product is placed outside of the context of the cluster and its resource groups (for example, on a host that is not part of the cluster), it can map or mount the file systems and perform the backups relative to the mapped or mounted drives. The advantage to this model has already been demonstrated: As the disk resources move between the physical cluster hosts, the backup product retains the mapping of the file system. The big disadvantage of this method is that backup of data over network protocols such as Network File System or Common Internet File System is slower than backup of local file systems. This becomes a factor as data volumes grow and backup windows shrink. Another disadvantage is that you have no protection if the host you have chosen for backups fails!

The Tivoli Storage Manager Backup-Archive client is designed to manage the backup of cluster drives by placing the backup-archive client within the context of the cluster's resource groups. This gives the advantage of backing up data from local resources (as opposed to accessing the data across the network) to maximize the performance of the backup operation and to manage the backup data relative to the resource group. Therefore, the backup-archive client can always back up data on cluster resources, as if the data were local data, and maximize backup performance. This ensures that critical data is getting backed up across system failures.

Cluster aware vs. highly available

The Tivoli Storage Manager Backup-Archive Client is not cluster aware, in that it does not use any API to determine the context under which it is running or provide any explicit mechanisms for failover or high-availability. The Tivoli Storage Manager Backup-Archive client does offer fault-tolerant scheduling capabilities which can be exploited to provide protection in cluster environments; this makes the Tivoli Storage Manager backup scheduling service highly available. For example, a schedule can be defined that performs an incremental backup of the /group1_disk1 file system. If the /group1_disk1 file system fails to another host in the cluster while the backup is being performed, the scheduled incremental operation is restarted from the other host, if the configuration steps outlined below are followed correctly. The progressive incremental backup methodology employed by Tivoli Storage Manager determines if files are already backed-up, and effectively continues processing the incremental backup from the point where the failover occurred.


Configuring the Tivoli Storage Manager Backup-Archive client in a cluster environment

Figure 7. Active/Active cluster environment: Normal operation
Active/active cluster environment: Normal operation

The sample configuration in Figure 7 shows an active/active cluster environment that has three physical hosts in the cluster named NodeA, NodeB, and NodeC. The nodes have the following qualities:

  • NodeA owns the cluster resource with file systems /A1 and /A2
  • NodeB owns the cluster resources with file systems /B1 and /B2
  • NodeC owns the cluster resources with file systems /C1 and /C2

Note: NodeA can also have two non-clustered volumes, /fs1 and /fs2, that must be backed up (not shown in the figure).

For best backup performance, you might want all nodes in the cluster to perform the backups of the shared file systems that they own. When a node failover occurs, the backup tasks of the failed node shift to the node to which the failover occurred. For example, when NodeA fails over to NodeB, the backup of /A1 and /A2 moves to NodeB.

The example below presumes that the Tivoli Storage Manager Backup-Archive client is installed and configured on all nodes in the cluster environment. Follow the steps below to configure the Tivoli Storage Manager Backup-Archive client to back up cluster and non-cluster volumes:

Setup prerequisites for a clustered environment

The following setup requirements must be observed when using the backup-archive client in a clustered environment:

  • A separate backup-archive client scheduler process must be run for each resource group being protected. In normal conditions, each node has two scheduler processes: one for the cluster resources, and one for the local file systems. After a failure, additional scheduler processes are started on a node in order to protect the resources that have moved over from another node.
  • The backup-archive client password files must be stored on cluster disks so that after a failure, the generated backup-archive client password is available to the takeover node.
  • The file systems to be protected as part of a resource group are defined using the backup-archive client domain option. The domain option is specified in the dsm.sys file, which should also be stored on a cluster disk so that it can be accessed by the takeover node.

If the Tivoli Storage Manager Web client access is desired during a failover condition, the Tivoli Storage Manager Web client acceptor daemon (CAD) must also be configured for each cluster resource. In this case, we suggest that you use Tivoli Storage Manager CAD to manage the scheduler process to simplify the administration and configuration. See the "Enabling Tivoli Storage Manager Web client access in a cluster environment" section of this document for more information.

Perform the following steps to configure the Tivoli Storage Manager backup-archive client in a cluster environment:

Step 1: Register backup-archive client node definitions on the Tivoli Storage Manager Server

All nodes in the cluster must be defined on the Tivoli Storage Manager Server. If you are defining multiple cluster resources in a cluster environment to failover independently, then unique node names must be defined per resource group.

For the three-way, active-active cluster configuration sample above, define three nodes (one per resource), as shown below:

tsm: CLINTON>register node nodeA nodeApw domain=standard
tsm: CLINTON>register node nodeB nodeBpw domain=standard
tsm: CLINTON>register node nodeC nodeCpw domain=standard

Step 2: Configure the backup-archive client system options file

Each node in the cluster must have separate server stanzas for each cluster resource group in order to be backed up in each respective dsm.sys file. You must ensure that the server stanzas are identical in the system option files on each node. Alternatively, you can place the dsm.sys file on a shared cluster location. The server stanzas defined to back up clustered volumes must have the following special characteristics:

  • The nodename option must refer to the client node name registered on the Tivoli Storage Manager Server. If the client node name is not defined, the nodename will default to the hostname of the node, which might conflict with other nodenames used for the same client system. We suggest that you use the nodename option to explicitly define the client node.
  • The tcpclientaddress option must refer to the service IP address of the cluster node.
  • The passworddir option must refer to a directory on the shared volumes that are part of the cluster resource group.
  • The errorlogname and schedlogname options must refer to files on the shared volumes that are part of the cluster resource group to maintain a single continuous log file.
  • All include exclude statements must refer to files on the shared volumes that are part of the cluster resource group.
  • If you use the inclexcl option, it must refer to a file path on the shared volumes that are part of the cluster group.
  • The stanza names identified with the servername option must be identical on all systems.

Other backup-archive client options can be set as desired.

In the example, all three nodes, NodeA, NodeB, and NodeC, must have the following three server stanzas in their dsm.sys file:

Servername        clinton_nodeA
nodename          NodeA
commmethod        tcpip
tcpport           1500
tcpserveraddress  clinton.ibm.com
tcpclientaddres   nodeA.ibm.com
passwordaccess    generate
passworddir       /A1/tsm/pwd
managedservices   schedule
schedlogname      /A1/tsm/dsmsched.log
errorlogname      /A1/tsm/errorlog.log

Servername        clinton_nodeB
nodename          NodeB
commmethod        tcpip
tcpport           1500
tcpserveraddress  clinton.ibm.com
tcpclientaddres   nodeB.ibm.com
passwordaccess    generate
passworddir       /B1/tsm/pwd
managedservices   schedule
schedlogname      /B1/tsm/dsmsched.log
errorlogname      /B1/tsm/errorlog.log

Servername        clinton_nodeC
nodename          NodeC
commmethod        tcpip
tcpport           1500
tcpserveraddress  clinton.ibm.com
tcpclientaddres   nodeC.ibm.com
passwordaccess    generate
passworddir       /C1/tsm/pwd
managedservices   schedule
schedlogname      /C1/tsm/dsmsched.log
errorlogname      /C1/tsm/errorlog.log

Step 3: Configure the backup-archive client user options file

The backup-archive client user options file (dsm.opt) must reside on the shared volumes in the cluster resource group. Define the DSM_CONFIG environment variable to refer to this file. Ensure that the dsm.opt file contains the following settings:

  • The value of the servername option must be the server stanza in the dsm.sys file which defines parameters for backing up clustered volumes.

  • Define the clustered file systems to be backed up with the domain option.

    Note: Ensure that you define the domain option in the dsm.opt file or specify the option in the schedule or on the Tivoli Storage Manager command-line client. This is to restrict clustered operations to cluster resources and non-clustered operations to non-clustered resources.

In the example, nodes NodeA, NodeB, and NodeC set up their corresponding dsm.opt file and DSM_CONFIG environment variable as follows:

NodeA

  1. Set up the /A1/tsm/dsm.opt file:

    servername clinton_nodeA
    domain     /A1 /A2
  2. Issue the following command or include it in your user's profile:

    export DSM_CONFIG=/A1/tsm/dsm.opt

NodeB

  1. Set up the /B1/tsm/dsm.opt file:

    servername clinton_nodeB
    domain     /B1 /B2
  2. Issue the following command or include it in your user's profile:

    export DSM_CONFIG=/B1/tsm/dsm.opt

NodeC

  1. Set up the /C1/tsm/dsm.opt file:

    servername clinton_nodeC
    domain     /C1 /C2
  2. Issue the following command or include it in your user's profile:

    export DSM_CONFIG=/C1/tsm/dsm.opt

Step 4: Set up the schedule definitions for each cluster resource group

After the basic setup is complete, define the automated schedules to back up cluster resources to meet the backup requirements. The procedure described below illustrates the schedule setup by using the built-in Tivoli Storage Manager scheduler. If you are using a third-party scheduler, refer to the documentation provided by the vendor of the scheduler .

  • Define a schedule in the policy domain where cluster nodes are defined. Ensure that the schedule's startup window is large enough to restart the schedule on the failover node in case of a failure and fallback event. This means that the schedule's duration must be set to longer than the time it takes to complete the backup of the cluster data for that node, under normal conditions.

    If the reconnection occurs within the start window for that event, the scheduled command is restarted. This scheduled incremental backup reexamines files sent to the server before the failover. The backup will then "catch up" to where it terminated before the failover situation.

    In our example, the clus_backup schedule is defined in the standard domain to start the backup at 12:30 a.m. every day with the duration set to two hours (which is the normal backup time for each node's data). See below:

    tsm: CLINTON>define schedule standard clus_backup action=incr starttime=00:30 
    startdate=TODAY  Duration=2
  • Associate the schedule with all the backup-archive client nodes defined to backup cluster resources.

    tsm: CLINTON>define association standard clus_backup nodeA
    tsm: CLINTON>define association standard clus_backup nodeB
    tsm: CLINTON>define association standard clus_backup nodeC

Step 5: Setup the scheduler service for backup

On each client node, a scheduler service must be configured for each resource that the node is responsible for backing up, under normal conditions.

The DSM_CONFIG environment variable for each resource scheduler service must be set to refer to the corresponding dsm.opt file for that resource. For the sample configuration, the following three shell scripts must be created to allow dsmcad processes to be started, as needed, from any node in the cluster.

NodeA: /A1/tsm/startsched
#!/bin/ksh
export DSM_CONFIG=/A1/tsm/dsm.opt
dsmcad
NodeB: /B1/tsm/startsched
#!/bin/ksh
export DSM_CONFIG=/B1/tsm/dsm.opt
dsmcad
NodeC: /C1/tsm/startsched
#!/bin/ksh
export DSM_CONFIG=/C1/tsm/dsm.opt
dsmcad

For more information, refer to the BM Tivoli Storage Manager for UNIX and Linux Backup-Archive Clients Installation and User's Guide Chapter 7, "Automating tasks" (see the Resources section for a link).

Step 6: Define the Tivoli Storage Manager Backup-Archive client to the cluster application

To continue the backup of the failed resource after a failover condition, the Tivoli Storage Manager scheduler service (for each cluster client node) must be defined as a resource to the cluster application in order to participate in the failover processing. This is required in order to continue the backup of the failed resources from the node that takes over the resource. Failure to do so would result in the incomplete backup of the failed resource.

The sample scripts in Step 5 can be associated with the cluster resources to ensure that they are started on nodes in the cluster while the disk resources being protected move from one node to another.

The actual steps required to set up the scheduler service as a cluster resource are specific to the cluster software. Refer to your cluster application documentation for additional information.

Step 7: Validate the setup

To validate the setup, perform the following test scenarios:

  • Ensure each node's password is generated and cached correctly in the location specified using the passworddir option. This can be validated by performing the following two steps.

    1. Validate that each node can connect to the Tivoli Storage Manager Server without the password prompt. You can do this by running the backup-archive command-line client and issuing the following command on each node:

      #dsmc query session

      If you are prompted to submit your password, enter the password to run the command successfully and re-run the command. The second time, the command should run without the prompt for the password. If you get prompted for the password, check your configuration starting at step 1.

    2. Validate that the other nodes in the cluster can start sessions to the Tivoli Storage Manager Server for the failed-over node. This can be done by running the same commands, as described in the step above, on the backup nodes.

      For example, to validate if NodeB and NodeC can start a session as NodeA in the failover event without prompting for the password, perform the following.

      On NodeB and NodeC:

      #export DSM_CONFIG=/A1/tsm/dsm.opt
      #dsmc query session

      The prompt for the password might appear at this time, but this is unlikely. If you are prompted, the password was not stored in the shared location correctly. Check the passworddir option setting used for NodeA and follow the configuration steps again.

  • Ensure that the schedules are run correctly by each node. You can trigger a schedule by setting the schedule's starttime to now. Remember to reset the starttime after testing is complete.

    tsm: CLINTON>update sched standard clus_backup starttime=now
  • Failover and fallback between nodeA and nodeB, while nodeA is in the middle of the backup and the schedule's start window, is still valid. Verify that the incremental backup will continue to run and finish successfully after failover and fallback.

  • Issue the command below to cause a node's (nodeA) password to expire. Ensure that backup continues normally under normal cluster operations, as well as failover and fallback.

    tsm: CLINTON>update node nodeA forcep=yes

Step 8: Configure the backup-archive client to backup local resources

  • Define client nodes on the Tivoli Storage Manager Server. Local resources should never be backed up or archived using node names defined to back up cluster data. If local volumes that are not defined as cluster resources will be backed up, separate node names (and separate client instances) must be used for both non-clustered and clustered volumes.

    In the example, assume that only NodeA has local filesystems /fs1 and /fs2 to be backed up. In order to manage the local resources, register a node, NodeA_local, on the Tivoli Storage Manager server.

    tsm: CLINTON>register node nodeA_local nodeA_localpw domain=standard
  • Add a separate stanza in each node's system options file dsm.sys that must back up local resources with the following special characteristics:

    • The value of the tcpclientaddress option must be the local host name or IP address. This is the IP address used for primary traffic to and from the node.

    • If the client will back up and restore non-clustered volumes without being connected to the cluster, the value of the tcpclientaddress option must be the boot IP address. This is the IP address used to start the machine (node) before it rejoins the cluster.

      Example stanza for NodeA_local:

      Servername        clinton_nodeA_local
      nodename          nodeA_local
      commmethod        tcpip
      tcpport           1500
      tcpserveraddress  clinton.ibm.com
      tcpclientaddres   nodeA_host.ibm.com
      passwordaccess    generate
      managedservices   schedule
  • Define the user options file dsm.opt in a path that is on a non-clustered resource.

    • The value of the servername option must be the server stanza in the dsm.sys file which defines parameters for backing up non-clustered volumes.
    • Use the domain option to define the non-clustered file systems to be backed up.

    Note: Ensure that you define the domain option in the dsm.opt file or specify the option in the schedule or on the Tivoli Storage Manager command-line client, in order to restrict the backup-archive operations to non-clustered volumes.

    In the example, NodeA uses the following /home/admin/dsm.opt file and sets up the DSM_CONFIG environment to refer to /home/admin/A1.dsm.opt.

    Contents of /home/admin/A1.dsm.opt:

    servername clinton_nodeA_local
    domain     /fs1 /fs2
    
    
    export DSM_CONFIG=/home/admin/A1.dsm.opt
  • Define and set up a schedule to perform the incremental backup for non-clustered file systems as described in Step 4.

    tsm: CLINTON>define schedule standard local_backup action=incr starttime=00:30 
    startdate=TODAY  Duration=2

    Associate the schedule with all of the backup-archive client nodes that are defined to backup non-clustered resources.

    tsm: CLINTON>define association standard nodeA_local

Step 9: Restore cluster file system data

All volumes in a cluster resource are backed up under the target node defined for that cluster resource. If you need to restore the data that resides on a cluster volume, it can be restored from the client node that owns the cluster resource at the time of the restore. The backup-archive client must use the same user options file (dsm.opt) that was used during the backup to restore the data. There are no additional setup requirements necessary to restore data on cluster volumes. Refer to the IBM Tivoli Storage Manager for UNIX and Linux Backup-Archive Clients Installation and User's Guide Chapter 5: "Restoring your data" (see Resources for a link).

Step 10: Restore local file system data

The non-clustered volumes are backed up under the separate node name setup for non -clustered operations. In order to restore this data, the Tivoli Storage Manager Backup-Archive client must use the same user options file dsm.opt that was used during the backup.

In the example, set environment variable DSM_CONFIG to refer to /home/admin/A1.dsm.opt prior to performing a TSM client restore for the local node, nodeA_local.

Refer to the IBM Tivoli Storage Manager for UNIX and Linux Backup-Archive Clients Installation and User's Guide Chapter 5: "Restoring your data" (see Resources for a link).


Enabling the Tivoli Storage Manager Storage Agent to perform LAN-free data movement

The enhancements in version 5.4 of Tivoli Storage Manager Backup-Archive client to remove the programming dependencies between the Backup-Archive client software and cluster software, such as AIX HACMP, do not hinder the Backup-Archive client from moving data over the storage area network (SAN). They also do not affect the failover of the Tivoli Storage Manager Storage Agent or Tivoli Storage Manager Server in a cluster environment.

While this paper does not offer guidance in setting up the Storage Agent and Server for LAN-free data movement, some general concepts on setting up the Storage Agent and Server for LAN-free data movement in cluster environment should be considered if you intend to move data over the SAN.

The simplest way to deploy Storage Agents in a cluster environment is to install a single Storage Agent on each physical host in the cluster. If a Backup-Archive client instance fails from one physical host (hostA) to another physical host (hostB), it is serviced by the Storage Agent on hostB. Remember the following concepts when deploying this type of configuration:

  • The Storage Agents should have unique names on each host, for example staA on hostA and staB on hostB.
  • The Storage Agents should have the same network address and communication protocol, for example, each Storage Agent can be configured to use the TCP/IP loopback address (127.0.0.1) and TCP/IP port 1500.

To enable LAN-free data movement in the example used by the previous section, add the following options to each stanza in dsm.sys files, which were described in previous sections. For example, to modify the dsm.sys file for NodeA, the following LAN-free options are added:

Servername        clinton_NodeA
nodename          NodeA
commmethod        tcpip
...
enablelanfree	      yes
lanfreecommmethod       tcpip
lanfreetcpport          1500
lanfreetcpserveraddress 127.0.0.1

Other considerations must be taken into account to correctly manage how tape mounts are used during a failover of an instance of the Backup-Archive Client in a clustered environment. For example, an instance of the Backup-Archive Client communicating with Storage Agent staA on hostA has one or more tapes mounted to satisfy the backup request, and the Backup-Archive client fails over to hostB and starts communication with Storage Agent staB. How the tapes that were mounted on behalf of Storage Agent staA are managed depends on the configuration of the TSM server. You must consider the following:

  • The Tivoli Storage Manager Server for Windows® and AIX® has support, which correctly identifies that Storage Agent staA is no longer communicating with the Tivoli Storage Manager Server and after a short period of time, releases the relevant drives. In order to enable this support, you must ensure that the shared=yes and resetdrives=yes parameters are set for the library on the Tivoli Storage Manager Server.

  • Although the Tivoli Storage Manager Server on other platforms do not have this support for the Library Manager, they can still be used as the target of LAN-free data movement in a clustered environment. In these cases, using the example above, the drives will be orphaned and no longer available for use until a Tivoli Storage Manager administrator manually releases them. This may be satisfactory if there are sufficient drives, as it would allow the backup to finish normally.

  • Note that the Tivoli Storage Manager Backup-Archive client must have sufficient mount points to handle failover cases, which might orphan drives for any amount of time. This includes not only having physical mount points available but also ensuring that there are enough mount points configured to the client node. For example, the Tivoli Storage Manager Server maximum number of mount points parameter maxnummp must be set in accordance for this scenario.

For more detailed information and examples on configuration of the Tivoli Storage Manager Storage Agents and Servers in a cluster environment refer to the IBM Redbooks® title "IBM Tivoli Storage Manager in a Clustered Environment" (see Resources for a link).


Enabling Tivoli Storage Manager Web client access in a cluster environment

If the Tivoli Storage Manager Web client access is desired during a failover condition, you must configure the Tivoli Storage Manager Web client acceptor daemon (CAD) associated with the cluster to failover along with the cluster resource.

After you complete the configuration steps described in the "Configuring the Tivoli Storage Manager Backup-Archive Client in a cluster environment" section of this document, perform the additional steps described below to complete the Web client access setup.

Step 1: Set up the CAD to manage the Web client and scheduler

Tivoli Storage Manager CAD should be set up to manage schedulers as well as Web client access. This reduces the number of daemons that need to be configured as cluster applications and thus simplifies the configuration and administration. When a failover occurs, the Tivoli Storage Manager CAD starts on the node that is managing the takeover.

Update the managedservices option in the system options file dsm.sys on each node for each server stanza, as shown below for NodeA:

Servername        clinton_NodeA
nodename          NodeA
commmethod        tcpip
tcpp              1500
tcps              clinton.sanjose.ibm.com
tcpclientaddres   nodeA.sanjose.ibm.com
passwordaccess    generate
passworddir       /A1/tsm/pwd
schedlogn         /A1/tsm/dsmsched.log
errorlogname      /A1/tsm/errorlog.log
managedservices   webclient schedule

Step 2: Set up the CAD to use a known HTTP port

By default, the CAD uses http port 1581, when available, for the Web client access. If this port is not available, the CAD finds the first available port, starting with 1581. In a failover condition of an active-active cluster configuration, a failover cluster host system will likely be running multiple instances of the CAD. If default settings are used for the HTTP port, the failover node will use any available port for the CAD being failed over, since the default port will likely be in use by the failover host's current CAD processes. This causes problems for the Web client associated with the CAD that failed over, as the new HTTP port will not be known to the Web client users.

You can use the httpport option to specify the specific ports for the Web client access for each resource. This allows you to always use the same port when connecting from a Web browser, independent of the node serving the cluster resource.

Add the httpport option in the system options file (dsm.sys) on each node for each server stanza as follows, making sure that each stanza uses a unique value:

Servername        clinton_NodeA
nodename          NodeA
commmethod        tcpip
tcpp              1500
tcps              clinton.sanjose.ibm.com
tcpclientaddres   nodeA.sanjose.ibm.com
passwordaccess    generate
passworddir       /A1/tsm/pwd
managedservices   webclient schedule
schedlogn         /A1/tsm/dsmsched.log
errorlogname      /A1/tsm/errorlog.log
httpport          1510


Servername        clinton_NodeB
nodename          NodeB
commmethod        tcpip
tcpp              1500
tcps              clinton.sanjose.ibm.com
tcpclientaddres   nodeB.sanjose.ibm.com
passwordaccess    generate
passworddir       /B1/tsm/pwd
managedservices   webclient schedule
schedlogn         /B1/tsm/dsmsched.log
errorlogname      /B1/tsm/errorlog.log
httpport          1511


Servername        clinton_NodeC
nodename          NodeC
commmethod        tcpip
tcpp              1500
tcps              clinton.sanjose.ibm.com
tcpclientaddres   nodeC.sanjose.ibm.com
passwordaccess    generate
passworddir       /C1/tsm/pwd
managedservices   webclient schedule
schedlogn         /C1/tsm/dsmsched.log
errorlogname      /C1/tsm/errorlog.log
httpport          1512

Migrating legacy AIX HACMP setups

The Tivoli Storage Manager Backup-Archive client integrates with AIX HACMP using the backup-archive client clusternode option. When the clusternode option is set to yes, it allows the backup-archive client to obtain the cluster name using the HACMP API. The cluster name is used internally by the backup-archive client to provide failover management capabilities. The cluster name is also used as the default client node name if the nodename option is not specified explicitly in the system options file.

The clusternode option is no longer required in version 5.4 the Tivoli Storage Manager Backup-Archive client to provide failover management and correct encryption for stored password. This option will be phased out in a future release in favor of the generalized approach outlined in this paper. This generalized approach removes the need for certification of the HACMP version by Tivoli Storage Manager and removes compatibility requirements between HACMP and Tivoli Storage Manager, reducing the risk of unexpected or unwanted behavior between the two products.

If you are currently using the Tivoli Storage Manager Backup-Archive cClient in an HACMP environment using the clusternode option, we suggest that you update current configurations to the one described in this paper by following the procedure outlined in the following section.

Step 1: Update the backup-archive client system options file

As with the clusternode option, each node in the cluster must continue to have separate server stanzas for each cluster resource group to be backed up in each respective dsm.sys file.

The existing dsm.sys file for NodeA might appear as follows:

Servername        clinton_nodeA
commmethod        tcpip
tcpp              1500
tcps              clinton.sanjose.ibm.com
tcpclientaddres   nodeA.sanjose.ibm.com
passwordaccess    generate
passworddir       /A1
clusternode       yes
managedservices   schedule
schedlogn         /A1/dsmsched.log
errorlogname      /A1/errorlog.log

Notice that no nodename option is used in this sample.

Make the following changes to the existing dsm.sys file for NodeA.

  • Remove the clusternode option.
  • Specify a nodename option if you do not have one already specified. In order to avoid backing up the entire node's data again, use the existing cluster node name registered on the TSM server as the value for the nodename option.

The new dsm.sys file for NodeA should appear as follows:

Servername        clinton_nodeA
commmethod        tcpip
nodename          myclus   
tcpp              1500
tcps              clinton.sanjose.ibm.com
tcpclientaddres   nodeA.sanjose.ibm.com
passwordaccess    generate
passworddir       /A1
managedservices   schedule
schedlogn         /A1/dsmsched.log
errorlogname      /A1/errorlog.log

Note that myclus is the existing cluster name.

Step 2: Register backup-archive client nodes on the Tivoli Storage Manager Server

If new backup-archive client nodes are added in the first step to replace the current default value of the cluster node name, register those nodes on the Tivoli Storage Manager Server.

Step 3: Update schedule definitions

If new backup-archive client nodes are added in Step 2, ensure that the backup schedule definitions used earlier to back up this node's data are now associated with the new client node names.

Step 4: Validate the setup

To validate the setup, follow Step 6 described in the "Configuring the Tivoli Storage Manager Backup-Archive client in a cluster environment" section.


Conclusion

Enhancements in version 5.4 of the Tivoli Storage Manager Backup-Archive client have helped remove the programming dependencies between the backup-archive client software and cluster software, such as AIX HACMP. These enhancements enable the backup-archive client to be deployed into cluster environments on UNIX and Linux, independent of the cluster software. If you are currently using the Tivoli Storage Manager Backup-Archive Client in an HACMP AIX cluster environment, update your configuration as described by this document.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Tivoli (service management) on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Tivoli
ArticleID=216816
ArticleTitle=Using the IBM Tivoli Storage Manager Backup-Archive client in cluster environments on UNIX and Linux
publish-date=07182007