Networking considerations for shared storage pools

Learn about the networking considerations and restrictions for shared storage pools (SSP).

Networking considerations

The networking considerations for shared storage pools (SSP) follow:

  • Uninterrupted network connectivity is required for SSP operations. The network interface that is used for the SSP configuration must be on a highly reliable network, which is not congested.
  • Ensure that both the forward and the reverse lookup for the hostname that is used by the VIOS logical partition for clustering resolves to the same IP address.
  • With the VIOS Version 2.2.2.0, or later, clusters support Internet Protocol version 6 (IPv6) addresses. Therefore, VIOS logical partitions in a cluster can have hostnames that resolve to an IPv6 address.
  • To set up clusters on an IPv6 network, IPv6 stateless auto-configuration is recommended. You can have a VIOS logical partition that is configured with either IPv6 static configuration or IPv6 stateless auto-configuration. A VIOS logical partition that has both IPv6 static configuration and IPv6 stateless auto-configuration is not supported in VIOS Version 2.2.2.0.
  • The hostname of each VIOS logical partition that belongs to the same cluster must resolve to the same IP address family, which is either Internet Protocol version 4 (IPv4) or IPv6 address.
Restrictions:
  • In a cluster configuration, to change the hostname or the IP address of a VIOS logical partition, complete one of the following procedures depending on the number of VIOS logical partitions in the cluster:
    • If VIOS logical partitions exists in the cluster, remove the VIOS logical partition from the cluster and change the hostname or the IP address. You can later add the VIOS logical partition to the cluster again with the new hostname or the IP address.
    • If only one VIOS logical partition exists in the cluster, you must delete the cluster to change the hostname or the IP address. Before deleting the cluster, you must create a backup of SSP configuration by using the viosbr command. You can restore the cluster after the hostname or the IP address is updated.
  • You must apply any hostname or the IP address changes to the /etc/netsvc.conf file of the VIOS logical partition before creating the cluster. This file is used to specify the order of name resolution for networking routines and commands. Later, if you want to edit the /etc/netsvc.conf file, complete the following procedure on each VIOS logical partition:
    1. To stop cluster services on the VIOS logical partition, type the following command:
      clstartstop -stop -n clustername -m vios_hostname
    2. Make the required changes in the /etc/netsvc.conf file. Do not change the IP address that resolves to the hostname that is being used for the cluster.
    3. To restart cluster services on the VIOS logical partition, type the following command:
      clstartstop -start -n clustername -m vios_hostname
    Maintain the same order of name resolution for all the VIOS logical partitions that belong to the same cluster. You must not make any changes to the /etc/netsvc.conf file when you are migrating a cluster from IPv4 to IPv6.

Multiple Transmission Control Protocol or Internet Protocol (TCP/IP) network support

In the VIOS versions earlier than VIOS Version 3.1.1.0, the shared storage pool (SSP) used only a single network interface or an IP for communication. Having a single network interface or an IP for communication might cause network failure and can be disruptive to the storage pool.

In the VIOS Version 3.1.1.0, or later, the shared storage pool improves the network resilience by supporting multiple TCP/IP network interfaces for LPAR client I/O specific communication. This communication is only used by the SSP for pool file system metadata protocol exchanges. Some of the VIOS daemon communication is also enhanced to use multiple network interfaces.

Multiple network interfaces are used in an active/passive mode. This means only one interface is used at a time without load balancing. In this case, one network interface is active and all the other network interfaces are in a standby mode. An active lease is maintained on all network interfaces for quick network interface switch-over. When the lease of an active network connection is at risk, the pool switches to another valid connection. The error log entries indicate the state of the network connection.

You can configure multiple TCP/IP network interfaces by using the -addips and -rmips options of the cluster command.

Best practices for using multiple TCP/IP networks:

  • To achieve true redundancy of multiple TCP/IP networks, you must avoid using a single network interface for multiple network connections and configure separate isolated subnets.
  • The network connection priority for multiple network connections is supported. In a multiple network interface environment, the primary network interface is utilized as much as possible. This means that if the primary network interface fails, failover to the secondary network interface occurs. Similarly, after the primary network is back and available, the communication automatically returns to the primary network interface. If the network interfaces have different speeds, the network interface with the highest speed must be defined as the primary network interface. For example, if the speed of the network interface is 10 gigabit and the speed of another network interface is 1 gigabit, the network interface with a 10 gigabit speed must be defined as the primary network interface. The IP address of this primary network interface resolves to the hostname that is used with a cluster node.
  • Adding or removing the IP addresses when the node is online is not supported. You must stop the node to add or remove the network and then start the node again.
  • Ensure that all the IP addresses of the cluster nodes are stored in the /etc/hosts file on all the nodes to avoid hostname query failure when the TCP/IP network or DNS is down. Failure in the hostname query might cause a node to take the shared storage pool offline on that node.

Limitations of multiple TCP/IP networks:

  • Using the HMC to configure multiple IP addresses is not supported.
  • You must stop and start the node for adding or removing the IP addresses. If you change the primary IP address or the hostname, remove the node from the cluster and then add it after the changes are complete.
  • You can configure multiple network interfaces and create a backup of the network configuration by using the viosbr command. However, when you perform the complete cluster restore operation by using the backup file, the shared storage pool does not recognize any secondary interfaces. For the configured interfaces to be recognized, you must stop and start the node.
  • The use of virtual IP address (VIPA) is not compatible while configuring multiple network interfaces by using the cluster -addips command. These are mutually exclusive techniques for network redundancy. The cluster -addips command cannot recognize a virtual IP address as it uses IP addresses from the physical network interfaces.

Disk communication support

In the VIOS Version 3.1.1.0 or later, you can configure disk communication for the shared storage pool LPAR client I/O specific communication. The shared storage pool keeps the disk connection active when all the TCP/IP networks are down. This allows you to manage a total network outage for a short period. The error log entry indicates when the node starts using disk communication and also when the network communication is resumed. When the TCP/IP network is back online, the shared storage pool automatically returns to communicate over the TCP/IP network.

A cluster is considered to be in a degraded mode when it is using disk communication:

  • The primary goal of disk communication is to ensure that application I/O on client logical partitions (LPARs) do not time out.
  • The VIOS CLI operations such as cluster -status might fail due to the network outage.
  • Communication-intensive shared storage pool operations such as PV remove might also fail.

The communication disk is managed by Cluster Aware AIX (CAA) and it is separate from the repository disk. The size requirement for the disk is same as a repository disk. SSP supports only a single disk network for communication.

You can configure multiple TCP/IP network interfaces by using the -addcompvs and -rmcompvs options of the cluster command.

Best practices for disk communication:

  • Provide a high-speed disk for disk communication depending on the I/O workload and the number of virtual I/O servers in the cluster.
  • When an active TCP/IP network is not available, you cannot access the DNS. You must add the /etc/hosts entries for all the nodes to avoid a node getting expelled during the recovery operation and taking its pool offline.
  • Disk communication is suited for low I/O rate applications such as rootvg or middleware. Disk communication can scale up to the limit of the storage performance.
  • Reduce the application I/O operations during disk communication if the disk communication cannot handle the requests.
  • During disk communication, you might need a larger error logging space for the /var and the /home directories when the networks are down. You need to monitor the /var and the /home directory space.

Limitations of disk communication:

  • The database might not be accessible because it requires a TCP/IP network for connection.
  • Configuration operations might fail because the database is not accessible.
  • The cluster -status command might display that the shared storage pool is down because it does not use disk communication.
  • 4K sector size disks are not supported for disk communication similar to repository disk.
  • Using the HMC to configure disk communication is not supported.



Last updated: Thu, October 15, 2020