mmnetverify command

Verifies network configuration and operation in a cluster.

Synopsis

mmnetverify [Operation[ Operation...]] [-N {Node[,Node...] | all}]
            [--target-nodes {Node[,Node...] | all}]
            [--configuration-file File] [--log-file File]
            [--verbose | -Y] [--min-bandwidth Number]
            [--max-threads Number] [--ces-override] [--ping-packet-size Number]
            [--subnets Addr[,Addr...]][--cluster Name Node[,Node...]]
or
mmnetverify --remote-cluster-daemon [--remote-cluster-port PortNumber]

Availability

Available on all IBM Spectrum Scale editions.

Description

Note: The mmnetverify command is a diagnostic tool and is not intended to be issued continually during normal IBM Spectrum Scale cluster operations. Issue this command only to validate a cluster network before going into production or when you suspect network problems.

With the mmnetverify command, you can verify the network configuration and operation of a group of nodes before you organize them into an IBM Spectrum Scale cluster. You can also run the command to analyze network problems after you create a cluster.

If you have not created an IBM Spectrum Scale cluster yet, you must run the command with a configuration file. See the --configuration-file option in the Parameters section.

The command uses the concepts of local nodes and target nodes. A local node is a node from which a network test is run. The command can be started from one node and run from several separate local nodes. A target node is a node against which a test is run.

The command has the following requirements:
  • IBM Spectrum Scale must be installed on all the nodes that are involved in the test, including both local nodes and target nodes.
  • Each node must be able to issue remote shell commands to all nodes, including itself, without a password. (This requirement is tested in the shell check.)

The following table lists the types of output messages and where they are sent. Messages are not added to a log file unless you specify the log-file name on the command line:

Table 1. Information and error messages
Message type Printed to console Added to the log file, if one is specified.
Information messages Yes (stdout) Yes
Verbose information messages Yes (stdout)

If --verbose is specified on the command line.

Yes
Error messages Yes (stderr) Yes
If sudo wrappers are enabled, the command uses a sudo wrapper when it communicates with remote nodes. Note the following restrictions:
  • If you are working with an existing cluster, the cluster must be at IBM Spectrum Scale v4.2.3 or later.
  • You must run the command on an administration node.
  • The -N option is not supported.
  • You must run the sudo command as the gpfsadmin user.

Parameters

Operation[ Operation...]
Specifies one or more operations, which are separated by blanks, that are to be verified against the target nodes. The operations are described in Table 3. Shortcut terms are described in Table 2.

If you do not specify any operations, the command does all the operations except data-large, flood-node, and flood-cluster against the target nodes.

-N {Node[,Node...] | all}
Specifies a list of nodes on which to run the command. If you specify more than one node, the command is run on all the specified nodes in parallel. Each node tests all the specified target nodes. If you do not include this parameter, the command runs the operations only from the node where you enter the command.
Note: The --max-threads parameter specifies how many nodes can run in parallel at the same time. If the limit is exceeded, the command still tests all the specified nodes, starting the surplus nodes as other nodes finish.
Node[,Node...]
Specifies a list of nodes in the local cluster. This parameter accepts node classes. You can specify system node classes, such as aixnodes, and node classes that you define with the mmcrnodeclass command.
all
Specifies all the nodes in the local cluster.

--target-nodes {Node[,Node...] | all}]
Specifies a list of nodes that are the targets of the testing. If you do not specify this parameter, the command runs the operations against all the nodes in the cluster.
Node[,Node...]
Specifies a list of nodes in the local cluster. This parameter accepts node classes. You can specify system node classes, such as aixnodes, and node classes that you define with the mmcrnodeclass command.
all
Specifies all the nodes in the local cluster.

[--configuration-file File]
Specifies the path of a configuration file. You must use a configuration file if you have not created an IBM Spectrum Scale cluster yet. You can also specify a configuration file if you have created a cluster but you do not want the command to run with the IBM Spectrum Scale cluster configuration values. Only the node parameter is required. The other parameters revert to their default values if they are not specified.
Note: When you specify this parameter, the --ces-override parameter is automatically set.
The following code block shows the format of the file:
node Node [AdminName]
rshPath Path
rcpPath Path
tscTcpPort Port
mmsdrservPort Port
tscCmdPortRange Min-Max
subnets Addr[,Addr...] 
cluster Name Node[,Node...]
where:
node Node [AdminName]
Specifies a node name, followed optionally by the node's admin name. If you do not specify an admin name, the command uses the node name as the admin name.

You can have multiple node parameters. Add a node parameter for each node that you want to be included in the testing, either as a local node or as a target node. You must include the node from which you are running the command.

rshPath Path
Optional. Specifies the path of the remote shell command to be used. The default value is /usr/bin/ssh. Specify this parameter only if you want to use a different remote shell command.
rcpPath Path
Optional. Specifies the path of the remote file copy command to be used. The default value is /usr/bin/scp. Specify this parameter only if you want to use a different remote copy command.
tscTcpPort Port
Optional. Specifies the TCP port number to be used by the local GPFS daemon when it contacts a remote cluster. The default value is 1191. Specify this value only if you want to use a different port.
mmsdrservPort Port
Optional. Specifies the TCP port number to be used by the mmsdrserv service to provide access to configuration data to the rest of the nodes in the cluster. The default value is the value that is stored in mmfsdPort.
tscCmdPortRange=Min-Max
Specifies the range of port numbers to be used for extra TCP/IP ports that some administration commands need for their processing. Defining a port range makes it easier for you set firewall rules that allow incoming traffic on only those ports. For more information, see IBM Spectrum Scale port usage.

If you used the spectrumscale installation toolkit to install a version of IBM Spectrum Scale that is earlier than version 5.0.0, then this attribute is initialized to 60000-61000. Otherwise, this attribute is initially undefined and the port numbers are dynamically assigned from the range of ephemeral ports that are provided by the operating system.

subnets Addr[,Addr...]
Optional. Specifies a list of subnet addresses to be searched for a subnet that both the local node and the target node are connected to. If such a subnet is found, the command runs the specified network check between the connections of the local node and target node to that subnet. Otherwise, the command runs the network check across another connection that the local node and the target node have in common. The command goes through this process for each local node and target node that are specified for the command to process. The default value of this parameter is no subnets.

You must specify subnet addresses in dot-decimal format, such as 10.168.0.0. You cannot specify a cluster name as part of a subnet address. For more information about specifying a list of subnets, see the description of the parameter --subnets later in this topic.

cluster Name Node[,Node...]
Optional. Specifies the name of a remote cluster followed by at least one contact node identifier. The remote-cluster operation includes this cluster in its connectivity checks. You can specify this parameter multiple times in a configuration file. For more information, see the description of the --cluster parameter later in this topic.

[--log-file File]
Specifies the path of a file to contain the output messages from the network checks. If you do not specify this parameter, messages are displayed only on the console. See Table 1.

--verbose
Causes the command to generate verbose output messages. See Table 1.

-Y
Displays the command output in a parseable format with a colon (:) as a field delimiter. Each column is described by a header.
Note: Fields that have a colon (:) are encoded to prevent confusion. For the set of characters that might be encoded, see the command documentation of mmclidecode. Use the mmclidecode command to decode the field.
[--min-bandwidth Number]
Specifies the minimum acceptable bandwidth for the data bandwidth check.

--max-threads Number
Specifies how many nodes can run the command in parallel at the same time. The valid range is 1 - 64. The default value is 32. For more information, see the -N parameter.

--ces-override
Causes the command to consider all the nodes in the configuration to be CES-enabled. This parameter overrides the requirement of the protocol-ctdb and the protocol-object network checks that the local node and the target nodes must be CES-enabled with the mmchnode command. This parameter is automatically set when you specify the --configuration-file parameter.

[--ping-packet-size Number]
Specifies the size in bytes of the ICMP echo request packets that are sent between the local node and the target node during the ping test. The size must not be greater than the MTU of the network interface.

If the MTU size of the network interface changes, for example to support jumbo frames, you can specify this parameter to verify that all the nodes in the cluster can handle the new MTU size.

[--subnets Addr[,Addr...]]
Specifies a list of subnets that the command searches in sequential order for a subnet that both the local node and the target node are connected to. If such a subnet is found, the command runs the network check between the connections of the local node and target node to that subnet. (The command runs the network check only for the first such subnet that it finds in the list.) If such a subnet is not found, the command runs the network check across another connection that the local node and the target node have in common. The command goes through this process for each local node and target node that are specified on the command line.

This parameter affects only the network checks that are included in the port, data, and bandwidth shortcuts. For a list of these network checks, see Table 2 following.

Before you use this parameter, ensure that all nodes in the cluster or group are running IBM Spectrum Scale 5.0.0 or later and that you have successfully run the command mmchconfig release=LATEST on all the nodes. For more information, see the chapter Upgrading.

Addr[,Addr...]
Specifies a list of subnet addresses to be searched. You can specify a subnet address either as a literal network address in dot-decimal format, such as 10.168.0.0, or as a shell-style regular expression that can match multiple subnet addresses.
Note: You cannot specify a cluster name as part of the subnet address.
The following types of regular expression are supported:
  • Character classes
    • A set of numerals enclosed in square brackets. For example, the expression 192.168.[23].0 matches 192.168.2.0 and 192.168.3.0.
    • A range of numerals enclose in square brackets. For example, the expression 192.168.[2-16].0 matches the range 192.168.2.0 through 192.168.16.0.
  • Quantifiers
    • The pattern X* signifies X followed by 0 or more characters. For example, the expression 192.168.*.0 matches 192.168.0.0, 192.168.1.0, and so on up to 192.168.255.0.
    • The pattern X? signifies X followed by 0 or 1 characters. For example, the expression 192.168.?.0 matches 192.168.0.0, 192.168.1, 0, and so on up to 192.168.9.0.
Tip: In all of IBM Spectrum Scale, you can specify a list of subnets for the mmnetverify command in three locations:
  • In the subnets attribute of the mmchconfig command. For more information, see mmchconfig command.
  • In the subnets entry of the mmnetverify configuration file. See the description of the --configuration-file parameter earlier in this topic.
  • In the --subnets parameter of the mmnetverify command.

If you specify a list of subnets in the second location (the subnets entry of the mmnetverify configuration file) the command ignores any subnets that are specified in the first location. Similarly, if you specify a list of subnets in the third location (the --subnets parameter of the mmnetverify command) the command ignores any subnets that are specified in the first two locations.

[--cluster Name Node[,Node...]]
Specifies a remote cluster for the remote-cluster operation to check. This parameter can occur multiple times on the command line, so that you can specify multiple remote clusters. By default the remote-cluster operation checks only the known remote clusters -- that is, the remote clusters that are listed in the mmsdrfs file, where they are put by the mmremotecluster command. With the --cluster parameter, you can specify other remote clusters to be checked. For more information about the remote-cluster operation, see the entry for the operation in Table 3 later in this topic.
Node[,Node...]]
Specifies one or more contact nodes through which the remote-cluster operation can get information about the remote cluster. You must specify at least one contact node.
The remote-cluster operation evaluates a remote cluster in two stages. In the first stage it gathers information about the nodes in the remote cluster. In the second stage it checks the nodes for connectivity. The first stage has two phases:
  1. In the first phase the remote-cluster operation tries to get information through the contact nodes of the remote cluster. If this attempt succeeds, the first stage ends and the operation goes to the second stage, which is testing the remote nodes for connectivity. But if the attempt fails, the operation goes to the second phase.
  2. In the second phase the operation tries to get information through a special daemon called a remote-cluster daemon that can be running on a remote contact node. If this attempt succeeds, the first stage ends and the operation goes to the second stage, which is checking the nodes for connectivity. If the attempt fails, the operation displays an error message with instructions to start a remote-cluster daemon on a contact node and try the remote-cluster operation again.
To start a remote-cluster daemon, open a console window on a contact node and issue the following command:
mmnetverify --remote-cluster-daemon --remote-cluster-port <PortNumber>
Note: The --remote-cluster-port parameter is optional. The default port number is 61147. For more information, see the description of the --remote-cluster-daemon parameter later in this topic.
You can now issue the mmnetverify command again with the remote-cluster operation to run checks against the remote cluster. If you also specify the --remote-cluster-port parameter and the port number on which the remote-cluster daemon listens, the remote-cluster operation skips the first phase of the search and immediately starts the second phase. For example,
mmnetverify remote-cluster --remote-cluster-port <PortNumber> --cluster <clusterName> <nodeName>
For more information, see the entry for the remote-cluster operation in Table 3 later in this topic.
--remote-cluster-daemon [--remote-cluster-port PortNumber]
Causes the mmnetverify command to start a remote-cluster daemon on the node where the command is issued. The purpose of this daemon is to provide information about the nodes of the remote cluster to the mmnetverify command when it is running a remote-cluster operation from a node in a local cluster.
--remote-cluster-port PortNumber
Specifies the port that the remote-cluster daemon listens on. If you do not specify a port number, the daemon listens on the default port 61147.
For more information, see the description of the --cluster parameter earlier in this topic and the description of the remote-cluster operation in Table 3 later in this topic.

The network checks

The following table lists the shortcut terms that you can specify for the network checks that are listed in Table 3:
Note: These network checks might cause entries to the mmfs log that say that connections are being ended. These entries are expected and do not indicate problems in your system.
Table 2. Shortcut terms for network checks
Shortcut Checks that are performed
local interface
connectivity resolution, ping, shell, and copy
port daemon-port, sdserv-port, and tsccmd-port
data data-small, data-medium, and data-large
bandwidth bandwidth-node and bandwidth-cluster
protocol protocol-ctdb and protocol-object
flood flood-node and flood-cluster
Start of changerdmaEnd of change Start of changerdma-connectivityEnd of change
all All checks except flood-node and flood-cluster
The following table lists the parameters that you can specify for network checks. Separate these parameters with a blank on the command line. For example, the following command runs the interface and copy checks on the local node against all the nodes in the cluster:
mmnetverify interface copy
Table 3. Network checks
Command-line option Test description Test items
interface Network interface configuration The local node's daemon and admin network addresses are enabled.
resolution Host name resolution The command verifies the following conditions:
  • The target node's name and daemon node name can be resolved on the local node and they resolve to the same address.
  • The target node's IP address resolves to either the node name or the daemon node name.
ping Network connectivity via ping The local node can ping the target node with its name, daemon node name, rel_hostname, admin_shortname, and IP address entries in the mmsdrfs file.
shell Remote shell command The command verifies the following conditions:
  • The local node can issue remote shell commands to the target node's admin interface without requiring a password.
  • The target node's daemon and admin names refer to the same node.
copy Remote copy The local node can issue a remote copy command to the target node's admin interface without requiring a password.
time Date and time The time and date on the local node and target node do not differ by a wide margin.
daemon-port1 GPFS daemon connectivity The target node can establish a TCP connection to the local node on the mmfsd daemon port of the local node:
  • The target node uses the port that is specified in the cluster configuration property tscTcpPort. The default value is 1191.
  • If mmfsd and mmsdrserv are not running on the local node, the command starts an echo server on the daemon port of the local node for this test.
sdrserv-port1 GetObject daemon connectivity The target node can establish a TCP connection directed to the local node on the port that is specified in the cluster configuration property mmsdrservPort:
  • The default value of this port is the value that is specified in the cluster configuration property tscTcpPort.
  • If mmfsd and mmsdrserv are not running on the local node, the command starts an echo server on the daemon port of the local node for this test.
tsccmd-port1 TS-command connectivity The target node can establish a TCP connection directed to the local node on a port in the range that is specified in the tscCmdPortRange property:
  • The command starts an echo server on the local node for this test.
  • If the tscCmdPortRange property is set, then the echo server listens on a port in the specified range.
  • If not, then the echo server listens on an ephemeral port that is provided by the operating system.
data-small1 Small data exchange The target node can establish a TCP connection to the local node and exchange a series of small-sized data messages without network errors:
  • The command starts an echo server on the local node for this test.
  • If the tscCmdPortRange property is set, then the echo server listens on a port in the specified range.
  • If not, then the echo server listens on an ephemeral port that is provided by the operating system.
data-medium1 Medium data exchange The target node can establish a TCP connection to the local node and exchange a series of medium-sized data messages without network errors:
  • The command starts an echo server on the local node for this test.
  • If the tscCmdPortRange property is set, then the echo server listens on a port in the specified range.
  • If not, then the echo server listens on an ephemeral port that is provided by the operating system.
data-large1 Large data exchange The target node can establish a TCP connection to the local node and exchange a series of large-sized data messages without network errors:
  • The command starts an echo server on the local node for this test.
  • If the tscCmdPortRange property is set, then the echo server listens on a port in the specified range.
  • If not, then the echo server listens on an ephemeral port that is provided by the operating system.
bandwidth-node1 Network bandwidth one-to-one The target node can establish a TCP connection to the local node and send a large amount of data with adequate bandwidth:
  • Bandwidth is measured on the target node.
  • If the min-bandwidth parameter was specified on the command line, the command verifies that the actual bandwidth exceeds the specified minimum bandwidth.
bandwidth-cluster1 Network bandwidth many-to-one All target nodes can establish a TCP connection to the local node and send a large amount of data in parallel:
  • The bandwidth is measured on each target node.
  • If the min-bandwidth parameter was specified on the command line, the command verifies that the actual bandwidth exceeds the specified minimum bandwidth.
  • None of the target nodes has a significantly smaller bandwidth than the other target nodes.
gnr-bandwidth Overall bandwidth All target nodes can establish a TCP connection to the local node and send data to it.
  • The bandwidth is measured on each target node.
  • If the min-bandwidth parameter is specified on the command line, the command verifies that the actual bandwidth exceeds the specified minimum bandwidth.
Note the following differences between this test and the bandwidth-cluster test:
  • The total bandwidth is measured rather than the bandwidth per node.
  • All target nodes must be active and must participate in the test.
  • The bandwidth value that is reported does not include ramp-up time for TCP to read full capability.
flood-node Flood one-to-one When the local node is flooded with datagrams, the target node can successfully send datagrams to the local node:
  • The target node tries to flood the local node with datagrams.
  • The command records packet loss.
  • The command verifies that some of the datagrams were received.
flood-cluster Flood many-to-one When the local node is flooded with datagrams from all the target nodes in parallel, each target node can successfully send datagrams to the local node:
  • The command records packet loss for each target node.
  • The command checks that each target node received some datagrams.
  • The command checks that none of the target nodes has a packet loss significantly higher than the other nodes.
remote-cluster [--remote-cluster-port PortNumber ] Remote cluster connectivity The local node can establish a connection with the nodes in the specified remote clusters. By default this operation tests each remote cluster that is listed in the mmsdrfs file on the local node. You can specify more clusters to test with the --cluster command line parameter. This operation runs the following tests against each node in a remote cluster:
  • Host name resolution:
    • The target node's name and daemon node name can be resolved on the local node and they resolve to the same address.
    • The target node's IP address resolves to either the node name or the daemon node name.
  • Network connectivity via ping:
    • The local node can ping the target node with its name, daemon node name, rel_hostname, admin_shortname, and IP address entries in the mmsdrfs file of the target node.

  • GPFS daemon connectivity:
    • The local node can establish a TCP connection to the target node on the mmfsd daemon port of the target node:
      • The local node uses the port that is specified in the cluster configuration property tscTcpPort. The default value is 1191.
      Note: Either the mmfsd daemon or the mmsdrserv daemon must be running on the remote node. (In normal operation one of these daemons is always running on a node so long as IBM Spectrum Scale is installed.) Otherwise the test fails.
If you specify the --remote-cluster-port parameter with a port number, the remote-cluster operation skips the first phase of its search for information about a remote cluster and immediately begins the second phase of its search. For more information see the description of the --cluster parameter earlier in this topic.
This network check is performed only if all of the following conditions are true:
  • The local cluster version is IBM Spectrum Scale 5.0.2 or later.
  • At least one remote cluster is defined either in the mmsdrfs file or in a --cluster command-line parameter.
  • One of the following conditions is true:
    • A contact node in the remote cluster can be reached via passwordless ssh.
    • A remote-cluster daemon is running on a contact node in the remote cluster and the port that the daemon listens on is not blocked by a firewall.
protocol-ctdb CTDB port connectivity The target node can establish a connection with the local node through the CTDB port. SMB uses CTDB for internode communications. The command does not run this test in the following situations:
  • The SMB protocol service is not enabled.
  • The local node or a target node is not a CES-enabled node. A node is considered to be CES-enabled if it is in an existing cluster and if it was enabled with the mmchnode --ces-enable command. This requirement is not enforced in the following situations:
    • You override the requirement by specifying the --ces-override option.
    • You specify the --configuration-file parameter.
protocol-object Object-protocol connectivity The target node can establish a connection with the local node through the ports that are used by the object server, the container server, the account server, the object-server-sof, and the keystone daemons. The command does not run this test in the following situations:
  • The Object protocol service is not enabled.
  • The local node or a target node is not a CES-enabled node. A node is considered to be CES-enabled if it is in an existing cluster and if it was enabled with the mmchnode --ces-enable command. This requirement is not enforced in the following situations:
    • You override the requirement by specifying the --ces-override option.
    • You specify the --configuration-file parameter.
Start of changerdma-connectivityEnd of change Start of changeRDMA connectivityEnd of change Start of changeThe command verifies the following conditions on nodes on which RDMA is configured:
  • The configuration that is specified in the verbPorts cluster attribute matches the active InfiniBand interfaces on the nodes.
  • The nodes can connect to each other through the active InfiniBand interfaces.
For more information see the verbPorts cluster attribute in the topic mmchconfig command.
Note: These checks require that the commands ibv_devnfo and ibtracert be installed on the nodes on which RDMA is configured.
End of change

1For this type of test, if a list of subnets is specified, the command searches the list sequentially for a subnet that both the local node and the target node are connected to. If the command finds such a subnet, the command runs the specified test across the subnet. You can specify a list of subnets in the subnets attribute in the mmchconfig command, in the subnets entry of the mmnetverify configuration file, or in the --subnets parameter of the mmnetverify command. For more information, see the description of the --subnets parameter earlier in this topic.

Exit status

0
The command completed successfully and all the tests were completed successfully.
1
The command encountered problems with options or with running tests.
2
The command completed successfully, but one or more tests was unsuccessful.

Security

You must have root authority to run the mmnetverify command.

The node on which you enter the command must be able to execute remote shell commands on any other administration node in the cluster. It must be able to do so without the use of a password and without producing any extraneous messages. For more information, see Requirements for administering a GPFS file system.

Examples

  1. The following command runs all checks, except data-large, flood-node, and flood-cluster, from the node where you enter the command against all the nodes in the cluster:
    mmnetverify
  2. The following command runs connectivity checks from the node where you enter the command against all the nodes in the cluster:
    mmnetverify connectivity
  3. The following command runs connectivity checks from nodes c49f04n11 and c49f04n12 against all the nodes in the cluster:
    mmnetverify connectivity -N c49f04n11,c49f04n12
  4. The following command runs all checks, except flood-node and flood-cluster, from the node where you enter the command against nodes c49f04n11 and c49f04n12:
    mmnetverify all --target-nodes c49f04n11,c49f04n12
  5. The following command runs network port checks on nodes c49f04n07 and c49f04n08, each checking against nodes c49f04n11 and c49f04n12:
    mmnetverify port -N c49f04n07,c49f04n08 --target-nodes c49f04n11,c49f04n12

See also

Location

/usr/lpp/mmfs/bin