Requirements for attaching systems to hosts running the Linux

Ensure that your system meets the requirements for attaching to a host that is running the Linux® operating system.

The following list provides the requirements for attaching the system to the host that is running the Linux operating system:

  • Check the LUN limitations for your host system.
  • Ensure that you have the documentation for your host and access to the hardware installation information for the correct model of your system. All system publications are available from the following website:www.ibm.com/support.
  • Ensure that you install the correct operating systems and are running a supported kernel of Linux.
  • When you attach the system to a BladeCenter blade server, see the BladeCenter documentation for SAN configuration details.
You must ensure the following configuration settings to restore the path if they are lost during an upgrade or other hardware maintenance that results in a node being offline:
  • SCSI INQUIRY TIMEOUT
  • SCSI COMMAND TIMEOUT
  • Multipath settings in multipath.conf
Note: If you configure your system to use NPIV, there are less chances that the paths will get lost.
To set SCSI INQUIRY TIMEOUT use these steps:
  • Add scsi_mod.inq_timeout=70 to the kernel boot command line through GRUB configuration. By adding the scsi_mod.inq_timeout=70 parameter, the change in the parameter is persistent from a server reboot. Linux hosts can also regain system node paths when lost. This change can be done by completing the following steps.

    For SLES12 and SLES15 servers, follow these steps:
    1. To make the permanent change, edit /etc/default/grub and add to the GRUB_CMDLINE_LINUX_DEFAULT line:
      scsi_mod.inq_timeout=70
    2. Run the following command to rewrite the boot record:
      grub2-mkconfig -o /boot/grub2/grub.cfg
      
  • For RHEL6, RHEL7 and RHEL8 servers, follow these steps:
    1. To make the change permanent, edit /etc/sysconfig/grub and add to the GRUB_CMDLINE_LINUX line:
      scsi_mod.inq_timeout=70
    2. Run the following command to rewrite the boot record:
      grub2-mkconfig -o /etc/grub2.cfg

    The previous steps will be effective after rebooting. In RHEL6, RHEL7, RHEL8, SLES12, and SLES15, you can choose to change the inq_timeout parameter temporarily without rebooting. This method will not keep the parameter value persistent if the system ever reboots in the future and if you do not edit the GRUB configuration by following these steps. .

    Use the following command to change the inq_timeout parameter temporarily without rebooting:
    echo 70 > /sys/module/scsi_mod/parameters/inq_timeout
    In RHEL6, RHEL7, RHEL8, SLES12, and SLES15 enter the following command to view that the change was made:
    systool -m scsi_mod -A inq_timeout
    The output of the command shows that the value is changed to 70:
    Module = "scsi_mod"
    inq_timeout         = "70"
Note: It is best to perform both tasks in case the server is rebooted in the future.
To set the SCSI COMMAND TIMEOUT
  • Set the udev rules for SCSI command timeout to 120s. This is the recommended setting for all versions of Linux.
Udev rules file creation
  1. To increase the SCSI command timeout for the system, create the following udev rule:
    # Set SCSI command timeout to 120s (default == 30 or 60) for IBM 2145 devices
    SUBSYSTEM=="block", ACTION=="add", ENV{ID_VENDOR}=="IBM",ENV{ID_MODEL}=="2145", RUN+="/bin/sh -c 'echo 120 >/sys/block/%k/device/timeout'"
  2. After you set up your volumes, confirm that they are set for 120 seconds. Locate the block device paths by running multipath -ll | grep sd from the command prompt. Then, run cat /sys/block/sdX/device/timeout (where X is each 2145 block device path).
  3. To reload the udev rules without rebooting (or dynamically) you can run the following commands:
    udevadm control -R
    /sbin/udevadm trigger --type=devices --action=add

Settings for Linux hosts

To ensure path recovery in failover scenarios, certain Device Mapper Multipath (DMMP) settings and udev rules for the attachment of Linux hosts to the system are recommended. These settings are valid for IBM® System x, all Intel or AMD-based servers, and Power® platforms.

You must restart your host after you complete the following two steps:
  • Editing the multipath settings in /etc/multipath.conf
  • Editing the udev rules for SCSI command timeout

For each Linux distribution and releases within a distribution, refer to the default settings under [/usr/share/doc/device-mapper-multipath.*] for Red Hat and [/usr/share/doc/packages/multipath-tools] for Novell SuSE. Ensure that the entries added to multipath.conf match the format and syntax for the required Linux distribution. Use the multipath.conf only from your related distribution and release. Do not copy the multipath.conf file from one distribution or release to another.

For some operating system levels, the polling_interval needs to be located under defaults instead of under device settings. If polling_interval is present in the device section, comment out polling_interval by using a # key.

For example:
Under Device Section
# 		polling_interval 5,

Under Defaults Section
defaults {
		user_friendly_names yes
		polling_interval 5
}

Multipath settings for specific Linux distributions and releases

Edit /etc/multipath.conf with the following parameters and confirm the changes by entering:
multipathd -k
multipathd> show config
Note: Oracle Linux versions are same as Red Hat Linux versions.
Red Hat Linux version 7.x, 8.x, and 9.x
     vendor "IBM"
     product "2145"
     path_grouping_policy "group_by_prio"
     path_selector "service-time 0" # Used by Red Hat 7.x
     prio "alua"
     path_checker "tur"
     failback "immediate"
     no_path_retry 5
     rr_weight uniform
     rr_min_io_rq "1"
     dev_loss_tmo 120	
SUSE Linux Versions 12 and 15
     vendor "IBM"
     product "2145"
     path_grouping_policy "group_by_prio"
     path_selector "service-time 0" 
     prio "alua"
     path_checker "tur"
     failback "immediate"
     retain_attached_hw_handler "yes"
     no_path_retry 5 # or no_path_retry "fail"
     fast_io_fail_tmo 5
     rr_min_io 1000
     rr_min_io_rq 1
     rr_weight "uniform"	
Ubuntu
     vendor "IBM"
     product "2145"
     path_grouping_policy "group_by_prio"
     path_selector "service-time 0" 
     prio "alua"
     path_checker "tur"
     failback "immediate"
     no_path_retry 5 # or no_path_retry "fail"
     retain_attached_hw_handler "yes"
     fast_io_fail_tmo 5
     rr_min_io 1000
     rr_min_io_rq 1
     rr_weight "uniform"	

DM-MPIO for dev_loss_tmo

After a problem is detected on an FC port and it set to infinity, the SCSI layer can wait until 2147483647 seconds (68 years) before removing it from the system. The default value is determined by the OS.

All Linux hosts should have a dev_loss_tmo setting, but the value in seconds is how long to wait for the device/paths to be pruned. The suggested duration is 120-150 seconds, but extended duration is also supported.

Care needs to be taken if it is too low since if paths are pruned, then they also need to be rediscovered and if too low, that may require manual rescan later. If inquiry timeout is right, the host should be able to re-add the paths when the SVC nodes are restored.

If the inquiry is too short such as 20 seconds then the inquiry may timeout before the paths are ready.

Multipathing driver

If you lose paths and are not automatically restored, you can manually get them back with the following process.

  • If you are using Linux dm-multipath as the multipathing software on the host, and have NPIV disabled on the SVC, it may be necessary to rescan ports after each node restores its paths.
  • If you have NPIV disabled on the SVC, additional configuration is required. During the upgrade, an SVC with NPIV disabled will shut down the ports for an extended period of time, which may cause Linux to remove the ports.
  • This setting may be applied to a running system, however it must also be applied to the GRUB configuration in order to ensure it persists over reboots.
  • It is possible that, even with this setting, an SVC upgrade may keep the ports down longer than the timeout setting allows. It may be necessary to rescan the ports once the SVC node has restored operation.
  • Use the multipath command to check path status once the SVC node has completed the upgrade, before updating the next node:
    # multipath -ll /dev/mapper/mpatha 
    (360050768028211d8b000000000000061) dm-11 IBM,2145
    size=10G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='service-time 0' prio=50 status=enabled
    | |- 4:0:1:0  sder    129:48  failed faulty running
    | `- 5:0:1:0  sdfb    129:208 failed faulty running
    `-+- policy='service-time 0' prio=10 status=active 
     |- 4:0:0:0  sdem    128:224 active ready running 
     `- 5:0:0:0  sdew    129:128 active ready running#

Choose a device name, instead of mpatha in the above example, to match a device being utilized by the SVC that is performing the upgrade. The failed and faulty status, indicates that paths are still down for Linux. You can use multipath -ll command to list all devices name, and then scan the output for failed paths, confirming that failed paths are on devices associated with the SVC upgrade.

To rescan all SCSI targets and the ports on Red Hat® Enterprise Linux or SUSE Linux Enterprise Server, use the rescan-scsi-bus.sh command which is part of the sg3_utils package.

If your Linux distribution does not include the rescan-scsi-bus.sh command, use the SCSI-rescan command to rescans all the SCSI targets.