IBM Support

IT30518: VM BACKUP OR RESTORE STOPS WITH "FAILED TO CREATE SHARE: COMMANDTIMED OUT"

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The Backup or Restore job for VMware guests might stop with the
    following message :
    
    ERROR,<timestamp>,CTGGA2086,Storage exception Failed to create
    storage share on volume <VolumeName> message :"Failed to create
    share: Command timed out: zfs set sharenfs=\"insecure rw=<IP
    Address>\" vpool<x>/fs<yz>"type :"ShareCreateError"
    
    in the vSnap host log, the following related message will be
    seen :
    
    [<timestamp>] INFO pid-12345 vsnap.share Adding new nfs share
    for volume id <yz>
    [<timestamp>] INFO pid-12345 vsnap.linux.system Executing
    command: sudo -n zfs set sharenfs="insecure,rw=<IP Address>"
    vpool<x>/fs<yz>
    ...
    [<timestamp>] ERROR pid-12345 vsnap.linux.system Timed out (480
    seconds) waiting for command to complete: sudo -n zfs set
    sharenfs="insecure,rw=<IP Address>" vpool<x>/fs<yz>
    
    On the vSnap host, the 'showmount -e' output will display an
    orphaned NFS share :
    
    Export list for <vSnapHostName>:
    /vsnap/vpool<x>/fs<yz> <IP_1>,<IP_2>,<IP_3>,...
    
    This will happen if some of the IP addresses collected from the
    vSphere environment are not reachable by the vSnap host.
    Spectrum Protect Plus auto-discovers the list of all IPs
    associated with the ESXi where the backup or restore needs to
    connect.
    That list of IPs is communicated to the vSnap host.
    Then a NFS share create command is started with that list of IPs
    that might need to access that share.
    The vSnap does not directly test the access to these IPs but
    passes them to the exportfs tool.
    If the exportfs detects one or more of these IPs is not
    reachable, it will get stuck during sharing or unsharing.
    The vSnap will time out the command but leave the NFS share
    orphaned.
    Any further attempt to perform a backup involving the volume
    having orphaned shares will then stop with the above message.
    
    IBM Spectrum Protect Versions Affected:
    
    IBM Spectrum Protect Plus 10.1.x
    
    Initial Impact: High
    
    Additional Keywords: SPP, SPPLUS, TS002770661 vSnap hang nfs
    export share unshare ip restore backup
    

Local fix

  • Cleanup the current stale NFS mounts :
    
    1. Ensure no restore/backup/replication jobs are running.
     If there are active jobs, wait for them to complete.
    2. On the vSnap server, temporarily stop the NFS server:
     sudo systemctl stop nfs-server
    3. Run the following to clean up the stuck volume that is no
    longer being used.
     This command may take a few minutes to complete.
     sudo zfs destroy -rf vpool<x>/fs<yz>
     NOTE: The suntax needs to be modified to fit
           your environment.
    4. Restart the NFS server:
     sudo systemctl start nfs-server
    
    To avoid reoccurrence :
    
    Ensure no unexpected DNS entries are seen in the file
    resolv.conf.
    If these are seen, run the network manager to correct the vSnap
    network configuration by running the command :
     sudo nmtui
    and correct the DNS settings for the needed network interfaces.
    
    Once the above was verified/corrected, if the problem is still
    present :
    
    1. Determine what IP/subnets do the vSnap server use (command
    "vsnap network show")
    2. Determine what IP/subnets do the ESX servers use
    3. If there are multiple subnets, which ones are reachable from
    vSnap and which ones are not reachable?
    4. Once the subnet(s) and eventual individual IP addresses that
    are unreachable by the vSnap host are identified, do the
    following to filter these out from the NFS share/unshare command
    :
     For example all IP addresses 1.2.*.* and individual IP address
    3.4.5.6 need to be excluded
     The command will be :
     vsnap system pref set --name excludeAllowedHostsPrefix --value
    "1.2,3.4.5.6"
    5. verify it set :
     vsnap system pref get | grep excludeAllowedHostsPrefix
    
        excludeAllowedHostsPrefix | N/A | 1.2,3.4.5.6 | string
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * IBM Spectrum Protect Plus level 10.1.4 and 10.1.5.           *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in IBM Spectrum Protect Plus level     *
    * 10.1.6. Note that this is subject to change at the           *
    * discretion of IBM.                                           *
    ****************************************************************
    

Problem conclusion

  • In IBM Spectrum Protect Plus 10.1.4 and 10.1.5, vSnap servers
    that were deployed from OVA initially contained invalid DNS
    configuration left over from the OVA build process. Typically,
    when the user customizes the IP and DNS configuration during or
    after deployment, the invalid configuration is updated with the
    new customized configuration, so no problems occur. In rare
    cases, users may choose not to enter any DNS information at all.
    In these cases, the invalid configuration remains on the vSnap
    appliance. When creating, updating, or deleting NFS shares, the
    NFS server on the vSnap attempts to perform DNS lookups of share
    clients. These DNS lookups may fail/hang due to the invalid DNS
    configuration left behind on the server, thus resulting in hangs
    of the share operation.
    
    The issue has been resolved by removing the invalid DNS
    configuration during the OVA build process. The fix takes effect
    for new OVA deployments. For existing deployments, the invalid
    configuration can be corrected by either removing the file
    /etc/resolv.conf or by specifying custom DNS servers to ensure
    the default invalid configuration is overwritten by the custom
    configuration.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT30518

  • Reported component name

    SP PLUS

  • Reported component ID

    5737SPLUS

  • Reported release

    A14

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2019-10-07

  • Closed date

    2020-04-29

  • Last modified date

    2020-08-14

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SP PLUS

  • Fixed component ID

    5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A14","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
30 January 2024