APAR status
Closed as program error.
Error description
The Backup or Restore job for VMware guests might stop with the following message : ERROR,<timestamp>,CTGGA2086,Storage exception Failed to create storage share on volume <VolumeName> message :"Failed to create share: Command timed out: zfs set sharenfs=\"insecure rw=<IP Address>\" vpool<x>/fs<yz>"type :"ShareCreateError" in the vSnap host log, the following related message will be seen : [<timestamp>] INFO pid-12345 vsnap.share Adding new nfs share for volume id <yz> [<timestamp>] INFO pid-12345 vsnap.linux.system Executing command: sudo -n zfs set sharenfs="insecure,rw=<IP Address>" vpool<x>/fs<yz> ... [<timestamp>] ERROR pid-12345 vsnap.linux.system Timed out (480 seconds) waiting for command to complete: sudo -n zfs set sharenfs="insecure,rw=<IP Address>" vpool<x>/fs<yz> On the vSnap host, the 'showmount -e' output will display an orphaned NFS share : Export list for <vSnapHostName>: /vsnap/vpool<x>/fs<yz> <IP_1>,<IP_2>,<IP_3>,... This will happen if some of the IP addresses collected from the vSphere environment are not reachable by the vSnap host. Spectrum Protect Plus auto-discovers the list of all IPs associated with the ESXi where the backup or restore needs to connect. That list of IPs is communicated to the vSnap host. Then a NFS share create command is started with that list of IPs that might need to access that share. The vSnap does not directly test the access to these IPs but passes them to the exportfs tool. If the exportfs detects one or more of these IPs is not reachable, it will get stuck during sharing or unsharing. The vSnap will time out the command but leave the NFS share orphaned. Any further attempt to perform a backup involving the volume having orphaned shares will then stop with the above message. IBM Spectrum Protect Versions Affected: IBM Spectrum Protect Plus 10.1.x Initial Impact: High Additional Keywords: SPP, SPPLUS, TS002770661 vSnap hang nfs export share unshare ip restore backup
Local fix
Cleanup the current stale NFS mounts : 1. Ensure no restore/backup/replication jobs are running. If there are active jobs, wait for them to complete. 2. On the vSnap server, temporarily stop the NFS server: sudo systemctl stop nfs-server 3. Run the following to clean up the stuck volume that is no longer being used. This command may take a few minutes to complete. sudo zfs destroy -rf vpool<x>/fs<yz> NOTE: The suntax needs to be modified to fit your environment. 4. Restart the NFS server: sudo systemctl start nfs-server To avoid reoccurrence : Ensure no unexpected DNS entries are seen in the file resolv.conf. If these are seen, run the network manager to correct the vSnap network configuration by running the command : sudo nmtui and correct the DNS settings for the needed network interfaces. Once the above was verified/corrected, if the problem is still present : 1. Determine what IP/subnets do the vSnap server use (command "vsnap network show") 2. Determine what IP/subnets do the ESX servers use 3. If there are multiple subnets, which ones are reachable from vSnap and which ones are not reachable? 4. Once the subnet(s) and eventual individual IP addresses that are unreachable by the vSnap host are identified, do the following to filter these out from the NFS share/unshare command : For example all IP addresses 1.2.*.* and individual IP address 3.4.5.6 need to be excluded The command will be : vsnap system pref set --name excludeAllowedHostsPrefix --value "1.2,3.4.5.6" 5. verify it set : vsnap system pref get | grep excludeAllowedHostsPrefix excludeAllowedHostsPrefix | N/A | 1.2,3.4.5.6 | string
Problem summary
**************************************************************** * USERS AFFECTED: * * IBM Spectrum Protect Plus level 10.1.4 and 10.1.5. * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Apply fixing level when available. This problem is currently * * projected to be fixed in IBM Spectrum Protect Plus level * * 10.1.6. Note that this is subject to change at the * * discretion of IBM. * ****************************************************************
Problem conclusion
In IBM Spectrum Protect Plus 10.1.4 and 10.1.5, vSnap servers that were deployed from OVA initially contained invalid DNS configuration left over from the OVA build process. Typically, when the user customizes the IP and DNS configuration during or after deployment, the invalid configuration is updated with the new customized configuration, so no problems occur. In rare cases, users may choose not to enter any DNS information at all. In these cases, the invalid configuration remains on the vSnap appliance. When creating, updating, or deleting NFS shares, the NFS server on the vSnap attempts to perform DNS lookups of share clients. These DNS lookups may fail/hang due to the invalid DNS configuration left behind on the server, thus resulting in hangs of the share operation. The issue has been resolved by removing the invalid DNS configuration during the OVA build process. The fix takes effect for new OVA deployments. For existing deployments, the invalid configuration can be corrected by either removing the file /etc/resolv.conf or by specifying custom DNS servers to ensure the default invalid configuration is overwritten by the custom configuration.
Temporary fix
Comments
APAR Information
APAR number
IT30518
Reported component name
SP PLUS
Reported component ID
5737SPLUS
Reported release
A14
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2019-10-07
Closed date
2020-04-29
Last modified date
2020-08-14
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SP PLUS
Fixed component ID
5737SPLUS
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A14","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
30 January 2024