APAR status
Closed as program error.
Error description
Protecting databases that use iSCSI to access the target vSnap volume can stop. Example of a Microsoft Exchange database backup job log : SUMMARY,<timestamp>,,CTGGA2398,Starting job for policy <SLAName> (ID:<SLAId>). id -> <JobId>. IBM Spectrum Protect Plus version 10.1.8-4083. ... ERROR,<timestamp>,2,,[<DBServerAddress>] Exception caught: error: [Failed to login iscsi target iqn: [iqn.1994-05.com.redhat:vsnap.123456789] target host address: [<vSnapIP>] error [0xefff0013]] ERROR,<timestamp>,2,CTGGA3608,[<DBServerAddress>] "The command[mount] results for the agent [winosutil.exe] cannot be located". ERROR,<timestamp>,2,CTGGA2073,Failed to perform backup of db instance <DBServerInstanceName> on application server <DBServerAddress> : null In the application log, the same error is seen which happens when attempting the iSCSI login : <timestamp> [0x0000d24c] DEBUG winosutil - Property: SendTargets:*<vSnapIP> 0003260 ROOT\ISCSIPRT\0000_0 ; target address: <vSnapIP> <timestamp> [0x0000d24c] DEBUG winosutil - Start WQL query [associators of {MSFT_iSCSITarget.NodeAddress='iqn.1994-05.com.redhat:vsnap.1234 56789'} where resultclass = msft_iscsisession] <timestamp> [0x0000d24c] WARN winosutil - Failed to get iscsi target session info, error: WBEM Enumerator failed. Error: [(0x00000001, 00000207574390A0)]) ... <timestamp> [0x0000d24c] DEBUG winosutil - Target IP address [<vSnapIP>] was connected successfully, and will be used for iscsi connection <timestamp> [0x0000d24c] DEBUG winosutil - vSnap iscsi target IP address was updated to [<vSnapIP>] <timestamp> [0x0000d24c] DEBUG winosutil - Log onto iscsi target (iqn.1994-05.com.redhat:vsnap.123456789, <vSnapIP>) <timestamp> [0x0000d24c] DEBUG winosutil - To add iscsi target (iqn[iqn.1994-05.com.redhat:vsnap.123456789], IP[<vSnapIP>]) ... <timestamp> [0x0000d24c] DEBUG winosutil - Start iscsi login target API, [iqn.1994-05.com.redhat:vsnap.123456789] ... <timestamp> [0x0000d24c] ERROR winosutil - Exception caught: error: [Failed to login iscsi target iqn: [iqn.1994-05.com.redhat:vsnap.123456789], target host address: [<vSnapIP>], error [0xefff0013]] In the vSnap host only the following type of messages are seen in the /var/log/messages : Jun 24 00:05:22 <vSnapHost> kernel: rx_data returned 0, expecting 48. Jun 24 00:05:22 <vSnapHost> kernel: iSCSI Login negotiation failed. Once the above iSCSI login error is observed, all jobs accessing the vSnap iSCSI resource will display the same problem. The starting point of the iSCSI login problem will observed as the following from the /var/log/messages : .. kernel: iSCSI Login timeout on Network Portal 0.0.0.0:3260 .. kernel: tx_data returned -104, expecting 48. .. kernel: iSCSI Login negotiation failed. ... .. : Call Trace: .. : __schedule+0x2ab/0x880 .. : schedule+0x32/0x80 .. : schedule_timeout+0x1d8/0x300 .. : ? _cond_resched+0x15/0x30 .. : ? __kmalloc_track_caller+0x5c/0x220 .. : wait_for_completion+0x123/0x190 .. : ? wake_up_q+0x70/0x70 .. : iscsi_check_for_session_reinstatement+0x20e/0x270 [iscsi_target_mod] .. : iscsi_target_do_login+0xe8/0x5d0 [iscsi_target_mod] .. : iscsi_target_do_login_rx+0x266/0x340 [iscsi_target_mod] .. : ? iscsi_target_check_login_request+0x170/0x170 [iscsi_target_mod] .. : process_one_work+0x171/0x370 .. : worker_thread+0x49/0x3f0 .. : kthread+0xf8/0x130 .. : ? max_active_store+0x80/0x80 .. : ? kthread_bind+0x10/0x10 .. : ret_from_fork+0x1f/0x40 .. : INFO: task kworker/2:2:7618 blocked for more than 600 seconds. .. : Tainted: P OE 4.19.177-1c.el7.x86_64 #1 .. : "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. .. : kworker/2:2 D 0 7618 2 0x80000080 .. : Workqueue: events target_tmr_work [target_core_mod] .. : Call Trace: .. : __schedule+0x2ab/0x880 .. : schedule+0x32/0x80 .. : schedule_timeout+0x1d8/0x300 .. : ? __kmalloc_track_caller+0x5c/0x220 .. : wait_for_completion+0x123/0x190 .. : ? wake_up_q+0x70/0x70 .. : target_put_cmd_and_wait+0x66/0xa0 [target_core_mod] .. : core_tmr_lun_reset+0x540/0x760 [target_core_mod] .. : target_tmr_work+0xb0/0xf0 [target_core_mod] .. : process_one_work+0x171/0x370 .. : worker_thread+0x49/0x3f0 .. : kthread+0xf8/0x130 .. : ? max_active_store+0x80/0x80 .. : ? kthread_bind+0x10/0x10 .. : ret_from_fork+0x1f/0x40 Starting with IBM Spectrum Protect Plus 10.1.5, the vSnap uses by default the 4.19 Linux kernel which is affected by this defect. IBM Spectrum Protect Plus Versions Affected: IBM Spectrum Protect Plus 10.1.5 and later Additional Keywords: SPP, SPPLUS, TS005987845, iSCSI
Local fix
The immediate fix is to reboot the vSnap host. To avoid reoccurrence, use the available alternate 3.10 Linux kernel. To switch to that kernel, on the vSnap host : 1. sudo touch /etc/vsnap/skip_kernel_upgrade 2. Rerun the vSnap installer for the same build that is currently installed. It will switch to alternate kernel and will prompt for reboot at the end. 3. After reboot run "uname -r" and verify it shows kernel 3.10.xxx. 4. Run "vsnap_status" and verify all services started correctly.
Problem summary
**************************************************************** * USERS AFFECTED: * * IBM Spectrum Protect Plus level 10.1.8 * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Apply fixing level when available. This problem is currently * * projected to be fixed in IBM Spectrum Protect Plus level * * 10.1.9. Note that this is subject to change at the * * discretion of IBM. * ****************************************************************
Problem conclusion
The problem is caused by iSCSI handling bugs in the upstream Linux kernel version 4.19. vSnap uses upstream kernel version 4.19 as the default kernel and provides the standard RedHat/CentOS kernel version 3.10 as an alternate choice. The problem has been resolved by switching to the kernel version 3.10 as the default. Upon upgrade to the fix level, the vSnap installer installs the 3.10 as the default boot choice.
Temporary fix
Comments
APAR Information
APAR number
IT37459
Reported component name
SP PLUS
Reported component ID
5737SPLUS
Reported release
A18
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-06-30
Closed date
2021-09-29
Last modified date
2021-09-29
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
vSnap iSCSI
Fix information
Fixed component name
SP PLUS
Fixed component ID
5737SPLUS
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A18","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
31 January 2024