IBM Support

Common causes of VIOS_VSCSI_HOST error in VIOS from vhost adapter

Question & Answer


Question

What is the cause of VIOS_VSCSI_HOST error logged in VIOS error log?

Cause

This document discusses some common causes.

Answer

VIOS_VSCSI_HOST error with "rc = 0x0000000000000005"

I have multiple virtual SCSI server (vhost#) adapters configured, but the VIOS_VSCSI_HOST error is not from any of them in particular.
The error is logged non-stop by the minute.
Sample error:
 Mar 18 07:55:53 vhost      T VIOS_VSCSI_HOST     Virtual SCSI Host Adapter detected an error
 Mar 18 07:54:53 vhost      T VIOS_VSCSI_HOST     Virtual SCSI Host Adapter detected an error
 Mar 18 07:53:53 vhost      T VIOS_VSCSI_HOST     Virtual SCSI Host Adapter detected an error
 Mar 18 07:52:53 vhost      T VIOS_VSCSI_HOST     Virtual SCSI Host Adapter detected an error
 Mar 18 07:51:53 vhost      T VIOS_VSCSI_HOST     Virtual SCSI Host Adapter detected an error
 Mar 18 07:50:53 vhost      T VIOS_VSCSI_HOST     Virtual SCSI Host Adapter detected an error
LABEL:		VIOS_VSCSI_HOST
IDENTIFIER:	B58C686F

Date/Time:       Wed Mar 18 07:55:53 2020
Sequence Number: 440784
Machine Id:      00F69BD64C00
Node Id:         <VIOS>
Class:           S
Type:            TEMP
WPAR:            Global
Resource Name:   vhost

Description
Virtual SCSI Host Adapter detected an error

Probable Causes
Virtual SCSI Host Adapter Driver has detected a possible problem

Failure Causes
Virtual SCSI Host Adapter Driver has detected a possible problem

	Recommended Actions
	Remove Virtual SCSI Host Adapter Instance, then Configure the same instance

Detail Data
ERNUM
1000 00DC                                           [....                            ]
ABSTRACT
IOCINFO ioctl failed
AREA
File System
BUILD INFO
BLD: 1708 16-09:53:45 m2017_33A7
LOCATION
Filename:target_dev.c Function:cap_thread Line:4214
DATA
rc = 0x0000000000000005

Duplicates
Number of duplicates
          11
Time of first duplicate
Wed Mar 18 07:55:43 2020
Time of last duplicate
Wed Mar 18 07:55:53 2020
This error is not logged against a particular vhost because it does not happen on a specific vhost.
It is generated while the target kproc from the Virtual I/O Server is checking the capacity on a physical device mapped as a Virtual Target Device (VTD). The interval between two checks is 60 seconds.
To check this capacity, the kproc is sending an IOCINFO ioctl against the physical device, and in this case, this ioctl failed due to EIO -> I/O Error (rc = 0x0000000000000005).
This return code indicates the the device is not ready or accessible. Consequently, the cause of this error is a problem outside the VIOS host and due to the end device, commonly a storage problem.
 
Best Practice Recommendations
  1. Remove all unused vhost adapters.  For example, if there is a vhost adapter mapped to a virtual client but there are no Virtual Target Devices configured on the vhost (lsmap -vadapter vhost#), remove it (rmdev -dev vhost#).
  2. If there is any SAN storage configured but not in use, it is recommended to remove it.
Recommend Action Plan
Contact your SAN administrator for further investigation.

VIOS_VSCSI_HOST error with "rc = 0xEEEE0000CC988024"

LABEL:        VIOS_VSCSI_HOST
IDENTIFIER:    B58C686F
Date/Time:       Wed Aug  2 16:51:03 2023
Sequence Number: 1743
Machine Id:      000491C47A00
Node Id:         apdvh678
Class:           S
Type:            TEMP
WPAR:            Global
Resource Name:   vhost11
Description
Virtual SCSI Host Adapter detected an error
Probable Causes
Virtual SCSI Host Adapter Driver has detected a possible problem
Failure Causes
Virtual SCSI Host Adapter Driver has detected a possible problem
    Recommended Actions
    Remove Virtual SCSI Host Adapter Instance, then Configure the same instance
Detail Data
ERNUM
1000 013F                                                                        [...?                            ]
ABSTRACT
Error while sending client information to vio daemon via VKE interface
AREA
VKE
BUILD INFO
BLD: 2207 28-10:32:29 z2022_30A6
LOCATION
Filename:target_ngdisk.c Function:ngdisk_send_client_info Line:13928
DATA
rc = 0xEEEE0000CC988024
This error decodes to:
Aug  2 16:51:03 vhost11    T VIOS_VSCSI_HOST     COMMON_ERR_013F Virtual SCSI Host Adapter detected an error Error while sending client information to vio daemon via VKE interface [target_ngdisk.c ngdisk_send_client_info()] [rc EEEE0000CC988024]
Probable Cause
rc EEEE0000CC988024 indicates we failed to send client LPAR information to VKE due to ENOTCONN_VKE_SND_CINFO.  It states: the vio_daemon was down at the time we tried to send the client information to it (via a socket), or the VKE daemon was not yet in ready state, and thus we were unable to pass client LPAR information to it. The temporary error is informational in nature.
If the VIOS is part of a Shared Storage Pool cluster and the vhost in question has Logical Units (LUs) configured as backing devices, review the VIOS error log for SSP cluster errors around the time the VIOS_VSCSI_HOST error was logged. 
If cluster errors are found around the time the VIOS_VSCSI_HOST error was reported, determine the current cluster state.  As padmin, run:
$ cluster -status -clustername my_cluster_name -verbose
If the cluster shows problems, contact your local IBM Support Representative for investigation.
If the cluster reported issues when VIOS_VSCSI_HOST error was logged, and those issues have been corrected (cluster command reports everything "OK"), the temporary error can be considered informational.

VIOS_VSCSI_HOST error with rc = 0x0000000000000016

VIO client is unable to access the virtual SCSI disk associated with the virtual target device name (vtd_name) specified in the Detail Data.  In this example, the VTD name is "my_vtd_name".
LABEL:           VIOS_VSCSI_HOST
IDENTIFIER:      B58C686F
Date/Time:       Sat Jun 23 23:53:31 2018
Sequence Number: 4634961
Machine Id:      00C15D074C00
Node Id:         <VIOS_name>
Class:           S
Type:            TEMP
WPAR:            Global
Resource Name:   vhost50
Description
Virtual SCSI Host Adapter detected an error
Probable Causes
Virtual SCSI Host Adapter Driver has detected a possible problem
Failure Causes
Virtual SCSI Host Adapter Driver has detected a possible problem
    Recommended Actions
    Remove Virtual SCSI Host Adapter Instance, then Configure the same instance
Detail Data
ERNUM
1000 00C6 [....]
ABSTRACT
Error in configuration method for this virtual device
AREA
Config
BUILD INFO
BLD: 1504 10-09:39:00 d2015_15A7
LOCATION
Filename:target_dev.c Function:add_child Line:1448
DATA
rc = 0x0000000000000016     cfg_failed = 0x00000024     vtd_name = my_vtd_name     dev_name = vtd_
CAUSE
This error indicates a configuration issue for the vtd_name, my_vtd_name.
'rc = 0x0000000000000016' error indicates that an I/O issue with the backing device associated with the vtd_name.
RECOMMENDATION

1) Determine the backing device associated  with the VTD name in the Detail Data by using lsmap command for the vhost# the error was logged against.  In this case, vtd_name is my_vtd_name on vhost50.

$ lsmap -vadapter vhost 
SVSA            Physloc                                      Client Partition ID
--------------- -------------------------------------------- ------------------
vhost50         U9117.MMC.0615D07-V3-C750                    0x0000000d
...
VTD                   013d11FCd20D4CF
Status                Defined
LUN                   0x9400000000000000
Backing device        hdisk390
Physloc               
Mirrored              N/A
...

2) Ensure the backing device is accessible.

If the backing device is an  hdisk, determine whether the VIOS can access the device by testing disk I/O using dd or lquerypv commands from the oem_setup_env shell as noted in the following examples.

Method 1

$ oem_setup_env
# dd if=/dev/hdisk20 of=/dev/null count=1


     The expected "OK" output should include messages like:
     1+0 records in.
     1+0 records out.

     If  dd command fails, correct the disk I/O problem and try again.

     Example failure:

dd: /dev/hdisk35: The requested resource is busy

Method 2

# lquerypv -h /dev/hdisk20 80 10


   Expected output is one line. For example,

00000080   00110010 00120011 00130012 00140013  |................|

If  lquerypv returns to the prompt with no output, or returns an error, there is a disk I/O problem that must be corrected.

Note: From the VIOS/OS perspective, a reserve is put on the disk whenever the disk reserve policy is set to single path. For example, if the hdisk# (a SAN disk) is set to single path on VIOS1, then the device "busy" message would be expected if you try to access the disk from VIOS2.  To avoid the "busy" error, the SAN disk reserve policy attribute must be set to no_reserve on VIOS1.  This reserve policy value is required on both VIO servers for vSCSI MPIO configuration. 

  • If the disk is set to no_reserve on both VIOs, yet you still get the device busy message or I/O error, contact your local Storage Support Representative to investigate what is holding the device.
 

To verify the disk reserve policy value, use the lsdev command as follows:

$ lsdev -dev hdisk390 -attr reserve_policy
 
  reserve_policy no_reserve Reserve Policy True

3) Ensure the backing device is writable.

$ oem_setup_env
# alog -ot cfg | grep 'failed to open'
In the following sample output, errno=47 (EWRPROTECT) indicates the hdisk2 and hdisk6  are write-protected. 
M0 2883734 cfg_vtdev_scdisk.c 454 open_head_driver : failed to open /dev/hdisk2 (Not client reserve) parent:vhost0 vtdev:vtscsi0 errno 47
M0 3145844 cfg_vtdev_scdisk.c 454 open_head_driver : failed to open /dev/hdisk6 (Not client reserve) parent:vhost0 vtdev:vtscsi4 errno 47
   This is a problem because VSCSI requires read-write access to the backing device.
   (Other errno values can be found in /usr/include/sys/errno.h)
   To determine the cause of write-protection, contact your Storage Support Representative for investigation.
   To determine the storage type, run:
   $ lsdev -type disk|grep hdisk#
Once I/O to the backing device/hdisk has been re-established, run cfgdev command to make the VTD status Available.
Note: The "dev_name" in the error details normally has the backing device name.  In this example,  the value was "dev_name = vtd_".  The reason being because the backing device name (hdisk#) had been removed from the VIOS.  So the error was valid and corrected by simply removing the VTD name (rmvdev -vtd VTD_name).  In such cases where the VTD name in question is no longer needed, ensure the corresponding virtual SCSI disk is also removed on the client partition.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSPHKW","label":"PowerVM Virtual I\/O Server"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
30 May 2024

UID

ibm16173373