On our customers 9117-MMA(p570) a production client lpar crashed some days ago. We opened a PMR , bit they said, that vio-server behaved normal.
The situation is following
We have 2 vios (2.1 fix 20.1) which map LUNS from two DS3400 to the client lpars. vio1 maps from DS1 and vio2 from DS2. Mirroring is performed by the client lpar (redhat 2.6.18-181).
On the DS1 a disk failed from an SATA-Array (built of 6 disks 5+1 Raid5, disk size 1TB. And based on the error log on the vio1, we see that this failure coincided with " LABEL: CLIENT_FAILURE
Date/Time: Thu Aug 13 00:10:00 GMT+02:00 2009
Sequence Number: 266
Machine Id: 00C85B8D4C00
Node Id: learn-570-vio1
Resource Name: vhost3
Misbehaved Virtual SCSI Client
Bad IU, or SRP Violation
Bad IU, or SRP Violation
Remove Virtual SCSI Client, then Configure the same instance
module: trans_event rc: 00000000FFFFFFD8 location: 00000502
data: 2 2 0 0 0
The client-lpar hang after this message (I/O-hang) could not log in any more,neither via tcp nor virt-console from hmc). We had to reboot the client. The mirror is broken, so we will fix teh problem with vhost3 tomorrow
We do not have IBM Linux Softwaresupport, so IBM closed the PMR again
Does anybody have an idea, why this failure can occur?
health_check interval: we use mpio not RDAC
IBM Software-support said: we shell change the health check interval on the VIO-side from 80 to 0!? So deactivating the health-check
Could it be that the DS3400 SATA-array is to slow? We set the queue_depth of the VIOS-disks to 16. Is this too high?
Did someone already have a similar case?
[ I only found an old case http://ozlabs.org/pipermail/linuxppc64-dev/2004-November/002699.html , where they say: The symptom is that some large I/Os will fail the adapter (putting it
best regards and thank you for any help!
This topic has been locked.
1 reply Latest Post - 2009-08-31T13:37:18Z by SystemAdmin
Pinned topic vio 2.1: crash client lpar : misbehaved virt scsi client
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2009-08-31T13:37:18Z at 2009-08-31T13:37:18Z by SystemAdmin
SystemAdmin 110000D4XK706 PostsACCEPTED ANSWER
Re: vio 2.1: crash client lpar : misbehaved virt scsi client2009-08-31T13:37:18Z in response to walterchen_austriaCan you post the output of:
> uname -a
> cat /etc/redhat-release
The kernel version you quoted redhat 2.6.18-181 doesn't look familiar. Thanks.