2006-09-07 kernel 2.6.16 bug fix patch 07 ("October 2005")

If you download any software from this web site please be aware of the Warranty Disclaimer and Limitation of Liabilities.

linux-2.6.16-s390-07-october2005.tar.gz / MD5 ... accumulated patch, recommended (2006-09-07)

linux-2.6.16-s390-07-october2005-patches.tar.gz / MD5 ... per-problem-patches, recommended (2006-09-07)

This patch contains the following linux kernel bug fixes:

Description:
cio: Disallow ccwgroup devices containing non-unique ccw devices.
Symptom:
After user creates ccwgroup device containing the same ccw device twice, oopses or VM abends occur.
Problem:
Treating the same device as two unique devices when performing operations leads to unpredictable results.
Solution:
Check for unique ccw devices in ccwgroup_create().
Problem-ID:
25802
Description:
cio: I/O stall due to lost interrupt after CHPID vary off/on cycle.
Symptom:
I/O on a CCW device may stall if a CHPID to that device is logically varied off/on. This may also apply to SE CHPID off/on cycles.
Problem:
A user I/O interrupt is mis-interpreted as interrupt for an internal path verification operation due to a missing check and is therefore never reported to the device driver.
Solution:
Correct check for pending interruptions before starting path verification.
Problem-ID:
26103
Description:
cio: Inconsistent values in channel measurement facility.
Symptom:
Values obtained from cmf do not form a consistent picture.
Problem:
Blocks copied from the hardware may not be consistent. avg_sample_interval was re-calculated on every read and did not contain values in nanoseconds.
Solution:
Move copying the hardware block into idle state and do it several times, if needed.
Store a timestamp when the last new block is received and use this value to calculate avg_sample_interval. Print avg_sample_interval in nanoseconds, like it is documented.
Problem-ID:
23427, 24855
Description:
cio: module containing ccwgroup driver cannot be unloaded.
Symptom:
When trying to unload a module containing a ccwgroup driver, the rmmod process hangs.
Problem:
Driver core deadlocks when calling device_unregister() from driver_for_each_device().
Solution:
Use driver_find_device() instead of driver_for_each_device().
Problem-ID:
23575
Description:
cio: permanent subchannel busy conditions may cause I/O stall.
Symptom:
In special conditions where a subchannel rejects the HALT I/O- instruction with a busy indication (cc 2), I/O may stall.
Problem:
I/O request termination logic retries HALT I/O indefinitely because it expects HALT I/O to alter the subchannel status which is not true when cc 2 is returned.
Solution:
In case of a busy indication, try CLEAR I/O instruction immediately.
Problem-ID:
25801
Description:
dasd: kernel BUG when setting a DASD device offline.
Symptom:
The message 'kernel BUG at drivers/ll_rw_block.c:2xxx' is printed.
Problem:
A request that should be ended is still enqueued.
Solution:
First dequeue request and than call end_request when flushing request queue.
Problem-ID:
26118
Description:
dasd: Cleanup queue fails during offline processing.
Symptom:
The ioctl BIODASDDISABLE hangs.
Problem:
Cleanup of the internal cqr-queue did not work for several reasons.
Solution:
Fixed clear_IO handling (need to wait for interrupt) and introduced error-handling in shutdown processing.
Problem-ID:
24511
Description:
iucv: multiple interfaces with same peer established.
Symptom:
"cat /sys/class/net/iucv*/device/user" may show same peer more than once.
Problem:
Handling of 'echo <peer> > /sys/bus/iucv/drivers/netiucv/connection' does not check whether a connection with <peer> already exists.
Solution:
Add such a check in subroutines conn_write() and user_write() and reject creation of a second interface with same peer.
Problem-ID:
25799
Description:
qeth: Set routing for IPv6 invalid on HiperSockets.
Symptom:
Error message when HiperSockets interface (hsi) set online. 'Error 0001 while setting routing type on hsi.'
Problem:
During card setup for HiperSockets qeth issues IPA SETRTG with IPv6.
Solution:
Omit IPA SETRTG with IPv6 when card does not support IPv6.
Problem-ID:
26144
Description:
qeth: kernel panic under heavy UDP workload.
Symptom:
Addressing exception in qeth_hard_start_xmit / skb_realloc_headroom.
Problem:
When sending an skb, qeth might have a need to re-allocate or copy this skb before turning it over to the network device (for instance with bonding setups). Once this happens qeth frees the original skb immediately and continues processing with its clone. But later on qeth may realize that all its output queue buffers are full due to high traffic load. Then qeth gives up and returns an EBUSY condition to its caller. Typically the caller retries sending of the same skb in this case. If qeth has already freed this skb, kernel panic occurs.
Solution:
Keep original skb until it has been successfully turned over to the network device.
Problem-ID:
26068
Description:
qeth: race during setup of qeth device.
Symptom:
Kernel panic in qeth code (qeth_flush_buffers) when sending the first packet.
Problem:
The crash happened in qeth_main.c within qeth_flush_buffers() for line
'queue->card->dev->trans_start = jiffies';
queue->card is still 0.
queue->card is initialized in qeth_init_qdio_queues(), but card->state is set to CARD_STATE_SOFTSETUP before ( see __qeth_set_online() ).
That means a qeth_open() / qeth_hard_start_xmit() executes already, even though queue->card might not yet be initialized.
Solution:
In __qeth_set_online() move setting of card->state to CARD_STATE_SOFTSETUP after invocation of qeth_init_qdio_queues().
Problem-ID:
25564
Description:
qeth: race when re-boot and recovery run concurrently.
Symptom:
Addressing exception: 0038 in qdio_shutdown.
Problem:
It may happen that a qeth re-boot event runs on one CPU while a qeth recovery starts on another CPU for the same OSA card. This causes duplicate invocations of qeth_qdio_clear_card(), which may result for a specific interlocking scenario in a duplicate invocation of qdio_cleanup().
Solution:
Avoid duplicate invocation of qdio_cleanup() by changing card->qdio.state into an atomic variable.
Problem-ID:
26016
Description:
qeth: stack trace with msg "inconsistent lock state".
Symptom:
message:
[ INFO: inconsistent lock state ]
inconsistent {in-softirq-W} ->ftirq-on-W} usage.
modprobe/758 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&tbl->lock){-+-.}, at: [<000000008080c396>]
qeth_init+0x396/0x4e4 [qeth]
Problem:
With kernel 2.6.17-1.2473.el5 a new service called "Lock dependency validator" is activated detecting this deficiency in the qeth code.
Solution:
bhs must be disabled when accessing neighbour tables.
Problem-ID:
26014
Description:
zfcp: re-open adapter on do_QDIO error.
Symptom:
I/O stall during chpid off/on cycles within grace period.
Problem:
cio triggers shutdown of qdio queues but does not call no-path-notifier for zfcp. Hence qdio queues for an adapter are down but zfcp does not re-open the queues instead zfcp receives -EBUSY when calling do_QDIO.
Solution:
Re-open qdio queues for an adapter if do_QDIO returns an error.
Problem-ID:
19628

Everybody should apply this patch.

To create the complete linux kernel sources, the following patches need to be applied in sequence:

linux-2.6.16.tar.gz (from http://www.kernel.org/pub/linux/kernel/v2.6)
+ linux-2.6.16-s390-base-october2005.diff (IBM)
+ linux-2.6.16-s390-01-october2005.diff (IBM)
+ linux-2.6.16-s390-02-october2005.diff (IBM)
+ linux-2.6.16-s390-03-october2005.diff (IBM)
+ linux-2.6.16-s390-04-october2005.diff (IBM)
+ linux-2.6.16-s390-05-october2005.diff (IBM)
+ linux-2.6.16-s390-06-october2005.diff (IBM)

Contact the IBM team

If you want to contact the Linux on System z IBM team refer to the Contact the Linux on System z IBM team page.