2006-04-28 kernel 2.6.16 bug fix patch 02 ("October 2005")

If you download any software from this web site please be aware of the Warranty Disclaimer and Limitation of Liabilities.

linux-2.6.16-s390-02-october2005.tar.gz / MD5 ... accumulated patch, recommended (2006-04-28)

linux-2.6.16-s390-02-october2005-patches.tar.gz / MD5 ... per problem patches, recommended (2006-04-28)

These patches contain the following linux kernel bug fixes:

Description:
cio: Enable interrupts on error path.
Symptom:
Interrupts could stay disabled if error occured in _chp_add().
Problem:
Use of spin_unlock() instead of spin_unlock_irq().
Solution:
Use spin_unlock_irq().
Problem-ID:
23146
Description:
cio: I/O failing after CHPID is offline despite remaining CHPIDs.
Symptom:
I/O on a DASD is failing with no retry left after turning one channel path to the device off and on multiple times (via HMC), even though other channel paths still remain operational.
Problem:
Sporadic unsuccessful termination of pending I/O when CHPID is set offline.
Solution:
Fix race condition in termination logic.
Problem-ID:
23146
Description:
ctcmpc: Unknown symbol reported for extern callback functions.
Symptom:
snaipx which is part of CSL is unable to load ctcmpc devices.
Problem:
EXPORT_SYMBOL was missing for the extern functions.
Solution:
Add EXPORT_SYMBOL statements for each extern callback function.
Problem-ID:
14424
Description:
dasd: BIODASDENABLE ioctl never returns on unformatted device.
Symptom:
A program calling BIODASDENABLE on an unformatted device never returns.
Problem:
The DASD state-machine is not designed to enable an unformatted device, since 'unformatted' is a final state.
Problem is, that this final state is not detected so far and the state-machine is waiting forever.
Note: To get such a device online it has to be re-analyzed. This means that the device needs to be disabled prior to re-enablement.
Solution:
Handle 'unformatted' as a final state and return.
Problem-ID:
-
Description:
dasd: Misleading message when trying to set device offline.
Symptom:
Message "Can't offline dasd device with open count = 0" generated.
Problem:
Race condition in counter usage and internal opener might also prevent offline.
Solution:
Fix race and generate more precise message.
Problem-ID:
-
Description:
dasd: Fix kernel panic with fail-fast requests on quiesced devices.
Symptom:
Kernel panic caused by oops in dasd_int_handler when using fail-fast flag in I/O request on a dasd disk which has been quiesced.
Problem:
Incorrect check in the dasd request start function causes requests with pending interruptions to be freed. Once the interruption occurs, arbitrary memory regions are accessed.
Solution:
Modify request start function to only work on requests in valid state.
Problem-ID:
22871
Description:
dcss: Print z/VM error code for segment_load, segment_type and segment_save.
Symptom:
segment_save, segment_load and segment_type may fail without printing a z/VM error code.
Problem:
Complicates problem determination.
Solution:
Print z/VM error code.
Problem-ID:
-
Description:
Kernel code does not make use of gcc __builtin functions.
Symptom:
Performance degradation.
Problem:
When building the kernel the compile option -ffreestanding is always passed to the compiler. This option implies -fno-builtin which causes the compiler to generate code without recognizing built-in functions (like e.g. memcpy) for which it could generate optimized code.
Solution:
Omit compile option -ffreestanding.
Problem-ID:
-
Description:
kernel: Instruction processing damage handling.
Symptom:
Kernel hangs.
Problem:
In case of an instruction processing damage (IPD) machine check in kernel mode the resulting action is always to stop the kernel.
This is not necessarily the best solution since a retry of the failing instruction might succeed.
Solution:
Allow retries in case of IPD machine checks.
Problem-ID:
-
Description:
kernel: Missing error check on signal frame setup.
Symptom:
User space application does not return to kernel after signal handling.
Problem:
The return value of __put_user() which writes the syscall opcode to user space is not checked. Even if this __put_user() fails the user space application will run again instead of sending it a SIGSEGV.
Solution:
Check return value of __put_user().
Problem-ID:
23074 - Note: If you apply this patch, please also apply next patch (23074 "kernel: Bug in setup_rt_frame()") on top.
Description:
kernel: Bug in setup_rt_frame().
Symptom:
Process terminates unexpectedly.
Problem:
setup_rt_frame() writes the wrong system call number on the stackframe. When the signal handler returns via this system call the kernel terminates the process.
Solution:
Use correct system call number.
Problem-ID:
23074
Description:
kernel: Signal handling bug.
Symptom:
Process crashes on signal delivery.
Problem:
If a signal handler has been established with the SA_ONSTACK option but no alternate stack is provided with sigaltstack(), the kernel still tries to install the alternate stack. Also when setting an alternate stack with sigalstack() and the SS_DISABLE flag, the kernel tries to install the alternate stack on signal delivery.
Solution:
Use the correct condition sas_ss_flags() to check if the alternate stack has to be used.
Problem-ID:
23355
Description:
kernel: RCU handling delays.
Symptom:
Unexpected delays.
Problem:
The kernel handles RCU batches. If a batch is finished on CPU 1 but still in work on CPU 2, CPU 1 could decide to enter a tickless wait state even if the next batch would already be pending. Thus CPU 1 would miss when the next batch would be ready since CPU 2 will schedule the next batch. This can result in delays.
Solution:
Check for pending RCU batches and do not enter tickless wait state if there is a batch pending.
Problem-ID:
-
Description:
net: initcall order.
Symptom:
Kernel hangs or is unstable if network code is compiled in.
Problem:
If the network code is compiled into the kernel the initcalls of all network drivers are done before inet_init(). This is the wrong order since the drivers rely on data structures that get initialized from inet_init().
Solution:
Raise priority of inet_init() initcall from device_initcall() to fs_initcall().
Problem-ID:
22969
Description:
qdio: I/O stall with zfcp in low-memory situation.
Symptom:
SCSI I/O stall in low-memory situation.
Problem:
qdio allocates memory with GFP_KERNEL during qdio_establish and qdio_shutdown. zfcp must call these functions when performing error recovery for an adapter. This can lead to a situation where qdio waits for memory and zfcp (SCSI) waits for end of its error recovery. In case zfcp (SCSI) is needed to swap pages this can lead to an I/O stall.
Solution:
Avoid memory allocation with GFP_KERNEL in qdio_establish and qdio_shutdown and introduce memory pool to allow these qdio operations in low memory situations.
Problem-ID:
22223
Description:
qeth: qethconf not adding IPv4 addresses.
Symptom:
/var/log/messages "kernel: qeth: Invalid IP address format!"
Problem:
Incorrect syntax checking of IPv4 addresses.
Solution:
Change result checking of sscanf invocation.
Problem-ID:
22637
Description:
qeth: Race condition possible during device recovery.
Symptom:
Kernel panic while device is recovering.
Problem:
While device is recovering network stack could send packets and is even able to call qeth's hard_start_xmit routine. using netif_stop_queue outside dev->hard_start_xmit is dangerous, because it could happen that the network stack holds the xmit_lock while still sending a packet.
While setting qeth device online netif_wake_queue is called although the device is not in UP state.
Solution:
Only call netif_wake_queue when device is up. Rather use netif_tx_disable than netif_stop_queue outside from qeth_hard_start_xmit routine.
Problem-ID:
23195
Description:
qeth: System crash during data transmission.
Symptom:
Addressing exception when sending data with qeth driver.
Problem:
qeth_hard_start_xmit() prepares the skb for the OSA-card and turns it over to the OSA-card. Afterwards it updates some statistical data. To do this it makes use of skb-values. But the skb might already be freed by the qeth_qdio_output_handler.
Solution:
Save skb-values still used after skb_delivery to the OSA-card before handing the skb over to the OSA-card.
Problem-ID:
23458
Description:
z90crypt: Analysis revealed a possible memory overlay.
Symptom:
Kernel oops.
Problem:
Incorrect size for an array.
Solution:
Use sizeof() as the array dimension.
Problem-ID:
22773
Description:
z90crypt: Analysis revealed unreachable code.
Symptom:
Error messages never generated under certain conditions.
Problem:
Extraneous check of variable for non-zero value.
Solution:
Remove extraneous check.
Problem-ID:
22772

Everybody should apply this patch.

To create the complete linux kernel sources, the following patches need to be applied in sequence:

linux-2.6.16.tar.gz (from http://www.kernel.org/pub/linux/kernel/v2.6)
+ linux-2.6.16-s390-base-october2005.diff (IBM)
+ linux-2.6.16-s390-01-october2005.diff (IBM)
+ linux-2.6.16-s390-02-october2005.diff (IBM)

Contact the IBM team

If you want to contact the Linux on System z IBM team refer to the Contact the Linux on System z IBM team page.