2006-04-28 kernel 2.6.16 bug fix patch 02 ("October 2005")
If you download any software from this web site please be aware of the Warranty Disclaimer and Limitation of Liabilities.
linux-2.6.16-s390-02-october2005.tar.gz / MD5 ... accumulated patch, recommended (2006-04-28)
linux-2.6.16-s390-02-october2005-patches.tar.gz / MD5 ... per problem patches, recommended (2006-04-28)
These patches contain the following linux kernel bug fixes:
- Description:
- cio: Enable interrupts on error path.
- Symptom:
- Interrupts could stay disabled if error occured in _chp_add().
- Problem:
- Use of spin_unlock() instead of spin_unlock_irq().
- Solution:
- Use spin_unlock_irq().
- Problem-ID:
- 23146
- Description:
- cio: I/O failing after CHPID is offline despite remaining CHPIDs.
- Symptom:
- I/O on a DASD is failing with no retry left after turning one channel path to the device off and on multiple times (via HMC), even though other channel paths still remain operational.
- Problem:
- Sporadic unsuccessful termination of pending I/O when CHPID is set offline.
- Solution:
- Fix race condition in termination logic.
- Problem-ID:
- 23146
- Description:
- ctcmpc: Unknown symbol reported for extern callback functions.
- Symptom:
- snaipx which is part of CSL is unable to load ctcmpc devices.
- Problem:
- EXPORT_SYMBOL was missing for the extern functions.
- Solution:
- Add EXPORT_SYMBOL statements for each extern callback function.
- Problem-ID:
- 14424
- Description:
- dasd: BIODASDENABLE ioctl never returns on unformatted device.
- Symptom:
- A program calling BIODASDENABLE on an unformatted device never returns.
- Problem:
- The DASD state-machine is not designed to enable an unformatted
device, since 'unformatted' is a final state.
Problem is, that this final state is not detected so far and the state-machine is waiting forever.
Note: To get such a device online it has to be re-analyzed. This means that the device needs to be disabled prior to re-enablement. - Solution:
- Handle 'unformatted' as a final state and return.
- Problem-ID:
- -
- Description:
- dasd: Misleading message when trying to set device offline.
- Symptom:
- Message "Can't offline dasd device with open count = 0" generated.
- Problem:
- Race condition in counter usage and internal opener might also prevent offline.
- Solution:
- Fix race and generate more precise message.
- Problem-ID:
- -
- Description:
- dasd: Fix kernel panic with fail-fast requests on quiesced devices.
- Symptom:
- Kernel panic caused by oops in dasd_int_handler when using fail-fast flag in I/O request on a dasd disk which has been quiesced.
- Problem:
- Incorrect check in the dasd request start function causes requests with pending interruptions to be freed. Once the interruption occurs, arbitrary memory regions are accessed.
- Solution:
- Modify request start function to only work on requests in valid state.
- Problem-ID:
- 22871
- Description:
- dcss: Print z/VM error code for segment_load, segment_type and segment_save.
- Symptom:
- segment_save, segment_load and segment_type may fail without printing a z/VM error code.
- Problem:
- Complicates problem determination.
- Solution:
- Print z/VM error code.
- Problem-ID:
- -
- Description:
- Kernel code does not make use of gcc __builtin functions.
- Symptom:
- Performance degradation.
- Problem:
- When building the kernel the compile option -ffreestanding is always passed to the compiler. This option implies -fno-builtin which causes the compiler to generate code without recognizing built-in functions (like e.g. memcpy) for which it could generate optimized code.
- Solution:
- Omit compile option -ffreestanding.
- Problem-ID:
- -
- Description:
- kernel: Instruction processing damage handling.
- Symptom:
- Kernel hangs.
- Problem:
- In case of an instruction processing damage (IPD) machine check in
kernel mode the resulting action is always to stop the kernel.
This is not necessarily the best solution since a retry of the failing instruction might succeed. - Solution:
- Allow retries in case of IPD machine checks.
- Problem-ID:
- -
- Description:
- kernel: Missing error check on signal frame setup.
- Symptom:
- User space application does not return to kernel after signal handling.
- Problem:
- The return value of __put_user() which writes the syscall opcode to user space is not checked. Even if this __put_user() fails the user space application will run again instead of sending it a SIGSEGV.
- Solution:
- Check return value of __put_user().
- Problem-ID:
- 23074 - Note: If you apply this patch, please also apply next patch (23074 "kernel: Bug in setup_rt_frame()") on top.
- Description:
- kernel: Bug in setup_rt_frame().
- Symptom:
- Process terminates unexpectedly.
- Problem:
- setup_rt_frame() writes the wrong system call number on the stackframe. When the signal handler returns via this system call the kernel terminates the process.
- Solution:
- Use correct system call number.
- Problem-ID:
- 23074
- Description:
- kernel: Signal handling bug.
- Symptom:
- Process crashes on signal delivery.
- Problem:
- If a signal handler has been established with the SA_ONSTACK option but no alternate stack is provided with sigaltstack(), the kernel still tries to install the alternate stack. Also when setting an alternate stack with sigalstack() and the SS_DISABLE flag, the kernel tries to install the alternate stack on signal delivery.
- Solution:
- Use the correct condition sas_ss_flags() to check if the alternate stack has to be used.
- Problem-ID:
- 23355
- Description:
- kernel: RCU handling delays.
- Symptom:
- Unexpected delays.
- Problem:
- The kernel handles RCU batches. If a batch is finished on CPU 1 but still in work on CPU 2, CPU 1 could decide to enter a tickless wait state even if the next batch would already be pending. Thus CPU 1 would miss when the next batch would be ready since CPU 2 will schedule the next batch. This can result in delays.
- Solution:
- Check for pending RCU batches and do not enter tickless wait state if there is a batch pending.
- Problem-ID:
- -
- Description:
- net: initcall order.
- Symptom:
- Kernel hangs or is unstable if network code is compiled in.
- Problem:
- If the network code is compiled into the kernel the initcalls of all network drivers are done before inet_init(). This is the wrong order since the drivers rely on data structures that get initialized from inet_init().
- Solution:
- Raise priority of inet_init() initcall from device_initcall() to fs_initcall().
- Problem-ID:
- 22969
- Description:
- qdio: I/O stall with zfcp in low-memory situation.
- Symptom:
- SCSI I/O stall in low-memory situation.
- Problem:
- qdio allocates memory with GFP_KERNEL during qdio_establish and qdio_shutdown. zfcp must call these functions when performing error recovery for an adapter. This can lead to a situation where qdio waits for memory and zfcp (SCSI) waits for end of its error recovery. In case zfcp (SCSI) is needed to swap pages this can lead to an I/O stall.
- Solution:
- Avoid memory allocation with GFP_KERNEL in qdio_establish and qdio_shutdown and introduce memory pool to allow these qdio operations in low memory situations.
- Problem-ID:
- 22223
- Description:
- qeth: qethconf not adding IPv4 addresses.
- Symptom:
- /var/log/messages "kernel: qeth: Invalid IP address format!"
- Problem:
- Incorrect syntax checking of IPv4 addresses.
- Solution:
- Change result checking of sscanf invocation.
- Problem-ID:
- 22637
- Description:
- qeth: Race condition possible during device recovery.
- Symptom:
- Kernel panic while device is recovering.
- Problem:
- While device is recovering network stack could
send packets and is even able to call qeth's
hard_start_xmit routine. using netif_stop_queue
outside dev->hard_start_xmit is dangerous,
because it could happen that the network stack holds
the xmit_lock while still sending a packet.
While setting qeth device online netif_wake_queue is called although the device is not in UP state. - Solution:
- Only call netif_wake_queue when device is up. Rather use netif_tx_disable than netif_stop_queue outside from qeth_hard_start_xmit routine.
- Problem-ID:
- 23195
- Description:
- qeth: System crash during data transmission.
- Symptom:
- Addressing exception when sending data with qeth driver.
- Problem:
- qeth_hard_start_xmit() prepares the skb for the OSA-card and turns it over to the OSA-card. Afterwards it updates some statistical data. To do this it makes use of skb-values. But the skb might already be freed by the qeth_qdio_output_handler.
- Solution:
- Save skb-values still used after skb_delivery to the OSA-card before handing the skb over to the OSA-card.
- Problem-ID:
- 23458
- Description:
- z90crypt: Analysis revealed a possible memory overlay.
- Symptom:
- Kernel oops.
- Problem:
- Incorrect size for an array.
- Solution:
- Use sizeof() as the array dimension.
- Problem-ID:
- 22773
- Description:
- z90crypt: Analysis revealed unreachable code.
- Symptom:
- Error messages never generated under certain conditions.
- Problem:
- Extraneous check of variable for non-zero value.
- Solution:
- Remove extraneous check.
- Problem-ID:
- 22772
Everybody should apply this patch.
To create the complete linux kernel sources, the following patches need to be applied in sequence:
linux-2.6.16.tar.gz (from http://www.kernel.org/pub/linux/kernel/v2.6)
+ linux-2.6.16-s390-base-october2005.diff (IBM)
+ linux-2.6.16-s390-01-october2005.diff (IBM)
+ linux-2.6.16-s390-02-october2005.diff (IBM)