This patch contains the following linux kernel bug fixes:
Description:
cio: Handle invalid subchannel set id in stsch.
Symptom:
Operand exception while processing machine checks.
Problem:
When the common I/O layer is forced to re-scan on all possible
subchannels on a machine check, it looks at all subchannel ids
created by for_each_subchannel(). Those subchannel ids may contain
an invalid subchannel set id, which will cause an operand
exception on stsch.
Solution:
Use stsch_err which can handle operand exceptions.
Problem-ID:
31538
Description:
cio: I/O error after cable pulls.
Symptom:
After some cables to a DASD have been pulled, dasd unsuccessfully
tries to submit I/O. After the re-tries for the error recovery
have been exhausted, an I/O error is generated.
Problem:
When the common I/O layer tries to start delayed path
verification, it detects a pending status and remembers to delay
path verification further. However, the pending interrupt is an
unsolicited interrupt, and no action is triggered.
Solution:
Re-start path verification when an unsolicited interrupt occurs
if the doverify bit is set.
Problem-ID:
30958
Description:
cio: Use path verification for last path gone after vary off.
Symptom:
dasd driver running out of re-tries after the last path for a
device has been varied off.
Problem:
When the last path had been varied off, devices were being put on
the slow path workqueue for evaluation (which may take some time),
but the device driver was not prevented from committing I/O in the
meantime (which failed due to no path left).
Solution:
Trigger path verification when the last path is varied off which
will prevent the device driver from submitting further I/O until
its notify function is called.
Problem-ID:
29856
Description:
dasd: Enhanced handling of failed termination requests.
Symptom:
Console is flooded with re-try messages and kernel is very slow.
Problem:
In case a request timed out and termination did not work,
the console was flooded with re-try messages (every 1/10 second).
Solution:
Now we use a 5 second delay per re-try and generate a more precise
message.
Problem-ID:
31429
Description:
dasd: Fixed 'unconditional reserve' handling.
Symptom:
The reserve/release IOCTLs sometimes did not work.
A re-try was always successful.
Problem:
If second system did a 'steal lock' the pending unit check
(Format 3 Msg F) was delivered. Since we disabled ERP for
reserve/release, the IOCTL call failed.
Solution:
Allow basic ERP (re-tries) for reserve/release IOCTLs.
Problem-ID:
25181
Description:
dasd: fix bug in dasd initialization cleanup.
Symptom:
When the attempt to load the dasd driver fails, e.g. due to a
broken parameter, it runs into a bug in the cleanup code.
Problem:
The initialization of the dasd_eer code is one of the last
steps of the dasd driver initialization. When initialization
fails in one of the earlier steps, the dasd_exit function is
called to clean up what has been done so far. So the
dasd_eer_exit function may be called, although the dasd_eer_init
function was not called before and dasd_eer_exit tries to unregister
a misc device that was not registered, which results in a BUG.
Solution:
Make sure that dasd_eer_exit can be called without initialization.
Use a dynamically allocated struct miscdevice instead of
a static one, so we only try to unregister the device if it
exists and was actually registered.
Problem-ID:
30869
Description:
kernel: Misaligned wait-PSW.
Symptom:
IPL hangs.
Problem:
On IPL a wait-PSW might be loaded. This PSW is misaligned and
the lpsw instruction will generate a specification exception.
This will lead to a program check loop and unresponsive system.
cpcmd (diag8) fails with static module variables as
response buffer.
Problem:
The bounce buffer check in cpcmd did not check for
memory, where real address is not equal to virtual kernel address.
Solution:
Change bounce buffer logic to check for real is not equal to virtual.
Problem-ID:
31925
Description:
qeth: device functions are not callable in atomic
context.
Symptom:
Schedule while atomic (kernel dump) when removing
slave from bond device.
Problem:
When devices are added/removed to a bond the bonding
driver requests atomic context and calls qeth device
functions. qeth device functions use wait calls
(schedule) and cause the kernel to fail.
Solution:
Make qeth device functions callable in atomic context.
Problem-ID:
28121
Description:
sclp: invalid handling of temporary 'not operational' status
Symptom:
Kernel oops when sclp interface temporarily reports 'not
operational' state.
Problem:
Requests are aborted when the sclp interface reports 'not
operational' even though they may still be active at the sclp,
leading to concurrent writes to request memory by both the kernel
and the sclp interface.
Solution:
Do not abort requests for which the sclp interface reports
not operational status during request re-try.
Problem-ID:
30816
Description:
zfcp: Oops when trying to set FCP device back online.
Symptom:
Oops.
Problem:
Statistic functions were called by the zfcp environment without
acting on their return-codes (failure). This caused a continued
operation without properly initialized structures -> Oops.
Solution:
Check return code.
Problem-ID:
27944
Description:
zfcp: invalid locking order.
Symptom:
Kernel hangs.
Problem:
Possibly trying to take two locks which are dependent on each other.
Solution:
Introducing temporary variable to free requests. Free lock after
requests are copied.
Problem-ID:
31767
Everybody should apply this patch.
To create the complete linux kernel sources, the following
patches need to be applied in sequence: