All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET] block: fix merge of requests with different failfast settings
@ 2009-07-03  8:48 ` Tejun Heo
  0 siblings, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2009-07-03  8:48 UTC (permalink / raw)
  To: Jens Axboe, Linux Kernel, James Bottomley, linux-scsi,
	Niel Lambrechts, FUJITA Tomonori

Hello,

Block layer didn't consider failfast status while merging requests and
it led to premature failure of normal (non-failfast) IOs.  Niel
Lambrechts could trigger the problem semi-reliably on ext4 when
resuming from STR.  ext4 uses readahead when reading inodes and
combined with the deterministic extra SATA PHY exception cycle during
resume on the specific configuration, non-readahead inode read would
fail causing ext4 errors.  Please read the following thread for
details.

  http://lkml.org/lkml/2009/5/23/21

This patchset contains the following four patches to fix the problem.

 0001-block-don-t-merge-requests-of-different-failfast-se.patch
 0002-block-use-the-same-failfast-bits-for-bio-and-reques.patch
 0003-block-implement-mixed-merge-of-different-failfast-r.patch
 0004-scsi-block-update-SCSI-to-handle-mixed-merge-failur.patch

0001 disallows merge between requests with different failfast
settings.  This one is the quick fix and should go into 2.6.31 and
later to -stable as the bug is pretty serious and may lead to data
loss.

0002 preps for later changes.

0003-0004 implements and applies mixed merge.  Requests of different
failfast settings are merged as before but failure handling is updated
such that parts which shouldn't fail without retrial are properly
retried.

I spent quite some time thinking about and testing it but I'd really
like more pairs of eyes on this patchset as dangerous bugs can go
unnoticed for quite a while in this area (anyone knows when the
failfast bug was introduced?).

Jens, I think the best way to merge this is to first push 0001 to
Linus's tree and then pull it into for-next and then apply the rest on
top of them.

This patchset contains the following changes.

 block/blk-core.c        |  118 +++++++++++++++++++++++++++++++++++++++++++-----
 block/blk-merge.c       |   43 +++++++++++++++++
 block/blk.h             |    1 
 drivers/scsi/scsi_lib.c |    6 +-
 include/linux/bio.h     |   43 +++++++++--------
 include/linux/blkdev.h  |   27 ++++++++--
 6 files changed, 199 insertions(+), 39 deletions(-)

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2009-07-15  9:42 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-03  8:48 [PATCHSET] block: fix merge of requests with different failfast settings Tejun Heo
2009-07-03  8:48 ` Tejun Heo
2009-07-03  8:48 ` [PATCH 1/4] block: don't merge requests of " Tejun Heo
2009-07-03  8:48   ` Tejun Heo
2009-07-03  8:48 ` [PATCH 2/4] block: use the same failfast bits for bio and request Tejun Heo
2009-07-03  8:48   ` Tejun Heo
2009-07-05  9:27   ` Boaz Harrosh
2009-07-09  0:45     ` Tejun Heo
2009-07-09  9:12       ` Boaz Harrosh
2009-07-09 13:37       ` Christoph Hellwig
2009-07-09 17:20         ` Jeff Garzik
2009-07-09 17:39           ` Jens Axboe
2009-07-10 13:18         ` Tejun Heo
2009-07-12 12:06           ` Boaz Harrosh
2009-07-15  9:27             ` Tejun Heo
2009-07-03  8:48 ` [PATCH 3/4] block: implement mixed merge of different failfast requests Tejun Heo
2009-07-03  8:48   ` Tejun Heo
2009-07-05  9:27   ` Boaz Harrosh
2009-07-09  0:47     ` Tejun Heo
2009-07-09  9:17       ` Boaz Harrosh
2009-07-15  9:41         ` Tejun Heo
2009-07-03  8:48 ` [PATCH 4/4] scsi,block: update SCSI to handle mixed merge failures Tejun Heo
2009-07-03  8:48   ` Tejun Heo
2009-07-03 10:54 ` [PATCHSET] block: fix merge of requests with different failfast settings Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.