linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drbd: fix potential silent data corruption
@ 2021-04-26 16:30 Christoph Böhmwalder
  0 siblings, 0 replies; only message in thread
From: Christoph Böhmwalder @ 2021-04-26 16:30 UTC (permalink / raw)
  To: drbd-dev
  Cc: Philipp Reisner, Jens Axboe, linux-block, linux-kernel,
	Lars Ellenberg, Christoph Böhmwalder, stable

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Scenario:
---------

bio chain generated by blk_queue_split().
Some split bio fails and propagates its error status to the "parent" bio.
But then the (last part of the) parent bio itself completes without error.

We would clobber the already recorded error status with BLK_STS_OK,
causing silent data corruption.

Reproducer:
-----------

How to trigger this in the real world within seconds:

DRBD on top of degraded parity raid,
small stripe_cache_size, large read_ahead setting.
Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED",
umount and mount again, "reboot").

Cause significant read ahead.

Large read ahead request is split by blk_queue_split().
Parts of the read ahead that are already in the stripe cache,
or find an available stripe cache to use, can be serviced.
Parts of the read ahead that would need "too much work",
would need to wait for a "stripe_head" to become available,
are rejected immediately.

For larger read ahead requests that are split in many pieces, it is very
likely that some "splits" will be serviced, but then the stripe cache is
exhausted/busy, and the remaining ones will be rejected.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: <stable@vger.kernel.org> # 4.13.x
---

Note: this will need to be backported to versions prior to 4.13 too, but
the API changed in the meantime (from the new bio->bi_status to the old
bio->bi_error). I will send a separate patch for these older versions.

In addition, the generic bio_endio/bio_chain_endio has to be fixed in
a similar way for versions before 4.6. This equates to a backport of
upstream commit af3e3a5259e3 ("block: don't unecessarily clobber bi_error
for chained bios").

 drivers/block/drbd/drbd_req.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 9398c2c2cb2d..a384a58de1fd 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -180,7 +180,8 @@ void start_new_tl_epoch(struct drbd_connection *connection)
 void complete_master_bio(struct drbd_device *device,
 		struct bio_and_error *m)
 {
-	m->bio->bi_status = errno_to_blk_status(m->error);
+	if (unlikely(m->error))
+		m->bio->bi_status = errno_to_blk_status(m->error);
 	bio_endio(m->bio);
 	dec_ap_bio(device);
 }
-- 
2.26.3


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2021-04-26 16:30 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 16:30 [PATCH] drbd: fix potential silent data corruption Christoph Böhmwalder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).