All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] md: retry bailed bio
@ 2017-09-21 17:42 Shaohua Li
  2017-09-21 17:42 ` [PATCH 1/3] md: separate request handling Shaohua Li
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Shaohua Li @ 2017-09-21 17:42 UTC (permalink / raw)
  To: linux-raid; +Cc: Shaohua Li

From: Shaohua Li <shli@fb.com>

With cc27b0c78c79(md: fix deadlock between mddev_suspend() and
md_write_start()), pers->make_request could bail out without handling the bio.
If that happens, we should retry. There are two call sites we don't do the
retry and cause problems. This set fixes the issues.

Thanks,
Shaohua

Shaohua Li (3):
  md: separate request handling
  md: fix a race condition for flush request handling
  dm-raid: fix a race condition in request handling

 drivers/md/dm-raid.c |  2 +-
 drivers/md/md.c      | 72 +++++++++++++++++++++++++++++++---------------------
 drivers/md/md.h      |  1 +
 3 files changed, 45 insertions(+), 30 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] md: separate request handling
  2017-09-21 17:42 [PATCH 0/3] md: retry bailed bio Shaohua Li
@ 2017-09-21 17:42 ` Shaohua Li
  2017-09-21 17:42 ` [PATCH 2/3] md: fix a race condition for flush " Shaohua Li
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2017-09-21 17:42 UTC (permalink / raw)
  To: linux-raid; +Cc: Shaohua Li, NeilBrown, stable

From: Shaohua Li <shli@fb.com>

With commit cc27b0c78c79, pers->make_request could bail out without handling
the bio. If that happens, we should retry.  The commit fixes md_make_request
but not other call sites. Separate the request handling part, so other call
sites can use it.

Reported-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: NeilBrown <neilb@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Shaohua Li <shli@fb.com>
---
 drivers/md/md.c | 58 ++++++++++++++++++++++++++++++++-------------------------
 drivers/md/md.h |  1 +
 2 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 08fcaebc61bd..1db1a22ed835 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -266,6 +266,37 @@ static DEFINE_SPINLOCK(all_mddevs_lock);
  * call has finished, the bio has been linked into some internal structure
  * and so is visible to ->quiesce(), so we don't need the refcount any more.
  */
+void md_handle_request(struct mddev *mddev, struct bio *bio)
+{
+check_suspended:
+	rcu_read_lock();
+	if (mddev->suspended) {
+		DEFINE_WAIT(__wait);
+		for (;;) {
+			prepare_to_wait(&mddev->sb_wait, &__wait,
+					TASK_UNINTERRUPTIBLE);
+			if (!mddev->suspended)
+				break;
+			rcu_read_unlock();
+			schedule();
+			rcu_read_lock();
+		}
+		finish_wait(&mddev->sb_wait, &__wait);
+	}
+	atomic_inc(&mddev->active_io);
+	rcu_read_unlock();
+
+	if (!mddev->pers->make_request(mddev, bio)) {
+		atomic_dec(&mddev->active_io);
+		wake_up(&mddev->sb_wait);
+		goto check_suspended;
+	}
+
+	if (atomic_dec_and_test(&mddev->active_io) && mddev->suspended)
+		wake_up(&mddev->sb_wait);
+}
+EXPORT_SYMBOL(md_handle_request);
+
 static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
 {
 	const int rw = bio_data_dir(bio);
@@ -285,23 +316,6 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
 		bio_endio(bio);
 		return BLK_QC_T_NONE;
 	}
-check_suspended:
-	rcu_read_lock();
-	if (mddev->suspended) {
-		DEFINE_WAIT(__wait);
-		for (;;) {
-			prepare_to_wait(&mddev->sb_wait, &__wait,
-					TASK_UNINTERRUPTIBLE);
-			if (!mddev->suspended)
-				break;
-			rcu_read_unlock();
-			schedule();
-			rcu_read_lock();
-		}
-		finish_wait(&mddev->sb_wait, &__wait);
-	}
-	atomic_inc(&mddev->active_io);
-	rcu_read_unlock();
 
 	/*
 	 * save the sectors now since our bio can
@@ -310,20 +324,14 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
 	sectors = bio_sectors(bio);
 	/* bio could be mergeable after passing to underlayer */
 	bio->bi_opf &= ~REQ_NOMERGE;
-	if (!mddev->pers->make_request(mddev, bio)) {
-		atomic_dec(&mddev->active_io);
-		wake_up(&mddev->sb_wait);
-		goto check_suspended;
-	}
+
+	md_handle_request(mddev, bio);
 
 	cpu = part_stat_lock();
 	part_stat_inc(cpu, &mddev->gendisk->part0, ios[rw]);
 	part_stat_add(cpu, &mddev->gendisk->part0, sectors[rw], sectors);
 	part_stat_unlock();
 
-	if (atomic_dec_and_test(&mddev->active_io) && mddev->suspended)
-		wake_up(&mddev->sb_wait);
-
 	return BLK_QC_T_NONE;
 }
 
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 561d22b9a9a8..d8287d3cd1bf 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -692,6 +692,7 @@ extern void md_stop_writes(struct mddev *mddev);
 extern int md_rdev_init(struct md_rdev *rdev);
 extern void md_rdev_clear(struct md_rdev *rdev);
 
+extern void md_handle_request(struct mddev *mddev, struct bio *bio);
 extern void mddev_suspend(struct mddev *mddev);
 extern void mddev_resume(struct mddev *mddev);
 extern struct bio *bio_alloc_mddev(gfp_t gfp_mask, int nr_iovecs,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] md: fix a race condition for flush request handling
  2017-09-21 17:42 [PATCH 0/3] md: retry bailed bio Shaohua Li
  2017-09-21 17:42 ` [PATCH 1/3] md: separate request handling Shaohua Li
@ 2017-09-21 17:42 ` Shaohua Li
  2017-09-21 17:42 ` [PATCH 3/3] dm-raid: fix a race condition in " Shaohua Li
  2017-09-28  2:47 ` [PATCH 0/3] md: retry bailed bio NeilBrown
  3 siblings, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2017-09-21 17:42 UTC (permalink / raw)
  To: linux-raid; +Cc: Shaohua Li, NeilBrown, stable

From: Shaohua Li <shli@fb.com>

md_submit_flush_data calls pers->make_request, which missed the suspend check.
Fix it with the new md_handle_request API.

Reported-by: Nate Dailey <nate.dailey@stratus.com>
Tested-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: NeilBrown <neilb@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Shaohua Li <shli@fb.com>
---
 drivers/md/md.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 1db1a22ed835..0ff1bbf6c90e 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -447,16 +447,22 @@ static void md_submit_flush_data(struct work_struct *ws)
 	struct mddev *mddev = container_of(ws, struct mddev, flush_work);
 	struct bio *bio = mddev->flush_bio;
 
+	/*
+	 * must reset flush_bio before calling into md_handle_request to avoid a
+	 * deadlock, because other bios passed md_handle_request suspend check
+	 * could wait for this and below md_handle_request could wait for those
+	 * bios because of suspend check
+	 */
+	mddev->flush_bio = NULL;
+	wake_up(&mddev->sb_wait);
+
 	if (bio->bi_iter.bi_size == 0)
 		/* an empty barrier - all done */
 		bio_endio(bio);
 	else {
 		bio->bi_opf &= ~REQ_PREFLUSH;
-		mddev->pers->make_request(mddev, bio);
+		md_handle_request(mddev, bio);
 	}
-
-	mddev->flush_bio = NULL;
-	wake_up(&mddev->sb_wait);
 }
 
 void md_flush_request(struct mddev *mddev, struct bio *bio)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] dm-raid: fix a race condition in request handling
  2017-09-21 17:42 [PATCH 0/3] md: retry bailed bio Shaohua Li
  2017-09-21 17:42 ` [PATCH 1/3] md: separate request handling Shaohua Li
  2017-09-21 17:42 ` [PATCH 2/3] md: fix a race condition for flush " Shaohua Li
@ 2017-09-21 17:42 ` Shaohua Li
  2017-09-28  2:47 ` [PATCH 0/3] md: retry bailed bio NeilBrown
  3 siblings, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2017-09-21 17:42 UTC (permalink / raw)
  To: linux-raid; +Cc: Shaohua Li, NeilBrown, Heinz Mauelshagen, Mike Snitzer, stable

From: Shaohua Li <shli@fb.com>

raid_map calls pers->make_request, which missed the suspend check. Fix it with
the new md_handle_request API.

Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: NeilBrown <neilb@suse.com>
Cc: Heinz Mauelshagen <heinzm@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Shaohua Li <shli@fb.com>
---
 drivers/md/dm-raid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 5bfe285ea9d1..1ac58c5651b7 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3238,7 +3238,7 @@ static int raid_map(struct dm_target *ti, struct bio *bio)
 	if (unlikely(bio_end_sector(bio) > mddev->array_sectors))
 		return DM_MAPIO_REQUEUE;
 
-	mddev->pers->make_request(mddev, bio);
+	md_handle_request(mddev, bio);
 
 	return DM_MAPIO_SUBMITTED;
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] md: retry bailed bio
  2017-09-21 17:42 [PATCH 0/3] md: retry bailed bio Shaohua Li
                   ` (2 preceding siblings ...)
  2017-09-21 17:42 ` [PATCH 3/3] dm-raid: fix a race condition in " Shaohua Li
@ 2017-09-28  2:47 ` NeilBrown
  3 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2017-09-28  2:47 UTC (permalink / raw)
  To: Shaohua Li, linux-raid; +Cc: Shaohua Li

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

On Thu, Sep 21 2017, Shaohua Li wrote:

> From: Shaohua Li <shli@fb.com>
>
> With cc27b0c78c79(md: fix deadlock between mddev_suspend() and
> md_write_start()), pers->make_request could bail out without handling the bio.
> If that happens, we should retry. There are two call sites we don't do the
> retry and cause problems. This set fixes the issues.

All look good to me, thanks.
Reviewed-by: NeilBrown <neilb@suse.com>

NeilBrown

>
> Thanks,
> Shaohua
>
> Shaohua Li (3):
>   md: separate request handling
>   md: fix a race condition for flush request handling
>   dm-raid: fix a race condition in request handling
>
>  drivers/md/dm-raid.c |  2 +-
>  drivers/md/md.c      | 72 +++++++++++++++++++++++++++++++---------------------
>  drivers/md/md.h      |  1 +
>  3 files changed, 45 insertions(+), 30 deletions(-)
>
> -- 
> 2.11.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-09-28  2:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-21 17:42 [PATCH 0/3] md: retry bailed bio Shaohua Li
2017-09-21 17:42 ` [PATCH 1/3] md: separate request handling Shaohua Li
2017-09-21 17:42 ` [PATCH 2/3] md: fix a race condition for flush " Shaohua Li
2017-09-21 17:42 ` [PATCH 3/3] dm-raid: fix a race condition in " Shaohua Li
2017-09-28  2:47 ` [PATCH 0/3] md: retry bailed bio NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.