linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	Song Liu <song@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Guoqing Jiang <guoqing.jiang@linux.dev>,
	Stephen Bates <sbates@raithlin.com>,
	Martin Oliveira <Martin.Oliveira@eideticom.com>,
	David Sloan <David.Sloan@eideticom.com>,
	Logan Gunthorpe <logang@deltatee.com>
Subject: [PATCH v2 09/12] md/raid5: Keep a reference to last stripe_head for batch
Date: Wed, 20 Apr 2022 13:54:22 -0600	[thread overview]
Message-ID: <20220420195425.34911-10-logang@deltatee.com> (raw)
In-Reply-To: <20220420195425.34911-1-logang@deltatee.com>

When batching, every stripe head has to find the previous stripe head to
add to the batch list. This involves taking the hash lock which is
highly contended during IO.

Instead of finding the previous stripe_head each time, store a
reference to the previous stripe_head in a pointer so that it doesn't
require taking the contended lock another time.

The reference to the previous stripe must be released before scheduling
and waiting for work to get done. Otherwise, it can hold up
raid5_activate_delayed() and deadlock.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/raid5.c | 51 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 0c250cc3bfff..28ea7b9b6ab6 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -843,7 +843,8 @@ static bool stripe_can_batch(struct stripe_head *sh)
 }
 
 /* we only do back search */
-static void stripe_add_to_batch_list(struct r5conf *conf, struct stripe_head *sh)
+static void stripe_add_to_batch_list(struct r5conf *conf,
+		struct stripe_head *sh, struct stripe_head *last_sh)
 {
 	struct stripe_head *head;
 	sector_t head_sector, tmp_sec;
@@ -856,15 +857,20 @@ static void stripe_add_to_batch_list(struct r5conf *conf, struct stripe_head *sh
 		return;
 	head_sector = sh->sector - RAID5_STRIPE_SECTORS(conf);
 
-	hash = stripe_hash_locks_hash(conf, head_sector);
-	spin_lock_irq(conf->hash_locks + hash);
-	head = find_get_stripe(conf, head_sector, conf->generation, hash);
-	spin_unlock_irq(conf->hash_locks + hash);
-
-	if (!head)
-		return;
-	if (!stripe_can_batch(head))
-		goto out;
+	if (last_sh && head_sector == last_sh->sector) {
+		head = last_sh;
+		atomic_inc(&head->count);
+	} else {
+		hash = stripe_hash_locks_hash(conf, head_sector);
+		spin_lock_irq(conf->hash_locks + hash);
+		head = find_get_stripe(conf, head_sector, conf->generation,
+				       hash);
+		spin_unlock_irq(conf->hash_locks + hash);
+		if (!head)
+			return;
+		if (!stripe_can_batch(head))
+			goto out;
+	}
 
 	lock_two_stripes(head, sh);
 	/* clear_batch_ready clear the flag */
@@ -5800,6 +5806,7 @@ enum stripe_result {
 
 struct stripe_request_ctx {
 	bool do_flush;
+	struct stripe_head *batch_last;
 };
 
 static enum stripe_result make_stripe_request(struct mddev *mddev,
@@ -5889,8 +5896,13 @@ static enum stripe_result make_stripe_request(struct mddev *mddev,
 		return STRIPE_SCHEDULE_AND_RETRY;
 	}
 
-	if (stripe_can_batch(sh))
-		stripe_add_to_batch_list(conf, sh);
+	if (stripe_can_batch(sh)) {
+		stripe_add_to_batch_list(conf, sh, ctx->batch_last);
+		if (ctx->batch_last)
+			raid5_release_stripe(ctx->batch_last);
+		atomic_inc(&sh->count);
+		ctx->batch_last = sh;
+	}
 
 	if (ctx->do_flush) {
 		set_bit(STRIPE_R5C_PREFLUSH, &sh->state);
@@ -5979,6 +5991,18 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
 		} else if (res == STRIPE_RETRY) {
 			continue;
 		} else if (res == STRIPE_SCHEDULE_AND_RETRY) {
+			/*
+			 * Must release the reference to batch_last before
+			 * scheduling and waiting for work to be done,
+			 * otherwise the batch_last stripe head could prevent
+			 * raid5_activate_delayed() from making progress
+			 * and thus deadlocking.
+			 */
+			if (ctx.batch_last) {
+				raid5_release_stripe(ctx.batch_last);
+				ctx.batch_last = NULL;
+			}
+
 			schedule();
 			prepare_to_wait(&conf->wait_for_overlap, &w,
 					TASK_UNINTERRUPTIBLE);
@@ -5990,6 +6014,9 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
 
 	finish_wait(&conf->wait_for_overlap, &w);
 
+	if (ctx.batch_last)
+		raid5_release_stripe(ctx.batch_last);
+
 	if (rw == WRITE)
 		md_write_end(mddev);
 	bio_endio(bi);
-- 
2.30.2


  parent reply	other threads:[~2022-04-20 19:55 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-20 19:54 [PATCH v2 00/12] Improve Raid5 Lock Contention Logan Gunthorpe
2022-04-20 19:54 ` [PATCH v2 01/12] md/raid5: Factor out ahead_of_reshape() function Logan Gunthorpe
2022-04-21  6:07   ` Christoph Hellwig
2022-04-21  9:17   ` Paul Menzel
2022-04-21 16:05     ` Logan Gunthorpe
2022-04-21 23:33       ` Wol
2022-04-27  1:28   ` Guoqing Jiang
2022-04-27 16:07     ` Logan Gunthorpe
2022-04-28  1:49       ` Guoqing Jiang
2022-04-28 15:44         ` Logan Gunthorpe
2022-04-29  0:24           ` Guoqing Jiang
2022-04-20 19:54 ` [PATCH v2 02/12] md/raid5: Refactor raid5_make_request loop Logan Gunthorpe
2022-04-21  6:08   ` Christoph Hellwig
2022-04-27  1:32   ` Guoqing Jiang
2022-04-27 16:08     ` Logan Gunthorpe
2022-04-28  1:16       ` Guoqing Jiang
2022-04-20 19:54 ` [PATCH v2 03/12] md/raid5: Move stripe_add_to_batch_list() call out of add_stripe_bio() Logan Gunthorpe
2022-04-27  1:33   ` Guoqing Jiang
2022-04-20 19:54 ` [PATCH v2 04/12] md/raid5: Move common stripe count increment code into __find_stripe() Logan Gunthorpe
2022-04-21  6:10   ` Christoph Hellwig
2022-04-27  1:33   ` Guoqing Jiang
2022-04-20 19:54 ` [PATCH v2 05/12] md/raid5: Factor out helper from raid5_make_request() loop Logan Gunthorpe
2022-04-21  6:14   ` Christoph Hellwig
2022-04-20 19:54 ` [PATCH v2 06/12] md/raid5: Drop the do_prepare flag in raid5_make_request() Logan Gunthorpe
2022-04-21  6:15   ` Christoph Hellwig
2022-04-27  2:11   ` Guoqing Jiang
2022-04-20 19:54 ` [PATCH v2 07/12] md/raid5: Move read_seqcount_begin() into make_stripe_request() Logan Gunthorpe
2022-04-21  6:15   ` Christoph Hellwig
2022-04-27  2:13   ` Guoqing Jiang
2022-04-20 19:54 ` [PATCH v2 08/12] md/raid5: Refactor for loop in raid5_make_request() into while loop Logan Gunthorpe
2022-04-21  6:16   ` Christoph Hellwig
2022-04-20 19:54 ` Logan Gunthorpe [this message]
2022-04-21  6:17   ` [PATCH v2 09/12] md/raid5: Keep a reference to last stripe_head for batch Christoph Hellwig
2022-04-27  1:36   ` Guoqing Jiang
2022-04-27 23:27     ` Logan Gunthorpe
2022-04-20 19:54 ` [PATCH v2 10/12] md/raid5: Refactor add_stripe_bio() Logan Gunthorpe
2022-04-21  6:18   ` Christoph Hellwig
2022-04-20 19:54 ` [PATCH v2 11/12] md/raid5: Check all disks in a stripe_head for reshape progress Logan Gunthorpe
2022-04-21  6:18   ` Christoph Hellwig
2022-04-27  1:53   ` Guoqing Jiang
2022-04-27 16:11     ` Logan Gunthorpe
2022-04-20 19:54 ` [PATCH v2 12/12] md/raid5: Pivot raid5_make_request() Logan Gunthorpe
2022-04-21  6:43   ` Christoph Hellwig
2022-04-21 15:54     ` Logan Gunthorpe
2022-04-27  2:06   ` Guoqing Jiang
2022-04-27 16:18     ` Logan Gunthorpe
2022-04-28  1:32       ` Guoqing Jiang
2022-04-21  8:45 ` [PATCH v2 00/12] Improve Raid5 Lock Contention Xiao Ni
2022-04-21 16:02   ` Logan Gunthorpe
2022-04-24  8:00     ` Guoqing Jiang
2022-04-25 15:39       ` Logan Gunthorpe
2022-04-25 16:12         ` Xiao Ni
2022-04-28 21:22           ` Logan Gunthorpe
2022-04-29  0:49             ` Guoqing Jiang
2022-04-29 16:01               ` Logan Gunthorpe
2022-04-30  1:44                 ` Guoqing Jiang
2022-04-24  7:53 ` Guoqing Jiang
2022-04-25 15:37   ` Logan Gunthorpe
2022-04-25 23:07 ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220420195425.34911-10-logang@deltatee.com \
    --to=logang@deltatee.com \
    --cc=David.Sloan@eideticom.com \
    --cc=Martin.Oliveira@eideticom.com \
    --cc=guoqing.jiang@linux.dev \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=sbates@raithlin.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).