linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] raid5: call clear_batch_ready before set STRIPE_ACTIVE
@ 2020-06-16  9:25 Guoqing Jiang
  2020-06-16  9:25 ` [PATCH 2/3] raid5: put the comment of clear_batch_ready to the right place Guoqing Jiang
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Guoqing Jiang @ 2020-06-16  9:25 UTC (permalink / raw)
  To: linux-raid; +Cc: song, Guoqing Jiang

We tried to only put the head sh of batch list to handle_list, then the
handle_stripe doesn't handle other members in the batch list. However,
we still got the calltrace in break_stripe_batch_list.

[593764.644269] stripe state: 2003
kernel: [593764.644299] ------------[ cut here ]------------
kernel: [593764.644308] WARNING: CPU: 12 PID: 856 at drivers/md/raid5.c:4625 break_stripe_batch_list+0x203/0x240 [raid456]
[...]
kernel: [593764.644363] Call Trace:
kernel: [593764.644370]  handle_stripe+0x907/0x20c0 [raid456]
kernel: [593764.644376]  ? __wake_up_common_lock+0x89/0xc0
kernel: [593764.644379]  handle_active_stripes.isra.57+0x35f/0x570 [raid456]
kernel: [593764.644382]  ? raid5_wakeup_stripe_thread+0x96/0x1f0 [raid456]
kernel: [593764.644385]  raid5d+0x480/0x6a0 [raid456]
kernel: [593764.644390]  ? md_thread+0x11f/0x160
kernel: [593764.644392]  md_thread+0x11f/0x160
kernel: [593764.644394]  ? wait_woken+0x80/0x80
kernel: [593764.644396]  kthread+0xfc/0x130
kernel: [593764.644398]  ? find_pers+0x70/0x70
kernel: [593764.644399]  ? kthread_create_on_node+0x70/0x70
kernel: [593764.644401]  ret_from_fork+0x1f/0x30

As we can see, the stripe was set with STRIPE_ACTIVE and STRIPE_HANDLE,
and only handle_stripe could set those flags then return. And since the
stipe was already in the batch list, we need to return earlier before
set the two flags.

And after dig a little about git history especially commit 3664847d95e6
("md/raid5: fix a race condition in stripe batch"), it seems the batched
stipe still could be handled by handle_stipe, then handle_stipe needs to
return earlier if clear_batch_ready to return true.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
---
Another alternative would be just not warn if STRIPE_ACTIVE is valid for 
the batched list.

What do you think?

Thanks,
Guoqing

 drivers/md/raid5.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index ab8067f9ce8c..a35332364f07 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4682,6 +4682,16 @@ static void handle_stripe(struct stripe_head *sh)
 	struct r5dev *pdev, *qdev;
 
 	clear_bit(STRIPE_HANDLE, &sh->state);
+
+	/*
+	 * handle_stripe should not continue handle the batched stripe, only
+	 * the head of batch list or lone stripe can continue. Otherwise we
+	 * could see break_stripe_batch_list warns about the STRIPE_ACTIVE
+	 * is set for the batched stripe.
+	 */
+	if (clear_batch_ready(sh))
+		return;
+
 	if (test_and_set_bit_lock(STRIPE_ACTIVE, &sh->state)) {
 		/* already being handled, ensure it gets handled
 		 * again when current action finishes */
@@ -4689,11 +4699,6 @@ static void handle_stripe(struct stripe_head *sh)
 		return;
 	}
 
-	if (clear_batch_ready(sh) ) {
-		clear_bit_unlock(STRIPE_ACTIVE, &sh->state);
-		return;
-	}
-
 	if (test_and_clear_bit(STRIPE_BATCH_ERR, &sh->state))
 		break_stripe_batch_list(sh, 0);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-07-16 17:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16  9:25 [PATCH 1/3] raid5: call clear_batch_ready before set STRIPE_ACTIVE Guoqing Jiang
2020-06-16  9:25 ` [PATCH 2/3] raid5: put the comment of clear_batch_ready to the right place Guoqing Jiang
2020-06-16  9:25 ` [PATCH 3/3] raid5: remove the meaningless check in raid5_make_request Guoqing Jiang
2020-06-19 14:16 ` [PATCH 1/3] raid5: call clear_batch_ready before set STRIPE_ACTIVE Guoqing Jiang
2020-06-23 23:58 ` Song Liu
2020-06-25  9:22   ` Guoqing Jiang
2020-06-26  0:16     ` Song Liu
2020-07-16  7:44       ` Guoqing Jiang
2020-07-16 17:32         ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).