linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
@ 2024-01-25  8:21 Song Liu
  2024-01-25 11:49 ` Yu Kuai
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Song Liu @ 2024-01-25  8:21 UTC (permalink / raw)
  To: linux-raid; +Cc: yukuai1, Song Liu, Dan Moulding, stable, Junxiao Bi, Yu Kuai

This reverts commit bed9e27baf52a09b7ba2a3714f1e24e17ced386d.

The original set [1][2] was expected to undo a suboptimal fix in [2], and
replace it with a better fix [1]. However, as reported by Dan Moulding [2]
causes an issue with raid5 with journal device.

Revert [2] for now to close the issue. We will follow up on another issue
reported by Juxiao Bi, as [2] is expected to fix it. We believe this is a
good trade-off, because the latter issue happens less freqently.

In the meanwhile, we will NOT revert [1], as it contains the right logic.

Reported-by: Dan Moulding <dan@danm.net>
Closes: https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@danm.net/
Fixes: bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
Cc: stable@vger.kernel.org # v5.19+
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>

[1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update")
[2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
---
 drivers/md/raid5.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8497880135ee..2b2f03705990 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -36,6 +36,7 @@
  */
 
 #include <linux/blkdev.h>
+#include <linux/delay.h>
 #include <linux/kthread.h>
 #include <linux/raid/pq.h>
 #include <linux/async_tx.h>
@@ -6773,7 +6774,18 @@ static void raid5d(struct md_thread *thread)
 			spin_unlock_irq(&conf->device_lock);
 			md_check_recovery(mddev);
 			spin_lock_irq(&conf->device_lock);
+
+			/*
+			 * Waiting on MD_SB_CHANGE_PENDING below may deadlock
+			 * seeing md_check_recovery() is needed to clear
+			 * the flag when using mdmon.
+			 */
+			continue;
 		}
+
+		wait_event_lock_irq(mddev->sb_wait,
+			!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags),
+			conf->device_lock);
 	}
 	pr_debug("%d stripes handled\n", handled);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
  2024-01-25  8:21 [PATCH] Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"" Song Liu
@ 2024-01-25 11:49 ` Yu Kuai
  2024-01-25 15:48 ` Dan Moulding
  2024-01-25 16:55 ` junxiao.bi
  2 siblings, 0 replies; 4+ messages in thread
From: Yu Kuai @ 2024-01-25 11:49 UTC (permalink / raw)
  To: Song Liu, linux-raid
  Cc: yukuai1, Dan Moulding, stable, Junxiao Bi, yukuai (C)

在 2024/01/25 16:21, Song Liu 写道:
> This reverts commit bed9e27baf52a09b7ba2a3714f1e24e17ced386d.
> 
> The original set [1][2] was expected to undo a suboptimal fix in [2], and
> replace it with a better fix [1]. However, as reported by Dan Moulding [2]
> causes an issue with raid5 with journal device.
> 
> Revert [2] for now to close the issue. We will follow up on another issue
> reported by Juxiao Bi, as [2] is expected to fix it. We believe this is a
> good trade-off, because the latter issue happens less freqently.
> 
> In the meanwhile, we will NOT revert [1], as it contains the right logic.
> 
> Reported-by: Dan Moulding<dan@danm.net>
> Closes:https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@danm.net/
> Fixes: bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
> Cc:stable@vger.kernel.org  # v5.19+
> Cc: Junxiao Bi<junxiao.bi@oracle.com>
> Cc: Yu Kuai<yukuai3@huawei.com>
> Signed-off-by: Song Liu<song@kernel.org>
> 
> [1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update")
> [2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")

LGTM
Reviewed-by: Yu Kuai <yukuai3@huawei.com>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
  2024-01-25  8:21 [PATCH] Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"" Song Liu
  2024-01-25 11:49 ` Yu Kuai
@ 2024-01-25 15:48 ` Dan Moulding
  2024-01-25 16:55 ` junxiao.bi
  2 siblings, 0 replies; 4+ messages in thread
From: Dan Moulding @ 2024-01-25 15:48 UTC (permalink / raw)
  To: song; +Cc: dan, junxiao.bi, linux-raid, stable, yukuai1, yukuai3

Thank you Song. Let me know if there is any more information I can
provide to help diagnose or reproduce this.

-- Dan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
  2024-01-25  8:21 [PATCH] Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"" Song Liu
  2024-01-25 11:49 ` Yu Kuai
  2024-01-25 15:48 ` Dan Moulding
@ 2024-01-25 16:55 ` junxiao.bi
  2 siblings, 0 replies; 4+ messages in thread
From: junxiao.bi @ 2024-01-25 16:55 UTC (permalink / raw)
  To: Song Liu, linux-raid; +Cc: yukuai1, Dan Moulding, stable, Yu Kuai

Should we get some understanding what is the issue before reverting the 
commit? I am not clear what is the issue, already asked Dan in another 
thread.

Thanks,

Junxiao.

On 1/25/24 12:21 AM, Song Liu wrote:
> This reverts commit bed9e27baf52a09b7ba2a3714f1e24e17ced386d.
>
> The original set [1][2] was expected to undo a suboptimal fix in [2], and
> replace it with a better fix [1]. However, as reported by Dan Moulding [2]
> causes an issue with raid5 with journal device.
>
> Revert [2] for now to close the issue. We will follow up on another issue
> reported by Juxiao Bi, as [2] is expected to fix it. We believe this is a
> good trade-off, because the latter issue happens less freqently.
>
> In the meanwhile, we will NOT revert [1], as it contains the right logic.
>
> Reported-by: Dan Moulding <dan@danm.net>
> Closes: https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@danm.net/
> Fixes: bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
> Cc: stable@vger.kernel.org # v5.19+
> Cc: Junxiao Bi <junxiao.bi@oracle.com>
> Cc: Yu Kuai <yukuai3@huawei.com>
> Signed-off-by: Song Liu <song@kernel.org>
>
> [1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update")
> [2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
> ---
>   drivers/md/raid5.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 8497880135ee..2b2f03705990 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -36,6 +36,7 @@
>    */
>   
>   #include <linux/blkdev.h>
> +#include <linux/delay.h>
>   #include <linux/kthread.h>
>   #include <linux/raid/pq.h>
>   #include <linux/async_tx.h>
> @@ -6773,7 +6774,18 @@ static void raid5d(struct md_thread *thread)
>   			spin_unlock_irq(&conf->device_lock);
>   			md_check_recovery(mddev);
>   			spin_lock_irq(&conf->device_lock);
> +
> +			/*
> +			 * Waiting on MD_SB_CHANGE_PENDING below may deadlock
> +			 * seeing md_check_recovery() is needed to clear
> +			 * the flag when using mdmon.
> +			 */
> +			continue;
>   		}
> +
> +		wait_event_lock_irq(mddev->sb_wait,
> +			!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags),
> +			conf->device_lock);
>   	}
>   	pr_debug("%d stripes handled\n", handled);
>   

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-01-25 16:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-25  8:21 [PATCH] Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"" Song Liu
2024-01-25 11:49 ` Yu Kuai
2024-01-25 15:48 ` Dan Moulding
2024-01-25 16:55 ` junxiao.bi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).