* [PATCH V2] md: unlock mddev before reap sync_thread in action_store
@ 2022-06-21 3:11 Guoqing Jiang
2022-06-22 23:32 ` Song Liu
0 siblings, 1 reply; 4+ messages in thread
From: Guoqing Jiang @ 2022-06-21 3:11 UTC (permalink / raw)
To: song; +Cc: buczek, logang, linux-raid
Since the bug which commit 8b48ec23cc51a ("md: don't unregister sync_thread
with reconfig_mutex held") fixed is related with action_store path, other
callers which reap sync_thread didn't need to be changed.
Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous
bug with belows.
1. unlock mddev before md_reap_sync_thread in action_store.
2. save reshape_position before unlock, then restore it to ensure position
not changed accidentally by others.
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
V2 changes:
1. add set_bit(MD_RECOVERY_INTR, &mddev->recovery) before unregister sync thread
And I didn't find regression with this version after run mdadm tests.
drivers/md/dm-raid.c | 1 +
drivers/md/md.c | 19 +++++++++++++++++--
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 9526ccbedafb..d43b8075c055 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3725,6 +3725,7 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) {
if (mddev->sync_thread) {
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
+ md_unregister_thread(&mddev->sync_thread);
md_reap_sync_thread(mddev);
}
} else if (decipher_sync_action(mddev, mddev->recovery) != st_idle)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index c7ecb0bffda0..04bab0511312 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4830,6 +4830,19 @@ action_store(struct mddev *mddev, const char *page, size_t len)
if (work_pending(&mddev->del_work))
flush_workqueue(md_misc_wq);
if (mddev->sync_thread) {
+ sector_t save_rp = mddev->reshape_position;
+
+ mddev_unlock(mddev);
+ set_bit(MD_RECOVERY_INTR, &mddev->recovery);
+ md_unregister_thread(&mddev->sync_thread);
+ mddev_lock_nointr(mddev);
+ /*
+ * set RECOVERY_INTR again and restore reshape
+ * position in case others changed them after
+ * got lock, eg, reshape_position_store and
+ * md_check_recovery.
+ */
+ mddev->reshape_position = save_rp;
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
md_reap_sync_thread(mddev);
}
@@ -6197,6 +6210,7 @@ static void __md_stop_writes(struct mddev *mddev)
flush_workqueue(md_misc_wq);
if (mddev->sync_thread) {
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
+ md_unregister_thread(&mddev->sync_thread);
md_reap_sync_thread(mddev);
}
@@ -9303,6 +9317,7 @@ void md_check_recovery(struct mddev *mddev)
* ->spare_active and clear saved_raid_disk
*/
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
+ md_unregister_thread(&mddev->sync_thread);
md_reap_sync_thread(mddev);
clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
@@ -9338,6 +9353,7 @@ void md_check_recovery(struct mddev *mddev)
goto unlock;
}
if (mddev->sync_thread) {
+ md_unregister_thread(&mddev->sync_thread);
md_reap_sync_thread(mddev);
goto unlock;
}
@@ -9417,8 +9433,7 @@ void md_reap_sync_thread(struct mddev *mddev)
sector_t old_dev_sectors = mddev->dev_sectors;
bool is_reshaped = false;
- /* resync has finished, collect result */
- md_unregister_thread(&mddev->sync_thread);
+ /* sync_thread should be unregistered, collect result */
if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
!test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
mddev->degraded != mddev->raid_disks) {
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH V2] md: unlock mddev before reap sync_thread in action_store
2022-06-21 3:11 [PATCH V2] md: unlock mddev before reap sync_thread in action_store Guoqing Jiang
@ 2022-06-22 23:32 ` Song Liu
2022-06-23 1:29 ` Guoqing Jiang
0 siblings, 1 reply; 4+ messages in thread
From: Song Liu @ 2022-06-22 23:32 UTC (permalink / raw)
To: Guoqing Jiang; +Cc: Donald Buczek, Logan Gunthorpe, linux-raid
On Mon, Jun 20, 2022 at 8:11 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> Since the bug which commit 8b48ec23cc51a ("md: don't unregister sync_thread
> with reconfig_mutex held") fixed is related with action_store path, other
> callers which reap sync_thread didn't need to be changed.
>
> Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous
> bug with belows.
>
> 1. unlock mddev before md_reap_sync_thread in action_store.
> 2. save reshape_position before unlock, then restore it to ensure position
> not changed accidentally by others.
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
> V2 changes:
> 1. add set_bit(MD_RECOVERY_INTR, &mddev->recovery) before unregister sync thread
>
> And I didn't find regression with this version after run mdadm tests.
>
> drivers/md/dm-raid.c | 1 +
> drivers/md/md.c | 19 +++++++++++++++++--
> 2 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
> index 9526ccbedafb..d43b8075c055 100644
> --- a/drivers/md/dm-raid.c
> +++ b/drivers/md/dm-raid.c
> @@ -3725,6 +3725,7 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
> if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) {
> if (mddev->sync_thread) {
> set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> + md_unregister_thread(&mddev->sync_thread);
> md_reap_sync_thread(mddev);
> }
> } else if (decipher_sync_action(mddev, mddev->recovery) != st_idle)
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index c7ecb0bffda0..04bab0511312 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -4830,6 +4830,19 @@ action_store(struct mddev *mddev, const char *page, size_t len)
> if (work_pending(&mddev->del_work))
> flush_workqueue(md_misc_wq);
> if (mddev->sync_thread) {
> + sector_t save_rp = mddev->reshape_position;
> +
> + mddev_unlock(mddev);
> + set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> + md_unregister_thread(&mddev->sync_thread);
> + mddev_lock_nointr(mddev);
> + /*
> + * set RECOVERY_INTR again and restore reshape
> + * position in case others changed them after
> + * got lock, eg, reshape_position_store and
> + * md_check_recovery.
> + */
Hmm.. do we really need to handle reshape_position changed case? What would
break if we don't?
Thanks,
Song
> + mddev->reshape_position = save_rp;
> set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> md_reap_sync_thread(mddev);
> }
> @@ -6197,6 +6210,7 @@ static void __md_stop_writes(struct mddev *mddev)
> flush_workqueue(md_misc_wq);
> if (mddev->sync_thread) {
> set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> + md_unregister_thread(&mddev->sync_thread);
> md_reap_sync_thread(mddev);
> }
>
> @@ -9303,6 +9317,7 @@ void md_check_recovery(struct mddev *mddev)
> * ->spare_active and clear saved_raid_disk
> */
> set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> + md_unregister_thread(&mddev->sync_thread);
> md_reap_sync_thread(mddev);
> clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
> clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> @@ -9338,6 +9353,7 @@ void md_check_recovery(struct mddev *mddev)
> goto unlock;
> }
> if (mddev->sync_thread) {
> + md_unregister_thread(&mddev->sync_thread);
> md_reap_sync_thread(mddev);
> goto unlock;
> }
> @@ -9417,8 +9433,7 @@ void md_reap_sync_thread(struct mddev *mddev)
> sector_t old_dev_sectors = mddev->dev_sectors;
> bool is_reshaped = false;
>
> - /* resync has finished, collect result */
> - md_unregister_thread(&mddev->sync_thread);
> + /* sync_thread should be unregistered, collect result */
> if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
> !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
> mddev->degraded != mddev->raid_disks) {
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH V2] md: unlock mddev before reap sync_thread in action_store
2022-06-22 23:32 ` Song Liu
@ 2022-06-23 1:29 ` Guoqing Jiang
2022-06-23 4:12 ` Song Liu
0 siblings, 1 reply; 4+ messages in thread
From: Guoqing Jiang @ 2022-06-23 1:29 UTC (permalink / raw)
To: Song Liu; +Cc: Donald Buczek, Logan Gunthorpe, linux-raid
On 6/23/22 7:32 AM, Song Liu wrote:
> On Mon, Jun 20, 2022 at 8:11 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>> Since the bug which commit 8b48ec23cc51a ("md: don't unregister sync_thread
>> with reconfig_mutex held") fixed is related with action_store path, other
>> callers which reap sync_thread didn't need to be changed.
>>
>> Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous
>> bug with belows.
>>
>> 1. unlock mddev before md_reap_sync_thread in action_store.
>> 2. save reshape_position before unlock, then restore it to ensure position
>> not changed accidentally by others.
>>
>> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
>> ---
>> V2 changes:
>> 1. add set_bit(MD_RECOVERY_INTR, &mddev->recovery) before unregister sync thread
>>
>> And I didn't find regression with this version after run mdadm tests.
>>
>> drivers/md/dm-raid.c | 1 +
>> drivers/md/md.c | 19 +++++++++++++++++--
>> 2 files changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
>> index 9526ccbedafb..d43b8075c055 100644
>> --- a/drivers/md/dm-raid.c
>> +++ b/drivers/md/dm-raid.c
>> @@ -3725,6 +3725,7 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
>> if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) {
>> if (mddev->sync_thread) {
>> set_bit(MD_RECOVERY_INTR, &mddev->recovery);
>> + md_unregister_thread(&mddev->sync_thread);
>> md_reap_sync_thread(mddev);
>> }
>> } else if (decipher_sync_action(mddev, mddev->recovery) != st_idle)
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index c7ecb0bffda0..04bab0511312 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -4830,6 +4830,19 @@ action_store(struct mddev *mddev, const char *page, size_t len)
>> if (work_pending(&mddev->del_work))
>> flush_workqueue(md_misc_wq);
>> if (mddev->sync_thread) {
>> + sector_t save_rp = mddev->reshape_position;
>> +
>> + mddev_unlock(mddev);
>> + set_bit(MD_RECOVERY_INTR, &mddev->recovery);
>> + md_unregister_thread(&mddev->sync_thread);
>> + mddev_lock_nointr(mddev);
>> + /*
>> + * set RECOVERY_INTR again and restore reshape
>> + * position in case others changed them after
>> + * got lock, eg, reshape_position_store and
>> + * md_check_recovery.
>> + */
> Hmm.. do we really need to handle reshape_position changed case? What would
> break if we don't?
I want to make the behavior consistent with previous code, and
reshape_position_store
can change it as said in comment.
Anyway, I didn't see regression with mdadm test with or without setting
reshape_position,
so it is a more conservative way to avoid potential issue. I will remove
it if you think it is
not necessary.
Thanks,
Guoqing
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH V2] md: unlock mddev before reap sync_thread in action_store
2022-06-23 1:29 ` Guoqing Jiang
@ 2022-06-23 4:12 ` Song Liu
0 siblings, 0 replies; 4+ messages in thread
From: Song Liu @ 2022-06-23 4:12 UTC (permalink / raw)
To: Guoqing Jiang; +Cc: Donald Buczek, Logan Gunthorpe, linux-raid
On Wed, Jun 22, 2022 at 6:30 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
>
>
> On 6/23/22 7:32 AM, Song Liu wrote:
> > On Mon, Jun 20, 2022 at 8:11 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
> >> Since the bug which commit 8b48ec23cc51a ("md: don't unregister sync_thread
> >> with reconfig_mutex held") fixed is related with action_store path, other
> >> callers which reap sync_thread didn't need to be changed.
> >>
> >> Let's pull md_unregister_thread from md_reap_sync_thread, then fix previous
> >> bug with belows.
> >>
> >> 1. unlock mddev before md_reap_sync_thread in action_store.
> >> 2. save reshape_position before unlock, then restore it to ensure position
> >> not changed accidentally by others.
> >>
> >> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> >> ---
> >> V2 changes:
> >> 1. add set_bit(MD_RECOVERY_INTR, &mddev->recovery) before unregister sync thread
> >>
> >> And I didn't find regression with this version after run mdadm tests.
> >>
> >> drivers/md/dm-raid.c | 1 +
> >> drivers/md/md.c | 19 +++++++++++++++++--
> >> 2 files changed, 18 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
> >> index 9526ccbedafb..d43b8075c055 100644
> >> --- a/drivers/md/dm-raid.c
> >> +++ b/drivers/md/dm-raid.c
> >> @@ -3725,6 +3725,7 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
> >> if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) {
> >> if (mddev->sync_thread) {
> >> set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> >> + md_unregister_thread(&mddev->sync_thread);
> >> md_reap_sync_thread(mddev);
> >> }
> >> } else if (decipher_sync_action(mddev, mddev->recovery) != st_idle)
> >> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >> index c7ecb0bffda0..04bab0511312 100644
> >> --- a/drivers/md/md.c
> >> +++ b/drivers/md/md.c
> >> @@ -4830,6 +4830,19 @@ action_store(struct mddev *mddev, const char *page, size_t len)
> >> if (work_pending(&mddev->del_work))
> >> flush_workqueue(md_misc_wq);
> >> if (mddev->sync_thread) {
> >> + sector_t save_rp = mddev->reshape_position;
> >> +
> >> + mddev_unlock(mddev);
> >> + set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> >> + md_unregister_thread(&mddev->sync_thread);
> >> + mddev_lock_nointr(mddev);
> >> + /*
> >> + * set RECOVERY_INTR again and restore reshape
> >> + * position in case others changed them after
> >> + * got lock, eg, reshape_position_store and
> >> + * md_check_recovery.
> >> + */
> > Hmm.. do we really need to handle reshape_position changed case? What would
> > break if we don't?
>
> I want to make the behavior consistent with previous code, and
> reshape_position_store
> can change it as said in comment.
I see. I will apply the patch as-is.
Thanks,
Song
>
> Anyway, I didn't see regression with mdadm test with or without setting
> reshape_position,
> so it is a more conservative way to avoid potential issue. I will remove
> it if you think it is
> not necessary.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-06-23 4:44 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-21 3:11 [PATCH V2] md: unlock mddev before reap sync_thread in action_store Guoqing Jiang
2022-06-22 23:32 ` Song Liu
2022-06-23 1:29 ` Guoqing Jiang
2022-06-23 4:12 ` Song Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).