* [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE
@ 2023-01-31 7:07 Hou Tao
2023-01-31 17:06 ` Logan Gunthorpe
2023-02-01 16:35 ` Song Liu
0 siblings, 2 replies; 6+ messages in thread
From: Hou Tao @ 2023-01-31 7:07 UTC (permalink / raw)
To: linux-raid; +Cc: Song Liu, Jens Axboe, Logan Gunthorpe, houtao1, linan122
From: Hou Tao <houtao1@huawei.com>
Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
md may skip the resync of the first 3 sectors if the resync procedure is
interrupted before the first calling of ->sync_request() as shown below:
md_do_sync thread control thread
// setup resync
mddev->recovery_cp = 0
j = 0
mddev->curr_resync = MD_RESYNC_ACTIVE
// e.g., set array as idle
set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
// resync loop
// check INTR before calling sync_request
!test_bit(MD_RECOVERY_INTR, &mddev->recovery
// resync interrupted
// update recovery_cp from 0 to 3
// the resync of three 3 sectors will be skipped
mddev->recovery_cp = 3
Fixes: eac58d08d493 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
drivers/md/md.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 775f1dde190a..16de0ae65447 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9028,7 +9028,7 @@ void md_do_sync(struct md_thread *thread)
mddev->pers->sync_request(mddev, max_sectors, &skipped);
if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) &&
- mddev->curr_resync >= MD_RESYNC_ACTIVE) {
+ mddev->curr_resync > MD_RESYNC_ACTIVE) {
if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) {
if (mddev->curr_resync >= mddev->recovery_cp) {
--
2.29.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE
2023-01-31 7:07 [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE Hou Tao
@ 2023-01-31 17:06 ` Logan Gunthorpe
2023-02-01 7:07 ` Song Liu
2023-02-01 16:35 ` Song Liu
1 sibling, 1 reply; 6+ messages in thread
From: Logan Gunthorpe @ 2023-01-31 17:06 UTC (permalink / raw)
To: Hou Tao, linux-raid; +Cc: Song Liu, Jens Axboe, houtao1, linan122
On 2023-01-31 12:07 a.m., Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
>
> Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
> md may skip the resync of the first 3 sectors if the resync procedure is
> interrupted before the first calling of ->sync_request() as shown below:
>
> md_do_sync thread control thread
> // setup resync
> mddev->recovery_cp = 0
> j = 0
> mddev->curr_resync = MD_RESYNC_ACTIVE
>
> // e.g., set array as idle
> set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
> // resync loop
> // check INTR before calling sync_request
> !test_bit(MD_RECOVERY_INTR, &mddev->recovery
>
> // resync interrupted
> // update recovery_cp from 0 to 3
> // the resync of three 3 sectors will be skipped
> mddev->recovery_cp = 3
>
> Fixes: eac58d08d493 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
> Signed-off-by: Hou Tao <houtao1@huawei.com>
Nice Catch! Thanks.
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Logan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE
2023-01-31 17:06 ` Logan Gunthorpe
@ 2023-02-01 7:07 ` Song Liu
0 siblings, 0 replies; 6+ messages in thread
From: Song Liu @ 2023-02-01 7:07 UTC (permalink / raw)
To: Logan Gunthorpe; +Cc: Hou Tao, linux-raid, Jens Axboe, houtao1, linan122
On Tue, Jan 31, 2023 at 9:06 AM Logan Gunthorpe <logang@deltatee.com> wrote:
>
>
>
> On 2023-01-31 12:07 a.m., Hou Tao wrote:
> > From: Hou Tao <houtao1@huawei.com>
> >
> > Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
> > md may skip the resync of the first 3 sectors if the resync procedure is
> > interrupted before the first calling of ->sync_request() as shown below:
> >
> > md_do_sync thread control thread
> > // setup resync
> > mddev->recovery_cp = 0
> > j = 0
> > mddev->curr_resync = MD_RESYNC_ACTIVE
> >
> > // e.g., set array as idle
> > set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
> > // resync loop
> > // check INTR before calling sync_request
> > !test_bit(MD_RECOVERY_INTR, &mddev->recovery
> >
> > // resync interrupted
> > // update recovery_cp from 0 to 3
> > // the resync of three 3 sectors will be skipped
> > mddev->recovery_cp = 3
> >
> > Fixes: eac58d08d493 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
> > Signed-off-by: Hou Tao <houtao1@huawei.com>
>
> Nice Catch! Thanks.
>
> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Applied to md-next. Thanks!
Song
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE
2023-01-31 7:07 [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE Hou Tao
2023-01-31 17:06 ` Logan Gunthorpe
@ 2023-02-01 16:35 ` Song Liu
2023-02-02 0:51 ` Hou Tao
1 sibling, 1 reply; 6+ messages in thread
From: Song Liu @ 2023-02-01 16:35 UTC (permalink / raw)
To: Hou Tao; +Cc: linux-raid, Jens Axboe, Logan Gunthorpe, houtao1, linan122
On Mon, Jan 30, 2023 at 10:39 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> From: Hou Tao <houtao1@huawei.com>
>
> Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
> md may skip the resync of the first 3 sectors if the resync procedure is
> interrupted before the first calling of ->sync_request() as shown below:
>
> md_do_sync thread control thread
> // setup resync
> mddev->recovery_cp = 0
> j = 0
> mddev->curr_resync = MD_RESYNC_ACTIVE
>
> // e.g., set array as idle
> set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
> // resync loop
> // check INTR before calling sync_request
> !test_bit(MD_RECOVERY_INTR, &mddev->recovery
>
> // resync interrupted
> // update recovery_cp from 0 to 3
> // the resync of three 3 sectors will be skipped
> mddev->recovery_cp = 3
>
> Fixes: eac58d08d493 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
> Signed-off-by: Hou Tao <houtao1@huawei.com>
By the way, how did you find this issue? Is it from users/production?
Or just from reading the code?
Thanks,
Song
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE
2023-02-01 16:35 ` Song Liu
@ 2023-02-02 0:51 ` Hou Tao
2023-02-02 6:53 ` Song Liu
0 siblings, 1 reply; 6+ messages in thread
From: Hou Tao @ 2023-02-02 0:51 UTC (permalink / raw)
To: Song Liu; +Cc: linux-raid, Jens Axboe, Logan Gunthorpe, houtao1, linan122
Hi,
On 2/2/2023 12:35 AM, Song Liu wrote:
> On Mon, Jan 30, 2023 at 10:39 PM Hou Tao <houtao@huaweicloud.com> wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
>> md may skip the resync of the first 3 sectors if the resync procedure is
>> interrupted before the first calling of ->sync_request() as shown below:
>>
>> md_do_sync thread control thread
>> // setup resync
>> mddev->recovery_cp = 0
>> j = 0
>> mddev->curr_resync = MD_RESYNC_ACTIVE
>>
>> // e.g., set array as idle
>> set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
>> // resync loop
>> // check INTR before calling sync_request
>> !test_bit(MD_RECOVERY_INTR, &mddev->recovery
>>
>> // resync interrupted
>> // update recovery_cp from 0 to 3
>> // the resync of three 3 sectors will be skipped
>> mddev->recovery_cp = 3
>>
>> Fixes: eac58d08d493 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
> By the way, how did you find this issue? Is it from users/production?
> Or just from reading the code?
Found the issue when reading the code and reproduced the problem to confirm that.
>
> Thanks,
> Song
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE
2023-02-02 0:51 ` Hou Tao
@ 2023-02-02 6:53 ` Song Liu
0 siblings, 0 replies; 6+ messages in thread
From: Song Liu @ 2023-02-02 6:53 UTC (permalink / raw)
To: Hou Tao; +Cc: linux-raid, Jens Axboe, Logan Gunthorpe, houtao1, linan122
On Wed, Feb 1, 2023 at 4:51 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> Hi,
>
> On 2/2/2023 12:35 AM, Song Liu wrote:
> > On Mon, Jan 30, 2023 at 10:39 PM Hou Tao <houtao@huaweicloud.com> wrote:
> >> From: Hou Tao <houtao1@huawei.com>
> >>
> >> Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
> >> md may skip the resync of the first 3 sectors if the resync procedure is
> >> interrupted before the first calling of ->sync_request() as shown below:
> >>
> >> md_do_sync thread control thread
> >> // setup resync
> >> mddev->recovery_cp = 0
> >> j = 0
> >> mddev->curr_resync = MD_RESYNC_ACTIVE
> >>
> >> // e.g., set array as idle
> >> set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
> >> // resync loop
> >> // check INTR before calling sync_request
> >> !test_bit(MD_RECOVERY_INTR, &mddev->recovery
> >>
> >> // resync interrupted
> >> // update recovery_cp from 0 to 3
> >> // the resync of three 3 sectors will be skipped
> >> mddev->recovery_cp = 3
> >>
> >> Fixes: eac58d08d493 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
> >> Signed-off-by: Hou Tao <houtao1@huawei.com>
> > By the way, how did you find this issue? Is it from users/production?
> > Or just from reading the code?
> Found the issue when reading the code and reproduced the problem to confirm that.
Thanks for the information!
Song
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-02-02 6:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-31 7:07 [PATCH] md: don't update recovery_cp when curr_resync is ACTIVE Hou Tao
2023-01-31 17:06 ` Logan Gunthorpe
2023-02-01 7:07 ` Song Liu
2023-02-01 16:35 ` Song Liu
2023-02-02 0:51 ` Hou Tao
2023-02-02 6:53 ` Song Liu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.