* Re: [PATCH] raid5: avoid add sh->lru to different list
[not found] <20200108163023.9301-1-guoqing.jiang@cloud.ionos.com>
@ 2020-01-09 7:02 ` Xiao Ni
2020-01-09 10:32 ` Guoqing Jiang
2020-01-10 1:06 ` Song Liu
1 sibling, 1 reply; 4+ messages in thread
From: Xiao Ni @ 2020-01-09 7:02 UTC (permalink / raw)
To: jgq516, liu.song.a23; +Cc: linux-raid, Guoqing Jiang
On 01/09/2020 12:30 AM, jgq516@gmail.com wrote:
> From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
>
> release_stripe_plug adds lru to unplug list, then raid5_unplug
> iterates unplug list and release the stripe in the list. But
> sh->lru could exist in another list as well since there is no
> protection of the race since release_stripe_plug is lock free.
>
> For example, the same sh could be handled by raid5_release_stripe
> which is lock free too, it does two things in case "sh->count == 1".
>
> 1. add sh to released_stripes.
>
> Or
>
> 2. go to slow path if sh is already set with ON_RELEASE_LIST.
>
> Either 1 or 2 could trigger do_release_stripe finally, and this
> function mainly move sh->lru to different lists such as delayed_list,
> handle_list or temp_inactive_list etc.
Hi Guoqing
The sh->count should be >= 1 before calling release_stripe_plug. Because
the stripe is got
from raid5_get_active_stripe. In your example, if the same sh will be
handled by raid5_release_stripe.
The sh->count should be >= 2, right? Because it should be added 1 when
someone gets the same
sh. So the sh->count can't be zero and can't be released before/in/after
release_stripe_plug. Could
you explain about this if sh->count can be zero when calling
release_stripe_plug?
When it moves sh->lru to unplug list, it doesn't check sh->count and get
any lock. The unplug method
is to avoid conf->device_lock. Is it enough to check sh->lru here? If
it's not empty, it can call raid5_release_stripe
directly.
>
> Then the same node could be in different lists, which causes
What's the meaning of "node" here? It's stripe?
> raid5_unplug sticks with "while (!list_empty(&cb->list))", and
> since spin_lock_irq(&conf->device_lock) is called before, it
> causes:
You mean a dead loop? In the loop, stripe is moved from cb->list. The
while loop
can finish successfully, right?
>
> 1. hard lock up in [1], [2] and [3] since irq is disabled.
>
> 2. raid5_get_active_stripe can't get device_lock and calltrace
> happens.
>
> [<ffffffff81598060>] _raw_spin_lock+0x20/0x30
> [<ffffffffa0230b0a>] raid5_get_active_stripe+0x1da/0x5250 [raid456]
> [<ffffffff8112d165>] ? mempool_alloc_slab+0x15/0x20
> [<ffffffffa0231174>] raid5_get_active_stripe+0x844/0x5250 [raid456]
> [<ffffffff812d5574>] ? generic_make_request+0x24/0x2b0
> [<ffffffff810938b0>] ? wait_woken+0x90/0x90
> [<ffffffff814a2adc>] md_make_request+0xfc/0x250
> [<ffffffff812d5867>] submit_bio+0x67/0x150
>
> So, this commit tries to avoid the issue by makes below change:
>
> 1. firstly we can't add sh->lru to cb->list if sh->count == 0.
>
> 2. check STRIPE_ON_UNPLUG_LIST too in do_release_stripe to avoid
> list corruption, and the lock version of test_and_set_bit/clear_bit
> are used (though it should not effective on x86).
>
> [1]. https://marc.info/?l=linux-raid&m=150348807422853&w=2
> [2]. https://marc.info/?l=linux-raid&m=146883211430999&w=2
> [3]. https://marc.info/?l=linux-raid&m=157434565331673&w=2
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> ---
> As you can see, this version is different from previous one.
>
> The previous verison is made because I thought that ON_UNPLUG_LIST
> and ON_RELEASE_LIST are kind of exclusive, the sh should be
> only set with either of the flag but not both, but perhaps
> it is not true ...
Yes. The stripe may be handled in other cases and it calls
raid5_release_stripe after the job.
Now the stripe maybe is in conf->released_stripes and sh->count is 1.
>
> Instead, since the goal is to avoid put the same sh->lru in
> different list, or we can fix it from other side. Before
> do_release_stripe move sh->lru to another list (not cb->list),
> check if the sh is on unplug list yet, if it is true then wake
> up mddev->thread to trigger the plug path:
>
> raid5d -> blk_finish_plug -> raid5_unplug ->
> __release_stripe -> do_release_stripe
>
> Then raid5_unplug remove sh from cb->list one by one, clear
> ON_UNPLUG_LIST flag and release stripe. So we ensure sh on
> unplug list is actually handled by plug mechanism instead
> of other paths.
>
> Comments and tests are welcomed.
>
> The new changes are tested with "./test --raidtype=raid456",
> seems good.
>
> Thanks,
> Guoqing
>
> drivers/md/raid5.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 223e97ab27e6..808b0bd18c8c 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -218,6 +218,25 @@ static void do_release_stripe(struct r5conf *conf, struct stripe_head *sh,
> BUG_ON(!list_empty(&sh->lru));
> BUG_ON(atomic_read(&conf->active_stripes)==0);
>
> + /*
> + * If stripe is on unplug list then original caller of __release_stripe
> + * is not raid5_unplug, so sh->lru is still in cb->list, don't risk to
> + * add lru to another list in do_release_stripe.
> + */
> + if (!test_and_set_bit_lock(STRIPE_ON_UNPLUG_LIST, &sh->state))
> + clear_bit_unlock(STRIPE_ON_UNPLUG_LIST, &sh->state);
> + else {
> + /*
> + * The sh is on unplug list, so increase count (because count
> + * is decrease before enter do_release_stripe), then trigger
is decreased before entering
> + * raid5d -> plug -> raid5_unplug -> __release_stripe to handle
> + * this stripe.
> + */
> + atomic_inc(&sh->count);
> + md_wakeup_thread(conf->mddev->thread);
> + return;
> + }
> +
> if (r5c_is_writeback(conf->log))
> for (i = sh->disks; i--; )
> if (test_bit(R5_InJournal, &sh->dev[i].flags))
> @@ -5441,7 +5460,7 @@ static void raid5_unplug(struct blk_plug_cb *blk_cb, bool from_schedule)
> * is still in our list
> */
> smp_mb__before_atomic();
> - clear_bit(STRIPE_ON_UNPLUG_LIST, &sh->state);
> + clear_bit_unlock(STRIPE_ON_UNPLUG_LIST, &sh->state);
> /*
> * STRIPE_ON_RELEASE_LIST could be set here. In that
> * case, the count is always > 1 here
> @@ -5481,7 +5500,8 @@ static void release_stripe_plug(struct mddev *mddev,
> INIT_LIST_HEAD(cb->temp_inactive_list + i);
> }
>
> - if (!test_and_set_bit(STRIPE_ON_UNPLUG_LIST, &sh->state))
> + if (!test_and_set_bit_lock(STRIPE_ON_UNPLUG_LIST, &sh->state) &&
> + (atomic_read(&sh->count) != 0))
> list_add_tail(&sh->lru, &cb->list);
> else
> raid5_release_stripe(sh);
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] raid5: avoid add sh->lru to different list
2020-01-09 7:02 ` [PATCH] raid5: avoid add sh->lru to different list Xiao Ni
@ 2020-01-09 10:32 ` Guoqing Jiang
0 siblings, 0 replies; 4+ messages in thread
From: Guoqing Jiang @ 2020-01-09 10:32 UTC (permalink / raw)
To: Xiao Ni, jgq516, liu.song.a23; +Cc: linux-raid
Hi Xiao,
On 1/9/20 8:02 AM, Xiao Ni wrote:
>
>
> On 01/09/2020 12:30 AM, jgq516@gmail.com wrote:
>> From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
>>
>> release_stripe_plug adds lru to unplug list, then raid5_unplug
>> iterates unplug list and release the stripe in the list. But
>> sh->lru could exist in another list as well since there is no
>> protection of the race since release_stripe_plug is lock free.
>>
>> For example, the same sh could be handled by raid5_release_stripe
>> which is lock free too, it does two things in case "sh->count == 1".
>>
>> 1. add sh to released_stripes.
>>
>> Or
>>
>> 2. go to slow path if sh is already set with ON_RELEASE_LIST.
>>
>> Either 1 or 2 could trigger do_release_stripe finally, and this
>> function mainly move sh->lru to different lists such as delayed_list,
>> handle_list or temp_inactive_list etc.
>
> Hi Guoqing
>
> The sh->count should be >= 1 before calling release_stripe_plug.
> Because the stripe is got
> from raid5_get_active_stripe. In your example, if the same sh will be
> handled by raid5_release_stripe.
> The sh->count should be >= 2, right?
I am not sure it is true, since sh->count could be decreased in both
raid5_release_stripe (lockless)
and __release_stripe (protected by device_lock), so there are quite a
lot of paths can decrease the
count.
> Because it should be added 1 when someone gets the same
> sh. So the sh->count can't be zero and can't be released
> before/in/after release_stripe_plug.
After sh->lru is added to cb->list, could it possible that other
processes decrease sh->count and
add the same sh->lru to another list? I don't know where can prevent it
happen.
> Could you explain about this if sh->count can be zero when calling
> release_stripe_plug?
Because raid5_get_active_stripe is lockless, and image there are massive
IOs against the same sh
especially when multiple r5workers are activated. I am also not sure the
sh->count can guarantee
sh->lru lives in only one list, so there could be corner case which
causes lockup.
>
> When it moves sh->lru to unplug list, it doesn't check sh->count and
> get any lock. The unplug method
> is to avoid conf->device_lock. Is it enough to check sh->lru here? If
> it's not empty, it can call raid5_release_stripe
> directly.
>>
>> Then the same node could be in different lists, which causes
> What's the meaning of "node" here? It's stripe?
I means the list node, maybe entry is more appropriate.
>> raid5_unplug sticks with "while (!list_empty(&cb->list))", and
>> since spin_lock_irq(&conf->device_lock) is called before, it
>> causes:
> You mean a dead loop? In the loop, stripe is moved from cb->list. The
> while loop
> can finish successfully, right?
Please see link [1].
>
>>
>> 1. hard lock up in [1], [2] and [3] since irq is disabled.
>>
>> 2. raid5_get_active_stripe can't get device_lock and calltrace
>> happens.
>>
>> [<ffffffff81598060>] _raw_spin_lock+0x20/0x30
>> [<ffffffffa0230b0a>] raid5_get_active_stripe+0x1da/0x5250 [raid456]
>> [<ffffffff8112d165>] ? mempool_alloc_slab+0x15/0x20
>> [<ffffffffa0231174>] raid5_get_active_stripe+0x844/0x5250 [raid456]
>> [<ffffffff812d5574>] ? generic_make_request+0x24/0x2b0
>> [<ffffffff810938b0>] ? wait_woken+0x90/0x90
>> [<ffffffff814a2adc>] md_make_request+0xfc/0x250
>> [<ffffffff812d5867>] submit_bio+0x67/0x150
>>
>> So, this commit tries to avoid the issue by makes below change:
>>
>> 1. firstly we can't add sh->lru to cb->list if sh->count == 0.
>>
>> 2. check STRIPE_ON_UNPLUG_LIST too in do_release_stripe to avoid
>> list corruption, and the lock version of test_and_set_bit/clear_bit
>> are used (though it should not effective on x86).
>>
>> [1]. https://marc.info/?l=linux-raid&m=150348807422853&w=2
>> [2]. https://marc.info/?l=linux-raid&m=146883211430999&w=2
>> [3]. https://marc.info/?l=linux-raid&m=157434565331673&w=2
>>
>> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
>> ---
>> As you can see, this version is different from previous one.
>>
>> The previous verison is made because I thought that ON_UNPLUG_LIST
>> and ON_RELEASE_LIST are kind of exclusive, the sh should be
>> only set with either of the flag but not both, but perhaps
>> it is not true ...
> Yes. The stripe may be handled in other cases and it calls
> raid5_release_stripe after the job.
> Now the stripe maybe is in conf->released_stripes and sh->count is 1.
So I changed my mind :-). If the sh is on unplug list, then it should be
handled
by raid5_unplug instead of by other callers, it seems more natural from my
point view whether one sh could live different list simultaneously or not.
>>
>> Instead, since the goal is to avoid put the same sh->lru in
>> different list, or we can fix it from other side. Before
>> do_release_stripe move sh->lru to another list (not cb->list),
>> check if the sh is on unplug list yet, if it is true then wake
>> up mddev->thread to trigger the plug path:
>>
>> raid5d -> blk_finish_plug -> raid5_unplug ->
>> __release_stripe -> do_release_stripe
>>
>> Then raid5_unplug remove sh from cb->list one by one, clear
>> ON_UNPLUG_LIST flag and release stripe. So we ensure sh on
>> unplug list is actually handled by plug mechanism instead
>> of other paths.
>>
>> Comments and tests are welcomed.
>>
>> The new changes are tested with "./test --raidtype=raid456",
>> seems good.
Could you help to test the change in your nvdimm environment (would be
better
if more scenarios are covered) to avoid regression?
>>
>> Thanks,
>> Guoqing
>>
>> drivers/md/raid5.c | 24 ++++++++++++++++++++++--
>> 1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index 223e97ab27e6..808b0bd18c8c 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -218,6 +218,25 @@ static void do_release_stripe(struct r5conf
>> *conf, struct stripe_head *sh,
>> BUG_ON(!list_empty(&sh->lru));
>> BUG_ON(atomic_read(&conf->active_stripes)==0);
>> + /*
>> + * If stripe is on unplug list then original caller of
>> __release_stripe
>> + * is not raid5_unplug, so sh->lru is still in cb->list, don't
>> risk to
>> + * add lru to another list in do_release_stripe.
>> + */
>> + if (!test_and_set_bit_lock(STRIPE_ON_UNPLUG_LIST, &sh->state))
>> + clear_bit_unlock(STRIPE_ON_UNPLUG_LIST, &sh->state);
>> + else {
>> + /*
>> + * The sh is on unplug list, so increase count (because count
>> + * is decrease before enter do_release_stripe), then trigger
> is decreased before entering
Thanks for checking.
Cheers,
Guoqing
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] raid5: avoid add sh->lru to different list
[not found] <20200108163023.9301-1-guoqing.jiang@cloud.ionos.com>
2020-01-09 7:02 ` [PATCH] raid5: avoid add sh->lru to different list Xiao Ni
@ 2020-01-10 1:06 ` Song Liu
2020-01-10 10:51 ` Guoqing Jiang
1 sibling, 1 reply; 4+ messages in thread
From: Song Liu @ 2020-01-10 1:06 UTC (permalink / raw)
To: Guoqing Jiang, Xiao Ni; +Cc: linux-raid, Guoqing Jiang
Hi Guoqing and Xiao,
Thanks for looking into this. I am still looking into this. Some
questions below...
On Wed, Jan 8, 2020 at 8:30 AM <jgq516@gmail.com> wrote:
[...]
>
> drivers/md/raid5.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 223e97ab27e6..808b0bd18c8c 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -218,6 +218,25 @@ static void do_release_stripe(struct r5conf *conf, struct stripe_head *sh,
> BUG_ON(!list_empty(&sh->lru));
> BUG_ON(atomic_read(&conf->active_stripes)==0);
>
> + /*
> + * If stripe is on unplug list then original caller of __release_stripe
> + * is not raid5_unplug, so sh->lru is still in cb->list, don't risk to
> + * add lru to another list in do_release_stripe.
> + */
Can we do the check before calling into do_release_stripe()?
> + if (!test_and_set_bit_lock(STRIPE_ON_UNPLUG_LIST, &sh->state))
> + clear_bit_unlock(STRIPE_ON_UNPLUG_LIST, &sh->state);
> + else {
> + /*
> + * The sh is on unplug list, so increase count (because count
> + * is decrease before enter do_release_stripe), then trigger
> + * raid5d -> plug -> raid5_unplug -> __release_stripe to handle
> + * this stripe.
> + */
> + atomic_inc(&sh->count);
> + md_wakeup_thread(conf->mddev->thread);
> + return;
> + }
> +
> if (r5c_is_writeback(conf->log))
> for (i = sh->disks; i--; )
> if (test_bit(R5_InJournal, &sh->dev[i].flags))
> @@ -5441,7 +5460,7 @@ static void raid5_unplug(struct blk_plug_cb *blk_cb, bool from_schedule)
> * is still in our list
> */
> smp_mb__before_atomic();
> - clear_bit(STRIPE_ON_UNPLUG_LIST, &sh->state);
> + clear_bit_unlock(STRIPE_ON_UNPLUG_LIST, &sh->state);
> /*
> * STRIPE_ON_RELEASE_LIST could be set here. In that
> * case, the count is always > 1 here
> @@ -5481,7 +5500,8 @@ static void release_stripe_plug(struct mddev *mddev,
> INIT_LIST_HEAD(cb->temp_inactive_list + i);
> }
>
> - if (!test_and_set_bit(STRIPE_ON_UNPLUG_LIST, &sh->state))
> + if (!test_and_set_bit_lock(STRIPE_ON_UNPLUG_LIST, &sh->state) &&
> + (atomic_read(&sh->count) != 0))
Do we really see sh->count == 0 here? If yes, I guess we should check
sh->count first,
so that we reduce the case where STRIPE_ON_UNPLUG_LIST is set, but the sh is
not really on the unplug_list.
> list_add_tail(&sh->lru, &cb->list);
> else
> raid5_release_stripe(sh);
>
Overall, I think we should try our best to let STRIPE_ON_RELEASE_LIST
means it is
on release_list, and STRIPE_ON_UNPLUG_LIST means it is on unplug_list. I think
there will be some exceptions, but we should try to minimize those cases.
Does this make sense?
Thanks,
Song
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] raid5: avoid add sh->lru to different list
2020-01-10 1:06 ` Song Liu
@ 2020-01-10 10:51 ` Guoqing Jiang
0 siblings, 0 replies; 4+ messages in thread
From: Guoqing Jiang @ 2020-01-10 10:51 UTC (permalink / raw)
To: Song Liu, Guoqing Jiang, Xiao Ni; +Cc: linux-raid
Hi Song,
On 1/10/20 2:06 AM, Song Liu wrote:
> Hi Guoqing and Xiao,
>
> Thanks for looking into this. I am still looking into this. Some
> questions below...
Thanks for looking into it.
> On Wed, Jan 8, 2020 at 8:30 AM <jgq516@gmail.com> wrote:
> [...]
>> drivers/md/raid5.c | 24 ++++++++++++++++++++++--
>> 1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index 223e97ab27e6..808b0bd18c8c 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -218,6 +218,25 @@ static void do_release_stripe(struct r5conf *conf, struct stripe_head *sh,
>> BUG_ON(!list_empty(&sh->lru));
>> BUG_ON(atomic_read(&conf->active_stripes)==0);
>>
>> + /*
>> + * If stripe is on unplug list then original caller of __release_stripe
>> + * is not raid5_unplug, so sh->lru is still in cb->list, don't risk to
>> + * add lru to another list in do_release_stripe.
>> + */
> Can we do the check before calling into do_release_stripe()?
Then we need to add the check twice, one in raid5_release_stripe,
another in
__release_stripe.
>
>> + if (!test_and_set_bit_lock(STRIPE_ON_UNPLUG_LIST, &sh->state))
>> + clear_bit_unlock(STRIPE_ON_UNPLUG_LIST, &sh->state);
>> + else {
>> + /*
>> + * The sh is on unplug list, so increase count (because count
>> + * is decrease before enter do_release_stripe), then trigger
>> + * raid5d -> plug -> raid5_unplug -> __release_stripe to handle
>> + * this stripe.
>> + */
>> + atomic_inc(&sh->count);
>> + md_wakeup_thread(conf->mddev->thread);
>> + return;
>> + }
>> +
>> if (r5c_is_writeback(conf->log))
>> for (i = sh->disks; i--; )
>> if (test_bit(R5_InJournal, &sh->dev[i].flags))
>> @@ -5441,7 +5460,7 @@ static void raid5_unplug(struct blk_plug_cb *blk_cb, bool from_schedule)
>> * is still in our list
>> */
>> smp_mb__before_atomic();
>> - clear_bit(STRIPE_ON_UNPLUG_LIST, &sh->state);
>> + clear_bit_unlock(STRIPE_ON_UNPLUG_LIST, &sh->state);
>> /*
>> * STRIPE_ON_RELEASE_LIST could be set here. In that
>> * case, the count is always > 1 here
>> @@ -5481,7 +5500,8 @@ static void release_stripe_plug(struct mddev *mddev,
>> INIT_LIST_HEAD(cb->temp_inactive_list + i);
>> }
>>
>> - if (!test_and_set_bit(STRIPE_ON_UNPLUG_LIST, &sh->state))
>> + if (!test_and_set_bit_lock(STRIPE_ON_UNPLUG_LIST, &sh->state) &&
>> + (atomic_read(&sh->count) != 0))
> Do we really see sh->count == 0 here? If yes, I guess we should check
> sh->count first,
> so that we reduce the case where STRIPE_ON_UNPLUG_LIST is set, but the sh is
> not really on the unplug_list.
Yes, it could be, thanks. But with check count first, I am worried a
potential
race window which causes sh->lru could be added to different lists as well.
1. release_stripe_plug -> atomic_read(&sh->count) != 0
2. do_release_stripe -> atomic_dec_and_test(&sh->count)
!test_bit(STRIPE_ON_UNPLUG_LIST, &sh->state)
list_add_tail(&sh->lru, ***_list);
3. release_stripe_plug -> !test_and_set_bit(STRIPE_ON_UNPLUG_LIST,
&sh->state)
list_add_tail(&sh->lru, &cb->list);
IMO, the difficult thing is we need kind of atomic operation to check both
sh->count and ON_UNPLUG flag, but we do want lockless as much as possible.
How about get an extra count before add add sh to cb->list? I am
thinking that
atomic_inc(&sh->count) and atomic_dec(&sh->count) should be paired, right?
Then do_release_stripe should not be triggered in theory. Something like.
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index d4d3b67ffbba..225bd9ca4e42 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -222,6 +222,24 @@ static void do_release_stripe(struct r5conf *conf,
struct stripe_head *sh,
for (i = sh->disks; i--; )
if (test_bit(R5_InJournal, &sh->dev[i].flags))
injournal++;
+
+ /*
+ * If stripe is on unplug list then original caller of
__release_stripe
+ * is not raid5_unplug, so sh->lru is still in cb->list, don't
risk to
+ * add lru to another list in do_release_stripe.
+ */
+ if (unlikely(test_bit(STRIPE_ON_UNPLUG_LIST, &sh->state))){
+ /*
+ * The sh is on unplug list, so increase count (because
count
+ * is decreased before entering do_release_stripe), then
trigger
+ * raid5d -> plug -> raid5_unplug -> __release_stripe to
handle
+ * this stripe.
+ */
+ atomic_inc(&sh->count);
+ md_wakeup_thread(conf->mddev->thread);
+ return;
+ }
+
/*
* In the following cases, the stripe cannot be released to cached
* lists. Therefore, we make the stripe write out and set
@@ -5481,10 +5499,15 @@ static void release_stripe_plug(struct mddev
*mddev,
INIT_LIST_HEAD(cb->temp_inactive_list + i);
}
- if (!test_and_set_bit(STRIPE_ON_UNPLUG_LIST, &sh->state))
+ /* hold extra count to avoid the running of do_release_stripe */
+ atomic_inc(&sh->count);
+ if (!test_and_set_bit(STRIPE_ON_UNPLUG_LIST, &sh->state)) {
list_add_tail(&sh->lru, &cb->list);
- else
+ atomic_dec(&sh->count);
+ } else {
+ atomic_dec(&sh->count);
raid5_release_stripe(sh);
+ }
}
>> list_add_tail(&sh->lru, &cb->list);
>> else
>> raid5_release_stripe(sh);
>>
> Overall, I think we should try our best to let STRIPE_ON_RELEASE_LIST
> means it is
> on release_list, and STRIPE_ON_UNPLUG_LIST means it is on unplug_list. I think
> there will be some exceptions, but we should try to minimize those cases.
>
> Does this make sense?
Agreed, though I don't see there is code to do it unfortunately ...
Thanks,
Guoqing
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-01-10 10:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20200108163023.9301-1-guoqing.jiang@cloud.ionos.com>
2020-01-09 7:02 ` [PATCH] raid5: avoid add sh->lru to different list Xiao Ni
2020-01-09 10:32 ` Guoqing Jiang
2020-01-10 1:06 ` Song Liu
2020-01-10 10:51 ` Guoqing Jiang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.