All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] md: fix raid5 livelock
@ 2015-01-25 20:06 Heinz Mauelshagen
  2015-01-28  2:37 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Heinz Mauelshagen @ 2015-01-25 20:06 UTC (permalink / raw)
  To: neilb@suse.de >> NeilBrown
  Cc: dm-devel >> device-mapper development, linux-raid

From: Heinz Mauelshagen <heinzm@redhat.com>

Hi Neil,

the reconstruct write optimization in raid5, function fetch_block causes
livelocks in LVM raid4/5 tests.

Test scenarios:
the tests wait for full initial array resynchronization before making a 
filesystem
on the raid4/5 logical volume, mounting it, writing to the filesystem 
and failing
one physical volume holding a raiddev.

In short, we're seeing livelocks on fully synchronized raid4/5 arrays 
with a failed device.

This patch fixes the issue but likely in a suboptimnal way.

Do you think there is a better solution to avoid livelocks on 
reconstruct writes?

Regards,
Heinz

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Tested-by: Jon Brassow <jbrassow@redhat.com>
Tested-by: Heinz Mauelshagen <heinzm@redhat.com>

---
  drivers/md/raid5.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c1b0d52..0fc8737 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2915,7 +2915,7 @@ static int fetch_block(struct stripe_head *sh, 
struct stripe_head_state *s,
              (s->failed >= 1 && fdev[0]->toread) ||
              (s->failed >= 2 && fdev[1]->toread) ||
              (sh->raid_conf->level <= 5 && s->failed && fdev[0]->towrite &&
-             (!test_bit(R5_Insync, &dev->flags) || 
test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
+             (!test_bit(R5_Insync, &dev->flags) || 
test_bit(STRIPE_PREREAD_ACTIVE, &sh->state) || s->non_overwrite) &&
               !test_bit(R5_OVERWRITE, &fdev[0]->flags)) ||
              ((sh->raid_conf->level == 6 ||
                sh->sector >= sh->raid_conf->mddev->recovery_cp)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] md: fix raid5 livelock
  2015-01-25 20:06 [PATCH] md: fix raid5 livelock Heinz Mauelshagen
@ 2015-01-28  2:37 ` NeilBrown
  2015-01-28 12:03   ` Heinz Mauelshagen
  2015-01-29 17:17   ` Jes Sorensen
  0 siblings, 2 replies; 6+ messages in thread
From: NeilBrown @ 2015-01-28  2:37 UTC (permalink / raw)
  To: Heinz Mauelshagen; +Cc: dm-devel >> device-mapper development, linux-raid

[-- Attachment #1: Type: text/plain, Size: 3242 bytes --]

On Sun, 25 Jan 2015 21:06:20 +0100 Heinz Mauelshagen <heinzm@redhat.com>
wrote:

> From: Heinz Mauelshagen <heinzm@redhat.com>
> 
> Hi Neil,
> 
> the reconstruct write optimization in raid5, function fetch_block causes
> livelocks in LVM raid4/5 tests.
> 
> Test scenarios:
> the tests wait for full initial array resynchronization before making a 
> filesystem
> on the raid4/5 logical volume, mounting it, writing to the filesystem 
> and failing
> one physical volume holding a raiddev.
> 
> In short, we're seeing livelocks on fully synchronized raid4/5 arrays 
> with a failed device.
> 
> This patch fixes the issue but likely in a suboptimnal way.
> 
> Do you think there is a better solution to avoid livelocks on 
> reconstruct writes?
> 
> Regards,
> Heinz
> 
> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
> Tested-by: Jon Brassow <jbrassow@redhat.com>
> Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
> 
> ---
>   drivers/md/raid5.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index c1b0d52..0fc8737 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -2915,7 +2915,7 @@ static int fetch_block(struct stripe_head *sh, 
> struct stripe_head_state *s,
>               (s->failed >= 1 && fdev[0]->toread) ||
>               (s->failed >= 2 && fdev[1]->toread) ||
>               (sh->raid_conf->level <= 5 && s->failed && fdev[0]->towrite &&
> -             (!test_bit(R5_Insync, &dev->flags) || 
> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
> +             (!test_bit(R5_Insync, &dev->flags) || 
> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state) || s->non_overwrite) &&
>                !test_bit(R5_OVERWRITE, &fdev[0]->flags)) ||
>               ((sh->raid_conf->level == 6 ||
>                 sh->sector >= sh->raid_conf->mddev->recovery_cp)


That is a bit heavy handed, but knowing that fixes the problem helps a lot.

I think the problem happens when processes a non-overwrite write to a failed
device.

fetch_block() should, in that case, pre-read all of the working device, but
since

	      (!test_bit(R5_Insync, &dev->flags) || test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&

was added, it sometimes doesn't.  The root problem is that
handle_stripe_dirtying is getting confused because neither rmw or rcw seem to
work, so it doesn't start the chain of events to set STRIPE_PREREAD_ACTIVE.

The following (which is against mainline) might fix it.  Can you test?

Thanks,
NeilBrown

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c1b0d52bfcb0..793cf2861e97 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3195,6 +3195,10 @@ static void handle_stripe_dirtying(struct r5conf *conf,
 					  (unsigned long long)sh->sector,
 					  rcw, qread, test_bit(STRIPE_DELAYED, &sh->state));
 	}
+	if (rcw > disks && rmw > disks &&
+	    !test_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
+		set_bit(STRIPE_DELAYED, &sh->state);
+
 	/* now if nothing is locked, and if we have enough data,
 	 * we can start a write request
 	 */


This code really really needs to be tidied up and commented better!!!

Thanks,
NeilBrown


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] md: fix raid5 livelock
  2015-01-28  2:37 ` NeilBrown
@ 2015-01-28 12:03   ` Heinz Mauelshagen
  2015-01-29 11:24     ` [dm-devel] " Heinz Mauelshagen
  2015-01-29 17:17   ` Jes Sorensen
  1 sibling, 1 reply; 6+ messages in thread
From: Heinz Mauelshagen @ 2015-01-28 12:03 UTC (permalink / raw)
  To: neilBrown
  Cc: marian Csontos, dm-devel >> device-mapper development,
	Mikulas Patocka, Alasdair G Kergon


Neil,

thanks for providing the patch.

Test with it will take some hours in order to tell any success.

Regards,
Heinz

On 01/28/2015 03:37 AM, NeilBrown wrote:
> On Sun, 25 Jan 2015 21:06:20 +0100 Heinz Mauelshagen <heinzm@redhat.com>
> wrote:
>
>> From: Heinz Mauelshagen <heinzm@redhat.com>
>>
>> Hi Neil,
>>
>> the reconstruct write optimization in raid5, function fetch_block causes
>> livelocks in LVM raid4/5 tests.
>>
>> Test scenarios:
>> the tests wait for full initial array resynchronization before making a
>> filesystem
>> on the raid4/5 logical volume, mounting it, writing to the filesystem
>> and failing
>> one physical volume holding a raiddev.
>>
>> In short, we're seeing livelocks on fully synchronized raid4/5 arrays
>> with a failed device.
>>
>> This patch fixes the issue but likely in a suboptimnal way.
>>
>> Do you think there is a better solution to avoid livelocks on
>> reconstruct writes?
>>
>> Regards,
>> Heinz
>>
>> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
>> Tested-by: Jon Brassow <jbrassow@redhat.com>
>> Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
>>
>> ---
>>    drivers/md/raid5.c | 2 +-
>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index c1b0d52..0fc8737 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -2915,7 +2915,7 @@ static int fetch_block(struct stripe_head *sh,
>> struct stripe_head_state *s,
>>                (s->failed >= 1 && fdev[0]->toread) ||
>>                (s->failed >= 2 && fdev[1]->toread) ||
>>                (sh->raid_conf->level <= 5 && s->failed && fdev[0]->towrite &&
>> -             (!test_bit(R5_Insync, &dev->flags) ||
>> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
>> +             (!test_bit(R5_Insync, &dev->flags) ||
>> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state) || s->non_overwrite) &&
>>                 !test_bit(R5_OVERWRITE, &fdev[0]->flags)) ||
>>                ((sh->raid_conf->level == 6 ||
>>                  sh->sector >= sh->raid_conf->mddev->recovery_cp)
>
> That is a bit heavy handed, but knowing that fixes the problem helps a lot.
>
> I think the problem happens when processes a non-overwrite write to a failed
> device.
>
> fetch_block() should, in that case, pre-read all of the working device, but
> since
>
> 	      (!test_bit(R5_Insync, &dev->flags) || test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
>
> was added, it sometimes doesn't.  The root problem is that
> handle_stripe_dirtying is getting confused because neither rmw or rcw seem to
> work, so it doesn't start the chain of events to set STRIPE_PREREAD_ACTIVE.
>
> The following (which is against mainline) might fix it.  Can you test?
>
> Thanks,
> NeilBrown
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index c1b0d52bfcb0..793cf2861e97 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3195,6 +3195,10 @@ static void handle_stripe_dirtying(struct r5conf *conf,
>   					  (unsigned long long)sh->sector,
>   					  rcw, qread, test_bit(STRIPE_DELAYED, &sh->state));
>   	}
> +	if (rcw > disks && rmw > disks &&
> +	    !test_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
> +		set_bit(STRIPE_DELAYED, &sh->state);
> +
>   	/* now if nothing is locked, and if we have enough data,
>   	 * we can start a write request
>   	 */
>
>
> This code really really needs to be tidied up and commented better!!!
>
> Thanks,
> NeilBrown

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-devel] [PATCH] md: fix raid5 livelock
  2015-01-28 12:03   ` Heinz Mauelshagen
@ 2015-01-29 11:24     ` Heinz Mauelshagen
  2015-02-02  0:06       ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Heinz Mauelshagen @ 2015-01-29 11:24 UTC (permalink / raw)
  To: neilBrown
  Cc: device-mapper development, linux-raid, Brassow Jonathan,
	Alasdair G Kergon, Mikulas Patocka, Marian Csontos


Neil,

the patch worked fine in overnight test runs without the previous livelock.
No regressions have been triggered.

Yes, tidying up that optimization logic (e.g. in fetch_block()) is very 
much appreciated :-)

Thanks,
Heinz

On 01/28/2015 01:03 PM, Heinz Mauelshagen wrote:
>
> Neil,
>
> thanks for providing the patch.
>
> Test with it will take some hours in order to tell any success.
>
> Regards,
> Heinz
>
> On 01/28/2015 03:37 AM, NeilBrown wrote:
>> On Sun, 25 Jan 2015 21:06:20 +0100 Heinz Mauelshagen <heinzm@redhat.com>
>> wrote:
>>
>>> From: Heinz Mauelshagen <heinzm@redhat.com>
>>>
>>> Hi Neil,
>>>
>>> the reconstruct write optimization in raid5, function fetch_block 
>>> causes
>>> livelocks in LVM raid4/5 tests.
>>>
>>> Test scenarios:
>>> the tests wait for full initial array resynchronization before making a
>>> filesystem
>>> on the raid4/5 logical volume, mounting it, writing to the filesystem
>>> and failing
>>> one physical volume holding a raiddev.
>>>
>>> In short, we're seeing livelocks on fully synchronized raid4/5 arrays
>>> with a failed device.
>>>
>>> This patch fixes the issue but likely in a suboptimnal way.
>>>
>>> Do you think there is a better solution to avoid livelocks on
>>> reconstruct writes?
>>>
>>> Regards,
>>> Heinz
>>>
>>> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
>>> Tested-by: Jon Brassow <jbrassow@redhat.com>
>>> Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
>>>
>>> ---
>>>    drivers/md/raid5.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>>> index c1b0d52..0fc8737 100644
>>> --- a/drivers/md/raid5.c
>>> +++ b/drivers/md/raid5.c
>>> @@ -2915,7 +2915,7 @@ static int fetch_block(struct stripe_head *sh,
>>> struct stripe_head_state *s,
>>>                (s->failed >= 1 && fdev[0]->toread) ||
>>>                (s->failed >= 2 && fdev[1]->toread) ||
>>>                (sh->raid_conf->level <= 5 && s->failed && 
>>> fdev[0]->towrite &&
>>> -             (!test_bit(R5_Insync, &dev->flags) ||
>>> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
>>> +             (!test_bit(R5_Insync, &dev->flags) ||
>>> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state) || s->non_overwrite) &&
>>>                 !test_bit(R5_OVERWRITE, &fdev[0]->flags)) ||
>>>                ((sh->raid_conf->level == 6 ||
>>>                  sh->sector >= sh->raid_conf->mddev->recovery_cp)
>>
>> That is a bit heavy handed, but knowing that fixes the problem helps 
>> a lot.
>>
>> I think the problem happens when processes a non-overwrite write to a 
>> failed
>> device.
>>
>> fetch_block() should, in that case, pre-read all of the working 
>> device, but
>> since
>>
>>           (!test_bit(R5_Insync, &dev->flags) || 
>> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
>>
>> was added, it sometimes doesn't.  The root problem is that
>> handle_stripe_dirtying is getting confused because neither rmw or rcw 
>> seem to
>> work, so it doesn't start the chain of events to set 
>> STRIPE_PREREAD_ACTIVE.
>>
>> The following (which is against mainline) might fix it.  Can you test?
>>
>> Thanks,
>> NeilBrown
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index c1b0d52bfcb0..793cf2861e97 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -3195,6 +3195,10 @@ static void handle_stripe_dirtying(struct 
>> r5conf *conf,
>>                         (unsigned long long)sh->sector,
>>                         rcw, qread, test_bit(STRIPE_DELAYED, 
>> &sh->state));
>>       }
>> +    if (rcw > disks && rmw > disks &&
>> +        !test_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
>> +        set_bit(STRIPE_DELAYED, &sh->state);
>> +
>>       /* now if nothing is locked, and if we have enough data,
>>        * we can start a write request
>>        */
>>
>>
>> This code really really needs to be tidied up and commented better!!!
>>
>> Thanks,
>> NeilBrown
>
> -- 
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] md: fix raid5 livelock
  2015-01-28  2:37 ` NeilBrown
  2015-01-28 12:03   ` Heinz Mauelshagen
@ 2015-01-29 17:17   ` Jes Sorensen
  1 sibling, 0 replies; 6+ messages in thread
From: Jes Sorensen @ 2015-01-29 17:17 UTC (permalink / raw)
  To: NeilBrown
  Cc: Heinz Mauelshagen, dm-devel >> device-mapper development,
	linux-raid

NeilBrown <neilb@suse.de> writes:
> On Sun, 25 Jan 2015 21:06:20 +0100 Heinz Mauelshagen <heinzm@redhat.com>
> wrote:
>
>> From: Heinz Mauelshagen <heinzm@redhat.com>
>> 
>> Hi Neil,
>> 
>> the reconstruct write optimization in raid5, function fetch_block causes
>> livelocks in LVM raid4/5 tests.
>> 
>> Test scenarios:
>> the tests wait for full initial array resynchronization before making a 
>> filesystem
>> on the raid4/5 logical volume, mounting it, writing to the filesystem 
>> and failing
>> one physical volume holding a raiddev.
>> 
>> In short, we're seeing livelocks on fully synchronized raid4/5 arrays 
>> with a failed device.
>> 
>> This patch fixes the issue but likely in a suboptimnal way.
>> 
>> Do you think there is a better solution to avoid livelocks on 
>> reconstruct writes?
>> 
>> Regards,
>> Heinz
>> 
>> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
>> Tested-by: Jon Brassow <jbrassow@redhat.com>
>> Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
>> 
>> ---
>>   drivers/md/raid5.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index c1b0d52..0fc8737 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -2915,7 +2915,7 @@ static int fetch_block(struct stripe_head *sh, 
>> struct stripe_head_state *s,
>>               (s->failed >= 1 && fdev[0]->toread) ||
>>               (s->failed >= 2 && fdev[1]->toread) ||
>>               (sh->raid_conf->level <= 5 && s->failed && fdev[0]->towrite &&
>> -             (!test_bit(R5_Insync, &dev->flags) || 
>> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
>> +             (!test_bit(R5_Insync, &dev->flags) || 
>> test_bit(STRIPE_PREREAD_ACTIVE, &sh->state) || s->non_overwrite) &&
>>                !test_bit(R5_OVERWRITE, &fdev[0]->flags)) ||
>>               ((sh->raid_conf->level == 6 ||
>>                 sh->sector >= sh->raid_conf->mddev->recovery_cp)
>
>
> That is a bit heavy handed, but knowing that fixes the problem helps a lot.
>
> I think the problem happens when processes a non-overwrite write to a failed
> device.
>
> fetch_block() should, in that case, pre-read all of the working device, but
> since
>
> 	      (!test_bit(R5_Insync, &dev->flags) || test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
>
> was added, it sometimes doesn't.  The root problem is that
> handle_stripe_dirtying is getting confused because neither rmw or rcw seem to
> work, so it doesn't start the chain of events to set STRIPE_PREREAD_ACTIVE.
>
> The following (which is against mainline) might fix it.  Can you test?
>
> Thanks,
> NeilBrown
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index c1b0d52bfcb0..793cf2861e97 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3195,6 +3195,10 @@ static void handle_stripe_dirtying(struct r5conf *conf,
>  					  (unsigned long long)sh->sector,
>  					  rcw, qread, test_bit(STRIPE_DELAYED, &sh->state));
>  	}
> +	if (rcw > disks && rmw > disks &&
> +	    !test_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
> +		set_bit(STRIPE_DELAYED, &sh->state);
> +
>  	/* now if nothing is locked, and if we have enough data,
>  	 * we can start a write request
>  	 */
>
>
> This code really really needs to be tidied up and commented better!!!

Neil,

Since this one seems to do the trick, will you be pushing it into you
tree anytime soon?

Cheers,
Jes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-devel] [PATCH] md: fix raid5 livelock
  2015-01-29 11:24     ` [dm-devel] " Heinz Mauelshagen
@ 2015-02-02  0:06       ` NeilBrown
  0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2015-02-02  0:06 UTC (permalink / raw)
  To: Heinz Mauelshagen
  Cc: device-mapper development, linux-raid, Brassow Jonathan,
	Alasdair G Kergon, Mikulas Patocka, Marian Csontos

[-- Attachment #1: Type: text/plain, Size: 2173 bytes --]

On Thu, 29 Jan 2015 12:24:00 +0100 Heinz Mauelshagen <heinzm@redhat.com>
wrote:

> 
> Neil,
> 
> the patch worked fine in overnight test runs without the previous livelock.
> No regressions have been triggered.
> 
> Yes, tidying up that optimization logic (e.g. in fetch_block()) is very 
> much appreciated :-)
> 

Thanks!
The following is what should appear in -next soonish.  If there are any *-by:
tags to be added or changed, please let me know.

NeilBrown


From: NeilBrown <neilb@suse.de>
Date: Mon, 2 Feb 2015 10:44:29 +1100
Subject: [PATCH] md/raid5: fix another livelock caused by non-aligned writes.

If a non-page-aligned write is destined for a device which
is missing/faulty, we can deadlock.

As the target device is missing, a read-modify-write cycle
is not possible.
As the write is not for a full-page, a recontruct-write cycle
is not possible.

This should be handled by logic in fetch_block() which notices
there is a non-R5_OVERWRITE write to a missing device, and so
loads all blocks.

However since commit 67f455486d2ea2, that code requires
STRIPE_PREREAD_ACTIVE before it will active, and those circumstances
never set STRIPE_PREREAD_ACTIVE.

So: in handle_stripe_dirtying, if neither rmw or rcw was possible,
set STRIPE_DELAYED, which will cause STRIPE_PREREAD_ACTIVE be set
after a suitable delay.

Fixes: 67f455486d2ea20b2d94d6adf5b9b783d079e321
Cc: stable@vger.kernel.org (v3.16+)
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 41494d904859..274db1834d43 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3192,6 +3192,11 @@ static void handle_stripe_dirtying(struct r5conf *conf,
 					  (unsigned long long)sh->sector,
 					  rcw, qread, test_bit(STRIPE_DELAYED, &sh->state));
 	}
+
+	if (rcw > disks && rmw > disks &&
+	    !test_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
+		set_bit(STRIPE_DELAYED, &sh->state);
+
 	/* now if nothing is locked, and if we have enough data,
 	 * we can start a write request
 	 */

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-02-02  0:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-25 20:06 [PATCH] md: fix raid5 livelock Heinz Mauelshagen
2015-01-28  2:37 ` NeilBrown
2015-01-28 12:03   ` Heinz Mauelshagen
2015-01-29 11:24     ` [dm-devel] " Heinz Mauelshagen
2015-02-02  0:06       ` NeilBrown
2015-01-29 17:17   ` Jes Sorensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.