* Bad raid0 bio too large problem
@ 2015-09-22 15:30 Jes Sorensen
2015-09-22 16:07 ` Jes Sorensen
2015-09-23 2:25 ` Neil Brown
0 siblings, 2 replies; 10+ messages in thread
From: Jes Sorensen @ 2015-09-22 15:30 UTC (permalink / raw)
To: NeilBrown; +Cc: Xiao Ni, linux-raid, yizhan
Hi Neil,
I think we have some bad side effects with this patch:
commit 199dc6ed5179251fa6158a461499c24bdd99c836
Author: NeilBrown <neilb@suse.com>
Date: Mon Aug 3 13:11:47 2015 +1000
md/raid0: update queue parameter in a safer location.
When a (e.g.) RAID5 array is reshaped to RAID0, the updating
of queue parameters (e.g. max number of sectors per bio) is
done in the wrong place.
It should be part of ->run, but it is actually part of ->takeover.
This means it happens before level_store() calls:
blk_set_stacking_limits(&mddev->queue->limits);
Running the '03r0assem' test suite fills my kernel log with output like
below. Yi Zhang also had issues where writes failed too.
robably something we need to resolve for 4.2-final or revert the
offending patch.
Cheers,
Jes
md: bind<loop0>
md: bind<loop1>
md: bind<loop2>
md/raid0:md2: md_size is 116736 sectors.
md: RAID0 configuration for md2 - 1 zone
md: zone0=[loop0/loop1/loop2]
zone-offset= 0KB, device-offset= 0KB, size= 58368KB
md2: detected capacity change from 0 to 59768832
bio too big device loop0 (296 > 255)
bio too big device loop0 (272 > 255)
bio too big device loop1 (672 > 255)
bio too big device loop1 (352 > 255)
bio too big device loop2 (1024 > 255)
bio too big device loop0 (672 > 255)
bio too big device loop1 (256 > 255)
bio too big device loop2 (288 > 255)
bio too big device loop2 (736 > 255)
bio too big device loop0 (288 > 255)
bio too big device loop2 (256 > 255)
bio too big device loop2 (368 > 255)
bio too big device loop0 (488 > 255)
bio too big device loop0 (360 > 255)
bio too big device loop0 (256 > 255)
bio too big device loop1 (288 > 255)
bio too big device loop1 (736 > 255)
bio too big device loop2 (288 > 255)
bio too big device loop1 (256 > 255)
bio too big device loop1 (512 > 255)
bio too big device loop2 (856 > 255)
bio too big device loop2 (256 > 255)
bio too big device loop0 (288 > 255)
bio too big device loop0 (736 > 255)
bio too big device loop1 (288 > 255)
bio too big device loop0 (256 > 255)
bio too big device loop0 (512 > 255)
bio too big device loop1 (856 > 255)
bio too big device loop1 (256 > 255)
bio too big device loop2 (288 > 255)
bio too big device loop2 (736 > 255)
md2: detected capacity change from 59768832 to 0
md: md2 stopped.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-22 15:30 Bad raid0 bio too large problem Jes Sorensen
@ 2015-09-22 16:07 ` Jes Sorensen
2015-09-23 2:25 ` Neil Brown
1 sibling, 0 replies; 10+ messages in thread
From: Jes Sorensen @ 2015-09-22 16:07 UTC (permalink / raw)
To: NeilBrown; +Cc: Xiao Ni, linux-raid, yizhan
Jes Sorensen <Jes.Sorensen@redhat.com> writes:
> Hi Neil,
>
> I think we have some bad side effects with this patch:
>
> commit 199dc6ed5179251fa6158a461499c24bdd99c836
> Author: NeilBrown <neilb@suse.com>
> Date: Mon Aug 3 13:11:47 2015 +1000
>
> md/raid0: update queue parameter in a safer location.
>
> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
> of queue parameters (e.g. max number of sectors per bio) is
> done in the wrong place.
> It should be part of ->run, but it is actually part of ->takeover.
> This means it happens before level_store() calls:
>
> blk_set_stacking_limits(&mddev->queue->limits);
>
> Running the '03r0assem' test suite fills my kernel log with output like
> below. Yi Zhang also had issues where writes failed too.
>
> robably something we need to resolve for 4.2-final or revert the
> offending patch.
Obviously I meant 4.3 here :)
Jes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-22 15:30 Bad raid0 bio too large problem Jes Sorensen
2015-09-22 16:07 ` Jes Sorensen
@ 2015-09-23 2:25 ` Neil Brown
2015-09-23 11:18 ` Jes Sorensen
1 sibling, 1 reply; 10+ messages in thread
From: Neil Brown @ 2015-09-23 2:25 UTC (permalink / raw)
To: Jes Sorensen; +Cc: Xiao Ni, linux-raid, yizhan
[-- Attachment #1: Type: text/plain, Size: 1727 bytes --]
Jes Sorensen <Jes.Sorensen@redhat.com> writes:
> Hi Neil,
>
> I think we have some bad side effects with this patch:
>
> commit 199dc6ed5179251fa6158a461499c24bdd99c836
> Author: NeilBrown <neilb@suse.com>
> Date: Mon Aug 3 13:11:47 2015 +1000
>
> md/raid0: update queue parameter in a safer location.
>
> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
> of queue parameters (e.g. max number of sectors per bio) is
> done in the wrong place.
> It should be part of ->run, but it is actually part of ->takeover.
> This means it happens before level_store() calls:
>
> blk_set_stacking_limits(&mddev->queue->limits);
>
> Running the '03r0assem' test suite fills my kernel log with output like
> below. Yi Zhang also had issues where writes failed too.
>
> robably something we need to resolve for 4.2-final or revert the
> offending patch.
>
> Cheers,
> Jes
>
> md: bind<loop0>
> md: bind<loop1>
> md: bind<loop2>
> md/raid0:md2: md_size is 116736 sectors.
> md: RAID0 configuration for md2 - 1 zone
> md: zone0=[loop0/loop1/loop2]
> zone-offset= 0KB, device-offset= 0KB, size= 58368KB
>
> md2: detected capacity change from 0 to 59768832
> bio too big device loop0 (296 > 255)
> bio too big device loop0 (272 > 255)
1/ Why do you blame that particular patch?
2/ Where is that error message coming from? I cannot find "bio too big"
in the kernel (except in a comment).
Commit: 54efd50bfd87 ("block: make generic_make_request handle arbitrarily sized bios")
removed the only instance of the error message that I know of.
Which kernel exactly are you testing?
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-23 2:25 ` Neil Brown
@ 2015-09-23 11:18 ` Jes Sorensen
2015-09-23 11:20 ` Jes Sorensen
2015-09-24 2:53 ` Neil Brown
0 siblings, 2 replies; 10+ messages in thread
From: Jes Sorensen @ 2015-09-23 11:18 UTC (permalink / raw)
To: Neil Brown; +Cc: Xiao Ni, linux-raid, yizhan
Neil Brown <neilb@suse.de> writes:
> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>
>> Hi Neil,
>>
>> I think we have some bad side effects with this patch:
>>
>> commit 199dc6ed5179251fa6158a461499c24bdd99c836
>> Author: NeilBrown <neilb@suse.com>
>> Date: Mon Aug 3 13:11:47 2015 +1000
>>
>> md/raid0: update queue parameter in a safer location.
>>
>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
>> of queue parameters (e.g. max number of sectors per bio) is
>> done in the wrong place.
>> It should be part of ->run, but it is actually part of ->takeover.
>> This means it happens before level_store() calls:
>>
>> blk_set_stacking_limits(&mddev->queue->limits);
>>
>> Running the '03r0assem' test suite fills my kernel log with output like
>> below. Yi Zhang also had issues where writes failed too.
>>
>> robably something we need to resolve for 4.2-final or revert the
>> offending patch.
>>
>> Cheers,
>> Jes
>>
>> md: bind<loop0>
>> md: bind<loop1>
>> md: bind<loop2>
>> md/raid0:md2: md_size is 116736 sectors.
>> md: RAID0 configuration for md2 - 1 zone
>> md: zone0=[loop0/loop1/loop2]
>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB
>>
>> md2: detected capacity change from 0 to 59768832
>> bio too big device loop0 (296 > 255)
>> bio too big device loop0 (272 > 255)
>
> 1/ Why do you blame that particular patch?
>
> 2/ Where is that error message coming from? I cannot find "bio too big"
> in the kernel (except in a comment).
> Commit: 54efd50bfd87 ("block: make generic_make_request handle
> arbitrarily sized bios")
> removed the only instance of the error message that I know of.
>
> Which kernel exactly are you testing?
I blame it because of bisect - I revert that patch and the issue goes
away.
I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree,
see the bio too large. I revert it and it goes away.
Cheers,
Jes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-23 11:18 ` Jes Sorensen
@ 2015-09-23 11:20 ` Jes Sorensen
2015-09-23 11:42 ` Jes Sorensen
2015-09-24 2:53 ` Neil Brown
1 sibling, 1 reply; 10+ messages in thread
From: Jes Sorensen @ 2015-09-23 11:20 UTC (permalink / raw)
To: Neil Brown; +Cc: Xiao Ni, linux-raid, yizhan
Jes Sorensen <Jes.Sorensen@redhat.com> writes:
> Neil Brown <neilb@suse.de> writes:
>> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>>
>>> Hi Neil,
>>>
>>> I think we have some bad side effects with this patch:
>>>
>>> commit 199dc6ed5179251fa6158a461499c24bdd99c836
>>> Author: NeilBrown <neilb@suse.com>
>>> Date: Mon Aug 3 13:11:47 2015 +1000
>>>
>>> md/raid0: update queue parameter in a safer location.
>>>
>>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
>>> of queue parameters (e.g. max number of sectors per bio) is
>>> done in the wrong place.
>>> It should be part of ->run, but it is actually part of ->takeover.
>>> This means it happens before level_store() calls:
>>>
>>> blk_set_stacking_limits(&mddev->queue->limits);
>>>
>>> Running the '03r0assem' test suite fills my kernel log with output like
>>> below. Yi Zhang also had issues where writes failed too.
>>>
>>> robably something we need to resolve for 4.2-final or revert the
>>> offending patch.
>>>
>>> Cheers,
>>> Jes
>>>
>>> md: bind<loop0>
>>> md: bind<loop1>
>>> md: bind<loop2>
>>> md/raid0:md2: md_size is 116736 sectors.
>>> md: RAID0 configuration for md2 - 1 zone
>>> md: zone0=[loop0/loop1/loop2]
>>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB
>>>
>>> md2: detected capacity change from 0 to 59768832
>>> bio too big device loop0 (296 > 255)
>>> bio too big device loop0 (272 > 255)
>>
>> 1/ Why do you blame that particular patch?
>>
>> 2/ Where is that error message coming from? I cannot find "bio too big"
>> in the kernel (except in a comment).
>> Commit: 54efd50bfd87 ("block: make generic_make_request handle
>> arbitrarily sized bios")
>> removed the only instance of the error message that I know of.
>>
>> Which kernel exactly are you testing?
>
> I blame it because of bisect - I revert that patch and the issue goes
> away.
>
> I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree,
> see the bio too large. I revert it and it goes away.
Hmmm Xiao tells me that the warning message is not in upstream, but I
was sure I reproduced it in the upstream kernel as well. If I screwed
up, my apologies, I am going back to my cave and will investigate
further.
Jes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-23 11:20 ` Jes Sorensen
@ 2015-09-23 11:42 ` Jes Sorensen
0 siblings, 0 replies; 10+ messages in thread
From: Jes Sorensen @ 2015-09-23 11:42 UTC (permalink / raw)
To: Neil Brown; +Cc: Xiao Ni, linux-raid, yizhan
Jes Sorensen <Jes.Sorensen@redhat.com> writes:
> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>> Neil Brown <neilb@suse.de> writes:
>>> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>>>
>>>> Hi Neil,
>>>>
>>>> I think we have some bad side effects with this patch:
>>>>
>>>> commit 199dc6ed5179251fa6158a461499c24bdd99c836
>>>> Author: NeilBrown <neilb@suse.com>
>>>> Date: Mon Aug 3 13:11:47 2015 +1000
>>>>
>>>> md/raid0: update queue parameter in a safer location.
>>>>
>>>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
>>>> of queue parameters (e.g. max number of sectors per bio) is
>>>> done in the wrong place.
>>>> It should be part of ->run, but it is actually part of ->takeover.
>>>> This means it happens before level_store() calls:
>>>>
>>>> blk_set_stacking_limits(&mddev->queue->limits);
>>>>
>>>> Running the '03r0assem' test suite fills my kernel log with output like
>>>> below. Yi Zhang also had issues where writes failed too.
>>>>
>>>> robably something we need to resolve for 4.2-final or revert the
>>>> offending patch.
>>>>
>>>> Cheers,
>>>> Jes
>>>>
>>>> md: bind<loop0>
>>>> md: bind<loop1>
>>>> md: bind<loop2>
>>>> md/raid0:md2: md_size is 116736 sectors.
>>>> md: RAID0 configuration for md2 - 1 zone
>>>> md: zone0=[loop0/loop1/loop2]
>>>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB
>>>>
>>>> md2: detected capacity change from 0 to 59768832
>>>> bio too big device loop0 (296 > 255)
>>>> bio too big device loop0 (272 > 255)
>>>
>>> 1/ Why do you blame that particular patch?
>>>
>>> 2/ Where is that error message coming from? I cannot find "bio too big"
>>> in the kernel (except in a comment).
>>> Commit: 54efd50bfd87 ("block: make generic_make_request handle
>>> arbitrarily sized bios")
>>> removed the only instance of the error message that I know of.
>>>
>>> Which kernel exactly are you testing?
>>
>> I blame it because of bisect - I revert that patch and the issue goes
>> away.
>>
>> I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree,
>> see the bio too large. I revert it and it goes away.
>
> Hmmm Xiao tells me that the warning message is not in upstream, but I
> was sure I reproduced it in the upstream kernel as well. If I screwed
> up, my apologies, I am going back to my cave and will investigate
> further.
Digging a bit further, I was testing on a 4.2.0-rc5+ where I saw the
issue. That does raise the question of what this patch does to older
kernels, since it was submitted it for 2.6.35+.
"bio too big" message went away with this commit:
commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e
Author: Kent Overstreet <kent.overstreet@gmail.com>
Date: Thu Apr 23 22:37:18 2015 -0700
block: make generic_make_request handle arbitrarily sized bios
Cheers,
Jes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-23 11:18 ` Jes Sorensen
2015-09-23 11:20 ` Jes Sorensen
@ 2015-09-24 2:53 ` Neil Brown
2015-09-24 8:48 ` Xiao Ni
2015-09-24 12:59 ` Jes Sorensen
1 sibling, 2 replies; 10+ messages in thread
From: Neil Brown @ 2015-09-24 2:53 UTC (permalink / raw)
To: Jes Sorensen; +Cc: Xiao Ni, linux-raid, yizhan
[-- Attachment #1: Type: text/plain, Size: 3671 bytes --]
Jes Sorensen <Jes.Sorensen@redhat.com> writes:
> Neil Brown <neilb@suse.de> writes:
>> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>>
>>> Hi Neil,
>>>
>>> I think we have some bad side effects with this patch:
>>>
>>> commit 199dc6ed5179251fa6158a461499c24bdd99c836
>>> Author: NeilBrown <neilb@suse.com>
>>> Date: Mon Aug 3 13:11:47 2015 +1000
>>>
>>> md/raid0: update queue parameter in a safer location.
>>>
>>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
>>> of queue parameters (e.g. max number of sectors per bio) is
>>> done in the wrong place.
>>> It should be part of ->run, but it is actually part of ->takeover.
>>> This means it happens before level_store() calls:
>>>
>>> blk_set_stacking_limits(&mddev->queue->limits);
>>>
>>> Running the '03r0assem' test suite fills my kernel log with output like
>>> below. Yi Zhang also had issues where writes failed too.
>>>
>>> robably something we need to resolve for 4.2-final or revert the
>>> offending patch.
>>>
>>> Cheers,
>>> Jes
>>>
>>> md: bind<loop0>
>>> md: bind<loop1>
>>> md: bind<loop2>
>>> md/raid0:md2: md_size is 116736 sectors.
>>> md: RAID0 configuration for md2 - 1 zone
>>> md: zone0=[loop0/loop1/loop2]
>>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB
>>>
>>> md2: detected capacity change from 0 to 59768832
>>> bio too big device loop0 (296 > 255)
>>> bio too big device loop0 (272 > 255)
>>
>> 1/ Why do you blame that particular patch?
>>
>> 2/ Where is that error message coming from? I cannot find "bio too big"
>> in the kernel (except in a comment).
>> Commit: 54efd50bfd87 ("block: make generic_make_request handle
>> arbitrarily sized bios")
>> removed the only instance of the error message that I know of.
>>
>> Which kernel exactly are you testing?
>
> I blame it because of bisect - I revert that patch and the issue goes
> away.
>
> I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree,
> see the bio too large. I revert it and it goes away.
Well that's pretty convincing - thanks.
And as you say - it is tagged for -stable so really needs to be fixed.
Stares at the code again. And again.
Ahhh. that patch moved the
blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors);
to after
disk_stack_limits(...);
That is wrong.
Could you confirm that this fixes your test?
Thanks,
NeilBrown
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 4a13c3cb940b..0875e5e7e09a 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -431,12 +431,6 @@ static int raid0_run(struct mddev *mddev)
struct md_rdev *rdev;
bool discard_supported = false;
- rdev_for_each(rdev, mddev) {
- disk_stack_limits(mddev->gendisk, rdev->bdev,
- rdev->data_offset << 9);
- if (blk_queue_discard(bdev_get_queue(rdev->bdev)))
- discard_supported = true;
- }
blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors);
blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors);
blk_queue_max_discard_sectors(mddev->queue, mddev->chunk_sectors);
@@ -445,6 +439,12 @@ static int raid0_run(struct mddev *mddev)
blk_queue_io_opt(mddev->queue,
(mddev->chunk_sectors << 9) * mddev->raid_disks);
+ rdev_for_each(rdev, mddev) {
+ disk_stack_limits(mddev->gendisk, rdev->bdev,
+ rdev->data_offset << 9);
+ if (blk_queue_discard(bdev_get_queue(rdev->bdev)))
+ discard_supported = true;
+ }
if (!discard_supported)
queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
else
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-24 2:53 ` Neil Brown
@ 2015-09-24 8:48 ` Xiao Ni
2015-09-24 12:59 ` Jes Sorensen
1 sibling, 0 replies; 10+ messages in thread
From: Xiao Ni @ 2015-09-24 8:48 UTC (permalink / raw)
To: Neil Brown; +Cc: Jes Sorensen, linux-raid, yizhan
----- Original Message -----
> From: "Neil Brown" <neilb@suse.de>
> To: "Jes Sorensen" <Jes.Sorensen@redhat.com>
> Cc: "Xiao Ni" <xni@redhat.com>, "linux-raid" <linux-raid@vger.kernel.org>, yizhan@redhat.com
> Sent: Thursday, September 24, 2015 10:53:06 AM
> Subject: Re: Bad raid0 bio too large problem
>
> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>
> > Neil Brown <neilb@suse.de> writes:
> >> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
> >>
> >>> Hi Neil,
> >>>
> >>> I think we have some bad side effects with this patch:
> >>>
> >>> commit 199dc6ed5179251fa6158a461499c24bdd99c836
> >>> Author: NeilBrown <neilb@suse.com>
> >>> Date: Mon Aug 3 13:11:47 2015 +1000
> >>>
> >>> md/raid0: update queue parameter in a safer location.
> >>>
> >>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
> >>> of queue parameters (e.g. max number of sectors per bio) is
> >>> done in the wrong place.
> >>> It should be part of ->run, but it is actually part of ->takeover.
> >>> This means it happens before level_store() calls:
> >>>
> >>> blk_set_stacking_limits(&mddev->queue->limits);
> >>>
> >>> Running the '03r0assem' test suite fills my kernel log with output like
> >>> below. Yi Zhang also had issues where writes failed too.
> >>>
> >>> robably something we need to resolve for 4.2-final or revert the
> >>> offending patch.
> >>>
> >>> Cheers,
> >>> Jes
> >>>
> >>> md: bind<loop0>
> >>> md: bind<loop1>
> >>> md: bind<loop2>
> >>> md/raid0:md2: md_size is 116736 sectors.
> >>> md: RAID0 configuration for md2 - 1 zone
> >>> md: zone0=[loop0/loop1/loop2]
> >>> zone-offset= 0KB, device-offset= 0KB, size=
> >>> 58368KB
> >>>
> >>> md2: detected capacity change from 0 to 59768832
> >>> bio too big device loop0 (296 > 255)
> >>> bio too big device loop0 (272 > 255)
> >>
> >> 1/ Why do you blame that particular patch?
> >>
> >> 2/ Where is that error message coming from? I cannot find "bio too big"
> >> in the kernel (except in a comment).
> >> Commit: 54efd50bfd87 ("block: make generic_make_request handle
> >> arbitrarily sized bios")
> >> removed the only instance of the error message that I know of.
> >>
> >> Which kernel exactly are you testing?
> >
> > I blame it because of bisect - I revert that patch and the issue goes
> > away.
> >
> > I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree,
> > see the bio too large. I revert it and it goes away.
>
> Well that's pretty convincing - thanks.
> And as you say - it is tagged for -stable so really needs to be fixed.
>
> Stares at the code again. And again.
>
> Ahhh. that patch moved the
> blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors);
> to after
> disk_stack_limits(...);
>
> That is wrong.
>
> Could you confirm that this fixes your test?
>
> Thanks,
> NeilBrown
>
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index 4a13c3cb940b..0875e5e7e09a 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -431,12 +431,6 @@ static int raid0_run(struct mddev *mddev)
> struct md_rdev *rdev;
> bool discard_supported = false;
>
> - rdev_for_each(rdev, mddev) {
> - disk_stack_limits(mddev->gendisk, rdev->bdev,
> - rdev->data_offset << 9);
> - if (blk_queue_discard(bdev_get_queue(rdev->bdev)))
> - discard_supported = true;
> - }
> blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors);
> blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors);
> blk_queue_max_discard_sectors(mddev->queue, mddev->chunk_sectors);
> @@ -445,6 +439,12 @@ static int raid0_run(struct mddev *mddev)
> blk_queue_io_opt(mddev->queue,
> (mddev->chunk_sectors << 9) * mddev->raid_disks);
>
> + rdev_for_each(rdev, mddev) {
> + disk_stack_limits(mddev->gendisk, rdev->bdev,
> + rdev->data_offset << 9);
> + if (blk_queue_discard(bdev_get_queue(rdev->bdev)))
> + discard_supported = true;
> + }
> if (!discard_supported)
> queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
> else
>
Hi Neil, Jes
The problem is fixed with this patch.
Best Regards
Xiao
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-24 2:53 ` Neil Brown
2015-09-24 8:48 ` Xiao Ni
@ 2015-09-24 12:59 ` Jes Sorensen
2015-09-25 4:23 ` Neil Brown
1 sibling, 1 reply; 10+ messages in thread
From: Jes Sorensen @ 2015-09-24 12:59 UTC (permalink / raw)
To: Neil Brown; +Cc: Xiao Ni, linux-raid, yizhan
Neil Brown <neilb@suse.de> writes:
> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>
>> Neil Brown <neilb@suse.de> writes:
>>> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>>>
>>>> Hi Neil,
>>>>
>>>> I think we have some bad side effects with this patch:
>>>>
>>>> commit 199dc6ed5179251fa6158a461499c24bdd99c836
>>>> Author: NeilBrown <neilb@suse.com>
>>>> Date: Mon Aug 3 13:11:47 2015 +1000
>>>>
>>>> md/raid0: update queue parameter in a safer location.
>>>>
>>>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
>>>> of queue parameters (e.g. max number of sectors per bio) is
>>>> done in the wrong place.
>>>> It should be part of ->run, but it is actually part of ->takeover.
>>>> This means it happens before level_store() calls:
>>>>
>>>> blk_set_stacking_limits(&mddev->queue->limits);
>>>>
>>>> Running the '03r0assem' test suite fills my kernel log with output like
>>>> below. Yi Zhang also had issues where writes failed too.
>>>>
>>>> robably something we need to resolve for 4.2-final or revert the
>>>> offending patch.
>>>>
>>>> Cheers,
>>>> Jes
>>>>
>>>> md: bind<loop0>
>>>> md: bind<loop1>
>>>> md: bind<loop2>
>>>> md/raid0:md2: md_size is 116736 sectors.
>>>> md: RAID0 configuration for md2 - 1 zone
>>>> md: zone0=[loop0/loop1/loop2]
>>>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB
>>>>
>>>> md2: detected capacity change from 0 to 59768832
>>>> bio too big device loop0 (296 > 255)
>>>> bio too big device loop0 (272 > 255)
>>>
>>> 1/ Why do you blame that particular patch?
>>>
>>> 2/ Where is that error message coming from? I cannot find "bio too big"
>>> in the kernel (except in a comment).
>>> Commit: 54efd50bfd87 ("block: make generic_make_request handle
>>> arbitrarily sized bios")
>>> removed the only instance of the error message that I know of.
>>>
>>> Which kernel exactly are you testing?
>>
>> I blame it because of bisect - I revert that patch and the issue goes
>> away.
>>
>> I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree,
>> see the bio too large. I revert it and it goes away.
>
> Well that's pretty convincing - thanks.
> And as you say - it is tagged for -stable so really needs to be fixed.
>
> Stares at the code again. And again.
>
> Ahhh. that patch moved the
> blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors);
> to after
> disk_stack_limits(...);
>
> That is wrong.
>
> Could you confirm that this fixes your test?
I refuse to do that! .... since Xiao beat me to it! Thanks!
I was half way bisecting my way through it last night. For some reason
the problem was reproducible in 4.2 if I applied the offending patch,
but not in 4.3-rc2.
Any chance you'll push these to your git tree in the near future?
Thanks!
Jes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem
2015-09-24 12:59 ` Jes Sorensen
@ 2015-09-25 4:23 ` Neil Brown
0 siblings, 0 replies; 10+ messages in thread
From: Neil Brown @ 2015-09-25 4:23 UTC (permalink / raw)
To: Jes Sorensen; +Cc: Xiao Ni, linux-raid, yizhan
[-- Attachment #1: Type: text/plain, Size: 3060 bytes --]
Jes Sorensen <Jes.Sorensen@redhat.com> writes:
> Neil Brown <neilb@suse.de> writes:
>> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>>
>>> Neil Brown <neilb@suse.de> writes:
>>>> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>>>>
>>>>> Hi Neil,
>>>>>
>>>>> I think we have some bad side effects with this patch:
>>>>>
>>>>> commit 199dc6ed5179251fa6158a461499c24bdd99c836
>>>>> Author: NeilBrown <neilb@suse.com>
>>>>> Date: Mon Aug 3 13:11:47 2015 +1000
>>>>>
>>>>> md/raid0: update queue parameter in a safer location.
>>>>>
>>>>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating
>>>>> of queue parameters (e.g. max number of sectors per bio) is
>>>>> done in the wrong place.
>>>>> It should be part of ->run, but it is actually part of ->takeover.
>>>>> This means it happens before level_store() calls:
>>>>>
>>>>> blk_set_stacking_limits(&mddev->queue->limits);
>>>>>
>>>>> Running the '03r0assem' test suite fills my kernel log with output like
>>>>> below. Yi Zhang also had issues where writes failed too.
>>>>>
>>>>> robably something we need to resolve for 4.2-final or revert the
>>>>> offending patch.
>>>>>
>>>>> Cheers,
>>>>> Jes
>>>>>
>>>>> md: bind<loop0>
>>>>> md: bind<loop1>
>>>>> md: bind<loop2>
>>>>> md/raid0:md2: md_size is 116736 sectors.
>>>>> md: RAID0 configuration for md2 - 1 zone
>>>>> md: zone0=[loop0/loop1/loop2]
>>>>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB
>>>>>
>>>>> md2: detected capacity change from 0 to 59768832
>>>>> bio too big device loop0 (296 > 255)
>>>>> bio too big device loop0 (272 > 255)
>>>>
>>>> 1/ Why do you blame that particular patch?
>>>>
>>>> 2/ Where is that error message coming from? I cannot find "bio too big"
>>>> in the kernel (except in a comment).
>>>> Commit: 54efd50bfd87 ("block: make generic_make_request handle
>>>> arbitrarily sized bios")
>>>> removed the only instance of the error message that I know of.
>>>>
>>>> Which kernel exactly are you testing?
>>>
>>> I blame it because of bisect - I revert that patch and the issue goes
>>> away.
>>>
>>> I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree,
>>> see the bio too large. I revert it and it goes away.
>>
>> Well that's pretty convincing - thanks.
>> And as you say - it is tagged for -stable so really needs to be fixed.
>>
>> Stares at the code again. And again.
>>
>> Ahhh. that patch moved the
>> blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors);
>> to after
>> disk_stack_limits(...);
>>
>> That is wrong.
>>
>> Could you confirm that this fixes your test?
>
> I refuse to do that! .... since Xiao beat me to it! Thanks!
>
> I was half way bisecting my way through it last night. For some reason
> the problem was reproducible in 4.2 if I applied the offending patch,
> but not in 4.3-rc2.
>
> Any chance you'll push these to your git tree in the near future?
Pushed.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-09-25 4:23 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22 15:30 Bad raid0 bio too large problem Jes Sorensen
2015-09-22 16:07 ` Jes Sorensen
2015-09-23 2:25 ` Neil Brown
2015-09-23 11:18 ` Jes Sorensen
2015-09-23 11:20 ` Jes Sorensen
2015-09-23 11:42 ` Jes Sorensen
2015-09-24 2:53 ` Neil Brown
2015-09-24 8:48 ` Xiao Ni
2015-09-24 12:59 ` Jes Sorensen
2015-09-25 4:23 ` Neil Brown
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.