linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] md/raid5: limit request size according to implementation limits
@ 2016-11-27 16:32 Konstantin Khlebnikov
  2016-11-28  4:40 ` Coly Li
  2016-11-29 23:52 ` Shaohua Li
  0 siblings, 2 replies; 4+ messages in thread
From: Konstantin Khlebnikov @ 2016-11-27 16:32 UTC (permalink / raw)
  To: Shaohua Li, Neil Brown; +Cc: linux-raid, linux-kernel, stable

Current implementation employ 16bit counter of active stripes in lower
bits of bio->bi_phys_segments. If request is big enough to overflow
this counter bio will be completed and freed too early.

Fortunately this not happens in default configuration because several
other limits prevent that: stripe_cache_size * nr_disks effectively
limits count of active stripes. And small max_sectors_kb at lower
disks prevent that during normal read/write operations.

Overflow easily happens in discard if it's enabled by module parameter
"devices_handle_discard_safely" and stripe_cache_size is set big enough.

This patch limits requests size with 256Mb - 8Kb to prevent overflows.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Shaohua Li <shli@kernel.org>
Cc: Neil Brown <neilb@suse.com>
Cc: stable@vger.kernel.org
---
 drivers/md/raid5.c |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 92ac251e91e6..cce6057b9aca 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6984,6 +6984,15 @@ static int raid5_run(struct mddev *mddev)
 			stripe = (stripe | (stripe-1)) + 1;
 		mddev->queue->limits.discard_alignment = stripe;
 		mddev->queue->limits.discard_granularity = stripe;
+
+		/*
+		 * We use 16-bit counter of active stripes in bi_phys_segments
+		 * (minus one for over-loaded initialization)
+		 */
+		blk_queue_max_hw_sectors(mddev->queue, 0xfffe * STRIPE_SECTORS);
+		blk_queue_max_discard_sectors(mddev->queue,
+					      0xfffe * STRIPE_SECTORS);
+
 		/*
 		 * unaligned part of discard request will be ignored, so can't
 		 * guarantee discard_zeroes_data

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] md/raid5: limit request size according to implementation limits
  2016-11-27 16:32 [PATCH] md/raid5: limit request size according to implementation limits Konstantin Khlebnikov
@ 2016-11-28  4:40 ` Coly Li
  2016-11-28  6:06   ` Konstantin Khlebnikov
  2016-11-29 23:52 ` Shaohua Li
  1 sibling, 1 reply; 4+ messages in thread
From: Coly Li @ 2016-11-28  4:40 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Shaohua Li, Neil Brown, linux-raid, linux-kernel, stable

On 2016/11/28 上午12:32, Konstantin Khlebnikov wrote:
> Current implementation employ 16bit counter of active stripes in lower
> bits of bio->bi_phys_segments. If request is big enough to overflow
> this counter bio will be completed and freed too early.
> 
> Fortunately this not happens in default configuration because several
> other limits prevent that: stripe_cache_size * nr_disks effectively
> limits count of active stripes. And small max_sectors_kb at lower
> disks prevent that during normal read/write operations.
> 
> Overflow easily happens in discard if it's enabled by module parameter
> "devices_handle_discard_safely" and stripe_cache_size is set big enough.
> 
> This patch limits requests size with 256Mb - 8Kb to prevent overflows.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: Neil Brown <neilb@suse.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/md/raid5.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 92ac251e91e6..cce6057b9aca 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6984,6 +6984,15 @@ static int raid5_run(struct mddev *mddev)
>  			stripe = (stripe | (stripe-1)) + 1;
>  		mddev->queue->limits.discard_alignment = stripe;
>  		mddev->queue->limits.discard_granularity = stripe;
> +
> +		/*
> +		 * We use 16-bit counter of active stripes in bi_phys_segments
> +		 * (minus one for over-loaded initialization)
> +		 */
> +		blk_queue_max_hw_sectors(mddev->queue, 0xfffe * STRIPE_SECTORS);
> +		blk_queue_max_discard_sectors(mddev->queue,
> +					      0xfffe * STRIPE_SECTORS);
> +

Could you please to explain why use 0xfffe * STRIPE_SECTORS here ?

Thanks.

Coly

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] md/raid5: limit request size according to implementation limits
  2016-11-28  4:40 ` Coly Li
@ 2016-11-28  6:06   ` Konstantin Khlebnikov
  0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Khlebnikov @ 2016-11-28  6:06 UTC (permalink / raw)
  To: Coly Li
  Cc: Konstantin Khlebnikov, Shaohua Li, Neil Brown, linux-raid,
	Linux Kernel Mailing List, Stable

On Mon, Nov 28, 2016 at 7:40 AM, Coly Li <colyli@suse.de> wrote:
> On 2016/11/28 上午12:32, Konstantin Khlebnikov wrote:
>> Current implementation employ 16bit counter of active stripes in lower
>> bits of bio->bi_phys_segments. If request is big enough to overflow
>> this counter bio will be completed and freed too early.
>>
>> Fortunately this not happens in default configuration because several
>> other limits prevent that: stripe_cache_size * nr_disks effectively
>> limits count of active stripes. And small max_sectors_kb at lower
>> disks prevent that during normal read/write operations.
>>
>> Overflow easily happens in discard if it's enabled by module parameter
>> "devices_handle_discard_safely" and stripe_cache_size is set big enough.
>>
>> This patch limits requests size with 256Mb - 8Kb to prevent overflows.
>>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> Cc: Shaohua Li <shli@kernel.org>
>> Cc: Neil Brown <neilb@suse.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  drivers/md/raid5.c |    9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index 92ac251e91e6..cce6057b9aca 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -6984,6 +6984,15 @@ static int raid5_run(struct mddev *mddev)
>>                       stripe = (stripe | (stripe-1)) + 1;
>>               mddev->queue->limits.discard_alignment = stripe;
>>               mddev->queue->limits.discard_granularity = stripe;
>> +
>> +             /*
>> +              * We use 16-bit counter of active stripes in bi_phys_segments
>> +              * (minus one for over-loaded initialization)
>> +              */
>> +             blk_queue_max_hw_sectors(mddev->queue, 0xfffe * STRIPE_SECTORS);
>> +             blk_queue_max_discard_sectors(mddev->queue,
>> +                                           0xfffe * STRIPE_SECTORS);
>> +
>
> Could you please to explain why use 0xfffe * STRIPE_SECTORS here ?

This code send individual bio to lower device for each STRIPE_SECTORS (8)
and count them in 16-bit counter 0xffff max (you could find this
constant above in this file)
but counter initialized with 1 to prevent hitting zero during generation
thus maximum is 0xfffe stripes which is 256Mb - 8Kb in bytes

>
> Thanks.
>
> Coly
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] md/raid5: limit request size according to implementation limits
  2016-11-27 16:32 [PATCH] md/raid5: limit request size according to implementation limits Konstantin Khlebnikov
  2016-11-28  4:40 ` Coly Li
@ 2016-11-29 23:52 ` Shaohua Li
  1 sibling, 0 replies; 4+ messages in thread
From: Shaohua Li @ 2016-11-29 23:52 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: Neil Brown, linux-raid, linux-kernel, stable

On Sun, Nov 27, 2016 at 07:32:32PM +0300, Konstantin Khlebnikov wrote:
> Current implementation employ 16bit counter of active stripes in lower
> bits of bio->bi_phys_segments. If request is big enough to overflow
> this counter bio will be completed and freed too early.
> 
> Fortunately this not happens in default configuration because several
> other limits prevent that: stripe_cache_size * nr_disks effectively
> limits count of active stripes. And small max_sectors_kb at lower
> disks prevent that during normal read/write operations.
> 
> Overflow easily happens in discard if it's enabled by module parameter
> "devices_handle_discard_safely" and stripe_cache_size is set big enough.
> 
> This patch limits requests size with 256Mb - 8Kb to prevent overflows.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: Neil Brown <neilb@suse.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/md/raid5.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 92ac251e91e6..cce6057b9aca 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6984,6 +6984,15 @@ static int raid5_run(struct mddev *mddev)
>  			stripe = (stripe | (stripe-1)) + 1;
>  		mddev->queue->limits.discard_alignment = stripe;
>  		mddev->queue->limits.discard_granularity = stripe;
> +
> +		/*
> +		 * We use 16-bit counter of active stripes in bi_phys_segments
> +		 * (minus one for over-loaded initialization)
> +		 */
> +		blk_queue_max_hw_sectors(mddev->queue, 0xfffe * STRIPE_SECTORS);
> +		blk_queue_max_discard_sectors(mddev->queue,
> +					      0xfffe * STRIPE_SECTORS);
> +
>  		/*
>  		 * unaligned part of discard request will be ignored, so can't
>  		 * guarantee discard_zeroes_data

Thanks! I applied this one, which is easy for stable too. After Neil's patches
to remove the limitation, we can remove this one. 

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-11-29 23:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-27 16:32 [PATCH] md/raid5: limit request size according to implementation limits Konstantin Khlebnikov
2016-11-28  4:40 ` Coly Li
2016-11-28  6:06   ` Konstantin Khlebnikov
2016-11-29 23:52 ` Shaohua Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).