Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] block: Improve io_opt limit stacking
@ 2020-05-14  6:58 Damien Le Moal
  2020-05-22  7:27 ` Damien Le Moal
  0 siblings, 1 reply; 5+ messages in thread
From: Damien Le Moal @ 2020-05-14  6:58 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-nvme, Keith Busch, dm-devel, Mike Snitzer

When devices with different physical sector sizes are stacked, the
largest value is used as the stacked device physical sector size. For
the optimal IO size, the lowest common multiple (lcm) of the underlying
devices is used for the stacked device. In this scenario, if only one of
the underlying device reports an optimal IO size, that value is used as
is for the stacked device but that value may not be a multiple of the
stacked device physical sector size. In this case, blk_stack_limits()
returns an error resulting in warnings being printed on device mapper
startup (observed with dm-zoned dual drive setup combining a 512B
sector SSD with a 4K sector HDD).

To fix this, rather than returning an error, the optimal IO size limit
for the stacked device can be adjusted to the lowest common multiple
(lcm) of the stacked physical sector size and optimal IO size, resulting
in a value that is a multiple of the physical sector size while still
being an optimal size for the underlying devices.

This patch is complementary to the patch "nvme: Fix io_opt limit
setting" which prevents the nvme driver from reporting an optimal IO
size equal to a namespace sector size for a device that does not report
an optimal IO size.

Suggested-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-settings.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 9a2c23cd9700..9a2b017ff681 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -561,11 +561,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 	}
 
 	/* Optimal I/O a multiple of the physical block size? */
-	if (t->io_opt & (t->physical_block_size - 1)) {
-		t->io_opt = 0;
-		t->misaligned = 1;
-		ret = -1;
-	}
+	if (t->io_opt & (t->physical_block_size - 1))
+		t->io_opt = lcm(t->io_opt, t->physical_block_size);
 
 	t->raid_partial_stripes_expensive =
 		max(t->raid_partial_stripes_expensive,
-- 
2.25.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] block: Improve io_opt limit stacking
  2020-05-14  6:58 [PATCH] block: Improve io_opt limit stacking Damien Le Moal
@ 2020-05-22  7:27 ` Damien Le Moal
  2020-05-22 13:28   ` Martin K. Petersen
  0 siblings, 1 reply; 5+ messages in thread
From: Damien Le Moal @ 2020-05-22  7:27 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-nvme, Keith Busch, dm-devel, Mike Snitzer

On 2020/05/14 15:58, Damien Le Moal wrote:
> When devices with different physical sector sizes are stacked, the
> largest value is used as the stacked device physical sector size. For
> the optimal IO size, the lowest common multiple (lcm) of the underlying
> devices is used for the stacked device. In this scenario, if only one of
> the underlying device reports an optimal IO size, that value is used as
> is for the stacked device but that value may not be a multiple of the
> stacked device physical sector size. In this case, blk_stack_limits()
> returns an error resulting in warnings being printed on device mapper
> startup (observed with dm-zoned dual drive setup combining a 512B
> sector SSD with a 4K sector HDD).
> 
> To fix this, rather than returning an error, the optimal IO size limit
> for the stacked device can be adjusted to the lowest common multiple
> (lcm) of the stacked physical sector size and optimal IO size, resulting
> in a value that is a multiple of the physical sector size while still
> being an optimal size for the underlying devices.
> 
> This patch is complementary to the patch "nvme: Fix io_opt limit
> setting" which prevents the nvme driver from reporting an optimal IO
> size equal to a namespace sector size for a device that does not report
> an optimal IO size.
> 
> Suggested-by: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  block/blk-settings.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 9a2c23cd9700..9a2b017ff681 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -561,11 +561,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
>  	}
>  
>  	/* Optimal I/O a multiple of the physical block size? */
> -	if (t->io_opt & (t->physical_block_size - 1)) {
> -		t->io_opt = 0;
> -		t->misaligned = 1;
> -		ret = -1;
> -	}
> +	if (t->io_opt & (t->physical_block_size - 1))
> +		t->io_opt = lcm(t->io_opt, t->physical_block_size);
>  
>  	t->raid_partial_stripes_expensive =
>  		max(t->raid_partial_stripes_expensive,
> 

Jens,

Any comment on this patch ?
Note: the patch the patch "nvme: Fix io_opt limit setting" is already queued for
5.8.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] block: Improve io_opt limit stacking
  2020-05-22  7:27 ` Damien Le Moal
@ 2020-05-22 13:28   ` Martin K. Petersen
  2020-05-22 13:36     ` Martin K. Petersen
  0 siblings, 1 reply; 5+ messages in thread
From: Martin K. Petersen @ 2020-05-22 13:28 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-block, Jens Axboe, linux-nvme, Keith Busch, dm-devel, Mike Snitzer


Damien,

>> +	if (t->io_opt & (t->physical_block_size - 1))
>> +		t->io_opt = lcm(t->io_opt, t->physical_block_size);

> Any comment on this patch ?  Note: the patch the patch "nvme: Fix
> io_opt limit setting" is already queued for 5.8.

Setting io_opt to the physical block size is not correct.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] block: Improve io_opt limit stacking
  2020-05-22 13:28   ` Martin K. Petersen
@ 2020-05-22 13:36     ` Martin K. Petersen
  2020-05-22 14:14       ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Martin K. Petersen @ 2020-05-22 13:36 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Damien Le Moal, linux-block, Jens Axboe, linux-nvme, Keith Busch,
	dm-devel, Mike Snitzer


>>> +	if (t->io_opt & (t->physical_block_size - 1))
>>> +		t->io_opt = lcm(t->io_opt, t->physical_block_size);
>
>> Any comment on this patch ?  Note: the patch the patch "nvme: Fix
>> io_opt limit setting" is already queued for 5.8.
>
> Setting io_opt to the physical block size is not correct.

Oh, missed the lcm(). But I'm still concerned about twiddling io_opt to
a value different than the one reported by an underlying device.

Setting io_opt to something that's less than a full stripe width in a
RAID, for instance, doesn't produce the expected result. So I think I'd
prefer not to set io_opt at all if it isn't consistent across all the
stacked devices.

Let me chew on it for a bit...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] block: Improve io_opt limit stacking
  2020-05-22 13:36     ` Martin K. Petersen
@ 2020-05-22 14:14       ` Keith Busch
  0 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2020-05-22 14:14 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Damien Le Moal, linux-block, Jens Axboe, linux-nvme, dm-devel,
	Mike Snitzer

On Fri, May 22, 2020 at 09:36:18AM -0400, Martin K. Petersen wrote:
> 
> >>> +	if (t->io_opt & (t->physical_block_size - 1))
> >>> +		t->io_opt = lcm(t->io_opt, t->physical_block_size);
> >
> >> Any comment on this patch ?  Note: the patch the patch "nvme: Fix
> >> io_opt limit setting" is already queued for 5.8.
> >
> > Setting io_opt to the physical block size is not correct.
> 
> Oh, missed the lcm(). But I'm still concerned about twiddling io_opt to
> a value different than the one reported by an underlying device.
>
> Setting io_opt to something that's less than a full stripe width in a
> RAID, for instance, doesn't produce the expected result. So I think I'd
> prefer not to set io_opt at all if it isn't consistent across all the
> stacked devices.

We already modify the stacked io_opt value if the two request_queues
report different io_opt's. If, however, only one reports an io_opt value,
and it happens to not align with the other's physical block size, the
code currently rejects stacking those devices. Damien's patch should
just set a larger io_opt value that aligns with both, so if one io_opt
is a RAID stripe size, the stacked result will be multiple stripes.

I think that makes sense, but please do let us know if you think such
mismatched devices just shouldn't stack.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-14  6:58 [PATCH] block: Improve io_opt limit stacking Damien Le Moal
2020-05-22  7:27 ` Damien Le Moal
2020-05-22 13:28   ` Martin K. Petersen
2020-05-22 13:36     ` Martin K. Petersen
2020-05-22 14:14       ` Keith Busch

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git