linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Two loop driver patches
@ 2021-08-03  0:01 Bart Van Assche
  2021-08-03  0:01 ` [PATCH 1/2] loop: Prevent that an I/O scheduler is assigned Bart Van Assche
  2021-08-03  0:02 ` [PATCH 2/2] loop: Add the default_queue_depth kernel module parameter Bart Van Assche
  0 siblings, 2 replies; 7+ messages in thread
From: Bart Van Assche @ 2021-08-03  0:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche

Hi Jens,

The two patches in this patch series are what I came up with while testing
Android software. Please consider these patches for inclusion in the upstream
kernel.

Thanks,

Bart.

Bart Van Assche (2):
  loop: Prevent that an I/O scheduler is assigned
  loop: Add the default_queue_depth kernel module parameter

 drivers/block/loop.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] loop: Prevent that an I/O scheduler is assigned
  2021-08-03  0:01 [PATCH 0/2] Two loop driver patches Bart Van Assche
@ 2021-08-03  0:01 ` Bart Van Assche
  2021-08-03  1:54   ` Ming Lei
  2021-08-03  0:02 ` [PATCH 2/2] loop: Add the default_queue_depth kernel module parameter Bart Van Assche
  1 sibling, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2021-08-03  0:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Ming Lei, Tetsuo Handa, Martijn Coenen

Loop devices have a single hardware queue. Hence, the block layer function
elevator_get_default() selects the mq-deadline scheduler for loop devices.
Using the mq-deadline scheduler or any other I/O scheduler for loop devices
incurs unnecessary overhead. Make the loop driver pass the flag
BLK_MQ_F_NOSCHED to the block layer core such that no I/O scheduler can be
associated with block devices. This approach has an advantage compared to
letting udevd change the loop I/O scheduler to none, namely that
synchronize_rcu() does not get called.

It is intentional that the flag BLK_MQ_F_SHOULD_MERGE is preserved.

This patch reduces the Android boot time on my test setup with 0.5 seconds.

Cc: Ming Lei <ming.lei@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martijn Coenen <maco@android.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/block/loop.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f8486d9b75a4..9fca3ab3988d 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -2333,7 +2333,8 @@ static int loop_add(int i)
 	lo->tag_set.queue_depth = 128;
 	lo->tag_set.numa_node = NUMA_NO_NODE;
 	lo->tag_set.cmd_size = sizeof(struct loop_cmd);
-	lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING;
+	lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING |
+		BLK_MQ_F_NO_SCHED;
 	lo->tag_set.driver_data = lo;
 
 	err = blk_mq_alloc_tag_set(&lo->tag_set);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] loop: Add the default_queue_depth kernel module parameter
  2021-08-03  0:01 [PATCH 0/2] Two loop driver patches Bart Van Assche
  2021-08-03  0:01 ` [PATCH 1/2] loop: Prevent that an I/O scheduler is assigned Bart Van Assche
@ 2021-08-03  0:02 ` Bart Van Assche
  2021-08-03  1:57   ` Ming Lei
  1 sibling, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2021-08-03  0:02 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Ming Lei, Tetsuo Handa, Martijn Coenen

Recent versions of Android use the zram driver on top of the loop driver.
There is a mismatch between the default loop driver queue depth (128) and
the queue depth of the storage device in my test setup (32). That mismatch
results in write latencies that are higher than necessary. Address this
issue by making the default loop driver queue depth configurable. Compared
to configuring the queue depth by writing into the nr_requests sysfs
attribute, this approach does not involve calling synchronize_rcu() to
modify the queue depth.

Cc: Ming Lei <ming.lei@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martijn Coenen <maco@android.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/block/loop.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 9fca3ab3988d..0f1f1ecd941a 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -2098,6 +2098,9 @@ module_param(max_loop, int, 0444);
 MODULE_PARM_DESC(max_loop, "Maximum number of loop devices");
 module_param(max_part, int, 0444);
 MODULE_PARM_DESC(max_part, "Maximum number of partitions per loop device");
+static uint32_t default_queue_depth = 128;
+module_param(default_queue_depth, uint, 0644);
+MODULE_PARM_DESC(default_queue_depth, "Default loop device queue depth");
 MODULE_LICENSE("GPL");
 MODULE_ALIAS_BLOCKDEV_MAJOR(LOOP_MAJOR);
 
@@ -2330,7 +2333,7 @@ static int loop_add(int i)
 	err = -ENOMEM;
 	lo->tag_set.ops = &loop_mq_ops;
 	lo->tag_set.nr_hw_queues = 1;
-	lo->tag_set.queue_depth = 128;
+	lo->tag_set.queue_depth = max(default_queue_depth, 2U);
 	lo->tag_set.numa_node = NUMA_NO_NODE;
 	lo->tag_set.cmd_size = sizeof(struct loop_cmd);
 	lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING |

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] loop: Prevent that an I/O scheduler is assigned
  2021-08-03  0:01 ` [PATCH 1/2] loop: Prevent that an I/O scheduler is assigned Bart Van Assche
@ 2021-08-03  1:54   ` Ming Lei
  2021-08-03  5:23     ` Bart Van Assche
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2021-08-03  1:54 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Jaegeuk Kim,
	Tetsuo Handa, Martijn Coenen

On Mon, Aug 02, 2021 at 05:01:59PM -0700, Bart Van Assche wrote:
> Loop devices have a single hardware queue. Hence, the block layer function
> elevator_get_default() selects the mq-deadline scheduler for loop devices.
> Using the mq-deadline scheduler or any other I/O scheduler for loop devices
> incurs unnecessary overhead. Make the loop driver pass the flag
> BLK_MQ_F_NOSCHED to the block layer core such that no I/O scheduler can be
> associated with block devices. This approach has an advantage compared to
> letting udevd change the loop I/O scheduler to none, namely that
> synchronize_rcu() does not get called.
> 
> It is intentional that the flag BLK_MQ_F_SHOULD_MERGE is preserved.
> 
> This patch reduces the Android boot time on my test setup with 0.5 seconds.

Can you investigate why none reduces Android boot time? Or reproduce &
understand it by a fio simulation on your setting?

> 
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Martijn Coenen <maco@android.com>
> Cc: Jaegeuk Kim <jaegeuk@kernel.org>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/block/loop.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index f8486d9b75a4..9fca3ab3988d 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -2333,7 +2333,8 @@ static int loop_add(int i)
>  	lo->tag_set.queue_depth = 128;
>  	lo->tag_set.numa_node = NUMA_NO_NODE;
>  	lo->tag_set.cmd_size = sizeof(struct loop_cmd);
> -	lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING;
> +	lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING |
> +		BLK_MQ_F_NO_SCHED;

Loop directio needs io merge, so it isn't good to set NO_SCHED
unconditionally, see:

40326d8a33d5 ("block/loop: allow request merge for directio mode")

Thanks,
Ming


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] loop: Add the default_queue_depth kernel module parameter
  2021-08-03  0:02 ` [PATCH 2/2] loop: Add the default_queue_depth kernel module parameter Bart Van Assche
@ 2021-08-03  1:57   ` Ming Lei
  0 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2021-08-03  1:57 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Jaegeuk Kim,
	Tetsuo Handa, Martijn Coenen

On Mon, Aug 02, 2021 at 05:02:00PM -0700, Bart Van Assche wrote:
> Recent versions of Android use the zram driver on top of the loop driver.
> There is a mismatch between the default loop driver queue depth (128) and
> the queue depth of the storage device in my test setup (32). That mismatch
> results in write latencies that are higher than necessary. Address this
> issue by making the default loop driver queue depth configurable. Compared
> to configuring the queue depth by writing into the nr_requests sysfs
> attribute, this approach does not involve calling synchronize_rcu() to
> modify the queue depth.
> 
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Martijn Coenen <maco@android.com>
> Cc: Jaegeuk Kim <jaegeuk@kernel.org>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/block/loop.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 9fca3ab3988d..0f1f1ecd941a 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -2098,6 +2098,9 @@ module_param(max_loop, int, 0444);
>  MODULE_PARM_DESC(max_loop, "Maximum number of loop devices");
>  module_param(max_part, int, 0444);
>  MODULE_PARM_DESC(max_part, "Maximum number of partitions per loop device");
> +static uint32_t default_queue_depth = 128;
> +module_param(default_queue_depth, uint, 0644);
> +MODULE_PARM_DESC(default_queue_depth, "Default loop device queue depth");
>  MODULE_LICENSE("GPL");
>  MODULE_ALIAS_BLOCKDEV_MAJOR(LOOP_MAJOR);
>  
> @@ -2330,7 +2333,7 @@ static int loop_add(int i)
>  	err = -ENOMEM;
>  	lo->tag_set.ops = &loop_mq_ops;
>  	lo->tag_set.nr_hw_queues = 1;
> -	lo->tag_set.queue_depth = 128;
> +	lo->tag_set.queue_depth = max(default_queue_depth, 2U);
>  	lo->tag_set.numa_node = NUMA_NO_NODE;
>  	lo->tag_set.cmd_size = sizeof(struct loop_cmd);
>  	lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING |
> 

Looks fine:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] loop: Prevent that an I/O scheduler is assigned
  2021-08-03  1:54   ` Ming Lei
@ 2021-08-03  5:23     ` Bart Van Assche
  2021-08-03  7:18       ` Ming Lei
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2021-08-03  5:23 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Jaegeuk Kim,
	Tetsuo Handa, Martijn Coenen

On 8/2/21 6:54 PM, Ming Lei wrote:
> On Mon, Aug 02, 2021 at 05:01:59PM -0700, Bart Van Assche wrote:
>> Loop devices have a single hardware queue. Hence, the block layer function
>> elevator_get_default() selects the mq-deadline scheduler for loop devices.
>> Using the mq-deadline scheduler or any other I/O scheduler for loop devices
>> incurs unnecessary overhead. Make the loop driver pass the flag
>> BLK_MQ_F_NOSCHED to the block layer core such that no I/O scheduler can be
>> associated with block devices. This approach has an advantage compared to
>> letting udevd change the loop I/O scheduler to none, namely that
>> synchronize_rcu() does not get called.
>>
>> It is intentional that the flag BLK_MQ_F_SHOULD_MERGE is preserved.
>>
>> This patch reduces the Android boot time on my test setup with 0.5 seconds.
> 
> Can you investigate why none reduces Android boot time? Or reproduce &
> understand it by a fio simulation on your setting?

Hi Ming,

The software process called apexd creates multiple loop devices while
the device is booting. Using BLK_MQ_F_NO_SCHED is faster than letting
apexd change the I/O scheduler from mq-deadline into 'none' since the
latter involves calling synchronize_rcu() once per loop device.

>> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
>> index f8486d9b75a4..9fca3ab3988d 100644
>> --- a/drivers/block/loop.c
>> +++ b/drivers/block/loop.c
>> @@ -2333,7 +2333,8 @@ static int loop_add(int i)
>>  	lo->tag_set.queue_depth = 128;
>>  	lo->tag_set.numa_node = NUMA_NO_NODE;
>>  	lo->tag_set.cmd_size = sizeof(struct loop_cmd);
>> -	lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING;
>> +	lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING |
>> +		BLK_MQ_F_NO_SCHED;
> 
> Loop directio needs io merge, so it isn't good to set NO_SCHED
> unconditionally, see:
> 
> 40326d8a33d5 ("block/loop: allow request merge for directio mode")

Setting BLK_MQ_F_NO_SCHED only for buffered I/O mode could be tricky
since the loop driver creates a request queue before the I/O mode is
configured. Anyway, I will look into this.

Bart.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] loop: Prevent that an I/O scheduler is assigned
  2021-08-03  5:23     ` Bart Van Assche
@ 2021-08-03  7:18       ` Ming Lei
  0 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2021-08-03  7:18 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Jaegeuk Kim,
	Tetsuo Handa, Martijn Coenen

On Mon, Aug 02, 2021 at 10:23:56PM -0700, Bart Van Assche wrote:
> On 8/2/21 6:54 PM, Ming Lei wrote:
> > On Mon, Aug 02, 2021 at 05:01:59PM -0700, Bart Van Assche wrote:
> >> Loop devices have a single hardware queue. Hence, the block layer function
> >> elevator_get_default() selects the mq-deadline scheduler for loop devices.
> >> Using the mq-deadline scheduler or any other I/O scheduler for loop devices
> >> incurs unnecessary overhead. Make the loop driver pass the flag
> >> BLK_MQ_F_NOSCHED to the block layer core such that no I/O scheduler can be
> >> associated with block devices. This approach has an advantage compared to
> >> letting udevd change the loop I/O scheduler to none, namely that
> >> synchronize_rcu() does not get called.
> >>
> >> It is intentional that the flag BLK_MQ_F_SHOULD_MERGE is preserved.
> >>
> >> This patch reduces the Android boot time on my test setup with 0.5 seconds.
> > 
> > Can you investigate why none reduces Android boot time? Or reproduce &
> > understand it by a fio simulation on your setting?
> 
> Hi Ming,
> 
> The software process called apexd creates multiple loop devices while
> the device is booting. Using BLK_MQ_F_NO_SCHED is faster than letting
> apexd change the I/O scheduler from mq-deadline into 'none' since the
> latter involves calling synchronize_rcu() once per loop device.

OK, but why does apexd switch to none during booting? Does none perform
better than deadline during booting in the Android setting?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-03  7:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-03  0:01 [PATCH 0/2] Two loop driver patches Bart Van Assche
2021-08-03  0:01 ` [PATCH 1/2] loop: Prevent that an I/O scheduler is assigned Bart Van Assche
2021-08-03  1:54   ` Ming Lei
2021-08-03  5:23     ` Bart Van Assche
2021-08-03  7:18       ` Ming Lei
2021-08-03  0:02 ` [PATCH 2/2] loop: Add the default_queue_depth kernel module parameter Bart Van Assche
2021-08-03  1:57   ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).