All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] blk-mq: update hctx->dispatch_busy in case of real scheduler
@ 2021-05-28  3:20 Ming Lei
  2021-05-28 12:26 ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2021-05-28  3:20 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig; +Cc: linux-block, Ming Lei, Jan Kara

Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
starts to support io batching submission by using hctx->dispatch_busy.

However, blk_mq_update_dispatch_busy() isn't changed to update hctx->dispatch_busy
in that commit, so fix the issue by updating hctx->dispatch_busy in case
of real scheduler.

Reported-by: Jan Kara <jack@suse.cz>
Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f11d4018ce2e..4930f7119f22 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1224,9 +1224,6 @@ static void blk_mq_update_dispatch_busy(struct blk_mq_hw_ctx *hctx, bool busy)
 {
 	unsigned int ewma;
 
-	if (hctx->queue->elevator)
-		return;
-
 	ewma = hctx->dispatch_busy;
 
 	if (!ewma && !busy)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: update hctx->dispatch_busy in case of real scheduler
  2021-05-28  3:20 [PATCH] blk-mq: update hctx->dispatch_busy in case of real scheduler Ming Lei
@ 2021-05-28 12:26 ` Jan Kara
  2021-05-31  1:37   ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2021-05-28 12:26 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, Christoph Hellwig, linux-block, Jan Kara

On Fri 28-05-21 11:20:55, Ming Lei wrote:
> Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
> starts to support io batching submission by using hctx->dispatch_busy.
> 
> However, blk_mq_update_dispatch_busy() isn't changed to update hctx->dispatch_busy
> in that commit, so fix the issue by updating hctx->dispatch_busy in case
> of real scheduler.
> 
> Reported-by: Jan Kara <jack@suse.cz>
> Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-mq.c | 3 ---
>  1 file changed, 3 deletions(-)

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

BTW: Do you plan to submit also your improvement to
__blk_mq_do_dispatch_sched() to update dispatch_busy during the fetching
requests from the scheduler to avoid draining all requests from the IO
scheduler?

								Honza
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f11d4018ce2e..4930f7119f22 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1224,9 +1224,6 @@ static void blk_mq_update_dispatch_busy(struct blk_mq_hw_ctx *hctx, bool busy)
>  {
>  	unsigned int ewma;
>  
> -	if (hctx->queue->elevator)
> -		return;
> -
>  	ewma = hctx->dispatch_busy;
>  
>  	if (!ewma && !busy)
> -- 
> 2.29.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: update hctx->dispatch_busy in case of real scheduler
  2021-05-28 12:26 ` Jan Kara
@ 2021-05-31  1:37   ` Ming Lei
  2021-05-31 12:04     ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2021-05-31  1:37 UTC (permalink / raw)
  To: Jan Kara; +Cc: Jens Axboe, Christoph Hellwig, linux-block

On Fri, May 28, 2021 at 02:26:31PM +0200, Jan Kara wrote:
> On Fri 28-05-21 11:20:55, Ming Lei wrote:
> > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
> > starts to support io batching submission by using hctx->dispatch_busy.
> > 
> > However, blk_mq_update_dispatch_busy() isn't changed to update hctx->dispatch_busy
> > in that commit, so fix the issue by updating hctx->dispatch_busy in case
> > of real scheduler.
> > 
> > Reported-by: Jan Kara <jack@suse.cz>
> > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  block/blk-mq.c | 3 ---
> >  1 file changed, 3 deletions(-)
> 
> Looks good to me. You can add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> BTW: Do you plan to submit also your improvement to
> __blk_mq_do_dispatch_sched() to update dispatch_busy during the fetching
> requests from the scheduler to avoid draining all requests from the IO
> scheduler?

I understand that kind of change isn't needed. When more requests are
dequeued, hctx->dispatch_busy will be updated, then __blk_mq_do_dispatch_sched()
won't dequeue at batch any more if either .queue_rq() returns
STS_RESOURCE or running out of driver tag/budget.

Or do you still see related issues after this patch is applied?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: update hctx->dispatch_busy in case of real scheduler
  2021-05-31  1:37   ` Ming Lei
@ 2021-05-31 12:04     ` Jan Kara
  2021-06-01 10:01       ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2021-05-31 12:04 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jan Kara, Jens Axboe, Christoph Hellwig, linux-block

On Mon 31-05-21 09:37:01, Ming Lei wrote:
> On Fri, May 28, 2021 at 02:26:31PM +0200, Jan Kara wrote:
> > On Fri 28-05-21 11:20:55, Ming Lei wrote:
> > > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
> > > starts to support io batching submission by using hctx->dispatch_busy.
> > > 
> > > However, blk_mq_update_dispatch_busy() isn't changed to update hctx->dispatch_busy
> > > in that commit, so fix the issue by updating hctx->dispatch_busy in case
> > > of real scheduler.
> > > 
> > > Reported-by: Jan Kara <jack@suse.cz>
> > > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
> > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > ---
> > >  block/blk-mq.c | 3 ---
> > >  1 file changed, 3 deletions(-)
> > 
> > Looks good to me. You can add:
> > 
> > Reviewed-by: Jan Kara <jack@suse.cz>
> > 
> > BTW: Do you plan to submit also your improvement to
> > __blk_mq_do_dispatch_sched() to update dispatch_busy during the fetching
> > requests from the scheduler to avoid draining all requests from the IO
> > scheduler?
> 
> I understand that kind of change isn't needed. When more requests are
> dequeued, hctx->dispatch_busy will be updated, then __blk_mq_do_dispatch_sched()
> won't dequeue at batch any more if either .queue_rq() returns
> STS_RESOURCE or running out of driver tag/budget.
> 
> Or do you still see related issues after this patch is applied?

I was suspicious that __blk_mq_do_dispatch_sched() would be still pulling
requests too aggressively from the IO scheduler (which effectively defeats
impact of cgroup IO weights on observed throughput). Now I did a few more
experiments with the workload doing multiple iterations for each kernel and
comparing ratios of achieved throughput when cgroup weights were in 2:1
ratio.

With this patch alone, I've got no significant distinction between IO from
two cgroups in 4 out of 5 test iterations. With your patch to update
max_dispatch in __blk_mq_do_dispatch_sched() applied on top the results
were not significantly different (my previous test result was likely a
lucky chance). With my original patch to allocate driver tags early in
__blk_mq_do_dispatch_sched() I get reliable distinction between cgroups -
the worst ratio from all the iterations is 1.4, average ratio is ~1.75.
This last result is btw very similar to ratios I can see when using
virtio-scsi instead of virtio-blk for the backing storage which is kind of
natural because virtio-scsi ends up using the dispatch-budget logic of SCSI
subsystem. I'm not saying my patch is the right way to do things but it
clearly shows that __blk_mq_do_dispatch_sched() is still too aggressive
pulling requests out of the IO scheduler.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: update hctx->dispatch_busy in case of real scheduler
  2021-05-31 12:04     ` Jan Kara
@ 2021-06-01 10:01       ` Ming Lei
  0 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2021-06-01 10:01 UTC (permalink / raw)
  To: Jan Kara; +Cc: Jens Axboe, Christoph Hellwig, linux-block

On Mon, May 31, 2021 at 02:04:34PM +0200, Jan Kara wrote:
> On Mon 31-05-21 09:37:01, Ming Lei wrote:
> > On Fri, May 28, 2021 at 02:26:31PM +0200, Jan Kara wrote:
> > > On Fri 28-05-21 11:20:55, Ming Lei wrote:
> > > > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
> > > > starts to support io batching submission by using hctx->dispatch_busy.
> > > > 
> > > > However, blk_mq_update_dispatch_busy() isn't changed to update hctx->dispatch_busy
> > > > in that commit, so fix the issue by updating hctx->dispatch_busy in case
> > > > of real scheduler.
> > > > 
> > > > Reported-by: Jan Kara <jack@suse.cz>
> > > > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > >  block/blk-mq.c | 3 ---
> > > >  1 file changed, 3 deletions(-)
> > > 
> > > Looks good to me. You can add:
> > > 
> > > Reviewed-by: Jan Kara <jack@suse.cz>
> > > 
> > > BTW: Do you plan to submit also your improvement to
> > > __blk_mq_do_dispatch_sched() to update dispatch_busy during the fetching
> > > requests from the scheduler to avoid draining all requests from the IO
> > > scheduler?
> > 
> > I understand that kind of change isn't needed. When more requests are
> > dequeued, hctx->dispatch_busy will be updated, then __blk_mq_do_dispatch_sched()
> > won't dequeue at batch any more if either .queue_rq() returns
> > STS_RESOURCE or running out of driver tag/budget.
> > 
> > Or do you still see related issues after this patch is applied?
> 
> I was suspicious that __blk_mq_do_dispatch_sched() would be still pulling
> requests too aggressively from the IO scheduler (which effectively defeats
> impact of cgroup IO weights on observed throughput). Now I did a few more
> experiments with the workload doing multiple iterations for each kernel and
> comparing ratios of achieved throughput when cgroup weights were in 2:1
> ratio.
> 
> With this patch alone, I've got no significant distinction between IO from
> two cgroups in 4 out of 5 test iterations. With your patch to update
> max_dispatch in __blk_mq_do_dispatch_sched() applied on top the results
> were not significantly different (my previous test result was likely a
> lucky chance). With my original patch to allocate driver tags early in
> __blk_mq_do_dispatch_sched() I get reliable distinction between cgroups -
> the worst ratio from all the iterations is 1.4, average ratio is ~1.75.
> This last result is btw very similar to ratios I can see when using
> virtio-scsi instead of virtio-blk for the backing storage which is kind of
> natural because virtio-scsi ends up using the dispatch-budget logic of SCSI
> subsystem. I'm not saying my patch is the right way to do things but it
> clearly shows that __blk_mq_do_dispatch_sched() is still too aggressive
> pulling requests out of the IO scheduler.

IMO getting driver tag before dequeue should be one good way for this
issue, and that is exactly what the legacy request IO code path did. Not
only address this issue, but also get better leverage between batching
dispatch and IO request merge.

But it can't work by simply adding blk_mq_get_driver_tag() in
__blk_mq_do_dispatch_sched(), and one big problem is that how to re-run
queue in case of running out of getting driver tag. That said we need to
refactor blk_mq_dispatch_rq_list(), such as moving handling of running
out of driver tag into __blk_mq_do_dispatch_sched(). I will investigate a
bit on this change.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-01 10:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-28  3:20 [PATCH] blk-mq: update hctx->dispatch_busy in case of real scheduler Ming Lei
2021-05-28 12:26 ` Jan Kara
2021-05-31  1:37   ` Ming Lei
2021-05-31 12:04     ` Jan Kara
2021-06-01 10:01       ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.