All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mq-deadline: Fix request accounting
@ 2021-08-24 17:05 Bart Van Assche
  2021-08-24 17:09 ` Jens Axboe
  2021-08-24 21:35 ` Niklas Cassel
  0 siblings, 2 replies; 7+ messages in thread
From: Bart Van Assche @ 2021-08-24 17:05 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Niklas Cassel, Damien Le Moal, Hannes Reinecke

The block layer may call the I/O scheduler .finish_request() callback
without having called the .insert_requests() callback. Make sure that the
mq-deadline I/O statistics are correct if the block layer inserts an I/O
request that bypasses the I/O scheduler. This patch prevents that lower
priority I/O is delayed longer than necessary for mixed I/O priority
workloads.

Cc: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Fixes: 08a9ad8bf607 ("block/mq-deadline: Add cgroup support")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index dbd74bae9f11..28f2e7655a5f 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -713,6 +713,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 
 	prio = ioprio_class_to_prio[ioprio_class];
 	dd_count(dd, inserted, prio);
+	rq->elv.priv[0] = (void *)(uintptr_t)1;
 
 	if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
 		blk_mq_free_requests(&free);
@@ -761,12 +762,10 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
 	spin_unlock(&dd->lock);
 }
 
-/*
- * Nothing to do here. This is defined only to ensure that .finish_request
- * method is called upon request completion.
- */
+/* Callback from inside blk_mq_rq_ctx_init(). */
 static void dd_prepare_request(struct request *rq)
 {
+	rq->elv.priv[0] = NULL;
 }
 
 /*
@@ -793,7 +792,14 @@ static void dd_finish_request(struct request *rq)
 	const enum dd_prio prio = ioprio_class_to_prio[ioprio_class];
 	struct dd_per_prio *per_prio = &dd->per_prio[prio];
 
-	dd_count(dd, completed, prio);
+	/*
+	 * The block layer core may call dd_finish_request() without having
+	 * called dd_insert_requests(). Hence only update statistics for
+	 * requests for which dd_insert_requests() has been called. See also
+	 * blk_mq_request_bypass_insert().
+	 */
+	if (rq->elv.priv[0])
+		dd_count(dd, completed, prio);
 
 	if (blk_queue_is_zoned(q)) {
 		unsigned long flags;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mq-deadline: Fix request accounting
  2021-08-24 17:05 [PATCH] mq-deadline: Fix request accounting Bart Van Assche
@ 2021-08-24 17:09 ` Jens Axboe
  2021-08-24 17:13   ` Bart Van Assche
  2021-08-24 21:35 ` Niklas Cassel
  1 sibling, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2021-08-24 17:09 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Niklas Cassel,
	Damien Le Moal, Hannes Reinecke

On 8/24/21 11:05 AM, Bart Van Assche wrote:
> The block layer may call the I/O scheduler .finish_request() callback
> without having called the .insert_requests() callback. Make sure that the
> mq-deadline I/O statistics are correct if the block layer inserts an I/O
> request that bypasses the I/O scheduler. This patch prevents that lower
> priority I/O is delayed longer than necessary for mixed I/O priority
> workloads.
> 
> Cc: Niklas Cassel <Niklas.Cassel@wdc.com>
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com>
> Fixes: 08a9ad8bf607 ("block/mq-deadline: Add cgroup support")

This one is a little confusing, since that commit got reverted...
Yet the patch still applies.

Shouldn't it be:

Fixes: 38ba64d12d4c ("block/mq-deadline: Track I/O statistics")


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mq-deadline: Fix request accounting
  2021-08-24 17:09 ` Jens Axboe
@ 2021-08-24 17:13   ` Bart Van Assche
  2021-08-24 17:18     ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2021-08-24 17:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Niklas Cassel,
	Damien Le Moal, Hannes Reinecke

On 8/24/21 10:09 AM, Jens Axboe wrote:
> Shouldn't it be:
> 
> Fixes: 38ba64d12d4c ("block/mq-deadline: Track I/O statistics")

Right, that's what the Fixes line should look like. Do you want me to
resend this patch?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mq-deadline: Fix request accounting
  2021-08-24 17:13   ` Bart Van Assche
@ 2021-08-24 17:18     ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2021-08-24 17:18 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Niklas Cassel,
	Damien Le Moal, Hannes Reinecke

On 8/24/21 11:13 AM, Bart Van Assche wrote:
> On 8/24/21 10:09 AM, Jens Axboe wrote:
>> Shouldn't it be:
>>
>> Fixes: 38ba64d12d4c ("block/mq-deadline: Track I/O statistics")
> 
> Right, that's what the Fixes line should look like. Do you want me to
> resend this patch?

Nah that's fine, I'll just correct it.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mq-deadline: Fix request accounting
  2021-08-24 17:05 [PATCH] mq-deadline: Fix request accounting Bart Van Assche
  2021-08-24 17:09 ` Jens Axboe
@ 2021-08-24 21:35 ` Niklas Cassel
  2021-08-27  2:22   ` Bart Van Assche
  1 sibling, 1 reply; 7+ messages in thread
From: Niklas Cassel @ 2021-08-24 21:35 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Jaegeuk Kim,
	Damien Le Moal, Hannes Reinecke

On Tue, Aug 24, 2021 at 10:05:20AM -0700, Bart Van Assche wrote:
> The block layer may call the I/O scheduler .finish_request() callback
> without having called the .insert_requests() callback. Make sure that the
> mq-deadline I/O statistics are correct if the block layer inserts an I/O
> request that bypasses the I/O scheduler. This patch prevents that lower
> priority I/O is delayed longer than necessary for mixed I/O priority
> workloads.
> 
> Cc: Niklas Cassel <Niklas.Cassel@wdc.com>
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com>
> Fixes: 08a9ad8bf607 ("block/mq-deadline: Add cgroup support")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/mq-deadline.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index dbd74bae9f11..28f2e7655a5f 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -713,6 +713,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>  
>  	prio = ioprio_class_to_prio[ioprio_class];
>  	dd_count(dd, inserted, prio);
> +	rq->elv.priv[0] = (void *)(uintptr_t)1;
>  
>  	if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
>  		blk_mq_free_requests(&free);
> @@ -761,12 +762,10 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
>  	spin_unlock(&dd->lock);
>  }
>  
> -/*
> - * Nothing to do here. This is defined only to ensure that .finish_request
> - * method is called upon request completion.
> - */
> +/* Callback from inside blk_mq_rq_ctx_init(). */
>  static void dd_prepare_request(struct request *rq)
>  {
> +	rq->elv.priv[0] = NULL;
>  }
>  
>  /*
> @@ -793,7 +792,14 @@ static void dd_finish_request(struct request *rq)
>  	const enum dd_prio prio = ioprio_class_to_prio[ioprio_class];
>  	struct dd_per_prio *per_prio = &dd->per_prio[prio];
>  
> -	dd_count(dd, completed, prio);
> +	/*
> +	 * The block layer core may call dd_finish_request() without having
> +	 * called dd_insert_requests(). Hence only update statistics for
> +	 * requests for which dd_insert_requests() has been called. See also
> +	 * blk_mq_request_bypass_insert().
> +	 */
> +	if (rq->elv.priv[0])
> +		dd_count(dd, completed, prio);
>  
>  	if (blk_queue_is_zoned(q)) {
>  		unsigned long flags;

Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mq-deadline: Fix request accounting
  2021-08-24 21:35 ` Niklas Cassel
@ 2021-08-27  2:22   ` Bart Van Assche
  2021-08-27  9:20     ` Niklas Cassel
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2021-08-27  2:22 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Jaegeuk Kim,
	Damien Le Moal, Hannes Reinecke

On 8/24/21 2:35 PM, Niklas Cassel wrote:
> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
> Tested-by: Niklas Cassel <niklas.cassel@wdc.com>

Hi Niklas,

Thank you for the help, that's really appreciated. Earlier today I discovered
that this patch does not handle requeuing correctly so I have started testing
the patch below.

---
 block/mq-deadline.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index dbd74bae9f11..c4c3c2e3f446 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -712,7 +712,10 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 	blk_req_zone_write_unlock(rq);

 	prio = ioprio_class_to_prio[ioprio_class];
-	dd_count(dd, inserted, prio);
+	if (!rq->elv.priv[0]) {
+		dd_count(dd, inserted, prio);
+		rq->elv.priv[0] = (void *)(uintptr_t)1;
+	}

 	if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
 		blk_mq_free_requests(&free);
@@ -761,12 +764,10 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
 	spin_unlock(&dd->lock);
 }

-/*
- * Nothing to do here. This is defined only to ensure that .finish_request
- * method is called upon request completion.
- */
+/* Callback from inside blk_mq_rq_ctx_init(). */
 static void dd_prepare_request(struct request *rq)
 {
+	rq->elv.priv[0] = NULL;
 }

 /*
@@ -793,6 +794,14 @@ static void dd_finish_request(struct request *rq)
 	const enum dd_prio prio = ioprio_class_to_prio[ioprio_class];
 	struct dd_per_prio *per_prio = &dd->per_prio[prio];

+	/*
+	 * The block layer core may call dd_finish_request() without having
+	 * called dd_insert_requests(). Skip requests that bypassed I/O
+	 * scheduling. See also blk_mq_request_bypass_insert().
+	 */
+	if (!rq->elv.priv[0])
+		return;
+
 	dd_count(dd, completed, prio);

 	if (blk_queue_is_zoned(q)) {

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mq-deadline: Fix request accounting
  2021-08-27  2:22   ` Bart Van Assche
@ 2021-08-27  9:20     ` Niklas Cassel
  0 siblings, 0 replies; 7+ messages in thread
From: Niklas Cassel @ 2021-08-27  9:20 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Jaegeuk Kim,
	Damien Le Moal, Hannes Reinecke

On Thu, Aug 26, 2021 at 07:22:38PM -0700, Bart Van Assche wrote:
> On 8/24/21 2:35 PM, Niklas Cassel wrote:
> > Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
> > Tested-by: Niklas Cassel <niklas.cassel@wdc.com>
> 
> Hi Niklas,
> 
> Thank you for the help, that's really appreciated. Earlier today I discovered
> that this patch does not handle requeuing correctly so I have started testing
> the patch below.

Since Jens has reverted "block/mq-deadline: Prioritize high-priority requests",
which was the culprit for the 10 second slowdowns in the first case, there is
no longer any urgency for you to fix the counters to behave properly on
requeues, since without that patch, the counters are not used for decision
making anyway.

Looking more closely on how it is solved in BFQ:
they have both .finish_request and .requeue_request pointing to the same
function: bfq_finish_requeue_request()

bfq_finish_requeue_request() also sets rq->elv.priv[0] = NULL; at the end
of the function.

Also see the big note about how this really should be solved in blk-mq:
https://github.com/torvalds/linux/blob/master/block/bfq-iosched.c#L6456-L6462

Is seems wrong to clutter multiple callback functions in both BFQ and
mq-deadline because of shortcomings in blk-mq.


It seems like the way forward is to add a RQF_ELEVATOR request flag, which
only gets set in req->rq_flags if e->type->ops.insert_requests() gets called
(by blk_mq_sched_insert_request() or blk_mq_sched_insert_requests()).
Then blk_mq_free_request() only calls ops.finish_request if RQF_ELEVATOR is
set, and similarly blk_mq_sched_requeue_request() only calls
ops.requeue_request if RQF_ELEVATOR is set (and perhaps also
blk_mq_sched_completed_request() for ops.completed_request).

This should enable us to remove ugly if-statements from .insert_requests and
.finish_request for both mq-deadline and BFQ. (And BFQ can also remove ugly
if-statements and comments from their .requeue_request callback.)


And no, we cannot reuse RQF_ELVPRIV, since that flag gets set by
__blk_mq_alloc_request(), which is at a way too early stage, since at that
point, we don't know if the request will bypass the scheduler or not,
the flag has to get set at insertion time.
(Since right now, a request that has RQF_ELVPRIV, and have thus been prepared
by ops.prepare_request, might still not get inserted into the scheduler.)

We should probably just remove RQF_ELVPRIV, and as a first step, simply have
blk_mq_sched_insert_request() and blk_mq_sched_insert_requests() call
ops.prepare_request immediately before ops.insert_requests. Because currently,
it seems that all elevators only use .prepare_request to clear private fields,
no elevator has any real logic in their .prepare_request callback.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-27  9:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-24 17:05 [PATCH] mq-deadline: Fix request accounting Bart Van Assche
2021-08-24 17:09 ` Jens Axboe
2021-08-24 17:13   ` Bart Van Assche
2021-08-24 17:18     ` Jens Axboe
2021-08-24 21:35 ` Niklas Cassel
2021-08-27  2:22   ` Bart Van Assche
2021-08-27  9:20     ` Niklas Cassel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.