All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] don't call io scheduler callbacks for passthrough requests
@ 2021-09-07 14:21 Niklas Cassel
  2021-09-07 14:21 ` [PATCH 1/2] blk-mq: don't call callbacks for requests that bypassed the scheduler Niklas Cassel
  2021-09-07 14:21 ` [PATCH 2/2] Revert "mq-deadline: Fix request accounting" Niklas Cassel
  0 siblings, 2 replies; 11+ messages in thread
From: Niklas Cassel @ 2021-09-07 14:21 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Bart Van Assche, Damien Le Moal, Paolo Valente, Ming Lei,
	Niklas Cassel, linux-block, linux-kernel

From: Niklas Cassel <niklas.cassel@wdc.com>

Recently, there were some extra "dummy" data added into mq-deadline
rq->elv.priv, that was added for the solve purpose of working around
quirky behavior in blk-mq (which called the scheduler callbacks even
for requests that bypassed the scheduler).

Fix blk-mq to not call the I/O scheduler callbacks for passthrough requests,
since they will later bypass the I/O scheduler anyway.

This way, we can remove unnecessary logic in mq-deadline that was added to
deal with this quirky behavior.

BFQ should be able to perform a similar cleanup, if they wish, and should
then be able to drop this comment from the code:
https://github.com/torvalds/linux/blob/v5.14/block/bfq-iosched.c#L6456-L6462

Considering that I'm quite unfamiliar with the BFQ code, I'd rather
someone who is a bit more familiar with BFQ performs that cleanup.


Kind regards,
Niklas

Niklas Cassel (2):
  blk-mq: don't call callbacks for requests that bypassed the scheduler
  Revert "mq-deadline: Fix request accounting"

 block/blk-mq.c      |  7 ++++++-
 block/mq-deadline.c | 16 +++++-----------
 2 files changed, 11 insertions(+), 12 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] blk-mq: don't call callbacks for requests that bypassed the scheduler
  2021-09-07 14:21 [PATCH 0/2] don't call io scheduler callbacks for passthrough requests Niklas Cassel
@ 2021-09-07 14:21 ` Niklas Cassel
  2021-09-07 14:29   ` Ming Lei
  2021-09-07 14:21 ` [PATCH 2/2] Revert "mq-deadline: Fix request accounting" Niklas Cassel
  1 sibling, 1 reply; 11+ messages in thread
From: Niklas Cassel @ 2021-09-07 14:21 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Bart Van Assche, Damien Le Moal, Paolo Valente, Ming Lei,
	Niklas Cassel, linux-block, linux-kernel

From: Niklas Cassel <niklas.cassel@wdc.com>

Currently, __blk_mq_alloc_request() (via blk_mq_rq_ctx_init()) calls the
I/O scheduler callback e->type->ops.prepare_request(), which will set
RQF_ELVPRIV, even though passthrough (and flush) requests will later
bypass the I/O scheduler in blk_mq_submit_bio().

Later, blk_mq_free_request() checks if the RQF_ELVPRIV flag is set,
if it is, the e->type->ops.finish_request() I/O scheduler callback
will be called.

i.e., the prepare_request and finish_request I/O scheduler callbacks
will be called for requests which were never inserted to the I/O
scheduler.

Fix this by not calling e->type->ops.prepare_request(), nor setting
the RQF_ELVPRIV flag for passthrough requests.
Since the RQF_ELVPRIV flag will not get set for passthrough requests,
e->type->ops.prepare_request() will no longer get called for
passthrough requests which were never inserted to the I/O scheduler.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
---
 block/blk-mq.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 65d3a63aecc6..0816af125059 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -328,7 +328,12 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	data->ctx->rq_dispatched[op_is_sync(data->cmd_flags)]++;
 	refcount_set(&rq->ref, 1);
 
-	if (!op_is_flush(data->cmd_flags)) {
+	/*
+	 * Flush/passthrough requests are special and go directly to the
+	 * dispatch list, bypassing the scheduler.
+	 */
+	if (!op_is_flush(data->cmd_flags) &&
+	    !blk_op_is_passthrough(data->cmd_flags)) {
 		struct elevator_queue *e = data->q->elevator;
 
 		rq->elv.icq = NULL;
-- 
2.31.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] Revert "mq-deadline: Fix request accounting"
  2021-09-07 14:21 [PATCH 0/2] don't call io scheduler callbacks for passthrough requests Niklas Cassel
  2021-09-07 14:21 ` [PATCH 1/2] blk-mq: don't call callbacks for requests that bypassed the scheduler Niklas Cassel
@ 2021-09-07 14:21 ` Niklas Cassel
  2021-09-07 14:54   ` Bart Van Assche
  2021-09-07 15:15   ` Bart Van Assche
  1 sibling, 2 replies; 11+ messages in thread
From: Niklas Cassel @ 2021-09-07 14:21 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Bart Van Assche, Damien Le Moal, Paolo Valente, Ming Lei,
	Niklas Cassel, linux-block, linux-kernel

From: Niklas Cassel <niklas.cassel@wdc.com>

This reverts commit b6d2b054e8baaee53fd2d4854c63cbf0f2c6262a.

blk-mq will no longer call the I/O scheduler .finish_request() callback
for requests that were never inserted to the I/O scheduler.

Therefore, we can remove the logic inside mq-deadline that was added to
workaround the (no longer existing) quirky behavior of blk-mq.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
---
 block/mq-deadline.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 7f3c3932b723..b2d1e3adcb39 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -678,7 +678,6 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 
 	prio = ioprio_class_to_prio[ioprio_class];
 	dd_count(dd, inserted, prio);
-	rq->elv.priv[0] = (void *)(uintptr_t)1;
 
 	if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
 		blk_mq_free_requests(&free);
@@ -727,10 +726,12 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
 	spin_unlock(&dd->lock);
 }
 
-/* Callback from inside blk_mq_rq_ctx_init(). */
+/*
+ * Nothing to do here. This is defined only to ensure that .finish_request
+ * method is called upon request completion.
+ */
 static void dd_prepare_request(struct request *rq)
 {
-	rq->elv.priv[0] = NULL;
 }
 
 /*
@@ -757,14 +758,7 @@ static void dd_finish_request(struct request *rq)
 	const enum dd_prio prio = ioprio_class_to_prio[ioprio_class];
 	struct dd_per_prio *per_prio = &dd->per_prio[prio];
 
-	/*
-	 * The block layer core may call dd_finish_request() without having
-	 * called dd_insert_requests(). Hence only update statistics for
-	 * requests for which dd_insert_requests() has been called. See also
-	 * blk_mq_request_bypass_insert().
-	 */
-	if (rq->elv.priv[0])
-		dd_count(dd, completed, prio);
+	dd_count(dd, completed, prio);
 
 	if (blk_queue_is_zoned(q)) {
 		unsigned long flags;
-- 
2.31.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] blk-mq: don't call callbacks for requests that bypassed the scheduler
  2021-09-07 14:21 ` [PATCH 1/2] blk-mq: don't call callbacks for requests that bypassed the scheduler Niklas Cassel
@ 2021-09-07 14:29   ` Ming Lei
  0 siblings, 0 replies; 11+ messages in thread
From: Ming Lei @ 2021-09-07 14:29 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Jens Axboe, Bart Van Assche, Damien Le Moal, Paolo Valente,
	linux-block, linux-kernel

On Tue, Sep 07, 2021 at 02:21:55PM +0000, Niklas Cassel wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>
> 
> Currently, __blk_mq_alloc_request() (via blk_mq_rq_ctx_init()) calls the
> I/O scheduler callback e->type->ops.prepare_request(), which will set
> RQF_ELVPRIV, even though passthrough (and flush) requests will later
> bypass the I/O scheduler in blk_mq_submit_bio().
> 
> Later, blk_mq_free_request() checks if the RQF_ELVPRIV flag is set,
> if it is, the e->type->ops.finish_request() I/O scheduler callback
> will be called.
> 
> i.e., the prepare_request and finish_request I/O scheduler callbacks
> will be called for requests which were never inserted to the I/O
> scheduler.
> 
> Fix this by not calling e->type->ops.prepare_request(), nor setting
> the RQF_ELVPRIV flag for passthrough requests.
> Since the RQF_ELVPRIV flag will not get set for passthrough requests,
> e->type->ops.prepare_request() will no longer get called for
> passthrough requests which were never inserted to the I/O scheduler.
> 
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> ---
>  block/blk-mq.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 65d3a63aecc6..0816af125059 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -328,7 +328,12 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
>  	data->ctx->rq_dispatched[op_is_sync(data->cmd_flags)]++;
>  	refcount_set(&rq->ref, 1);
>  
> -	if (!op_is_flush(data->cmd_flags)) {
> +	/*
> +	 * Flush/passthrough requests are special and go directly to the
> +	 * dispatch list, bypassing the scheduler.
> +	 */
> +	if (!op_is_flush(data->cmd_flags) &&
> +	    !blk_op_is_passthrough(data->cmd_flags)) {

Looks fine:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Revert "mq-deadline: Fix request accounting"
  2021-09-07 14:21 ` [PATCH 2/2] Revert "mq-deadline: Fix request accounting" Niklas Cassel
@ 2021-09-07 14:54   ` Bart Van Assche
  2021-09-07 16:07     ` Niklas Cassel
  2021-09-07 15:15   ` Bart Van Assche
  1 sibling, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2021-09-07 14:54 UTC (permalink / raw)
  To: Niklas Cassel, Jens Axboe
  Cc: Damien Le Moal, Paolo Valente, Ming Lei, linux-block, linux-kernel

On 9/7/21 7:21 AM, Niklas Cassel wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>
> 
> This reverts commit b6d2b054e8baaee53fd2d4854c63cbf0f2c6262a.
> 
> blk-mq will no longer call the I/O scheduler .finish_request() callback
> for requests that were never inserted to the I/O scheduler.
> 
> Therefore, we can remove the logic inside mq-deadline that was added to
> workaround the (no longer existing) quirky behavior of blk-mq.
> 
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> ---
>   block/mq-deadline.c | 16 +++++-----------
>   1 file changed, 5 insertions(+), 11 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index 7f3c3932b723..b2d1e3adcb39 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -678,7 +678,6 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>   
>   	prio = ioprio_class_to_prio[ioprio_class];
>   	dd_count(dd, inserted, prio);
> -	rq->elv.priv[0] = (void *)(uintptr_t)1;
>   
>   	if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
>   		blk_mq_free_requests(&free);
> @@ -727,10 +726,12 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
>   	spin_unlock(&dd->lock);
>   }
>   
> -/* Callback from inside blk_mq_rq_ctx_init(). */
> +/*
> + * Nothing to do here. This is defined only to ensure that .finish_request
> + * method is called upon request completion.
> + */
>   static void dd_prepare_request(struct request *rq)
>   {
> -	rq->elv.priv[0] = NULL;
>   }

Please take a look at
https://lore.kernel.org/linux-block/18594aff-4a94-8a48-334c-f21ae32120c6@acm.org/
If dd_prepare_request() is removed I will have to reintroduce it.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Revert "mq-deadline: Fix request accounting"
  2021-09-07 14:21 ` [PATCH 2/2] Revert "mq-deadline: Fix request accounting" Niklas Cassel
  2021-09-07 14:54   ` Bart Van Assche
@ 2021-09-07 15:15   ` Bart Van Assche
  2021-09-07 16:28     ` Niklas Cassel
  1 sibling, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2021-09-07 15:15 UTC (permalink / raw)
  To: Niklas Cassel, Jens Axboe
  Cc: Damien Le Moal, Paolo Valente, Ming Lei, linux-block, linux-kernel

On 9/7/21 7:21 AM, Niklas Cassel wrote:
> blk-mq will no longer call the I/O scheduler .finish_request() callback
> for requests that were never inserted to the I/O scheduler.

I do not agree. Even with patch 1/2 from this series applied, finish_request()
will still be called for requests inserted by blk_insert_cloned_request()
although these requests are never inserted to the I/O scheduler.

Bart.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Revert "mq-deadline: Fix request accounting"
  2021-09-07 14:54   ` Bart Van Assche
@ 2021-09-07 16:07     ` Niklas Cassel
  2021-09-07 16:49       ` Bart Van Assche
  0 siblings, 1 reply; 11+ messages in thread
From: Niklas Cassel @ 2021-09-07 16:07 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Damien Le Moal, Paolo Valente, Ming Lei, linux-block,
	linux-kernel

On Tue, Sep 07, 2021 at 07:54:09AM -0700, Bart Van Assche wrote:
> On 9/7/21 7:21 AM, Niklas Cassel wrote:
> > From: Niklas Cassel <niklas.cassel@wdc.com>
> > 
> > This reverts commit b6d2b054e8baaee53fd2d4854c63cbf0f2c6262a.
> > 
> > blk-mq will no longer call the I/O scheduler .finish_request() callback
> > for requests that were never inserted to the I/O scheduler.
> > 
> > Therefore, we can remove the logic inside mq-deadline that was added to
> > workaround the (no longer existing) quirky behavior of blk-mq.
> > 
> > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> > ---
> >   block/mq-deadline.c | 16 +++++-----------
> >   1 file changed, 5 insertions(+), 11 deletions(-)
> > 
> > diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> > index 7f3c3932b723..b2d1e3adcb39 100644
> > --- a/block/mq-deadline.c
> > +++ b/block/mq-deadline.c
> > @@ -678,7 +678,6 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
> >   	prio = ioprio_class_to_prio[ioprio_class];
> >   	dd_count(dd, inserted, prio);
> > -	rq->elv.priv[0] = (void *)(uintptr_t)1;
> >   	if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
> >   		blk_mq_free_requests(&free);
> > @@ -727,10 +726,12 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
> >   	spin_unlock(&dd->lock);
> >   }
> > -/* Callback from inside blk_mq_rq_ctx_init(). */
> > +/*
> > + * Nothing to do here. This is defined only to ensure that .finish_request
> > + * method is called upon request completion.
> > + */
> >   static void dd_prepare_request(struct request *rq)
> >   {
> > -	rq->elv.priv[0] = NULL;
> >   }
> 
> Please take a look at
> https://lore.kernel.org/linux-block/18594aff-4a94-8a48-334c-f21ae32120c6@acm.org/
> If dd_prepare_request() is removed I will have to reintroduce it.

dd_prepare_request() itself is not removed, just the
rq->elv.priv[0] = NULL; inside dd_prepare_request().

If you need to modify dd_prepare_request() in a future
commit, that should be fine, no?

Without patch 1/2, e->type->ops.requeue_request() can get called
both for requests that bypassed the I/O scheduler, and for requests
that were inserted in the I/O scheduler.

See:
block/blk-mq-sched.h:blk_mq_sched_requeue_request()
If the RQF_ELVPRIV flag is not set, e->type->ops.requeue_request()
will not be called.

Perhaps you are having issues with requests that were inserted
in the scheduler, but later requeued?

If so, shouldn't these fixes help you, since you do not need to
worry about passthrough requests resulting in spurious calls to
the I/O scheduler callbacks?


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Revert "mq-deadline: Fix request accounting"
  2021-09-07 15:15   ` Bart Van Assche
@ 2021-09-07 16:28     ` Niklas Cassel
  2021-09-07 17:12       ` Bart Van Assche
  0 siblings, 1 reply; 11+ messages in thread
From: Niklas Cassel @ 2021-09-07 16:28 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Damien Le Moal, Paolo Valente, Ming Lei, linux-block,
	linux-kernel

On Tue, Sep 07, 2021 at 08:15:03AM -0700, Bart Van Assche wrote:
> On 9/7/21 7:21 AM, Niklas Cassel wrote:
> > blk-mq will no longer call the I/O scheduler .finish_request() callback
> > for requests that were never inserted to the I/O scheduler.
> 
> I do not agree. Even with patch 1/2 from this series applied, finish_request()
> will still be called for requests inserted by blk_insert_cloned_request()
> although these requests are never inserted to the I/O scheduler.
> 
> Bart.

Hello Bart,


Looking at blk_mq_free_request(),
e->type->ops.finish_request() will only be called if RQF_ELVPRIV
is set.

blk_insert_cloned_request() doesn't seem to allocate a request
itself, but instead takes an already cloned request.

So I guess it depends on how the supplied request was cloned.

I would assume if the original request doesn't have RQF_ELVPRIV set,
then neither will the cloned request?

I tried to look at blk_rq_prep_clone(), which seems to be a common
cloning function, but I don't see req->rq_flags being copied
(except for RQF_SPECIAL_PAYLOAD).

Anyway, I don't see how .finish_request() will be called in relation
to blk_insert_cloned_request(). Could you please help me out and
give me an example of a call chain where this can happen?


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Revert "mq-deadline: Fix request accounting"
  2021-09-07 16:07     ` Niklas Cassel
@ 2021-09-07 16:49       ` Bart Van Assche
  0 siblings, 0 replies; 11+ messages in thread
From: Bart Van Assche @ 2021-09-07 16:49 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Jens Axboe, Damien Le Moal, Paolo Valente, Ming Lei, linux-block,
	linux-kernel

On 9/7/21 9:07 AM, Niklas Cassel wrote:
> On Tue, Sep 07, 2021 at 07:54:09AM -0700, Bart Van Assche wrote:
>> Please take a look at
>> https://lore.kernel.org/linux-block/18594aff-4a94-8a48-334c-f21ae32120c6@acm.org/
>> If dd_prepare_request() is removed I will have to reintroduce it.
> 
> dd_prepare_request() itself is not removed, just the
> rq->elv.priv[0] = NULL; inside dd_prepare_request().
> 
> If you need to modify dd_prepare_request() in a future
> commit, that should be fine, no?
> 
> Without patch 1/2, e->type->ops.requeue_request() can get called
> both for requests that bypassed the I/O scheduler, and for requests
> that were inserted in the I/O scheduler.
> 
> See:
> block/blk-mq-sched.h:blk_mq_sched_requeue_request()
> If the RQF_ELVPRIV flag is not set, e->type->ops.requeue_request()
> will not be called.
> 
> Perhaps you are having issues with requests that were inserted
> in the scheduler, but later requeued?
> 
> If so, shouldn't these fixes help you, since you do not need to
> worry about passthrough requests resulting in spurious calls to
> the I/O scheduler callbacks?

In my comment I was indeed referring to requeued requests. If a request
is requeued the scheduler insert callback function is called multiple
times. From my point of view this patch series doesn't help much since
requeued requests are not addressed.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Revert "mq-deadline: Fix request accounting"
  2021-09-07 16:28     ` Niklas Cassel
@ 2021-09-07 17:12       ` Bart Van Assche
  2021-09-08 11:57         ` Niklas Cassel
  0 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2021-09-07 17:12 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Jens Axboe, Damien Le Moal, Paolo Valente, Ming Lei, linux-block,
	linux-kernel

On 9/7/21 9:28 AM, Niklas Cassel wrote:
> On Tue, Sep 07, 2021 at 08:15:03AM -0700, Bart Van Assche wrote:
>> On 9/7/21 7:21 AM, Niklas Cassel wrote:
>>> blk-mq will no longer call the I/O scheduler .finish_request() callback
>>> for requests that were never inserted to the I/O scheduler.
>>
>> I do not agree. Even with patch 1/2 from this series applied, finish_request()
>> will still be called for requests inserted by blk_insert_cloned_request()
>> although these requests are never inserted to the I/O scheduler.
> 
> Looking at blk_mq_free_request(),
> e->type->ops.finish_request() will only be called if RQF_ELVPRIV
> is set.
> 
> blk_insert_cloned_request() doesn't seem to allocate a request
> itself, but instead takes an already cloned request.
> 
> So I guess it depends on how the supplied request was cloned.
> 
> I would assume if the original request doesn't have RQF_ELVPRIV set,
> then neither will the cloned request?
> 
> I tried to look at blk_rq_prep_clone(), which seems to be a common
> cloning function, but I don't see req->rq_flags being copied
> (except for RQF_SPECIAL_PAYLOAD).
> 
> Anyway, I don't see how .finish_request() will be called in relation
> to blk_insert_cloned_request(). Could you please help me out and
> give me an example of a call chain where this can happen?

Hi Niklas,

This is a bit outside my area of expertise. Anyway: map_request() calls
.clone_and_map_rq(). At least multipath_clone_and_map() calls
blk_get_request(). I think this shows that blk_insert_cloned_request()
may insert an entirely new request. Is my understanding correct that
blk_mq_rq_ctx_init() will set RQF_ELVPRIV for the cloned request if a
scheduler is associated with the request queue associated with the
cloned request?

Bart.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Revert "mq-deadline: Fix request accounting"
  2021-09-07 17:12       ` Bart Van Assche
@ 2021-09-08 11:57         ` Niklas Cassel
  0 siblings, 0 replies; 11+ messages in thread
From: Niklas Cassel @ 2021-09-08 11:57 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Damien Le Moal, Paolo Valente, Ming Lei, linux-block,
	linux-kernel

On Tue, Sep 07, 2021 at 10:12:49AM -0700, Bart Van Assche wrote:
> On 9/7/21 9:28 AM, Niklas Cassel wrote:
> > On Tue, Sep 07, 2021 at 08:15:03AM -0700, Bart Van Assche wrote:
> > > On 9/7/21 7:21 AM, Niklas Cassel wrote:
> > > > blk-mq will no longer call the I/O scheduler .finish_request() callback
> > > > for requests that were never inserted to the I/O scheduler.
> > > 
> > > I do not agree. Even with patch 1/2 from this series applied, finish_request()
> > > will still be called for requests inserted by blk_insert_cloned_request()
> > > although these requests are never inserted to the I/O scheduler.
> > 
> > Looking at blk_mq_free_request(),
> > e->type->ops.finish_request() will only be called if RQF_ELVPRIV
> > is set.
> > 
> > blk_insert_cloned_request() doesn't seem to allocate a request
> > itself, but instead takes an already cloned request.
> > 
> > So I guess it depends on how the supplied request was cloned.
> > 
> > I would assume if the original request doesn't have RQF_ELVPRIV set,
> > then neither will the cloned request?
> > 
> > I tried to look at blk_rq_prep_clone(), which seems to be a common
> > cloning function, but I don't see req->rq_flags being copied
> > (except for RQF_SPECIAL_PAYLOAD).
> > 
> > Anyway, I don't see how .finish_request() will be called in relation
> > to blk_insert_cloned_request(). Could you please help me out and
> > give me an example of a call chain where this can happen?
> 
> Hi Niklas,
> 
> This is a bit outside my area of expertise. Anyway: map_request() calls
> .clone_and_map_rq(). At least multipath_clone_and_map() calls
> blk_get_request(). I think this shows that blk_insert_cloned_request()
> may insert an entirely new request. Is my understanding correct that
> blk_mq_rq_ctx_init() will set RQF_ELVPRIV for the cloned request if a
> scheduler is associated with the request queue associated with the
> cloned request?
> 
> Bart.

Thank you Bart.


dm-rq.c:map_request() calls .clone_and_map_rq()

one example of a .clone_and_map_rq() implementation is
multipath_clone_and_map().

multipath_clone_and_map(), to get a clone simply does blk_get_request(),
which does blk_mq_alloc_request(), which calls __blk_mq_alloc_request(),
which calls blk_mq_rq_ctx_init(), which will set RQF_ELVPRIV, as long as
the request is not a flush or a passthrough request.

dm-rq.c:dm_dispatch_clone_request() calls blk_insert_cloned_request(),
which will bypass the I/O scheduler.

So, a request cloned by e.g. drivers/md/dm-mpath.c will
bypass the I/O scheduler, but can still have the RQF_ELVPRIV flag set.



This just tells me that if someone wants to clean up this mess,
we have to do something similar to what I proposed in my initial RFC:
https://lore.kernel.org/linux-block/20210827124100.98112-2-Niklas.Cassel@wdc.com/

i.e., set RQF_ELVPRIV flag immediately before calling
e->type->ops.insert_requests(), and only then.

It seems logical to simply set the flag before actually inserting the request
to the scheduler, rather than finding all the corner cases where we don't.

Anyway, I'll leave this potential cleanup to people more familiar with the
blk-mq code for now.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-08 11:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-07 14:21 [PATCH 0/2] don't call io scheduler callbacks for passthrough requests Niklas Cassel
2021-09-07 14:21 ` [PATCH 1/2] blk-mq: don't call callbacks for requests that bypassed the scheduler Niklas Cassel
2021-09-07 14:29   ` Ming Lei
2021-09-07 14:21 ` [PATCH 2/2] Revert "mq-deadline: Fix request accounting" Niklas Cassel
2021-09-07 14:54   ` Bart Van Assche
2021-09-07 16:07     ` Niklas Cassel
2021-09-07 16:49       ` Bart Van Assche
2021-09-07 15:15   ` Bart Van Assche
2021-09-07 16:28     ` Niklas Cassel
2021-09-07 17:12       ` Bart Van Assche
2021-09-08 11:57         ` Niklas Cassel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.