All of lore.kernel.org
 help / color / mirror / Atom feed
From: Niklas Cassel <Niklas.Cassel@wdc.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>, Jaegeuk Kim <jaegeuk@kernel.org>,
	Adam Manzanares <a.manzanares@samsung.com>,
	Damien Le Moal <Damien.LeMoal@wdc.com>,
	Hannes Reinecke <hare@suse.de>, Ming Lei <ming.lei@redhat.com>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	Himanshu Madhani <himanshu.madhani@oracle.com>
Subject: Re: [PATCH v3 16/16] block/mq-deadline: Prioritize high-priority requests
Date: Mon, 23 Aug 2021 07:36:36 +0000	[thread overview]
Message-ID: <YSNQAu9uXrmEteXc@x1-carbon> (raw)
In-Reply-To: <e94f62c4-a329-398d-5003-d369506d7f89@acm.org>

On Fri, Aug 20, 2021 at 04:38:24PM -0700, Bart Van Assche wrote:
> On 8/20/21 4:05 PM, Niklas Cassel wrote:
> > Thank you for your patch!
> > I tested it, and it does solve my problem.
> 
> That's quick. Thanks!

Thank you for the patch!

> 
> > I've been thinking more about this problem.
> > The problem is seen on a SATA zoned drive.
> > 
> > These drives have mq-deadline set as default by the
> > blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE) call in
> > drivers/scsi/sd_zbc.c:sd_zbc_read_zones()
> > 
> > This triggers block/elevator.c:elevator_init_mq() to initialize
> > "mq-deadline" as default scheduler for these devices.
> > 
> > I think that the problem might because that drivers/scsi/sd_zbc.c
> > has created the request_queue and submitted requests, before the call
> > to elevator_init_mq() is done.
> > 
> > elevator_init_mq() will set q->elevator->type->ops, so once that is set,
> > blk_mq_free_request() will call e->type->ops.finish_request(rq),
> > regardless if the request was inserted through the recently initialized
> > scheduler or not.
> > 
> > While I'm perfectly happy with your fix, would it perhaps be possible
> > to do the fix in block/elevator.c instead, so that we don't need to
> > do the same type of check that you did, in each and every single
> > io scheduler?
> > 
> > Looking at block/elevator.c:elevator_init_mq(), it seems to do:
> > 
> > blk_mq_freeze_queue()
> > blk_mq_quiesce_queue()
> > 
> > blk_mq_init_sched()
> > 
> > blk_mq_unquiesce_queue()
> > blk_mq_unfreeze_queue()
> > 
> > This obviously isn't enough to avoid the bug that we are seeing,
> > but could perhaps a more general fix be to flush/wait until all
> > in-flight requests have completed, and then free them, and then
> > set q->elevator->type->ops. That way, all requests inserted after
> > the io scheduler has been initialized, will have gone through the
> > io scheduler. So all finish_request() calls should have a
> > matching insert_request() call. What do you think?
> 
> q->elevator is set from inside the I/O scheduler's init_sched callback and
> that callback is called with the request queue frozen. Freezing happens by
> calling blk_mq_freeze_queue() and that function waits until all previously
> submitted requests have finished. So I don't think that the race described
> above can happen.

I see.
I was mainly thinking that it should be possible to do a generic fix,
such that we eventually won't need a similar fix as yours in all the
different I/O schedulers.

However, looking at e.g. BFQ it does appear to have something similar
to your fix already:

#define RQ_BFQQ(rq)             ((rq)->elv.priv[1])

bfq_finish_requeue_request()
	struct bfq_queue *bfqq = RQ_BFQQ(rq);

	...

        if (!rq->elv.icq || !bfqq)
                return;



So your proposed fix should also be fine.

However, it does not apply on top of Torvalds master or Jens's for-next
branch because they both have reverted your cgroup support patch.

If you rebase your fix and send it out, I will be happy to send out
a Reviewed-by/Tested-by.


Kind regards,
Niklas

  reply	other threads:[~2021-08-23  7:36 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-18  0:44 [PATCH v3 00/16] Improve I/O priority support Bart Van Assche
2021-06-18  0:44 ` [PATCH v3 01/16] block/Kconfig: Make the BLK_WBT and BLK_WBT_MQ entries consecutive Bart Van Assche
2021-06-18 17:09   ` Adam Manzanares
2021-06-18 19:49   ` Himanshu Madhani
2021-06-18  0:44 ` [PATCH v3 02/16] block/blk-cgroup: Swap the blk_throtl_init() and blk_iolatency_init() calls Bart Van Assche
2021-06-18 17:15   ` Adam Manzanares
2021-06-18 19:49   ` Himanshu Madhani
2021-06-21 14:24   ` Tejun Heo
2021-06-18  0:44 ` [PATCH v3 03/16] block/blk-rq-qos: Move a function from a header file into a C file Bart Van Assche
2021-06-18 17:22   ` Adam Manzanares
2021-06-18 19:49   ` Himanshu Madhani
2021-06-18  0:44 ` [PATCH v3 04/16] block: Introduce the ioprio rq-qos policy Bart Van Assche
2021-06-18 22:02   ` Adam Manzanares
2021-06-18  0:44 ` [PATCH v3 05/16] block/mq-deadline: Add several comments Bart Van Assche
2021-06-18 22:06   ` Adam Manzanares
2021-06-18  0:44 ` [PATCH v3 06/16] block/mq-deadline: Add two lockdep_assert_held() statements Bart Van Assche
2021-06-18 22:09   ` Adam Manzanares
2021-06-18  0:44 ` [PATCH v3 07/16] block/mq-deadline: Remove two local variables Bart Van Assche
2021-06-18 22:16   ` Adam Manzanares
2021-06-18  0:44 ` [PATCH v3 08/16] block/mq-deadline: Rename dd_init_queue() and dd_exit_queue() Bart Van Assche
2021-06-18 22:18   ` Adam Manzanares
2021-06-18  0:44 ` [PATCH v3 09/16] block/mq-deadline: Improve compile-time argument checking Bart Van Assche
2021-06-18 22:30   ` Adam Manzanares
2021-06-18  0:44 ` [PATCH v3 10/16] block/mq-deadline: Improve the sysfs show and store macros Bart Van Assche
2021-06-18 23:07   ` Adam Manzanares
2021-06-18  0:44 ` [PATCH v3 11/16] block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests Bart Van Assche
2021-06-18  0:44 ` [PATCH v3 12/16] block/mq-deadline: Micro-optimize the batching algorithm Bart Van Assche
2021-06-18  0:44 ` [PATCH v3 13/16] block/mq-deadline: Add I/O priority support Bart Van Assche
2021-06-18  0:44 ` [PATCH v3 14/16] block/mq-deadline: Track I/O statistics Bart Van Assche
2021-06-18  0:44 ` [PATCH v3 15/16] block/mq-deadline: Add cgroup support Bart Van Assche
2021-06-18  0:44 ` [PATCH v3 16/16] block/mq-deadline: Prioritize high-priority requests Bart Van Assche
2021-08-20  0:45   ` Niklas Cassel
2021-08-20 18:04     ` Bart Van Assche
2021-08-20 23:05       ` Niklas Cassel
2021-08-20 23:38         ` Bart Van Assche
2021-08-23  7:36           ` Niklas Cassel [this message]
2021-08-23 17:15             ` Bart Van Assche
2021-08-23 23:01               ` Damien Le Moal
2021-08-24 21:33               ` Niklas Cassel
2021-06-21 16:06 ` [PATCH v3 00/16] Improve I/O priority support Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YSNQAu9uXrmEteXc@x1-carbon \
    --to=niklas.cassel@wdc.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=a.manzanares@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=himanshu.madhani@oracle.com \
    --cc=jaegeuk@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.