All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-block@vger.kernel.org, John Garry <john.garry@huawei.com>,
	Hannes Reinecke <hare@suse.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: blk-mq: improvement CPU hotplug (simplified version) v3
Date: Thu, 21 May 2020 10:57:44 +0800	[thread overview]
Message-ID: <20200521025744.GC735749@T590> (raw)
In-Reply-To: <0cbc37cf-5439-c68c-3581-b3c436932388@acm.org>

On Wed, May 20, 2020 at 02:46:52PM -0700, Bart Van Assche wrote:
> On 2020-05-20 10:06, Christoph Hellwig wrote:
> > this series ensures I/O is quiesced before a cpu and thus the managed
> > interrupt handler is shut down.
> > 
> > This patchset tries to address the issue by the following approach:
> > 
> >  - before the last cpu in hctx->cpumask is going to offline, mark this
> >    hctx as inactive
> > 
> >  - disable preempt during allocating tag for request, and after tag is
> >    allocated, check if this hctx is inactive. If yes, give up the
> >    allocation and try remote allocation from online CPUs
> > 
> >  - before hctx becomes inactive, drain all allocated requests on this
> >    hctx
> 
> What is not clear to me is which assumptions about the relationship
> between interrupts and hardware queues this patch series is based on.
> Does this patch series perhaps only support a 1:1 mapping between
> interrupts and hardware queues?

No, it supports any mapping, but the issue won't be triggered on 1:N
mapping, since this kind of hctx never becomes inactive.

> What if there are more hardware queues
> than interrupts? An example of a block driver that allocates multiple

It doesn't matter, see blew comment.

> hardware queues is the NVMeOF initiator driver. From the NVMeOF
> initiator driver function nvme_rdma_alloc_tagset() and for the code that
> refers to I/O queues:
> 
> 	set->nr_hw_queues = nctrl->queue_count - 1;
> 
> From nvme_rdma_alloc_io_queues():
> 
> 	nr_read_queues = min_t(unsigned int, ibdev->num_comp_vectors,
> 				min(opts->nr_io_queues,
> 				    num_online_cpus()));
> 	nr_default_queues =  min_t(unsigned int,
> 	 			ibdev->num_comp_vectors,
> 				min(opts->nr_write_queues,
> 					 num_online_cpus()));
> 	nr_poll_queues = min(opts->nr_poll_queues, num_online_cpus());
> 	nr_io_queues = nr_read_queues + nr_default_queues +
> 			 nr_poll_queues;
> 	[ ... ]
> 	ctrl->ctrl.queue_count = nr_io_queues + 1;
> 
> From nvmf_parse_options():
> 
> 	/* Set defaults */
> 	opts->nr_io_queues = num_online_cpus();
> 
> Can this e.g. result in 16 hardware queues being allocated for I/O even
> if the underlying RDMA adapter only supports four interrupt vectors?
> Does that mean that four hardware queues will be associated with each
> interrupt vector?

The patchset actually doesn't bind to interrupt vector, and that said we
don't care actuall interrupt allocation.

> If the CPU to which one of these interrupt vectors has
> been assigned is hotplugged, does that mean that four hardware queues
> have to be quiesced instead of only one as is done in patch 6/6?

No, one hctx only becomes inactive after each CPU in hctx->cpumask is offline.
No matter how interrupt vector is assigned to hctx, requests shouldn't
be dispatched to that hctx any more.


Thanks,
Ming


  reply	other threads:[~2020-05-21  2:58 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-20 17:06 blk-mq: improvement CPU hotplug (simplified version) v3 Christoph Hellwig
2020-05-20 17:06 ` [PATCH 1/6] blk-mq: remove the bio argument to ->prepare_request Christoph Hellwig
2020-05-20 18:16   ` Bart Van Assche
2020-05-22  9:11   ` Hannes Reinecke
2020-05-20 17:06 ` [PATCH 2/6] blk-mq: simplify the blk_mq_get_request calling convention Christoph Hellwig
2020-05-20 18:22   ` Bart Van Assche
2020-05-22  9:13   ` Hannes Reinecke
2020-05-20 17:06 ` [PATCH 3/6] blk-mq: move more request initialization to blk_mq_rq_ctx_init Christoph Hellwig
2020-05-20 20:10   ` Bart Van Assche
2020-05-20 17:06 ` [PATCH 4/6] blk-mq: open code __blk_mq_alloc_request in blk_mq_alloc_request_hctx Christoph Hellwig
2020-05-22  9:17   ` Hannes Reinecke
2020-05-20 17:06 ` [PATCH 5/6] blk-mq: add blk_mq_all_tag_iter Christoph Hellwig
2020-05-20 20:24   ` Bart Van Assche
2020-05-27  6:05     ` Christoph Hellwig
2020-05-22  9:18   ` Hannes Reinecke
2020-05-20 17:06 ` [PATCH 6/6] blk-mq: drain I/O when all CPUs in a hctx are offline Christoph Hellwig
2020-05-22  9:25   ` Hannes Reinecke
2020-05-25  9:20     ` Ming Lei
2020-05-20 21:46 ` blk-mq: improvement CPU hotplug (simplified version) v3 Bart Van Assche
2020-05-21  2:57   ` Ming Lei [this message]
2020-05-21  3:50     ` Bart Van Assche
2020-05-21  4:33       ` Ming Lei
2020-05-21 19:15         ` Bart Van Assche
2020-05-22  2:39           ` Ming Lei
2020-05-22 14:47             ` Keith Busch
2020-05-23  3:05               ` Ming Lei
2020-05-23 15:19             ` Bart Van Assche
2020-05-25  4:09               ` Ming Lei
2020-05-25 15:32                 ` Bart Van Assche
2020-05-25 16:38                   ` Keith Busch
2020-05-26  0:37                   ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200521025744.GC735749@T590 \
    --to=ming.lei@redhat.com \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.