Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: bigeasy@linutronix.de, helgaas@kernel.org, hch@lst.de,
	linux-nvme@lists.infradead.org, ming.lei@redhat.com
Subject: Re: [PATCH 4/4] nvme/pci: Spin threaded interrupt completions
Date: Thu, 28 Nov 2019 12:28:43 +0900
Message-ID: <20191128032843.GA1738@redsun51.ssa.fujisawa.hgst.com> (raw)
In-Reply-To: <11325d8e-e9f8-408e-18c3-182c69e90eab@grimberg.me>

On Wed, Nov 27, 2019 at 06:46:55PM -0800, Sagi Grimberg wrote:
> > For deeply queued workloads, the nvme controller may be posting
> > new completions while the threaded interrupt handles previous
> > completions. Since the interrupts are masked, we can spin for these
> > completions for as long as new completions are being posted.
> > 
> > Signed-off-by: Keith Busch <kbusch@kernel.org>
> > ---
> >   drivers/nvme/host/pci.c | 10 ++++++++--
> >   1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index 571b33b69c5f..9ec0933eb120 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -1042,9 +1042,15 @@ static irqreturn_t nvme_irq(int irq, void *data)
> >   	return ret;
> >   }
> > +static void nvme_irq_spin(int irq, void *data)
> > +{
> > +	while (nvme_irq(irq, data) != IRQ_NONE)
> > +		cond_resched();
> 
> So the cond_resched should be fair to multiple devices mapped to the
> same cpu core I assume.. did you happen to test it?

It should, but I'm having difficulty expressly testing that. Frequent
spinning here needs a single queue mapped to multiple CPUs, such that
one or more CPUs can constantly dispatch new requests. I've one test
where this spin never exits for the entire duration of an fio execution,
and /proc/interrupts confirms only 1 interrupt occured for many millions
of IO.

When you have two or more devices with queues mapped to multiple CPUs,
their threaded interrupt handler affinities will not share the same CPU.

When we have per-cpu queues, all the devices' thread affinity will be
the same, but the while loop usually spins around only a couple times
because the submission side is sharing that same CPU. This naturally
throttles the number of completions the irq thread can observe, so the
thread ends up scheduling itself out without the cond_resched().

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply index

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-27 17:58 [PATCH 0/4] nvme: Threaded interrupt handling improvements Keith Busch
2019-11-27 17:58 ` [PATCH 1/4] PCI/MSI: Export __pci_msix_desc_mask_irq Keith Busch
2019-11-28  2:42   ` Sagi Grimberg
2019-11-28  3:41     ` Keith Busch
2019-11-28  7:17   ` Christoph Hellwig
2019-11-27 17:58 ` [PATCH 2/4] nvme/pci: Mask legacy and MSI in threaded handler Keith Busch
2019-11-28  3:39   ` Ming Lei
2019-11-28  3:48     ` Keith Busch
2019-11-28  3:58       ` Ming Lei
2019-11-28  4:14         ` Keith Busch
2019-11-28  8:41           ` Ming Lei
2019-11-27 17:58 ` [PATCH 3/4] nvme/pci: Mask MSIx interrupts for threaded handling Keith Busch
2019-11-28  7:19   ` Christoph Hellwig
2019-11-27 17:58 ` [PATCH 4/4] nvme/pci: Spin threaded interrupt completions Keith Busch
2019-11-28  2:46   ` Sagi Grimberg
2019-11-28  3:28     ` Keith Busch [this message]
2019-11-28  3:51       ` Ming Lei
2019-11-28  3:58         ` Keith Busch
2019-11-28  7:22   ` Christoph Hellwig
2019-11-29  9:13   ` Sebastian Andrzej Siewior
2019-11-30 18:10     ` Keith Busch
2019-12-02  1:10       ` Ming Lei
2019-12-02  1:30         ` Keith Busch
2019-12-02 16:51       ` Sebastian Andrzej Siewior
2019-11-28  7:50 ` [PATCH 0/4] nvme: Threaded interrupt handling improvements Christoph Hellwig
2019-11-28 17:59   ` Keith Busch
2019-11-29  8:30     ` Christoph Hellwig
2019-11-29  9:46 ` Sebastian Andrzej Siewior
2019-11-29 16:27   ` Keith Busch
2019-11-29 17:05     ` Sebastian Andrzej Siewior
2019-11-30 17:02       ` Keith Busch
2019-12-02 17:05         ` Sebastian Andrzej Siewior
2019-12-02 17:12           ` Christoph Hellwig
2019-12-02 18:06             ` Keith Busch
2019-12-03  7:40               ` Christoph Hellwig
2019-12-02 19:57             ` Sebastian Andrzej Siewior
2019-12-03  7:42               ` Christoph Hellwig

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191128032843.GA1738@redsun51.ssa.fujisawa.hgst.com \
    --to=kbusch@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git