Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Christoph Hellwig <hch@lst.de>
Cc: sagi@grimberg.me, linux-nvme@lists.infradead.org,
	ming.lei@redhat.com, helgaas@kernel.org,
	Keith Busch <kbusch@kernel.org>,
	tglx@linutronix.de
Subject: Re: [PATCH 0/4] nvme: Threaded interrupt handling improvements
Date: Mon, 2 Dec 2019 20:57:30 +0100
Message-ID: <20191202195730.bzzldihtv37odsie@linutronix.de> (raw)
In-Reply-To: <20191202171239.GA8547@lst.de>

On 2019-12-02 18:12:39 [+0100], Christoph Hellwig wrote:
> On Mon, Dec 02, 2019 at 06:05:38PM +0100, Sebastian Andrzej Siewior wrote:
> > That might be a misunderstanding. I think if your threaded-IRQ handler
> > is running legitimately for longer period of time (and making progress)
> > and IRQ core's "nobody-care" detector shuts it down then the detector
> > might need a tweak.
> > The worst thing that could happen, is that the RT tasks run for too long
> > and the scheduler punishes them to protect against run-away-tasks (the
> > default limit is at 950ms RT task time within 1 second,
> > sched_rt_runtime_us).
> 
> The problem is that by doing the agressive polling we can keep one
> CPU busy just running the irq handler and starve processes on that
> CPU if an NVMe queue servers multiple CPUs.

and this is bad? The scheduler will move everything to other CPUs unless
it is for pinned to this CPU. You can offload even RCU these days :)
Performance wise it might be better to dedicate one CPU doing this work
instead spreading it over four CPUs each doing a fraction of it and
using same cache lines which bounce from one CPU to the next.
 
> That's why I had the previous idea of one irq thread per cpu that
> is assigned to the irq.  We'd have to encode a relative index into
> the hardirq handler return value which we get from bits encoded in
> the NVMe command ID, but that should be doable.  At that point we
> shouldn't need the cond_resched.  I can try to hack that up, but
> I'm not an expert on the irq thread code.

there is always one IRQ-thread per-CPU/Interrupt.
You could start a kthread_create_worker_on_cpu() on multiple CPUs and
feed them with work from your interrupt. And if you make it SCHED_FIFO
then you should be able to run your completion on multiple CPUs from one 
interrupt.

Sebastian

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  parent reply index

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-27 17:58 Keith Busch
2019-11-27 17:58 ` [PATCH 1/4] PCI/MSI: Export __pci_msix_desc_mask_irq Keith Busch
2019-11-28  2:42   ` Sagi Grimberg
2019-11-28  3:41     ` Keith Busch
2019-11-28  7:17   ` Christoph Hellwig
2019-11-27 17:58 ` [PATCH 2/4] nvme/pci: Mask legacy and MSI in threaded handler Keith Busch
2019-11-28  3:39   ` Ming Lei
2019-11-28  3:48     ` Keith Busch
2019-11-28  3:58       ` Ming Lei
2019-11-28  4:14         ` Keith Busch
2019-11-28  8:41           ` Ming Lei
2019-11-27 17:58 ` [PATCH 3/4] nvme/pci: Mask MSIx interrupts for threaded handling Keith Busch
2019-11-28  7:19   ` Christoph Hellwig
2019-11-27 17:58 ` [PATCH 4/4] nvme/pci: Spin threaded interrupt completions Keith Busch
2019-11-28  2:46   ` Sagi Grimberg
2019-11-28  3:28     ` Keith Busch
2019-11-28  3:51       ` Ming Lei
2019-11-28  3:58         ` Keith Busch
2019-11-28  7:22   ` Christoph Hellwig
2019-11-29  9:13   ` Sebastian Andrzej Siewior
2019-11-30 18:10     ` Keith Busch
2019-12-02  1:10       ` Ming Lei
2019-12-02  1:30         ` Keith Busch
2019-12-02 16:51       ` Sebastian Andrzej Siewior
2019-11-28  7:50 ` [PATCH 0/4] nvme: Threaded interrupt handling improvements Christoph Hellwig
2019-11-28 17:59   ` Keith Busch
2019-11-29  8:30     ` Christoph Hellwig
2019-11-29  9:46 ` Sebastian Andrzej Siewior
2019-11-29 16:27   ` Keith Busch
2019-11-29 17:05     ` Sebastian Andrzej Siewior
2019-11-30 17:02       ` Keith Busch
2019-12-02 17:05         ` Sebastian Andrzej Siewior
2019-12-02 17:12           ` Christoph Hellwig
2019-12-02 18:06             ` Keith Busch
2019-12-03  7:40               ` Christoph Hellwig
2019-12-02 19:57             ` Sebastian Andrzej Siewior [this message]
2019-12-03  7:42               ` Christoph Hellwig

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191202195730.bzzldihtv37odsie@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git