Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Jacob Keller <jacob.e.keller@intel.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Nitesh Narayan Lal <nitesh@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	helgaas@kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-pci@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org, frederic@kernel.org,
	sassmann@redhat.com, jesse.brandeburg@intel.com,
	lihong.yang@intel.com, jeffrey.t.kirsher@intel.com,
	jlelli@redhat.com, hch@infradead.org, bhelgaas@google.com,
	mike.marciniszyn@intel.com, dennis.dalessandro@intel.com,
	thomas.lendacky@amd.com, jiri@nvidia.com, mingo@redhat.com,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	lgoncalv@redhat.com
Subject: Re: [PATCH v4 4/4] PCI: Limit pci_alloc_irq_vectors() to housekeeping CPUs
Date: Mon, 26 Oct 2020 23:46:34 +0100
Message-ID: <878sbs38s5.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20201026151306.4af991a5@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net>

On Mon, Oct 26 2020 at 15:13, Jakub Kicinski wrote:
> On Mon, 26 Oct 2020 22:50:45 +0100 Thomas Gleixner wrote:
>> But I still think that for curing that isolation stuff we want at least
>> some information from the driver. Alternative solution would be to grant
>> the allocation of interrupts and queues and have some sysfs knob to shut
>> down queues at runtime. If that shutdown results in releasing the queue
>> interrupt (via free_irq()) then the vector exhaustion problem goes away.
>> 
>> Needs more thought and information (for network oblivious folks like
>> me).
>
> One piece of information that may be useful is that even tho the RX
> packets may be spread semi-randomly the user space can still control
> which queues are included in the mechanism. There is an indirection
> table in the HW which allows to weigh queues differently, or exclude
> selected queues from the spreading. Other mechanisms exist to filter
> flows onto specific queues.
>
> IOW just because a core has an queue/interrupt does not mean that
> interrupt will ever fire, provided its excluded from RSS.
>
> Another piece is that by default we suggest drivers allocate 8 RX
> queues, and online_cpus TX queues. The number of queues can be
> independently controlled via ethtool -L. Drivers which can't support
> separate queues will default to online_cpus queue pairs, and let
> ethtool -L only set the "combined" parameter.
>
> There are drivers which always allocate online_cpus interrupts, 
> and then some of them will go unused if #qs < #cpus.

Thanks for the enlightment.

> My unpopular opinion is that for networking devices all the heuristics
> we may come up with are going to be a dead end.

I agree. Heuristics suck.

> We need an explicit API to allow users placing queues on cores, and
> use managed IRQs for data queues. (I'm assuming that managed IRQs will
> let us reliably map a MSI-X vector to a core :))

Yes, they allow you to do that. That will need some tweaks to theway
they work today (coming from the strict block mq semantics). You also
need to be aware that managed irqs have also strict semantics vs. CPU
hotplug. If the last CPU in the managed affinity set goes down then the
interrupt is shut down by the irq core which means that you need to
quiesce the associated queue before that happens. When the first CPU
comes online again the interrupt is reenabled, so the queue should be
able to handle it or has ensured that the device does not raise one
before it is able to do so.

Thanks,

        tglx

  reply index

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-28 18:35 [PATCH v4 0/4] isolation: limit msix vectors " Nitesh Narayan Lal
2020-09-28 18:35 ` [PATCH v4 1/4] sched/isolation: API to get number of " Nitesh Narayan Lal
2020-09-28 18:35 ` [PATCH v4 2/4] sched/isolation: Extend nohz_full to isolate managed IRQs Nitesh Narayan Lal
2020-10-23 13:25   ` Peter Zijlstra
2020-10-23 13:29     ` Frederic Weisbecker
2020-10-23 13:57       ` Nitesh Narayan Lal
2020-10-23 13:45     ` Nitesh Narayan Lal
2020-09-28 18:35 ` [PATCH v4 3/4] i40e: Limit msix vectors to housekeeping CPUs Nitesh Narayan Lal
2020-09-28 18:35 ` [PATCH v4 4/4] PCI: Limit pci_alloc_irq_vectors() " Nitesh Narayan Lal
2020-09-28 21:59   ` Bjorn Helgaas
2020-09-29 17:46     ` Christoph Hellwig
2020-10-16 12:20   ` Peter Zijlstra
2020-10-18 18:14     ` Nitesh Narayan Lal
2020-10-19 11:11       ` Peter Zijlstra
2020-10-19 14:00         ` Marcelo Tosatti
2020-10-19 14:25           ` Nitesh Narayan Lal
2020-10-20  7:30           ` Peter Zijlstra
2020-10-20 13:00             ` Nitesh Narayan Lal
2020-10-20 13:41               ` Peter Zijlstra
2020-10-20 14:39                 ` Nitesh Narayan Lal
2020-10-22 17:47                   ` Nitesh Narayan Lal
2020-10-23  8:58                     ` Peter Zijlstra
2020-10-23 13:10                       ` Nitesh Narayan Lal
2020-10-23 21:00                         ` Thomas Gleixner
2020-10-26 13:35                           ` Nitesh Narayan Lal
2020-10-26 13:57                             ` Thomas Gleixner
2020-10-26 17:30                           ` Marcelo Tosatti
2020-10-26 19:00                             ` Thomas Gleixner
2020-10-26 19:11                               ` Marcelo Tosatti
2020-10-26 19:21                               ` Jacob Keller
2020-10-26 20:11                                 ` Thomas Gleixner
2020-10-26 21:11                                   ` Jacob Keller
2020-10-26 21:50                                     ` Thomas Gleixner
2020-10-26 22:13                                       ` Jakub Kicinski
2020-10-26 22:46                                         ` Thomas Gleixner [this message]
2020-10-26 22:52                                         ` Jacob Keller
2020-10-26 22:22                                       ` Nitesh Narayan Lal
2020-10-26 22:49                                         ` Thomas Gleixner
2020-10-26 23:08                                           ` Jacob Keller
2020-10-27 14:28                                             ` Thomas Gleixner
2020-10-27 11:47                                         ` Marcelo Tosatti
2020-10-27 14:43                                           ` Thomas Gleixner
2020-10-19 14:21         ` Frederic Weisbecker
2020-10-20 14:16   ` Thomas Gleixner
2020-10-20 16:18     ` Nitesh Narayan Lal
2020-10-20 18:07       ` Thomas Gleixner
2020-10-21 20:25         ` Thomas Gleixner
2020-10-21 21:04           ` Nitesh Narayan Lal
2020-10-22  0:02           ` Jakub Kicinski
2020-10-22  0:27             ` Jacob Keller
2020-10-22  8:28             ` Thomas Gleixner
2020-10-22 12:28           ` Marcelo Tosatti
2020-10-22 22:39             ` Thomas Gleixner
2020-10-01 15:49 ` [PATCH v4 0/4] isolation: limit msix vectors " Frederic Weisbecker
2020-10-08 21:40   ` Nitesh Narayan Lal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878sbs38s5.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=bhelgaas@google.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=frederic@kernel.org \
    --cc=hch@infradead.org \
    --cc=helgaas@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jiri@nvidia.com \
    --cc=jlelli@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=kuba@kernel.org \
    --cc=lgoncalv@redhat.com \
    --cc=lihong.yang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mike.marciniszyn@intel.com \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nitesh@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sassmann@redhat.com \
    --cc=thomas.lendacky@amd.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git