linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nitesh Narayan Lal <nitesh@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>,
	Chris Friesen <chris.friesen@windriver.com>
Cc: linux-pci@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: PCI, isolcpus, and irq affinity
Date: Mon, 12 Oct 2020 13:42:11 -0400	[thread overview]
Message-ID: <a23436f1-1d87-999f-e8fe-a9dd2f50f779@redhat.com> (raw)
In-Reply-To: <20201012165839.GA3732859@bjorn-Precision-5520>


[-- Attachment #1.1: Type: text/plain, Size: 2399 bytes --]


On 10/12/20 12:58 PM, Bjorn Helgaas wrote:
> [+cc Christoph, Thomas, Nitesh]
>
> On Mon, Oct 12, 2020 at 09:49:37AM -0600, Chris Friesen wrote:
>> I've got a linux system running the RT kernel with threaded irqs.  On
>> startup we affine the various irq threads to the housekeeping CPUs, but I
>> recently hit a scenario where after some days of uptime we ended up with a
>> number of NVME irq threads affined to application cores instead (not good
>> when we're trying to run low-latency applications).
> pci_alloc_irq_vectors_affinity() basically just passes affinity
> information through to kernel/irq/affinity.c, and the PCI core doesn't
> change affinity after that.
>
>> Looking at the code, it appears that the NVME driver can in some scenarios
>> end up calling pci_alloc_irq_vectors_affinity() after initial system
>> startup, which seems to determine CPU affinity without any regard for things
>> like "isolcpus" or "cset shield".
>>
>> There seem to be other reports of similar issues:
>>
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1831566
>>
>> It looks like some SCSI drivers and virtio_pci_common.c will also call
>> pci_alloc_irq_vectors_affinity(), though I'm not sure if they would ever do
>> it after system startup.
>>
>> How does it make sense for the PCI subsystem to affine interrupts to CPUs
>> which have explicitly been designated as "isolated"?
> This recent thread may be useful:
>
>   https://lore.kernel.org/linux-pci/20200928183529.471328-1-nitesh@redhat.com/
>
> It contains a patch to "Limit pci_alloc_irq_vectors() to housekeeping
> CPUs".  I'm not sure that patch summary is 100% accurate because IIUC
> that particular patch only reduces the *number* of vectors allocated
> and does not actually *limit* them to housekeeping CPUs.

That is correct the above-mentioned patch is just to reduce the number of
vectors.

Based on the problem that has been described here, I think the issue could
be the usage of cpu_online_mask/cpu_possible_mask while creating the
affinity mask or while distributing the jobs. What we should be doing in
these cases is to basically use the housekeeping_cpumask instead.

A few months back similar issue has been fixed for cpumask_local_spread
and some other sub-systems [1].

[1] https://lore.kernel.org/lkml/20200625223443.2684-1-nitesh@redhat.com/

-- 
Nitesh



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2020-10-12 17:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12 15:49 Chris Friesen
2020-10-12 16:58 ` Bjorn Helgaas
2020-10-12 17:39   ` Sean V Kelley
2020-10-12 19:18     ` Chris Friesen
2020-10-12 17:42   ` Nitesh Narayan Lal [this message]
2020-10-12 17:50   ` Thomas Gleixner
2020-10-12 18:58     ` Chris Friesen
2020-10-12 19:07       ` Keith Busch
2020-10-12 19:44         ` Thomas Gleixner
2020-10-15 18:47         ` Chris Friesen
2020-10-15 19:02           ` Keith Busch
2020-10-12 19:31       ` Thomas Gleixner
2020-10-12 20:24         ` David Woodhouse
2020-10-12 22:25           ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a23436f1-1d87-999f-e8fe-a9dd2f50f779@redhat.com \
    --to=nitesh@redhat.com \
    --cc=chris.friesen@windriver.com \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --subject='Re: PCI, isolcpus, and irq affinity' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).