From: Stefan Hajnoczi <stefanha@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
linux-pci@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
Bjorn Helgaas <bhelgaas@google.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>
Subject: [RFC 0/2] genirq: take device NUMA node into account for managed IRQs
Date: Wed, 17 Jun 2020 10:37:23 +0100 [thread overview]
Message-ID: <20200617093725.1725569-1-stefanha@redhat.com> (raw)
Devices with a small number of managed IRQs do not benefit from spreading
across all CPUs. Instead they benefit from NUMA node affinity so that IRQs are
handled on the device's NUMA node.
For example, here is a machine with a virtio-blk PCI device on NUMA node 1:
# lstopo-no-graphics
Machine (958MB total)
Package L#0
NUMANode L#0 (P#0 491MB)
L3 L#0 (16MB) + L2 L#0 (4096KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Cor=
e L#0 + PU L#0 (P#0)
Package L#1
NUMANode L#1 (P#1 466MB)
L3 L#1 (16MB) + L2 L#1 (4096KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Cor=
e L#1 + PU L#1 (P#1)
HostBridge
PCIBridge
PCI c9:00.0 (SCSI)
Block "vdb"
HostBridge
PCIBridge
PCI 02:00.0 (Ethernet)
Net "enp2s0"
PCIBridge
PCI 05:00.0 (SCSI)
Block "vda"
PCI 00:1f.2 (SATA)
Currently the virtio5-req.0 IRQ for the vdb device gets assigned to CPU 0:
# cat /proc/interrupts
CPU0 CPU1
...
36: 0 0 PCI-MSI 105381888-edge virtio5-config
37: 81 0 PCI-MSI 105381889-edge virtio5-req.0
If managed IRQ assignment takes the device's NUMA node into account then CPU 1
will be used instead:
# cat /proc/interrupts
CPU0 CPU1
...
36: 0 0 PCI-MSI 105381888-edge virtio5-config
37: 0 92 PCI-MSI 105381889-edge virtio5-req.0
The fio benchmark with 4KB random read running on CPU 1 increases IOPS by 58%:
Name IOPS Error
Before 26720.59 =C2=B1 0.28%
After 42373.79 =C2=B1 0.54%
Now most of this improvement is not due to NUMA but just because the requests
complete on the same CPU where they were submitted. However, if the IRQ is on
CPU 0 and fio also runs on CPU 0 only 39600 IOPS is achieved, not the full
42373 IOPS that we get when NUMA affinity is honored. So it is worth taking
NUMA into account to achieve maximum performance.
The following patches are a hack that uses the device's NUMA node when
assigning managed IRQs. They are not mergeable but I hope they will help start
the discussion. One bug is that they affect all managed IRQs, even for devices
with many IRQs where spreading across all CPUs is a good policy.
Please let me know what you think:
1. Is there a reason why managed IRQs should *not* take NUMA into account that
I've missed?
2. Is there a better place to implement this logic? For example,
pci_alloc_irq_vectors_affinity() where the cpumasks are calculated.
Any suggestions on how to proceed would be appreciated. Thanks!
Stefan Hajnoczi (2):
genirq: honor device NUMA node when allocating descs
genirq/matrix: take NUMA into account for managed IRQs
include/linux/irq.h | 2 +-
arch/x86/kernel/apic/vector.c | 3 ++-
kernel/irq/irqdesc.c | 3 ++-
kernel/irq/matrix.c | 16 ++++++++++++----
4 files changed, 17 insertions(+), 7 deletions(-)
--=20
2.26.2
next reply other threads:[~2020-06-17 9:37 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-17 9:37 Stefan Hajnoczi [this message]
2020-06-17 9:37 ` [RFC 1/2] genirq: honor device NUMA node when allocating descs Stefan Hajnoczi
2020-06-17 9:37 ` [RFC 2/2] genirq/matrix: take NUMA into account for managed IRQs Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200617093725.1725569-1-stefanha@redhat.com \
--to=stefanha@redhat.com \
--cc=bhelgaas@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mst@redhat.com \
--cc=mtosatti@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).