linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] powerpc/pseries: fix MSI/X IRQ affinity on pseries
@ 2020-11-26  8:28 Laurent Vivier
  2020-11-26  8:28 ` [PATCH v4 1/2] genirq/irqdomain: Add an irq_create_mapping_affinity() function Laurent Vivier
  2020-11-26  8:28 ` [PATCH v4 2/2] powerpc/pseries: Pass MSI affinity to irq_create_mapping() Laurent Vivier
  0 siblings, 2 replies; 3+ messages in thread
From: Laurent Vivier @ 2020-11-26  8:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Paul Mackerras, linux-pci, Thomas Gleixner, linux-block,
	Michael S . Tsirkin, Marc Zyngier, Christoph Hellwig, Greg Kurz,
	linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt,
	Laurent Vivier

With virtio, in multiqueue case, each queue IRQ is normally
bound to a different CPU using the affinity mask.

This works fine on x86_64 but totally ignored on pseries.

This is not obvious at first look because irqbalance is doing
some balancing to improve that.

It appears that the "managed" flag set in the MSI entry
is never copied to the system IRQ entry.

This series passes the affinity mask from rtas_setup_msi_irqs()
to irq_domain_alloc_descs() by adding an affinity parameter to
irq_create_mapping().

The first patch adds the parameter (no functional change), the
second patch passes the actual affinity mask to irq_create_mapping()
in rtas_setup_msi_irqs().

For instance, with 32 CPUs VM and 32 queues virtio-scsi interface:

... -smp 32 -device virtio-scsi-pci,id=virtio_scsi_pci0,num_queues=32

for IRQ in $(grep virtio2-request /proc/interrupts |cut -d: -f1); do
    for file in /proc/irq/$IRQ/ ; do
        echo -n "IRQ: $(basename $file) CPU: " ; cat $file/smp_affinity_list
    done
done

Without the patch (and without irqbalanced)

IRQ: 268 CPU: 0-31
IRQ: 269 CPU: 0-31
IRQ: 270 CPU: 0-31
IRQ: 271 CPU: 0-31
IRQ: 272 CPU: 0-31
IRQ: 273 CPU: 0-31
IRQ: 274 CPU: 0-31
IRQ: 275 CPU: 0-31
IRQ: 276 CPU: 0-31
IRQ: 277 CPU: 0-31
IRQ: 278 CPU: 0-31
IRQ: 279 CPU: 0-31
IRQ: 280 CPU: 0-31
IRQ: 281 CPU: 0-31
IRQ: 282 CPU: 0-31
IRQ: 283 CPU: 0-31
IRQ: 284 CPU: 0-31
IRQ: 285 CPU: 0-31
IRQ: 286 CPU: 0-31
IRQ: 287 CPU: 0-31
IRQ: 288 CPU: 0-31
IRQ: 289 CPU: 0-31
IRQ: 290 CPU: 0-31
IRQ: 291 CPU: 0-31
IRQ: 292 CPU: 0-31
IRQ: 293 CPU: 0-31
IRQ: 294 CPU: 0-31
IRQ: 295 CPU: 0-31
IRQ: 296 CPU: 0-31
IRQ: 297 CPU: 0-31
IRQ: 298 CPU: 0-31
IRQ: 299 CPU: 0-31

With the patch:

IRQ: 265 CPU: 0
IRQ: 266 CPU: 1
IRQ: 267 CPU: 2
IRQ: 268 CPU: 3
IRQ: 269 CPU: 4
IRQ: 270 CPU: 5
IRQ: 271 CPU: 6
IRQ: 272 CPU: 7
IRQ: 273 CPU: 8
IRQ: 274 CPU: 9
IRQ: 275 CPU: 10
IRQ: 276 CPU: 11
IRQ: 277 CPU: 12
IRQ: 278 CPU: 13
IRQ: 279 CPU: 14
IRQ: 280 CPU: 15
IRQ: 281 CPU: 16
IRQ: 282 CPU: 17
IRQ: 283 CPU: 18
IRQ: 284 CPU: 19
IRQ: 285 CPU: 20
IRQ: 286 CPU: 21
IRQ: 287 CPU: 22
IRQ: 288 CPU: 23
IRQ: 289 CPU: 24
IRQ: 290 CPU: 25
IRQ: 291 CPU: 26
IRQ: 292 CPU: 27
IRQ: 293 CPU: 28
IRQ: 294 CPU: 29
IRQ: 295 CPU: 30
IRQ: 299 CPU: 31

This matches what we have on an x86_64 system.

v4: udate changelog of PATCH 2, add Michael's Acked-by
v3: update changelog of PATCH 1 with comments from Thomas Gleixner and
    Marc Zyngier.
v2: add a wrapper around original irq_create_mapping() with the
    affinity parameter. Update comments

Laurent Vivier (2):
  genirq/irqdomain: Add an irq_create_mapping_affinity() function
  powerpc/pseries: Pass MSI affinity to irq_create_mapping()

 arch/powerpc/platforms/pseries/msi.c |  3 ++-
 include/linux/irqdomain.h            | 12 ++++++++++--
 kernel/irq/irqdomain.c               | 13 ++++++++-----
 3 files changed, 20 insertions(+), 8 deletions(-)

-- 
2.28.0



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v4 1/2] genirq/irqdomain: Add an irq_create_mapping_affinity() function
  2020-11-26  8:28 [PATCH v4 0/2] powerpc/pseries: fix MSI/X IRQ affinity on pseries Laurent Vivier
@ 2020-11-26  8:28 ` Laurent Vivier
  2020-11-26  8:28 ` [PATCH v4 2/2] powerpc/pseries: Pass MSI affinity to irq_create_mapping() Laurent Vivier
  1 sibling, 0 replies; 3+ messages in thread
From: Laurent Vivier @ 2020-11-26  8:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Paul Mackerras, linux-pci, Thomas Gleixner, linux-block,
	Michael S . Tsirkin, Marc Zyngier, Christoph Hellwig, Greg Kurz,
	linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt,
	Laurent Vivier

There is currently no way to convey the affinity of an interrupt
via irq_create_mapping(), which creates issues for devices that
expect that affinity to be managed by the kernel.

In order to sort this out, rename irq_create_mapping() to
irq_create_mapping_affinity() with an additional affinity parameter
that can conveniently passed down to irq_domain_alloc_descs().

irq_create_mapping() is then re-implemented as a wrapper around
irq_create_mapping_affinity().

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
---
 include/linux/irqdomain.h | 12 ++++++++++--
 kernel/irq/irqdomain.c    | 13 ++++++++-----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 71535e87109f..ea5a337e0f8b 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -384,11 +384,19 @@ extern void irq_domain_associate_many(struct irq_domain *domain,
 extern void irq_domain_disassociate(struct irq_domain *domain,
 				    unsigned int irq);
 
-extern unsigned int irq_create_mapping(struct irq_domain *host,
-				       irq_hw_number_t hwirq);
+extern unsigned int irq_create_mapping_affinity(struct irq_domain *host,
+				      irq_hw_number_t hwirq,
+				      const struct irq_affinity_desc *affinity);
 extern unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec);
 extern void irq_dispose_mapping(unsigned int virq);
 
+static inline unsigned int irq_create_mapping(struct irq_domain *host,
+					      irq_hw_number_t hwirq)
+{
+	return irq_create_mapping_affinity(host, hwirq, NULL);
+}
+
+
 /**
  * irq_linear_revmap() - Find a linux irq from a hw irq number.
  * @domain: domain owning this hardware interrupt
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index cf8b374b892d..e4ca69608f3b 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -624,17 +624,19 @@ unsigned int irq_create_direct_mapping(struct irq_domain *domain)
 EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
 
 /**
- * irq_create_mapping() - Map a hardware interrupt into linux irq space
+ * irq_create_mapping_affinity() - Map a hardware interrupt into linux irq space
  * @domain: domain owning this hardware interrupt or NULL for default domain
  * @hwirq: hardware irq number in that domain space
+ * @affinity: irq affinity
  *
  * Only one mapping per hardware interrupt is permitted. Returns a linux
  * irq number.
  * If the sense/trigger is to be specified, set_irq_type() should be called
  * on the number returned from that call.
  */
-unsigned int irq_create_mapping(struct irq_domain *domain,
-				irq_hw_number_t hwirq)
+unsigned int irq_create_mapping_affinity(struct irq_domain *domain,
+				       irq_hw_number_t hwirq,
+				       const struct irq_affinity_desc *affinity)
 {
 	struct device_node *of_node;
 	int virq;
@@ -660,7 +662,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 	}
 
 	/* Allocate a virtual interrupt number */
-	virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node), NULL);
+	virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node),
+				      affinity);
 	if (virq <= 0) {
 		pr_debug("-> virq allocation failed\n");
 		return 0;
@@ -676,7 +679,7 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 
 	return virq;
 }
-EXPORT_SYMBOL_GPL(irq_create_mapping);
+EXPORT_SYMBOL_GPL(irq_create_mapping_affinity);
 
 /**
  * irq_create_strict_mappings() - Map a range of hw irqs to fixed linux irqs
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v4 2/2] powerpc/pseries: Pass MSI affinity to irq_create_mapping()
  2020-11-26  8:28 [PATCH v4 0/2] powerpc/pseries: fix MSI/X IRQ affinity on pseries Laurent Vivier
  2020-11-26  8:28 ` [PATCH v4 1/2] genirq/irqdomain: Add an irq_create_mapping_affinity() function Laurent Vivier
@ 2020-11-26  8:28 ` Laurent Vivier
  1 sibling, 0 replies; 3+ messages in thread
From: Laurent Vivier @ 2020-11-26  8:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Paul Mackerras, linux-pci, Thomas Gleixner, linux-block,
	Michael S . Tsirkin, Marc Zyngier, Christoph Hellwig, Greg Kurz,
	linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt,
	Laurent Vivier

With virtio multiqueue, normally each queue IRQ is mapped to a CPU.

Commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ affinity") exposed
an existing shortcoming of the arch code by moving virtio_scsi to
the automatic IRQ affinity assignment.

The affinity is correctly computed in msi_desc but this is not applied
to the system IRQs.

It appears the affinity is correctly passed to rtas_setup_msi_irqs() but
lost at this point and never passed to irq_domain_alloc_descs()
(see commit 06ee6d571f0e ("genirq: Add affinity hint to irq allocation"))
because irq_create_mapping() doesn't take an affinity parameter.

As the previous patch has added the affinity parameter to
irq_create_mapping() we can forward the affinity from rtas_setup_msi_irqs()
to irq_domain_alloc_descs().

With this change, the virtqueues are correctly dispatched between the CPUs
on pseries.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/platforms/pseries/msi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index 133f6adcb39c..b3ac2455faad 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -458,7 +458,8 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
 			return hwirq;
 		}
 
-		virq = irq_create_mapping(NULL, hwirq);
+		virq = irq_create_mapping_affinity(NULL, hwirq,
+						   entry->affinity);
 
 		if (!virq) {
 			pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-11-26  8:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-26  8:28 [PATCH v4 0/2] powerpc/pseries: fix MSI/X IRQ affinity on pseries Laurent Vivier
2020-11-26  8:28 ` [PATCH v4 1/2] genirq/irqdomain: Add an irq_create_mapping_affinity() function Laurent Vivier
2020-11-26  8:28 ` [PATCH v4 2/2] powerpc/pseries: Pass MSI affinity to irq_create_mapping() Laurent Vivier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).