iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI
@ 2020-08-21  0:24 Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 01/38] iommu/amd: Prevent NULL pointer dereference Thomas Gleixner
                   ` (38 more replies)
  0 siblings, 39 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

First of all, sorry for the horrible long Cc list, which was
unfortunately unavoidable as this touches the world and some more.

This patch series aims to provide a base to support device MSI (non
PCI based) in a halfways architecture independent way.

It's a mixed bag of bug fixes, cleanups and general improvements which
are worthwhile independent of the device MSI stuff. Unfortunately this
also comes with an evil abuse of the irqdomain system to coerce XEN on
x86 into compliance without rewriting XEN from scratch.

As discussed in length in this mail thread:

  https://lore.kernel.org/r/87h7tcgbs2.fsf@nanos.tec.linutronix.de

the initial attempt of piggypacking device MSI support on platform MSI
is doomed for various reasons, but creating independent interrupt
domains for these upcoming magic PCI subdevices which are not PCI, but
might be exposed as PCI devices is not as trivial as it seems.

The initially suggested and evaluated approach of extending platform
MSI turned out to be the completely wrong direction and in fact
platform MSI should be rewritten on top of device MSI or completely
replaced by it.

One of the main issues is that x86 does not support the concept of irq
domains associations stored in device::msi_domain and still relies on
the arch_*_msi_irqs() fallback implementations which has it's own set
of problems as outlined in

  https://lore.kernel.org/r/87bljg7u4f.fsf@nanos.tec.linutronix.de/

in the very same thread.

The main obstacle of storing that pointer is XEN which has it's own
historical notiion of handling PCI MSI interupts.

This series tries to address these issues in several steps:

 1) Accidental bug fixes
	iommu/amd: Prevent NULL pointer dereference

 2) Janitoring
	x86/init: Remove unused init ops

 3) Simplification of the x86 specific interrupt allocation mechanism

	x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency
	x86/irq: Add allocation type for parent domain retrieval
	iommu/vt-d: Consolidate irq domain getter
	iommu/amd: Consolidate irq domain getter
	iommu/irq_remapping: Consolidate irq domain lookup

 4) Consolidation of the X86 specific interrupt allocation mechanism to be as close
    as possible to the generic MSI allocation mechanism which allows to get rid
    of quite a bunch of x86'isms which are pointless

	x86/irq: Prepare consolidation of irq_alloc_info
	x86/msi: Consolidate HPET allocation
	x86/ioapic: Consolidate IOAPIC allocation
	x86/irq: Consolidate DMAR irq allocation
	x86/irq: Consolidate UV domain allocation
	PCI: MSI: Rework pci_msi_domain_calc_hwirq()
	x86/msi: Consolidate MSI allocation
	x86/msi: Use generic MSI domain ops

  5) x86 specific cleanups to remove the dependency on arch_*_msi_irqs()

	x86/irq: Move apic_post_init() invocation to one place
	z86/pci: Reducde #ifdeffery in PCI init code
	x86/irq: Initialize PCI/MSI domain at PCI init time
	irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
	PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI
	PCI: MSI: Provide pci_dev_has_special_msi_domain() helper
	x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
	x86/xen: Rework MSI teardown
	x86/xen: Consolidate XEN-MSI init
	irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
	x86/xen: Wrap XEN MSI management into irqdomain
	iommm/vt-d: Store irq domain in struct device
	iommm/amd: Store irq domain in struct device
	x86/pci: Set default irq domain in pcibios_add_device()
	PCI/MSI: Allow to disable arch fallbacks
	x86/irq: Cleanup the arch_*_msi_irqs() leftovers
	x86/irq: Make most MSI ops XEN private

    This one is paving the way to device MSI support, but it comes
    with an ugly and evil hack. The ability of overriding the default
    allocation/free functions of an MSI irq domain is useful in general as
    (hopefully) demonstrated with the device MSI POC, but the abuse
    in context of XEN is evil. OTOH without enough XENology and without
    rewriting XEN from scratch wrapping XEN MSI handling into a pseudo
    irq domain is a reasonable step forward for mere mortals with severly
    limited XENology. One day the XEN folks might make it a real irq domain.
    Perhaps when they have to support the same mess on other architectures.
    Hope dies last...

    At least the mechanism to override alloc/free turned out to be useful
    for implementing the base infrastructure for device MSI. So it's not a
    completely lost case.

  6) X86 specific preparation for device MSI

       x86/irq: Add DEV_MSI allocation type
       x86/msi: Let pci_msi_prepare() handle non-PCI MSI

  7) Generic device MSI infrastructure

       platform-msi: Provide default irq_chip:ack
       platform-msi: Add device MSI infrastructure

  8) Infrastructure for and a POC of an IMS (Interrupt Message
     Storm) irq domain and irqchip implementation

       irqdomain/msi: Provide msi_alloc/free_store() callbacks
       irqchip: Add IMS array driver - NOT FOR MERGING

The whole lot is also available from git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git device-msi

This has been tested on Intel/AMD/KVM but lacks testing on:

    - HYPERV (-ENODEV)
    - VMD enabled systems (-ENODEV)
    - XEN (-ENOCLUE)

#1 and #2 should be applied unconditionally for obvious reasons
#3-5 are wortwhile cleanups which should be done independent of device MSI

#6-7 look promising to cleanup the platform MSI implementation
     independent of #8, but I neither had cycles nor stomache to tackle that.

#8 is obviously just for the folks interested in IMS

And of course this all started with a 100 lines combo patch to figure
out whether this is possible at all with a reasonable effort. 38
patches later ...

Thanks,

	tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 01/38] iommu/amd: Prevent NULL pointer dereference
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 02/38] x86/init: Remove unused init ops Thomas Gleixner
                   ` (37 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: iommu-amd--Prevent-NULL-pointer-dereference.patch --]
[-- Type: text/plain, Size: 829 bytes --]

Dereferencing irq_data before checking it for NULL is suboptimal.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
---
 drivers/iommu/amd/iommu.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3717,8 +3717,8 @@ static int irq_remapping_alloc(struct ir
 
 	for (i = 0; i < nr_irqs; i++) {
 		irq_data = irq_domain_get_irq_data(domain, virq + i);
-		cfg = irqd_cfg(irq_data);
-		if (!irq_data || !cfg) {
+		cfg = irq_data ? irqd_cfg(irq_data) : NULL;
+		if (!cfg) {
 			ret = -EINVAL;
 			goto out_free_data;
 		}

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 02/38] x86/init: Remove unused init ops
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 01/38] iommu/amd: Prevent NULL pointer dereference Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 03/38] x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency Thomas Gleixner
                   ` (36 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-init--Remove-unused-init-ops.patch --]
[-- Type: text/plain, Size: 5159 bytes --]

Some past platform removal forgot to get rid of this unused ballast.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/mpspec.h   |   10 ----------
 arch/x86/include/asm/x86_init.h |   10 ----------
 arch/x86/kernel/mpparse.c       |   26 ++++----------------------
 arch/x86/kernel/x86_init.c      |    4 ----
 4 files changed, 4 insertions(+), 46 deletions(-)

--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -67,21 +67,11 @@ static inline void find_smp_config(void)
 #ifdef CONFIG_X86_MPPARSE
 extern void e820__memblock_alloc_reserved_mpc_new(void);
 extern int enable_update_mptable;
-extern int default_mpc_apic_id(struct mpc_cpu *m);
-extern void default_smp_read_mpc_oem(struct mpc_table *mpc);
-# ifdef CONFIG_X86_IO_APIC
-extern void default_mpc_oem_bus_info(struct mpc_bus *m, char *str);
-# else
-#  define default_mpc_oem_bus_info NULL
-# endif
 extern void default_find_smp_config(void);
 extern void default_get_smp_config(unsigned int early);
 #else
 static inline void e820__memblock_alloc_reserved_mpc_new(void) { }
 #define enable_update_mptable 0
-#define default_mpc_apic_id NULL
-#define default_smp_read_mpc_oem NULL
-#define default_mpc_oem_bus_info NULL
 #define default_find_smp_config x86_init_noop
 #define default_get_smp_config x86_init_uint_noop
 #endif
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -11,22 +11,12 @@ struct cpuinfo_x86;
 
 /**
  * struct x86_init_mpparse - platform specific mpparse ops
- * @mpc_record:			platform specific mpc record accounting
  * @setup_ioapic_ids:		platform specific ioapic id override
- * @mpc_apic_id:		platform specific mpc apic id assignment
- * @smp_read_mpc_oem:		platform specific oem mpc table setup
- * @mpc_oem_pci_bus:		platform specific pci bus setup (default NULL)
- * @mpc_oem_bus_info:		platform specific mpc bus info
  * @find_smp_config:		find the smp configuration
  * @get_smp_config:		get the smp configuration
  */
 struct x86_init_mpparse {
-	void (*mpc_record)(unsigned int mode);
 	void (*setup_ioapic_ids)(void);
-	int (*mpc_apic_id)(struct mpc_cpu *m);
-	void (*smp_read_mpc_oem)(struct mpc_table *mpc);
-	void (*mpc_oem_pci_bus)(struct mpc_bus *m);
-	void (*mpc_oem_bus_info)(struct mpc_bus *m, char *name);
 	void (*find_smp_config)(void);
 	void (*get_smp_config)(unsigned int early);
 };
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -46,11 +46,6 @@ static int __init mpf_checksum(unsigned
 	return sum & 0xFF;
 }
 
-int __init default_mpc_apic_id(struct mpc_cpu *m)
-{
-	return m->apicid;
-}
-
 static void __init MP_processor_info(struct mpc_cpu *m)
 {
 	int apicid;
@@ -61,7 +56,7 @@ static void __init MP_processor_info(str
 		return;
 	}
 
-	apicid = x86_init.mpparse.mpc_apic_id(m);
+	apicid = m->apicid;
 
 	if (m->cpuflag & CPU_BOOTPROCESSOR) {
 		bootup_cpu = " (Bootup-CPU)";
@@ -73,7 +68,7 @@ static void __init MP_processor_info(str
 }
 
 #ifdef CONFIG_X86_IO_APIC
-void __init default_mpc_oem_bus_info(struct mpc_bus *m, char *str)
+static void __init mpc_oem_bus_info(struct mpc_bus *m, char *str)
 {
 	memcpy(str, m->bustype, 6);
 	str[6] = 0;
@@ -84,7 +79,7 @@ static void __init MP_bus_info(struct mp
 {
 	char str[7];
 
-	x86_init.mpparse.mpc_oem_bus_info(m, str);
+	mpc_oem_bus_info(m, str);
 
 #if MAX_MP_BUSSES < 256
 	if (m->busid >= MAX_MP_BUSSES) {
@@ -100,9 +95,6 @@ static void __init MP_bus_info(struct mp
 		mp_bus_id_to_type[m->busid] = MP_BUS_ISA;
 #endif
 	} else if (strncmp(str, BUSTYPE_PCI, sizeof(BUSTYPE_PCI) - 1) == 0) {
-		if (x86_init.mpparse.mpc_oem_pci_bus)
-			x86_init.mpparse.mpc_oem_pci_bus(m);
-
 		clear_bit(m->busid, mp_bus_not_pci);
 #ifdef CONFIG_EISA
 		mp_bus_id_to_type[m->busid] = MP_BUS_PCI;
@@ -198,8 +190,6 @@ static void __init smp_dump_mptable(stru
 			1, mpc, mpc->length, 1);
 }
 
-void __init default_smp_read_mpc_oem(struct mpc_table *mpc) { }
-
 static int __init smp_read_mpc(struct mpc_table *mpc, unsigned early)
 {
 	char str[16];
@@ -218,14 +208,7 @@ static int __init smp_read_mpc(struct mp
 	if (early)
 		return 1;
 
-	if (mpc->oemptr)
-		x86_init.mpparse.smp_read_mpc_oem(mpc);
-
-	/*
-	 *      Now process the configuration blocks.
-	 */
-	x86_init.mpparse.mpc_record(0);
-
+	/* Now process the configuration blocks. */
 	while (count < mpc->length) {
 		switch (*mpt) {
 		case MP_PROCESSOR:
@@ -256,7 +239,6 @@ static int __init smp_read_mpc(struct mp
 			count = mpc->length;
 			break;
 		}
-		x86_init.mpparse.mpc_record(1);
 	}
 
 	if (!num_processors)
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -67,11 +67,7 @@ struct x86_init_ops x86_init __initdata
 	},
 
 	.mpparse = {
-		.mpc_record		= x86_init_uint_noop,
 		.setup_ioapic_ids	= x86_init_noop,
-		.mpc_apic_id		= default_mpc_apic_id,
-		.smp_read_mpc_oem	= default_smp_read_mpc_oem,
-		.mpc_oem_bus_info	= default_mpc_oem_bus_info,
 		.find_smp_config	= default_find_smp_config,
 		.get_smp_config		= default_get_smp_config,
 	},

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 03/38] x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 01/38] iommu/amd: Prevent NULL pointer dereference Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 02/38] x86/init: Remove unused init ops Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 04/38] x86/irq: Add allocation type for parent domain retrieval Thomas Gleixner
                   ` (35 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Rename-X86_IRQ_ALLOC_TYPE_MSI --]
[-- Type: text/plain, Size: 5684 bytes --]

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
---
 arch/x86/include/asm/hw_irq.h       |    4 ++--
 arch/x86/kernel/apic/msi.c          |    6 +++---
 drivers/iommu/amd/iommu.c           |   24 ++++++++++++------------
 drivers/iommu/intel/irq_remapping.c |   18 +++++++++---------
 4 files changed, 26 insertions(+), 26 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -36,8 +36,8 @@ struct msi_desc;
 enum irq_alloc_type {
 	X86_IRQ_ALLOC_TYPE_IOAPIC = 1,
 	X86_IRQ_ALLOC_TYPE_HPET,
-	X86_IRQ_ALLOC_TYPE_MSI,
-	X86_IRQ_ALLOC_TYPE_MSIX,
+	X86_IRQ_ALLOC_TYPE_PCI_MSI,
+	X86_IRQ_ALLOC_TYPE_PCI_MSIX,
 	X86_IRQ_ALLOC_TYPE_DMAR,
 	X86_IRQ_ALLOC_TYPE_UV,
 };
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -188,7 +188,7 @@ int native_setup_msi_irqs(struct pci_dev
 	struct irq_alloc_info info;
 
 	init_irq_alloc_info(&info, NULL);
-	info.type = X86_IRQ_ALLOC_TYPE_MSI;
+	info.type = X86_IRQ_ALLOC_TYPE_PCI_MSI;
 	info.msi_dev = dev;
 
 	domain = irq_remapping_get_irq_domain(&info);
@@ -220,9 +220,9 @@ int pci_msi_prepare(struct irq_domain *d
 	init_irq_alloc_info(arg, NULL);
 	arg->msi_dev = pdev;
 	if (desc->msi_attrib.is_msix) {
-		arg->type = X86_IRQ_ALLOC_TYPE_MSIX;
+		arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSIX;
 	} else {
-		arg->type = X86_IRQ_ALLOC_TYPE_MSI;
+		arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSI;
 		arg->flags |= X86_IRQ_ALLOC_CONTIGUOUS_VECTORS;
 	}
 
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3514,8 +3514,8 @@ static int get_devid(struct irq_alloc_in
 	case X86_IRQ_ALLOC_TYPE_HPET:
 		devid     = get_hpet_devid(info->hpet_id);
 		break;
-	case X86_IRQ_ALLOC_TYPE_MSI:
-	case X86_IRQ_ALLOC_TYPE_MSIX:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		devid = get_device_id(&info->msi_dev->dev);
 		break;
 	default:
@@ -3553,8 +3553,8 @@ static struct irq_domain *get_irq_domain
 		return NULL;
 
 	switch (info->type) {
-	case X86_IRQ_ALLOC_TYPE_MSI:
-	case X86_IRQ_ALLOC_TYPE_MSIX:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		devid = get_device_id(&info->msi_dev->dev);
 		if (devid < 0)
 			return NULL;
@@ -3615,8 +3615,8 @@ static void irq_remapping_prepare_irte(s
 		break;
 
 	case X86_IRQ_ALLOC_TYPE_HPET:
-	case X86_IRQ_ALLOC_TYPE_MSI:
-	case X86_IRQ_ALLOC_TYPE_MSIX:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		msg->address_hi = MSI_ADDR_BASE_HI;
 		msg->address_lo = MSI_ADDR_BASE_LO;
 		msg->data = irte_info->index;
@@ -3660,15 +3660,15 @@ static int irq_remapping_alloc(struct ir
 
 	if (!info)
 		return -EINVAL;
-	if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_MSI &&
-	    info->type != X86_IRQ_ALLOC_TYPE_MSIX)
+	if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_PCI_MSI &&
+	    info->type != X86_IRQ_ALLOC_TYPE_PCI_MSIX)
 		return -EINVAL;
 
 	/*
 	 * With IRQ remapping enabled, don't need contiguous CPU vectors
 	 * to support multiple MSI interrupts.
 	 */
-	if (info->type == X86_IRQ_ALLOC_TYPE_MSI)
+	if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI)
 		info->flags &= ~X86_IRQ_ALLOC_CONTIGUOUS_VECTORS;
 
 	devid = get_devid(info);
@@ -3700,9 +3700,9 @@ static int irq_remapping_alloc(struct ir
 		} else {
 			index = -ENOMEM;
 		}
-	} else if (info->type == X86_IRQ_ALLOC_TYPE_MSI ||
-		   info->type == X86_IRQ_ALLOC_TYPE_MSIX) {
-		bool align = (info->type == X86_IRQ_ALLOC_TYPE_MSI);
+	} else if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI ||
+		   info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX) {
+		bool align = (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI);
 
 		index = alloc_irq_index(devid, nr_irqs, align, info->msi_dev);
 	} else {
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1115,8 +1115,8 @@ static struct irq_domain *intel_get_ir_i
 	case X86_IRQ_ALLOC_TYPE_HPET:
 		iommu = map_hpet_to_ir(info->hpet_id);
 		break;
-	case X86_IRQ_ALLOC_TYPE_MSI:
-	case X86_IRQ_ALLOC_TYPE_MSIX:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		iommu = map_dev_to_ir(info->msi_dev);
 		break;
 	default:
@@ -1135,8 +1135,8 @@ static struct irq_domain *intel_get_irq_
 		return NULL;
 
 	switch (info->type) {
-	case X86_IRQ_ALLOC_TYPE_MSI:
-	case X86_IRQ_ALLOC_TYPE_MSIX:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		iommu = map_dev_to_ir(info->msi_dev);
 		if (iommu)
 			return iommu->ir_msi_domain;
@@ -1306,8 +1306,8 @@ static void intel_irq_remapping_prepare_
 		break;
 
 	case X86_IRQ_ALLOC_TYPE_HPET:
-	case X86_IRQ_ALLOC_TYPE_MSI:
-	case X86_IRQ_ALLOC_TYPE_MSIX:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		if (info->type == X86_IRQ_ALLOC_TYPE_HPET)
 			set_hpet_sid(irte, info->hpet_id);
 		else
@@ -1362,15 +1362,15 @@ static int intel_irq_remapping_alloc(str
 
 	if (!info || !iommu)
 		return -EINVAL;
-	if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_MSI &&
-	    info->type != X86_IRQ_ALLOC_TYPE_MSIX)
+	if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_PCI_MSI &&
+	    info->type != X86_IRQ_ALLOC_TYPE_PCI_MSIX)
 		return -EINVAL;
 
 	/*
 	 * With IRQ remapping enabled, don't need contiguous CPU vectors
 	 * to support multiple MSI interrupts.
 	 */
-	if (info->type == X86_IRQ_ALLOC_TYPE_MSI)
+	if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI)
 		info->flags &= ~X86_IRQ_ALLOC_CONTIGUOUS_VECTORS;
 
 	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 04/38] x86/irq: Add allocation type for parent domain retrieval
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (2 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 03/38] x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 05/38] iommu/vt-d: Consolidate irq domain getter Thomas Gleixner
                   ` (34 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Add-allocation-flag-for-domain-retrieval.patch --]
[-- Type: text/plain, Size: 3764 bytes --]

irq_remapping_ir_irq_domain() is used to retrieve the remapping parent
domain for an allocation type. irq_remapping_irq_domain() is for retrieving
the actual device domain for allocating interrupts for a device.

The two functions are similar and can be unified by using explicit modes
for parent irq domain retrieval.

Add X86_IRQ_ALLOC_TYPE_IOAPIC/HPET_GET_PARENT and use it in the iommu
implementations. Drop the parent domain retrieval for PCI_MSI/X as that is
unused.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: x86@kernel.org
Cc: linux-hyperv@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jon Derrick <jonathan.derrick@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h       |    2 ++
 arch/x86/kernel/apic/io_apic.c      |    2 +-
 arch/x86/kernel/apic/msi.c          |    2 +-
 drivers/iommu/amd/iommu.c           |    8 ++++++++
 drivers/iommu/hyperv-iommu.c        |    2 +-
 drivers/iommu/intel/irq_remapping.c |    8 ++------
 6 files changed, 15 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -40,6 +40,8 @@ enum irq_alloc_type {
 	X86_IRQ_ALLOC_TYPE_PCI_MSIX,
 	X86_IRQ_ALLOC_TYPE_DMAR,
 	X86_IRQ_ALLOC_TYPE_UV,
+	X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT,
+	X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT,
 };
 
 struct irq_alloc_info {
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2296,7 +2296,7 @@ static int mp_irqdomain_create(int ioapi
 		return 0;
 
 	init_irq_alloc_info(&info, NULL);
-	info.type = X86_IRQ_ALLOC_TYPE_IOAPIC;
+	info.type = X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT;
 	info.ioapic_id = mpc_ioapic_id(ioapic);
 	parent = irq_remapping_get_ir_irq_domain(&info);
 	if (!parent)
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -476,7 +476,7 @@ struct irq_domain *hpet_create_irq_domai
 	domain_info->data = (void *)(long)hpet_id;
 
 	init_irq_alloc_info(&info, NULL);
-	info.type = X86_IRQ_ALLOC_TYPE_HPET;
+	info.type = X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT;
 	info.hpet_id = hpet_id;
 	parent = irq_remapping_get_ir_irq_domain(&info);
 	if (parent == NULL)
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3534,6 +3534,14 @@ static struct irq_domain *get_ir_irq_dom
 	if (!info)
 		return NULL;
 
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
+	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
+		break;
+	default:
+		return NULL;
+	}
+
 	devid = get_devid(info);
 	if (devid >= 0) {
 		iommu = amd_iommu_rlookup_table[devid];
--- a/drivers/iommu/hyperv-iommu.c
+++ b/drivers/iommu/hyperv-iommu.c
@@ -184,7 +184,7 @@ static int __init hyperv_enable_irq_rema
 
 static struct irq_domain *hyperv_get_ir_irq_domain(struct irq_alloc_info *info)
 {
-	if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC)
+	if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT)
 		return ioapic_ir_domain;
 	else
 		return NULL;
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1109,16 +1109,12 @@ static struct irq_domain *intel_get_ir_i
 		return NULL;
 
 	switch (info->type) {
-	case X86_IRQ_ALLOC_TYPE_IOAPIC:
+	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
 		iommu = map_ioapic_to_ir(info->ioapic_id);
 		break;
-	case X86_IRQ_ALLOC_TYPE_HPET:
+	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
 		iommu = map_hpet_to_ir(info->hpet_id);
 		break;
-	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
-	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
-		iommu = map_dev_to_ir(info->msi_dev);
-		break;
 	default:
 		BUG_ON(1);
 		break;

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 05/38] iommu/vt-d: Consolidate irq domain getter
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (3 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 04/38] x86/irq: Add allocation type for parent domain retrieval Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 06/38] iommu/amd: " Thomas Gleixner
                   ` (33 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: iommu-vt-d--Consolidate-irq-domain-getter.patch --]
[-- Type: text/plain, Size: 3932 bytes --]

The irq domain request mode is now indicated in irq_alloc_info::type.

Consolidate the two getter functions into one.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/irq_remapping.c |   67 ++++++++++++------------------------
 1 file changed, 24 insertions(+), 43 deletions(-)

--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -204,35 +204,40 @@ static int modify_irte(struct irq_2_iomm
 	return rc;
 }
 
-static struct intel_iommu *map_hpet_to_ir(u8 hpet_id)
+static struct irq_domain *map_hpet_to_ir(u8 hpet_id)
 {
 	int i;
 
-	for (i = 0; i < MAX_HPET_TBS; i++)
+	for (i = 0; i < MAX_HPET_TBS; i++) {
 		if (ir_hpet[i].id == hpet_id && ir_hpet[i].iommu)
-			return ir_hpet[i].iommu;
+			return ir_hpet[i].iommu->ir_domain;
+	}
 	return NULL;
 }
 
-static struct intel_iommu *map_ioapic_to_ir(int apic)
+static struct intel_iommu *map_ioapic_to_iommu(int apic)
 {
 	int i;
 
-	for (i = 0; i < MAX_IO_APICS; i++)
+	for (i = 0; i < MAX_IO_APICS; i++) {
 		if (ir_ioapic[i].id == apic && ir_ioapic[i].iommu)
 			return ir_ioapic[i].iommu;
+	}
 	return NULL;
 }
 
-static struct intel_iommu *map_dev_to_ir(struct pci_dev *dev)
+static struct irq_domain *map_ioapic_to_ir(int apic)
 {
-	struct dmar_drhd_unit *drhd;
+	struct intel_iommu *iommu = map_ioapic_to_iommu(apic);
 
-	drhd = dmar_find_matched_drhd_unit(dev);
-	if (!drhd)
-		return NULL;
+	return iommu ? iommu->ir_domain : NULL;
+}
+
+static struct irq_domain *map_dev_to_ir(struct pci_dev *dev)
+{
+	struct dmar_drhd_unit *drhd = dmar_find_matched_drhd_unit(dev);
 
-	return drhd->iommu;
+	return drhd ? drhd->iommu->ir_msi_domain : NULL;
 }
 
 static int clear_entries(struct irq_2_iommu *irq_iommu)
@@ -996,7 +1001,7 @@ static int __init parse_ioapics_under_ir
 
 	for (ioapic_idx = 0; ioapic_idx < nr_ioapics; ioapic_idx++) {
 		int ioapic_id = mpc_ioapic_id(ioapic_idx);
-		if (!map_ioapic_to_ir(ioapic_id)) {
+		if (!map_ioapic_to_iommu(ioapic_id)) {
 			pr_err(FW_BUG "ioapic %d has no mapping iommu, "
 			       "interrupt remapping will be disabled\n",
 			       ioapic_id);
@@ -1101,47 +1106,23 @@ static void prepare_irte(struct irte *ir
 	irte->redir_hint = 1;
 }
 
-static struct irq_domain *intel_get_ir_irq_domain(struct irq_alloc_info *info)
+static struct irq_domain *intel_get_irq_domain(struct irq_alloc_info *info)
 {
-	struct intel_iommu *iommu = NULL;
-
 	if (!info)
 		return NULL;
 
 	switch (info->type) {
 	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
-		iommu = map_ioapic_to_ir(info->ioapic_id);
-		break;
+		return map_ioapic_to_ir(info->ioapic_id);
 	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
-		iommu = map_hpet_to_ir(info->hpet_id);
-		break;
-	default:
-		BUG_ON(1);
-		break;
-	}
-
-	return iommu ? iommu->ir_domain : NULL;
-}
-
-static struct irq_domain *intel_get_irq_domain(struct irq_alloc_info *info)
-{
-	struct intel_iommu *iommu;
-
-	if (!info)
-		return NULL;
-
-	switch (info->type) {
+		return map_hpet_to_ir(info->hpet_id);
 	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
 	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
-		iommu = map_dev_to_ir(info->msi_dev);
-		if (iommu)
-			return iommu->ir_msi_domain;
-		break;
+		return map_dev_to_ir(info->msi_dev);
 	default:
-		break;
+		WARN_ON_ONCE(1);
+		return NULL;
 	}
-
-	return NULL;
 }
 
 struct irq_remap_ops intel_irq_remap_ops = {
@@ -1150,7 +1131,7 @@ struct irq_remap_ops intel_irq_remap_ops
 	.disable		= disable_irq_remapping,
 	.reenable		= reenable_irq_remapping,
 	.enable_faulting	= enable_drhd_fault_handling,
-	.get_ir_irq_domain	= intel_get_ir_irq_domain,
+	.get_ir_irq_domain	= intel_get_irq_domain,
 	.get_irq_domain		= intel_get_irq_domain,
 };
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 06/38] iommu/amd: Consolidate irq domain getter
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (4 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 05/38] iommu/vt-d: Consolidate irq domain getter Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 07/38] iommu/irq_remapping: Consolidate irq domain lookup Thomas Gleixner
                   ` (32 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: iommu-amd--Condolidate-irq-domain-getter.patch --]
[-- Type: text/plain, Size: 3106 bytes --]

The irq domain request mode is now indicated in irq_alloc_info::type.

Consolidate the two getter functions into one.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
---
 drivers/iommu/amd/iommu.c |   65 ++++++++++++++--------------------------------
 1 file changed, 21 insertions(+), 44 deletions(-)

--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3505,77 +3505,54 @@ static void irte_ga_clear_allocated(stru
 
 static int get_devid(struct irq_alloc_info *info)
 {
-	int devid = -1;
-
 	switch (info->type) {
 	case X86_IRQ_ALLOC_TYPE_IOAPIC:
-		devid     = get_ioapic_devid(info->ioapic_id);
-		break;
+	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
+		return get_ioapic_devid(info->ioapic_id);
 	case X86_IRQ_ALLOC_TYPE_HPET:
-		devid     = get_hpet_devid(info->hpet_id);
-		break;
+	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
+		return get_hpet_devid(info->hpet_id);
 	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
 	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
-		devid = get_device_id(&info->msi_dev->dev);
-		break;
+		return get_device_id(&info->msi_dev->dev);
 	default:
-		BUG_ON(1);
-		break;
+		WARN_ON_ONCE(1);
+		return -1;
 	}
-
-	return devid;
 }
 
-static struct irq_domain *get_ir_irq_domain(struct irq_alloc_info *info)
+static struct irq_domain *get_irq_domain_for_devid(struct irq_alloc_info *info,
+						   int devid)
 {
-	struct amd_iommu *iommu;
-	int devid;
+	struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
 
-	if (!info)
+	if (!iommu)
 		return NULL;
 
 	switch (info->type) {
 	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
 	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
-		break;
+		return iommu->ir_domain;
+	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
+	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
+		return iommu->msi_domain;
 	default:
+		WARN_ON_ONCE(1);
 		return NULL;
 	}
-
-	devid = get_devid(info);
-	if (devid >= 0) {
-		iommu = amd_iommu_rlookup_table[devid];
-		if (iommu)
-			return iommu->ir_domain;
-	}
-
-	return NULL;
 }
 
 static struct irq_domain *get_irq_domain(struct irq_alloc_info *info)
 {
-	struct amd_iommu *iommu;
 	int devid;
 
 	if (!info)
 		return NULL;
 
-	switch (info->type) {
-	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
-	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
-		devid = get_device_id(&info->msi_dev->dev);
-		if (devid < 0)
-			return NULL;
-
-		iommu = amd_iommu_rlookup_table[devid];
-		if (iommu)
-			return iommu->msi_domain;
-		break;
-	default:
-		break;
-	}
-
-	return NULL;
+	devid = get_devid(info);
+	if (devid < 0)
+		return NULL;
+	return get_irq_domain_for_devid(info, devid);
 }
 
 struct irq_remap_ops amd_iommu_irq_ops = {
@@ -3584,7 +3561,7 @@ struct irq_remap_ops amd_iommu_irq_ops =
 	.disable		= amd_iommu_disable,
 	.reenable		= amd_iommu_reenable,
 	.enable_faulting	= amd_iommu_enable_faulting,
-	.get_ir_irq_domain	= get_ir_irq_domain,
+	.get_ir_irq_domain	= get_irq_domain,
 	.get_irq_domain		= get_irq_domain,
 };
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 07/38] iommu/irq_remapping: Consolidate irq domain lookup
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (5 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 06/38] iommu/amd: " Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 08/38] x86/irq: Prepare consolidation of irq_alloc_info Thomas Gleixner
                   ` (31 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: iommu-irq_remapping--Consolidate-irq-domain-lookup.patch --]
[-- Type: text/plain, Size: 5900 bytes --]

Now that the iommu implementations handle the X86_*_GET_PARENT_DOMAIN
types, consolidate the two getter functions. 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: linux-hyperv@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jon Derrick <jonathan.derrick@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h |    8 --------
 arch/x86/kernel/apic/io_apic.c       |    2 +-
 arch/x86/kernel/apic/msi.c           |    2 +-
 drivers/iommu/amd/iommu.c            |    1 -
 drivers/iommu/hyperv-iommu.c         |    4 ++--
 drivers/iommu/intel/irq_remapping.c  |    1 -
 drivers/iommu/irq_remapping.c        |   23 +----------------------
 drivers/iommu/irq_remapping.h        |    5 +----
 8 files changed, 6 insertions(+), 40 deletions(-)

--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -45,8 +45,6 @@ extern int irq_remap_enable_fault_handli
 extern void panic_if_irq_remap(const char *msg);
 
 extern struct irq_domain *
-irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info);
-extern struct irq_domain *
 irq_remapping_get_irq_domain(struct irq_alloc_info *info);
 
 /* Create PCI MSI/MSIx irqdomain, use @parent as the parent irqdomain. */
@@ -74,12 +72,6 @@ static inline void panic_if_irq_remap(co
 }
 
 static inline struct irq_domain *
-irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info)
-{
-	return NULL;
-}
-
-static inline struct irq_domain *
 irq_remapping_get_irq_domain(struct irq_alloc_info *info)
 {
 	return NULL;
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2298,7 +2298,7 @@ static int mp_irqdomain_create(int ioapi
 	init_irq_alloc_info(&info, NULL);
 	info.type = X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT;
 	info.ioapic_id = mpc_ioapic_id(ioapic);
-	parent = irq_remapping_get_ir_irq_domain(&info);
+	parent = irq_remapping_get_irq_domain(&info);
 	if (!parent)
 		parent = x86_vector_domain;
 	else
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -478,7 +478,7 @@ struct irq_domain *hpet_create_irq_domai
 	init_irq_alloc_info(&info, NULL);
 	info.type = X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT;
 	info.hpet_id = hpet_id;
-	parent = irq_remapping_get_ir_irq_domain(&info);
+	parent = irq_remapping_get_irq_domain(&info);
 	if (parent == NULL)
 		parent = x86_vector_domain;
 	else
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3561,7 +3561,6 @@ struct irq_remap_ops amd_iommu_irq_ops =
 	.disable		= amd_iommu_disable,
 	.reenable		= amd_iommu_reenable,
 	.enable_faulting	= amd_iommu_enable_faulting,
-	.get_ir_irq_domain	= get_irq_domain,
 	.get_irq_domain		= get_irq_domain,
 };
 
--- a/drivers/iommu/hyperv-iommu.c
+++ b/drivers/iommu/hyperv-iommu.c
@@ -182,7 +182,7 @@ static int __init hyperv_enable_irq_rema
 	return IRQ_REMAP_X2APIC_MODE;
 }
 
-static struct irq_domain *hyperv_get_ir_irq_domain(struct irq_alloc_info *info)
+static struct irq_domain *hyperv_get_irq_domain(struct irq_alloc_info *info)
 {
 	if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT)
 		return ioapic_ir_domain;
@@ -193,7 +193,7 @@ static struct irq_domain *hyperv_get_ir_
 struct irq_remap_ops hyperv_irq_remap_ops = {
 	.prepare		= hyperv_prepare_irq_remapping,
 	.enable			= hyperv_enable_irq_remapping,
-	.get_ir_irq_domain	= hyperv_get_ir_irq_domain,
+	.get_irq_domain		= hyperv_get_irq_domain,
 };
 
 #endif
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1131,7 +1131,6 @@ struct irq_remap_ops intel_irq_remap_ops
 	.disable		= disable_irq_remapping,
 	.reenable		= reenable_irq_remapping,
 	.enable_faulting	= enable_drhd_fault_handling,
-	.get_ir_irq_domain	= intel_get_irq_domain,
 	.get_irq_domain		= intel_get_irq_domain,
 };
 
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -160,33 +160,12 @@ void panic_if_irq_remap(const char *msg)
 }
 
 /**
- * irq_remapping_get_ir_irq_domain - Get the irqdomain associated with the IOMMU
- *				     device serving request @info
- * @info: interrupt allocation information, used to identify the IOMMU device
- *
- * It's used to get parent irqdomain for HPET and IOAPIC irqdomains.
- * Returns pointer to IRQ domain, or NULL on failure.
- */
-struct irq_domain *
-irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info)
-{
-	if (!remap_ops || !remap_ops->get_ir_irq_domain)
-		return NULL;
-
-	return remap_ops->get_ir_irq_domain(info);
-}
-
-/**
  * irq_remapping_get_irq_domain - Get the irqdomain serving the request @info
  * @info: interrupt allocation information, used to identify the IOMMU device
  *
- * There will be one PCI MSI/MSIX irqdomain associated with each interrupt
- * remapping device, so this interface is used to retrieve the PCI MSI/MSIX
- * irqdomain serving request @info.
  * Returns pointer to IRQ domain, or NULL on failure.
  */
-struct irq_domain *
-irq_remapping_get_irq_domain(struct irq_alloc_info *info)
+struct irq_domain *irq_remapping_get_irq_domain(struct irq_alloc_info *info)
 {
 	if (!remap_ops || !remap_ops->get_irq_domain)
 		return NULL;
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -43,10 +43,7 @@ struct irq_remap_ops {
 	/* Enable fault handling */
 	int  (*enable_faulting)(void);
 
-	/* Get the irqdomain associated the IOMMU device */
-	struct irq_domain *(*get_ir_irq_domain)(struct irq_alloc_info *);
-
-	/* Get the MSI irqdomain associated with the IOMMU device */
+	/* Get the irqdomain associated to IOMMU device */
 	struct irq_domain *(*get_irq_domain)(struct irq_alloc_info *);
 };
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 08/38] x86/irq: Prepare consolidation of irq_alloc_info
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (6 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 07/38] iommu/irq_remapping: Consolidate irq domain lookup Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 09/38] x86/msi: Consolidate HPET allocation Thomas Gleixner
                   ` (30 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Prepare-consolidation-of-irq_alloc_info.patch --]
[-- Type: text/plain, Size: 1771 bytes --]

struct irq_alloc_info is a horrible zoo of unnamed structs in a union. Many
of the struct fields can be generic and don't have to be type specific like
hpet_id, ioapic_id...

Provide a generic set of members to prepare for the consolidation. The goal
is to make irq_alloc_info have the same basic member as the generic
msi_alloc_info so generic MSI domain ops can be reused and yet more mess
can be avoided when (non-PCI) device MSI support comes along.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/hw_irq.h |   22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -44,10 +44,25 @@ enum irq_alloc_type {
 	X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT,
 };
 
+/**
+ * irq_alloc_info - X86 specific interrupt allocation info
+ * @type:	X86 specific allocation type
+ * @flags:	Flags for allocation tweaks
+ * @devid:	Device ID for allocations
+ * @hwirq:	Associated hw interrupt number in the domain
+ * @mask:	CPU mask for vector allocation
+ * @desc:	Pointer to msi descriptor
+ * @data:	Allocation specific data
+ */
 struct irq_alloc_info {
 	enum irq_alloc_type	type;
 	u32			flags;
-	const struct cpumask	*mask;	/* CPU mask for vector allocation */
+	u32			devid;
+	irq_hw_number_t		hwirq;
+	const struct cpumask	*mask;
+	struct msi_desc		*desc;
+	void			*data;
+
 	union {
 		int		unused;
 #ifdef	CONFIG_HPET_TIMER
@@ -88,11 +103,6 @@ struct irq_alloc_info {
 			char		*uv_name;
 		};
 #endif
-#if IS_ENABLED(CONFIG_VMD)
-		struct {
-			struct msi_desc *desc;
-		};
-#endif
 	};
 };
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 09/38] x86/msi: Consolidate HPET allocation
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (7 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 08/38] x86/irq: Prepare consolidation of irq_alloc_info Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 10/38] x86/ioapic: Consolidate IOAPIC allocation Thomas Gleixner
                   ` (29 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-msi--Consolidate-HPET-allocation.patch --]
[-- Type: text/plain, Size: 3606 bytes --]

None of the magic HPET fields are required in any way.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h       |    7 -------
 arch/x86/kernel/apic/msi.c          |   14 +++++++-------
 drivers/iommu/amd/iommu.c           |    2 +-
 drivers/iommu/intel/irq_remapping.c |    4 ++--
 4 files changed, 10 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -65,13 +65,6 @@ struct irq_alloc_info {
 
 	union {
 		int		unused;
-#ifdef	CONFIG_HPET_TIMER
-		struct {
-			int		hpet_id;
-			int		hpet_index;
-			void		*hpet_data;
-		};
-#endif
 #ifdef	CONFIG_PCI_MSI
 		struct {
 			struct pci_dev	*msi_dev;
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -427,7 +427,7 @@ static struct irq_chip hpet_msi_controll
 static irq_hw_number_t hpet_msi_get_hwirq(struct msi_domain_info *info,
 					  msi_alloc_info_t *arg)
 {
-	return arg->hpet_index;
+	return arg->hwirq;
 }
 
 static int hpet_msi_init(struct irq_domain *domain,
@@ -435,8 +435,8 @@ static int hpet_msi_init(struct irq_doma
 			 irq_hw_number_t hwirq, msi_alloc_info_t *arg)
 {
 	irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
-	irq_domain_set_info(domain, virq, arg->hpet_index, info->chip, NULL,
-			    handle_edge_irq, arg->hpet_data, "edge");
+	irq_domain_set_info(domain, virq, arg->hwirq, info->chip, NULL,
+			    handle_edge_irq, arg->data, "edge");
 
 	return 0;
 }
@@ -477,7 +477,7 @@ struct irq_domain *hpet_create_irq_domai
 
 	init_irq_alloc_info(&info, NULL);
 	info.type = X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT;
-	info.hpet_id = hpet_id;
+	info.devid = hpet_id;
 	parent = irq_remapping_get_irq_domain(&info);
 	if (parent == NULL)
 		parent = x86_vector_domain;
@@ -506,9 +506,9 @@ int hpet_assign_irq(struct irq_domain *d
 
 	init_irq_alloc_info(&info, NULL);
 	info.type = X86_IRQ_ALLOC_TYPE_HPET;
-	info.hpet_data = hc;
-	info.hpet_id = hpet_dev_id(domain);
-	info.hpet_index = dev_num;
+	info.data = hc;
+	info.devid = hpet_dev_id(domain);
+	info.hwirq = dev_num;
 
 	return irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, &info);
 }
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3511,7 +3511,7 @@ static int get_devid(struct irq_alloc_in
 		return get_ioapic_devid(info->ioapic_id);
 	case X86_IRQ_ALLOC_TYPE_HPET:
 	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
-		return get_hpet_devid(info->hpet_id);
+		return get_hpet_devid(info->devid);
 	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
 	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		return get_device_id(&info->msi_dev->dev);
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1115,7 +1115,7 @@ static struct irq_domain *intel_get_irq_
 	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
 		return map_ioapic_to_ir(info->ioapic_id);
 	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
-		return map_hpet_to_ir(info->hpet_id);
+		return map_hpet_to_ir(info->devid);
 	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
 	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		return map_dev_to_ir(info->msi_dev);
@@ -1285,7 +1285,7 @@ static void intel_irq_remapping_prepare_
 	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
 	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
 		if (info->type == X86_IRQ_ALLOC_TYPE_HPET)
-			set_hpet_sid(irte, info->hpet_id);
+			set_hpet_sid(irte, info->devid);
 		else
 			set_msi_sid(irte, info->msi_dev);
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 10/38] x86/ioapic: Consolidate IOAPIC allocation
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (8 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 09/38] x86/msi: Consolidate HPET allocation Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-26  8:40   ` Boqun Feng
  2020-08-21  0:24 ` [patch RFC 11/38] x86/irq: Consolidate DMAR irq allocation Thomas Gleixner
                   ` (28 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-ioapic--Consolidate-IOAPIC-allocation.patch --]
[-- Type: text/plain, Size: 11954 bytes --]

Move the IOAPIC specific fields into their own struct and reuse the common
devid. Get rid of the #ifdeffery as it does not matter at all whether the
alloc info is a couple of bytes longer or not.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: linux-hyperv@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jon Derrick <jonathan.derrick@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h       |   23 ++++++-----
 arch/x86/kernel/apic/io_apic.c      |   70 ++++++++++++++++++------------------
 drivers/iommu/amd/iommu.c           |   14 +++----
 drivers/iommu/hyperv-iommu.c        |    2 -
 drivers/iommu/intel/irq_remapping.c |   18 ++++-----
 5 files changed, 64 insertions(+), 63 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -44,6 +44,15 @@ enum irq_alloc_type {
 	X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT,
 };
 
+struct ioapic_alloc_info {
+	int				pin;
+	int				node;
+	u32				trigger : 1;
+	u32				polarity : 1;
+	u32				valid : 1;
+	struct IO_APIC_route_entry	*entry;
+};
+
 /**
  * irq_alloc_info - X86 specific interrupt allocation info
  * @type:	X86 specific allocation type
@@ -53,6 +62,8 @@ enum irq_alloc_type {
  * @mask:	CPU mask for vector allocation
  * @desc:	Pointer to msi descriptor
  * @data:	Allocation specific data
+ *
+ * @ioapic:	IOAPIC specific allocation data
  */
 struct irq_alloc_info {
 	enum irq_alloc_type	type;
@@ -64,6 +75,7 @@ struct irq_alloc_info {
 	void			*data;
 
 	union {
+		struct ioapic_alloc_info	ioapic;
 		int		unused;
 #ifdef	CONFIG_PCI_MSI
 		struct {
@@ -71,17 +83,6 @@ struct irq_alloc_info {
 			irq_hw_number_t	msi_hwirq;
 		};
 #endif
-#ifdef	CONFIG_X86_IO_APIC
-		struct {
-			int		ioapic_id;
-			int		ioapic_pin;
-			int		ioapic_node;
-			u32		ioapic_trigger : 1;
-			u32		ioapic_polarity : 1;
-			u32		ioapic_valid : 1;
-			struct IO_APIC_route_entry *ioapic_entry;
-		};
-#endif
 #ifdef	CONFIG_DMAR_TABLE
 		struct {
 			int		dmar_id;
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -860,10 +860,10 @@ void ioapic_set_alloc_attr(struct irq_al
 {
 	init_irq_alloc_info(info, NULL);
 	info->type = X86_IRQ_ALLOC_TYPE_IOAPIC;
-	info->ioapic_node = node;
-	info->ioapic_trigger = trigger;
-	info->ioapic_polarity = polarity;
-	info->ioapic_valid = 1;
+	info->ioapic.node = node;
+	info->ioapic.trigger = trigger;
+	info->ioapic.polarity = polarity;
+	info->ioapic.valid = 1;
 }
 
 #ifndef CONFIG_ACPI
@@ -878,32 +878,32 @@ static void ioapic_copy_alloc_attr(struc
 
 	copy_irq_alloc_info(dst, src);
 	dst->type = X86_IRQ_ALLOC_TYPE_IOAPIC;
-	dst->ioapic_id = mpc_ioapic_id(ioapic_idx);
-	dst->ioapic_pin = pin;
-	dst->ioapic_valid = 1;
-	if (src && src->ioapic_valid) {
-		dst->ioapic_node = src->ioapic_node;
-		dst->ioapic_trigger = src->ioapic_trigger;
-		dst->ioapic_polarity = src->ioapic_polarity;
+	dst->devid = mpc_ioapic_id(ioapic_idx);
+	dst->ioapic.pin = pin;
+	dst->ioapic.valid = 1;
+	if (src && src->ioapic.valid) {
+		dst->ioapic.node = src->ioapic.node;
+		dst->ioapic.trigger = src->ioapic.trigger;
+		dst->ioapic.polarity = src->ioapic.polarity;
 	} else {
-		dst->ioapic_node = NUMA_NO_NODE;
+		dst->ioapic.node = NUMA_NO_NODE;
 		if (acpi_get_override_irq(gsi, &trigger, &polarity) >= 0) {
-			dst->ioapic_trigger = trigger;
-			dst->ioapic_polarity = polarity;
+			dst->ioapic.trigger = trigger;
+			dst->ioapic.polarity = polarity;
 		} else {
 			/*
 			 * PCI interrupts are always active low level
 			 * triggered.
 			 */
-			dst->ioapic_trigger = IOAPIC_LEVEL;
-			dst->ioapic_polarity = IOAPIC_POL_LOW;
+			dst->ioapic.trigger = IOAPIC_LEVEL;
+			dst->ioapic.polarity = IOAPIC_POL_LOW;
 		}
 	}
 }
 
 static int ioapic_alloc_attr_node(struct irq_alloc_info *info)
 {
-	return (info && info->ioapic_valid) ? info->ioapic_node : NUMA_NO_NODE;
+	return (info && info->ioapic.valid) ? info->ioapic.node : NUMA_NO_NODE;
 }
 
 static void mp_register_handler(unsigned int irq, unsigned long trigger)
@@ -933,14 +933,14 @@ static bool mp_check_pin_attr(int irq, s
 	 * pin with real trigger and polarity attributes.
 	 */
 	if (irq < nr_legacy_irqs() && data->count == 1) {
-		if (info->ioapic_trigger != data->trigger)
-			mp_register_handler(irq, info->ioapic_trigger);
-		data->entry.trigger = data->trigger = info->ioapic_trigger;
-		data->entry.polarity = data->polarity = info->ioapic_polarity;
+		if (info->ioapic.trigger != data->trigger)
+			mp_register_handler(irq, info->ioapic.trigger);
+		data->entry.trigger = data->trigger = info->ioapic.trigger;
+		data->entry.polarity = data->polarity = info->ioapic.polarity;
 	}
 
-	return data->trigger == info->ioapic_trigger &&
-	       data->polarity == info->ioapic_polarity;
+	return data->trigger == info->ioapic.trigger &&
+	       data->polarity == info->ioapic.polarity;
 }
 
 static int alloc_irq_from_domain(struct irq_domain *domain, int ioapic, u32 gsi,
@@ -1002,7 +1002,7 @@ static int alloc_isa_irq_from_domain(str
 		if (!mp_check_pin_attr(irq, info))
 			return -EBUSY;
 		if (__add_pin_to_irq_node(irq_data->chip_data, node, ioapic,
-					  info->ioapic_pin))
+					  info->ioapic.pin))
 			return -ENOMEM;
 	} else {
 		info->flags |= X86_IRQ_ALLOC_LEGACY;
@@ -2092,8 +2092,8 @@ static int mp_alloc_timer_irq(int ioapic
 		struct irq_alloc_info info;
 
 		ioapic_set_alloc_attr(&info, NUMA_NO_NODE, 0, 0);
-		info.ioapic_id = mpc_ioapic_id(ioapic);
-		info.ioapic_pin = pin;
+		info.devid = mpc_ioapic_id(ioapic);
+		info.ioapic.pin = pin;
 		mutex_lock(&ioapic_mutex);
 		irq = alloc_isa_irq_from_domain(domain, 0, ioapic, pin, &info);
 		mutex_unlock(&ioapic_mutex);
@@ -2297,7 +2297,7 @@ static int mp_irqdomain_create(int ioapi
 
 	init_irq_alloc_info(&info, NULL);
 	info.type = X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT;
-	info.ioapic_id = mpc_ioapic_id(ioapic);
+	info.devid = mpc_ioapic_id(ioapic);
 	parent = irq_remapping_get_irq_domain(&info);
 	if (!parent)
 		parent = x86_vector_domain;
@@ -2932,9 +2932,9 @@ int mp_ioapic_registered(u32 gsi_base)
 static void mp_irqdomain_get_attr(u32 gsi, struct mp_chip_data *data,
 				  struct irq_alloc_info *info)
 {
-	if (info && info->ioapic_valid) {
-		data->trigger = info->ioapic_trigger;
-		data->polarity = info->ioapic_polarity;
+	if (info && info->ioapic.valid) {
+		data->trigger = info->ioapic.trigger;
+		data->polarity = info->ioapic.polarity;
 	} else if (acpi_get_override_irq(gsi, &data->trigger,
 					 &data->polarity) < 0) {
 		/* PCI interrupts are always active low level triggered. */
@@ -2980,7 +2980,7 @@ int mp_irqdomain_alloc(struct irq_domain
 		return -EINVAL;
 
 	ioapic = mp_irqdomain_ioapic_idx(domain);
-	pin = info->ioapic_pin;
+	pin = info->ioapic.pin;
 	if (irq_find_mapping(domain, (irq_hw_number_t)pin) > 0)
 		return -EEXIST;
 
@@ -2988,7 +2988,7 @@ int mp_irqdomain_alloc(struct irq_domain
 	if (!data)
 		return -ENOMEM;
 
-	info->ioapic_entry = &data->entry;
+	info->ioapic.entry = &data->entry;
 	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
 	if (ret < 0) {
 		kfree(data);
@@ -2996,7 +2996,7 @@ int mp_irqdomain_alloc(struct irq_domain
 	}
 
 	INIT_LIST_HEAD(&data->irq_2_pin);
-	irq_data->hwirq = info->ioapic_pin;
+	irq_data->hwirq = info->ioapic.pin;
 	irq_data->chip = (domain->parent == x86_vector_domain) ?
 			  &ioapic_chip : &ioapic_ir_chip;
 	irq_data->chip_data = data;
@@ -3006,8 +3006,8 @@ int mp_irqdomain_alloc(struct irq_domain
 	add_pin_to_irq_node(data, ioapic_alloc_attr_node(info), ioapic, pin);
 
 	local_irq_save(flags);
-	if (info->ioapic_entry)
-		mp_setup_entry(cfg, data, info->ioapic_entry);
+	if (info->ioapic.entry)
+		mp_setup_entry(cfg, data, info->ioapic.entry);
 	mp_register_handler(virq, data->trigger);
 	if (virq < nr_legacy_irqs())
 		legacy_pic->mask(virq);
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3508,7 +3508,7 @@ static int get_devid(struct irq_alloc_in
 	switch (info->type) {
 	case X86_IRQ_ALLOC_TYPE_IOAPIC:
 	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
-		return get_ioapic_devid(info->ioapic_id);
+		return get_ioapic_devid(info->devid);
 	case X86_IRQ_ALLOC_TYPE_HPET:
 	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
 		return get_hpet_devid(info->devid);
@@ -3586,15 +3586,15 @@ static void irq_remapping_prepare_irte(s
 	switch (info->type) {
 	case X86_IRQ_ALLOC_TYPE_IOAPIC:
 		/* Setup IOAPIC entry */
-		entry = info->ioapic_entry;
-		info->ioapic_entry = NULL;
+		entry = info->ioapic.entry;
+		info->ioapic.entry = NULL;
 		memset(entry, 0, sizeof(*entry));
 		entry->vector        = index;
 		entry->mask          = 0;
-		entry->trigger       = info->ioapic_trigger;
-		entry->polarity      = info->ioapic_polarity;
+		entry->trigger       = info->ioapic.trigger;
+		entry->polarity      = info->ioapic.polarity;
 		/* Mask level triggered irqs. */
-		if (info->ioapic_trigger)
+		if (info->ioapic.trigger)
 			entry->mask = 1;
 		break;
 
@@ -3680,7 +3680,7 @@ static int irq_remapping_alloc(struct ir
 					iommu->irte_ops->set_allocated(table, i);
 			}
 			WARN_ON(table->min_index != 32);
-			index = info->ioapic_pin;
+			index = info->ioapic.pin;
 		} else {
 			index = -ENOMEM;
 		}
--- a/drivers/iommu/hyperv-iommu.c
+++ b/drivers/iommu/hyperv-iommu.c
@@ -101,7 +101,7 @@ static int hyperv_irq_remapping_alloc(st
 	 * in the chip_data and hyperv_irq_remapping_activate()/hyperv_ir_set_
 	 * affinity() set vector and dest_apicid directly into IO-APIC entry.
 	 */
-	irq_data->chip_data = info->ioapic_entry;
+	irq_data->chip_data = info->ioapic.entry;
 
 	/*
 	 * Hypver-V IO APIC irq affinity should be in the scope of
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1113,7 +1113,7 @@ static struct irq_domain *intel_get_irq_
 
 	switch (info->type) {
 	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
-		return map_ioapic_to_ir(info->ioapic_id);
+		return map_ioapic_to_ir(info->devid);
 	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
 		return map_hpet_to_ir(info->devid);
 	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
@@ -1254,16 +1254,16 @@ static void intel_irq_remapping_prepare_
 	switch (info->type) {
 	case X86_IRQ_ALLOC_TYPE_IOAPIC:
 		/* Set source-id of interrupt request */
-		set_ioapic_sid(irte, info->ioapic_id);
+		set_ioapic_sid(irte, info->devid);
 		apic_printk(APIC_VERBOSE, KERN_DEBUG "IOAPIC[%d]: Set IRTE entry (P:%d FPD:%d Dst_Mode:%d Redir_hint:%d Trig_Mode:%d Dlvry_Mode:%X Avail:%X Vector:%02X Dest:%08X SID:%04X SQ:%X SVT:%X)\n",
-			info->ioapic_id, irte->present, irte->fpd,
+			info->devid, irte->present, irte->fpd,
 			irte->dst_mode, irte->redir_hint,
 			irte->trigger_mode, irte->dlvry_mode,
 			irte->avail, irte->vector, irte->dest_id,
 			irte->sid, irte->sq, irte->svt);
 
-		entry = (struct IR_IO_APIC_route_entry *)info->ioapic_entry;
-		info->ioapic_entry = NULL;
+		entry = (struct IR_IO_APIC_route_entry *)info->ioapic.entry;
+		info->ioapic.entry = NULL;
 		memset(entry, 0, sizeof(*entry));
 		entry->index2	= (index >> 15) & 0x1;
 		entry->zero	= 0;
@@ -1273,11 +1273,11 @@ static void intel_irq_remapping_prepare_
 		 * IO-APIC RTE will be configured with virtual vector.
 		 * irq handler will do the explicit EOI to the io-apic.
 		 */
-		entry->vector	= info->ioapic_pin;
+		entry->vector	= info->ioapic.pin;
 		entry->mask	= 0;			/* enable IRQ */
-		entry->trigger	= info->ioapic_trigger;
-		entry->polarity	= info->ioapic_polarity;
-		if (info->ioapic_trigger)
+		entry->trigger	= info->ioapic.trigger;
+		entry->polarity	= info->ioapic.polarity;
+		if (info->ioapic.trigger)
 			entry->mask = 1; /* Mask level triggered irqs. */
 		break;
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 11/38] x86/irq: Consolidate DMAR irq allocation
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (9 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 10/38] x86/ioapic: Consolidate IOAPIC allocation Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 12/38] x86/irq: Consolidate UV domain allocation Thomas Gleixner
                   ` (27 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Consolidate-DMAR-irq-allocation.patch --]
[-- Type: text/plain, Size: 1687 bytes --]

None of the DMAR specific fields are required.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/hw_irq.h |    6 ------
 arch/x86/kernel/apic/msi.c    |   10 +++++-----
 2 files changed, 5 insertions(+), 11 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -83,12 +83,6 @@ struct irq_alloc_info {
 			irq_hw_number_t	msi_hwirq;
 		};
 #endif
-#ifdef	CONFIG_DMAR_TABLE
-		struct {
-			int		dmar_id;
-			void		*dmar_data;
-		};
-#endif
 #ifdef	CONFIG_X86_UV
 		struct {
 			int		uv_limit;
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -329,15 +329,15 @@ static struct irq_chip dmar_msi_controll
 static irq_hw_number_t dmar_msi_get_hwirq(struct msi_domain_info *info,
 					  msi_alloc_info_t *arg)
 {
-	return arg->dmar_id;
+	return arg->hwirq;
 }
 
 static int dmar_msi_init(struct irq_domain *domain,
 			 struct msi_domain_info *info, unsigned int virq,
 			 irq_hw_number_t hwirq, msi_alloc_info_t *arg)
 {
-	irq_domain_set_info(domain, virq, arg->dmar_id, info->chip, NULL,
-			    handle_edge_irq, arg->dmar_data, "edge");
+	irq_domain_set_info(domain, virq, arg->devid, info->chip, NULL,
+			    handle_edge_irq, arg->data, "edge");
 
 	return 0;
 }
@@ -384,8 +384,8 @@ int dmar_alloc_hwirq(int id, int node, v
 
 	init_irq_alloc_info(&info, NULL);
 	info.type = X86_IRQ_ALLOC_TYPE_DMAR;
-	info.dmar_id = id;
-	info.dmar_data = arg;
+	info.devid = id;
+	info.data = arg;
 
 	return irq_domain_alloc_irqs(domain, 1, node, &info);
 }

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 12/38] x86/irq: Consolidate UV domain allocation
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (10 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 11/38] x86/irq: Consolidate DMAR irq allocation Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq() Thomas Gleixner
                   ` (26 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Consolidate-UV-domain-allocation.patch --]
[-- Type: text/plain, Size: 2974 bytes --]

Move the UV specific fields into their own struct for readability sake. Get
rid of the #ifdeffery as it does not matter at all whether the alloc info
is a couple of bytes longer or not.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steve Wahl <steve.wahl@hpe.com>
Cc:  Dimitri Sivanich <sivanich@hpe.com>
Cc: Russ Anderson <rja@hpe.com>
---
 arch/x86/include/asm/hw_irq.h |   21 ++++++++++++---------
 arch/x86/platform/uv/uv_irq.c |   16 ++++++++--------
 2 files changed, 20 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -53,6 +53,14 @@ struct ioapic_alloc_info {
 	struct IO_APIC_route_entry	*entry;
 };
 
+struct uv_alloc_info {
+	int		limit;
+	int		blade;
+	unsigned long	offset;
+	char		*name;
+
+};
+
 /**
  * irq_alloc_info - X86 specific interrupt allocation info
  * @type:	X86 specific allocation type
@@ -64,7 +72,8 @@ struct ioapic_alloc_info {
  * @data:	Allocation specific data
  *
  * @ioapic:	IOAPIC specific allocation data
- */
+ * @uv:		UV specific allocation data
+*/
 struct irq_alloc_info {
 	enum irq_alloc_type	type;
 	u32			flags;
@@ -76,6 +85,8 @@ struct irq_alloc_info {
 
 	union {
 		struct ioapic_alloc_info	ioapic;
+		struct uv_alloc_info		uv;
+
 		int		unused;
 #ifdef	CONFIG_PCI_MSI
 		struct {
@@ -83,14 +94,6 @@ struct irq_alloc_info {
 			irq_hw_number_t	msi_hwirq;
 		};
 #endif
-#ifdef	CONFIG_X86_UV
-		struct {
-			int		uv_limit;
-			int		uv_blade;
-			unsigned long	uv_offset;
-			char		*uv_name;
-		};
-#endif
 	};
 };
 
--- a/arch/x86/platform/uv/uv_irq.c
+++ b/arch/x86/platform/uv/uv_irq.c
@@ -90,15 +90,15 @@ static int uv_domain_alloc(struct irq_do
 
 	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
 	if (ret >= 0) {
-		if (info->uv_limit == UV_AFFINITY_CPU)
+		if (info->uv.limit == UV_AFFINITY_CPU)
 			irq_set_status_flags(virq, IRQ_NO_BALANCING);
 		else
 			irq_set_status_flags(virq, IRQ_MOVE_PCNTXT);
 
-		chip_data->pnode = uv_blade_to_pnode(info->uv_blade);
-		chip_data->offset = info->uv_offset;
+		chip_data->pnode = uv_blade_to_pnode(info->uv.blade);
+		chip_data->offset = info->uv.offset;
 		irq_domain_set_info(domain, virq, virq, &uv_irq_chip, chip_data,
-				    handle_percpu_irq, NULL, info->uv_name);
+				    handle_percpu_irq, NULL, info->uv.name);
 	} else {
 		kfree(chip_data);
 	}
@@ -193,10 +193,10 @@ int uv_setup_irq(char *irq_name, int cpu
 
 	init_irq_alloc_info(&info, cpumask_of(cpu));
 	info.type = X86_IRQ_ALLOC_TYPE_UV;
-	info.uv_limit = limit;
-	info.uv_blade = mmr_blade;
-	info.uv_offset = mmr_offset;
-	info.uv_name = irq_name;
+	info.uv.limit = limit;
+	info.uv.blade = mmr_blade;
+	info.uv.offset = mmr_offset;
+	info.uv.name = irq_name;
 
 	return irq_domain_alloc_irqs(domain, 1,
 				     uv_blade_to_memory_nid(mmr_blade), &info);

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq()
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (11 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 12/38] x86/irq: Consolidate UV domain allocation Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-25 20:03   ` Bjorn Helgaas
  2020-08-21  0:24 ` [patch RFC 14/38] x86/msi: Consolidate MSI allocation Thomas Gleixner
                   ` (25 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: PCI--MSI--Rework-pci_msi_domain_calc_hwirq--.patch --]
[-- Type: text/plain, Size: 2631 bytes --]

Retrieve the PCI device from the msi descriptor instead of doing so at the
call sites.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
---
 arch/x86/kernel/apic/msi.c |    2 +-
 drivers/pci/msi.c          |   13 ++++++-------
 include/linux/msi.h        |    3 +--
 3 files changed, 8 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -232,7 +232,7 @@ EXPORT_SYMBOL_GPL(pci_msi_prepare);
 
 void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
 {
-	arg->msi_hwirq = pci_msi_domain_calc_hwirq(arg->msi_dev, desc);
+	arg->msi_hwirq = pci_msi_domain_calc_hwirq(desc);
 }
 EXPORT_SYMBOL_GPL(pci_msi_set_desc);
 
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1346,17 +1346,17 @@ void pci_msi_domain_write_msg(struct irq
 
 /**
  * pci_msi_domain_calc_hwirq - Generate a unique ID for an MSI source
- * @dev:	Pointer to the PCI device
  * @desc:	Pointer to the MSI descriptor
  *
  * The ID number is only used within the irqdomain.
  */
-irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev,
-					  struct msi_desc *desc)
+irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc)
 {
+	struct pci_dev *pdev = msi_desc_to_pci_dev(desc);
+
 	return (irq_hw_number_t)desc->msi_attrib.entry_nr |
-		pci_dev_id(dev) << 11 |
-		(pci_domain_nr(dev->bus) & 0xFFFFFFFF) << 27;
+		pci_dev_id(pdev) << 11 |
+		(pci_domain_nr(pdev->bus) & 0xFFFFFFFF) << 27;
 }
 
 static inline bool pci_msi_desc_is_multi_msi(struct msi_desc *desc)
@@ -1406,8 +1406,7 @@ static void pci_msi_domain_set_desc(msi_
 				    struct msi_desc *desc)
 {
 	arg->desc = desc;
-	arg->hwirq = pci_msi_domain_calc_hwirq(msi_desc_to_pci_dev(desc),
-					       desc);
+	arg->hwirq = pci_msi_domain_calc_hwirq(desc);
 }
 #else
 #define pci_msi_domain_set_desc		NULL
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -369,8 +369,7 @@ void pci_msi_domain_write_msg(struct irq
 struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
 					     struct msi_domain_info *info,
 					     struct irq_domain *parent);
-irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev,
-					  struct msi_desc *desc);
+irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc);
 int pci_msi_domain_check_cap(struct irq_domain *domain,
 			     struct msi_domain_info *info, struct device *dev);
 u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev);

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 14/38] x86/msi: Consolidate MSI allocation
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (12 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq() Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 15/38] x86/msi: Use generic MSI domain ops Thomas Gleixner
                   ` (24 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-msi--Consolidate-MSI-allocation.patch --]
[-- Type: text/plain, Size: 4345 bytes --]

Convert the interrupt remap drivers to retrieve the pci device from the msi
descriptor and use info::hwirq.

This is the first step to prepare x86 for using the generic MSI domain ops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: linux-pci@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 arch/x86/include/asm/hw_irq.h       |    8 --------
 arch/x86/kernel/apic/msi.c          |    7 +++----
 drivers/iommu/amd/iommu.c           |    5 +++--
 drivers/iommu/intel/irq_remapping.c |    4 ++--
 drivers/pci/controller/pci-hyperv.c |    2 +-
 5 files changed, 9 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -85,14 +85,6 @@ struct irq_alloc_info {
 	union {
 		struct ioapic_alloc_info	ioapic;
 		struct uv_alloc_info		uv;
-
-		int		unused;
-#ifdef	CONFIG_PCI_MSI
-		struct {
-			struct pci_dev	*msi_dev;
-			irq_hw_number_t	msi_hwirq;
-		};
-#endif
 	};
 };
 
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -189,7 +189,6 @@ int native_setup_msi_irqs(struct pci_dev
 
 	init_irq_alloc_info(&info, NULL);
 	info.type = X86_IRQ_ALLOC_TYPE_PCI_MSI;
-	info.msi_dev = dev;
 
 	domain = irq_remapping_get_irq_domain(&info);
 	if (domain == NULL)
@@ -208,7 +207,7 @@ void native_teardown_msi_irq(unsigned in
 static irq_hw_number_t pci_msi_get_hwirq(struct msi_domain_info *info,
 					 msi_alloc_info_t *arg)
 {
-	return arg->msi_hwirq;
+	return arg->hwirq;
 }
 
 int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
@@ -218,7 +217,6 @@ int pci_msi_prepare(struct irq_domain *d
 	struct msi_desc *desc = first_pci_msi_entry(pdev);
 
 	init_irq_alloc_info(arg, NULL);
-	arg->msi_dev = pdev;
 	if (desc->msi_attrib.is_msix) {
 		arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSIX;
 	} else {
@@ -232,7 +230,8 @@ EXPORT_SYMBOL_GPL(pci_msi_prepare);
 
 void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
 {
-	arg->msi_hwirq = pci_msi_domain_calc_hwirq(desc);
+	arg->desc = desc;
+	arg->hwirq = pci_msi_domain_calc_hwirq(desc);
 }
 EXPORT_SYMBOL_GPL(pci_msi_set_desc);
 
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3514,7 +3514,7 @@ static int get_devid(struct irq_alloc_in
 		return get_hpet_devid(info->devid);
 	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
 	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
-		return get_device_id(&info->msi_dev->dev);
+		return get_device_id(msi_desc_to_dev(info->desc));
 	default:
 		WARN_ON_ONCE(1);
 		return -1;
@@ -3688,7 +3688,8 @@ static int irq_remapping_alloc(struct ir
 		   info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX) {
 		bool align = (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI);
 
-		index = alloc_irq_index(devid, nr_irqs, align, info->msi_dev);
+		index = alloc_irq_index(devid, nr_irqs, align,
+					msi_desc_to_pci_dev(info->desc));
 	} else {
 		index = alloc_irq_index(devid, nr_irqs, false, NULL);
 	}
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1118,7 +1118,7 @@ static struct irq_domain *intel_get_irq_
 		return map_hpet_to_ir(info->devid);
 	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
 	case X86_IRQ_ALLOC_TYPE_PCI_MSIX:
-		return map_dev_to_ir(info->msi_dev);
+		return map_dev_to_ir(msi_desc_to_pci_dev(info->desc));
 	default:
 		WARN_ON_ONCE(1);
 		return NULL;
@@ -1287,7 +1287,7 @@ static void intel_irq_remapping_prepare_
 		if (info->type == X86_IRQ_ALLOC_TYPE_HPET)
 			set_hpet_sid(irte, info->devid);
 		else
-			set_msi_sid(irte, info->msi_dev);
+			set_msi_sid(irte, msi_desc_to_pci_dev(info->desc));
 
 		msg->address_hi = MSI_ADDR_BASE_HI;
 		msg->data = sub_handle;
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1534,7 +1534,7 @@ static struct irq_chip hv_msi_irq_chip =
 static irq_hw_number_t hv_msi_domain_ops_get_hwirq(struct msi_domain_info *info,
 						   msi_alloc_info_t *arg)
 {
-	return arg->msi_hwirq;
+	return arg->hwirq;
 }
 
 static struct msi_domain_ops hv_msi_ops = {

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 15/38] x86/msi: Use generic MSI domain ops
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (13 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 14/38] x86/msi: Consolidate MSI allocation Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 16/38] x86/irq: Move apic_post_init() invocation to one place Thomas Gleixner
                   ` (23 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-msi--Use-generic-MSI-domain-ops.patch --]
[-- Type: text/plain, Size: 3832 bytes --]

pci_msi_get_hwirq() and pci_msi_set_desc are not longer special. Enable the
generic MSI domain ops in the core and PCI MSI code unconditionally and get
rid of the x86 specific implementations in the X86 MSI code and in the
hyperv PCI driver.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: linux-pci@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
---
 arch/x86/include/asm/msi.h          |    2 --
 arch/x86/kernel/apic/msi.c          |   15 ---------------
 drivers/pci/controller/pci-hyperv.c |    8 --------
 drivers/pci/msi.c                   |    4 ----
 kernel/irq/msi.c                    |    6 ------
 5 files changed, 35 deletions(-)

--- a/arch/x86/include/asm/msi.h
+++ b/arch/x86/include/asm/msi.h
@@ -9,6 +9,4 @@ typedef struct irq_alloc_info msi_alloc_
 int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
 		    msi_alloc_info_t *arg);
 
-void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc);
-
 #endif /* _ASM_X86_MSI_H */
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -204,12 +204,6 @@ void native_teardown_msi_irq(unsigned in
 	irq_domain_free_irqs(irq, 1);
 }
 
-static irq_hw_number_t pci_msi_get_hwirq(struct msi_domain_info *info,
-					 msi_alloc_info_t *arg)
-{
-	return arg->hwirq;
-}
-
 int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
 		    msi_alloc_info_t *arg)
 {
@@ -228,17 +222,8 @@ int pci_msi_prepare(struct irq_domain *d
 }
 EXPORT_SYMBOL_GPL(pci_msi_prepare);
 
-void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
-{
-	arg->desc = desc;
-	arg->hwirq = pci_msi_domain_calc_hwirq(desc);
-}
-EXPORT_SYMBOL_GPL(pci_msi_set_desc);
-
 static struct msi_domain_ops pci_msi_domain_ops = {
-	.get_hwirq	= pci_msi_get_hwirq,
 	.msi_prepare	= pci_msi_prepare,
-	.set_desc	= pci_msi_set_desc,
 };
 
 static struct msi_domain_info pci_msi_domain_info = {
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1531,16 +1531,8 @@ static struct irq_chip hv_msi_irq_chip =
 	.irq_unmask		= hv_irq_unmask,
 };
 
-static irq_hw_number_t hv_msi_domain_ops_get_hwirq(struct msi_domain_info *info,
-						   msi_alloc_info_t *arg)
-{
-	return arg->hwirq;
-}
-
 static struct msi_domain_ops hv_msi_ops = {
-	.get_hwirq	= hv_msi_domain_ops_get_hwirq,
 	.msi_prepare	= pci_msi_prepare,
-	.set_desc	= pci_msi_set_desc,
 	.msi_free	= hv_msi_free,
 };
 
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1401,16 +1401,12 @@ static int pci_msi_domain_handle_error(s
 	return error;
 }
 
-#ifdef GENERIC_MSI_DOMAIN_OPS
 static void pci_msi_domain_set_desc(msi_alloc_info_t *arg,
 				    struct msi_desc *desc)
 {
 	arg->desc = desc;
 	arg->hwirq = pci_msi_domain_calc_hwirq(desc);
 }
-#else
-#define pci_msi_domain_set_desc		NULL
-#endif
 
 static struct msi_domain_ops pci_msi_domain_ops_default = {
 	.set_desc	= pci_msi_domain_set_desc,
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -187,7 +187,6 @@ static const struct irq_domain_ops msi_d
 	.deactivate	= msi_domain_deactivate,
 };
 
-#ifdef GENERIC_MSI_DOMAIN_OPS
 static irq_hw_number_t msi_domain_ops_get_hwirq(struct msi_domain_info *info,
 						msi_alloc_info_t *arg)
 {
@@ -206,11 +205,6 @@ static void msi_domain_ops_set_desc(msi_
 {
 	arg->desc = desc;
 }
-#else
-#define msi_domain_ops_get_hwirq	NULL
-#define msi_domain_ops_prepare		NULL
-#define msi_domain_ops_set_desc		NULL
-#endif /* !GENERIC_MSI_DOMAIN_OPS */
 
 static int msi_domain_ops_init(struct irq_domain *domain,
 			       struct msi_domain_info *info,

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 16/38] x86/irq: Move apic_post_init() invocation to one place
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (14 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 15/38] x86/msi: Use generic MSI domain ops Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 17/38] x86/pci: Reducde #ifdeffery in PCI init code Thomas Gleixner
                   ` (22 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Move-apic_post_init --]
[-- Type: text/plain, Size: 1383 bytes --]

No point to call it from both 32bit and 64bit implementations of
default_setup_apic_routing(). Move it to the caller.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/apic.c     |    3 +++
 arch/x86/kernel/apic/probe_32.c |    3 ---
 arch/x86/kernel/apic/probe_64.c |    3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1429,6 +1429,9 @@ void __init apic_intr_mode_init(void)
 		break;
 	}
 
+	if (x86_platform.apic_post_init)
+		x86_platform.apic_post_init();
+
 	apic_bsp_setup(upmode);
 }
 
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -170,9 +170,6 @@ void __init default_setup_apic_routing(v
 
 	if (apic->setup_apic_routing)
 		apic->setup_apic_routing();
-
-	if (x86_platform.apic_post_init)
-		x86_platform.apic_post_init();
 }
 
 void __init generic_apic_probe(void)
--- a/arch/x86/kernel/apic/probe_64.c
+++ b/arch/x86/kernel/apic/probe_64.c
@@ -32,9 +32,6 @@ void __init default_setup_apic_routing(v
 			break;
 		}
 	}
-
-	if (x86_platform.apic_post_init)
-		x86_platform.apic_post_init();
 }
 
 int __init default_acpi_madt_oem_check(char *oem_id, char *oem_table_id)

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 17/38] x86/pci: Reducde #ifdeffery in PCI init code
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (15 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 16/38] x86/irq: Move apic_post_init() invocation to one place Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-25 20:20   ` Bjorn Helgaas
  2020-08-21  0:24 ` [patch RFC 18/38] x86/irq: Initialize PCI/MSI domain at PCI init time Thomas Gleixner
                   ` (21 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-pci--Reducde-#ifdeffery-in-PCI-init-code.patch --]
[-- Type: text/plain, Size: 2185 bytes --]

Adding a function call before the first #ifdef in arch_pci_init() triggers
a 'mixed declarations and code' warning if PCI_DIRECT is enabled.

Use stub functions and move the #ifdeffery to the header file where it is
not in the way.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
---
 arch/x86/include/asm/pci_x86.h |   11 +++++++++++
 arch/x86/pci/init.c            |   10 +++-------
 2 files changed, 14 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -114,9 +114,20 @@ extern const struct pci_raw_ops pci_dire
 extern bool port_cf9_safe;
 
 /* arch_initcall level */
+#ifdef CONFIG_PCI_DIRECT
 extern int pci_direct_probe(void);
 extern void pci_direct_init(int type);
+#else
+static inline int pci_direct_probe(void) { return -1; }
+static inline  void pci_direct_init(int type) { }
+#endif
+
+#ifdef CONFIG_PCI_BIOS
 extern void pci_pcbios_init(void);
+#else
+static inline void pci_pcbios_init(void) { }
+#endif
+
 extern void __init dmi_check_pciprobe(void);
 extern void __init dmi_check_skip_isa_align(void);
 
--- a/arch/x86/pci/init.c
+++ b/arch/x86/pci/init.c
@@ -8,11 +8,9 @@
    in the right sequence from here. */
 static __init int pci_arch_init(void)
 {
-#ifdef CONFIG_PCI_DIRECT
-	int type = 0;
+	int type;
 
 	type = pci_direct_probe();
-#endif
 
 	if (!(pci_probe & PCI_PROBE_NOEARLY))
 		pci_mmcfg_early_init();
@@ -20,18 +18,16 @@ static __init int pci_arch_init(void)
 	if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
 		return 0;
 
-#ifdef CONFIG_PCI_BIOS
 	pci_pcbios_init();
-#endif
+
 	/*
 	 * don't check for raw_pci_ops here because we want pcbios as last
 	 * fallback, yet it's needed to run first to set pcibios_last_bus
 	 * in case legacy PCI probing is used. otherwise detecting peer busses
 	 * fails.
 	 */
-#ifdef CONFIG_PCI_DIRECT
 	pci_direct_init(type);
-#endif
+
 	if (!raw_pci_ops && !raw_pci_ext_ops)
 		printk(KERN_ERR
 		"PCI: Fatal: No config space access function found\n");

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 18/38] x86/irq: Initialize PCI/MSI domain at PCI init time
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (16 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 17/38] x86/pci: Reducde #ifdeffery in PCI init code Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 19/38] irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI Thomas Gleixner
                   ` (20 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Initialize-PCI-MSI-domain-at-PCI-init-time.patch --]
[-- Type: text/plain, Size: 5012 bytes --]

No point in initializing the default PCI/MSI interrupt domain early and no
point to create it when XEN PV/HVM/DOM0 are active.

Move the initialization to pci_arch_init() and convert it to init ops so
that XEN can override it as XEN has it's own PCI/MSI management. The XEN
override comes in a later step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
---
 arch/x86/include/asm/irqdomain.h |    6 ++++--
 arch/x86/include/asm/x86_init.h  |    3 +++
 arch/x86/kernel/apic/msi.c       |   26 ++++++++++++++++----------
 arch/x86/kernel/apic/vector.c    |    2 --
 arch/x86/kernel/x86_init.c       |    3 ++-
 arch/x86/pci/init.c              |    3 +++
 6 files changed, 28 insertions(+), 15 deletions(-)

--- a/arch/x86/include/asm/irqdomain.h
+++ b/arch/x86/include/asm/irqdomain.h
@@ -51,9 +51,11 @@ extern int mp_irqdomain_ioapic_idx(struc
 #endif /* CONFIG_X86_IO_APIC */
 
 #ifdef CONFIG_PCI_MSI
-extern void arch_init_msi_domain(struct irq_domain *domain);
+void x86_create_pci_msi_domain(void);
+struct irq_domain *native_create_pci_msi_domain(void);
 #else
-static inline void arch_init_msi_domain(struct irq_domain *domain) { }
+static inline void x86_create_pci_msi_domain(void) { }
+#define native_create_pci_msi_domain	NULL
 #endif
 
 #endif
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -8,6 +8,7 @@ struct mpc_bus;
 struct mpc_cpu;
 struct mpc_table;
 struct cpuinfo_x86;
+struct irq_domain;
 
 /**
  * struct x86_init_mpparse - platform specific mpparse ops
@@ -42,12 +43,14 @@ struct x86_init_resources {
  * @intr_init:			interrupt init code
  * @intr_mode_select:		interrupt delivery mode selection
  * @intr_mode_init:		interrupt delivery mode setup
+ * @create_pci_msi_domain:	Create the PCI/MSI interrupt domain
  */
 struct x86_init_irqs {
 	void (*pre_vector_init)(void);
 	void (*intr_init)(void);
 	void (*intr_mode_select)(void);
 	void (*intr_mode_init)(void);
+	struct irq_domain *(*create_pci_msi_domain)(void);
 };
 
 /**
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -21,7 +21,7 @@
 #include <asm/apic.h>
 #include <asm/irq_remapping.h>
 
-static struct irq_domain *msi_default_domain;
+static struct irq_domain *x86_pci_msi_default_domain __ro_after_init;
 
 static void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg)
 {
@@ -192,7 +192,7 @@ int native_setup_msi_irqs(struct pci_dev
 
 	domain = irq_remapping_get_irq_domain(&info);
 	if (domain == NULL)
-		domain = msi_default_domain;
+		domain = x86_pci_msi_default_domain;
 	if (domain == NULL)
 		return -ENOSYS;
 
@@ -243,25 +243,31 @@ static struct msi_domain_info pci_msi_do
 	.handler_name	= "edge",
 };
 
-void __init arch_init_msi_domain(struct irq_domain *parent)
+struct irq_domain * __init native_create_pci_msi_domain(void)
 {
 	struct fwnode_handle *fn;
+	struct irq_domain *d;
 
 	if (disable_apic)
-		return;
+		return NULL;
 
 	fn = irq_domain_alloc_named_fwnode("PCI-MSI");
 	if (fn) {
-		msi_default_domain =
-			pci_msi_create_irq_domain(fn, &pci_msi_domain_info,
-						  parent);
+		d = pci_msi_create_irq_domain(fn, &pci_msi_domain_info,
+					      x86_vector_domain);
 	}
-	if (!msi_default_domain) {
+	if (!d) {
 		irq_domain_free_fwnode(fn);
-		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n");
+		pr_warn("Failed to initialize PCI-MSI irqdomain.\n");
 	} else {
-		msi_default_domain->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
+		d->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
 	}
+	return d;
+}
+
+void __init x86_create_pci_msi_domain(void)
+{
+	x86_pci_msi_default_domain = x86_init.irqs.create_pci_msi_domain();
 }
 
 #ifdef CONFIG_IRQ_REMAP
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -713,8 +713,6 @@ int __init arch_early_irq_init(void)
 	BUG_ON(x86_vector_domain == NULL);
 	irq_set_default_host(x86_vector_domain);
 
-	arch_init_msi_domain(x86_vector_domain);
-
 	BUG_ON(!alloc_cpumask_var(&vector_searchmask, GFP_KERNEL));
 
 	/*
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -76,7 +76,8 @@ struct x86_init_ops x86_init __initdata
 		.pre_vector_init	= init_ISA_irqs,
 		.intr_init		= native_init_IRQ,
 		.intr_mode_select	= apic_intr_mode_select,
-		.intr_mode_init		= apic_intr_mode_init
+		.intr_mode_init		= apic_intr_mode_init,
+		.create_pci_msi_domain	= native_create_pci_msi_domain,
 	},
 
 	.oem = {
--- a/arch/x86/pci/init.c
+++ b/arch/x86/pci/init.c
@@ -3,6 +3,7 @@
 #include <linux/init.h>
 #include <asm/pci_x86.h>
 #include <asm/x86_init.h>
+#include <asm/irqdomain.h>
 
 /* arch_initcall has too random ordering, so call the initializers
    in the right sequence from here. */
@@ -10,6 +11,8 @@ static __init int pci_arch_init(void)
 {
 	int type;
 
+	x86_create_pci_msi_domain();
+
 	type = pci_direct_probe();
 
 	if (!(pci_probe & PCI_PROBE_NOEARLY))

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 19/38] irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (17 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 18/38] x86/irq: Initialize PCI/MSI domain at PCI init time Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 20/38] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI Thomas Gleixner
                   ` (19 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: genirq-msi--Provide-DOMAIN_BUS_VMD_MSI.patch --]
[-- Type: text/plain, Size: 1302 bytes --]

PCI devices behind a VMD bus are not subject to interrupt remapping, but
the irq domain for VMD MSI cannot be distinguished from a regular PCI/MSI
irq domain.

Add a new domain bus token and allow it in the bus token check in
msi_check_reservation_mode() to keep the functionality the same once VMD
uses this token.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jon Derrick <jonathan.derrick@intel.com>
---
 include/linux/irqdomain.h |    1 +
 kernel/irq/msi.c          |    7 ++++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -84,6 +84,7 @@ enum irq_domain_bus_token {
 	DOMAIN_BUS_FSL_MC_MSI,
 	DOMAIN_BUS_TI_SCI_INTA_MSI,
 	DOMAIN_BUS_WAKEUP,
+	DOMAIN_BUS_VMD_MSI,
 };
 
 /**
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -370,8 +370,13 @@ static bool msi_check_reservation_mode(s
 {
 	struct msi_desc *desc;
 
-	if (domain->bus_token != DOMAIN_BUS_PCI_MSI)
+	switch(domain->bus_token) {
+	case DOMAIN_BUS_PCI_MSI:
+	case DOMAIN_BUS_VMD_MSI:
+		break;
+	default:
 		return false;
+	}
 
 	if (!(info->flags & MSI_FLAG_MUST_REACTIVATE))
 		return false;

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 20/38] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (18 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 19/38] irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-25 20:04   ` Bjorn Helgaas
  2020-08-21  0:24 ` [patch RFC 21/38] PCI: MSI: Provide pci_dev_has_special_msi_domain() helper Thomas Gleixner
                   ` (18 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: PCI--vmd--Mark-VMD-irqdomain-with-DOMAIN_BUS_VMD_PCI.patch --]
[-- Type: text/plain, Size: 1257 bytes --]

Devices on the VMD bus use their own MSI irq domain, but it is not
distinguishable from regular PCI/MSI irq domains. This is required
to exclude VMD devices from getting the irq domain pointer set by
interrupt remapping.

Override the default bus token.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Jonathan Derrick <jonathan.derrick@intel.com>
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/controller/vmd.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
 		return -ENODEV;
 	}
 
+	/*
+	 * Override the irq domain bus token so the domain can be distinguished
+	 * from a regular PCI/MSI domain.
+	 */
+	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
+
 	pci_add_resource(&resources, &vmd->resources[0]);
 	pci_add_resource_offset(&resources, &vmd->resources[1], offset[0]);
 	pci_add_resource_offset(&resources, &vmd->resources[2], offset[1]);

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 21/38] PCI: MSI: Provide pci_dev_has_special_msi_domain() helper
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (19 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 20/38] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-25 20:16   ` Bjorn Helgaas
  2020-08-21  0:24 ` [patch RFC 22/38] x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() Thomas Gleixner
                   ` (17 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: genirq-msi--Provide-pci_dev_has_special_msi_domain --]
[-- Type: text/plain, Size: 1849 bytes --]

Provide a helper function to check whether a PCI device is handled by a
non-standard PCI/MSI domain. This will be used to exclude such devices
which hang of a special bus, e.g. VMD, to be excluded from the irq domain
override in irq remapping.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/msi.c   |   22 ++++++++++++++++++++++
 include/linux/msi.h |    1 +
 2 files changed, 23 insertions(+)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1553,4 +1553,26 @@ struct irq_domain *pci_msi_get_device_do
 					     DOMAIN_BUS_PCI_MSI);
 	return dom;
 }
+
+/**
+ * pci_dev_has_special_msi_domain - Check whether the device is handled by
+ *				    a non-standard PCI-MSI domain
+ * @pdev:	The PCI device to check.
+ *
+ * Returns: True if the device irqdomain or the bus irqdomain is
+ * non-standard PCI/MSI.
+ */
+bool pci_dev_has_special_msi_domain(struct pci_dev *pdev)
+{
+	struct irq_domain *dom = dev_get_msi_domain(&pdev->dev);
+
+	if (!dom)
+		dom = dev_get_msi_domain(&pdev->bus->dev);
+
+	if (!dom)
+		return true;
+
+	return dom->bus_token != DOMAIN_BUS_PCI_MSI;
+}
+
 #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -374,6 +374,7 @@ int pci_msi_domain_check_cap(struct irq_
 			     struct msi_domain_info *info, struct device *dev);
 u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev);
 struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev);
+bool pci_dev_has_special_msi_domain(struct pci_dev *pdev);
 #else
 static inline struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev)
 {

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 22/38] x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (20 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 21/38] PCI: MSI: Provide pci_dev_has_special_msi_domain() helper Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-24  4:48   ` Jürgen Groß
  2020-08-21  0:24 ` [patch RFC 23/38] x86/xen: Rework MSI teardown Thomas Gleixner
                   ` (16 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stefano Stabellini,
	Stephen Hemminger, Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe,
	Megha Dey, xen-devel, Kevin Tian, Konrad Rzeszutek Wilk,
	Haiyang Zhang, Alex Williamson, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-xen--Make-xen_msi_init --]
[-- Type: text/plain, Size: 1171 bytes --]

The only user is in the same file and the name is too generic because this
function is only ever used for HVM domains.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: linux-pci@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>

---
 arch/x86/pci/xen.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -419,7 +419,7 @@ int __init pci_xen_init(void)
 }
 
 #ifdef CONFIG_PCI_MSI
-void __init xen_msi_init(void)
+static void __init xen_hvm_msi_init(void)
 {
 	if (!disable_apic) {
 		/*
@@ -459,7 +459,7 @@ int __init pci_xen_hvm_init(void)
 	 * We need to wait until after x2apic is initialized
 	 * before we can set MSI IRQ ops.
 	 */
-	x86_platform.apic_post_init = xen_msi_init;
+	x86_platform.apic_post_init = xen_hvm_msi_init;
 #endif
 	return 0;
 }

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 23/38] x86/xen: Rework MSI teardown
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (21 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 22/38] x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-24  5:09   ` Jürgen Groß
  2020-08-21  0:24 ` [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init Thomas Gleixner
                   ` (15 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-xen--Rework-XEN-MSI-management.patch --]
[-- Type: text/plain, Size: 2745 bytes --]

X86 cannot store the irq domain pointer in struct device without breaking
XEN because the irq domain pointer takes precedence over arch_*_msi_irqs()
fallbacks.

XENs MSI teardown relies on default_teardown_msi_irqs() which invokes
arch_teardown_msi_irq(). default_teardown_msi_irqs() is a trivial iterator
over the msi entries associated to a device.

Implement this loop in xen_teardown_msi_irqs() to prepare for removal of
the fallbacks for X86.

This is a preparatory step to wrap XEN MSI alloc/free into a irq domain
which in turn allows to store the irq domain pointer in struct device and
to use the irq domain functions directly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/pci/xen.c |   23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -376,20 +376,31 @@ static void xen_initdom_restore_msi_irqs
 static void xen_teardown_msi_irqs(struct pci_dev *dev)
 {
 	struct msi_desc *msidesc;
+	int i;
+
+	for_each_pci_msi_entry(msidesc, dev) {
+		if (msidesc->irq) {
+			for (i = 0; i < msidesc->nvec_used; i++)
+				xen_destroy_irq(msidesc->irq + i);
+		}
+	}
+}
+
+static void xen_pv_teardown_msi_irqs(struct pci_dev *dev)
+{
+	struct msi_desc *msidesc = first_pci_msi_entry(dev);
 
-	msidesc = first_pci_msi_entry(dev);
 	if (msidesc->msi_attrib.is_msix)
 		xen_pci_frontend_disable_msix(dev);
 	else
 		xen_pci_frontend_disable_msi(dev);
 
-	/* Free the IRQ's and the msidesc using the generic code. */
-	default_teardown_msi_irqs(dev);
+	xen_teardown_msi_irqs(dev);
 }
 
 static void xen_teardown_msi_irq(unsigned int irq)
 {
-	xen_destroy_irq(irq);
+	WARN_ON_ONCE(1);
 }
 
 #endif
@@ -412,7 +423,7 @@ int __init pci_xen_init(void)
 #ifdef CONFIG_PCI_MSI
 	x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
 	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
-	x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
+	x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs;
 	pci_msi_ignore_mask = 1;
 #endif
 	return 0;
@@ -436,6 +447,7 @@ static void __init xen_hvm_msi_init(void
 	}
 
 	x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs;
+	x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
 	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
 }
 #endif
@@ -472,6 +484,7 @@ int __init pci_xen_initial_domain(void)
 #ifdef CONFIG_PCI_MSI
 	x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
 	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
+	x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
 	x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
 	pci_msi_ignore_mask = 1;
 #endif

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (22 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 23/38] x86/xen: Rework MSI teardown Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-24  4:59   ` Jürgen Groß
  2020-08-21  0:24 ` [patch RFC 25/38] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() Thomas Gleixner
                   ` (14 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-xen--Consolidate-XEN-MSI-init.patch --]
[-- Type: text/plain, Size: 3150 bytes --]

X86 cannot store the irq domain pointer in struct device without breaking
XEN because the irq domain pointer takes precedence over arch_*_msi_irqs()
fallbacks.

To achieve this XEN MSI interrupt management needs to be wrapped into an
irq domain.

Move the x86_msi ops setup into a single function to prepare for this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/pci/xen.c |   51 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 32 insertions(+), 19 deletions(-)

--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -371,7 +371,10 @@ static void xen_initdom_restore_msi_irqs
 		WARN(ret && ret != -ENOSYS, "restore_msi -> %d\n", ret);
 	}
 }
-#endif
+#else /* CONFIG_XEN_DOM0 */
+#define xen_initdom_setup_msi_irqs	NULL
+#define xen_initdom_restore_msi_irqs	NULL
+#endif /* !CONFIG_XEN_DOM0 */
 
 static void xen_teardown_msi_irqs(struct pci_dev *dev)
 {
@@ -403,7 +406,31 @@ static void xen_teardown_msi_irq(unsigne
 	WARN_ON_ONCE(1);
 }
 
-#endif
+static __init void xen_setup_pci_msi(void)
+{
+	if (xen_initial_domain()) {
+		x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
+		x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
+		x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
+		pci_msi_ignore_mask = 1;
+	} else if (xen_pv_domain()) {
+		x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
+		x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs;
+		pci_msi_ignore_mask = 1;
+	} else if (xen_hvm_domain()) {
+		x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs;
+		x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
+	} else {
+		WARN_ON_ONCE(1);
+		return;
+	}
+
+	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
+}
+
+#else /* CONFIG_PCI_MSI */
+static inline void xen_setup_pci_msi(void) { }
+#endif /* CONFIG_PCI_MSI */
 
 int __init pci_xen_init(void)
 {
@@ -420,12 +447,7 @@ int __init pci_xen_init(void)
 	/* Keep ACPI out of the picture */
 	acpi_noirq_set();
 
-#ifdef CONFIG_PCI_MSI
-	x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
-	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
-	x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs;
-	pci_msi_ignore_mask = 1;
-#endif
+	xen_setup_pci_msi();
 	return 0;
 }
 
@@ -445,10 +467,7 @@ static void __init xen_hvm_msi_init(void
 		    ((eax & XEN_HVM_CPUID_APIC_ACCESS_VIRT) && boot_cpu_has(X86_FEATURE_APIC)))
 			return;
 	}
-
-	x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs;
-	x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
-	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
+	xen_setup_pci_msi();
 }
 #endif
 
@@ -481,13 +500,7 @@ int __init pci_xen_initial_domain(void)
 {
 	int irq;
 
-#ifdef CONFIG_PCI_MSI
-	x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
-	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
-	x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
-	x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
-	pci_msi_ignore_mask = 1;
-#endif
+	xen_setup_pci_msi();
 	__acpi_register_gsi = acpi_register_gsi_xen;
 	__acpi_unregister_gsi = NULL;
 	/*

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 25/38] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (23 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 26/38] x86/xen: Wrap XEN MSI management into irqdomain Thomas Gleixner
                   ` (13 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: irqdomain-msi--Allow-to-override-msi_domain_alloc-free_irqs--.patch --]
[-- Type: text/plain, Size: 7698 bytes --]

To support MSI irq domains which do not fit at all into the regular MSI
irqdomain scheme, like the XEN MSI interrupt management for PV/HVM/DOM0,
it's necessary to allow to override the alloc/free implementation.

This is a preperatory step to switch X86 away from arch_*_msi_irqs() and
store the irq domain pointer right in struct device.

No functional change for existing MSI irq domain users.

Aside of the evil XEN wrapper this is also useful for special MSI domains
which need to do extra alloc/free work before/after calling the generic
core function. Work like allocating/freeing MSI descriptors, MSI storage
space etc.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
---
 include/linux/msi.h |   27 ++++++++++++++++++++
 kernel/irq/msi.c    |   70 +++++++++++++++++++++++++++++++++++-----------------
 2 files changed, 75 insertions(+), 22 deletions(-)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -241,6 +241,10 @@ struct msi_domain_info;
  * @msi_finish:		Optional callback to finalize the allocation
  * @set_desc:		Set the msi descriptor for an interrupt
  * @handle_error:	Optional error handler if the allocation fails
+ * @domain_alloc_irqs:	Optional function to override the default allocation
+ *			function.
+ * @domain_free_irqs:	Optional function to override the default free
+ *			function.
  *
  * @get_hwirq, @msi_init and @msi_free are callbacks used by
  * msi_create_irq_domain() and related interfaces
@@ -248,6 +252,22 @@ struct msi_domain_info;
  * @msi_check, @msi_prepare, @msi_finish, @set_desc and @handle_error
  * are callbacks used by msi_domain_alloc_irqs() and related
  * interfaces which are based on msi_desc.
+ *
+ * @domain_alloc_irqs, @domain_free_irqs can be used to override the
+ * default allocation/free functions (__msi_domain_alloc/free_irqs). This
+ * is initially for a wrapper around XENs seperate MSI universe which can't
+ * be wrapped into the regular irq domains concepts by mere mortals.  This
+ * allows to universally use msi_domain_alloc/free_irqs without having to
+ * special case XEN all over the place.
+ *
+ * Contrary to other operations @domain_alloc_irqs and @domain_free_irqs
+ * are set to the default implementation if NULL and even when
+ * MSI_FLAG_USE_DEF_DOM_OPS is not set to avoid breaking existing users and
+ * because these callbacks are obviously mandatory.
+ *
+ * This is NOT meant to be abused, but it can be useful to build wrappers
+ * for specialized MSI irq domains which need extra work before and after
+ * calling __msi_domain_alloc_irqs()/__msi_domain_free_irqs().
  */
 struct msi_domain_ops {
 	irq_hw_number_t	(*get_hwirq)(struct msi_domain_info *info,
@@ -270,6 +290,10 @@ struct msi_domain_ops {
 				    struct msi_desc *desc);
 	int		(*handle_error)(struct irq_domain *domain,
 					struct msi_desc *desc, int error);
+	int		(*domain_alloc_irqs)(struct irq_domain *domain,
+					     struct device *dev, int nvec);
+	void		(*domain_free_irqs)(struct irq_domain *domain,
+					    struct device *dev);
 };
 
 /**
@@ -327,8 +351,11 @@ int msi_domain_set_affinity(struct irq_d
 struct irq_domain *msi_create_irq_domain(struct fwnode_handle *fwnode,
 					 struct msi_domain_info *info,
 					 struct irq_domain *parent);
+int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
+			    int nvec);
 int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
 			  int nvec);
+void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev);
 void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev);
 struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain);
 
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -229,11 +229,13 @@ static int msi_domain_ops_check(struct i
 }
 
 static struct msi_domain_ops msi_domain_ops_default = {
-	.get_hwirq	= msi_domain_ops_get_hwirq,
-	.msi_init	= msi_domain_ops_init,
-	.msi_check	= msi_domain_ops_check,
-	.msi_prepare	= msi_domain_ops_prepare,
-	.set_desc	= msi_domain_ops_set_desc,
+	.get_hwirq		= msi_domain_ops_get_hwirq,
+	.msi_init		= msi_domain_ops_init,
+	.msi_check		= msi_domain_ops_check,
+	.msi_prepare		= msi_domain_ops_prepare,
+	.set_desc		= msi_domain_ops_set_desc,
+	.domain_alloc_irqs	= __msi_domain_alloc_irqs,
+	.domain_free_irqs	= __msi_domain_free_irqs,
 };
 
 static void msi_domain_update_dom_ops(struct msi_domain_info *info)
@@ -245,6 +247,14 @@ static void msi_domain_update_dom_ops(st
 		return;
 	}
 
+	if (ops->domain_alloc_irqs == NULL)
+		ops->domain_alloc_irqs = msi_domain_ops_default.domain_alloc_irqs;
+	if (ops->domain_free_irqs == NULL)
+		ops->domain_free_irqs = msi_domain_ops_default.domain_free_irqs;
+
+	if (!(info->flags & MSI_FLAG_USE_DEF_DOM_OPS))
+		return;
+
 	if (ops->get_hwirq == NULL)
 		ops->get_hwirq = msi_domain_ops_default.get_hwirq;
 	if (ops->msi_init == NULL)
@@ -278,8 +288,7 @@ struct irq_domain *msi_create_irq_domain
 {
 	struct irq_domain *domain;
 
-	if (info->flags & MSI_FLAG_USE_DEF_DOM_OPS)
-		msi_domain_update_dom_ops(info);
+	msi_domain_update_dom_ops(info);
 	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
 		msi_domain_update_chip_ops(info);
 
@@ -386,17 +395,8 @@ static bool msi_check_reservation_mode(s
 	return desc->msi_attrib.is_msix || desc->msi_attrib.maskbit;
 }
 
-/**
- * msi_domain_alloc_irqs - Allocate interrupts from a MSI interrupt domain
- * @domain:	The domain to allocate from
- * @dev:	Pointer to device struct of the device for which the interrupts
- *		are allocated
- * @nvec:	The number of interrupts to allocate
- *
- * Returns 0 on success or an error code.
- */
-int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
-			  int nvec)
+int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
+			    int nvec)
 {
 	struct msi_domain_info *info = domain->host_data;
 	struct msi_domain_ops *ops = info->ops;
@@ -490,12 +490,24 @@ int msi_domain_alloc_irqs(struct irq_dom
 }
 
 /**
- * msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain associated tp @dev
- * @domain:	The domain to managing the interrupts
+ * msi_domain_alloc_irqs - Allocate interrupts from a MSI interrupt domain
+ * @domain:	The domain to allocate from
  * @dev:	Pointer to device struct of the device for which the interrupts
- *		are free
+ *		are allocated
+ * @nvec:	The number of interrupts to allocate
+ *
+ * Returns 0 on success or an error code.
  */
-void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
+int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
+			  int nvec)
+{
+	struct msi_domain_info *info = domain->host_data;
+	struct msi_domain_ops *ops = info->ops;
+
+	return ops->domain_alloc_irqs(domain, dev, nvec);
+}
+
+void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
 {
 	struct msi_desc *desc;
 
@@ -513,6 +525,20 @@ void msi_domain_free_irqs(struct irq_dom
 }
 
 /**
+ * __msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain associated tp @dev
+ * @domain:	The domain to managing the interrupts
+ * @dev:	Pointer to device struct of the device for which the interrupts
+ *		are free
+ */
+void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
+{
+	struct msi_domain_info *info = domain->host_data;
+	struct msi_domain_ops *ops = info->ops;
+
+	return ops->domain_free_irqs(domain, dev);
+}
+
+/**
  * msi_get_domain_info - Get the MSI interrupt domain info for @domain
  * @domain:	The interrupt domain to retrieve data from
  *

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 26/38] x86/xen: Wrap XEN MSI management into irqdomain
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (24 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 25/38] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-24  6:21   ` Jürgen Groß
  2020-08-21  0:24 ` [patch RFC 27/38] iommm/vt-d: Store irq domain in struct device Thomas Gleixner
                   ` (12 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-xen--Wrap-XEN-MSI-management-into-irqdomain.patch --]
[-- Type: text/plain, Size: 3100 bytes --]

To allow utilizing the irq domain pointer in struct device it is necessary
to make XEN/MSI irq domain compatible.

While the right solution would be to truly convert XEN to irq domains, this
is an exercise which is not possible for mere mortals with limited XENology.

Provide a plain irqdomain wrapper around XEN. While this is blatant
violation of the irqdomain design, it's the only solution for a XEN igorant
person to make progress on the issue which triggered this change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
---
Note: This is completely untested, but it compiles so it must be perfect.
---
 arch/x86/pci/xen.c |   63 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -406,6 +406,63 @@ static void xen_teardown_msi_irq(unsigne
 	WARN_ON_ONCE(1);
 }
 
+static int xen_msi_domain_alloc_irqs(struct irq_domain *domain,
+				     struct device *dev,  int nvec)
+{
+	int type;
+
+	if (WARN_ON_ONCE(!dev_is_pci(dev)))
+		return -EINVAL;
+
+	if (first_msi_entry(dev)->msi_attrib.is_msix)
+		type = PCI_CAP_ID_MSIX;
+	else
+		type = PCI_CAP_ID_MSI;
+
+	return x86_msi.setup_msi_irqs(to_pci_dev(dev), nvec, type);
+}
+
+static void xen_msi_domain_free_irqs(struct irq_domain *domain,
+				     struct device *dev)
+{
+	if (WARN_ON_ONCE(!dev_is_pci(dev)))
+		return;
+
+	x86_msi.teardown_msi_irqs(to_pci_dev(dev));
+}
+
+static struct msi_domain_ops xen_pci_msi_domain_ops = {
+	.domain_alloc_irqs	= xen_msi_domain_alloc_irqs,
+	.domain_free_irqs	= xen_msi_domain_free_irqs,
+};
+
+static struct msi_domain_info xen_pci_msi_domain_info = {
+	.ops			= &xen_pci_msi_domain_ops,
+};
+
+/*
+ * This irq domain is a blatant violation of the irq domain design, but
+ * distangling XEN into real irq domains is not a job for mere mortals with
+ * limited XENology. But it's the least dangerous way for a mere mortal to
+ * get rid of the arch_*_msi_irqs() hackery in order to store the irq
+ * domain pointer in struct device. This irq domain wrappery allows to do
+ * that without breaking XEN terminally.
+ */
+static __init struct irq_domain *xen_create_pci_msi_domain(void)
+{
+	struct irq_domain *d = NULL;
+	struct fwnode_handle *fn;
+
+	fn = irq_domain_alloc_named_fwnode("XEN-MSI");
+	if (fn)
+		d = msi_create_irq_domain(fn, &xen_pci_msi_domain_info, NULL);
+
+	/* FIXME: No idea how to survive if this fails */
+	BUG_ON(!d);
+
+	return d;
+}
+
 static __init void xen_setup_pci_msi(void)
 {
 	if (xen_initial_domain()) {
@@ -426,6 +483,12 @@ static __init void xen_setup_pci_msi(voi
 	}
 
 	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
+
+	/*
+	 * Override the PCI/MSI irq domain init function. No point
+	 * in allocating the native domain and never use it.
+	 */
+	x86_init.irqs.create_pci_msi_domain = xen_create_pci_msi_domain;
 }
 
 #else /* CONFIG_PCI_MSI */

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 27/38] iommm/vt-d: Store irq domain in struct device
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (25 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 26/38] x86/xen: Wrap XEN MSI management into irqdomain Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 28/38] iommm/amd: " Thomas Gleixner
                   ` (11 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: iommm-vt-d--Store-irq-domain-in-struct-device.patch --]
[-- Type: text/plain, Size: 2662 bytes --]

As a first step to make X86 utilize the direct MSI irq domain operations
store the irq domain pointer in the device struct when a device is probed.

This is done from dmar_pci_bus_add_dev() because it has to work even when
DMA remapping is disabled. It only overrides the irqdomain of devices which
are handled by a regular PCI/MSI irq domain which protects PCI devices
behind special busses like VMD which have their own irq domain.

No functional change. It just avoids the redirection through
arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq
domain alloc/free functions instead of having to look up the irq domain for
every single MSI interupt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/dmar.c          |    3 +++
 drivers/iommu/intel/irq_remapping.c |   16 ++++++++++++++++
 include/linux/intel-iommu.h         |    5 +++++
 3 files changed, 24 insertions(+)

--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -316,6 +316,9 @@ static int dmar_pci_bus_add_dev(struct d
 	if (ret < 0 && dmar_dev_scope_status == 0)
 		dmar_dev_scope_status = ret;
 
+	if (ret >= 0)
+		intel_irq_remap_add_device(info);
+
 	return ret;
 }
 
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -1086,6 +1086,22 @@ static int reenable_irq_remapping(int ei
 	return -1;
 }
 
+/*
+ * Store the MSI remapping domain pointer in the device if enabled.
+ *
+ * This is called from dmar_pci_bus_add_dev() so it works even when DMA
+ * remapping is disabled. Only update the pointer if the device is not
+ * already handled by a non default PCI/MSI interrupt domain. This protects
+ * e.g. VMD devices.
+ */
+void intel_irq_remap_add_device(struct dmar_pci_notify_info *info)
+{
+	if (!irq_remapping_enabled || pci_dev_has_special_msi_domain(info->dev))
+		return;
+
+	dev_set_msi_domain(&info->dev->dev, map_dev_to_ir(info->dev));
+}
+
 static void prepare_irte(struct irte *irte, int vector, unsigned int dest)
 {
 	memset(irte, 0, sizeof(*irte));
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -439,6 +439,11 @@ struct ir_table {
 	struct irte *base;
 	unsigned long *bitmap;
 };
+
+void intel_irq_remap_add_device(struct dmar_pci_notify_info *info);
+#else
+static inline void
+intel_irq_remap_add_device(struct dmar_pci_notify_info *info) { }
 #endif
 
 struct iommu_flush {

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 28/38] iommm/amd: Store irq domain in struct device
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (26 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 27/38] iommm/vt-d: Store irq domain in struct device Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 29/38] x86/pci: Set default irq domain in pcibios_add_device() Thomas Gleixner
                   ` (10 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: iommm-amd--Store-irq-domain-in-struct-device.patch --]
[-- Type: text/plain, Size: 1833 bytes --]

As the next step to make X86 utilize the direct MSI irq domain operations
store the irq domain pointer in the device struct when a device is probed.

It only overrides the irqdomain of devices which are handled by a regular
PCI/MSI irq domain which protects PCI devices behind special busses like
VMD which have their own irq domain.

No functional change.

It just avoids the redirection through arch_*_msi_irqs() and allows the
PCI/MSI core to directly invoke the irq domain alloc/free functions instead
of having to look up the irq domain for every single MSI interupt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
---
 drivers/iommu/amd/iommu.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -729,7 +729,21 @@ static void iommu_poll_ga_log(struct amd
 		}
 	}
 }
-#endif /* CONFIG_IRQ_REMAP */
+
+static void
+amd_iommu_set_pci_msi_domain(struct device *dev, struct amd_iommu *iommu)
+{
+	if (!irq_remapping_enabled || !dev_is_pci(dev) ||
+	    pci_dev_has_special_msi_domain(to_pci_dev(dev)))
+		return;
+
+	dev_set_msi_domain(dev, iommu->msi_domain);
+}
+
+#else /* CONFIG_IRQ_REMAP */
+static inline void
+amd_iommu_set_pci_msi_domain(struct device *dev, struct amd_iommu *iommu) { }
+#endif /* !CONFIG_IRQ_REMAP */
 
 #define AMD_IOMMU_INT_MASK	\
 	(MMIO_STATUS_EVT_INT_MASK | \
@@ -2157,6 +2171,7 @@ static struct iommu_device *amd_iommu_pr
 		iommu_dev = ERR_PTR(ret);
 		iommu_ignore_device(dev);
 	} else {
+		amd_iommu_set_pci_msi_domain(dev, iommu);
 		iommu_dev = &iommu->iommu;
 	}
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 29/38] x86/pci: Set default irq domain in pcibios_add_device()
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (27 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 28/38] iommm/amd: " Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks Thomas Gleixner
                   ` (9 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-pci--Set-default-irq-domain-in-pcibios_add_device.patch --]
[-- Type: text/plain, Size: 3208 bytes --]

Now that interrupt remapping sets the irqdomain pointer when a PCI device
is added it's possible to store the default irq domain in the device struct
in pcibios_add_device().

If the bus to which a device is connected has an irq domain associated then
this domain is used otherwise the default domain (PCI/MSI native or XEN
PCI/MSI) is used. Using the bus domain ensures that special MSI bus domains
like VMD work.

This makes XEN and the non-remapped native case work solely based on the
irq domain pointer in struct device for PCI/MSI and allows to remove the
arch fallback and make most of the x86_msi ops private to XEN in the next
steps.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
---
 arch/x86/include/asm/irqdomain.h |    2 ++
 arch/x86/kernel/apic/msi.c       |    2 +-
 arch/x86/pci/common.c            |   18 +++++++++++++++++-
 3 files changed, 20 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/irqdomain.h
+++ b/arch/x86/include/asm/irqdomain.h
@@ -53,9 +53,11 @@ extern int mp_irqdomain_ioapic_idx(struc
 #ifdef CONFIG_PCI_MSI
 void x86_create_pci_msi_domain(void);
 struct irq_domain *native_create_pci_msi_domain(void);
+extern struct irq_domain *x86_pci_msi_default_domain;
 #else
 static inline void x86_create_pci_msi_domain(void) { }
 #define native_create_pci_msi_domain	NULL
+#define x86_pci_msi_default_domain	NULL
 #endif
 
 #endif
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -21,7 +21,7 @@
 #include <asm/apic.h>
 #include <asm/irq_remapping.h>
 
-static struct irq_domain *x86_pci_msi_default_domain __ro_after_init;
+struct irq_domain *x86_pci_msi_default_domain __ro_after_init;
 
 static void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg)
 {
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -19,6 +19,7 @@
 #include <asm/smp.h>
 #include <asm/pci_x86.h>
 #include <asm/setup.h>
+#include <asm/irqdomain.h>
 
 unsigned int pci_probe = PCI_PROBE_BIOS | PCI_PROBE_CONF1 | PCI_PROBE_CONF2 |
 				PCI_PROBE_MMCONF;
@@ -633,8 +634,9 @@ static void set_dev_domain_options(struc
 
 int pcibios_add_device(struct pci_dev *dev)
 {
-	struct setup_data *data;
 	struct pci_setup_rom *rom;
+	struct irq_domain *msidom;
+	struct setup_data *data;
 	u64 pa_data;
 
 	pa_data = boot_params.hdr.setup_data;
@@ -661,6 +663,20 @@ int pcibios_add_device(struct pci_dev *d
 		memunmap(data);
 	}
 	set_dev_domain_options(dev);
+
+	/*
+	 * Setup the initial MSI domain of the device. If the underlying
+	 * bus has a PCI/MSI irqdomain associated use the bus domain,
+	 * otherwise set the default domain. This ensures that special irq
+	 * domains e.g. VMD are preserved. The default ensures initial
+	 * operation if irq remapping is not active. If irq remapping is
+	 * active it will overwrite the domain pointer when the device is
+	 * associated to a remapping domain.
+	 */
+	msidom = dev_get_msi_domain(&dev->bus->dev);
+	if (!msidom)
+		msidom = x86_pci_msi_default_domain;
+	dev_set_msi_domain(&dev->dev, msidom);
 	return 0;
 }
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (28 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 29/38] x86/pci: Set default irq domain in pcibios_add_device() Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-25 20:07   ` Bjorn Helgaas
  2020-08-21  0:24 ` [patch RFC 31/38] x86/irq: Cleanup the arch_*_msi_irqs() leftovers Thomas Gleixner
                   ` (8 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: PCI-MSI--Allow-to-disable-arch-fallbacks.patch --]
[-- Type: text/plain, Size: 3058 bytes --]

If an architecture does not require the MSI setup/teardown fallback
functions, then allow them to be replaced by stub functions which emit a
warning.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/Kconfig |    3 +++
 drivers/pci/msi.c   |    3 ++-
 include/linux/msi.h |   31 ++++++++++++++++++++++++++-----
 3 files changed, 31 insertions(+), 6 deletions(-)

--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -56,6 +56,9 @@ config PCI_MSI_IRQ_DOMAIN
 	depends on PCI_MSI
 	select GENERIC_MSI_IRQ_DOMAIN
 
+config PCI_MSI_DISABLE_ARCH_FALLBACKS
+	bool
+
 config PCI_QUIRKS
 	default y
 	bool "Enable PCI quirk workarounds" if EXPERT
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -58,8 +58,8 @@ static void pci_msi_teardown_msi_irqs(st
 #define pci_msi_teardown_msi_irqs	arch_teardown_msi_irqs
 #endif
 
+#ifndef CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS
 /* Arch hooks */
-
 int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 {
 	struct msi_controller *chip = dev->bus->msi;
@@ -132,6 +132,7 @@ void __weak arch_teardown_msi_irqs(struc
 {
 	return default_teardown_msi_irqs(dev);
 }
+#endif /* !CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS */
 
 static void default_restore_msi_irq(struct pci_dev *dev, int irq)
 {
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -193,17 +193,38 @@ void pci_msi_mask_irq(struct irq_data *d
 void pci_msi_unmask_irq(struct irq_data *data);
 
 /*
- * The arch hooks to setup up msi irqs. Those functions are
- * implemented as weak symbols so that they /can/ be overriden by
- * architecture specific code if needed.
+ * The arch hooks to setup up msi irqs. Default functions are implemented
+ * as weak symbols so that they /can/ be overriden by architecture specific
+ * code if needed.
+ *
+ * They can be replaced by stubs with warnings via
+ * CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS when the architecture fully
+ * utilizes direct irqdomain based setup.
  */
+#ifndef CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS
 int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc);
 void arch_teardown_msi_irq(unsigned int irq);
 int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
 void arch_teardown_msi_irqs(struct pci_dev *dev);
-void arch_restore_msi_irqs(struct pci_dev *dev);
-
 void default_teardown_msi_irqs(struct pci_dev *dev);
+#else
+static inline int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+{
+	WARN_ON_ONCE(1);
+	return -ENODEV;
+}
+
+static inline void arch_teardown_msi_irqs(struct pci_dev *dev)
+{
+	WARN_ON_ONCE(1);
+}
+#endif
+
+/*
+ * The restore hooks are still available as they are useful even
+ * for fully irq domain based setups. Courtesy to XEN/X86.
+ */
+void arch_restore_msi_irqs(struct pci_dev *dev);
 void default_restore_msi_irqs(struct pci_dev *dev);
 
 struct msi_controller {

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 31/38] x86/irq: Cleanup the arch_*_msi_irqs() leftovers
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (29 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 32/38] x86/irq: Make most MSI ops XEN private Thomas Gleixner
                   ` (7 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Cleanup-the-arch_msi_irqs-leftovers.patch --]
[-- Type: text/plain, Size: 4115 bytes --]

Get rid of all the gunk and enable CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: linux-pci@vger.kernel.org
---
 arch/x86/Kconfig                |    1 +
 arch/x86/include/asm/pci.h      |   11 -----------
 arch/x86/include/asm/x86_init.h |    1 -
 arch/x86/kernel/apic/msi.c      |   22 ----------------------
 arch/x86/kernel/x86_init.c      |   18 ------------------
 arch/x86/pci/xen.c              |    7 -------
 6 files changed, 1 insertion(+), 59 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -225,6 +225,7 @@ config X86
 	select NEED_SG_DMA_LENGTH
 	select PCI_DOMAINS			if PCI
 	select PCI_LOCKLESS_CONFIG		if PCI
+	select PCI_MSI_DISABLE_ARCH_FALLBACKS
 	select PERF_EVENTS
 	select RTC_LIB
 	select RTC_MC146818_LIB
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -105,17 +105,6 @@ static inline void early_quirks(void) {
 
 extern void pci_iommu_alloc(void);
 
-#ifdef CONFIG_PCI_MSI
-/* implemented in arch/x86/kernel/apic/io_apic. */
-struct msi_desc;
-int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
-void native_teardown_msi_irq(unsigned int irq);
-void native_restore_msi_irqs(struct pci_dev *dev);
-#else
-#define native_setup_msi_irqs		NULL
-#define native_teardown_msi_irq		NULL
-#endif
-
 /* generic pci stuff */
 #include <asm-generic/pci.h>
 
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -277,7 +277,6 @@ struct pci_dev;
 
 struct x86_msi_ops {
 	int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
-	void (*teardown_msi_irq)(unsigned int irq);
 	void (*teardown_msi_irqs)(struct pci_dev *dev);
 	void (*restore_msi_irqs)(struct pci_dev *dev);
 };
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -182,28 +182,6 @@ static struct irq_chip pci_msi_controlle
 	.flags			= IRQCHIP_SKIP_SET_WAKE,
 };
 
-int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
-{
-	struct irq_domain *domain;
-	struct irq_alloc_info info;
-
-	init_irq_alloc_info(&info, NULL);
-	info.type = X86_IRQ_ALLOC_TYPE_PCI_MSI;
-
-	domain = irq_remapping_get_irq_domain(&info);
-	if (domain == NULL)
-		domain = x86_pci_msi_default_domain;
-	if (domain == NULL)
-		return -ENOSYS;
-
-	return msi_domain_alloc_irqs(domain, &dev->dev, nvec);
-}
-
-void native_teardown_msi_irq(unsigned int irq)
-{
-	irq_domain_free_irqs(irq, 1);
-}
-
 int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
 		    msi_alloc_info_t *arg)
 {
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -145,28 +145,10 @@ EXPORT_SYMBOL_GPL(x86_platform);
 
 #if defined(CONFIG_PCI_MSI)
 struct x86_msi_ops x86_msi __ro_after_init = {
-	.setup_msi_irqs		= native_setup_msi_irqs,
-	.teardown_msi_irq	= native_teardown_msi_irq,
-	.teardown_msi_irqs	= default_teardown_msi_irqs,
 	.restore_msi_irqs	= default_restore_msi_irqs,
 };
 
 /* MSI arch specific hooks */
-int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
-{
-	return x86_msi.setup_msi_irqs(dev, nvec, type);
-}
-
-void arch_teardown_msi_irqs(struct pci_dev *dev)
-{
-	x86_msi.teardown_msi_irqs(dev);
-}
-
-void arch_teardown_msi_irq(unsigned int irq)
-{
-	x86_msi.teardown_msi_irq(irq);
-}
-
 void arch_restore_msi_irqs(struct pci_dev *dev)
 {
 	x86_msi.restore_msi_irqs(dev);
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -401,11 +401,6 @@ static void xen_pv_teardown_msi_irqs(str
 	xen_teardown_msi_irqs(dev);
 }
 
-static void xen_teardown_msi_irq(unsigned int irq)
-{
-	WARN_ON_ONCE(1);
-}
-
 static int xen_msi_domain_alloc_irqs(struct irq_domain *domain,
 				     struct device *dev,  int nvec)
 {
@@ -482,8 +477,6 @@ static __init void xen_setup_pci_msi(voi
 		return;
 	}
 
-	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
-
 	/*
 	 * Override the PCI/MSI irq domain init function. No point
 	 * in allocating the native domain and never use it.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 32/38] x86/irq: Make most MSI ops XEN private
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (30 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 31/38] x86/irq: Cleanup the arch_*_msi_irqs() leftovers Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 33/38] x86/irq: Add DEV_MSI allocation type Thomas Gleixner
                   ` (6 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Make-most-MSI-ops-XEN-private.patch --]
[-- Type: text/plain, Size: 2927 bytes --]

Nothing except XEN uses the setup/teardown ops. Hide them there.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: linux-pci@vger.kernel.org
---
 arch/x86/include/asm/x86_init.h |    2 --
 arch/x86/pci/xen.c              |   23 +++++++++++++++--------
 2 files changed, 15 insertions(+), 10 deletions(-)

--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -276,8 +276,6 @@ struct x86_platform_ops {
 struct pci_dev;
 
 struct x86_msi_ops {
-	int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
-	void (*teardown_msi_irqs)(struct pci_dev *dev);
 	void (*restore_msi_irqs)(struct pci_dev *dev);
 };
 
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -156,6 +156,13 @@ static int acpi_register_gsi_xen(struct
 struct xen_pci_frontend_ops *xen_pci_frontend;
 EXPORT_SYMBOL_GPL(xen_pci_frontend);
 
+struct xen_msi_ops {
+	int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
+	void (*teardown_msi_irqs)(struct pci_dev *dev);
+};
+
+static struct xen_msi_ops xen_msi_ops __ro_after_init;
+
 static int xen_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	int irq, ret, i;
@@ -414,7 +421,7 @@ static int xen_msi_domain_alloc_irqs(str
 	else
 		type = PCI_CAP_ID_MSI;
 
-	return x86_msi.setup_msi_irqs(to_pci_dev(dev), nvec, type);
+	return xen_msi_ops.setup_msi_irqs(to_pci_dev(dev), nvec, type);
 }
 
 static void xen_msi_domain_free_irqs(struct irq_domain *domain,
@@ -423,7 +430,7 @@ static void xen_msi_domain_free_irqs(str
 	if (WARN_ON_ONCE(!dev_is_pci(dev)))
 		return;
 
-	x86_msi.teardown_msi_irqs(to_pci_dev(dev));
+	xen_msi_ops.teardown_msi_irqs(to_pci_dev(dev));
 }
 
 static struct msi_domain_ops xen_pci_msi_domain_ops = {
@@ -461,17 +468,17 @@ static __init struct irq_domain *xen_cre
 static __init void xen_setup_pci_msi(void)
 {
 	if (xen_initial_domain()) {
-		x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
-		x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
+		xen_msi_ops.setup_msi_irqs = xen_initdom_setup_msi_irqs;
+		xen_msi_ops.teardown_msi_irqs = xen_teardown_msi_irqs;
 		x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
 		pci_msi_ignore_mask = 1;
 	} else if (xen_pv_domain()) {
-		x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
-		x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs;
+		xen_msi_ops.setup_msi_irqs = xen_setup_msi_irqs;
+		xen_msi_ops.teardown_msi_irqs = xen_pv_teardown_msi_irqs;
 		pci_msi_ignore_mask = 1;
 	} else if (xen_hvm_domain()) {
-		x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs;
-		x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
+		xen_msi_ops.setup_msi_irqs = xen_hvm_setup_msi_irqs;
+		xen_msi_ops.teardown_msi_irqs = xen_teardown_msi_irqs;
 	} else {
 		WARN_ON_ONCE(1);
 		return;

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 33/38] x86/irq: Add DEV_MSI allocation type
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (31 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 32/38] x86/irq: Make most MSI ops XEN private Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:24 ` [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI Thomas Gleixner
                   ` (5 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-irq--Add-DEVMSI-allocation-type.patch --]
[-- Type: text/plain, Size: 666 bytes --]

For the upcoming device MSI support a new allocation type is
required.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/hw_irq.h |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -40,6 +40,7 @@ enum irq_alloc_type {
 	X86_IRQ_ALLOC_TYPE_PCI_MSIX,
 	X86_IRQ_ALLOC_TYPE_DMAR,
 	X86_IRQ_ALLOC_TYPE_UV,
+	X86_IRQ_ALLOC_TYPE_DEV_MSI,
 	X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT,
 	X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT,
 };

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (32 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 33/38] x86/irq: Add DEV_MSI allocation type Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-25 20:24   ` Bjorn Helgaas
  2020-08-21  0:24 ` [patch RFC 35/38] platform-msi: Provide default irq_chip::ack Thomas Gleixner
                   ` (4 subsequent siblings)
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: x86-msi--Let-pci_msi_prepare---handle-non-PCI-MSI.patch --]
[-- Type: text/plain, Size: 3189 bytes --]

Rename it to x86_msi_prepare() and handle the allocation type setup
depending on the device type.

Add a new arch_msi_prepare define which will be utilized by the upcoming
device MSI support. Define it to NULL if not provided by an architecture in
the generic MSI header.

One arch specific function for MSI support is truly enough.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
---
 arch/x86/include/asm/msi.h          |    4 +++-
 arch/x86/kernel/apic/msi.c          |   27 ++++++++++++++++++++-------
 drivers/pci/controller/pci-hyperv.c |    2 +-
 include/linux/msi.h                 |    4 ++++
 4 files changed, 28 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/msi.h
+++ b/arch/x86/include/asm/msi.h
@@ -6,7 +6,9 @@
 
 typedef struct irq_alloc_info msi_alloc_info_t;
 
-int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
+int x86_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
 		    msi_alloc_info_t *arg);
 
+#define arch_msi_prepare		x86_msi_prepare
+
 #endif /* _ASM_X86_MSI_H */
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -182,26 +182,39 @@ static struct irq_chip pci_msi_controlle
 	.flags			= IRQCHIP_SKIP_SET_WAKE,
 };
 
-int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
-		    msi_alloc_info_t *arg)
+static void pci_msi_prepare(struct device *dev, msi_alloc_info_t *arg)
 {
-	struct pci_dev *pdev = to_pci_dev(dev);
-	struct msi_desc *desc = first_pci_msi_entry(pdev);
+	struct msi_desc *desc = first_msi_entry(dev);
 
-	init_irq_alloc_info(arg, NULL);
 	if (desc->msi_attrib.is_msix) {
 		arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSIX;
 	} else {
 		arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSI;
 		arg->flags |= X86_IRQ_ALLOC_CONTIGUOUS_VECTORS;
 	}
+}
+
+static void dev_msi_prepare(struct device *dev, msi_alloc_info_t *arg)
+{
+	arg->type = X86_IRQ_ALLOC_TYPE_DEV_MSI;
+}
+
+int x86_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
+		    msi_alloc_info_t *arg)
+{
+	init_irq_alloc_info(arg, NULL);
+
+	if (dev_is_pci(dev))
+		pci_msi_prepare(dev, arg);
+	else
+		dev_msi_prepare(dev, arg);
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(pci_msi_prepare);
+EXPORT_SYMBOL_GPL(x86_msi_prepare);
 
 static struct msi_domain_ops pci_msi_domain_ops = {
-	.msi_prepare	= pci_msi_prepare,
+	.msi_prepare	= x86_msi_prepare,
 };
 
 static struct msi_domain_info pci_msi_domain_info = {
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1532,7 +1532,7 @@ static struct irq_chip hv_msi_irq_chip =
 };
 
 static struct msi_domain_ops hv_msi_ops = {
-	.msi_prepare	= pci_msi_prepare,
+	.msi_prepare	= arch_msi_prepare,
 	.msi_free	= hv_msi_free,
 };
 
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -430,4 +430,8 @@ static inline struct irq_domain *pci_msi
 }
 #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
 
+#ifndef arch_msi_prepare
+# define arch_msi_prepare	NULL
+#endif
+
 #endif /* LINUX_MSI_H */

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 35/38] platform-msi: Provide default irq_chip::ack
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (33 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI Thomas Gleixner
@ 2020-08-21  0:24 ` Thomas Gleixner
  2020-08-21  0:25 ` [patch RFC 36/38] platform-msi: Add device MSI infrastructure Thomas Gleixner
                   ` (3 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: platform-msi--Provide-default-irq_chip--ack.patch --]
[-- Type: text/plain, Size: 890 bytes --]

For the upcoming device MSI support it's required to have a default
irq_chip::ack implementation (irq_chip_ack_parent) so the drivers do not
need to care.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/platform-msi.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/base/platform-msi.c
+++ b/drivers/base/platform-msi.c
@@ -95,6 +95,8 @@ static void platform_msi_update_chip_ops
 		chip->irq_mask = irq_chip_mask_parent;
 	if (!chip->irq_unmask)
 		chip->irq_unmask = irq_chip_unmask_parent;
+	if (!chip->irq_ack)
+		chip->irq_ack = irq_chip_ack_parent;
 	if (!chip->irq_eoi)
 		chip->irq_eoi = irq_chip_eoi_parent;
 	if (!chip->irq_set_affinity)

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 36/38] platform-msi: Add device MSI infrastructure
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (34 preceding siblings ...)
  2020-08-21  0:24 ` [patch RFC 35/38] platform-msi: Provide default irq_chip::ack Thomas Gleixner
@ 2020-08-21  0:25 ` Thomas Gleixner
  2020-08-21  0:25 ` [patch RFC 37/38] irqdomain/msi: Provide msi_alloc/free_store() callbacks Thomas Gleixner
                   ` (2 subsequent siblings)
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:25 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Rafael J. Wysocki, linux-pci,
	Steve Wahl, K. Y. Srinivasan, Dan Williams, Wei Liu,
	Stephen Hemminger, Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe,
	Megha Dey, xen-devel, Kevin Tian, Konrad Rzeszutek Wilk,
	Haiyang Zhang, Alex Williamson, Stefano Stabellini,
	Bjorn Helgaas, Dave Jiang, Boris Ostrovsky, Jonathan Derrick,
	Juergen Gross, Russ Anderson, Greg Kroah-Hartman, iommu,
	Jacob Pan

[-- Attachment #1: platform-msi--Add-device-MSI-infrastructure.patch --]
[-- Type: text/plain, Size: 7122 bytes --]

Add device specific MSI domain infrastructure for devices which have their
own resource management and interrupt chip. These devices are not related
to PCI and contrary to platform MSI they do not share a common resource and
interrupt chip. They provide their own domain specific resource management
and interrupt chip.

This utilizes the new alloc/free override in a non evil way which avoids
having yet another set of specialized alloc/free functions. Just using
msi_domain_alloc/free_irqs() is sufficient

While initially it was suggested and tried to piggyback device MSI on
platform MSI, the better variant is to reimplement platform MSI on top of
device MSI.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
---
 drivers/base/platform-msi.c |  129 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/irqdomain.h   |    1 
 include/linux/msi.h         |   24 ++++++++
 kernel/irq/Kconfig          |    4 +
 4 files changed, 158 insertions(+)

--- a/drivers/base/platform-msi.c
+++ b/drivers/base/platform-msi.c
@@ -412,3 +412,132 @@ int platform_msi_domain_alloc(struct irq
 
 	return err;
 }
+
+#ifdef CONFIG_DEVICE_MSI
+/*
+ * Device specific MSI domain infrastructure for devices which have their
+ * own resource management and interrupt chip. These devices are not
+ * related to PCI and contrary to platform MSI they do not share a common
+ * resource and interrupt chip. They provide their own domain specific
+ * resource management and interrupt chip.
+ */
+
+static void device_msi_free_msi_entries(struct device *dev)
+{
+	struct list_head *msi_list = dev_to_msi_list(dev);
+	struct msi_desc *entry, *tmp;
+
+	list_for_each_entry_safe(entry, tmp, msi_list, list) {
+		list_del(&entry->list);
+		free_msi_entry(entry);
+	}
+}
+
+/**
+ * device_msi_free_irqs - Free MSI interrupts assigned to  a device
+ * @dev:	Pointer to the device
+ *
+ * Frees the interrupt and the MSI descriptors.
+ */
+static void device_msi_free_irqs(struct irq_domain *domain, struct device *dev)
+{
+	__msi_domain_free_irqs(domain, dev);
+	device_msi_free_msi_entries(dev);
+}
+
+/**
+ * device_msi_alloc_irqs - Allocate MSI interrupts for a device
+ * @dev:	Pointer to the device
+ * @nvec:	Number of vectors
+ *
+ * Allocates the required number of MSI descriptors and the corresponding
+ * interrupt descriptors.
+ */
+static int device_msi_alloc_irqs(struct irq_domain *domain, struct device *dev, int nvec)
+{
+	int i, ret = -ENOMEM;
+
+	for (i = 0; i < nvec; i++) {
+		struct msi_desc *entry = alloc_msi_entry(dev, 1, NULL);
+
+		if (!entry)
+			goto fail;
+		list_add_tail(&entry->list, dev_to_msi_list(dev));
+	}
+
+	ret = __msi_domain_alloc_irqs(domain, dev, nvec);
+	if (!ret)
+		return 0;
+fail:
+	device_msi_free_msi_entries(dev);
+	return ret;
+}
+
+static void device_msi_update_dom_ops(struct msi_domain_info *info)
+{
+	if (!info->ops->domain_alloc_irqs)
+		info->ops->domain_alloc_irqs = device_msi_alloc_irqs;
+	if (!info->ops->domain_free_irqs)
+		info->ops->domain_free_irqs = device_msi_free_irqs;
+	if (!info->ops->msi_prepare)
+		info->ops->msi_prepare = arch_msi_prepare;
+}
+
+/**
+ * device_msi_create_msi_irq_domain - Create an irq domain for devices
+ * @fwnode:	Firmware node of the interrupt controller
+ * @info:	MSI domain info to configure the new domain
+ * @parent:	Parent domain
+ */
+struct irq_domain *device_msi_create_irq_domain(struct fwnode_handle *fn,
+						struct msi_domain_info *info,
+						struct irq_domain *parent)
+{
+	struct irq_domain *domain;
+
+	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
+		platform_msi_update_chip_ops(info);
+
+	if (info->flags & MSI_FLAG_USE_DEF_DOM_OPS)
+		device_msi_update_dom_ops(info);
+
+	domain = msi_create_irq_domain(fn, info, parent);
+	if (domain)
+		irq_domain_update_bus_token(domain, DOMAIN_BUS_DEVICE_MSI);
+	return domain;
+}
+
+#ifdef CONFIG_PCI
+#include <linux/pci.h>
+
+/**
+ * pci_subdevice_msi_create_irq_domain - Create an irq domain for subdevices
+ * @pdev:	Pointer to PCI device for which the subdevice domain is created
+ * @info:	MSI domain info to configure the new domain
+ */
+struct irq_domain *pci_subdevice_msi_create_irq_domain(struct pci_dev *pdev,
+						       struct msi_domain_info *info)
+{
+	struct irq_domain *domain, *pdev_msi;
+	struct fwnode_handle *fn;
+
+	/*
+	 * Retrieve the parent domain of the underlying PCI device's MSI
+	 * domain. This is going to be the parent of the new subdevice
+	 * domain as well.
+	 */
+	pdev_msi = dev_get_msi_domain(&pdev->dev);
+	if (!pdev_msi)
+		return NULL;
+
+	fn = irq_domain_alloc_named_fwnode(dev_name(&pdev->dev));
+	if (!fn)
+		return NULL;
+	domain = device_msi_create_irq_domain(fn, info, pdev_msi->parent);
+	if (!domain)
+		irq_domain_free_fwnode(fn);
+	return domain;
+}
+EXPORT_SYMBOL_GPL(pci_subdevice_msi_create_irq_domain);
+#endif /* CONFIG_PCI */
+#endif /* CONFIG_DEVICE_MSI */
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -85,6 +85,7 @@ enum irq_domain_bus_token {
 	DOMAIN_BUS_TI_SCI_INTA_MSI,
 	DOMAIN_BUS_WAKEUP,
 	DOMAIN_BUS_VMD_MSI,
+	DOMAIN_BUS_DEVICE_MSI,
 };
 
 /**
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -56,6 +56,18 @@ struct ti_sci_inta_msi_desc {
 };
 
 /**
+ * device_msi_desc - Device MSI specific MSI descriptor data
+ * @priv:		Pointer to device specific private data
+ * @priv_iomem:		Pointer to device specific private io memory
+ * @hwirq:		The hardware irq number in the device domain
+ */
+struct device_msi_desc {
+	void		*priv;
+	void __iomem	*priv_iomem;
+	u16		hwirq;
+};
+
+/**
  * struct msi_desc - Descriptor structure for MSI based interrupts
  * @list:	List head for management
  * @irq:	The base interrupt number
@@ -127,6 +139,7 @@ struct msi_desc {
 		struct platform_msi_desc platform;
 		struct fsl_mc_msi_desc fsl_mc;
 		struct ti_sci_inta_msi_desc inta;
+		struct device_msi_desc device_msi;
 	};
 };
 
@@ -412,6 +425,17 @@ void platform_msi_domain_free(struct irq
 void *platform_msi_get_host_data(struct irq_domain *domain);
 #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
 
+#ifdef CONFIG_DEVICE_MSI
+struct irq_domain *device_msi_create_irq_domain(struct fwnode_handle *fn,
+						struct msi_domain_info *info,
+						struct irq_domain *parent);
+
+# ifdef CONFIG_PCI
+struct irq_domain *pci_subdevice_msi_create_irq_domain(struct pci_dev *pdev,
+						       struct msi_domain_info *info);
+# endif
+#endif /* CONFIG_DEVICE_MSI */
+
 #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
 void pci_msi_domain_write_msg(struct irq_data *irq_data, struct msi_msg *msg);
 struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -93,6 +93,10 @@ config GENERIC_MSI_IRQ_DOMAIN
 	select IRQ_DOMAIN_HIERARCHY
 	select GENERIC_MSI_IRQ
 
+config DEVICE_MSI
+	bool
+	select GENERIC_MSI_IRQ_DOMAIN
+
 config IRQ_MSI_IOMMU
 	bool
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 37/38] irqdomain/msi: Provide msi_alloc/free_store() callbacks
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (35 preceding siblings ...)
  2020-08-21  0:25 ` [patch RFC 36/38] platform-msi: Add device MSI infrastructure Thomas Gleixner
@ 2020-08-21  0:25 ` Thomas Gleixner
  2020-08-21  0:25 ` [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING Thomas Gleixner
  2020-08-22 14:19 ` [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Jürgen Groß
  38 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:25 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, iommu, Jacob Pan, Rafael J. Wysocki

[-- Attachment #1: irqdomain-msi--Provide-msi_alloc-free_store---callbacks.patch --]
[-- Type: text/plain, Size: 2808 bytes --]

For devices which don't have a standard storage for MSI messages like the
upcoming IMS (Interrupt Message Storm) it's required to allocate storage
space before allocating interrupts and after freeing them.

This could be achieved with the existing callbacks, but that would be
awkward because they operate on msi_alloc_info_t which is not uniform
accross architectures. Also these callbacks are invoked per interrupt but
the allocation might have bulk requirements depending on the device.

As such devices can operate on different architectures it is simpler to
have seperate callbacks which operate on struct device. The resulting
storage information has to be stored in struct msi_desc so the underlying
irq chip implementation can retrieve it for the relevant operations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
---
 include/linux/msi.h |    8 ++++++++
 kernel/irq/msi.c    |   11 +++++++++++
 2 files changed, 19 insertions(+)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -279,6 +279,10 @@ struct msi_domain_info;
  *			function.
  * @domain_free_irqs:	Optional function to override the default free
  *			function.
+ * @msi_alloc_store:	Optional callback to allocate storage in a device
+ *			specific non-standard MSI store
+ * @msi_alloc_free:	Optional callback to free storage in a device
+ *			specific non-standard MSI store
  *
  * @get_hwirq, @msi_init and @msi_free are callbacks used by
  * msi_create_irq_domain() and related interfaces
@@ -328,6 +332,10 @@ struct msi_domain_ops {
 					     struct device *dev, int nvec);
 	void		(*domain_free_irqs)(struct irq_domain *domain,
 					    struct device *dev);
+	int		(*msi_alloc_store)(struct irq_domain *domain,
+					   struct device *dev, int nvec);
+	void		(*msi_free_store)(struct irq_domain *domain,
+					    struct device *dev);
 };
 
 /**
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -410,6 +410,12 @@ int __msi_domain_alloc_irqs(struct irq_d
 	if (ret)
 		return ret;
 
+	if (ops->msi_alloc_store) {
+		ret = ops->msi_alloc_store(domain, dev, nvec);
+		if (ret)
+			return ret;
+	}
+
 	for_each_msi_entry(desc, dev) {
 		ops->set_desc(&arg, desc);
 
@@ -509,6 +515,8 @@ int msi_domain_alloc_irqs(struct irq_dom
 
 void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
 {
+	struct msi_domain_info *info = domain->host_data;
+	struct msi_domain_ops *ops = info->ops;
 	struct msi_desc *desc;
 
 	for_each_msi_entry(desc, dev) {
@@ -522,6 +530,9 @@ void __msi_domain_free_irqs(struct irq_d
 			desc->irq = 0;
 		}
 	}
+
+	if (ops->msi_free_store)
+		ops->msi_free_store(domain, dev);
 }
 
 /**

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (36 preceding siblings ...)
  2020-08-21  0:25 ` [patch RFC 37/38] irqdomain/msi: Provide msi_alloc/free_store() callbacks Thomas Gleixner
@ 2020-08-21  0:25 ` Thomas Gleixner
  2020-08-21 12:45   ` Jason Gunthorpe
  2020-08-22 14:19 ` [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Jürgen Groß
  38 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21  0:25 UTC (permalink / raw)
  To: LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Jason Gunthorpe, Megha Dey, xen-devel,
	Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas,
	Stephen Hemminger, Dan Williams, Jonathan Derrick, Juergen Gross,
	Russ Anderson, Greg Kroah-Hartman, iommu, Jacob Pan,
	Rafael J. Wysocki

[-- Attachment #1: irqchip--Add-IMS-array-driver.patch --]
[-- Type: text/plain, Size: 7718 bytes --]

A generic IMS irq chip and irq domain implementation for IMS based devices
which utilize a MSI message store array on chip.

Allows IMS devices with a MSI message store array to reuse this code for
different array sizes.

Allocation and freeing of interrupts happens via the generic
msi_domain_alloc/free_irqs() interface. No special purpose IMS magic
required as long as the interrupt domain is stored in the underlying device
struct.

Completely untested of course and mostly for illustration and educational
purpose. This should of course be a modular irq chip, but adding that
support is left as an exercise for the people who care about this deeply.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Megha Dey <megha.dey@intel.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Cc: Baolu Lu <baolu.lu@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
---
 drivers/irqchip/Kconfig             |    8 +
 drivers/irqchip/Makefile            |    1 
 drivers/irqchip/irq-ims-msi.c       |  169 ++++++++++++++++++++++++++++++++++++
 include/linux/irqchip/irq-ims-msi.h |   41 ++++++++
 4 files changed, 219 insertions(+)

--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -571,4 +571,12 @@ config LOONGSON_PCH_MSI
 	help
 	  Support for the Loongson PCH MSI Controller.
 
+config IMS_MSI
+	bool "IMS Interrupt Message Store MSI controller"
+	depends on PCI
+	select DEVICE_MSI
+	help
+	  Support for IMS Interrupt Message Store MSI controller
+	  with IMS slot storage in a slot array
+
 endmenu
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -111,3 +111,4 @@ obj-$(CONFIG_LOONGSON_HTPIC)		+= irq-loo
 obj-$(CONFIG_LOONGSON_HTVEC)		+= irq-loongson-htvec.o
 obj-$(CONFIG_LOONGSON_PCH_PIC)		+= irq-loongson-pch-pic.o
 obj-$(CONFIG_LOONGSON_PCH_MSI)		+= irq-loongson-pch-msi.o
+obj-$(CONFIG_IMS_MSI)			+= irq-ims-msi.o
--- /dev/null
+++ b/drivers/irqchip/irq-ims-msi.c
@@ -0,0 +1,169 @@
+// SPDX-License-Identifier: GPL-2.0
+// (C) Copyright 2020 Thomas Gleixner <tglx@linutronix.de>
+/*
+ * Shared interrupt chip and irq domain for Intel IMS devices
+ */
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <linux/msi.h>
+#include <linux/irq.h>
+
+#include <linux/irqchip/irq-ims-msi.h>
+
+struct ims_data {
+	struct ims_array_info	info;
+	unsigned long		map[0];
+};
+
+static void ims_mask_irq(struct irq_data *data)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct ims_array_slot __iomem *slot = desc->device_msi.priv_iomem;
+	u32 __iomem *ctrl = &slot->ctrl;
+
+	iowrite32(ioread32(ctrl) & ~IMS_VECTOR_CTRL_UNMASK, ctrl);
+}
+
+static void ims_unmask_irq(struct irq_data *data)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct ims_array_slot __iomem *slot = desc->device_msi.priv_iomem;
+	u32 __iomem *ctrl = &slot->ctrl;
+
+	iowrite32(ioread32(ctrl) | IMS_VECTOR_CTRL_UNMASK, ctrl);
+}
+
+static void ims_write_msi_msg(struct irq_data *data, struct msi_msg *msg)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct ims_array_slot __iomem *slot = desc->device_msi.priv_iomem;
+
+	iowrite32(msg->address_lo, &slot->address_lo);
+	iowrite32(msg->address_hi, &slot->address_hi);
+	iowrite32(msg->data, &slot->data);
+}
+
+static const struct irq_chip ims_msi_controller = {
+	.name			= "IMS",
+	.irq_mask		= ims_mask_irq,
+	.irq_unmask		= ims_unmask_irq,
+	.irq_write_msi_msg	= ims_write_msi_msg,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.flags			= IRQCHIP_SKIP_SET_WAKE,
+};
+
+static void ims_reset_slot(struct ims_array_slot __iomem *slot)
+{
+	iowrite32(0, &slot->address_lo);
+	iowrite32(0, &slot->address_hi);
+	iowrite32(0, &slot->data);
+	iowrite32(0, &slot->ctrl);
+}
+
+static void ims_free_msi_store(struct irq_domain *domain, struct device *dev)
+{
+	struct msi_domain_info *info = domain->host_data;
+	struct ims_data *ims = info->data;
+	struct msi_desc *entry;
+
+	for_each_msi_entry(entry, dev) {
+		if (entry->device_msi.priv_iomem) {
+			clear_bit(entry->device_msi.hwirq, ims->map);
+			ims_reset_slot(entry->device_msi.priv_iomem);
+			entry->device_msi.priv_iomem = NULL;
+			entry->device_msi.hwirq = 0;
+		}
+	}
+}
+
+static int ims_alloc_msi_store(struct irq_domain *domain, struct device *dev,
+			       int nvec)
+{
+	struct msi_domain_info *info = domain->host_data;
+	struct ims_data *ims = info->data;
+	struct msi_desc *entry;
+
+	for_each_msi_entry(entry, dev) {
+		unsigned int idx;
+
+		idx = find_first_zero_bit(ims->map, ims->info.max_slots);
+		if (idx >= ims->info.max_slots)
+			goto fail;
+		set_bit(idx, ims->map);
+		entry->device_msi.priv_iomem = &ims->info.slots[idx];
+		entry->device_msi.hwirq = idx;
+	}
+	return 0;
+
+fail:
+	ims_free_msi_store(domain, dev);
+	return -ENOSPC;
+}
+
+struct ims_domain_template {
+	struct msi_domain_ops	ops;
+	struct msi_domain_info	info;
+};
+
+static const struct ims_domain_template ims_domain_template = {
+	.ops = {
+		.msi_alloc_store	= ims_alloc_msi_store,
+		.msi_free_store		= ims_free_msi_store,
+	},
+	.info = {
+		.flags		= MSI_FLAG_USE_DEF_DOM_OPS |
+				  MSI_FLAG_USE_DEF_CHIP_OPS,
+		.handler	= handle_edge_irq,
+		.handler_name	= "edge",
+	},
+};
+
+struct irq_domain *
+pci_ims_create_msi_irq_domain(struct pci_dev *pdev,
+			      struct ims_array_info *ims_info)
+{
+	struct ims_domain_template *info;
+	struct irq_domain *domain;
+	struct irq_chip *chip;
+	struct ims_data *data;
+	unsigned int size;
+
+	/* Allocate new domain storage */
+	info = kmemdup(&ims_domain_template, sizeof(ims_domain_template),
+		       GFP_KERNEL);
+	if (!info)
+		return NULL;
+	/* Link the ops */
+	info->info.ops = &info->ops;
+
+	/* Allocate ims_info along with the bitmap */
+	size = sizeof(*data);
+	size += BITS_TO_LONGS(ims_info->max_slots) * sizeof(unsigned long);
+	data = kzalloc(size, GFP_KERNEL);
+	if (!data)
+		goto err_info;
+
+	data->info = *ims_info;
+	info->info.data = data;
+
+	chip = kmemdup(&ims_msi_controller, sizeof(ims_msi_controller),
+		       GFP_KERNEL);
+	if (!chip)
+		goto err_data;
+	info->info.chip = chip;
+
+	domain = pci_subdevice_msi_create_irq_domain(pdev, &info->info);
+	if (!domain)
+		goto err_chip;
+
+	return domain;
+
+err_chip:
+	kfree(chip);
+err_data:
+	kfree(data);
+err_info:
+	kfree(info);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(pci_ims_create_msi_irq_domain);
--- /dev/null
+++ b/include/linux/irqchip/irq-ims-msi.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* (C) Copyright 2020 Thomas Gleixner <tglx@linutronix.de> */
+
+#ifndef _LINUX_IRQCHIP_IRQ_IMS_MSI_H
+#define _LINUX_IRQCHIP_IRQ_IMS_MSI_H
+
+#include <linux/types.h>
+
+struct ims_array_slot {
+	u32	address_lo;
+	u32	address_hi;
+	u32	data;
+	u32	ctrl;
+};
+
+/* Bit to unmask the interrupt in slot->ctrl */
+#define IMS_VECTOR_CTRL_UNMASK	0x01
+
+struct ims_array_info {
+	struct ims_array_slot	__iomem *slots;
+	unsigned int		max_slots;
+};
+
+/* Dummy forward declaration for illustration */
+struct ims_queue_slot;
+
+/**
+ * ims_msi_store - Interrupt Message Store descriptor data
+ * @array_slot:	Pointer to a on device IMS storage array slot
+ * @queue_slot:	Pointer to storage embedded in queue data
+ * @hw_irq:	Index of the slot or queue. Also hardware irq number
+ */
+struct ims_msi_store {
+	union {
+		struct ims_array_slot __iomem	*array_slot;
+		struct ims_queue_slot		*queue_slot;
+	};
+	unsigned int				hw_irq;
+};
+
+#endif

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-21  0:25 ` [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING Thomas Gleixner
@ 2020-08-21 12:45   ` Jason Gunthorpe
  2020-08-21 19:47     ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Jason Gunthorpe @ 2020-08-21 12:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Megha Dey, xen-devel, Kevin Tian,
	Konrad Rzeszutek Wilk, Haiyang Zhang, Alex Williamson,
	Stefano Stabellini, Bjorn Helgaas, Stephen Hemminger,
	Dan Williams, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21, 2020 at 02:25:02AM +0200, Thomas Gleixner wrote:
> +static void ims_mask_irq(struct irq_data *data)
> +{
> +	struct msi_desc *desc = irq_data_get_msi_desc(data);
> +	struct ims_array_slot __iomem *slot = desc->device_msi.priv_iomem;
> +	u32 __iomem *ctrl = &slot->ctrl;
> +
> +	iowrite32(ioread32(ctrl) & ~IMS_VECTOR_CTRL_UNMASK, ctrl);

Just to be clear, this is exactly the sort of operation we can't do
with non-MSI interrupts. For a real PCI device to execute this it
would have to keep the data on die.

I saw the idxd driver was doing something like this, I assume it
avoids trouble because it is a fake PCI device integrated with the
CPU, not on a real PCI bus?

It is really nice to see irq_domain used properly in x86!

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-21 12:45   ` Jason Gunthorpe
@ 2020-08-21 19:47     ` Thomas Gleixner
  2020-08-21 20:17       ` Jason Gunthorpe
  0 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21 19:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Megha Dey, xen-devel, Kevin Tian,
	Konrad Rzeszutek Wilk, Haiyang Zhang, Alex Williamson,
	Stefano Stabellini, Bjorn Helgaas, Stephen Hemminger,
	Dan Williams, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21 2020 at 09:45, Jason Gunthorpe wrote:
> On Fri, Aug 21, 2020 at 02:25:02AM +0200, Thomas Gleixner wrote:
>> +static void ims_mask_irq(struct irq_data *data)
>> +{
>> +	struct msi_desc *desc = irq_data_get_msi_desc(data);
>> +	struct ims_array_slot __iomem *slot = desc->device_msi.priv_iomem;
>> +	u32 __iomem *ctrl = &slot->ctrl;
>> +
>> +	iowrite32(ioread32(ctrl) & ~IMS_VECTOR_CTRL_UNMASK, ctrl);
>
> Just to be clear, this is exactly the sort of operation we can't do
> with non-MSI interrupts. For a real PCI device to execute this it
> would have to keep the data on die.

We means NVIDIA and your new device, right?

So if I understand correctly then the queue memory where the MSI
descriptor sits is in RAM.

How is that supposed to work if interrupt remapping is disabled?

That means irq migration and proper disabling of an interrupt become an
interesting exercise. I'm so not looking forward to that.

If interrupt remapping is enabled then both are trivial because then the
irq chip can delegate everything to the parent chip, i.e. the remapping
unit.

Can you please explain that a bit more precise?

> I saw the idxd driver was doing something like this, I assume it
> avoids trouble because it is a fake PCI device integrated with the
> CPU, not on a real PCI bus?

That's how it is implemented as far as I understood the patches. It's
device memory therefore iowrite32().

> It is really nice to see irq_domain used properly in x86!

If you ignore the abuse in XEN :)

And to be fair proper and usable (hierarchical) irq domains originate
from x86 and happened to solve quite a few horrorshows on the ARM side.

Just back then when we converted the original maze, nobody had a good
idea and the stomach to touch XEN.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-21 19:47     ` Thomas Gleixner
@ 2020-08-21 20:17       ` Jason Gunthorpe
  2020-08-21 23:47         ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Jason Gunthorpe @ 2020-08-21 20:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Megha Dey, xen-devel, Kevin Tian,
	Konrad Rzeszutek Wilk, Haiyang Zhang, Alex Williamson,
	Stefano Stabellini, Bjorn Helgaas, Stephen Hemminger,
	Dan Williams, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21, 2020 at 09:47:43PM +0200, Thomas Gleixner wrote:
> On Fri, Aug 21 2020 at 09:45, Jason Gunthorpe wrote:
> > On Fri, Aug 21, 2020 at 02:25:02AM +0200, Thomas Gleixner wrote:
> >> +static void ims_mask_irq(struct irq_data *data)
> >> +{
> >> +	struct msi_desc *desc = irq_data_get_msi_desc(data);
> >> +	struct ims_array_slot __iomem *slot = desc->device_msi.priv_iomem;
> >> +	u32 __iomem *ctrl = &slot->ctrl;
> >> +
> >> +	iowrite32(ioread32(ctrl) & ~IMS_VECTOR_CTRL_UNMASK, ctrl);
> >
> > Just to be clear, this is exactly the sort of operation we can't do
> > with non-MSI interrupts. For a real PCI device to execute this it
> > would have to keep the data on die.
> 
> We means NVIDIA and your new device, right?

We'd like to use this in the current Mellanox NIC HW, eg the mlx5
driver. (NVIDIA acquired Mellanox recently)

> So if I understand correctly then the queue memory where the MSI
> descriptor sits is in RAM.

Yes, IMHO that is the whole point of this 'IMS' stuff. If devices
could have enough on-die memory then they could just use really big
MSI-X tables. Currently due to on-die memory constraints mlx5 is
limited to a few hundred MSI-X vectors.

Since MSI-X tables are exposed via MMIO they can't be 'swapped' to
RAM.

Moving away from MSI-X's MMIO access model allows them to be swapped
to RAM. The cost is that accessing them for update is a
command/response operation not a MMIO operation.

The HW is already swapping the queues causing the interrupts to RAM,
so adding a bit of additional data to store the MSI addr/data is
reasonable.

To give some sense, a 'working set' for the NIC device in some cases
can be hundreds of megabytes of data. System RAM is used to store
this, and precious on-die memory holds some dynamic active set, much
like a processor cache.

> How is that supposed to work if interrupt remapping is disabled?

The best we can do is issue a command to the device and spin/sleep
until completion. The device will serialize everything internally.

If the device has died the driver has code to detect and trigger a
PCI function reset which will definitely stop the interrupt.

So, the implementation of these functions would be to push any change
onto a command queue, trigger the device to DMA the command, spin/sleep
until the device returns a response and then continue on. If the
device doesn't return a response in a time window then trigger a WQ to
do a full device reset.

The spin/sleep is only needed if the update has to be synchronous, so
things like rebalancing could just push the rebalancing work and
immediately return.

> If interrupt remapping is enabled then both are trivial because then the
> irq chip can delegate everything to the parent chip, i.e. the remapping
> unit.

I did like this notion that IRQ remapping could avoid the overhead of
spin/spleep. Most of the use cases we have for this will require the
IOMMU anyhow.

> > I saw the idxd driver was doing something like this, I assume it
> > avoids trouble because it is a fake PCI device integrated with the
> > CPU, not on a real PCI bus?
> 
> That's how it is implemented as far as I understood the patches. It's
> device memory therefore iowrite32().

I don't know anything about idxd.. Given the scale of interrupt need I
assumed the idxd HW had some hidden swapping to RAM. 

Since it is on-die with the CPU there are a bunch of ways I could
imagine Intel could make MMIO triggered swapping work that are not
available to a true PCI-E device.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-21 20:17       ` Jason Gunthorpe
@ 2020-08-21 23:47         ` Thomas Gleixner
  2020-08-22  0:51           ` Jason Gunthorpe
  0 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-21 23:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Megha Dey, xen-devel, Kevin Tian,
	Konrad Rzeszutek Wilk, Haiyang Zhang, Alex Williamson,
	Stefano Stabellini, Bjorn Helgaas, Stephen Hemminger,
	Dan Williams, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21 2020 at 17:17, Jason Gunthorpe wrote:
> On Fri, Aug 21, 2020 at 09:47:43PM +0200, Thomas Gleixner wrote:
>> So if I understand correctly then the queue memory where the MSI
>> descriptor sits is in RAM.
>
> Yes, IMHO that is the whole point of this 'IMS' stuff. If devices
> could have enough on-die memory then they could just use really big
> MSI-X tables. Currently due to on-die memory constraints mlx5 is
> limited to a few hundred MSI-X vectors.

Right, that's the limit of a particular device, but nothing prevents you
to have a larger table on a new device.

The MSI-X limitation to 2048 is defined by the PCI spec and you'd need
either some non spec compliant abuse of the reserved size bits or some
extra config entry. So IMS is a way to work around that. But I
understand why you want to move them to main memory, but you have to
deal with the problems this creates upfront.

> Since MSI-X tables are exposed via MMIO they can't be 'swapped' to
> RAM.
>
> Moving away from MSI-X's MMIO access model allows them to be swapped
> to RAM. The cost is that accessing them for update is a
> command/response operation not a MMIO operation.
>
> The HW is already swapping the queues causing the interrupts to RAM,
> so adding a bit of additional data to store the MSI addr/data is
> reasonable.

Makes sense.

>> How is that supposed to work if interrupt remapping is disabled?
>
> The best we can do is issue a command to the device and spin/sleep
> until completion. The device will serialize everything internally.
>
> If the device has died the driver has code to detect and trigger a
> PCI function reset which will definitely stop the interrupt.

If that interrupt is gone into storm mode for some reason then this will
render your machine unusable before you can do that.

> So, the implementation of these functions would be to push any change
> onto a command queue, trigger the device to DMA the command, spin/sleep
> until the device returns a response and then continue on. If the
> device doesn't return a response in a time window then trigger a WQ to
> do a full device reset.

I really don't want to do that with the irq descriptor lock held or in
case of affinity from the interrupt handler as we have to do with PCI
MSI/MSI-X due to the horrors of the X86 interrupt delivery trainwreck.
Also you cannot call into command queue code from interrupt disabled and
interrupt descriptor lock held sections. You can try, but lockdep will
yell at you immediately. 

There is also CPU hotplug where we have to force migrate an interrupt
away from an outgoing CPU. This needs some serious thought.

One question is whether the device can see partial updates to that
memory due to the async 'swap' of context from the device CPU.

So we have address_lo, address_hi, data and ctrl. Each of them 32 bit.

address_hi is only relevant when the number of CPUs is > 255 which
requires X2APIC which in turn requires interrupt remapping. For all
others the address_hi value never changes. Let's ignore that case for
now, but see further down.

So what's interesting is address_lo and data. If the device sees an
partial update, i.e. address_lo is written and the device grabs the
update before data is written then the next interrupt will end up in
lala land. We have code for that in place in msi_set_affinity() in
arch/x86/kernel/apic/msi.c. Get eyecancer protection glasses before
opening that and keep beer ready to wipe out the horrors immediately
afterwards.

If the device updates the data only when a command is issued then this
is not a problem, but it causes other problems because you still cannot
access the command queue from that context. This makes it even worse for
the CPU hotplug case. But see all of the reasoning on that.

If it takes whatever it sees while grabbing the state when switching to
a different queue or at the point of actual interrupt delivery, then you
have a problem. Not only you, I'm going to have one as well because I'm
going to be the poor sod to come up with the workaround.

So we better address that _before_ you start deploying this piece of
art. I'm not really interested in another slighly different and probably
more horrible version of the same story. Don't blame me, it's the way
how Intel decided to make this "work".

There are a couple of options to ensure that the device will never see
inconsistent state:

      1) Use a locked 16 byte wide operation (cpmxchg16) which is not
         available on 32bit

      2) Order the MSG entry differently in the queue storage:

         u32 address_lo
         u32 data
         u32 address_hi
         u32 ctrl

         And then enforce an 8 byte store on 64 bit which is guaranteed
         to be atomic vs. other CPUs and bus agents, i.e. DMA.

         I said enforce because compilers are known to do stupid things.

Both are fine for me and the only caveat is that the access does not go
accross a cache line boundary. The restriction to 64bit shouldn't be a
problem either. Running such a device on 32bit causes more problems than
it solves :)

> The spin/sleep is only needed if the update has to be synchronous, so
> things like rebalancing could just push the rebalancing work and
> immediately return.

Interrupt migration is async anyway. An interrupt might have been sent
to the old vector just before the new vector was written. That's already
dealt with. The old vector is cleaned up when the first interrupt
arrives on the new vector which is the most reliable indicator that it's
done.

In that case you can avoid issuing a command, but that needs some
thought as well when the queue data is never reloaded. But you can mark
the queue that affinity has changed and let the next operation on the
queue (RX, TX, whatever) which needs to talk to the device anyway deal
with it, i.e. set some command flag in the next operation which tells
the queue to reload that message.

The only exception is CPU hotplug, but I have an idea how to deal with
that.

Aside of that some stuff want's to be synchronous though. e.g. shutdown,
startup.

irq chips have already a mechanism in place to deal with stuff which
cannot be handled from within the irq descriptor spinlock held and
interrupt disabled section.

The mechanism was invented to deal with interrupt chips which are
connected to i2c, spi, etc.. The access to an interrupt chip control
register has to queue stuff on the bus and wait for completion.
Obviously not what you can do from interrupt disabled, raw spinlock held
context either.

So we have for most operations (except affinity setting) the concept of
update on lock release. For these devices the interrupt chip which
handles all lines on that controller on the slow bus has an additional
lock, called bus lock. The core code does not know about that lock at
all. It's managed at the irq chip side.

The irqchip has two callbacks: irq_bus_lock() and irq_bus_sync_unlock().
irq_bus_lock() is invoked before interrupts are disabled and the
spinlock is taken and irq_bus_sync_unlock() after releasing the spinlock
and reenabling interrupts. The "real" chip operations like mask, unmask
etc. are operating on an chip internal state cache.

For such devices irq_bus_lock() usually takes a sleepable lock (mutex)
to protect the state cache and the update logic over the slow bus.

irq_bus_sync_unlock() releases that lock, but before doing so it checks
whether the operation has changed the state cache and if so it queues a
command on the slow bus and waits for completion.

That makes sure that the device state and the state cache are in sync
before the next operation on a maybe different irq line on the same chip
happens.

Now for your case you might just not have irq_mask()/irq_unmask() callbacks or
simple ones which just update the queue memory in RAM, but then you want
irq_disable()/irq_enable() callbacks which manipulate state cache and
then provide the irq_bus_lock() and irq_bus_sync_unlock() callbacks as
well which do not necessarily need a lock underneath, but the unlock
side implements the 'Queue command and wait for completion' part.

Now coming back to affinity setting. I'd love to avoid adding the bus
lock magic to those interfaces because until now they can be called and
are called from atomic contexts. And obviously none of the devices which
use the buslock magic support affinity setting because they all deliver
a single interrupt to a demultiplex interrupt and that one is usually
sitting at the CPU level where interrupt steering works.

If we really can get away with atomically updating the message as
outlined above and just let it happen at some point in the future then
most problems are solved, except for the nastyness of CPU hotplug.

But that's actually a non issue. Nothing prevents us from having an
early 'migrate interrupts away from the outgoing CPU hotplug state'
which runs in thread context and can therefore utilize the buslock
mechanism. Actually I was thinking about that for other reasons already.

That state would need some thought and consequently some minor changes
to the affinity mask checks to prevent that the interrupt gets migrated
back to the outgoing CPU before that CPU reaches offline state. Nothing
fundamental though.

Just to be clear: We really need to do that at the core level and not
again in some dark place in a driver as that will cause state
inconsistency and hard to debug wreckage.

>> If interrupt remapping is enabled then both are trivial because then the
>> irq chip can delegate everything to the parent chip, i.e. the remapping
>> unit.
>
> I did like this notion that IRQ remapping could avoid the overhead of
> spin/spleep. Most of the use cases we have for this will require the
> IOMMU anyhow.

You still need to support !remap scenarios I fear.

And even for the remap case you need some of that bus lock magic to
handle startup and teardown properly without the usual horrible hacks in
the driver.

If your hard^Wfirmware does the right thing then the only place you need
to worry about the command queueing is startup and teardown and the
extra bit for the early hotplug migration.

Let me summarize what I think would be the sane solution for this:

  1) Utilize atomic writes for either all 16 bytes or reorder the bytes
     and update 8 bytes atomically which is sufficient as the wide
     address is only used with irq remapping and the MSI message in the
     device is never changed after startup.

  2) No requirement for issuing a command for regular migration
     operations as they have no requirements to be synchronous.

     Eventually store some state to force a reload on the next regular
     queue operation.

  3) No requirement for issuing a command for mask and unmask operations.
     The core code uses and handles lazy masking already. So if the
     hardware causes the lazyness, so be it.

  4) Issue commands for startup and teardown as they need to be
     synchronous

  5) Have an early migration state for CPU hotunplug which issues a
     command from appropriate context. That would even allow to handle
     queue shutdown for managed interrupts when the last CPU in the
     managed affinity set goes down. Restart of such a managed interrupt
     when the first CPU in an affinity set comes online again would only
     need minor modifications of the existing code to make it work.
     
Thoughts?

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-21 23:47         ` Thomas Gleixner
@ 2020-08-22  0:51           ` Jason Gunthorpe
  2020-08-22  1:34             ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Jason Gunthorpe @ 2020-08-22  0:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Megha Dey, xen-devel, Kevin Tian,
	Konrad Rzeszutek Wilk, Haiyang Zhang, Alex Williamson,
	Stefano Stabellini, Bjorn Helgaas, Stephen Hemminger,
	Dan Williams, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Sat, Aug 22, 2020 at 01:47:12AM +0200, Thomas Gleixner wrote:
> On Fri, Aug 21 2020 at 17:17, Jason Gunthorpe wrote:
> > On Fri, Aug 21, 2020 at 09:47:43PM +0200, Thomas Gleixner wrote:
> >> So if I understand correctly then the queue memory where the MSI
> >> descriptor sits is in RAM.
> >
> > Yes, IMHO that is the whole point of this 'IMS' stuff. If devices
> > could have enough on-die memory then they could just use really big
> > MSI-X tables. Currently due to on-die memory constraints mlx5 is
> > limited to a few hundred MSI-X vectors.
> 
> Right, that's the limit of a particular device, but nothing prevents you
> to have a larger table on a new device.

Well, physics are a problem.. The SRAM to store the MSI vectors costs
die space and making the chip die larger is not an option. So the
question is what do you throw out of the chip to get a 10-20x increase
in MSI SRAM?

This is why using host memory is so appealing. It is
economically/functionally better.

I'm going to guess other HW is in the same situation, virtualization
is really pushing up the number of required IRQs.

> >> How is that supposed to work if interrupt remapping is disabled?
> >
> > The best we can do is issue a command to the device and spin/sleep
> > until completion. The device will serialize everything internally.
> >
> > If the device has died the driver has code to detect and trigger a
> > PCI function reset which will definitely stop the interrupt.
> 
> If that interrupt is gone into storm mode for some reason then this will
> render your machine unusable before you can do that.

Yes, but in general the HW design is to have one-shot interrupts, it
would have to be well off the rails to storm. The kind of off the
rails where it could also be doing crazy stuff on PCI-E that would be
very harmful.

> > So, the implementation of these functions would be to push any change
> > onto a command queue, trigger the device to DMA the command, spin/sleep
> > until the device returns a response and then continue on. If the
> > device doesn't return a response in a time window then trigger a WQ to
> > do a full device reset.
> 
> I really don't want to do that with the irq descriptor lock held or in
> case of affinity from the interrupt handler as we have to do with PCI
> MSI/MSI-X due to the horrors of the X86 interrupt delivery trainwreck.
> Also you cannot call into command queue code from interrupt disabled and
> interrupt descriptor lock held sections. You can try, but lockdep will
> yell at you immediately.

Yes, I wouldn't want to do this from an IRQ.

> One question is whether the device can see partial updates to that
> memory due to the async 'swap' of context from the device CPU.

It is worse than just partial updates.. The device operation is much
more like you'd imagine a CPU cache. There could be copies of the RAM
in the device for long periods of time, dirty data in the device that
will flush back to CPU RAM overwriting CPU changes, etc.

Without involving the device there is just no way to create data
consistency, and no way to change the data from the CPU. 

This is the down side of having device data in the RAM. It cannot be
so simple as 'just fetch it every time before you use it' as
performance would be horrible.

> irq chips have already a mechanism in place to deal with stuff which
> cannot be handled from within the irq descriptor spinlock held and
> interrupt disabled section.
> 
> The mechanism was invented to deal with interrupt chips which are
> connected to i2c, spi, etc.. The access to an interrupt chip control
> register has to queue stuff on the bus and wait for completion.
> Obviously not what you can do from interrupt disabled, raw spinlock held
> context either.

Ah intersting, sounds like the right parts! I didn't know about this..

> Now coming back to affinity setting. I'd love to avoid adding the bus
> lock magic to those interfaces because until now they can be called and
> are called from atomic contexts. And obviously none of the devices which
> use the buslock magic support affinity setting because they all deliver
> a single interrupt to a demultiplex interrupt and that one is usually
> sitting at the CPU level where interrupt steering works.
> 
> If we really can get away with atomically updating the message as
> outlined above and just let it happen at some point in the future then
> most problems are solved, except for the nastyness of CPU hotplug.

Since we can't avoid a device command, I'm think more along the lines
of having the affinity update trigger an async WQ to issue the command
from a thread context. Since it doesn't need to be synchronous it can
make it out 'eventually'.

I suppose the core code could provide this as a service? Sort of a
varient of the other lazy things above?

> But that's actually a non issue. Nothing prevents us from having an
> early 'migrate interrupts away from the outgoing CPU hotplug state'
> which runs in thread context and can therefore utilize the buslock
> mechanism. Actually I was thinking about that for other reasons already.

That would certainly work well, seems like it fits with the other
lazy/sleeping stuff above as well.

> >> If interrupt remapping is enabled then both are trivial because then the
> >> irq chip can delegate everything to the parent chip, i.e. the remapping
> >> unit.
> >
> > I did like this notion that IRQ remapping could avoid the overhead of
> > spin/spleep. Most of the use cases we have for this will require the
> > IOMMU anyhow.
> 
> You still need to support !remap scenarios I fear.

For x86 I think we could accept linking this to IOMMU, if really
necessary.

But it would have to work with ARM - is remapping a x86 only thing?
Does ARM put the affinity in the GIC tables not in the MSI data?

> Let me summarize what I think would be the sane solution for this:
> 
>   1) Utilize atomic writes for either all 16 bytes or reorder the bytes
>      and update 8 bytes atomically which is sufficient as the wide
>      address is only used with irq remapping and the MSI message in the
>      device is never changed after startup.

Sadly not something the device can manage due to data coherence

>   2) No requirement for issuing a command for regular migration
>      operations as they have no requirements to be synchronous.
> 
>      Eventually store some state to force a reload on the next regular
>      queue operation.

Would the async version above be OK?

>   3) No requirement for issuing a command for mask and unmask operations.
>      The core code uses and handles lazy masking already. So if the
>      hardware causes the lazyness, so be it.

This lazy masking thing sounds good, I'm totally unfamiliar with it
though.

>   4) Issue commands for startup and teardown as they need to be
>      synchronous

Yep

>   5) Have an early migration state for CPU hotunplug which issues a
>      command from appropriate context. That would even allow to handle
>      queue shutdown for managed interrupts when the last CPU in the
>      managed affinity set goes down. Restart of such a managed interrupt
>      when the first CPU in an affinity set comes online again would only
>      need minor modifications of the existing code to make it work.

Yep

> Thoughts?

This email is super helpful, I definately don't know all these corners
of the IRQ subsystem as my past with it has mostly been SOC stuff that
isn't as complicated!

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-22  0:51           ` Jason Gunthorpe
@ 2020-08-22  1:34             ` Thomas Gleixner
  2020-08-22 23:05               ` Jason Gunthorpe
  0 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-22  1:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Megha Dey, xen-devel, Kevin Tian,
	Konrad Rzeszutek Wilk, Haiyang Zhang, Alex Williamson,
	Stefano Stabellini, Bjorn Helgaas, Stephen Hemminger,
	Dan Williams, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

Jason,

On Fri, Aug 21 2020 at 21:51, Jason Gunthorpe wrote:
> On Sat, Aug 22, 2020 at 01:47:12AM +0200, Thomas Gleixner wrote:
>> > If the device has died the driver has code to detect and trigger a
>> > PCI function reset which will definitely stop the interrupt.
>> 
>> If that interrupt is gone into storm mode for some reason then this will
>> render your machine unusable before you can do that.
>
> Yes, but in general the HW design is to have one-shot interrupts, it
> would have to be well off the rails to storm. The kind of off the
> rails where it could also be doing crazy stuff on PCI-E that would be
> very harmful.

Yeah. One shot should prevent most of the wreckage. I just wanted to
spell it out.

>> One question is whether the device can see partial updates to that
>> memory due to the async 'swap' of context from the device CPU.
>
> It is worse than just partial updates.. The device operation is much
> more like you'd imagine a CPU cache. There could be copies of the RAM
> in the device for long periods of time, dirty data in the device that
> will flush back to CPU RAM overwriting CPU changes, etc.

TBH, that's insane. You clearly want to think about this some more. If
you swap out device state and device control state then you definitly
want to have regions which are read only from the device POV and never
written back. The MSI msg store clearly belongs into that category.
But that's not restricted to the MSI msg store, there is certainly other
stuff which never wants to be written back by the device.

If you don't do that then you simply can't write to that space from the
CPU and you have to transport this kind information always via command
queues.

But that does not make sense. It's trivial enough to have

    | RO state |
    | RW state |

and on swap in the whole thing is DMA'd into the device and on swap out
only the RW state part. It's not rocket science and makes a huge amount
of sense.

> Without involving the device there is just no way to create data
> consistency, and no way to change the data from the CPU. 
>
> This is the down side of having device data in the RAM. It cannot be
> so simple as 'just fetch it every time before you use it' as
> performance would be horrible.

That's clear, but with a proper seperation like the above and some extra
mechanism which allows you to tickle a relaod of 'RO state' you can
avoid quite some of the problems which you create otherwise.

>> If we really can get away with atomically updating the message as
>> outlined above and just let it happen at some point in the future then
>> most problems are solved, except for the nastyness of CPU hotplug.
>
> Since we can't avoid a device command, I'm think more along the lines
> of having the affinity update trigger an async WQ to issue the command
> from a thread context. Since it doesn't need to be synchronous it can
> make it out 'eventually'.
>
> I suppose the core code could provide this as a service? Sort of a
> varient of the other lazy things above?

Kinda. That needs a lot of thought for the affinity setting stuff
because it can be called from contexts which do not allow that. It's
solvable though, but I clearly need to stare at the corner cases for a
while.

> But it would have to work with ARM - is remapping a x86 only thing?

No. ARM64 has that as well.

> Does ARM put the affinity in the GIC tables not in the MSI data?

IIRC, yes.

>> Let me summarize what I think would be the sane solution for this:
>> 
>>   1) Utilize atomic writes for either all 16 bytes or reorder the bytes
>>      and update 8 bytes atomically which is sufficient as the wide
>>      address is only used with irq remapping and the MSI message in the
>>      device is never changed after startup.
>
> Sadly not something the device can manage due to data coherence

I disagree :)

>>   2) No requirement for issuing a command for regular migration
>>      operations as they have no requirements to be synchronous.
>> 
>>      Eventually store some state to force a reload on the next regular
>>      queue operation.
>
> Would the async version above be OK?

Async is fine in any variant (except for hotplug). Though having an
async WQ or whatever there needs some thought.

>>   3) No requirement for issuing a command for mask and unmask operations.
>>      The core code uses and handles lazy masking already. So if the
>>      hardware causes the lazyness, so be it.
>
> This lazy masking thing sounds good, I'm totally unfamiliar with it
> though.

It's used to avoid irq chip (often MMIO) access in scenarios where
disable/enable of an interrupt line happens with high frequency. Serial
has that issue. So we mark it disabled, but do not mask it and the core
can handle that and masks it once an interrupt comes in in masked
state. That obviously does not work out of the box to protect against
not disabled but masked state, but conceptually it's a similar problem
and can be made work without massive changes. 

OTOH, in normal operation for MSI interrupts (edge type) masking is not
used at all and just restricted to the startup teardown.

But I clearly need to think about it with a more awake brain some more.

> This email is super helpful, I definately don't know all these corners
> of the IRQ subsystem as my past with it has mostly been SOC stuff that
> isn't as complicated!

It's differently complicated and not less horrible :)

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI
  2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
                   ` (37 preceding siblings ...)
  2020-08-21  0:25 ` [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING Thomas Gleixner
@ 2020-08-22 14:19 ` Jürgen Groß
  38 siblings, 0 replies; 71+ messages in thread
From: Jürgen Groß @ 2020-08-22 14:19 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On 21.08.20 02:24, Thomas Gleixner wrote:
> First of all, sorry for the horrible long Cc list, which was
> unfortunately unavoidable as this touches the world and some more.
> 
> This patch series aims to provide a base to support device MSI (non
> PCI based) in a halfways architecture independent way.
> 
> It's a mixed bag of bug fixes, cleanups and general improvements which
> are worthwhile independent of the device MSI stuff. Unfortunately this
> also comes with an evil abuse of the irqdomain system to coerce XEN on
> x86 into compliance without rewriting XEN from scratch.
> 
> As discussed in length in this mail thread:
> 
>    https://lore.kernel.org/r/87h7tcgbs2.fsf@nanos.tec.linutronix.de
> 
> the initial attempt of piggypacking device MSI support on platform MSI
> is doomed for various reasons, but creating independent interrupt
> domains for these upcoming magic PCI subdevices which are not PCI, but
> might be exposed as PCI devices is not as trivial as it seems.
> 
> The initially suggested and evaluated approach of extending platform
> MSI turned out to be the completely wrong direction and in fact
> platform MSI should be rewritten on top of device MSI or completely
> replaced by it.
> 
> One of the main issues is that x86 does not support the concept of irq
> domains associations stored in device::msi_domain and still relies on
> the arch_*_msi_irqs() fallback implementations which has it's own set
> of problems as outlined in
> 
>    https://lore.kernel.org/r/87bljg7u4f.fsf@nanos.tec.linutronix.de/
> 
> in the very same thread.
> 
> The main obstacle of storing that pointer is XEN which has it's own
> historical notiion of handling PCI MSI interupts.
> 
> This series tries to address these issues in several steps:
> 
>   1) Accidental bug fixes
> 	iommu/amd: Prevent NULL pointer dereference
> 
>   2) Janitoring
> 	x86/init: Remove unused init ops
> 
>   3) Simplification of the x86 specific interrupt allocation mechanism
> 
> 	x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency
> 	x86/irq: Add allocation type for parent domain retrieval
> 	iommu/vt-d: Consolidate irq domain getter
> 	iommu/amd: Consolidate irq domain getter
> 	iommu/irq_remapping: Consolidate irq domain lookup
> 
>   4) Consolidation of the X86 specific interrupt allocation mechanism to be as close
>      as possible to the generic MSI allocation mechanism which allows to get rid
>      of quite a bunch of x86'isms which are pointless
> 
> 	x86/irq: Prepare consolidation of irq_alloc_info
> 	x86/msi: Consolidate HPET allocation
> 	x86/ioapic: Consolidate IOAPIC allocation
> 	x86/irq: Consolidate DMAR irq allocation
> 	x86/irq: Consolidate UV domain allocation
> 	PCI: MSI: Rework pci_msi_domain_calc_hwirq()
> 	x86/msi: Consolidate MSI allocation
> 	x86/msi: Use generic MSI domain ops
> 
>    5) x86 specific cleanups to remove the dependency on arch_*_msi_irqs()
> 
> 	x86/irq: Move apic_post_init() invocation to one place
> 	z86/pci: Reducde #ifdeffery in PCI init code
> 	x86/irq: Initialize PCI/MSI domain at PCI init time
> 	irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
> 	PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI
> 	PCI: MSI: Provide pci_dev_has_special_msi_domain() helper
> 	x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
> 	x86/xen: Rework MSI teardown
> 	x86/xen: Consolidate XEN-MSI init
> 	irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
> 	x86/xen: Wrap XEN MSI management into irqdomain
> 	iommm/vt-d: Store irq domain in struct device
> 	iommm/amd: Store irq domain in struct device
> 	x86/pci: Set default irq domain in pcibios_add_device()
> 	PCI/MSI: Allow to disable arch fallbacks
> 	x86/irq: Cleanup the arch_*_msi_irqs() leftovers
> 	x86/irq: Make most MSI ops XEN private
> 
>      This one is paving the way to device MSI support, but it comes
>      with an ugly and evil hack. The ability of overriding the default
>      allocation/free functions of an MSI irq domain is useful in general as
>      (hopefully) demonstrated with the device MSI POC, but the abuse
>      in context of XEN is evil. OTOH without enough XENology and without
>      rewriting XEN from scratch wrapping XEN MSI handling into a pseudo
>      irq domain is a reasonable step forward for mere mortals with severly
>      limited XENology. One day the XEN folks might make it a real irq domain.
>      Perhaps when they have to support the same mess on other architectures.
>      Hope dies last...
> 
>      At least the mechanism to override alloc/free turned out to be useful
>      for implementing the base infrastructure for device MSI. So it's not a
>      completely lost case.
> 
>    6) X86 specific preparation for device MSI
> 
>         x86/irq: Add DEV_MSI allocation type
>         x86/msi: Let pci_msi_prepare() handle non-PCI MSI
> 
>    7) Generic device MSI infrastructure
> 
>         platform-msi: Provide default irq_chip:ack
>         platform-msi: Add device MSI infrastructure
> 
>    8) Infrastructure for and a POC of an IMS (Interrupt Message
>       Storm) irq domain and irqchip implementation
> 
>         irqdomain/msi: Provide msi_alloc/free_store() callbacks
>         irqchip: Add IMS array driver - NOT FOR MERGING
> 
> The whole lot is also available from git:
> 
>     git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git device-msi
> 
> This has been tested on Intel/AMD/KVM but lacks testing on:
> 
>      - HYPERV (-ENODEV)
>      - VMD enabled systems (-ENODEV)
>      - XEN (-ENOCLUE)

Tested to work in Xen dom0. Network is running fine with eth0 MSI
interrupts being routed through Xen.

You can add my:

Tested-by: Juergen Gross <jgross@suse.com>


Juergen
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-22  1:34             ` Thomas Gleixner
@ 2020-08-22 23:05               ` Jason Gunthorpe
  2020-08-23  8:03                 ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Jason Gunthorpe @ 2020-08-22 23:05 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Megha Dey, xen-devel, Kevin Tian,
	Konrad Rzeszutek Wilk, Haiyang Zhang, Alex Williamson,
	Stefano Stabellini, Bjorn Helgaas, Stephen Hemminger,
	Dan Williams, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Sat, Aug 22, 2020 at 03:34:45AM +0200, Thomas Gleixner wrote:
> >> One question is whether the device can see partial updates to that
> >> memory due to the async 'swap' of context from the device CPU.
> >
> > It is worse than just partial updates.. The device operation is much
> > more like you'd imagine a CPU cache. There could be copies of the RAM
> > in the device for long periods of time, dirty data in the device that
> > will flush back to CPU RAM overwriting CPU changes, etc.
> 
> TBH, that's insane. You clearly want to think about this some
> more. If

I think this general design is around 15 years old, across a healthy
number of silicon generations, and rather a lager number of shipped
devices. People have thought about it :)

> you swap out device state and device control state then you definitly
> want to have regions which are read only from the device POV and never
> written back. 

It is not as useful as you'd think - the issue with atomicity of
update still largely prevents doing much useful from the CPU, and to
make any CPU side changes visible a device command would still be
needed to synchronize the internal state to that modified memory.

So, CPU centric updates would cover a very limited number of
operations, and a device command is required anyhow. Little is
actually gained.

> The MSI msg store clearly belongs into that category.
> But that's not restricted to the MSI msg store, there is certainly other
> stuff which never wants to be written back by the device.

To get a design where you'd be able to run everything from a CPU
atomic context that can't trigger a WQ..

New silicon would have to implement some MSI-only 'cache' that can
invalidate entries based on a simple MemWr TLP.

Then the affinity update would write to the host memory, then send a
MemWr to the device to trigger invalidate.

As a silicon design it might work, but it means existing devices can't
be used with this dev_msi. It is also the sort of thing that would
need a standard document to have any hope of multiple vendors fitting
into it. Eg at PCI-SIG or something.

> If you don't do that then you simply can't write to that space from the
> CPU and you have to transport this kind information always via command
> queues.

Yes, exactly. This is part of the architectural design of the device,
has been for a long time. Has positives and negatives.

> > I suppose the core code could provide this as a service? Sort of a
> > varient of the other lazy things above?
> 
> Kinda. That needs a lot of thought for the affinity setting stuff
> because it can be called from contexts which do not allow that. It's
> solvable though, but I clearly need to stare at the corner cases for a
> while.

If possible, this would be ideal, as we could use the dev_msi on a big
installed base of existing HW.

I suspect other HW can probably fit into this too as the basic
ingredients should be fairly widespread.

Even a restricted version for situations where affinity does not need
a device update would possibly be interesting (eg x86 IOMMU remap, ARM
GIC, etc)

> OTOH, in normal operation for MSI interrupts (edge type) masking is not
> used at all and just restricted to the startup teardown.

Yeah, at least this device doesn't need masking at runtime, just
startup/teardown and affinity update.

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING
  2020-08-22 23:05               ` Jason Gunthorpe
@ 2020-08-23  8:03                 ` Thomas Gleixner
  0 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-23  8:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Boris Ostrovsky, Wei Liu, Dave Jiang, Baolu Lu,
	Marc Zyngier, x86, Megha Dey, xen-devel, Kevin Tian,
	Konrad Rzeszutek Wilk, Haiyang Zhang, Alex Williamson,
	Stefano Stabellini, Bjorn Helgaas, Stephen Hemminger,
	Dan Williams, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Sat, Aug 22 2020 at 20:05, Jason Gunthorpe wrote:
> On Sat, Aug 22, 2020 at 03:34:45AM +0200, Thomas Gleixner wrote:
> As a silicon design it might work, but it means existing devices can't
> be used with this dev_msi. It is also the sort of thing that would
> need a standard document to have any hope of multiple vendors fitting
> into it. Eg at PCI-SIG or something.

Fair enough.

>> If you don't do that then you simply can't write to that space from the
>> CPU and you have to transport this kind information always via command
>> queues.
>
> Yes, exactly. This is part of the architectural design of the device,
> has been for a long time. Has positives and negatives.

As always and it clearly follows the general HW design rule "we can fix
that in software".

>> > I suppose the core code could provide this as a service? Sort of a
>> > varient of the other lazy things above?
>> 
>> Kinda. That needs a lot of thought for the affinity setting stuff
>> because it can be called from contexts which do not allow that. It's
>> solvable though, but I clearly need to stare at the corner cases for a
>> while.
>
> If possible, this would be ideal, as we could use the dev_msi on a big
> installed base of existing HW.

I'll have a look, but I'm surely not going to like the outcome.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 22/38] x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
  2020-08-21  0:24 ` [patch RFC 22/38] x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() Thomas Gleixner
@ 2020-08-24  4:48   ` Jürgen Groß
  0 siblings, 0 replies; 71+ messages in thread
From: Jürgen Groß @ 2020-08-24  4:48 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stefano Stabellini,
	Stephen Hemminger, Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe,
	Megha Dey, xen-devel, Kevin Tian, Konrad Rzeszutek Wilk,
	Haiyang Zhang, Alex Williamson, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On 21.08.20 02:24, Thomas Gleixner wrote:
> The only user is in the same file and the name is too generic because this
> function is only ever used for HVM domains.
> 
> Signed-off-by: Thomas Gleixner<tglx@linutronix.de>
> Cc: Konrad Rzeszutek Wilk<konrad.wilk@oracle.com>
> Cc:linux-pci@vger.kernel.org
> Cc:xen-devel@lists.xenproject.org
> Cc: Juergen Gross<jgross@suse.com>
> Cc: Boris Ostrovsky<boris.ostrovsky@oracle.com>
> Cc: Stefano Stabellini<sstabellini@kernel.org>

Reviewed-by: Juergen Gross<jgross@suse.com>


Juergen
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init
  2020-08-21  0:24 ` [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init Thomas Gleixner
@ 2020-08-24  4:59   ` Jürgen Groß
  2020-08-24 21:21     ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Jürgen Groß @ 2020-08-24  4:59 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On 21.08.20 02:24, Thomas Gleixner wrote:
> X86 cannot store the irq domain pointer in struct device without breaking
> XEN because the irq domain pointer takes precedence over arch_*_msi_irqs()
> fallbacks.
> 
> To achieve this XEN MSI interrupt management needs to be wrapped into an
> irq domain.
> 
> Move the x86_msi ops setup into a single function to prepare for this.
> 
> Signed-off-by: Thomas Gleixner<tglx@linutronix.de>
> ---
>   arch/x86/pci/xen.c |   51 ++++++++++++++++++++++++++++++++-------------------
>   1 file changed, 32 insertions(+), 19 deletions(-)
> 
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -371,7 +371,10 @@ static void xen_initdom_restore_msi_irqs
>   		WARN(ret && ret != -ENOSYS, "restore_msi -> %d\n", ret);
>   	}
>   }
> -#endif
> +#else /* CONFIG_XEN_DOM0 */
> +#define xen_initdom_setup_msi_irqs	NULL
> +#define xen_initdom_restore_msi_irqs	NULL
> +#endif /* !CONFIG_XEN_DOM0 */
>   
>   static void xen_teardown_msi_irqs(struct pci_dev *dev)
>   {
> @@ -403,7 +406,31 @@ static void xen_teardown_msi_irq(unsigne
>   	WARN_ON_ONCE(1);
>   }
>   
> -#endif
> +static __init void xen_setup_pci_msi(void)
> +{
> +	if (xen_initial_domain()) {
> +		x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
> +		x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
> +		x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
> +		pci_msi_ignore_mask = 1;

This is wrong, as a PVH initial domain shouldn't do the pv settings.

The "if (xen_initial_domain())" should be inside the pv case, like:

if (xen_pv_domain()) {
	if (xen_initial_domain()) {
		...
	} else {
		...
	}
} else if (xen_hvm_domain()) {
	...

Juergen
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 23/38] x86/xen: Rework MSI teardown
  2020-08-21  0:24 ` [patch RFC 23/38] x86/xen: Rework MSI teardown Thomas Gleixner
@ 2020-08-24  5:09   ` Jürgen Groß
  0 siblings, 0 replies; 71+ messages in thread
From: Jürgen Groß @ 2020-08-24  5:09 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On 21.08.20 02:24, Thomas Gleixner wrote:
> X86 cannot store the irq domain pointer in struct device without breaking
> XEN because the irq domain pointer takes precedence over arch_*_msi_irqs()
> fallbacks.
> 
> XENs MSI teardown relies on default_teardown_msi_irqs() which invokes
> arch_teardown_msi_irq(). default_teardown_msi_irqs() is a trivial iterator
> over the msi entries associated to a device.
> 
> Implement this loop in xen_teardown_msi_irqs() to prepare for removal of
> the fallbacks for X86.
> 
> This is a preparatory step to wrap XEN MSI alloc/free into a irq domain
> which in turn allows to store the irq domain pointer in struct device and
> to use the irq domain functions directly.
> 
> Signed-off-by: Thomas Gleixner<tglx@linutronix.de>
> ---
>   arch/x86/pci/xen.c |   23 ++++++++++++++++++-----
>   1 file changed, 18 insertions(+), 5 deletions(-)
> 
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -376,20 +376,31 @@ static void xen_initdom_restore_msi_irqs
>   static void xen_teardown_msi_irqs(struct pci_dev *dev)
>   {
>   	struct msi_desc *msidesc;
> +	int i;
> +
> +	for_each_pci_msi_entry(msidesc, dev) {
> +		if (msidesc->irq) {
> +			for (i = 0; i < msidesc->nvec_used; i++)
> +				xen_destroy_irq(msidesc->irq + i);
> +		}
> +	}
> +}
> +
> +static void xen_pv_teardown_msi_irqs(struct pci_dev *dev)
> +{
> +	struct msi_desc *msidesc = first_pci_msi_entry(dev);
>   
> -	msidesc = first_pci_msi_entry(dev);
>   	if (msidesc->msi_attrib.is_msix)
>   		xen_pci_frontend_disable_msix(dev);
>   	else
>   		xen_pci_frontend_disable_msi(dev);
>   
> -	/* Free the IRQ's and the msidesc using the generic code. */
> -	default_teardown_msi_irqs(dev);
> +	xen_teardown_msi_irqs(dev);
>   }
>   
>   static void xen_teardown_msi_irq(unsigned int irq)
>   {
> -	xen_destroy_irq(irq);
> +	WARN_ON_ONCE(1);
>   }
>   
>   #endif
> @@ -412,7 +423,7 @@ int __init pci_xen_init(void)
>   #ifdef CONFIG_PCI_MSI
>   	x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
>   	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
> -	x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
> +	x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs;
>   	pci_msi_ignore_mask = 1;
>   #endif
>   	return 0;
> @@ -436,6 +447,7 @@ static void __init xen_hvm_msi_init(void
>   	}
>   
>   	x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs;
> +	x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
>   	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
>   }
>   #endif
> @@ -472,6 +484,7 @@ int __init pci_xen_initial_domain(void)
>   #ifdef CONFIG_PCI_MSI
>   	x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
>   	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
> +	x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;

This should be xen_pv_teardown_msi_irqs, as pci_xen_initial_domain() is
called only for the pv initial domain case today.

>   	x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
>   	pci_msi_ignore_mask = 1;
>   #endif
> 


Juergen
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 26/38] x86/xen: Wrap XEN MSI management into irqdomain
  2020-08-21  0:24 ` [patch RFC 26/38] x86/xen: Wrap XEN MSI management into irqdomain Thomas Gleixner
@ 2020-08-24  6:21   ` Jürgen Groß
  2020-08-25  7:57     ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Jürgen Groß @ 2020-08-24  6:21 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On 21.08.20 02:24, Thomas Gleixner wrote:
> To allow utilizing the irq domain pointer in struct device it is necessary
> to make XEN/MSI irq domain compatible.
> 
> While the right solution would be to truly convert XEN to irq domains, this
> is an exercise which is not possible for mere mortals with limited XENology.
> 
> Provide a plain irqdomain wrapper around XEN. While this is blatant
> violation of the irqdomain design, it's the only solution for a XEN igorant
> person to make progress on the issue which triggered this change.
> 
> Signed-off-by: Thomas Gleixner<tglx@linutronix.de>
> Cc:linux-pci@vger.kernel.org
> Cc:xen-devel@lists.xenproject.org

Acked-by: Juergen Gross <jgross@suse.com>

Looking into https://www.kernel.org/doc/Documentation/IRQ-domain.txt (is
this still valid?) I believe Xen should be able to use the "No Map"
approach, as Xen only ever uses software IRQs (at least those are the
only ones visible to any driver). The (virtualized) hardware interrupts
are Xen events after all.

So maybe morphing Xen into supporting irqdomains in a sane way isn't
that complicated. Maybe I'm missing the main complexities, though.


Juergen

> ---
> Note: This is completely untested, but it compiles so it must be perfect.
> ---
>   arch/x86/pci/xen.c |   63 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 63 insertions(+)
> 
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -406,6 +406,63 @@ static void xen_teardown_msi_irq(unsigne
>   	WARN_ON_ONCE(1);
>   }
>   
> +static int xen_msi_domain_alloc_irqs(struct irq_domain *domain,
> +				     struct device *dev,  int nvec)
> +{
> +	int type;
> +
> +	if (WARN_ON_ONCE(!dev_is_pci(dev)))
> +		return -EINVAL;
> +
> +	if (first_msi_entry(dev)->msi_attrib.is_msix)
> +		type = PCI_CAP_ID_MSIX;
> +	else
> +		type = PCI_CAP_ID_MSI;
> +
> +	return x86_msi.setup_msi_irqs(to_pci_dev(dev), nvec, type);
> +}
> +
> +static void xen_msi_domain_free_irqs(struct irq_domain *domain,
> +				     struct device *dev)
> +{
> +	if (WARN_ON_ONCE(!dev_is_pci(dev)))
> +		return;
> +
> +	x86_msi.teardown_msi_irqs(to_pci_dev(dev));
> +}
> +
> +static struct msi_domain_ops xen_pci_msi_domain_ops = {
> +	.domain_alloc_irqs	= xen_msi_domain_alloc_irqs,
> +	.domain_free_irqs	= xen_msi_domain_free_irqs,
> +};
> +
> +static struct msi_domain_info xen_pci_msi_domain_info = {
> +	.ops			= &xen_pci_msi_domain_ops,
> +};
> +
> +/*
> + * This irq domain is a blatant violation of the irq domain design, but
> + * distangling XEN into real irq domains is not a job for mere mortals with
> + * limited XENology. But it's the least dangerous way for a mere mortal to
> + * get rid of the arch_*_msi_irqs() hackery in order to store the irq
> + * domain pointer in struct device. This irq domain wrappery allows to do
> + * that without breaking XEN terminally.
> + */
> +static __init struct irq_domain *xen_create_pci_msi_domain(void)
> +{
> +	struct irq_domain *d = NULL;
> +	struct fwnode_handle *fn;
> +
> +	fn = irq_domain_alloc_named_fwnode("XEN-MSI");
> +	if (fn)
> +		d = msi_create_irq_domain(fn, &xen_pci_msi_domain_info, NULL);
> +
> +	/* FIXME: No idea how to survive if this fails */
> +	BUG_ON(!d);
> +
> +	return d;
> +}
> +
>   static __init void xen_setup_pci_msi(void)
>   {
>   	if (xen_initial_domain()) {
> @@ -426,6 +483,12 @@ static __init void xen_setup_pci_msi(voi
>   	}
>   
>   	x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
> +
> +	/*
> +	 * Override the PCI/MSI irq domain init function. No point
> +	 * in allocating the native domain and never use it.
> +	 */
> +	x86_init.irqs.create_pci_msi_domain = xen_create_pci_msi_domain;
>   }
>   
>   #else /* CONFIG_PCI_MSI */
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init
  2020-08-24  4:59   ` Jürgen Groß
@ 2020-08-24 21:21     ` Thomas Gleixner
  2020-08-25  4:21       ` Jürgen Groß
  0 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-24 21:21 UTC (permalink / raw)
  To: Jürgen Groß, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On Mon, Aug 24 2020 at 06:59, Jürgen Groß wrote:
> On 21.08.20 02:24, Thomas Gleixner wrote:
>> +static __init void xen_setup_pci_msi(void)
>> +{
>> +	if (xen_initial_domain()) {
>> +		x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
>> +		x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
>> +		x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
>> +		pci_msi_ignore_mask = 1;
>
> This is wrong, as a PVH initial domain shouldn't do the pv settings.
>
> The "if (xen_initial_domain())" should be inside the pv case, like:
>
> if (xen_pv_domain()) {
> 	if (xen_initial_domain()) {
> 		...
> 	} else {
> 		...
> 	}
> } else if (xen_hvm_domain()) {
> 	...

I still think it does the right thing depending on the place it is
called from, but even if so, it's completely unreadable gunk. I'll fix
that proper.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init
  2020-08-24 21:21     ` Thomas Gleixner
@ 2020-08-25  4:21       ` Jürgen Groß
  2020-08-25  9:51         ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Jürgen Groß @ 2020-08-25  4:21 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On 24.08.20 23:21, Thomas Gleixner wrote:
> On Mon, Aug 24 2020 at 06:59, Jürgen Groß wrote:
>> On 21.08.20 02:24, Thomas Gleixner wrote:
>>> +static __init void xen_setup_pci_msi(void)
>>> +{
>>> +	if (xen_initial_domain()) {
>>> +		x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
>>> +		x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
>>> +		x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
>>> +		pci_msi_ignore_mask = 1;
>>
>> This is wrong, as a PVH initial domain shouldn't do the pv settings.
>>
>> The "if (xen_initial_domain())" should be inside the pv case, like:
>>
>> if (xen_pv_domain()) {
>> 	if (xen_initial_domain()) {
>> 		...
>> 	} else {
>> 		...
>> 	}
>> } else if (xen_hvm_domain()) {
>> 	...
> 
> I still think it does the right thing depending on the place it is
> called from, but even if so, it's completely unreadable gunk. I'll fix
> that proper.

The main issue is that xen_initial_domain() and xen_pv_domain() are
orthogonal to each other. So xen_initial_domain() can either be true
for xen_pv_domain() (the "classic" pv dom0) or for xen_hvm_domain()
(the new PVH dom0).


Juergen
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 26/38] x86/xen: Wrap XEN MSI management into irqdomain
  2020-08-24  6:21   ` Jürgen Groß
@ 2020-08-25  7:57     ` Thomas Gleixner
  0 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-25  7:57 UTC (permalink / raw)
  To: Jürgen Groß, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On Mon, Aug 24 2020 at 08:21, Jürgen Groß wrote:
> On 21.08.20 02:24, Thomas Gleixner wrote:
>
> Looking into https://www.kernel.org/doc/Documentation/IRQ-domain.txt (is
> this still valid?)

It's halfways correct. Emphasis on halfways.

> I believe Xen should be able to use the "No Map" approach, as Xen only
> ever uses software IRQs (at least those are the only ones visible to
> any driver). The (virtualized) hardware interrupts are Xen events
> after all.
>
> So maybe morphing Xen into supporting irqdomains in a sane way isn't
> that complicated. Maybe I'm missing the main complexities, though.

The wrapper domain I did is pretty much that, but with the extra
functionality required by hierarchical irq domains. So, yes it's
functionally correct, but it's only utilizing the alloc/free interface
and not any of the other mechanisms provided by irqdomains. The latter
should make the overall code simpler but that obviously needs some
thought.

Thanks,

        tglx



_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init
  2020-08-25  4:21       ` Jürgen Groß
@ 2020-08-25  9:51         ` Thomas Gleixner
  0 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-25  9:51 UTC (permalink / raw)
  To: Jürgen Groß, LKML
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Russ Anderson, Greg Kroah-Hartman,
	iommu, Jacob Pan, Rafael J. Wysocki

On Tue, Aug 25 2020 at 06:21, Jürgen Groß wrote:
> On 24.08.20 23:21, Thomas Gleixner wrote:
>> I still think it does the right thing depending on the place it is
>> called from, but even if so, it's completely unreadable gunk. I'll fix
>> that proper.
>
> The main issue is that xen_initial_domain() and xen_pv_domain() are
> orthogonal to each other. So xen_initial_domain() can either be true
> for xen_pv_domain() (the "classic" pv dom0) or for xen_hvm_domain()
> (the new PVH dom0).

Fair enough. My limited XENology striked again.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq()
  2020-08-21  0:24 ` [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq() Thomas Gleixner
@ 2020-08-25 20:03   ` Bjorn Helgaas
  2020-08-25 21:11     ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Bjorn Helgaas @ 2020-08-25 20:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21, 2020 at 02:24:37AM +0200, Thomas Gleixner wrote:
> Retrieve the PCI device from the msi descriptor instead of doing so at the
> call sites.

I'd like it *better* with "PCI/MSI: " in the subject (to match history
and other patches in this series) and "MSI" here in the commit log,
but nice cleanup and:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Minor comments below.

> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-pci@vger.kernel.org
> ---
>  arch/x86/kernel/apic/msi.c |    2 +-
>  drivers/pci/msi.c          |   13 ++++++-------
>  include/linux/msi.h        |    3 +--
>  3 files changed, 8 insertions(+), 10 deletions(-)
> 
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -232,7 +232,7 @@ EXPORT_SYMBOL_GPL(pci_msi_prepare);
>  
>  void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
>  {
> -	arg->msi_hwirq = pci_msi_domain_calc_hwirq(arg->msi_dev, desc);
> +	arg->msi_hwirq = pci_msi_domain_calc_hwirq(desc);

I guess it's safe to assume that "arg->msi_dev ==
msi_desc_to_pci_dev(desc)"?  I didn't try to verify that.

>  }
>  EXPORT_SYMBOL_GPL(pci_msi_set_desc);
>  
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1346,17 +1346,17 @@ void pci_msi_domain_write_msg(struct irq
>  
>  /**
>   * pci_msi_domain_calc_hwirq - Generate a unique ID for an MSI source
> - * @dev:	Pointer to the PCI device
>   * @desc:	Pointer to the MSI descriptor
>   *
>   * The ID number is only used within the irqdomain.
>   */
> -irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev,
> -					  struct msi_desc *desc)
> +irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc)
>  {
> +	struct pci_dev *pdev = msi_desc_to_pci_dev(desc);

If you named this "struct pci_dev *dev" (not "pdev"), the diff would
be a little smaller and it would match other usage in the file.

>  	return (irq_hw_number_t)desc->msi_attrib.entry_nr |
> -		pci_dev_id(dev) << 11 |
> -		(pci_domain_nr(dev->bus) & 0xFFFFFFFF) << 27;
> +		pci_dev_id(pdev) << 11 |
> +		(pci_domain_nr(pdev->bus) & 0xFFFFFFFF) << 27;
>  }
>  
>  static inline bool pci_msi_desc_is_multi_msi(struct msi_desc *desc)
> @@ -1406,8 +1406,7 @@ static void pci_msi_domain_set_desc(msi_
>  				    struct msi_desc *desc)
>  {
>  	arg->desc = desc;
> -	arg->hwirq = pci_msi_domain_calc_hwirq(msi_desc_to_pci_dev(desc),
> -					       desc);
> +	arg->hwirq = pci_msi_domain_calc_hwirq(desc);
>  }
>  #else
>  #define pci_msi_domain_set_desc		NULL
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -369,8 +369,7 @@ void pci_msi_domain_write_msg(struct irq
>  struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
>  					     struct msi_domain_info *info,
>  					     struct irq_domain *parent);
> -irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev,
> -					  struct msi_desc *desc);
> +irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc);
>  int pci_msi_domain_check_cap(struct irq_domain *domain,
>  			     struct msi_domain_info *info, struct device *dev);
>  u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev);
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 20/38] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI
  2020-08-21  0:24 ` [patch RFC 20/38] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI Thomas Gleixner
@ 2020-08-25 20:04   ` Bjorn Helgaas
  0 siblings, 0 replies; 71+ messages in thread
From: Bjorn Helgaas @ 2020-08-25 20:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jonathan Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21, 2020 at 02:24:44AM +0200, Thomas Gleixner wrote:
> Devices on the VMD bus use their own MSI irq domain, but it is not
> distinguishable from regular PCI/MSI irq domains. This is required
> to exclude VMD devices from getting the irq domain pointer set by
> interrupt remapping.
> 
> Override the default bus token.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Jonathan Derrick <jonathan.derrick@intel.com>
> Cc: linux-pci@vger.kernel.org

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/controller/vmd.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
>  		return -ENODEV;
>  	}
>  
> +	/*
> +	 * Override the irq domain bus token so the domain can be distinguished
> +	 * from a regular PCI/MSI domain.
> +	 */
> +	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> +
>  	pci_add_resource(&resources, &vmd->resources[0]);
>  	pci_add_resource_offset(&resources, &vmd->resources[1], offset[0]);
>  	pci_add_resource_offset(&resources, &vmd->resources[2], offset[1]);
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks
  2020-08-21  0:24 ` [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks Thomas Gleixner
@ 2020-08-25 20:07   ` Bjorn Helgaas
  2020-08-25 21:28     ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Bjorn Helgaas @ 2020-08-25 20:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21, 2020 at 02:24:54AM +0200, Thomas Gleixner wrote:
> If an architecture does not require the MSI setup/teardown fallback
> functions, then allow them to be replaced by stub functions which emit a
> warning.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Question/comment below.

> ---
>  drivers/pci/Kconfig |    3 +++
>  drivers/pci/msi.c   |    3 ++-
>  include/linux/msi.h |   31 ++++++++++++++++++++++++++-----
>  3 files changed, 31 insertions(+), 6 deletions(-)
> 
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -56,6 +56,9 @@ config PCI_MSI_IRQ_DOMAIN
>  	depends on PCI_MSI
>  	select GENERIC_MSI_IRQ_DOMAIN
>  
> +config PCI_MSI_DISABLE_ARCH_FALLBACKS
> +	bool
> +
>  config PCI_QUIRKS
>  	default y
>  	bool "Enable PCI quirk workarounds" if EXPERT
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -58,8 +58,8 @@ static void pci_msi_teardown_msi_irqs(st
>  #define pci_msi_teardown_msi_irqs	arch_teardown_msi_irqs
>  #endif
>  
> +#ifndef CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS
>  /* Arch hooks */
> -
>  int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
>  {
>  	struct msi_controller *chip = dev->bus->msi;
> @@ -132,6 +132,7 @@ void __weak arch_teardown_msi_irqs(struc
>  {
>  	return default_teardown_msi_irqs(dev);
>  }
> +#endif /* !CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS */
>  
>  static void default_restore_msi_irq(struct pci_dev *dev, int irq)
>  {
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -193,17 +193,38 @@ void pci_msi_mask_irq(struct irq_data *d
>  void pci_msi_unmask_irq(struct irq_data *data);
>  
>  /*
> - * The arch hooks to setup up msi irqs. Those functions are
> - * implemented as weak symbols so that they /can/ be overriden by
> - * architecture specific code if needed.
> + * The arch hooks to setup up msi irqs. Default functions are implemented
> + * as weak symbols so that they /can/ be overriden by architecture specific
> + * code if needed.
> + *
> + * They can be replaced by stubs with warnings via
> + * CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS when the architecture fully
> + * utilizes direct irqdomain based setup.

Do you expect *all* arches to eventually use direct irqdomain setup?
And in that case, to remove the config option?

If not, it seems like it'd be nicer to have the burden on the arches
that need/want to use arch-specific code instead of on the arches that
do things generically.

>   */
> +#ifndef CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS
>  int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc);
>  void arch_teardown_msi_irq(unsigned int irq);
>  int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
>  void arch_teardown_msi_irqs(struct pci_dev *dev);
> -void arch_restore_msi_irqs(struct pci_dev *dev);
> -
>  void default_teardown_msi_irqs(struct pci_dev *dev);
> +#else
> +static inline int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> +{
> +	WARN_ON_ONCE(1);
> +	return -ENODEV;
> +}
> +
> +static inline void arch_teardown_msi_irqs(struct pci_dev *dev)
> +{
> +	WARN_ON_ONCE(1);
> +}
> +#endif
> +
> +/*
> + * The restore hooks are still available as they are useful even
> + * for fully irq domain based setups. Courtesy to XEN/X86.
> + */
> +void arch_restore_msi_irqs(struct pci_dev *dev);
>  void default_restore_msi_irqs(struct pci_dev *dev);
>  
>  struct msi_controller {
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 21/38] PCI: MSI: Provide pci_dev_has_special_msi_domain() helper
  2020-08-21  0:24 ` [patch RFC 21/38] PCI: MSI: Provide pci_dev_has_special_msi_domain() helper Thomas Gleixner
@ 2020-08-25 20:16   ` Bjorn Helgaas
  0 siblings, 0 replies; 71+ messages in thread
From: Bjorn Helgaas @ 2020-08-25 20:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21, 2020 at 02:24:45AM +0200, Thomas Gleixner wrote:
> Provide a helper function to check whether a PCI device is handled by a
> non-standard PCI/MSI domain. This will be used to exclude such devices
> which hang of a special bus, e.g. VMD, to be excluded from the irq domain
> override in irq remapping.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

s|PCI: MSI:|PCI/MSI:| in the subject if feasible.

> ---
>  drivers/pci/msi.c   |   22 ++++++++++++++++++++++
>  include/linux/msi.h |    1 +
>  2 files changed, 23 insertions(+)
> 
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1553,4 +1553,26 @@ struct irq_domain *pci_msi_get_device_do
>  					     DOMAIN_BUS_PCI_MSI);
>  	return dom;
>  }
> +
> +/**
> + * pci_dev_has_special_msi_domain - Check whether the device is handled by
> + *				    a non-standard PCI-MSI domain
> + * @pdev:	The PCI device to check.
> + *
> + * Returns: True if the device irqdomain or the bus irqdomain is
> + * non-standard PCI/MSI.
> + */
> +bool pci_dev_has_special_msi_domain(struct pci_dev *pdev)
> +{
> +	struct irq_domain *dom = dev_get_msi_domain(&pdev->dev);
> +
> +	if (!dom)
> +		dom = dev_get_msi_domain(&pdev->bus->dev);
> +
> +	if (!dom)
> +		return true;
> +
> +	return dom->bus_token != DOMAIN_BUS_PCI_MSI;
> +}
> +
>  #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -374,6 +374,7 @@ int pci_msi_domain_check_cap(struct irq_
>  			     struct msi_domain_info *info, struct device *dev);
>  u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev);
>  struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev);
> +bool pci_dev_has_special_msi_domain(struct pci_dev *pdev);
>  #else
>  static inline struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev)
>  {
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 17/38] x86/pci: Reducde #ifdeffery in PCI init code
  2020-08-21  0:24 ` [patch RFC 17/38] x86/pci: Reducde #ifdeffery in PCI init code Thomas Gleixner
@ 2020-08-25 20:20   ` Bjorn Helgaas
  0 siblings, 0 replies; 71+ messages in thread
From: Bjorn Helgaas @ 2020-08-25 20:20 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

s/Reducde/Reduce/ (in subject)

On Fri, Aug 21, 2020 at 02:24:41AM +0200, Thomas Gleixner wrote:
> Adding a function call before the first #ifdef in arch_pci_init() triggers
> a 'mixed declarations and code' warning if PCI_DIRECT is enabled.
> 
> Use stub functions and move the #ifdeffery to the header file where it is
> not in the way.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-pci@vger.kernel.org

Nice cleanup, thanks.  Glad to get rid of the useless initializer,
too.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  arch/x86/include/asm/pci_x86.h |   11 +++++++++++
>  arch/x86/pci/init.c            |   10 +++-------
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> --- a/arch/x86/include/asm/pci_x86.h
> +++ b/arch/x86/include/asm/pci_x86.h
> @@ -114,9 +114,20 @@ extern const struct pci_raw_ops pci_dire
>  extern bool port_cf9_safe;
>  
>  /* arch_initcall level */
> +#ifdef CONFIG_PCI_DIRECT
>  extern int pci_direct_probe(void);
>  extern void pci_direct_init(int type);
> +#else
> +static inline int pci_direct_probe(void) { return -1; }
> +static inline  void pci_direct_init(int type) { }
> +#endif
> +
> +#ifdef CONFIG_PCI_BIOS
>  extern void pci_pcbios_init(void);
> +#else
> +static inline void pci_pcbios_init(void) { }
> +#endif
> +
>  extern void __init dmi_check_pciprobe(void);
>  extern void __init dmi_check_skip_isa_align(void);
>  
> --- a/arch/x86/pci/init.c
> +++ b/arch/x86/pci/init.c
> @@ -8,11 +8,9 @@
>     in the right sequence from here. */
>  static __init int pci_arch_init(void)
>  {
> -#ifdef CONFIG_PCI_DIRECT
> -	int type = 0;
> +	int type;
>  
>  	type = pci_direct_probe();
> -#endif
>  
>  	if (!(pci_probe & PCI_PROBE_NOEARLY))
>  		pci_mmcfg_early_init();
> @@ -20,18 +18,16 @@ static __init int pci_arch_init(void)
>  	if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
>  		return 0;
>  
> -#ifdef CONFIG_PCI_BIOS
>  	pci_pcbios_init();
> -#endif
> +
>  	/*
>  	 * don't check for raw_pci_ops here because we want pcbios as last
>  	 * fallback, yet it's needed to run first to set pcibios_last_bus
>  	 * in case legacy PCI probing is used. otherwise detecting peer busses
>  	 * fails.
>  	 */
> -#ifdef CONFIG_PCI_DIRECT
>  	pci_direct_init(type);
> -#endif
> +
>  	if (!raw_pci_ops && !raw_pci_ext_ops)
>  		printk(KERN_ERR
>  		"PCI: Fatal: No config space access function found\n");
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI
  2020-08-21  0:24 ` [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI Thomas Gleixner
@ 2020-08-25 20:24   ` Bjorn Helgaas
  2020-08-25 21:30     ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Bjorn Helgaas @ 2020-08-25 20:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Fri, Aug 21, 2020 at 02:24:58AM +0200, Thomas Gleixner wrote:
> Rename it to x86_msi_prepare() and handle the allocation type setup
> depending on the device type.

I see what you're doing, but the subject reads a little strangely
("pci_msi_prepare() handling non-PCI" stuff) since it doesn't mention
the rename.  Maybe not practical or worthwhile to split into a rename
+ make generic, I dunno.

> Add a new arch_msi_prepare define which will be utilized by the upcoming
> device MSI support. Define it to NULL if not provided by an architecture in
> the generic MSI header.
> 
> One arch specific function for MSI support is truly enough.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-hyperv@vger.kernel.org
> ---
>  arch/x86/include/asm/msi.h          |    4 +++-
>  arch/x86/kernel/apic/msi.c          |   27 ++++++++++++++++++++-------
>  drivers/pci/controller/pci-hyperv.c |    2 +-
>  include/linux/msi.h                 |    4 ++++
>  4 files changed, 28 insertions(+), 9 deletions(-)
> 
> --- a/arch/x86/include/asm/msi.h
> +++ b/arch/x86/include/asm/msi.h
> @@ -6,7 +6,9 @@
>  
>  typedef struct irq_alloc_info msi_alloc_info_t;
>  
> -int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
> +int x86_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
>  		    msi_alloc_info_t *arg);
>  
> +#define arch_msi_prepare		x86_msi_prepare
> +
>  #endif /* _ASM_X86_MSI_H */
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -182,26 +182,39 @@ static struct irq_chip pci_msi_controlle
>  	.flags			= IRQCHIP_SKIP_SET_WAKE,
>  };
>  
> -int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
> -		    msi_alloc_info_t *arg)
> +static void pci_msi_prepare(struct device *dev, msi_alloc_info_t *arg)
>  {
> -	struct pci_dev *pdev = to_pci_dev(dev);
> -	struct msi_desc *desc = first_pci_msi_entry(pdev);
> +	struct msi_desc *desc = first_msi_entry(dev);
>  
> -	init_irq_alloc_info(arg, NULL);
>  	if (desc->msi_attrib.is_msix) {
>  		arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSIX;
>  	} else {
>  		arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSI;
>  		arg->flags |= X86_IRQ_ALLOC_CONTIGUOUS_VECTORS;
>  	}
> +}
> +
> +static void dev_msi_prepare(struct device *dev, msi_alloc_info_t *arg)
> +{
> +	arg->type = X86_IRQ_ALLOC_TYPE_DEV_MSI;
> +}
> +
> +int x86_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
> +		    msi_alloc_info_t *arg)
> +{
> +	init_irq_alloc_info(arg, NULL);
> +
> +	if (dev_is_pci(dev))
> +		pci_msi_prepare(dev, arg);
> +	else
> +		dev_msi_prepare(dev, arg);
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(pci_msi_prepare);
> +EXPORT_SYMBOL_GPL(x86_msi_prepare);
>  
>  static struct msi_domain_ops pci_msi_domain_ops = {
> -	.msi_prepare	= pci_msi_prepare,
> +	.msi_prepare	= x86_msi_prepare,
>  };
>  
>  static struct msi_domain_info pci_msi_domain_info = {
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1532,7 +1532,7 @@ static struct irq_chip hv_msi_irq_chip =
>  };
>  
>  static struct msi_domain_ops hv_msi_ops = {
> -	.msi_prepare	= pci_msi_prepare,
> +	.msi_prepare	= arch_msi_prepare,
>  	.msi_free	= hv_msi_free,
>  };
>  
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -430,4 +430,8 @@ static inline struct irq_domain *pci_msi
>  }
>  #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
>  
> +#ifndef arch_msi_prepare
> +# define arch_msi_prepare	NULL
> +#endif
> +
>  #endif /* LINUX_MSI_H */
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq()
  2020-08-25 20:03   ` Bjorn Helgaas
@ 2020-08-25 21:11     ` Thomas Gleixner
  0 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-25 21:11 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Tue, Aug 25 2020 at 15:03, Bjorn Helgaas wrote:
> On Fri, Aug 21, 2020 at 02:24:37AM +0200, Thomas Gleixner wrote:
>> Retrieve the PCI device from the msi descriptor instead of doing so at the
>> call sites.
>
> I'd like it *better* with "PCI/MSI: " in the subject (to match history

Duh, yes.

> and other patches in this series) and "MSI" here in the commit log,
> but nice cleanup and:
>> --- a/arch/x86/kernel/apic/msi.c
>> +++ b/arch/x86/kernel/apic/msi.c
>> @@ -232,7 +232,7 @@ EXPORT_SYMBOL_GPL(pci_msi_prepare);
>>  
>>  void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
>>  {
>> -	arg->msi_hwirq = pci_msi_domain_calc_hwirq(arg->msi_dev, desc);
>> +	arg->msi_hwirq = pci_msi_domain_calc_hwirq(desc);
>
> I guess it's safe to assume that "arg->msi_dev ==
> msi_desc_to_pci_dev(desc)"?  I didn't try to verify that.

It is.

>> +irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc)
>>  {
>> +	struct pci_dev *pdev = msi_desc_to_pci_dev(desc);
>
> If you named this "struct pci_dev *dev" (not "pdev"), the diff would
> be a little smaller and it would match other usage in the file.

Ok. I'm always happy to see pdev because that doesn't make me wonder
which type of dev it is :) But, yeah lets keep it consistent.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks
  2020-08-25 20:07   ` Bjorn Helgaas
@ 2020-08-25 21:28     ` Thomas Gleixner
  2020-08-25 21:35       ` Bjorn Helgaas
  0 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-25 21:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Tue, Aug 25 2020 at 15:07, Bjorn Helgaas wrote:
>> + * The arch hooks to setup up msi irqs. Default functions are implemented
>> + * as weak symbols so that they /can/ be overriden by architecture specific
>> + * code if needed.
>> + *
>> + * They can be replaced by stubs with warnings via
>> + * CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS when the architecture fully
>> + * utilizes direct irqdomain based setup.
>
> Do you expect *all* arches to eventually use direct irqdomain setup?

Ideally that happens some day. We have five left when x86 is converted:

IA64, MIPS, POWERPC, S390, SPARC

IA64 is unlikely to be fixed, but might be solved naturally by removal.

For the others I don't know, but it's not on the horizon anytime soon I
fear.

> And in that case, to remove the config option?

Yes, and all the code which depends on it.

> If not, it seems like it'd be nicer to have the burden on the arches
> that need/want to use arch-specific code instead of on the arches that
> do things generically.

Right, but they still share the common code there and some of them
provide only parts of the weak callbacks. I'm not sure whether it's a
good idea to copy all of this into each affected architecture.

Or did you just mean that those architectures should select
CONFIG_I_WANT_THE CRUFT instead of opting out on the fully irq domain
based ones?

Thanks,

        tglx


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI
  2020-08-25 20:24   ` Bjorn Helgaas
@ 2020-08-25 21:30     ` Thomas Gleixner
  2020-08-25 21:50       ` Bjorn Helgaas
  0 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-25 21:30 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Tue, Aug 25 2020 at 15:24, Bjorn Helgaas wrote:
> On Fri, Aug 21, 2020 at 02:24:58AM +0200, Thomas Gleixner wrote:
>> Rename it to x86_msi_prepare() and handle the allocation type setup
>> depending on the device type.
>
> I see what you're doing, but the subject reads a little strangely

Yes :(

> ("pci_msi_prepare() handling non-PCI" stuff) since it doesn't mention
> the rename.  Maybe not practical or worthwhile to split into a rename
> + make generic, I dunno.

What about

x86/msi: Rename and rework pci_msi_prepare() to cover non-PCI MSI

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks
  2020-08-25 21:28     ` Thomas Gleixner
@ 2020-08-25 21:35       ` Bjorn Helgaas
  2020-08-25 21:40         ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Bjorn Helgaas @ 2020-08-25 21:35 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Tue, Aug 25, 2020 at 11:28:30PM +0200, Thomas Gleixner wrote:
> On Tue, Aug 25 2020 at 15:07, Bjorn Helgaas wrote:
> >> + * The arch hooks to setup up msi irqs. Default functions are implemented
> >> + * as weak symbols so that they /can/ be overriden by architecture specific
> >> + * code if needed.
> >> + *
> >> + * They can be replaced by stubs with warnings via
> >> + * CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS when the architecture fully
> >> + * utilizes direct irqdomain based setup.

> > If not, it seems like it'd be nicer to have the burden on the arches
> > that need/want to use arch-specific code instead of on the arches that
> > do things generically.
> 
> Right, but they still share the common code there and some of them
> provide only parts of the weak callbacks. I'm not sure whether it's a
> good idea to copy all of this into each affected architecture.
> 
> Or did you just mean that those architectures should select
> CONFIG_I_WANT_THE CRUFT instead of opting out on the fully irq domain
> based ones?

Yes, that was my real question -- can we confine the cruft in the
crufty arches?  If not, no big deal.

Bjorn
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks
  2020-08-25 21:35       ` Bjorn Helgaas
@ 2020-08-25 21:40         ` Thomas Gleixner
  2020-08-25 22:03           ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-25 21:40 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Tue, Aug 25 2020 at 16:35, Bjorn Helgaas wrote:
> On Tue, Aug 25, 2020 at 11:28:30PM +0200, Thomas Gleixner wrote:
>> 
>> Or did you just mean that those architectures should select
>> CONFIG_I_WANT_THE CRUFT instead of opting out on the fully irq domain
>> based ones?
>
> Yes, that was my real question -- can we confine the cruft in the
> crufty arches?  If not, no big deal.

Should be doable. Let me try.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI
  2020-08-25 21:30     ` Thomas Gleixner
@ 2020-08-25 21:50       ` Bjorn Helgaas
  0 siblings, 0 replies; 71+ messages in thread
From: Bjorn Helgaas @ 2020-08-25 21:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Tue, Aug 25, 2020 at 11:30:41PM +0200, Thomas Gleixner wrote:
> On Tue, Aug 25 2020 at 15:24, Bjorn Helgaas wrote:
> > On Fri, Aug 21, 2020 at 02:24:58AM +0200, Thomas Gleixner wrote:
> >> Rename it to x86_msi_prepare() and handle the allocation type setup
> >> depending on the device type.
> >
> > I see what you're doing, but the subject reads a little strangely
> 
> Yes :(
> 
> > ("pci_msi_prepare() handling non-PCI" stuff) since it doesn't mention
> > the rename.  Maybe not practical or worthwhile to split into a rename
> > + make generic, I dunno.
> 
> What about
> 
> x86/msi: Rename and rework pci_msi_prepare() to cover non-PCI MSI

Perfect!
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks
  2020-08-25 21:40         ` Thomas Gleixner
@ 2020-08-25 22:03           ` Thomas Gleixner
  0 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-25 22:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Tue, Aug 25 2020 at 23:40, Thomas Gleixner wrote:
> On Tue, Aug 25 2020 at 16:35, Bjorn Helgaas wrote:
>> On Tue, Aug 25, 2020 at 11:28:30PM +0200, Thomas Gleixner wrote:
>>> 
>>> Or did you just mean that those architectures should select
>>> CONFIG_I_WANT_THE CRUFT instead of opting out on the fully irq domain
>>> based ones?
>>
>> Yes, that was my real question -- can we confine the cruft in the
>> crufty arches?  If not, no big deal.
>
> Should be doable. Let me try.

Bah. There is more cruft.

The weak implementation has another way to go sideways via
msi_controller::setup_irq[s] and msi_controller::teardown_irq

drivers/pci/controller/pci-tegra.c
drivers/pci/controller/pcie-rcar-host.c
drivers/pci/controller/pcie-xilinx.c

Groan....

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [patch RFC 10/38] x86/ioapic: Consolidate IOAPIC allocation
  2020-08-21  0:24 ` [patch RFC 10/38] x86/ioapic: Consolidate IOAPIC allocation Thomas Gleixner
@ 2020-08-26  8:40   ` Boqun Feng
  2020-08-26  9:53     ` Thomas Gleixner
  0 siblings, 1 reply; 71+ messages in thread
From: Boqun Feng @ 2020-08-26  8:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

Hi Thomas,

I hit a compiler error while I was trying to compile this patchset:

arch/x86/kernel/devicetree.c: In function ‘dt_irqdomain_alloc’:
arch/x86/kernel/devicetree.c:232:6: error: ‘struct irq_alloc_info’ has no member named ‘ioapic_id’; did you mean ‘ioapic’?
  232 |  tmp.ioapic_id = mpc_ioapic_id(mp_irqdomain_ioapic_idx(domain));
      |      ^~~~~~~~~
      |      ioapic
arch/x86/kernel/devicetree.c:233:6: error: ‘struct irq_alloc_info’ has no member named ‘ioapic_pin’; did you mean ‘ioapic’?
  233 |  tmp.ioapic_pin = fwspec->param[0]
      |      ^~~~~~~~~~
      |      ioapic

with CONFIG_OF=y. IIUC, the following changes are needed to fold into
this patch. (At least I can continue to compile the kernel with this
change)

diff --git a/arch/x86/kernel/devicetree.c b/arch/x86/kernel/devicetree.c
index a0e8fc7d85f1..ddffd80f5c52 100644
--- a/arch/x86/kernel/devicetree.c
+++ b/arch/x86/kernel/devicetree.c
@@ -229,8 +229,8 @@ static int dt_irqdomain_alloc(struct irq_domain *domain, unsigned int virq,
 
 	it = &of_ioapic_type[type_index];
 	ioapic_set_alloc_attr(&tmp, NUMA_NO_NODE, it->trigger, it->polarity);
-	tmp.ioapic_id = mpc_ioapic_id(mp_irqdomain_ioapic_idx(domain));
-	tmp.ioapic_pin = fwspec->param[0];
+	tmp.devid = mpc_ioapic_id(mp_irqdomain_ioapic_idx(domain));
+	tmp.ioapic.pin = fwspec->param[0];
 
 	return mp_irqdomain_alloc(domain, virq, nr_irqs, &tmp);
 }

Regards,
Boqun

On Fri, Aug 21, 2020 at 02:24:34AM +0200, Thomas Gleixner wrote:
> Move the IOAPIC specific fields into their own struct and reuse the common
> devid. Get rid of the #ifdeffery as it does not matter at all whether the
> alloc info is a couple of bytes longer or not.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: linux-hyperv@vger.kernel.org
> Cc: iommu@lists.linux-foundation.org
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Jon Derrick <jonathan.derrick@intel.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  arch/x86/include/asm/hw_irq.h       |   23 ++++++-----
>  arch/x86/kernel/apic/io_apic.c      |   70 ++++++++++++++++++------------------
>  drivers/iommu/amd/iommu.c           |   14 +++----
>  drivers/iommu/hyperv-iommu.c        |    2 -
>  drivers/iommu/intel/irq_remapping.c |   18 ++++-----
>  5 files changed, 64 insertions(+), 63 deletions(-)
> 
> --- a/arch/x86/include/asm/hw_irq.h
> +++ b/arch/x86/include/asm/hw_irq.h
> @@ -44,6 +44,15 @@ enum irq_alloc_type {
>  	X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT,
>  };
>  
> +struct ioapic_alloc_info {
> +	int				pin;
> +	int				node;
> +	u32				trigger : 1;
> +	u32				polarity : 1;
> +	u32				valid : 1;
> +	struct IO_APIC_route_entry	*entry;
> +};
> +
>  /**
>   * irq_alloc_info - X86 specific interrupt allocation info
>   * @type:	X86 specific allocation type
> @@ -53,6 +62,8 @@ enum irq_alloc_type {
>   * @mask:	CPU mask for vector allocation
>   * @desc:	Pointer to msi descriptor
>   * @data:	Allocation specific data
> + *
> + * @ioapic:	IOAPIC specific allocation data
>   */
>  struct irq_alloc_info {
>  	enum irq_alloc_type	type;
> @@ -64,6 +75,7 @@ struct irq_alloc_info {
>  	void			*data;
>  
>  	union {
> +		struct ioapic_alloc_info	ioapic;
>  		int		unused;
>  #ifdef	CONFIG_PCI_MSI
>  		struct {
> @@ -71,17 +83,6 @@ struct irq_alloc_info {
>  			irq_hw_number_t	msi_hwirq;
>  		};
>  #endif
> -#ifdef	CONFIG_X86_IO_APIC
> -		struct {
> -			int		ioapic_id;
> -			int		ioapic_pin;
> -			int		ioapic_node;
> -			u32		ioapic_trigger : 1;
> -			u32		ioapic_polarity : 1;
> -			u32		ioapic_valid : 1;
> -			struct IO_APIC_route_entry *ioapic_entry;
> -		};
> -#endif
>  #ifdef	CONFIG_DMAR_TABLE
>  		struct {
>  			int		dmar_id;
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -860,10 +860,10 @@ void ioapic_set_alloc_attr(struct irq_al
>  {
>  	init_irq_alloc_info(info, NULL);
>  	info->type = X86_IRQ_ALLOC_TYPE_IOAPIC;
> -	info->ioapic_node = node;
> -	info->ioapic_trigger = trigger;
> -	info->ioapic_polarity = polarity;
> -	info->ioapic_valid = 1;
> +	info->ioapic.node = node;
> +	info->ioapic.trigger = trigger;
> +	info->ioapic.polarity = polarity;
> +	info->ioapic.valid = 1;
>  }
>  
>  #ifndef CONFIG_ACPI
> @@ -878,32 +878,32 @@ static void ioapic_copy_alloc_attr(struc
>  
>  	copy_irq_alloc_info(dst, src);
>  	dst->type = X86_IRQ_ALLOC_TYPE_IOAPIC;
> -	dst->ioapic_id = mpc_ioapic_id(ioapic_idx);
> -	dst->ioapic_pin = pin;
> -	dst->ioapic_valid = 1;
> -	if (src && src->ioapic_valid) {
> -		dst->ioapic_node = src->ioapic_node;
> -		dst->ioapic_trigger = src->ioapic_trigger;
> -		dst->ioapic_polarity = src->ioapic_polarity;
> +	dst->devid = mpc_ioapic_id(ioapic_idx);
> +	dst->ioapic.pin = pin;
> +	dst->ioapic.valid = 1;
> +	if (src && src->ioapic.valid) {
> +		dst->ioapic.node = src->ioapic.node;
> +		dst->ioapic.trigger = src->ioapic.trigger;
> +		dst->ioapic.polarity = src->ioapic.polarity;
>  	} else {
> -		dst->ioapic_node = NUMA_NO_NODE;
> +		dst->ioapic.node = NUMA_NO_NODE;
>  		if (acpi_get_override_irq(gsi, &trigger, &polarity) >= 0) {
> -			dst->ioapic_trigger = trigger;
> -			dst->ioapic_polarity = polarity;
> +			dst->ioapic.trigger = trigger;
> +			dst->ioapic.polarity = polarity;
>  		} else {
>  			/*
>  			 * PCI interrupts are always active low level
>  			 * triggered.
>  			 */
> -			dst->ioapic_trigger = IOAPIC_LEVEL;
> -			dst->ioapic_polarity = IOAPIC_POL_LOW;
> +			dst->ioapic.trigger = IOAPIC_LEVEL;
> +			dst->ioapic.polarity = IOAPIC_POL_LOW;
>  		}
>  	}
>  }
>  
>  static int ioapic_alloc_attr_node(struct irq_alloc_info *info)
>  {
> -	return (info && info->ioapic_valid) ? info->ioapic_node : NUMA_NO_NODE;
> +	return (info && info->ioapic.valid) ? info->ioapic.node : NUMA_NO_NODE;
>  }
>  
>  static void mp_register_handler(unsigned int irq, unsigned long trigger)
> @@ -933,14 +933,14 @@ static bool mp_check_pin_attr(int irq, s
>  	 * pin with real trigger and polarity attributes.
>  	 */
>  	if (irq < nr_legacy_irqs() && data->count == 1) {
> -		if (info->ioapic_trigger != data->trigger)
> -			mp_register_handler(irq, info->ioapic_trigger);
> -		data->entry.trigger = data->trigger = info->ioapic_trigger;
> -		data->entry.polarity = data->polarity = info->ioapic_polarity;
> +		if (info->ioapic.trigger != data->trigger)
> +			mp_register_handler(irq, info->ioapic.trigger);
> +		data->entry.trigger = data->trigger = info->ioapic.trigger;
> +		data->entry.polarity = data->polarity = info->ioapic.polarity;
>  	}
>  
> -	return data->trigger == info->ioapic_trigger &&
> -	       data->polarity == info->ioapic_polarity;
> +	return data->trigger == info->ioapic.trigger &&
> +	       data->polarity == info->ioapic.polarity;
>  }
>  
>  static int alloc_irq_from_domain(struct irq_domain *domain, int ioapic, u32 gsi,
> @@ -1002,7 +1002,7 @@ static int alloc_isa_irq_from_domain(str
>  		if (!mp_check_pin_attr(irq, info))
>  			return -EBUSY;
>  		if (__add_pin_to_irq_node(irq_data->chip_data, node, ioapic,
> -					  info->ioapic_pin))
> +					  info->ioapic.pin))
>  			return -ENOMEM;
>  	} else {
>  		info->flags |= X86_IRQ_ALLOC_LEGACY;
> @@ -2092,8 +2092,8 @@ static int mp_alloc_timer_irq(int ioapic
>  		struct irq_alloc_info info;
>  
>  		ioapic_set_alloc_attr(&info, NUMA_NO_NODE, 0, 0);
> -		info.ioapic_id = mpc_ioapic_id(ioapic);
> -		info.ioapic_pin = pin;
> +		info.devid = mpc_ioapic_id(ioapic);
> +		info.ioapic.pin = pin;
>  		mutex_lock(&ioapic_mutex);
>  		irq = alloc_isa_irq_from_domain(domain, 0, ioapic, pin, &info);
>  		mutex_unlock(&ioapic_mutex);
> @@ -2297,7 +2297,7 @@ static int mp_irqdomain_create(int ioapi
>  
>  	init_irq_alloc_info(&info, NULL);
>  	info.type = X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT;
> -	info.ioapic_id = mpc_ioapic_id(ioapic);
> +	info.devid = mpc_ioapic_id(ioapic);
>  	parent = irq_remapping_get_irq_domain(&info);
>  	if (!parent)
>  		parent = x86_vector_domain;
> @@ -2932,9 +2932,9 @@ int mp_ioapic_registered(u32 gsi_base)
>  static void mp_irqdomain_get_attr(u32 gsi, struct mp_chip_data *data,
>  				  struct irq_alloc_info *info)
>  {
> -	if (info && info->ioapic_valid) {
> -		data->trigger = info->ioapic_trigger;
> -		data->polarity = info->ioapic_polarity;
> +	if (info && info->ioapic.valid) {
> +		data->trigger = info->ioapic.trigger;
> +		data->polarity = info->ioapic.polarity;
>  	} else if (acpi_get_override_irq(gsi, &data->trigger,
>  					 &data->polarity) < 0) {
>  		/* PCI interrupts are always active low level triggered. */
> @@ -2980,7 +2980,7 @@ int mp_irqdomain_alloc(struct irq_domain
>  		return -EINVAL;
>  
>  	ioapic = mp_irqdomain_ioapic_idx(domain);
> -	pin = info->ioapic_pin;
> +	pin = info->ioapic.pin;
>  	if (irq_find_mapping(domain, (irq_hw_number_t)pin) > 0)
>  		return -EEXIST;
>  
> @@ -2988,7 +2988,7 @@ int mp_irqdomain_alloc(struct irq_domain
>  	if (!data)
>  		return -ENOMEM;
>  
> -	info->ioapic_entry = &data->entry;
> +	info->ioapic.entry = &data->entry;
>  	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, info);
>  	if (ret < 0) {
>  		kfree(data);
> @@ -2996,7 +2996,7 @@ int mp_irqdomain_alloc(struct irq_domain
>  	}
>  
>  	INIT_LIST_HEAD(&data->irq_2_pin);
> -	irq_data->hwirq = info->ioapic_pin;
> +	irq_data->hwirq = info->ioapic.pin;
>  	irq_data->chip = (domain->parent == x86_vector_domain) ?
>  			  &ioapic_chip : &ioapic_ir_chip;
>  	irq_data->chip_data = data;
> @@ -3006,8 +3006,8 @@ int mp_irqdomain_alloc(struct irq_domain
>  	add_pin_to_irq_node(data, ioapic_alloc_attr_node(info), ioapic, pin);
>  
>  	local_irq_save(flags);
> -	if (info->ioapic_entry)
> -		mp_setup_entry(cfg, data, info->ioapic_entry);
> +	if (info->ioapic.entry)
> +		mp_setup_entry(cfg, data, info->ioapic.entry);
>  	mp_register_handler(virq, data->trigger);
>  	if (virq < nr_legacy_irqs())
>  		legacy_pic->mask(virq);
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -3508,7 +3508,7 @@ static int get_devid(struct irq_alloc_in
>  	switch (info->type) {
>  	case X86_IRQ_ALLOC_TYPE_IOAPIC:
>  	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
> -		return get_ioapic_devid(info->ioapic_id);
> +		return get_ioapic_devid(info->devid);
>  	case X86_IRQ_ALLOC_TYPE_HPET:
>  	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
>  		return get_hpet_devid(info->devid);
> @@ -3586,15 +3586,15 @@ static void irq_remapping_prepare_irte(s
>  	switch (info->type) {
>  	case X86_IRQ_ALLOC_TYPE_IOAPIC:
>  		/* Setup IOAPIC entry */
> -		entry = info->ioapic_entry;
> -		info->ioapic_entry = NULL;
> +		entry = info->ioapic.entry;
> +		info->ioapic.entry = NULL;
>  		memset(entry, 0, sizeof(*entry));
>  		entry->vector        = index;
>  		entry->mask          = 0;
> -		entry->trigger       = info->ioapic_trigger;
> -		entry->polarity      = info->ioapic_polarity;
> +		entry->trigger       = info->ioapic.trigger;
> +		entry->polarity      = info->ioapic.polarity;
>  		/* Mask level triggered irqs. */
> -		if (info->ioapic_trigger)
> +		if (info->ioapic.trigger)
>  			entry->mask = 1;
>  		break;
>  
> @@ -3680,7 +3680,7 @@ static int irq_remapping_alloc(struct ir
>  					iommu->irte_ops->set_allocated(table, i);
>  			}
>  			WARN_ON(table->min_index != 32);
> -			index = info->ioapic_pin;
> +			index = info->ioapic.pin;
>  		} else {
>  			index = -ENOMEM;
>  		}
> --- a/drivers/iommu/hyperv-iommu.c
> +++ b/drivers/iommu/hyperv-iommu.c
> @@ -101,7 +101,7 @@ static int hyperv_irq_remapping_alloc(st
>  	 * in the chip_data and hyperv_irq_remapping_activate()/hyperv_ir_set_
>  	 * affinity() set vector and dest_apicid directly into IO-APIC entry.
>  	 */
> -	irq_data->chip_data = info->ioapic_entry;
> +	irq_data->chip_data = info->ioapic.entry;
>  
>  	/*
>  	 * Hypver-V IO APIC irq affinity should be in the scope of
> --- a/drivers/iommu/intel/irq_remapping.c
> +++ b/drivers/iommu/intel/irq_remapping.c
> @@ -1113,7 +1113,7 @@ static struct irq_domain *intel_get_irq_
>  
>  	switch (info->type) {
>  	case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
> -		return map_ioapic_to_ir(info->ioapic_id);
> +		return map_ioapic_to_ir(info->devid);
>  	case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT:
>  		return map_hpet_to_ir(info->devid);
>  	case X86_IRQ_ALLOC_TYPE_PCI_MSI:
> @@ -1254,16 +1254,16 @@ static void intel_irq_remapping_prepare_
>  	switch (info->type) {
>  	case X86_IRQ_ALLOC_TYPE_IOAPIC:
>  		/* Set source-id of interrupt request */
> -		set_ioapic_sid(irte, info->ioapic_id);
> +		set_ioapic_sid(irte, info->devid);
>  		apic_printk(APIC_VERBOSE, KERN_DEBUG "IOAPIC[%d]: Set IRTE entry (P:%d FPD:%d Dst_Mode:%d Redir_hint:%d Trig_Mode:%d Dlvry_Mode:%X Avail:%X Vector:%02X Dest:%08X SID:%04X SQ:%X SVT:%X)\n",
> -			info->ioapic_id, irte->present, irte->fpd,
> +			info->devid, irte->present, irte->fpd,
>  			irte->dst_mode, irte->redir_hint,
>  			irte->trigger_mode, irte->dlvry_mode,
>  			irte->avail, irte->vector, irte->dest_id,
>  			irte->sid, irte->sq, irte->svt);
>  
> -		entry = (struct IR_IO_APIC_route_entry *)info->ioapic_entry;
> -		info->ioapic_entry = NULL;
> +		entry = (struct IR_IO_APIC_route_entry *)info->ioapic.entry;
> +		info->ioapic.entry = NULL;
>  		memset(entry, 0, sizeof(*entry));
>  		entry->index2	= (index >> 15) & 0x1;
>  		entry->zero	= 0;
> @@ -1273,11 +1273,11 @@ static void intel_irq_remapping_prepare_
>  		 * IO-APIC RTE will be configured with virtual vector.
>  		 * irq handler will do the explicit EOI to the io-apic.
>  		 */
> -		entry->vector	= info->ioapic_pin;
> +		entry->vector	= info->ioapic.pin;
>  		entry->mask	= 0;			/* enable IRQ */
> -		entry->trigger	= info->ioapic_trigger;
> -		entry->polarity	= info->ioapic_polarity;
> -		if (info->ioapic_trigger)
> +		entry->trigger	= info->ioapic.trigger;
> +		entry->polarity	= info->ioapic.polarity;
> +		if (info->ioapic.trigger)
>  			entry->mask = 1; /* Mask level triggered irqs. */
>  		break;
>  
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [patch RFC 10/38] x86/ioapic: Consolidate IOAPIC allocation
  2020-08-26  8:40   ` Boqun Feng
@ 2020-08-26  9:53     ` Thomas Gleixner
  0 siblings, 0 replies; 71+ messages in thread
From: Thomas Gleixner @ 2020-08-26  9:53 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Dimitri Sivanich, linux-hyperv, Steve Wahl, linux-pci,
	K. Y. Srinivasan, Dan Williams, Wei Liu, Stephen Hemminger,
	Baolu Lu, Marc Zyngier, x86, Jason Gunthorpe, Megha Dey,
	xen-devel, Kevin Tian, Konrad Rzeszutek Wilk, Haiyang Zhang,
	Alex Williamson, Stefano Stabellini, Bjorn Helgaas, Dave Jiang,
	Boris Ostrovsky, Jon Derrick, Juergen Gross, Russ Anderson,
	Greg Kroah-Hartman, LKML, iommu, Jacob Pan, Rafael J. Wysocki

On Wed, Aug 26 2020 at 16:40, Boqun Feng wrote:
> I hit a compiler error while I was trying to compile this patchset:
>
> arch/x86/kernel/devicetree.c: In function ‘dt_irqdomain_alloc’:
> arch/x86/kernel/devicetree.c:232:6: error: ‘struct irq_alloc_info’ has no member named ‘ioapic_id’; did you mean ‘ioapic’?
>   232 |  tmp.ioapic_id = mpc_ioapic_id(mp_irqdomain_ioapic_idx(domain));

Yeah, noticed myself already and 0day complained as well :)

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2020-08-26  9:53 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21  0:24 [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 01/38] iommu/amd: Prevent NULL pointer dereference Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 02/38] x86/init: Remove unused init ops Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 03/38] x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 04/38] x86/irq: Add allocation type for parent domain retrieval Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 05/38] iommu/vt-d: Consolidate irq domain getter Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 06/38] iommu/amd: " Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 07/38] iommu/irq_remapping: Consolidate irq domain lookup Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 08/38] x86/irq: Prepare consolidation of irq_alloc_info Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 09/38] x86/msi: Consolidate HPET allocation Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 10/38] x86/ioapic: Consolidate IOAPIC allocation Thomas Gleixner
2020-08-26  8:40   ` Boqun Feng
2020-08-26  9:53     ` Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 11/38] x86/irq: Consolidate DMAR irq allocation Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 12/38] x86/irq: Consolidate UV domain allocation Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq() Thomas Gleixner
2020-08-25 20:03   ` Bjorn Helgaas
2020-08-25 21:11     ` Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 14/38] x86/msi: Consolidate MSI allocation Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 15/38] x86/msi: Use generic MSI domain ops Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 16/38] x86/irq: Move apic_post_init() invocation to one place Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 17/38] x86/pci: Reducde #ifdeffery in PCI init code Thomas Gleixner
2020-08-25 20:20   ` Bjorn Helgaas
2020-08-21  0:24 ` [patch RFC 18/38] x86/irq: Initialize PCI/MSI domain at PCI init time Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 19/38] irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 20/38] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI Thomas Gleixner
2020-08-25 20:04   ` Bjorn Helgaas
2020-08-21  0:24 ` [patch RFC 21/38] PCI: MSI: Provide pci_dev_has_special_msi_domain() helper Thomas Gleixner
2020-08-25 20:16   ` Bjorn Helgaas
2020-08-21  0:24 ` [patch RFC 22/38] x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() Thomas Gleixner
2020-08-24  4:48   ` Jürgen Groß
2020-08-21  0:24 ` [patch RFC 23/38] x86/xen: Rework MSI teardown Thomas Gleixner
2020-08-24  5:09   ` Jürgen Groß
2020-08-21  0:24 ` [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init Thomas Gleixner
2020-08-24  4:59   ` Jürgen Groß
2020-08-24 21:21     ` Thomas Gleixner
2020-08-25  4:21       ` Jürgen Groß
2020-08-25  9:51         ` Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 25/38] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 26/38] x86/xen: Wrap XEN MSI management into irqdomain Thomas Gleixner
2020-08-24  6:21   ` Jürgen Groß
2020-08-25  7:57     ` Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 27/38] iommm/vt-d: Store irq domain in struct device Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 28/38] iommm/amd: " Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 29/38] x86/pci: Set default irq domain in pcibios_add_device() Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks Thomas Gleixner
2020-08-25 20:07   ` Bjorn Helgaas
2020-08-25 21:28     ` Thomas Gleixner
2020-08-25 21:35       ` Bjorn Helgaas
2020-08-25 21:40         ` Thomas Gleixner
2020-08-25 22:03           ` Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 31/38] x86/irq: Cleanup the arch_*_msi_irqs() leftovers Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 32/38] x86/irq: Make most MSI ops XEN private Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 33/38] x86/irq: Add DEV_MSI allocation type Thomas Gleixner
2020-08-21  0:24 ` [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI Thomas Gleixner
2020-08-25 20:24   ` Bjorn Helgaas
2020-08-25 21:30     ` Thomas Gleixner
2020-08-25 21:50       ` Bjorn Helgaas
2020-08-21  0:24 ` [patch RFC 35/38] platform-msi: Provide default irq_chip::ack Thomas Gleixner
2020-08-21  0:25 ` [patch RFC 36/38] platform-msi: Add device MSI infrastructure Thomas Gleixner
2020-08-21  0:25 ` [patch RFC 37/38] irqdomain/msi: Provide msi_alloc/free_store() callbacks Thomas Gleixner
2020-08-21  0:25 ` [patch RFC 38/38] irqchip: Add IMS array driver - NOT FOR MERGING Thomas Gleixner
2020-08-21 12:45   ` Jason Gunthorpe
2020-08-21 19:47     ` Thomas Gleixner
2020-08-21 20:17       ` Jason Gunthorpe
2020-08-21 23:47         ` Thomas Gleixner
2020-08-22  0:51           ` Jason Gunthorpe
2020-08-22  1:34             ` Thomas Gleixner
2020-08-22 23:05               ` Jason Gunthorpe
2020-08-23  8:03                 ` Thomas Gleixner
2020-08-22 14:19 ` [patch RFC 00/38] x86, PCI, XEN, genirq ...: Prepare for device MSI Jürgen Groß

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).