Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
* [RFC V1 0/7] Add support for a new IMS interrupt mechanism
@ 2019-09-13  1:32 Megha Dey
  2019-09-13  1:32 ` [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts Megha Dey
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Megha Dey @ 2019-09-13  1:32 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, jgg
  Cc: ashok.raj, megha.dey, jacob.jun.pan, Megha Dey

Currently, MSI (Message signaled interrupts) and MSI-X are the de facto
standard for device interrupt mechanism. MSI-X supports up to 2048
interrupts per device while MSI supports 32, which seems more than enough
for current devices. However, the introduction of SIOV (Scalable IO
virtualization) shifts the creation of assignable virtual devices from
hardware to a more software assisted approach. This flexible composition
of direct assignable devices, a.k.a. assignable device interfaces (ADIs)
unchains hardware from costly PCI standard. Under SIOV, device resource
can now be mapped directly to a guest or other user space drivers for
near native DMA performance. To complete functionality of ADIs, a matching
interrupt resource must also be introduced which will be scalable.

Interrupt message storage (IMS) is conceived  as a scalable albeit device
specific interrupt mechanism to meet such a demand. With IMS, there is
theoretically no upper bound on the number of interrupts which a device
can support. The size and location of IMS is device-specific; some devices
may implement IMS as on-device storage which are memory-mapped, others may
opt to implement IMS in system memory. IMS stores each interrupt message as
a DWORD size data payload and a 64-bit address(same as MSI-X). Access to
the IMS is through the host driver due to the non-architectural nature of
device IMS unlike the architectural MSI-X table which are accessed through
PCI drivers.

In this patchset, we introduce generic IMS APIs that fits the Linux IRQ
subsystem, supports IMS IRQ chip and domains that can be used by drivers
which are capable of generating IMS interrupts.

The IMS has been introduced as part of Intel's Scalable I/O virtualization
specification:
https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification

This patchset is based on Linux 5.3-rc8.

Currently there is no device out in the market which supports SIOV (Hence no
device supports IMS).

This series is a basic patchset to get the ball rolling and receive some
inital comments. As per my discussion with Marc Zyngier and Thomas Gleixner
at the Linux Plumbers, I need to do the following:
1. Since a device can support MSI-X and IMS simultaneously, ensure proper
   locking mechanism for the 'msi_list' in the device structure.
2. Introduce dynamic allocation of IMS vectors perhaps by using a group ID
3. IMS support of a device needs to be discoverable. A bit in the vendor
   specific capability in the PCI config is to be added rather than getting
   this information from each device driver.

Jason Gunthorpe of Mellanox technologies is looking to do something similar
on ARM platforms and was wondering why IMS is x86 sepcific. Perhaps we can
use this thread to discuss further on this. 

Megha Dey (7):
  genirq/msi: Differentiate between various MSI based interrupts
  drivers/base: Introduce callbacks for IMS interrupt domain
  x86/ims: Add support for a new IMS irq domain
  irq_remapping: New interfaces to support IMS irqdomain
  x86/ims: Introduce x86_ims_ops
  ims-msi: Add APIs to allocate/free IMS interrupts
  ims: Add the set_desc callback

 arch/mips/pci/msi-xlp.c              |   2 +-
 arch/s390/pci/pci_irq.c              |   2 +-
 arch/x86/include/asm/irq_remapping.h |  13 ++
 arch/x86/include/asm/msi.h           |   4 +
 arch/x86/include/asm/pci.h           |   4 +
 arch/x86/include/asm/x86_init.h      |  10 +
 arch/x86/kernel/apic/Makefile        |   1 +
 arch/x86/kernel/apic/ims.c           | 118 ++++++++++++
 arch/x86/kernel/apic/msi.c           |   6 +-
 arch/x86/kernel/x86_init.c           |  23 +++
 arch/x86/pci/xen.c                   |   2 +-
 drivers/base/Kconfig                 |   7 +
 drivers/base/Makefile                |   1 +
 drivers/base/ims-msi.c               | 353 +++++++++++++++++++++++++++++++++++
 drivers/iommu/intel_irq_remapping.c  |  30 +++
 drivers/iommu/irq_remapping.c        |   9 +
 drivers/iommu/irq_remapping.h        |   3 +
 drivers/pci/msi.c                    |  19 +-
 drivers/vfio/mdev/mdev_core.c        |   6 +
 drivers/vfio/mdev/mdev_private.h     |   1 -
 include/linux/intel-iommu.h          |   1 +
 include/linux/mdev.h                 |   2 +
 include/linux/msi.h                  |  55 +++++-
 kernel/irq/msi.c                     |   2 +-
 24 files changed, 655 insertions(+), 19 deletions(-)
 create mode 100644 arch/x86/kernel/apic/ims.c
 create mode 100644 drivers/base/ims-msi.c

-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts
  2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
@ 2019-09-13  1:32 ` Megha Dey
  2019-09-13  4:40   ` Greg KH
                     ` (2 more replies)
  2019-09-13  1:32 ` [RFC V1 2/7] drivers/base: Introduce callbacks for IMS interrupt domain Megha Dey
                   ` (6 subsequent siblings)
  7 siblings, 3 replies; 21+ messages in thread
From: Megha Dey @ 2019-09-13  1:32 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, jgg
  Cc: ashok.raj, megha.dey, jacob.jun.pan, Megha Dey

Since a device can support both MSI-X and IMS interrupts simultaneously,
do away with is_msix and introduce a new enum msi_desc_tag to
differentiate between the various types of msi_descs.

Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 arch/mips/pci/msi-xlp.c    |  2 +-
 arch/s390/pci/pci_irq.c    |  2 +-
 arch/x86/kernel/apic/msi.c |  2 +-
 arch/x86/pci/xen.c         |  2 +-
 drivers/pci/msi.c          | 19 ++++++++++---------
 include/linux/msi.h        | 11 ++++++++++-
 kernel/irq/msi.c           |  2 +-
 7 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/arch/mips/pci/msi-xlp.c b/arch/mips/pci/msi-xlp.c
index bb14335..0f06ad1 100644
--- a/arch/mips/pci/msi-xlp.c
+++ b/arch/mips/pci/msi-xlp.c
@@ -457,7 +457,7 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 	node = slot / 8;
 	lnkbase = nlm_get_pcie_base(node, link);
 
-	if (desc->msi_attrib.is_msix)
+	if (desc->tag == IRQ_MSI_TAG_MSIX)
 		return xlp_setup_msix(lnkbase, node, link, desc);
 	else
 		return xlp_setup_msi(lnkbase, node, link, desc);
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index d80616a..1938582 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -332,7 +332,7 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev)
 	for_each_pci_msi_entry(msi, pdev) {
 		if (!msi->irq)
 			continue;
-		if (msi->msi_attrib.is_msix)
+		if (msi->tag == IRQ_MSI_TAG_MSIX)
 			__pci_msix_desc_mask_irq(msi, 1);
 		else
 			__pci_msi_desc_mask_irq(msi, 1, 1);
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 7f75334..435bcda 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -98,7 +98,7 @@ int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
 
 	init_irq_alloc_info(arg, NULL);
 	arg->msi_dev = pdev;
-	if (desc->msi_attrib.is_msix) {
+	if (desc->tag == IRQ_MSI_TAG_MSIX) {
 		arg->type = X86_IRQ_ALLOC_TYPE_MSIX;
 	} else {
 		arg->type = X86_IRQ_ALLOC_TYPE_MSI;
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 91220cc..5e850b8 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -382,7 +382,7 @@ static void xen_teardown_msi_irqs(struct pci_dev *dev)
 	struct msi_desc *msidesc;
 
 	msidesc = first_pci_msi_entry(dev);
-	if (msidesc->msi_attrib.is_msix)
+	if (msidesc->tag == IRQ_MSI_TAG_MSIX)
 		xen_pci_frontend_disable_msix(dev);
 	else
 		xen_pci_frontend_disable_msi(dev);
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 0884bed..8a05416 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -235,7 +235,7 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag)
 {
 	struct msi_desc *desc = irq_data_get_msi_desc(data);
 
-	if (desc->msi_attrib.is_msix) {
+	if (desc->tag == IRQ_MSI_TAG_MSIX) {
 		msix_mask_irq(desc, flag);
 		readl(desc->mask_base);		/* Flush write to device */
 	} else {
@@ -278,7 +278,7 @@ void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 
 	BUG_ON(dev->current_state != PCI_D0);
 
-	if (entry->msi_attrib.is_msix) {
+	if (entry->tag == IRQ_MSI_TAG_MSIX) {
 		void __iomem *base = pci_msix_desc_addr(entry);
 
 		if (!base) {
@@ -313,7 +313,7 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 
 	if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev)) {
 		/* Don't touch the hardware now */
-	} else if (entry->msi_attrib.is_msix) {
+	} else if (entry->tag == IRQ_MSI_TAG_MSIX) {
 		void __iomem *base = pci_msix_desc_addr(entry);
 
 		if (!base)
@@ -376,7 +376,7 @@ static void free_msi_irqs(struct pci_dev *dev)
 	pci_msi_teardown_msi_irqs(dev);
 
 	list_for_each_entry_safe(entry, tmp, msi_list, list) {
-		if (entry->msi_attrib.is_msix) {
+		if (entry->tag == IRQ_MSI_TAG_MSIX) {
 			if (list_is_last(&entry->list, msi_list))
 				iounmap(entry->mask_base);
 		}
@@ -471,7 +471,7 @@ static ssize_t msi_mode_show(struct device *dev, struct device_attribute *attr,
 	entry = irq_get_msi_desc(irq);
 	if (entry)
 		return sprintf(buf, "%s\n",
-				entry->msi_attrib.is_msix ? "msix" : "msi");
+			(entry->tag == IRQ_MSI_TAG_MSIX) ? "msix" : "msi");
 
 	return -ENODEV;
 }
@@ -570,7 +570,7 @@ msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
 
-	entry->msi_attrib.is_msix	= 0;
+	entry->tag			= IRQ_MSI_TAG_MSI;
 	entry->msi_attrib.is_64		= !!(control & PCI_MSI_FLAGS_64BIT);
 	entry->msi_attrib.is_virtual    = 0;
 	entry->msi_attrib.entry_nr	= 0;
@@ -714,7 +714,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 			goto out;
 		}
 
-		entry->msi_attrib.is_msix	= 1;
+		entry->tag			= IRQ_MSI_TAG_MSIX;
 		entry->msi_attrib.is_64		= 1;
 		if (entries)
 			entry->msi_attrib.entry_nr = entries[i].entry;
@@ -1380,7 +1380,7 @@ irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev,
 
 static inline bool pci_msi_desc_is_multi_msi(struct msi_desc *desc)
 {
-	return !desc->msi_attrib.is_msix && desc->nvec_used > 1;
+	return (desc->tag == IRQ_MSI_TAG_MSI) && desc->nvec_used > 1;
 }
 
 /**
@@ -1404,7 +1404,8 @@ int pci_msi_domain_check_cap(struct irq_domain *domain,
 	if (pci_msi_desc_is_multi_msi(desc) &&
 	    !(info->flags & MSI_FLAG_MULTI_PCI_MSI))
 		return 1;
-	else if (desc->msi_attrib.is_msix && !(info->flags & MSI_FLAG_PCI_MSIX))
+	else if ((desc->tag == IRQ_MSI_TAG_MSIX) &&
+					!(info->flags & MSI_FLAG_PCI_MSIX))
 		return -ENOTSUPP;
 
 	return 0;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 8ad679e..22591b6 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -55,6 +55,15 @@ struct ti_sci_inta_msi_desc {
 	u16	dev_index;
 };
 
+enum msi_desc_tags {
+	IRQ_MSI_TAG_MSI,
+	IRQ_MSI_TAG_MSIX,
+	IRQ_MSI_TAG_IMS,
+	IRQ_MSI_TAG_PLAT,
+	IRQ_MSI_TAG_FSL,
+	IRQ_MSI_TAG_SCI,
+};
+
 /**
  * struct msi_desc - Descriptor structure for MSI based interrupts
  * @list:	List head for management
@@ -90,6 +99,7 @@ struct msi_desc {
 	struct device			*dev;
 	struct msi_msg			msg;
 	struct irq_affinity_desc	*affinity;
+	enum msi_desc_tags		tag;
 #ifdef CONFIG_IRQ_MSI_IOMMU
 	const void			*iommu_cookie;
 #endif
@@ -102,7 +112,6 @@ struct msi_desc {
 		struct {
 			u32 masked;
 			struct {
-				u8	is_msix		: 1;
 				u8	multiple	: 3;
 				u8	multi_cap	: 3;
 				u8	maskbit		: 1;
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index ad26fbc..0819395 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -384,7 +384,7 @@ static bool msi_check_reservation_mode(struct irq_domain *domain,
 	 * masking and MSI does so when the maskbit is set.
 	 */
 	desc = first_msi_entry(dev);
-	return desc->msi_attrib.is_msix || desc->msi_attrib.maskbit;
+	return (desc->tag == IRQ_MSI_TAG_MSIX) || desc->msi_attrib.maskbit;
 }
 
 /**
-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V1 2/7] drivers/base: Introduce callbacks for IMS interrupt domain
  2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
  2019-09-13  1:32 ` [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts Megha Dey
@ 2019-09-13  1:32 ` Megha Dey
  2019-09-13  4:39   ` Greg KH
  2019-09-13 14:50   ` Jason Gunthorpe
  2019-09-13  1:32 ` [RFC V1 3/7] x86/ims: Add support for a new IMS irq domain Megha Dey
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 21+ messages in thread
From: Megha Dey @ 2019-09-13  1:32 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, jgg
  Cc: ashok.raj, megha.dey, jacob.jun.pan, Megha Dey, Jacob Pan, Sanjay Kumar

This patch serves as a preparatory patch to introduce a new IMS
(Interrupt Message Store) domain. It consists of APIs which would
be used as callbacks to the IRQ chip associated with the IMS domain.

The APIs introduced in this patch are:
dev_ims_mask_irq - Generic irq chip callback to mask IMS interrupts
dev_ims_unmask_irq - Generic irq chip callback to unmask IMS interrupts
dev_ims_domain_write_msg - Helper to write MSI message to Device IMS

It also introduces IMS specific structures namely:
dev_ims_ops - Callbacks for IMS domain ops
dev_ims_desc - Device specific IMS msi descriptor data
dev_ims_priv_data - Internal data structure containing a unique devid
and a pointer to the IMS domain ops

Lastly, it adds a new config option MSI_IMS which must be enabled by
any driver who would want to use the IMS infrastructure.

Since IMS is not PCI compliant (like platform-msi), most of the code is
similar to platform-msi.c.

TODO: Conclude if ims-msi.c and platform-msi.c can be merged.

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 drivers/base/Kconfig   |  7 ++++
 drivers/base/Makefile  |  1 +
 drivers/base/ims-msi.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/msi.h    | 35 ++++++++++++++++++-
 4 files changed, 136 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/ims-msi.c

diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index dc40449..038fabd 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -206,3 +206,10 @@ config GENERIC_ARCH_TOPOLOGY
 	  runtime.
 
 endmenu
+
+config MSI_IMS
+	bool "Device Specific Interrupt Message Storage (IMS)"
+	select GENERIC_MSI_IRQ
+	help
+	  This allows device drivers to enable device specific
+	  interrupt message storage (IMS) besides standard MSI-X interrupts.
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 1574520..659b9b0 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -22,6 +22,7 @@ obj-$(CONFIG_SOC_BUS) += soc.o
 obj-$(CONFIG_PINCTRL) += pinctrl.o
 obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
 obj-$(CONFIG_GENERIC_MSI_IRQ_DOMAIN) += platform-msi.o
+obj-$(CONFIG_MSI_IMS) += ims-msi.o
 obj-$(CONFIG_GENERIC_ARCH_TOPOLOGY) += arch_topology.o
 
 obj-y			+= test/
diff --git a/drivers/base/ims-msi.c b/drivers/base/ims-msi.c
new file mode 100644
index 0000000..68dc10f
--- /dev/null
+++ b/drivers/base/ims-msi.c
@@ -0,0 +1,94 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright © 2019 Intel Corporation.
+ *
+ * Author: Megha Dey <megha.dey@intel.com>
+ */
+
+#include <linux/device.h>
+#include <linux/export.h>
+#include <linux/irq.h>
+#include <linux/msi.h>
+
+struct dev_ims_priv_data {
+	struct device		*dev;
+	msi_alloc_info_t	arg;
+	int			devid;
+	struct dev_ims_ops	*ims_ops;
+};
+
+u32 __dev_ims_desc_mask_irq(struct msi_desc *desc, u32 flag)
+{
+	u32 mask_bits = desc->dev_ims.masked;
+	struct dev_ims_ops *ops;
+
+	ops = desc->dev_ims.priv->ims_ops;
+	if (!ops)
+		return 0;
+
+	if (flag) {
+		if (ops->irq_mask)
+			mask_bits = ops->irq_mask(desc);
+	} else {
+		if (ops->irq_unmask)
+			mask_bits = ops->irq_unmask(desc);
+	}
+
+	return mask_bits;
+}
+
+static void ims_mask_irq(struct msi_desc *desc, u32 flag)
+{
+	desc->dev_ims.masked = __dev_ims_desc_mask_irq(desc, flag);
+}
+
+static void ims_set_mask_bit(struct irq_data *data, u32 flag)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+
+	ims_mask_irq(desc, flag);
+}
+
+static void __dev_write_ims_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+	struct dev_ims_ops *ops;
+
+	ops = desc->dev_ims.priv->ims_ops;
+	if (ops && ops->irq_write_msi_msg)
+		ops->irq_write_msi_msg(desc, msg);
+
+	desc->msg = *msg;
+}
+
+/**
+ * dev_ims_mask_irq - Generic irq chip callback to mask IMS interrupts
+ * @data: pointer to irqdata associated to that interrupt
+ */
+void dev_ims_mask_irq(struct irq_data *data)
+{
+	ims_set_mask_bit(data, 1);
+}
+EXPORT_SYMBOL_GPL(dev_ims_mask_irq);
+
+/**
+ * dev_msi_unmask_irq - Generic irq chip callback to unmask IMS interrupts
+ * @data: pointer to irqdata associated to that interrupt
+ */
+void dev_ims_unmask_irq(struct irq_data *data)
+{
+	ims_set_mask_bit(data, 0);
+}
+EXPORT_SYMBOL_GPL(dev_ims_unmask_irq);
+
+/**
+ * dev_ims_write_msg - Helper to write MSI message to Device IMS
+ * @irq_data: Pointer to interrupt data of the MSI interrupt
+ * @msg:      Pointer to the message
+ */
+void dev_ims_write_msg(struct irq_data *data, struct msi_msg *msg)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+
+	__dev_write_ims_msg(desc, msg);
+}
+EXPORT_SYMBOL_GPL(dev_ims_write_msg);
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 22591b6..246285a 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -17,6 +17,7 @@ struct irq_data;
 struct msi_desc;
 struct pci_dev;
 struct platform_msi_priv_data;
+struct dev_ims_priv_data;
 void __get_cached_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 #ifdef CONFIG_GENERIC_MSI_IRQ
 void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg);
@@ -40,6 +41,31 @@ struct platform_msi_desc {
 };
 
 /**
+ * dev_ims_ops - Callbacks for IMS domain ops
+ * @irq_mask:		mask an interrupt source
+ * @irq_unmask:		unmask an interrupt source
+ * @irq_write_msi_msg:	write message content
+ */
+struct dev_ims_ops {
+	unsigned int	(*irq_mask)(struct msi_desc *desc);
+	unsigned int	(*irq_unmask)(struct msi_desc *desc);
+	void		(*irq_write_msi_msg)(struct msi_desc *desc,
+						struct msi_msg *msg);
+};
+
+/**
+ * dev_ims_desc - Device specific interrupt message storage msi desc data
+ * @ims_priv_data:	Pointer to device private data
+ * @ims_index:		The index of the MSI descriptor
+ * @masked:		mask bits
+ */
+struct dev_ims_desc {
+	struct dev_ims_priv_data	*priv;
+	u16				ims_index;
+	u32				masked;
+};
+
+/**
  * fsl_mc_msi_desc - FSL-MC device specific msi descriptor data
  * @msi_index:		The index of the MSI descriptor
  */
@@ -90,6 +116,7 @@ enum msi_desc_tags {
  * @platform:	[platform]  Platform device specific msi descriptor data
  * @fsl_mc:	[fsl-mc]    FSL MC device specific msi descriptor data
  * @inta:	[INTA]	    TISCI based INTA specific msi descriptor data
+ * @dev_ims:	[dev_ims]   Device specific IMS msi descriptor data
  */
 struct msi_desc {
 	/* Shared device/bus type independent data */
@@ -136,6 +163,7 @@ struct msi_desc {
 		struct platform_msi_desc platform;
 		struct fsl_mc_msi_desc fsl_mc;
 		struct ti_sci_inta_msi_desc inta;
+		struct dev_ims_desc dev_ims;
 	};
 };
 
@@ -180,6 +208,7 @@ static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
 struct pci_dev *msi_desc_to_pci_dev(struct msi_desc *desc);
 void *msi_desc_to_pci_sysdata(struct msi_desc *desc);
 void pci_write_msi_msg(unsigned int irq, struct msi_msg *msg);
+
 #else /* CONFIG_PCI_MSI */
 static inline void *msi_desc_to_pci_sysdata(struct msi_desc *desc)
 {
@@ -201,6 +230,10 @@ u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
 void pci_msi_mask_irq(struct irq_data *data);
 void pci_msi_unmask_irq(struct irq_data *data);
 
+void dev_ims_unmask_irq(struct irq_data *data);
+void dev_ims_mask_irq(struct irq_data *data);
+void dev_ims_write_msg(struct irq_data *data, struct msi_msg *msg);
+
 /*
  * The arch hooks to setup up msi irqs. Those functions are
  * implemented as weak symbols so that they /can/ be overriden by
@@ -228,7 +261,7 @@ struct msi_controller {
 	void (*teardown_irq)(struct msi_controller *chip, unsigned int irq);
 };
 
-#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
+#if defined(CONFIG_GENERIC_MSI_IRQ_DOMAIN) || defined(CONFIG_MSI_IMS)
 
 #include <linux/irqhandler.h>
 #include <asm/msi.h>
-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V1 3/7] x86/ims: Add support for a new IMS irq domain
  2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
  2019-09-13  1:32 ` [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts Megha Dey
  2019-09-13  1:32 ` [RFC V1 2/7] drivers/base: Introduce callbacks for IMS interrupt domain Megha Dey
@ 2019-09-13  1:32 ` Megha Dey
  2019-09-13 14:52   ` Jason Gunthorpe
  2019-09-13 22:07   ` Alex Williamson
  2019-09-13  1:32 ` [RFC V1 4/7] irq_remapping: New interfaces to support IMS irqdomain Megha Dey
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 21+ messages in thread
From: Megha Dey @ 2019-09-13  1:32 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, jgg
  Cc: ashok.raj, megha.dey, jacob.jun.pan, Megha Dey, Jacob Pan, Sanjay Kumar

This patch adds support for the creation of a new IMS irq domain. It
creates a new irq_chip associated with the IMS domain and adds the
necessary domain operations to it.

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 arch/x86/include/asm/msi.h       |  4 ++
 arch/x86/kernel/apic/Makefile    |  1 +
 arch/x86/kernel/apic/ims.c       | 93 ++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/apic/msi.c       |  4 +-
 drivers/vfio/mdev/mdev_core.c    |  6 +++
 drivers/vfio/mdev/mdev_private.h |  1 -
 include/linux/mdev.h             |  2 +
 7 files changed, 108 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/kernel/apic/ims.c

diff --git a/arch/x86/include/asm/msi.h b/arch/x86/include/asm/msi.h
index 25ddd09..51f9d25 100644
--- a/arch/x86/include/asm/msi.h
+++ b/arch/x86/include/asm/msi.h
@@ -11,4 +11,8 @@ int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
 
 void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc);
 
+struct msi_domain_info;
+
+irq_hw_number_t msi_get_hwirq(struct msi_domain_info *info,
+						msi_alloc_info_t *arg);
 #endif /* _ASM_X86_MSI_H */
diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
index a6fcaf16..75a2270 100644
--- a/arch/x86/kernel/apic/Makefile
+++ b/arch/x86/kernel/apic/Makefile
@@ -12,6 +12,7 @@ obj-y				+= hw_nmi.o
 
 obj-$(CONFIG_X86_IO_APIC)	+= io_apic.o
 obj-$(CONFIG_PCI_MSI)		+= msi.o
+obj-$(CONFIG_MSI_IMS)		+= ims.o
 obj-$(CONFIG_SMP)		+= ipi.o
 
 ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/apic/ims.c b/arch/x86/kernel/apic/ims.c
new file mode 100644
index 0000000..d9808a5
--- /dev/null
+++ b/arch/x86/kernel/apic/ims.c
@@ -0,0 +1,93 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright © 2019 Intel Corporation.
+ *
+ * Author: Megha Dey <megha.dey@intel.com>
+ */
+
+#include <linux/dmar.h>
+#include <linux/irq.h>
+#include <linux/mdev.h>
+#include <linux/pci.h>
+
+/*
+ * Determine if a dev is mdev or not. Return NULL if not mdev device.
+ * Return mdev's parent dev if success.
+ */
+static inline struct device *mdev_to_parent(struct device *dev)
+{
+	struct device *ret = NULL;
+	struct device *(*fn)(struct device *dev);
+	struct bus_type *bus = symbol_get(mdev_bus_type);
+
+	if (bus && dev->bus == bus) {
+		fn = symbol_get(mdev_dev_to_parent_dev);
+		ret = fn(dev);
+		symbol_put(mdev_dev_to_parent_dev);
+		symbol_put(mdev_bus_type);
+	}
+
+	return ret;
+}
+
+static struct pci_dev *ims_get_pci_dev(struct device *dev)
+{
+	struct pci_dev *pdev;
+
+	if (dev_is_mdev(dev)) {
+		struct device *parent = mdev_to_parent(dev);
+
+		pdev = to_pci_dev(parent);
+	} else {
+		pdev = to_pci_dev(dev);
+	}
+
+	return pdev;
+}
+
+int dev_ims_prepare(struct irq_domain *domain, struct device *dev, int nvec,
+		    msi_alloc_info_t *arg)
+{
+	struct pci_dev *pdev = ims_get_pci_dev(dev);
+
+	init_irq_alloc_info(arg, NULL);
+	arg->msi_dev = pdev;
+	arg->type = X86_IRQ_ALLOC_TYPE_MSIX;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dev_ims_prepare);
+
+#ifdef CONFIG_IRQ_REMAP
+
+static struct msi_domain_ops dev_ims_domain_ops = {
+	.get_hwirq	= msi_get_hwirq,
+	.msi_prepare	= dev_ims_prepare,
+};
+
+static struct irq_chip dev_ims_ir_controller = {
+	.name			= "IR-DEV-IMS",
+	.irq_unmask		= dev_ims_unmask_irq,
+	.irq_mask		= dev_ims_mask_irq,
+	.irq_ack		= irq_chip_ack_parent,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_set_vcpu_affinity	= irq_chip_set_vcpu_affinity_parent,
+	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.irq_write_msi_msg	= dev_ims_write_msg,
+};
+
+static struct msi_domain_info ims_ir_domain_info = {
+	.flags		= MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+			  MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX,
+	.ops		= &dev_ims_domain_ops,
+	.chip		= &dev_ims_ir_controller,
+	.handler	= handle_edge_irq,
+	.handler_name	= "edge",
+};
+
+struct irq_domain *arch_create_ims_irq_domain(struct irq_domain *parent)
+{
+	return pci_msi_create_irq_domain(NULL, &ims_ir_domain_info, parent);
+}
+
+#endif
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 435bcda..65da813 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -84,7 +84,7 @@ void native_teardown_msi_irq(unsigned int irq)
 	irq_domain_free_irqs(irq, 1);
 }
 
-static irq_hw_number_t pci_msi_get_hwirq(struct msi_domain_info *info,
+irq_hw_number_t msi_get_hwirq(struct msi_domain_info *info,
 					 msi_alloc_info_t *arg)
 {
 	return arg->msi_hwirq;
@@ -116,7 +116,7 @@ void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
 EXPORT_SYMBOL_GPL(pci_msi_set_desc);
 
 static struct msi_domain_ops pci_msi_domain_ops = {
-	.get_hwirq	= pci_msi_get_hwirq,
+	.get_hwirq	= msi_get_hwirq,
 	.msi_prepare	= pci_msi_prepare,
 	.set_desc	= pci_msi_set_desc,
 };
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index b558d4c..cecc6a6 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -33,6 +33,12 @@ struct device *mdev_parent_dev(struct mdev_device *mdev)
 }
 EXPORT_SYMBOL(mdev_parent_dev);
 
+struct device *mdev_dev_to_parent_dev(struct device *dev)
+{
+	return to_mdev_device(dev)->parent->dev;
+}
+EXPORT_SYMBOL(mdev_dev_to_parent_dev);
+
 void *mdev_get_drvdata(struct mdev_device *mdev)
 {
 	return mdev->driver_data;
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index 7d92295..c21f130 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -36,7 +36,6 @@ struct mdev_device {
 };
 
 #define to_mdev_device(dev)	container_of(dev, struct mdev_device, dev)
-#define dev_is_mdev(d)		((d)->bus == &mdev_bus_type)
 
 struct mdev_type {
 	struct kobject kobj;
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 0ce30ca..9dcbffe 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -144,5 +144,7 @@ void mdev_unregister_driver(struct mdev_driver *drv);
 struct device *mdev_parent_dev(struct mdev_device *mdev);
 struct device *mdev_dev(struct mdev_device *mdev);
 struct mdev_device *mdev_from_dev(struct device *dev);
+struct device *mdev_dev_to_parent_dev(struct device *dev);
 
+#define dev_is_mdev(d)          ((d)->bus == symbol_get(mdev_bus_type))
 #endif /* MDEV_H */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V1 4/7] irq_remapping: New interfaces to support IMS irqdomain
  2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
                   ` (2 preceding siblings ...)
  2019-09-13  1:32 ` [RFC V1 3/7] x86/ims: Add support for a new IMS irq domain Megha Dey
@ 2019-09-13  1:32 ` Megha Dey
  2019-09-13  1:32 ` [RFC V1 5/7] x86/ims: Introduce x86_ims_ops Megha Dey
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Megha Dey @ 2019-09-13  1:32 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, jgg
  Cc: ashok.raj, megha.dey, jacob.jun.pan, Megha Dey, Jacob Pan, Sanjay Kumar

Introduce new interfaces for interrupt remapping drivers to support
IMS irqdomains:

irq_remapping_get_ims_irq_domain(): get the IMS irqdomain for an IRQ
allocation. We must build one IMS irqdomain for each interrupt remapping
unit. The driver calls this interface to get the IMS irqdomain associated
with an IR irqdomain which manages the devices.

Architecture specific hooks:
arch_create_ims_irq_domain(): create an IMS irqdomain associated with the
interrupt remapping unit.

We also add following callback into struct irq_remap_ops:
struct irq_domain *(*get_ims_irq_domain)(struct irq_alloc_info *);

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h | 13 +++++++++++++
 drivers/iommu/intel_irq_remapping.c  | 30 ++++++++++++++++++++++++++++++
 drivers/iommu/irq_remapping.c        |  9 +++++++++
 drivers/iommu/irq_remapping.h        |  3 +++
 include/linux/intel-iommu.h          |  1 +
 5 files changed, 56 insertions(+)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 4bc985f..a735507 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -48,11 +48,18 @@ extern struct irq_domain *
 irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info);
 extern struct irq_domain *
 irq_remapping_get_irq_domain(struct irq_alloc_info *info);
+extern struct irq_domain *
+irq_remapping_get_ims_irq_domain(struct irq_alloc_info *info);
 
 /* Create PCI MSI/MSIx irqdomain, use @parent as the parent irqdomain. */
 extern struct irq_domain *
 arch_create_remap_msi_irq_domain(struct irq_domain *par, const char *n, int id);
 
+/* Create IMS irqdomain, use @parent as the parent irqdomain. */
+#ifdef CONFIG_MSI_IMS
+extern struct irq_domain *arch_create_ims_irq_domain(struct irq_domain *parent);
+#endif
+
 /* Get parent irqdomain for interrupt remapping irqdomain */
 static inline struct irq_domain *arch_get_ir_parent_domain(void)
 {
@@ -85,5 +92,11 @@ irq_remapping_get_irq_domain(struct irq_alloc_info *info)
 	return NULL;
 }
 
+static inline struct irq_domain *
+irq_remapping_get_ims_irq_domain(struct irq_alloc_info *info)
+{
+	return NULL;
+}
+
 #endif /* CONFIG_IRQ_REMAP */
 #endif /* __X86_IRQ_REMAPPING_H */
diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 4786ca0..3c0c0cb 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -573,6 +573,10 @@ static int intel_setup_irq_remapping(struct intel_iommu *iommu)
 						 "INTEL-IR-MSI",
 						 iommu->seq_id);
 
+#ifdef CONFIG_MSI_IMS
+	iommu->ir_ims_domain = arch_create_ims_irq_domain(iommu->ir_domain);
+#endif
+
 	ir_table->base = page_address(pages);
 	ir_table->bitmap = bitmap;
 	iommu->ir_table = ir_table;
@@ -633,6 +637,10 @@ static void intel_teardown_irq_remapping(struct intel_iommu *iommu)
 			irq_domain_remove(iommu->ir_msi_domain);
 			iommu->ir_msi_domain = NULL;
 		}
+		if (iommu->ir_ims_domain) {
+			irq_domain_remove(iommu->ir_ims_domain);
+			iommu->ir_ims_domain = NULL;
+		}
 		if (iommu->ir_domain) {
 			irq_domain_remove(iommu->ir_domain);
 			iommu->ir_domain = NULL;
@@ -1139,6 +1147,27 @@ static struct irq_domain *intel_get_irq_domain(struct irq_alloc_info *info)
 	return NULL;
 }
 
+static struct irq_domain *intel_get_ims_irq_domain(struct irq_alloc_info *info)
+{
+	struct intel_iommu *iommu;
+
+	if (!info)
+		return NULL;
+
+	switch (info->type) {
+	case X86_IRQ_ALLOC_TYPE_MSI:
+	case X86_IRQ_ALLOC_TYPE_MSIX:
+		iommu = map_dev_to_ir(info->msi_dev);
+		if (iommu)
+			return iommu->ir_ims_domain;
+		break;
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
 struct irq_remap_ops intel_irq_remap_ops = {
 	.prepare		= intel_prepare_irq_remapping,
 	.enable			= intel_enable_irq_remapping,
@@ -1147,6 +1176,7 @@ struct irq_remap_ops intel_irq_remap_ops = {
 	.enable_faulting	= enable_drhd_fault_handling,
 	.get_ir_irq_domain	= intel_get_ir_irq_domain,
 	.get_irq_domain		= intel_get_irq_domain,
+	.get_ims_irq_domain     = intel_get_ims_irq_domain,
 };
 
 static void intel_ir_reconfigure_irte(struct irq_data *irqd, bool force)
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 83f36f6..c4352fc 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -193,3 +193,12 @@ irq_remapping_get_irq_domain(struct irq_alloc_info *info)
 
 	return remap_ops->get_irq_domain(info);
 }
+
+struct irq_domain *
+irq_remapping_get_ims_irq_domain(struct irq_alloc_info *info)
+{
+	if (!remap_ops || !remap_ops->get_ims_irq_domain)
+		return NULL;
+
+	return remap_ops->get_ims_irq_domain(info);
+}
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 6a190d5..8e198ad 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -48,6 +48,9 @@ struct irq_remap_ops {
 
 	/* Get the MSI irqdomain associated with the IOMMU device */
 	struct irq_domain *(*get_irq_domain)(struct irq_alloc_info *);
+
+	/* Get the IMS irqdomain associated with the IOMMU device */
+	struct irq_domain *(*get_ims_irq_domain)(struct irq_alloc_info *);
 };
 
 extern struct irq_remap_ops intel_irq_remap_ops;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 4fc6454f..26be769 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -546,6 +546,7 @@ struct intel_iommu {
 	struct ir_table *ir_table;	/* Interrupt remapping info */
 	struct irq_domain *ir_domain;
 	struct irq_domain *ir_msi_domain;
+	struct irq_domain *ir_ims_domain;
 #endif
 	struct iommu_device iommu;  /* IOMMU core code handle */
 	int		node;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V1 5/7] x86/ims: Introduce x86_ims_ops
  2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
                   ` (3 preceding siblings ...)
  2019-09-13  1:32 ` [RFC V1 4/7] irq_remapping: New interfaces to support IMS irqdomain Megha Dey
@ 2019-09-13  1:32 ` Megha Dey
  2019-09-13 14:54   ` Jason Gunthorpe
  2019-09-13  1:32 ` [RFC V1 6/7] ims-msi: Add APIs to allocate/free IMS interrupts Megha Dey
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Megha Dey @ 2019-09-13  1:32 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, jgg
  Cc: ashok.raj, megha.dey, jacob.jun.pan, Megha Dey, Jacob Pan, Sanjay Kumar

This patch introduces an x86 specific indirect mechanism to setup the
interrupt message storage. The IMS specific functions (setup, teardown,
restore) become function pointers in an x86_ims_ops struct, that
defaults to their implementations in ims.c and ims-msi.c.

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 arch/x86/include/asm/pci.h      |  4 ++++
 arch/x86/include/asm/x86_init.h | 10 ++++++++++
 arch/x86/kernel/apic/ims.c      | 18 ++++++++++++++++++
 arch/x86/kernel/x86_init.c      | 23 +++++++++++++++++++++++
 drivers/base/ims-msi.c          | 34 ++++++++++++++++++++++++++++++++++
 include/linux/msi.h             |  6 ++++++
 6 files changed, 95 insertions(+)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index e662f98..2ef513f 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -114,6 +114,10 @@ struct msi_desc;
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
 void native_teardown_msi_irq(unsigned int irq);
 void native_restore_msi_irqs(struct pci_dev *dev);
+#ifdef CONFIG_MSI_IMS
+int native_setup_ims_irqs(struct device *dev, int nvec);
+#endif
+
 #else
 #define native_setup_msi_irqs		NULL
 #define native_teardown_msi_irq		NULL
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index ac09341..9c2cbbb 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -287,6 +287,15 @@ struct x86_msi_ops {
 	void (*restore_msi_irqs)(struct pci_dev *dev);
 };
 
+struct device;
+
+struct x86_ims_ops {
+	int (*setup_ims_irqs)(struct device *dev, int nvec);
+	void (*teardown_ims_irq)(unsigned int irq);
+	void (*teardown_ims_irqs)(struct device *dev);
+	void (*restore_ims_irqs)(struct device *dev);
+};
+
 struct x86_apic_ops {
 	unsigned int	(*io_apic_read)   (unsigned int apic, unsigned int reg);
 	void		(*restore)(void);
@@ -297,6 +306,7 @@ extern struct x86_cpuinit_ops x86_cpuinit;
 extern struct x86_platform_ops x86_platform;
 extern struct x86_msi_ops x86_msi;
 extern struct x86_apic_ops x86_apic_ops;
+extern struct x86_ims_ops x86_ims;
 
 extern void x86_early_init_platform_quirks(void);
 extern void x86_init_noop(void);
diff --git a/arch/x86/kernel/apic/ims.c b/arch/x86/kernel/apic/ims.c
index d9808a5..a539666 100644
--- a/arch/x86/kernel/apic/ims.c
+++ b/arch/x86/kernel/apic/ims.c
@@ -9,6 +9,7 @@
 #include <linux/irq.h>
 #include <linux/mdev.h>
 #include <linux/pci.h>
+#include <asm/irq_remapping.h>
 
 /*
  * Determine if a dev is mdev or not. Return NULL if not mdev device.
@@ -45,6 +46,23 @@ static struct pci_dev *ims_get_pci_dev(struct device *dev)
 	return pdev;
 }
 
+int native_setup_ims_irqs(struct device *dev, int nvec)
+{
+	struct irq_domain *domain;
+	struct irq_alloc_info info;
+	struct pci_dev *pdev = ims_get_pci_dev(dev);
+
+	init_irq_alloc_info(&info, NULL);
+	info.type = X86_IRQ_ALLOC_TYPE_MSIX;
+	info.msi_dev = pdev;
+
+	domain = irq_remapping_get_ims_irq_domain(&info);
+	if (!domain)
+		return -ENOSYS;
+
+	return msi_domain_alloc_irqs(domain, dev, nvec);
+}
+
 int dev_ims_prepare(struct irq_domain *domain, struct device *dev, int nvec,
 		    msi_alloc_info_t *arg)
 {
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 1bef687..3ce42d4 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -153,6 +153,29 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
 }
 #endif
 
+#if defined(CONFIG_MSI_IMS)
+struct x86_ims_ops x86_ims __ro_after_init = {
+	.setup_ims_irqs		= native_setup_ims_irqs,
+	.teardown_ims_irqs	= dev_ims_teardown_irqs,
+	.restore_ims_irqs	= dev_ims_restore_irqs,
+};
+
+int arch_setup_ims_irqs(struct device *dev, int nvec)
+{
+	return x86_ims.setup_ims_irqs(dev, nvec);
+}
+
+void arch_teardown_ims_irqs(struct device *dev)
+{
+	x86_ims.teardown_ims_irqs(dev);
+}
+
+void arch_restore_ims_irqs(struct device *dev)
+{
+	x86_ims.restore_ims_irqs(dev);
+}
+#endif
+
 struct x86_apic_ops x86_apic_ops __ro_after_init = {
 	.io_apic_read	= native_io_apic_read,
 	.restore	= native_restore_boot_irq_mode,
diff --git a/drivers/base/ims-msi.c b/drivers/base/ims-msi.c
index 68dc10f..df28ee2 100644
--- a/drivers/base/ims-msi.c
+++ b/drivers/base/ims-msi.c
@@ -92,3 +92,37 @@ void dev_ims_write_msg(struct irq_data *data, struct msi_msg *msg)
 	__dev_write_ims_msg(desc, msg);
 }
 EXPORT_SYMBOL_GPL(dev_ims_write_msg);
+
+void dev_ims_teardown_irqs(struct device *dev)
+{
+	struct msi_desc *entry;
+
+	for_each_msi_entry(entry, dev)
+		if (entry->irq && entry->tag == IRQ_MSI_TAG_IMS)
+			arch_teardown_msi_irq(entry->irq);
+}
+
+static void dev_ims_restore_irq(struct device *dev, int irq)
+{
+	struct msi_desc *entry = NULL;
+	struct dev_ims_ops *ops;
+
+	for_each_msi_entry(entry, dev)
+		if (irq == entry->irq && entry->tag == IRQ_MSI_TAG_IMS)
+			break;
+
+	if (entry) {
+		ops = entry->dev_ims.priv->ims_ops;
+		if (ops && ops->irq_write_msi_msg)
+			ops->irq_write_msi_msg(entry, &entry->msg);
+	}
+}
+
+void dev_ims_restore_irqs(struct device *dev)
+{
+	struct msi_desc *entry;
+
+	for_each_msi_entry(entry, dev)
+		if (entry->tag == IRQ_MSI_TAG_IMS)
+			dev_ims_restore_irq(dev, entry->irq);
+}
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 246285a..37070bf 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -233,6 +233,8 @@ void pci_msi_unmask_irq(struct irq_data *data);
 void dev_ims_unmask_irq(struct irq_data *data);
 void dev_ims_mask_irq(struct irq_data *data);
 void dev_ims_write_msg(struct irq_data *data, struct msi_msg *msg);
+void dev_ims_teardown_irqs(struct device *dev);
+void dev_ims_restore_irqs(struct device *dev);
 
 /*
  * The arch hooks to setup up msi irqs. Those functions are
@@ -245,6 +247,10 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
 void arch_teardown_msi_irqs(struct pci_dev *dev);
 void arch_restore_msi_irqs(struct pci_dev *dev);
 
+int arch_setup_ims_irqs(struct device *dev, int nvec);
+void arch_teardown_ims_irqs(struct device *dev);
+void arch_restore_ims_irqs(struct device *dev);
+
 void default_teardown_msi_irqs(struct pci_dev *dev);
 void default_restore_msi_irqs(struct pci_dev *dev);
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V1 6/7] ims-msi: Add APIs to allocate/free IMS interrupts
  2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
                   ` (4 preceding siblings ...)
  2019-09-13  1:32 ` [RFC V1 5/7] x86/ims: Introduce x86_ims_ops Megha Dey
@ 2019-09-13  1:32 ` Megha Dey
  2019-09-26 21:44   ` Bjorn Helgaas
  2019-09-13  1:32 ` [RFC V1 7/7] ims: Add the set_desc callback Megha Dey
  2019-09-13 19:50 ` [RFC V1 0/7] Add support for a new IMS interrupt mechanism Jason Gunthorpe
  7 siblings, 1 reply; 21+ messages in thread
From: Megha Dey @ 2019-09-13  1:32 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, jgg
  Cc: ashok.raj, megha.dey, jacob.jun.pan, Megha Dey, Jacob Pan, Sanjay Kumar

This patch introduces APIs to allocate and free IMS interrupts.

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 drivers/base/ims-msi.c | 216 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/msi.h    |   2 +
 2 files changed, 218 insertions(+)

diff --git a/drivers/base/ims-msi.c b/drivers/base/ims-msi.c
index df28ee2..3e579c9 100644
--- a/drivers/base/ims-msi.c
+++ b/drivers/base/ims-msi.c
@@ -7,9 +7,12 @@
 
 #include <linux/device.h>
 #include <linux/export.h>
+#include <linux/interrupt.h>
 #include <linux/irq.h>
 #include <linux/msi.h>
 
+#define DEVIMS_ID_SHIFT        21
+
 struct dev_ims_priv_data {
 	struct device		*dev;
 	msi_alloc_info_t	arg;
@@ -17,6 +20,8 @@ struct dev_ims_priv_data {
 	struct dev_ims_ops	*ims_ops;
 };
 
+static DEFINE_IDA(dev_ims_devid_ida);
+
 u32 __dev_ims_desc_mask_irq(struct msi_desc *desc, u32 flag)
 {
 	u32 mask_bits = desc->dev_ims.masked;
@@ -126,3 +131,214 @@ void dev_ims_restore_irqs(struct device *dev)
 		if (entry->tag == IRQ_MSI_TAG_IMS)
 			dev_ims_restore_irq(dev, entry->irq);
 }
+
+static void dev_ims_free_descs(struct device *dev)
+{
+	struct msi_desc *desc, *tmp;
+
+	for_each_msi_entry(desc, dev)
+		if (desc->irq && desc->tag == IRQ_MSI_TAG_IMS)
+			BUG_ON(irq_has_action(desc->irq));
+
+	dev_ims_teardown_irqs(dev);
+
+	list_for_each_entry_safe(desc, tmp, dev_to_msi_list(dev), list) {
+		if (desc->tag == IRQ_MSI_TAG_IMS) {
+			list_del(&desc->list);
+			free_msi_entry(desc);
+		}
+	}
+}
+
+static int dev_ims_setup_msi_irqs(struct device *dev, int nvec)
+{
+	struct irq_domain *domain;
+
+	domain = dev_get_msi_domain(dev);
+	if (domain && irq_domain_is_hierarchy(domain))
+		return msi_domain_alloc_irqs(domain, dev, nvec);
+
+	return arch_setup_ims_irqs(dev, nvec);
+}
+
+static struct dev_ims_priv_data *
+dev_ims_alloc_priv_data(struct device *dev, unsigned int nvec,
+			struct dev_ims_ops *ops)
+{
+	struct dev_ims_priv_data *datap;
+	int ret;
+
+	/*
+	 * Currently there is no limit to the number of IRQs a device can
+	 * allocate.
+	 */
+	if (!nvec)
+		return ERR_PTR(-EINVAL);
+
+	datap = kzalloc(sizeof(*datap), GFP_KERNEL);
+	if (!datap)
+		return ERR_PTR(-ENOMEM);
+
+	ret = ida_simple_get(&dev_ims_devid_ida,
+				0, 1 << DEVIMS_ID_SHIFT, GFP_KERNEL);
+
+	if (ret < 0) {
+		kfree(datap);
+		return ERR_PTR(ret);
+	}
+
+	datap->devid = ret;
+	datap->ims_ops = ops;
+	datap->dev = dev;
+
+	return datap;
+}
+
+static int dev_ims_alloc_descs(struct device *dev,
+			       int nvec, struct dev_ims_priv_data *data,
+			       struct irq_affinity *affd)
+{
+	struct irq_affinity_desc *curmsk, *masks = NULL;
+	struct msi_desc *desc;
+	int i, base = 0;
+
+	if (!list_empty(dev_to_msi_list(dev))) {
+		desc = list_last_entry(dev_to_msi_list(dev),
+					struct msi_desc, list);
+		base = desc->dev_ims.ims_index + 1;
+	}
+
+	if (affd) {
+		masks = irq_create_affinity_masks(nvec, affd);
+		if (!masks)
+			dev_err(dev, "Unable to allocate affinity masks, ignoring\n");
+	}
+
+	for (i = 0, curmsk = masks; i < nvec; i++) {
+		desc = alloc_msi_entry(dev, 1, NULL);
+		if (!desc)
+			break;
+
+		desc->dev_ims.priv = data;
+		desc->tag = IRQ_MSI_TAG_IMS;
+		desc->dev_ims.ims_index = base + i;
+
+		list_add_tail(&desc->list, dev_to_msi_list(dev));
+
+		if (masks)
+			curmsk++;
+	}
+
+	kfree(masks);
+
+	if (i != nvec) {
+		/* Clean up the mess */
+		dev_ims_free_descs(dev);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static void dev_ims_free_priv_data(struct dev_ims_priv_data *data)
+{
+	ida_simple_remove(&dev_ims_devid_ida, data->devid);
+	kfree(data);
+}
+
+/**
+ * dev_ims_enable_irqs - Allocate IMS interrupts for @dev
+ * @dev:		The device for which to allocate interrupts
+ * @nvec:		The number of interrupts to allocate
+ * @ops:		IMS device operations
+ * @affd:		optional description of the affinity requirements
+ *
+ * Returns:
+ * Zero for success, or an error code in case of failure
+ */
+int dev_ims_enable_irqs(struct device *dev, unsigned int nvec,
+			struct dev_ims_ops *ops,
+			struct irq_affinity *affd)
+{
+	struct dev_ims_priv_data *priv_data;
+	int err;
+
+	priv_data = dev_ims_alloc_priv_data(dev, nvec, ops);
+	if (IS_ERR(priv_data))
+		return PTR_ERR(priv_data);
+
+	err = dev_ims_alloc_descs(dev, nvec, priv_data, affd);
+	if (err)
+		goto out_free_priv_data;
+
+	err = dev_ims_setup_msi_irqs(dev, nvec);
+	if (err)
+		goto out_free_desc;
+
+	return 0;
+
+out_free_desc:
+	dev_ims_free_descs(dev);
+out_free_priv_data:
+	dev_ims_free_priv_data(priv_data);
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(dev_ims_enable_irqs);
+
+int __dev_ims_alloc_irqs(struct device *dev, int nvec,
+			 struct dev_ims_ops *ops,
+			 struct irq_affinity *affd)
+{
+	int nvec1;
+	int rc;
+
+	for (;;) {
+		if (affd) {
+			nvec1 = irq_calc_affinity_vectors(nvec, nvec, affd);
+			if (nvec < nvec1)
+				return -ENOSPC;
+		}
+
+		rc = dev_ims_enable_irqs(dev, nvec, ops, affd);
+		if (rc == 0)
+			return nvec;
+
+		if (rc < 0)
+			return rc;
+		if (rc < nvec)
+			return -ENOSPC;
+
+		nvec = rc;
+	}
+}
+
+/**
+ * dev_ims_alloc_irqs - Alloc IMS interrupts for @dev
+ * @dev:	The device for which to allocate interrupts
+ * @nvec:	The number of interrupts to allocate
+ * @ops:	IMS device operations
+ */
+int dev_ims_alloc_irqs(struct device *dev, int nvec, struct dev_ims_ops *ops)
+{
+	return __dev_ims_alloc_irqs(dev, nvec, ops, NULL);
+}
+EXPORT_SYMBOL_GPL(dev_ims_alloc_irqs);
+
+/**
+ * dev_ims_domain_free_irqs - Free IMS interrupts for @dev
+ * @dev:	The device for which to free interrupts
+ */
+void dev_ims_free_irqs(struct device *dev)
+{
+	struct msi_desc *desc;
+
+	for_each_msi_entry(desc, dev)
+		if (desc->tag == IRQ_MSI_TAG_IMS) {
+			dev_ims_free_priv_data(desc->dev_ims.priv);
+			break;
+	}
+
+	dev_ims_free_descs(dev);
+}
+EXPORT_SYMBOL_GPL(dev_ims_free_irqs);
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 37070bf..4543bbf 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -235,6 +235,8 @@ void dev_ims_mask_irq(struct irq_data *data);
 void dev_ims_write_msg(struct irq_data *data, struct msi_msg *msg);
 void dev_ims_teardown_irqs(struct device *dev);
 void dev_ims_restore_irqs(struct device *dev);
+int dev_ims_alloc_irqs(struct device *dev, int nvec, struct dev_ims_ops *ops);
+void dev_ims_free_irqs(struct device *dev);
 
 /*
  * The arch hooks to setup up msi irqs. Those functions are
-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V1 7/7] ims: Add the set_desc callback
  2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
                   ` (5 preceding siblings ...)
  2019-09-13  1:32 ` [RFC V1 6/7] ims-msi: Add APIs to allocate/free IMS interrupts Megha Dey
@ 2019-09-13  1:32 ` Megha Dey
  2019-09-26 21:47   ` Bjorn Helgaas
  2019-09-13 19:50 ` [RFC V1 0/7] Add support for a new IMS interrupt mechanism Jason Gunthorpe
  7 siblings, 1 reply; 21+ messages in thread
From: Megha Dey @ 2019-09-13  1:32 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, jgg
  Cc: ashok.raj, megha.dey, jacob.jun.pan, Megha Dey, Jacob Pan, Sanjay Kumar

Add the set_desc callback to the ims domain ops.

The set_desc callback is used to find a unique hwirq number from a given
domain.

Each mdev can have a maximum of 2048 IMS interrupts.

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
---
 arch/x86/kernel/apic/ims.c | 7 +++++++
 drivers/base/ims-msi.c     | 9 +++++++++
 include/linux/msi.h        | 1 +
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/kernel/apic/ims.c b/arch/x86/kernel/apic/ims.c
index a539666..7e36571 100644
--- a/arch/x86/kernel/apic/ims.c
+++ b/arch/x86/kernel/apic/ims.c
@@ -76,11 +76,18 @@ int dev_ims_prepare(struct irq_domain *domain, struct device *dev, int nvec,
 }
 EXPORT_SYMBOL_GPL(dev_ims_prepare);
 
+void dev_ims_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
+{
+	arg->msi_hwirq = dev_ims_calc_hwirq(desc);
+}
+EXPORT_SYMBOL_GPL(dev_ims_set_desc);
+
 #ifdef CONFIG_IRQ_REMAP
 
 static struct msi_domain_ops dev_ims_domain_ops = {
 	.get_hwirq	= msi_get_hwirq,
 	.msi_prepare	= dev_ims_prepare,
+	.set_desc       = dev_ims_set_desc,
 };
 
 static struct irq_chip dev_ims_ir_controller = {
diff --git a/drivers/base/ims-msi.c b/drivers/base/ims-msi.c
index 3e579c9..48f3d24 100644
--- a/drivers/base/ims-msi.c
+++ b/drivers/base/ims-msi.c
@@ -22,6 +22,15 @@ struct dev_ims_priv_data {
 
 static DEFINE_IDA(dev_ims_devid_ida);
 
+irq_hw_number_t dev_ims_calc_hwirq(struct msi_desc *desc)
+{
+	u32 devid;
+
+	devid = desc->dev_ims.priv->devid;
+
+	return (devid << (32 - DEVIMS_ID_SHIFT)) | desc->dev_ims.ims_index;
+}
+
 u32 __dev_ims_desc_mask_irq(struct msi_desc *desc, u32 flag)
 {
 	u32 mask_bits = desc->dev_ims.masked;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 4543bbf..fe4678e 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -237,6 +237,7 @@ void dev_ims_teardown_irqs(struct device *dev);
 void dev_ims_restore_irqs(struct device *dev);
 int dev_ims_alloc_irqs(struct device *dev, int nvec, struct dev_ims_ops *ops);
 void dev_ims_free_irqs(struct device *dev);
+irq_hw_number_t dev_ims_calc_hwirq(struct msi_desc *desc);
 
 /*
  * The arch hooks to setup up msi irqs. Those functions are
-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 2/7] drivers/base: Introduce callbacks for IMS interrupt domain
  2019-09-13  1:32 ` [RFC V1 2/7] drivers/base: Introduce callbacks for IMS interrupt domain Megha Dey
@ 2019-09-13  4:39   ` Greg KH
  2019-09-13 14:50   ` Jason Gunthorpe
  1 sibling, 0 replies; 21+ messages in thread
From: Greg KH @ 2019-09-13  4:39 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, tglx, hpa,
	alex.williamson, jgg, ashok.raj, megha.dey, jacob.jun.pan,
	Jacob Pan, Sanjay Kumar

On Thu, Sep 12, 2019 at 06:32:03PM -0700, Megha Dey wrote:
> This patch serves as a preparatory patch to introduce a new IMS
> (Interrupt Message Store) domain. It consists of APIs which would
> be used as callbacks to the IRQ chip associated with the IMS domain.
> 
> The APIs introduced in this patch are:
> dev_ims_mask_irq - Generic irq chip callback to mask IMS interrupts
> dev_ims_unmask_irq - Generic irq chip callback to unmask IMS interrupts
> dev_ims_domain_write_msg - Helper to write MSI message to Device IMS
> 
> It also introduces IMS specific structures namely:
> dev_ims_ops - Callbacks for IMS domain ops
> dev_ims_desc - Device specific IMS msi descriptor data
> dev_ims_priv_data - Internal data structure containing a unique devid
> and a pointer to the IMS domain ops
> 
> Lastly, it adds a new config option MSI_IMS which must be enabled by
> any driver who would want to use the IMS infrastructure.
> 
> Since IMS is not PCI compliant (like platform-msi), most of the code is
> similar to platform-msi.c.
> 
> TODO: Conclude if ims-msi.c and platform-msi.c can be merged.
> 
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
> Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
> ---
>  drivers/base/Kconfig   |  7 ++++
>  drivers/base/Makefile  |  1 +
>  drivers/base/ims-msi.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/msi.h    | 35 ++++++++++++++++++-
>  4 files changed, 136 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/base/ims-msi.c
> 
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index dc40449..038fabd 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -206,3 +206,10 @@ config GENERIC_ARCH_TOPOLOGY
>  	  runtime.
>  
>  endmenu
> +
> +config MSI_IMS
> +	bool "Device Specific Interrupt Message Storage (IMS)"
> +	select GENERIC_MSI_IRQ
> +	help
> +	  This allows device drivers to enable device specific
> +	  interrupt message storage (IMS) besides standard MSI-X interrupts.

This text tells me nothing about if I want to enable this or not.  How
is a user (or even a developer) supposed to know if their hardware
requires this?

And I _really_ dont want to see this in drivers/base/ if at all possible
because suddenly I am responsible for this code that I know nothing
about.

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts
  2019-09-13  1:32 ` [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts Megha Dey
@ 2019-09-13  4:40   ` Greg KH
  2019-09-13 14:40   ` Jason Gunthorpe
  2019-09-26 21:50   ` Bjorn Helgaas
  2 siblings, 0 replies; 21+ messages in thread
From: Greg KH @ 2019-09-13  4:40 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, tglx, hpa,
	alex.williamson, jgg, ashok.raj, megha.dey, jacob.jun.pan

On Thu, Sep 12, 2019 at 06:32:02PM -0700, Megha Dey wrote:
> +enum msi_desc_tags {
> +	IRQ_MSI_TAG_MSI,
> +	IRQ_MSI_TAG_MSIX,
> +	IRQ_MSI_TAG_IMS,
> +	IRQ_MSI_TAG_PLAT,
> +	IRQ_MSI_TAG_FSL,
> +	IRQ_MSI_TAG_SCI,
> +};

What does any of these mean?  Can you please provide comments at the
very least saying what FSL, SCI, IMS and everything else is?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts
  2019-09-13  1:32 ` [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts Megha Dey
  2019-09-13  4:40   ` Greg KH
@ 2019-09-13 14:40   ` Jason Gunthorpe
  2019-09-26 21:50   ` Bjorn Helgaas
  2 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2019-09-13 14:40 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, ashok.raj, megha.dey, jacob.jun.pan

On Thu, Sep 12, 2019 at 06:32:02PM -0700, Megha Dey wrote:
> Since a device can support both MSI-X and IMS interrupts simultaneously,
> do away with is_msix and introduce a new enum msi_desc_tag to
> differentiate between the various types of msi_descs.

It would be clearer if this commit message explaind that this is a
cleanup creating a normal tagged union:

struct msi_desc {
 [..]
        union {
        
Where the tag says which of the elements in the union is filled in

> +enum msi_desc_tags {
> +	IRQ_MSI_TAG_MSI,
> +	IRQ_MSI_TAG_MSIX,
> +	IRQ_MSI_TAG_IMS,

And don't add IMS to the enum until you add a IMS member to the union.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 2/7] drivers/base: Introduce callbacks for IMS interrupt domain
  2019-09-13  1:32 ` [RFC V1 2/7] drivers/base: Introduce callbacks for IMS interrupt domain Megha Dey
  2019-09-13  4:39   ` Greg KH
@ 2019-09-13 14:50   ` Jason Gunthorpe
  1 sibling, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2019-09-13 14:50 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, ashok.raj, megha.dey, jacob.jun.pan,
	Jacob Pan, Sanjay Kumar

On Thu, Sep 12, 2019 at 06:32:03PM -0700, Megha Dey wrote:
> This patch serves as a preparatory patch to introduce a new IMS
> (Interrupt Message Store) domain. It consists of APIs which would
> be used as callbacks to the IRQ chip associated with the IMS domain.
> 
> The APIs introduced in this patch are:
> dev_ims_mask_irq - Generic irq chip callback to mask IMS interrupts
> dev_ims_unmask_irq - Generic irq chip callback to unmask IMS interrupts
> dev_ims_domain_write_msg - Helper to write MSI message to Device IMS
> 
> It also introduces IMS specific structures namely:
> dev_ims_ops - Callbacks for IMS domain ops
> dev_ims_desc - Device specific IMS msi descriptor data
> dev_ims_priv_data - Internal data structure containing a unique devid
> and a pointer to the IMS domain ops
> 
> Lastly, it adds a new config option MSI_IMS which must be enabled by
> any driver who would want to use the IMS infrastructure.
> 
> Since IMS is not PCI compliant (like platform-msi), most of the code is
> similar to platform-msi.c.

It is very unclear what driver facing API is being proposed here
without a driver code example.

> TODO: Conclude if ims-msi.c and platform-msi.c can be merged.

Wow, the commit message for platform-msi.c explains that it is doing
*exactly* the same thing as ims - at least as far as the driver is
concerned.

Clearly the ims idea already exists and the driver facing API should
be harmonized and enhanced to support both PCI and platform devices.

It would also make sense if the arch facing API was harmonized between
ARM and x86 as well. Would updating x86 to use the new generic
msi_domains ARM is using be needed?

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 3/7] x86/ims: Add support for a new IMS irq domain
  2019-09-13  1:32 ` [RFC V1 3/7] x86/ims: Add support for a new IMS irq domain Megha Dey
@ 2019-09-13 14:52   ` Jason Gunthorpe
  2019-09-13 22:07   ` Alex Williamson
  1 sibling, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2019-09-13 14:52 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, ashok.raj, megha.dey, jacob.jun.pan,
	Jacob Pan, Sanjay Kumar

On Thu, Sep 12, 2019 at 06:32:04PM -0700, Megha Dey wrote:
> +/*
> + * Determine if a dev is mdev or not. Return NULL if not mdev device.
> + * Return mdev's parent dev if success.
> + */
> +static inline struct device *mdev_to_parent(struct device *dev)
> +{
> +	struct device *ret = NULL;
> +	struct device *(*fn)(struct device *dev);
> +	struct bus_type *bus = symbol_get(mdev_bus_type);
> +
> +	if (bus && dev->bus == bus) {
> +		fn = symbol_get(mdev_dev_to_parent_dev);
> +		ret = fn(dev);
> +		symbol_put(mdev_dev_to_parent_dev);
> +		symbol_put(mdev_bus_type);
> +	}

Yuk, arch code should not have special knowledge about vfio-mdev, and
in any case the above way to get it is really gross.

> +static struct msi_domain_ops dev_ims_domain_ops = {
> +	.get_hwirq	= msi_get_hwirq,
> +	.msi_prepare	= dev_ims_prepare,
> +};
> +
> +static struct irq_chip dev_ims_ir_controller = {
> +	.name			= "IR-DEV-IMS",
> +	.irq_unmask		= dev_ims_unmask_irq,
> +	.irq_mask		= dev_ims_mask_irq,
> +	.irq_ack		= irq_chip_ack_parent,
> +	.irq_retrigger		= irq_chip_retrigger_hierarchy,
> +	.irq_set_vcpu_affinity	= irq_chip_set_vcpu_affinity_parent,
> +	.flags			= IRQCHIP_SKIP_SET_WAKE,
> +	.irq_write_msi_msg	= dev_ims_write_msg,
> +};

It seems really weird to wrapper an irq_chip like this, if the caller
is supposed to provide the functions more or less directly why not
have the caller create and own the irq chip? Is something in the way
of this? It looked like the platform version of this did the same API approach..

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 5/7] x86/ims: Introduce x86_ims_ops
  2019-09-13  1:32 ` [RFC V1 5/7] x86/ims: Introduce x86_ims_ops Megha Dey
@ 2019-09-13 14:54   ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2019-09-13 14:54 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, ashok.raj, megha.dey, jacob.jun.pan,
	Jacob Pan, Sanjay Kumar

On Thu, Sep 12, 2019 at 06:32:06PM -0700, Megha Dey wrote:
> This patch introduces an x86 specific indirect mechanism to setup the
> interrupt message storage. The IMS specific functions (setup, teardown,
> restore) become function pointers in an x86_ims_ops struct, that
> defaults to their implementations in ims.c and ims-msi.c.
> 
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
> Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
>  arch/x86/include/asm/pci.h      |  4 ++++
>  arch/x86/include/asm/x86_init.h | 10 ++++++++++
>  arch/x86/kernel/apic/ims.c      | 18 ++++++++++++++++++
>  arch/x86/kernel/x86_init.c      | 23 +++++++++++++++++++++++
>  drivers/base/ims-msi.c          | 34 ++++++++++++++++++++++++++++++++++
>  include/linux/msi.h             |  6 ++++++
>  6 files changed, 95 insertions(+)
> 
> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
> index e662f98..2ef513f 100644
> +++ b/arch/x86/include/asm/pci.h
> @@ -114,6 +114,10 @@ struct msi_desc;
>  int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
>  void native_teardown_msi_irq(unsigned int irq);
>  void native_restore_msi_irqs(struct pci_dev *dev);
> +#ifdef CONFIG_MSI_IMS
> +int native_setup_ims_irqs(struct device *dev, int nvec);
> +#endif
> +
>  #else
>  #define native_setup_msi_irqs		NULL
>  #define native_teardown_msi_irq		NULL
> diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
> index ac09341..9c2cbbb 100644
> +++ b/arch/x86/include/asm/x86_init.h
> @@ -287,6 +287,15 @@ struct x86_msi_ops {
>  	void (*restore_msi_irqs)(struct pci_dev *dev);
>  };
>  
> +struct device;
> +
> +struct x86_ims_ops {
> +	int (*setup_ims_irqs)(struct device *dev, int nvec);
> +	void (*teardown_ims_irq)(unsigned int irq);
> +	void (*teardown_ims_irqs)(struct device *dev);
> +	void (*restore_ims_irqs)(struct device *dev);
> +};

This looks alot like the generic struct msi_controller..

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 0/7] Add support for a new IMS interrupt mechanism
  2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
                   ` (6 preceding siblings ...)
  2019-09-13  1:32 ` [RFC V1 7/7] ims: Add the set_desc callback Megha Dey
@ 2019-09-13 19:50 ` Jason Gunthorpe
  2019-09-13 20:27   ` Raj, Ashok
  7 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2019-09-13 19:50 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, alex.williamson, ashok.raj, megha.dey, jacob.jun.pan

On Thu, Sep 12, 2019 at 06:32:01PM -0700, Megha Dey wrote:

> This series is a basic patchset to get the ball rolling and receive some
> inital comments. As per my discussion with Marc Zyngier and Thomas Gleixner
> at the Linux Plumbers, I need to do the following:
> 1. Since a device can support MSI-X and IMS simultaneously, ensure proper
>    locking mechanism for the 'msi_list' in the device structure.
> 2. Introduce dynamic allocation of IMS vectors perhaps by using a group ID
> 3. IMS support of a device needs to be discoverable. A bit in the vendor
>    specific capability in the PCI config is to be added rather than getting
>    this information from each device driver.

Why #3? The point of this scheme is to delegate programming the
addr/data pairs to the driver so it can be done in some
device-specific way. There is no PCI standard behind this, and no
change in PCI semantics. 

I think it would be a tall ask to get a config space bit from PCI-SIG
for something that has little to do with PCI.

After seeing that we already have a platform device based version of
this same idea (drivers/base/platform-msi.c), I think the task here is
really just to extend and expand that approach to work generically for
platform and PCI devices. Along the way tidying the arch interface so
x86 and ARM's stuff to support that uses the same generic interfaces.

ie it is re-organizing code and ideas already in Linux, not defining
some new standard.

I also think refering to this existing idea by some new IMS name is
only confusing people what the goal is... Which is perhaps why #3 was
suggested??

Stated more clearly, I think all uses would be satisfied if
platform_msi_domain_alloc_irqs() could be called for struct
pci_device, could be called multiple times for the same struct
pci_device, and co-existed with MSI and MSI-X on the same pci_device.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 0/7] Add support for a new IMS interrupt mechanism
  2019-09-13 19:50 ` [RFC V1 0/7] Add support for a new IMS interrupt mechanism Jason Gunthorpe
@ 2019-09-13 20:27   ` Raj, Ashok
  2019-09-19 18:25     ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Raj, Ashok @ 2019-09-13 20:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Megha Dey, linux-kernel, x86, linux-pci, maz, bhelgaas, rafael,
	gregkh, tglx, hpa, alex.williamson, megha.dey, jacob.jun.pan,
	Ashok Raj

On Fri, Sep 13, 2019 at 07:50:50PM +0000, Jason Gunthorpe wrote:
> On Thu, Sep 12, 2019 at 06:32:01PM -0700, Megha Dey wrote:
> 
> > This series is a basic patchset to get the ball rolling and receive some
> > inital comments. As per my discussion with Marc Zyngier and Thomas Gleixner
> > at the Linux Plumbers, I need to do the following:
> > 1. Since a device can support MSI-X and IMS simultaneously, ensure proper
> >    locking mechanism for the 'msi_list' in the device structure.
> > 2. Introduce dynamic allocation of IMS vectors perhaps by using a group ID
> > 3. IMS support of a device needs to be discoverable. A bit in the vendor
> >    specific capability in the PCI config is to be added rather than getting
> >    this information from each device driver.
> 
> Why #3? The point of this scheme is to delegate programming the
> addr/data pairs to the driver so it can be done in some
> device-specific way. There is no PCI standard behind this, and no
> change in PCI semantics. 
> 
> I think it would be a tall ask to get a config space bit from PCI-SIG
> for something that has little to do with PCI.

This isn't a standard config capability. Its Designated Vendor Specific
Capability (DVSEC). The device is responsible for managing the addr-data
pair. This provides a hint to the OS framework that this device has
device specific methods.

Agreed its not required, but some OSV's like a generic way to discover
these capabilities, hence its there so device vendors can have
a common guideline.

Check here for some of those details:

https://software.intel.com/en-us/blogs/2018/06/25/introducing-intel-scalable-io-virtualization 

> 
> After seeing that we already have a platform device based version of
> this same idea (drivers/base/platform-msi.c), I think the task here is
> really just to extend and expand that approach to work generically for
> platform and PCI devices. Along the way tidying the arch interface so
> x86 and ARM's stuff to support that uses the same generic interfaces.
> 
> ie it is re-organizing code and ideas already in Linux, not defining
> some new standard.
> 
> I also think refering to this existing idea by some new IMS name is
> only confusing people what the goal is... Which is perhaps why #3 was
> suggested??
> 
> Stated more clearly, I think all uses would be satisfied if
> platform_msi_domain_alloc_irqs() could be called for struct
> pci_device, could be called multiple times for the same struct
> pci_device, and co-existed with MSI and MSI-X on the same pci_device.
> 
> Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 3/7] x86/ims: Add support for a new IMS irq domain
  2019-09-13  1:32 ` [RFC V1 3/7] x86/ims: Add support for a new IMS irq domain Megha Dey
  2019-09-13 14:52   ` Jason Gunthorpe
@ 2019-09-13 22:07   ` Alex Williamson
  1 sibling, 0 replies; 21+ messages in thread
From: Alex Williamson @ 2019-09-13 22:07 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, bhelgaas, rafael, gregkh,
	tglx, hpa, jgg, ashok.raj, megha.dey, jacob.jun.pan, Jacob Pan,
	Sanjay Kumar

On Thu, 12 Sep 2019 18:32:04 -0700
Megha Dey <megha.dey@linux.intel.com> wrote:

> This patch adds support for the creation of a new IMS irq domain. It
> creates a new irq_chip associated with the IMS domain and adds the
> necessary domain operations to it.
> 
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
> Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
> ---
>  arch/x86/include/asm/msi.h       |  4 ++
>  arch/x86/kernel/apic/Makefile    |  1 +
>  arch/x86/kernel/apic/ims.c       | 93 ++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kernel/apic/msi.c       |  4 +-
>  drivers/vfio/mdev/mdev_core.c    |  6 +++
>  drivers/vfio/mdev/mdev_private.h |  1 -
>  include/linux/mdev.h             |  2 +
>  7 files changed, 108 insertions(+), 3 deletions(-)
>  create mode 100644 arch/x86/kernel/apic/ims.c
> 
> diff --git a/arch/x86/include/asm/msi.h b/arch/x86/include/asm/msi.h
> index 25ddd09..51f9d25 100644
> --- a/arch/x86/include/asm/msi.h
> +++ b/arch/x86/include/asm/msi.h
> @@ -11,4 +11,8 @@ int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
>  
>  void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc);
>  
> +struct msi_domain_info;
> +
> +irq_hw_number_t msi_get_hwirq(struct msi_domain_info *info,
> +						msi_alloc_info_t *arg);
>  #endif /* _ASM_X86_MSI_H */
> diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
> index a6fcaf16..75a2270 100644
> --- a/arch/x86/kernel/apic/Makefile
> +++ b/arch/x86/kernel/apic/Makefile
> @@ -12,6 +12,7 @@ obj-y				+= hw_nmi.o
>  
>  obj-$(CONFIG_X86_IO_APIC)	+= io_apic.o
>  obj-$(CONFIG_PCI_MSI)		+= msi.o
> +obj-$(CONFIG_MSI_IMS)		+= ims.o
>  obj-$(CONFIG_SMP)		+= ipi.o
>  
>  ifeq ($(CONFIG_X86_64),y)
> diff --git a/arch/x86/kernel/apic/ims.c b/arch/x86/kernel/apic/ims.c
> new file mode 100644
> index 0000000..d9808a5
> --- /dev/null
> +++ b/arch/x86/kernel/apic/ims.c
> @@ -0,0 +1,93 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright © 2019 Intel Corporation.
> + *
> + * Author: Megha Dey <megha.dey@intel.com>
> + */
> +
> +#include <linux/dmar.h>
> +#include <linux/irq.h>
> +#include <linux/mdev.h>
> +#include <linux/pci.h>
> +
> +/*
> + * Determine if a dev is mdev or not. Return NULL if not mdev device.
> + * Return mdev's parent dev if success.
> + */
> +static inline struct device *mdev_to_parent(struct device *dev)
> +{
> +	struct device *ret = NULL;
> +	struct device *(*fn)(struct device *dev);
> +	struct bus_type *bus = symbol_get(mdev_bus_type);
> +
> +	if (bus && dev->bus == bus) {
> +		fn = symbol_get(mdev_dev_to_parent_dev);
> +		ret = fn(dev);
> +		symbol_put(mdev_dev_to_parent_dev);
> +		symbol_put(mdev_bus_type);

Leaks a reference to the mdev module if dev->bus != bus.  The new
version of dev_is_mdev() unconditionally leaks a reference.  Thanks,

Alex

> +	}
> +
> +	return ret;
> +}
> +
> +static struct pci_dev *ims_get_pci_dev(struct device *dev)
> +{
> +	struct pci_dev *pdev;
> +
> +	if (dev_is_mdev(dev)) {
> +		struct device *parent = mdev_to_parent(dev);
> +
> +		pdev = to_pci_dev(parent);
> +	} else {
> +		pdev = to_pci_dev(dev);
> +	}
> +
> +	return pdev;
> +}
> +
> +int dev_ims_prepare(struct irq_domain *domain, struct device *dev, int nvec,
> +		    msi_alloc_info_t *arg)
> +{
> +	struct pci_dev *pdev = ims_get_pci_dev(dev);
> +
> +	init_irq_alloc_info(arg, NULL);
> +	arg->msi_dev = pdev;
> +	arg->type = X86_IRQ_ALLOC_TYPE_MSIX;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(dev_ims_prepare);
> +
> +#ifdef CONFIG_IRQ_REMAP
> +
> +static struct msi_domain_ops dev_ims_domain_ops = {
> +	.get_hwirq	= msi_get_hwirq,
> +	.msi_prepare	= dev_ims_prepare,
> +};
> +
> +static struct irq_chip dev_ims_ir_controller = {
> +	.name			= "IR-DEV-IMS",
> +	.irq_unmask		= dev_ims_unmask_irq,
> +	.irq_mask		= dev_ims_mask_irq,
> +	.irq_ack		= irq_chip_ack_parent,
> +	.irq_retrigger		= irq_chip_retrigger_hierarchy,
> +	.irq_set_vcpu_affinity	= irq_chip_set_vcpu_affinity_parent,
> +	.flags			= IRQCHIP_SKIP_SET_WAKE,
> +	.irq_write_msi_msg	= dev_ims_write_msg,
> +};
> +
> +static struct msi_domain_info ims_ir_domain_info = {
> +	.flags		= MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
> +			  MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX,
> +	.ops		= &dev_ims_domain_ops,
> +	.chip		= &dev_ims_ir_controller,
> +	.handler	= handle_edge_irq,
> +	.handler_name	= "edge",
> +};
> +
> +struct irq_domain *arch_create_ims_irq_domain(struct irq_domain *parent)
> +{
> +	return pci_msi_create_irq_domain(NULL, &ims_ir_domain_info, parent);
> +}
> +
> +#endif
> diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
> index 435bcda..65da813 100644
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -84,7 +84,7 @@ void native_teardown_msi_irq(unsigned int irq)
>  	irq_domain_free_irqs(irq, 1);
>  }
>  
> -static irq_hw_number_t pci_msi_get_hwirq(struct msi_domain_info *info,
> +irq_hw_number_t msi_get_hwirq(struct msi_domain_info *info,
>  					 msi_alloc_info_t *arg)
>  {
>  	return arg->msi_hwirq;
> @@ -116,7 +116,7 @@ void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
>  EXPORT_SYMBOL_GPL(pci_msi_set_desc);
>  
>  static struct msi_domain_ops pci_msi_domain_ops = {
> -	.get_hwirq	= pci_msi_get_hwirq,
> +	.get_hwirq	= msi_get_hwirq,
>  	.msi_prepare	= pci_msi_prepare,
>  	.set_desc	= pci_msi_set_desc,
>  };
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b558d4c..cecc6a6 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -33,6 +33,12 @@ struct device *mdev_parent_dev(struct mdev_device *mdev)
>  }
>  EXPORT_SYMBOL(mdev_parent_dev);
>  
> +struct device *mdev_dev_to_parent_dev(struct device *dev)
> +{
> +	return to_mdev_device(dev)->parent->dev;
> +}
> +EXPORT_SYMBOL(mdev_dev_to_parent_dev);
> +
>  void *mdev_get_drvdata(struct mdev_device *mdev)
>  {
>  	return mdev->driver_data;
> diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
> index 7d92295..c21f130 100644
> --- a/drivers/vfio/mdev/mdev_private.h
> +++ b/drivers/vfio/mdev/mdev_private.h
> @@ -36,7 +36,6 @@ struct mdev_device {
>  };
>  
>  #define to_mdev_device(dev)	container_of(dev, struct mdev_device, dev)
> -#define dev_is_mdev(d)		((d)->bus == &mdev_bus_type)
>  
>  struct mdev_type {
>  	struct kobject kobj;
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> index 0ce30ca..9dcbffe 100644
> --- a/include/linux/mdev.h
> +++ b/include/linux/mdev.h
> @@ -144,5 +144,7 @@ void mdev_unregister_driver(struct mdev_driver *drv);
>  struct device *mdev_parent_dev(struct mdev_device *mdev);
>  struct device *mdev_dev(struct mdev_device *mdev);
>  struct mdev_device *mdev_from_dev(struct device *dev);
> +struct device *mdev_dev_to_parent_dev(struct device *dev);
>  
> +#define dev_is_mdev(d)          ((d)->bus == symbol_get(mdev_bus_type))
>  #endif /* MDEV_H */


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 0/7] Add support for a new IMS interrupt mechanism
  2019-09-13 20:27   ` Raj, Ashok
@ 2019-09-19 18:25     ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2019-09-19 18:25 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Megha Dey, linux-kernel, x86, linux-pci, maz, bhelgaas, rafael,
	gregkh, tglx, hpa, alex.williamson, megha.dey, jacob.jun.pan

On Fri, Sep 13, 2019 at 01:27:10PM -0700, Raj, Ashok wrote:
> On Fri, Sep 13, 2019 at 07:50:50PM +0000, Jason Gunthorpe wrote:
> > On Thu, Sep 12, 2019 at 06:32:01PM -0700, Megha Dey wrote:
> > 
> > > This series is a basic patchset to get the ball rolling and receive some
> > > inital comments. As per my discussion with Marc Zyngier and Thomas Gleixner
> > > at the Linux Plumbers, I need to do the following:
> > > 1. Since a device can support MSI-X and IMS simultaneously, ensure proper
> > >    locking mechanism for the 'msi_list' in the device structure.
> > > 2. Introduce dynamic allocation of IMS vectors perhaps by using a group ID
> > > 3. IMS support of a device needs to be discoverable. A bit in the vendor
> > >    specific capability in the PCI config is to be added rather than getting
> > >    this information from each device driver.
> > 
> > Why #3? The point of this scheme is to delegate programming the
> > addr/data pairs to the driver so it can be done in some
> > device-specific way. There is no PCI standard behind this, and no
> > change in PCI semantics. 
> > 
> > I think it would be a tall ask to get a config space bit from PCI-SIG
> > for something that has little to do with PCI.
> 
> This isn't a standard config capability. Its Designated Vendor Specific
> Capability (DVSEC). The device is responsible for managing the addr-data
> pair. This provides a hint to the OS framework that this device has
> device specific methods.
> 
> Agreed its not required, but some OSV's like a generic way to discover
> these capabilities, hence its there so device vendors can have
> a common guideline.

I think it would be reasonable for a specific device driver to test
the DVSEC, if it wishes too. 

Since it is not required it does not make sense for the core kernel to
enforce this on all devices, at least for such a nebulous reason.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 6/7] ims-msi: Add APIs to allocate/free IMS interrupts
  2019-09-13  1:32 ` [RFC V1 6/7] ims-msi: Add APIs to allocate/free IMS interrupts Megha Dey
@ 2019-09-26 21:44   ` Bjorn Helgaas
  0 siblings, 0 replies; 21+ messages in thread
From: Bjorn Helgaas @ 2019-09-26 21:44 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, rafael, gregkh, tglx, hpa,
	alex.williamson, jgg, ashok.raj, megha.dey, jacob.jun.pan,
	Jacob Pan, Sanjay Kumar

On Thu, Sep 12, 2019 at 06:32:07PM -0700, Megha Dey wrote:
> This patch introduces APIs to allocate and free IMS interrupts.

> +int __dev_ims_alloc_irqs(struct device *dev, int nvec,
> +			 struct dev_ims_ops *ops,
> +			 struct irq_affinity *affd)

Should be static?  The only reference is in this file.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 7/7] ims: Add the set_desc callback
  2019-09-13  1:32 ` [RFC V1 7/7] ims: Add the set_desc callback Megha Dey
@ 2019-09-26 21:47   ` Bjorn Helgaas
  0 siblings, 0 replies; 21+ messages in thread
From: Bjorn Helgaas @ 2019-09-26 21:47 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, rafael, gregkh, tglx, hpa,
	alex.williamson, jgg, ashok.raj, megha.dey, jacob.jun.pan,
	Jacob Pan, Sanjay Kumar

On Thu, Sep 12, 2019 at 06:32:08PM -0700, Megha Dey wrote:
> Add the set_desc callback to the ims domain ops.

Elsewhere you capitalize "IMS" when it's an initialism.

Generally you capitalized "IRQ" and "MSI" in similar situations, but
there are a couple exceptions (in other commit logs).

> The set_desc callback is used to find a unique hwirq number from a given
> domain.
> 
> Each mdev can have a maximum of 2048 IMS interrupts.

Maybe you could mention where this limit comes from and whether it's
related to any #define in these patches?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts
  2019-09-13  1:32 ` [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts Megha Dey
  2019-09-13  4:40   ` Greg KH
  2019-09-13 14:40   ` Jason Gunthorpe
@ 2019-09-26 21:50   ` Bjorn Helgaas
  2 siblings, 0 replies; 21+ messages in thread
From: Bjorn Helgaas @ 2019-09-26 21:50 UTC (permalink / raw)
  To: Megha Dey
  Cc: linux-kernel, x86, linux-pci, maz, rafael, gregkh, tglx, hpa,
	alex.williamson, jgg, ashok.raj, megha.dey, jacob.jun.pan

On Thu, Sep 12, 2019 at 06:32:02PM -0700, Megha Dey wrote:
> Since a device can support both MSI-X and IMS interrupts simultaneously,
> do away with is_msix and introduce a new enum msi_desc_tag to
> differentiate between the various types of msi_descs.
> 
> Signed-off-by: Megha Dey <megha.dey@linux.intel.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>	# for drivers/pci/msi.c

> ---
>  arch/mips/pci/msi-xlp.c    |  2 +-
>  arch/s390/pci/pci_irq.c    |  2 +-
>  arch/x86/kernel/apic/msi.c |  2 +-
>  arch/x86/pci/xen.c         |  2 +-
>  drivers/pci/msi.c          | 19 ++++++++++---------
>  include/linux/msi.h        | 11 ++++++++++-
>  kernel/irq/msi.c           |  2 +-
>  7 files changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/mips/pci/msi-xlp.c b/arch/mips/pci/msi-xlp.c
> index bb14335..0f06ad1 100644
> --- a/arch/mips/pci/msi-xlp.c
> +++ b/arch/mips/pci/msi-xlp.c
> @@ -457,7 +457,7 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
>  	node = slot / 8;
>  	lnkbase = nlm_get_pcie_base(node, link);
>  
> -	if (desc->msi_attrib.is_msix)
> +	if (desc->tag == IRQ_MSI_TAG_MSIX)
>  		return xlp_setup_msix(lnkbase, node, link, desc);
>  	else
>  		return xlp_setup_msi(lnkbase, node, link, desc);
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index d80616a..1938582 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -332,7 +332,7 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev)
>  	for_each_pci_msi_entry(msi, pdev) {
>  		if (!msi->irq)
>  			continue;
> -		if (msi->msi_attrib.is_msix)
> +		if (msi->tag == IRQ_MSI_TAG_MSIX)
>  			__pci_msix_desc_mask_irq(msi, 1);
>  		else
>  			__pci_msi_desc_mask_irq(msi, 1, 1);
> diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
> index 7f75334..435bcda 100644
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -98,7 +98,7 @@ int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
>  
>  	init_irq_alloc_info(arg, NULL);
>  	arg->msi_dev = pdev;
> -	if (desc->msi_attrib.is_msix) {
> +	if (desc->tag == IRQ_MSI_TAG_MSIX) {
>  		arg->type = X86_IRQ_ALLOC_TYPE_MSIX;
>  	} else {
>  		arg->type = X86_IRQ_ALLOC_TYPE_MSI;
> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
> index 91220cc..5e850b8 100644
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -382,7 +382,7 @@ static void xen_teardown_msi_irqs(struct pci_dev *dev)
>  	struct msi_desc *msidesc;
>  
>  	msidesc = first_pci_msi_entry(dev);
> -	if (msidesc->msi_attrib.is_msix)
> +	if (msidesc->tag == IRQ_MSI_TAG_MSIX)
>  		xen_pci_frontend_disable_msix(dev);
>  	else
>  		xen_pci_frontend_disable_msi(dev);
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 0884bed..8a05416 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -235,7 +235,7 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag)
>  {
>  	struct msi_desc *desc = irq_data_get_msi_desc(data);
>  
> -	if (desc->msi_attrib.is_msix) {
> +	if (desc->tag == IRQ_MSI_TAG_MSIX) {
>  		msix_mask_irq(desc, flag);
>  		readl(desc->mask_base);		/* Flush write to device */
>  	} else {
> @@ -278,7 +278,7 @@ void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>  
>  	BUG_ON(dev->current_state != PCI_D0);
>  
> -	if (entry->msi_attrib.is_msix) {
> +	if (entry->tag == IRQ_MSI_TAG_MSIX) {
>  		void __iomem *base = pci_msix_desc_addr(entry);
>  
>  		if (!base) {
> @@ -313,7 +313,7 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>  
>  	if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev)) {
>  		/* Don't touch the hardware now */
> -	} else if (entry->msi_attrib.is_msix) {
> +	} else if (entry->tag == IRQ_MSI_TAG_MSIX) {
>  		void __iomem *base = pci_msix_desc_addr(entry);
>  
>  		if (!base)
> @@ -376,7 +376,7 @@ static void free_msi_irqs(struct pci_dev *dev)
>  	pci_msi_teardown_msi_irqs(dev);
>  
>  	list_for_each_entry_safe(entry, tmp, msi_list, list) {
> -		if (entry->msi_attrib.is_msix) {
> +		if (entry->tag == IRQ_MSI_TAG_MSIX) {
>  			if (list_is_last(&entry->list, msi_list))
>  				iounmap(entry->mask_base);
>  		}
> @@ -471,7 +471,7 @@ static ssize_t msi_mode_show(struct device *dev, struct device_attribute *attr,
>  	entry = irq_get_msi_desc(irq);
>  	if (entry)
>  		return sprintf(buf, "%s\n",
> -				entry->msi_attrib.is_msix ? "msix" : "msi");
> +			(entry->tag == IRQ_MSI_TAG_MSIX) ? "msix" : "msi");
>  
>  	return -ENODEV;
>  }
> @@ -570,7 +570,7 @@ msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
>  
>  	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
>  
> -	entry->msi_attrib.is_msix	= 0;
> +	entry->tag			= IRQ_MSI_TAG_MSI;
>  	entry->msi_attrib.is_64		= !!(control & PCI_MSI_FLAGS_64BIT);
>  	entry->msi_attrib.is_virtual    = 0;
>  	entry->msi_attrib.entry_nr	= 0;
> @@ -714,7 +714,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
>  			goto out;
>  		}
>  
> -		entry->msi_attrib.is_msix	= 1;
> +		entry->tag			= IRQ_MSI_TAG_MSIX;
>  		entry->msi_attrib.is_64		= 1;
>  		if (entries)
>  			entry->msi_attrib.entry_nr = entries[i].entry;
> @@ -1380,7 +1380,7 @@ irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev,
>  
>  static inline bool pci_msi_desc_is_multi_msi(struct msi_desc *desc)
>  {
> -	return !desc->msi_attrib.is_msix && desc->nvec_used > 1;
> +	return (desc->tag == IRQ_MSI_TAG_MSI) && desc->nvec_used > 1;
>  }
>  
>  /**
> @@ -1404,7 +1404,8 @@ int pci_msi_domain_check_cap(struct irq_domain *domain,
>  	if (pci_msi_desc_is_multi_msi(desc) &&
>  	    !(info->flags & MSI_FLAG_MULTI_PCI_MSI))
>  		return 1;
> -	else if (desc->msi_attrib.is_msix && !(info->flags & MSI_FLAG_PCI_MSIX))
> +	else if ((desc->tag == IRQ_MSI_TAG_MSIX) &&
> +					!(info->flags & MSI_FLAG_PCI_MSIX))
>  		return -ENOTSUPP;
>  
>  	return 0;
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 8ad679e..22591b6 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -55,6 +55,15 @@ struct ti_sci_inta_msi_desc {
>  	u16	dev_index;
>  };
>  
> +enum msi_desc_tags {
> +	IRQ_MSI_TAG_MSI,
> +	IRQ_MSI_TAG_MSIX,
> +	IRQ_MSI_TAG_IMS,
> +	IRQ_MSI_TAG_PLAT,
> +	IRQ_MSI_TAG_FSL,
> +	IRQ_MSI_TAG_SCI,
> +};
> +
>  /**
>   * struct msi_desc - Descriptor structure for MSI based interrupts
>   * @list:	List head for management
> @@ -90,6 +99,7 @@ struct msi_desc {
>  	struct device			*dev;
>  	struct msi_msg			msg;
>  	struct irq_affinity_desc	*affinity;
> +	enum msi_desc_tags		tag;
>  #ifdef CONFIG_IRQ_MSI_IOMMU
>  	const void			*iommu_cookie;
>  #endif
> @@ -102,7 +112,6 @@ struct msi_desc {
>  		struct {
>  			u32 masked;
>  			struct {
> -				u8	is_msix		: 1;
>  				u8	multiple	: 3;
>  				u8	multi_cap	: 3;
>  				u8	maskbit		: 1;
> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> index ad26fbc..0819395 100644
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -384,7 +384,7 @@ static bool msi_check_reservation_mode(struct irq_domain *domain,
>  	 * masking and MSI does so when the maskbit is set.
>  	 */
>  	desc = first_msi_entry(dev);
> -	return desc->msi_attrib.is_msix || desc->msi_attrib.maskbit;
> +	return (desc->tag == IRQ_MSI_TAG_MSIX) || desc->msi_attrib.maskbit;
>  }
>  
>  /**
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, back to index

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-13  1:32 [RFC V1 0/7] Add support for a new IMS interrupt mechanism Megha Dey
2019-09-13  1:32 ` [RFC V1 1/7] genirq/msi: Differentiate between various MSI based interrupts Megha Dey
2019-09-13  4:40   ` Greg KH
2019-09-13 14:40   ` Jason Gunthorpe
2019-09-26 21:50   ` Bjorn Helgaas
2019-09-13  1:32 ` [RFC V1 2/7] drivers/base: Introduce callbacks for IMS interrupt domain Megha Dey
2019-09-13  4:39   ` Greg KH
2019-09-13 14:50   ` Jason Gunthorpe
2019-09-13  1:32 ` [RFC V1 3/7] x86/ims: Add support for a new IMS irq domain Megha Dey
2019-09-13 14:52   ` Jason Gunthorpe
2019-09-13 22:07   ` Alex Williamson
2019-09-13  1:32 ` [RFC V1 4/7] irq_remapping: New interfaces to support IMS irqdomain Megha Dey
2019-09-13  1:32 ` [RFC V1 5/7] x86/ims: Introduce x86_ims_ops Megha Dey
2019-09-13 14:54   ` Jason Gunthorpe
2019-09-13  1:32 ` [RFC V1 6/7] ims-msi: Add APIs to allocate/free IMS interrupts Megha Dey
2019-09-26 21:44   ` Bjorn Helgaas
2019-09-13  1:32 ` [RFC V1 7/7] ims: Add the set_desc callback Megha Dey
2019-09-26 21:47   ` Bjorn Helgaas
2019-09-13 19:50 ` [RFC V1 0/7] Add support for a new IMS interrupt mechanism Jason Gunthorpe
2019-09-13 20:27   ` Raj, Ashok
2019-09-19 18:25     ` Jason Gunthorpe

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org linux-pci@archiver.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/ public-inbox