linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
@ 2014-07-26  3:08 Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 01/11] PCI/MSI: Use pci_dev->msi_cap instead of msi_desc->msi_attrib.pos Yijing Wang
                   ` (12 more replies)
  0 siblings, 13 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Hi all,
	The series is a draft of generic MSI driver that supports PCI
and Non-PCI device which have MSI capability. If you're not interested
it, sorry for the noise.

The series is based on Linux-3.16-rc1.

MSI was introduced in PCI Spec 2.2. Currently, kernel MSI 
driver codes are bonding with PCI device. Because MSI has a lot
advantages in design. More and more non-PCI devices want to
use MSI as their default interrupt. The existing MSI device
include HPET. HPET driver provide its own MSI code to initialize
and process MSI interrupts. In the latest GIC v3 spec, legacy device
can deliver MSI by the help of a relay device named consolidator.
Consolidator can translate the legacy interrupts connected to it
to MSI/MSI-X. And new non-PCI device will be designed to 
support MSI in future. So make the MSI driver code be generic will 
help the non-PCI device use MSI more simply.

The new data struct for generic MSI driver.
struct msi_irqs {
	u8 msi_enabled:1; /* Enable flag */
	u8 msix_enabled:1;
	struct list_head msi_list; /* MSI desc list */
	void *data;	/* help to find the MSI device */
	struct msi_ops *ops; /* MSI device specific hook */
};
struct msi_irqs is used to manage MSI related informations. Every device supports
MSI should contain this data struct and allocate it.

struct msi_ops {
	struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi, struct msi_desc *entry);
	int msix_setup_entries(struct msi_irqs *msi, struct msix_entry *entries);
	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
	void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
	void (*msi_write_message)(struct msi_desc *desc, struct msi_msg *msg);
	void (*msi_set_intx)(struct msi_irqs *msi, int enable);
};
struct msi_ops provides several hook functions, generic MSI driver will call
the hook functions to access device specific registers. PCI devices will share
the same msi_ops, because they have the same way to access MSI hardware registers.

Generic MSI layer export msi_capability_init() and msix_capability_init() functions
to drivers. msi/x_capability_init() will initialize MSI capability data struct msi_desc
and alloc the irq, then write the msi address/data value to hardware registers.

This series only did compile test, we will test it in x86 and arm platform later.

Any comments are welcome.

Thanks!
Yijing.




Yijing Wang (11):
  PCI/MSI: Use pci_dev->msi_cap instead of msi_desc->msi_attrib.pos
  PCI/MSI: Use new MSI type macro instead of PCI MSI flags
  PCI/MSI: Refactor pci_dev_msi_enabled()
  PCI/MSI: Move MSIX table address mapping out of msix_capability_init
  PCI/MSI: Move populate_msi_sysfs() out of msi_capability_init()
  PCI/MSI: Save MSI irq in PCI MSI layer
  PCI/MSI: Mask MSI-X entry in msix_setup_entries()
  PCI/MSI: Introduce new struct msi_irqs and struct msi_ops
  PCI/MSI: refactor PCI MSI driver
  PCI/MSI: Split the generic MSI code into new file
  x86/MSI: Refactor x86 MSI code

 arch/cris/arch-v32/drivers/pci/bios.c     |    2 +-
 arch/frv/mb93090-mb00/pci-vdk.c           |    2 +-
 arch/ia64/pci/pci.c                       |    4 +-
 arch/mips/pci/msi-octeon.c                |    8 +-
 arch/powerpc/kernel/eeh_driver.c          |    2 +-
 arch/powerpc/kernel/msi.c                 |    2 +-
 arch/powerpc/platforms/pseries/msi.c      |    8 +-
 arch/s390/pci/pci.c                       |    2 +-
 arch/x86/include/asm/io_apic.h            |    2 +-
 arch/x86/include/asm/irq_remapping.h      |    4 +-
 arch/x86/include/asm/pci.h                |    6 +-
 arch/x86/include/asm/x86_init.h           |   10 +-
 arch/x86/kernel/apic/io_apic.c            |   25 +-
 arch/x86/kernel/x86_init.c                |   12 +-
 arch/x86/pci/common.c                     |    5 +-
 arch/x86/pci/xen.c                        |   24 +-
 drivers/Kconfig                           |    1 +
 drivers/Makefile                          |    1 +
 drivers/block/nvme-core.c                 |    4 +-
 drivers/dma/ioat/dma.c                    |    2 +-
 drivers/firewire/ohci.c                   |    2 +-
 drivers/gpu/drm/i915/i915_dma.c           |    4 +-
 drivers/iommu/amd_iommu.c                 |   16 +-
 drivers/iommu/intel_irq_remapping.c       |    9 +-
 drivers/iommu/irq_remapping.c             |   53 ++--
 drivers/iommu/irq_remapping.h             |    6 +-
 drivers/irqchip/irq-armada-370-xp.c       |    2 +-
 drivers/misc/mei/hw-me.c                  |    2 +-
 drivers/misc/mei/hw-txe.c                 |    2 +-
 drivers/misc/mei/pci-me.c                 |    4 +-
 drivers/misc/mei/pci-txe.c                |    4 +-
 drivers/misc/mic/host/mic_debugfs.c       |    4 +-
 drivers/misc/mic/host/mic_intr.c          |    8 +-
 drivers/msi/Kconfig                       |    8 +
 drivers/msi/Makefile                      |    1 +
 drivers/msi/msi.c                         |  539 +++++++++++++++++++++++
 drivers/ntb/ntb_hw.c                      |    2 +-
 drivers/pci/Kconfig                       |    6 +-
 drivers/pci/host/pcie-designware.c        |    2 +-
 drivers/pci/irq.c                         |    4 +-
 drivers/pci/msi.c                         |  660 +++++++----------------------
 drivers/pci/pci.c                         |    6 +-
 drivers/pci/pcie/portdrv_core.c           |    4 +-
 drivers/scsi/esas2r/esas2r_init.c         |    4 +-
 drivers/scsi/esas2r/esas2r_ioctl.c        |    4 +-
 drivers/scsi/hpsa.c                       |    4 +-
 drivers/staging/crystalhd/crystalhd_lnx.c |    2 +-
 drivers/xen/xen-pciback/pciback_ops.c     |   12 +-
 include/linux/msi.h                       |   73 +++-
 include/linux/pci.h                       |   24 +-
 virt/kvm/assigned-dev.c                   |    2 +-
 51 files changed, 929 insertions(+), 670 deletions(-)
 create mode 100644 drivers/msi/Kconfig
 create mode 100644 drivers/msi/Makefile
 create mode 100644 drivers/msi/msi.c


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC PATCH 01/11] PCI/MSI: Use pci_dev->msi_cap instead of msi_desc->msi_attrib.pos
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 02/11] PCI/MSI: Use new MSI type macro instead of PCI MSI flags Yijing Wang
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

PCI devices save the msi and msix capability offset in pci_dev->msi_cap
and pci_dev->msix_cap. When we access PCI device MSI and MSIX
registers, we can use msi_cap and msix_cap in pci_dev directly.
Remove the pos member in msi_attrib.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 arch/mips/pci/msi-octeon.c         |    4 ++--
 drivers/pci/host/pcie-designware.c |    2 +-
 drivers/pci/msi.c                  |    2 --
 include/linux/msi.h                |    1 -
 4 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/mips/pci/msi-octeon.c b/arch/mips/pci/msi-octeon.c
index ab0c5d1..6a6a99f 100644
--- a/arch/mips/pci/msi-octeon.c
+++ b/arch/mips/pci/msi-octeon.c
@@ -73,7 +73,7 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
 	 * wants.  Most devices only want 1, which will give
 	 * configured_private_bits and request_private_bits equal 0.
 	 */
-	pci_read_config_word(dev, desc->msi_attrib.pos + PCI_MSI_FLAGS,
+	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS,
 			     &control);
 
 	/*
@@ -176,7 +176,7 @@ msi_irq_allocated:
 	/* Update the number of IRQs the device has available to it */
 	control &= ~PCI_MSI_FLAGS_QSIZE;
 	control |= request_private_bits << 4;
-	pci_write_config_word(dev, desc->msi_attrib.pos + PCI_MSI_FLAGS,
+	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS,
 			      control);
 
 	irq_set_msi_desc(irq, desc);
diff --git a/drivers/pci/host/pcie-designware.c b/drivers/pci/host/pcie-designware.c
index 1eaf4df..04339cd 100644
--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -335,7 +335,7 @@ static int dw_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
 		return -EINVAL;
 	}
 
-	pci_read_config_word(pdev, desc->msi_attrib.pos+PCI_MSI_FLAGS,
+	pci_read_config_word(pdev, pdev->msi_cap + PCI_MSI_FLAGS,
 				&msg_ctr);
 	msgvec = (msg_ctr&PCI_MSI_FLAGS_QSIZE) >> 4;
 	if (msgvec == 0)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 5a40516..e67acd1 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -595,7 +595,6 @@ static struct msi_desc *msi_setup_entry(struct pci_dev *dev)
 	entry->msi_attrib.entry_nr	= 0;
 	entry->msi_attrib.maskbit	= !!(control & PCI_MSI_FLAGS_MASKBIT);
 	entry->msi_attrib.default_irq	= dev->irq;	/* Save IOAPIC IRQ */
-	entry->msi_attrib.pos		= dev->msi_cap;
 	entry->msi_attrib.multi_cap	= (control & PCI_MSI_FLAGS_QMASK) >> 1;
 
 	if (control & PCI_MSI_FLAGS_64BIT)
@@ -699,7 +698,6 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 		entry->msi_attrib.is_64		= 1;
 		entry->msi_attrib.entry_nr	= entries[i].entry;
 		entry->msi_attrib.default_irq	= dev->irq;
-		entry->msi_attrib.pos		= dev->msix_cap;
 		entry->mask_base		= base;
 
 		list_add_tail(&entry->list, &dev->msi_list);
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 8103f32..ce88c5b 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -29,7 +29,6 @@ struct msi_desc {
 		__u8	multi_cap : 3;	/* log2 num of messages supported */
 		__u8	maskbit	: 1;	/* mask-pending bit supported ? */
 		__u8	is_64	: 1;	/* Address size: 0=32bit 1=64bit */
-		__u8	pos;		/* Location of the msi capability */
 		__u16	entry_nr;	/* specific enabled entry */
 		unsigned default_irq;	/* default pre-assigned irq */
 	} msi_attrib;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 02/11] PCI/MSI: Use new MSI type macro instead of PCI MSI flags
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 01/11] PCI/MSI: Use pci_dev->msi_cap instead of msi_desc->msi_attrib.pos Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled() Yijing Wang
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Add new MSI type marco(MSI_TYPE and MSIX_TYPE) to support
the future generic MSI driver. The coming generic MSI driver
will be used by PCI and Non-PCI devices that have MSI capability.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 arch/mips/pci/msi-octeon.c           |    4 ++--
 arch/powerpc/kernel/msi.c            |    2 +-
 arch/powerpc/platforms/pseries/msi.c |    8 ++++----
 arch/s390/pci/pci.c                  |    2 +-
 arch/x86/kernel/apic/io_apic.c       |    2 +-
 arch/x86/pci/xen.c                   |   24 ++++++++++++------------
 drivers/iommu/irq_remapping.c        |    2 +-
 drivers/irqchip/irq-armada-370-xp.c  |    2 +-
 drivers/pci/msi.c                    |   10 +++++-----
 include/linux/msi.h                  |    3 +++
 10 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/arch/mips/pci/msi-octeon.c b/arch/mips/pci/msi-octeon.c
index 6a6a99f..8105610 100644
--- a/arch/mips/pci/msi-octeon.c
+++ b/arch/mips/pci/msi-octeon.c
@@ -192,14 +192,14 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	/*
 	 * MSI-X is not supported.
 	 */
-	if (type == PCI_CAP_ID_MSIX)
+	if (type == MSIX_TYPE)
 		return -EINVAL;
 
 	/*
 	 * If an architecture wants to support multiple MSI, it needs to
 	 * override arch_setup_msi_irqs()
 	 */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
+	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 
 	list_for_each_entry(entry, &dev->msi_list, list) {
diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 8bbc12d..05b3133 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -21,7 +21,7 @@ int arch_msi_check_device(struct pci_dev* dev, int nvec, int type)
 	}
 
 	/* PowerPC doesn't support multiple MSI yet */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
+	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 
 	if (ppc_md.msi_check_device) {
diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index 0c882e8..e2f27d6 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -339,7 +339,7 @@ static int rtas_msi_check_device(struct pci_dev *pdev, int nvec, int type)
 {
 	int quota, rc;
 
-	if (type == PCI_CAP_ID_MSIX)
+	if (type == MSIX_TYPE)
 		rc = check_req_msix(pdev, nvec);
 	else
 		rc = check_req_msi(pdev, nvec);
@@ -406,14 +406,14 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
 	if (!pdn)
 		return -ENODEV;
 
-	if (type == PCI_CAP_ID_MSIX && check_msix_entries(pdev))
+	if (type == MSIX_TYPE && check_msix_entries(pdev))
 		return -EINVAL;
 
 	/*
 	 * Firmware currently refuse any non power of two allocation
 	 * so we round up if the quota will allow it.
 	 */
-	if (type == PCI_CAP_ID_MSIX) {
+	if (type == MSIX_TYPE) {
 		int m = roundup_pow_of_two(nvec);
 		int quota = msi_quota_for_device(pdev, m);
 
@@ -427,7 +427,7 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
 	 * return MSI-Xs.
 	 */
 again:
-	if (type == PCI_CAP_ID_MSI) {
+	if (type == MSI_TYPE) {
 		if (pdn->force_32bit_msi) {
 			rc = rtas_change_msi(pdn, RTAS_CHANGE_32MSI_FN, nvec);
 			if (rc < 0) {
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 9ddc51e..fe3a40c 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -407,7 +407,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 	struct msi_msg msg;
 	int rc, irq;
 
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
+	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 	msi_vecs = min(nvec, ZPCI_MSI_VEC_MAX);
 	msi_vecs = min_t(unsigned int, msi_vecs, CONFIG_PCI_NR_MSI);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 81e08ef..b833042 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3069,7 +3069,7 @@ int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	int node, ret;
 
 	/* Multiple MSI vectors only supported with interrupt remapping */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
+	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 
 	node = dev_to_node(&dev->dev);
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 905956f..c19a8de 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -162,14 +162,14 @@ static int xen_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	struct msi_desc *msidesc;
 	int *v;
 
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
+	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 
 	v = kzalloc(sizeof(int) * max(1, nvec), GFP_KERNEL);
 	if (!v)
 		return -ENOMEM;
 
-	if (type == PCI_CAP_ID_MSIX)
+	if (type == MSIX_TYPE)
 		ret = xen_pci_frontend_enable_msix(dev, v, nvec);
 	else
 		ret = xen_pci_frontend_enable_msi(dev, v);
@@ -178,8 +178,8 @@ static int xen_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	i = 0;
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
 		irq = xen_bind_pirq_msi_to_irq(dev, msidesc, v[i],
-					       (type == PCI_CAP_ID_MSI) ? nvec : 1,
-					       (type == PCI_CAP_ID_MSIX) ?
+					       (type == MSI_TYPE) ? nvec : 1,
+					       (type == MSIX_TYPE) ?
 					       "pcifront-msi-x" :
 					       "pcifront-msi",
 						DOMID_SELF);
@@ -224,7 +224,7 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	struct msi_desc *msidesc;
 	struct msi_msg msg;
 
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
+	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 
 	list_for_each_entry(msidesc, &dev->msi_list, list) {
@@ -246,8 +246,8 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 				"xen: msi already bound to pirq=%d\n", pirq);
 		}
 		irq = xen_bind_pirq_msi_to_irq(dev, msidesc, pirq,
-					       (type == PCI_CAP_ID_MSI) ? nvec : 1,
-					       (type == PCI_CAP_ID_MSIX) ?
+					       (type == MSI_TYPE) ? nvec : 1,
+					       (type == MSIX_TYPE) ?
 					       "msi-x" : "msi",
 					       DOMID_SELF);
 		if (irq < 0)
@@ -290,10 +290,10 @@ static int xen_initdom_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 			      (pci_domain_nr(dev->bus) << 16);
 		map_irq.devfn = dev->devfn;
 
-		if (type == PCI_CAP_ID_MSI && nvec > 1) {
+		if (type == MSI_TYPE && nvec > 1) {
 			map_irq.type = MAP_PIRQ_TYPE_MULTI_MSI;
 			map_irq.entry_nr = nvec;
-		} else if (type == PCI_CAP_ID_MSIX) {
+		} else if (type == MSIX_TYPE) {
 			int pos;
 			u32 table_offset, bir;
 
@@ -310,7 +310,7 @@ static int xen_initdom_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 		if (pci_seg_supported)
 			ret = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq,
 						    &map_irq);
-		if (type == PCI_CAP_ID_MSI && nvec > 1 && ret) {
+		if (type == MSI_TYPE && nvec > 1 && ret) {
 			/*
 			 * If MAP_PIRQ_TYPE_MULTI_MSI is not available
 			 * there's nothing else we can do in this case.
@@ -337,8 +337,8 @@ static int xen_initdom_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 		}
 
 		ret = xen_bind_pirq_msi_to_irq(dev, msidesc, map_irq.pirq,
-		                               (type == PCI_CAP_ID_MSI) ? nvec : 1,
-		                               (type == PCI_CAP_ID_MSIX) ? "msi-x" : "msi",
+		                               (type == MSI_TYPE) ? nvec : 1,
+		                               (type == MSIX_TYPE) ? "msi-x" : "msi",
 		                               domid);
 		if (ret < 0)
 			goto out;
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 33c4395..a3b1805 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -142,7 +142,7 @@ error:
 static int irq_remapping_setup_msi_irqs(struct pci_dev *dev,
 					int nvec, int type)
 {
-	if (type == PCI_CAP_ID_MSI)
+	if (type == MSI_TYPE)
 		return do_setup_msi_irqs(dev, nvec);
 	else
 		return do_setup_msix_irqs(dev, nvec);
diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c
index c887e6e..249823b 100644
--- a/drivers/irqchip/irq-armada-370-xp.c
+++ b/drivers/irqchip/irq-armada-370-xp.c
@@ -170,7 +170,7 @@ static int armada_370_xp_check_msi_device(struct msi_chip *chip, struct pci_dev
 					  int nvec, int type)
 {
 	/* We support MSI, but not MSI-X */
-	if (type == PCI_CAP_ID_MSI)
+	if (type == MSI_TYPE)
 		return 0;
 	return -EINVAL;
 }
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index e67acd1..e416dc0 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -75,7 +75,7 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	 * If an architecture wants to support multiple MSI, it needs to
 	 * override arch_setup_msi_irqs()
 	 */
-	if (type == PCI_CAP_ID_MSI && nvec > 1)
+	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 
 	list_for_each_entry(entry, &dev->msi_list, list) {
@@ -639,7 +639,7 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 	list_add_tail(&entry->list, &dev->msi_list);
 
 	/* Configure MSI capability structure */
-	ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
+	ret = arch_setup_msi_irqs(dev, nvec, MSI_TYPE);
 	if (ret) {
 		msi_mask_irq(entry, mask, ~mask);
 		free_msi_irqs(dev);
@@ -754,7 +754,7 @@ static int msix_capability_init(struct pci_dev *dev,
 	if (ret)
 		return ret;
 
-	ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
+	ret = arch_setup_msi_irqs(dev, nvec, MSIX_TYPE);
 	if (ret)
 		goto out_avail;
 
@@ -950,7 +950,7 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 	if (!entries || !dev->msix_cap || dev->current_state != PCI_D0)
 		return -EINVAL;
 
-	status = pci_msi_check_device(dev, nvec, PCI_CAP_ID_MSIX);
+	status = pci_msi_check_device(dev, nvec, MSIX_TYPE);
 	if (status)
 		return status;
 
@@ -1084,7 +1084,7 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
 		nvec = maxvec;
 
 	do {
-		rc = pci_msi_check_device(dev, nvec, PCI_CAP_ID_MSI);
+		rc = pci_msi_check_device(dev, nvec, MSI_TYPE);
 		if (rc < 0) {
 			return rc;
 		} else if (rc > 0) {
diff --git a/include/linux/msi.h b/include/linux/msi.h
index ce88c5b..3ad8416 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -67,6 +67,9 @@ void default_restore_msi_irqs(struct pci_dev *dev);
 u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
 u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag);
 
+#define MSI_TYPE	0x01
+#define MSIX_TYPE	0x02
+
 struct msi_chip {
 	struct module *owner;
 	struct device *dev;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled()
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 01/11] PCI/MSI: Use pci_dev->msi_cap instead of msi_desc->msi_attrib.pos Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 02/11] PCI/MSI: Use new MSI type macro instead of PCI MSI flags Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-08-05 22:35   ` Stuart Yoder
  2014-08-20  5:57   ` Bharat.Bhushan
  2014-07-26  3:08 ` [RFC PATCH 04/11] PCI/MSI: Move MSIX table address mapping out of msix_capability_init Yijing Wang
                   ` (9 subsequent siblings)
  12 siblings, 2 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Pci_dev_msi_enabled() is used to check whether device
MSI/MSIX enabled. Refactor this function  to suuport
checking only device MSI or MSIX enabled.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 arch/cris/arch-v32/drivers/pci/bios.c     |    2 +-
 arch/frv/mb93090-mb00/pci-vdk.c           |    2 +-
 arch/ia64/pci/pci.c                       |    4 ++--
 arch/powerpc/kernel/eeh_driver.c          |    2 +-
 arch/x86/pci/common.c                     |    5 +++--
 drivers/block/nvme-core.c                 |    4 ++--
 drivers/dma/ioat/dma.c                    |    2 +-
 drivers/firewire/ohci.c                   |    2 +-
 drivers/gpu/drm/i915/i915_dma.c           |    4 ++--
 drivers/misc/mei/hw-me.c                  |    2 +-
 drivers/misc/mei/hw-txe.c                 |    2 +-
 drivers/misc/mei/pci-me.c                 |    4 ++--
 drivers/misc/mei/pci-txe.c                |    4 ++--
 drivers/misc/mic/host/mic_debugfs.c       |    4 ++--
 drivers/misc/mic/host/mic_intr.c          |    8 ++++----
 drivers/ntb/ntb_hw.c                      |    2 +-
 drivers/pci/irq.c                         |    4 ++--
 drivers/pci/msi.c                         |   15 +++++++++------
 drivers/pci/pci.c                         |    6 +++---
 drivers/pci/pcie/portdrv_core.c           |    4 ++--
 drivers/scsi/esas2r/esas2r_init.c         |    4 ++--
 drivers/scsi/esas2r/esas2r_ioctl.c        |    4 ++--
 drivers/scsi/hpsa.c                       |    4 ++--
 drivers/staging/crystalhd/crystalhd_lnx.c |    2 +-
 drivers/xen/xen-pciback/pciback_ops.c     |   12 ++++++------
 include/linux/pci.h                       |   12 ++++++++++--
 virt/kvm/assigned-dev.c                   |    2 +-
 27 files changed, 67 insertions(+), 55 deletions(-)

diff --git a/arch/cris/arch-v32/drivers/pci/bios.c b/arch/cris/arch-v32/drivers/pci/bios.c
index 64a5fb9..d9d8332 100644
--- a/arch/cris/arch-v32/drivers/pci/bios.c
+++ b/arch/cris/arch-v32/drivers/pci/bios.c
@@ -93,7 +93,7 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
 	if ((err = pcibios_enable_resources(dev, mask)) < 0)
 		return err;
 
-	if (!dev->msi_enabled)
+	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
 		pcibios_enable_irq(dev);
 	return 0;
 }
diff --git a/arch/frv/mb93090-mb00/pci-vdk.c b/arch/frv/mb93090-mb00/pci-vdk.c
index efa5d65..b96c128 100644
--- a/arch/frv/mb93090-mb00/pci-vdk.c
+++ b/arch/frv/mb93090-mb00/pci-vdk.c
@@ -409,7 +409,7 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
 
 	if ((err = pci_enable_resources(dev, mask)) < 0)
 		return err;
-	if (!dev->msi_enabled)
+	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
 		pcibios_enable_irq(dev);
 	return 0;
 }
diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 291a582..da8ddff 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -568,7 +568,7 @@ pcibios_enable_device (struct pci_dev *dev, int mask)
 	if (ret < 0)
 		return ret;
 
-	if (!dev->msi_enabled)
+	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
 		return acpi_pci_irq_enable(dev);
 	return 0;
 }
@@ -577,7 +577,7 @@ void
 pcibios_disable_device (struct pci_dev *dev)
 {
 	BUG_ON(atomic_read(&dev->enable_cnt));
-	if (!dev->msi_enabled)
+	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
 		acpi_pci_irq_disable(dev);
 }
 
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 420da61..e3f2074 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -123,7 +123,7 @@ static void eeh_disable_irq(struct pci_dev *dev)
 	 * effectively disabled by the DMA Stopped state
 	 * when an EEH error occurs.
 	 */
-	if (dev->msi_enabled || dev->msix_enabled)
+	if (pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
 		return;
 
 	if (!irq_has_action(dev->irq))
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 059a76c..4597940 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -662,14 +662,15 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
 	if ((err = pci_enable_resources(dev, mask)) < 0)
 		return err;
 
-	if (!pci_dev_msi_enabled(dev))
+	if (!pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
 		return pcibios_enable_irq(dev);
 	return 0;
 }
 
 void pcibios_disable_device (struct pci_dev *dev)
 {
-	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq)
+	if (!pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE) 
+			&& pcibios_disable_irq)
 		pcibios_disable_irq(dev);
 }
 
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 02351e2..f96b90f 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -2325,9 +2325,9 @@ static int nvme_dev_map(struct nvme_dev *dev)
 
 static void nvme_dev_unmap(struct nvme_dev *dev)
 {
-	if (dev->pci_dev->msi_enabled)
+	if (pci_dev_msi_enabled(dev->pci_dev, MSI_TYPE))
 		pci_disable_msi(dev->pci_dev);
-	else if (dev->pci_dev->msix_enabled)
+	else if (pci_dev_msi_enabled(dev->pci_dev, MSIX_TYPE))
 		pci_disable_msix(dev->pci_dev);
 
 	if (dev->bar) {
diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index 4e3549a..a11dac1 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -1088,7 +1088,7 @@ static void ioat1_intr_quirk(struct ioatdma_device *device)
 	u32 dmactrl;
 
 	pci_read_config_dword(pdev, IOAT_PCI_DMACTRL_OFFSET, &dmactrl);
-	if (pdev->msi_enabled)
+	if (pci_dev_msi_enabled(pdev, MSI_TYPE))
 		dmactrl |= IOAT_PCI_DMACTRL_MSI_EN;
 	else
 		dmactrl &= ~IOAT_PCI_DMACTRL_MSI_EN;
diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 5798541..ec0a794 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -3705,7 +3705,7 @@ static int pci_probe(struct pci_dev *dev,
 	if (!(ohci->quirks & QUIRK_NO_MSI))
 		pci_enable_msi(dev);
 	if (request_irq(dev->irq, irq_handler,
-			pci_dev_msi_enabled(dev) ? 0 : IRQF_SHARED,
+			pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE) ? 0 : IRQF_SHARED,
 			ohci_driver_name, ohci)) {
 		ohci_err(ohci, "failed to allocate interrupt %d\n", dev->irq);
 		err = -EIO;
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 4c22a5b..0c248fe 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1745,7 +1745,7 @@ out_gem_unload:
 	WARN_ON(unregister_oom_notifier(&dev_priv->mm.oom_notifier));
 	unregister_shrinker(&dev_priv->mm.shrinker);
 
-	if (dev->pdev->msi_enabled)
+	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE))
 		pci_disable_msi(dev->pdev);
 
 	intel_teardown_gmbus(dev);
@@ -1826,7 +1826,7 @@ int i915_driver_unload(struct drm_device *dev)
 	cancel_work_sync(&dev_priv->gpu_error.work);
 	i915_destroy_error_state(dev);
 
-	if (dev->pdev->msi_enabled)
+	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE))
 		pci_disable_msi(dev->pdev);
 
 	intel_opregion_fini(dev);
diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c
index 6a2d272..d7595d4 100644
--- a/drivers/misc/mei/hw-me.c
+++ b/drivers/misc/mei/hw-me.c
@@ -647,7 +647,7 @@ irqreturn_t mei_me_irq_thread_handler(int irq, void *dev_id)
 
 	/* Ack the interrupt here
 	 * In case of MSI we don't go through the quick handler */
-	if (pci_dev_msi_enabled(dev->pdev))
+	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE | MSIX_TYPE))
 		mei_clear_interrupts(dev);
 
 	/* check if ME wants a reset */
diff --git a/drivers/misc/mei/hw-txe.c b/drivers/misc/mei/hw-txe.c
index 9327378..8c2d95c 100644
--- a/drivers/misc/mei/hw-txe.c
+++ b/drivers/misc/mei/hw-txe.c
@@ -951,7 +951,7 @@ irqreturn_t mei_txe_irq_thread_handler(int irq, void *dev_id)
 	mutex_lock(&dev->device_lock);
 	mei_io_list_init(&complete_list);
 
-	if (pci_dev_msi_enabled(dev->pdev))
+	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE | MSIX_TYPE))
 		mei_txe_check_and_ack_intrs(dev, true);
 
 	/* show irq events */
diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
index 1b46c64..283fc09 100644
--- a/drivers/misc/mei/pci-me.c
+++ b/drivers/misc/mei/pci-me.c
@@ -181,7 +181,7 @@ static int mei_me_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	pci_enable_msi(pdev);
 
 	 /* request and enable interrupt */
-	if (pci_dev_msi_enabled(pdev))
+	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
 		err = request_threaded_irq(pdev->irq,
 			NULL,
 			mei_me_irq_thread_handler,
@@ -329,7 +329,7 @@ static int mei_me_pci_resume(struct device *device)
 	pci_enable_msi(pdev);
 
 	/* request and enable interrupt */
-	if (pci_dev_msi_enabled(pdev))
+	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
 		err = request_threaded_irq(pdev->irq,
 			NULL,
 			mei_me_irq_thread_handler,
diff --git a/drivers/misc/mei/pci-txe.c b/drivers/misc/mei/pci-txe.c
index 2343c62..a3bf202 100644
--- a/drivers/misc/mei/pci-txe.c
+++ b/drivers/misc/mei/pci-txe.c
@@ -124,7 +124,7 @@ static int mei_txe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	mei_clear_interrupts(dev);
 
 	/* request and enable interrupt  */
-	if (pci_dev_msi_enabled(pdev))
+	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
 		err = request_threaded_irq(pdev->irq,
 			NULL,
 			mei_txe_irq_thread_handler,
@@ -272,7 +272,7 @@ static int mei_txe_pci_resume(struct device *device)
 	mei_clear_interrupts(dev);
 
 	/* request and enable interrupt */
-	if (pci_dev_msi_enabled(pdev))
+	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
 		err = request_threaded_irq(pdev->irq,
 			NULL,
 			mei_txe_irq_thread_handler,
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 028ba5d..6e1a553 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -376,9 +376,9 @@ static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 	struct pci_dev *pdev = container_of(mdev->sdev->parent,
 		struct pci_dev, dev);
 
-	if (pci_dev_msi_enabled(pdev)) {
+	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
 		for (i = 0; i < mdev->irq_info.num_vectors; i++) {
-			if (pdev->msix_enabled) {
+			if (pci_dev_msi_enabled(pdev, MSIX_TYPE)) {
 				entry = mdev->irq_info.msix_entries[i].entry;
 				vector = mdev->irq_info.msix_entries[i].vector;
 			} else {
diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index dbc5afd..9eab900 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -468,7 +468,7 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 		}
 
 		entry = 0;
-		if (pci_dev_msi_enabled(pdev)) {
+		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
 			mdev->irq_info.mic_msi_map[entry] |= (1 << offset);
 			mdev->intr_ops->program_msi_to_src_map(mdev,
 				entry, offset, true);
@@ -526,7 +526,7 @@ void mic_free_irq(struct mic_device *mdev,
 			dev_warn(mdev->sdev->parent, "Error unregistering callback\n");
 			return;
 		}
-		if (pci_dev_msi_enabled(pdev)) {
+		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
 			mdev->irq_info.mic_msi_map[entry] &= ~(BIT(src_id));
 			mdev->intr_ops->program_msi_to_src_map(mdev,
 				entry, src_id, false);
@@ -589,7 +589,7 @@ void mic_free_interrupts(struct mic_device *mdev, struct pci_dev *pdev)
 		kfree(mdev->irq_info.msix_entries);
 		pci_disable_msix(pdev);
 	} else {
-		if (pci_dev_msi_enabled(pdev)) {
+		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
 			free_irq(pdev->irq, mdev);
 			kfree(mdev->irq_info.mic_msi_map);
 			pci_disable_msi(pdev);
@@ -617,7 +617,7 @@ void mic_intr_restore(struct mic_device *mdev)
 	struct pci_dev *pdev = container_of(mdev->sdev->parent,
 		struct pci_dev, dev);
 
-	if (!pci_dev_msi_enabled(pdev))
+	if (!pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
 		return;
 
 	for (entry = 0; entry < mdev->irq_info.num_vectors; entry++) {
diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c
index 372e08c..868f685 100644
--- a/drivers/ntb/ntb_hw.c
+++ b/drivers/ntb/ntb_hw.c
@@ -1306,7 +1306,7 @@ static void ntb_free_interrupts(struct ntb_device *ndev)
 	} else {
 		free_irq(pdev->irq, ndev);
 
-		if (pci_dev_msi_enabled(pdev))
+		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
 			pci_disable_msi(pdev);
 	}
 }
diff --git a/drivers/pci/irq.c b/drivers/pci/irq.c
index 6684f15..e3e3293 100644
--- a/drivers/pci/irq.c
+++ b/drivers/pci/irq.c
@@ -36,10 +36,10 @@ static void pci_note_irq_problem(struct pci_dev *pdev, const char *reason)
  */
 enum pci_lost_interrupt_reason pci_lost_interrupt(struct pci_dev *pdev)
 {
-	if (pdev->msi_enabled || pdev->msix_enabled) {
+	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
 		enum pci_lost_interrupt_reason ret;
 
-		if (pdev->msix_enabled) {
+		if (pci_dev_msi_enabled(pdev, MSIX_TYPE)) {
 			pci_note_irq_problem(pdev, "MSIX routing failure");
 			ret = PCI_LOST_IRQ_DISABLE_MSIX;
 		} else {
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index e416dc0..d5c8e56 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -125,7 +125,7 @@ static void default_restore_msi_irq(struct pci_dev *dev, int irq)
 			if (irq == entry->irq)
 				break;
 		}
-	} else if (dev->msi_enabled)  {
+	} else if (pci_dev_msi_enabled(dev, MSI_TYPE))  {
 		entry = irq_get_msi_desc(irq);
 	}
 
@@ -439,7 +439,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 	u16 control;
 	struct msi_desc *entry;
 
-	if (!dev->msi_enabled)
+	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
 		return;
 
 	entry = irq_get_msi_desc(dev->irq);
@@ -878,7 +878,8 @@ void pci_msi_shutdown(struct pci_dev *dev)
 	struct msi_desc *desc;
 	u32 mask;
 
-	if (!pci_msi_enable || !dev || !dev->msi_enabled)
+	if (!pci_msi_enable || !dev || 
+			!pci_dev_msi_enabled(dev, MSI_TYPE))
 		return;
 
 	BUG_ON(list_empty(&dev->msi_list));
@@ -899,7 +900,8 @@ void pci_msi_shutdown(struct pci_dev *dev)
 
 void pci_disable_msi(struct pci_dev *dev)
 {
-	if (!pci_msi_enable || !dev || !dev->msi_enabled)
+	if (!pci_msi_enable || !dev || 
+			!pci_dev_msi_enabled(dev, MSI_TYPE))
 		return;
 
 	pci_msi_shutdown(dev);
@@ -972,7 +974,7 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 	WARN_ON(!!dev->msix_enabled);
 
 	/* Check whether driver already requested for MSI irq */
-	if (dev->msi_enabled) {
+	if (pci_dev_msi_enabled(dev, MSI_TYPE)) {
 		dev_info(&dev->dev, "can't enable MSI-X (MSI IRQ already assigned)\n");
 		return -EINVAL;
 	}
@@ -1001,7 +1003,8 @@ void pci_msix_shutdown(struct pci_dev *dev)
 
 void pci_disable_msix(struct pci_dev *dev)
 {
-	if (!pci_msi_enable || !dev || !dev->msix_enabled)
+	if (!pci_msi_enable || !dev || 
+			!pci_dev_msi_enabled(dev, MSIX_TYPE))
 		return;
 
 	pci_msix_shutdown(dev);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 74043a2..6e9e7bd 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1206,7 +1206,7 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
 		return err;
 	pci_fixup_device(pci_fixup_enable, dev);
 
-	if (dev->msi_enabled || dev->msix_enabled)
+	if (pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
 		return 0;
 
 	pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
@@ -1361,9 +1361,9 @@ static void pcim_release(struct device *gendev, void *res)
 	struct pci_devres *this = res;
 	int i;
 
-	if (dev->msi_enabled)
+	if (pci_dev_msi_enabled(dev, MSI_TYPE))
 		pci_disable_msi(dev);
-	if (dev->msix_enabled)
+	if (pci_dev_msi_enabled(dev, MSIX_TYPE))
 		pci_disable_msix(dev);
 
 	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 2f0ce66..7a1b6ec 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -235,9 +235,9 @@ static int init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
 
 static void cleanup_service_irqs(struct pci_dev *dev)
 {
-	if (dev->msix_enabled)
+	if (pci_dev_msi_enabled(dev, MSIX_TYPE))
 		pci_disable_msix(dev);
-	else if (dev->msi_enabled)
+	else if (pci_dev_msi_enabled(dev, MSI_TYPE))
 		pci_disable_msi(dev);
 }
 
diff --git a/drivers/scsi/esas2r/esas2r_init.c b/drivers/scsi/esas2r/esas2r_init.c
index 6776931..444f64d 100644
--- a/drivers/scsi/esas2r/esas2r_init.c
+++ b/drivers/scsi/esas2r/esas2r_init.c
@@ -617,8 +617,8 @@ void esas2r_kill_adapter(int i)
 			       &(a->pcid->dev),
 			       "pci_disable_device() called.  msix_enabled: %d "
 			       "msi_enabled: %d irq: %d pin: %d",
-			       a->pcid->msix_enabled,
-			       a->pcid->msi_enabled,
+			       pci_dev_msi_enabled(a->pcid, MSIX_TYPE),
+			       pci_dev_msi_enabled(a->pcid, MSI_TYPE),
 			       a->pcid->irq,
 			       a->pcid->pin);
 
diff --git a/drivers/scsi/esas2r/esas2r_ioctl.c b/drivers/scsi/esas2r/esas2r_ioctl.c
index d89a027..31e06bd 100644
--- a/drivers/scsi/esas2r/esas2r_ioctl.c
+++ b/drivers/scsi/esas2r/esas2r_ioctl.c
@@ -810,9 +810,9 @@ static int hba_ioctl_callback(struct esas2r_adapter *a,
 
 		gai->pci.msi_vector_cnt = 1;
 
-		if (a->pcid->msix_enabled)
+		if (pci_dev_msi_enabled(a->pcid, MSIX_TYPE))
 			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_MSIX;
-		else if (a->pcid->msi_enabled)
+		else if (pci_dev_msi_enabled(a->pcid, MSI_TYPE))
 			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_MSI;
 		else
 			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_LEGACY;
diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 31184b3..964d809 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -6707,10 +6707,10 @@ static void hpsa_free_irqs_and_disable_msix(struct ctlr_info *h)
 	free_irqs(h);
 #ifdef CONFIG_PCI_MSI
 	if (h->msix_vector) {
-		if (h->pdev->msix_enabled)
+		if (pci_dev_msi_enabled(h->pdev, MSIX_TYPE))
 			pci_disable_msix(h->pdev);
 	} else if (h->msi_vector) {
-		if (h->pdev->msi_enabled)
+		if (pci_dev_msi_enabled(h->pdev, MSI_TYPE))
 			pci_disable_msi(h->pdev);
 	}
 #endif /* CONFIG_PCI_MSI */
diff --git a/drivers/staging/crystalhd/crystalhd_lnx.c b/drivers/staging/crystalhd/crystalhd_lnx.c
index e6fb331..9459b42 100644
--- a/drivers/staging/crystalhd/crystalhd_lnx.c
+++ b/drivers/staging/crystalhd/crystalhd_lnx.c
@@ -45,7 +45,7 @@ static int chd_dec_enable_int(struct crystalhd_adp *adp)
 		return -EINVAL;
 	}
 
-	if (adp->pdev->msi_enabled)
+	if (pci_msi_dev_enabled(adp->pdev, MSI_TYPE))
 		adp->msi = 1;
 	else
 		adp->msi = pci_enable_msi(adp->pdev);
diff --git a/drivers/xen/xen-pciback/pciback_ops.c b/drivers/xen/xen-pciback/pciback_ops.c
index c4a0666..fee2f19 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -64,8 +64,8 @@ static void xen_pcibk_control_isr(struct pci_dev *dev, int reset)
 		dev_data->irq_name,
 		dev_data->irq,
 		pci_is_enabled(dev) ? "on" : "off",
-		dev->msi_enabled ? "MSI" : "",
-		dev->msix_enabled ? "MSI/X" : "",
+		pci_dev_msi_enabled(dev, MSI_TYPE) ? "MSI" : "",
+		pci_dev_msi_enabled(dev, MSIX_TYPE) ? "MSI/X" : "",
 		dev_data->isr_on ? "enable" : "disable",
 		enable ? "enable" : "disable");
 
@@ -90,8 +90,8 @@ out:
 		dev_data->irq_name,
 		dev_data->irq,
 		pci_is_enabled(dev) ? "on" : "off",
-		dev->msi_enabled ? "MSI" : "",
-		dev->msix_enabled ? "MSI/X" : "",
+		pci_dev_msi_enabled(dev, MSI_TYPE) ? "MSI" : "",
+		pci_dev_msi_enabled(dev, MSIX_TYPE) ? "MSI/X" : "",
 		enable ? (dev_data->isr_on ? "enabled" : "failed to enable") :
 			(dev_data->isr_on ? "failed to disable" : "disabled"));
 }
@@ -111,9 +111,9 @@ void xen_pcibk_reset_device(struct pci_dev *dev)
 #ifdef CONFIG_PCI_MSI
 		/* The guest could have been abruptly killed without
 		 * disabling MSI/MSI-X interrupts.*/
-		if (dev->msix_enabled)
+		if (pci_dev_msi_enabled(dev, MSIX_TYPE))
 			pci_disable_msix(dev);
-		if (dev->msi_enabled)
+		if (pci_dev_msi_enabled(dev, MSI_TYPE))
 			pci_disable_msi(dev);
 #endif
 		if (pci_is_enabled(dev))
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 6ed3647..c6c01ae 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -33,6 +33,7 @@
 
 #include <linux/pci_ids.h>
 
+#include <linux/msi.h>
 /*
  * The PCI interface treats multi-function devices as independent
  * devices.  The slot/function address of each device is encoded
@@ -506,9 +507,16 @@ static inline struct pci_dev *pci_upstream_bridge(struct pci_dev *dev)
 }
 
 #ifdef CONFIG_PCI_MSI
-static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev)
+static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev, int type)
 {
-	return pci_dev->msi_enabled || pci_dev->msix_enabled;
+	bool enabled = 0;
+
+	if (type & MSI_TYPE)
+		enabled |= pci_dev->msi_enabled;
+	if (type & MSIX_TYPE)
+		enabled |= pci_dev->msix_enabled;
+
+	return enabled;
 }
 #else
 static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false; }
diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index bf06577..4634bd0 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -366,7 +366,7 @@ static int assigned_device_enable_host_msi(struct kvm *kvm,
 {
 	int r;
 
-	if (!dev->dev->msi_enabled) {
+	if (!pci_dev_msi_enabled(dev->dev, MSI_TYPE)) {
 		r = pci_enable_msi(dev->dev);
 		if (r)
 			return r;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 04/11] PCI/MSI: Move MSIX table address mapping out of msix_capability_init
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (2 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled() Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 05/11] PCI/MSI: Move populate_msi_sysfs() out of msi_capability_init() Yijing Wang
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Move MSIX table address mapping work to PCI MSIX layer.
Some Non-PCI MSI device will do their address mapping work before
enable MSIX capability or their MSIX table address is within
device address block. So Move address mapping stuff out of the
generic MSIX core. This is prepartion for generic MSI drvier.

Suggested-by: Yun Wu <wuyun.wu@huawei.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/msi.c |   25 +++++++++++++------------
 1 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d5c8e56..116383c 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -668,8 +668,8 @@ static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
 	u32 table_offset;
 	u8 bir;
 
-	pci_read_config_dword(dev, dev->msix_cap + PCI_MSIX_TABLE,
-			      &table_offset);
+	pci_read_config_dword(dev, dev->msix_cap + PCI_MSIX_TABLE, 
+				&table_offset);
 	bir = (u8)(table_offset & PCI_MSIX_TABLE_BIR);
 	table_offset &= PCI_MSIX_TABLE_OFFSET;
 	phys_addr = pci_resource_start(dev, bir) + table_offset;
@@ -734,22 +734,14 @@ static void msix_program_entries(struct pci_dev *dev,
  * single MSI-X irq. A return of zero indicates the successful setup of
  * requested MSI-X entries with allocated irqs or non-zero for otherwise.
  **/
-static int msix_capability_init(struct pci_dev *dev,
+static int msix_capability_init(struct pci_dev *dev, void __iomem *base,
 				struct msix_entry *entries, int nvec)
 {
 	int ret;
-	u16 control;
-	void __iomem *base;
 
 	/* Ensure MSI-X is disabled while it is set up */
 	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 
-	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
-	/* Request & Map MSI-X table region */
-	base = msix_map_region(dev, msix_table_size(control));
-	if (!base)
-		return -ENOMEM;
-
 	ret = msix_setup_entries(dev, base, entries, nvec);
 	if (ret)
 		return ret;
@@ -948,6 +940,8 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 {
 	int status, nr_entries;
 	int i, j;
+	void __iomem *base;
+	u16 control;
 
 	if (!entries || !dev->msix_cap || dev->current_state != PCI_D0)
 		return -EINVAL;
@@ -978,7 +972,14 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 		dev_info(&dev->dev, "can't enable MSI-X (MSI IRQ already assigned)\n");
 		return -EINVAL;
 	}
-	status = msix_capability_init(dev, entries, nvec);
+
+	/* Request & Map MSI-X table region */
+	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
+	base = msix_map_region(dev, msix_table_size(control));
+	if (!base)
+		return -ENOMEM;
+
+	status = msix_capability_init(dev, base, entries, nvec);
 	return status;
 }
 EXPORT_SYMBOL(pci_enable_msix);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 05/11] PCI/MSI: Move populate_msi_sysfs() out of msi_capability_init()
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (3 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 04/11] PCI/MSI: Move MSIX table address mapping out of msix_capability_init Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 06/11] PCI/MSI: Save MSI irq in PCI MSI layer Yijing Wang
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Because some Non-PCI devices don't need to create sysfs object,
so move populate_msi_sysfs() out of generic MSI function
msi/x_capability_init().

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/msi.c |   31 ++++++++++++++++++-------------
 1 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 116383c..21b16e0 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -646,13 +646,6 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 		return ret;
 	}
 
-	ret = populate_msi_sysfs(dev);
-	if (ret) {
-		msi_mask_irq(entry, mask, ~mask);
-		free_msi_irqs(dev);
-		return ret;
-	}
-
 	/* Set MSI enabled bits	 */
 	pci_intx_for_msi(dev, 0);
 	msi_set_enable(dev, 1);
@@ -760,10 +753,6 @@ static int msix_capability_init(struct pci_dev *dev, void __iomem *base,
 
 	msix_program_entries(dev, entries);
 
-	ret = populate_msi_sysfs(dev);
-	if (ret)
-		goto out_free;
-
 	/* Set MSI-X enabled bits and unmask the function */
 	pci_intx_for_msi(dev, 0);
 	dev->msix_enabled = 1;
@@ -789,7 +778,6 @@ out_avail:
 			ret = avail;
 	}
 
-out_free:
 	free_msi_irqs(dev);
 
 	return ret;
@@ -939,7 +927,7 @@ EXPORT_SYMBOL(pci_msix_vec_count);
 int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 {
 	int status, nr_entries;
-	int i, j;
+	int i, j, ret;
 	void __iomem *base;
 	u16 control;
 
@@ -980,6 +968,14 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 		return -ENOMEM;
 
 	status = msix_capability_init(dev, base, entries, nvec);
+	if (!status) {
+		ret = populate_msi_sysfs(dev);
+		if (ret) {
+			dev->msix_enabled = 0;
+			pci_intx_for_msi(dev, 1);
+			free_msi_irqs(dev);
+		}
+	}
 	return status;
 }
 EXPORT_SYMBOL(pci_enable_msix);
@@ -1109,6 +1105,15 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
 		}
 	} while (rc);
 
+	rc = populate_msi_sysfs(dev);
+	if (rc) {
+		msi_set_enable(dev, 0);
+		pci_intx_for_msi(dev, 1);
+		dev->msi_enabled = 0;
+		free_msi_irqs(dev);
+		return rc;
+	}
+
 	return nvec;
 }
 EXPORT_SYMBOL(pci_enable_msi_range);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 06/11] PCI/MSI: Save MSI irq in PCI MSI layer
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (4 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 05/11] PCI/MSI: Move populate_msi_sysfs() out of msi_capability_init() Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 07/11] PCI/MSI: Mask MSI-X entry in msix_setup_entries() Yijing Wang
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Save MSI irq in PCI MSI layer, this is preparation
for generic MSI.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/msi.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 21b16e0..f96dd38 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -650,8 +650,6 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 	pci_intx_for_msi(dev, 0);
 	msi_set_enable(dev, 1);
 	dev->msi_enabled = 1;
-
-	dev->irq = entry->irq;
 	return 0;
 }
 
@@ -1059,6 +1057,7 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
 {
 	int nvec;
 	int rc;
+	struct msi_desc *entry;
 
 	if (dev->current_state != PCI_D0)
 		return -EINVAL;
@@ -1114,6 +1113,8 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
 		return rc;
 	}
 
+	entry = list_entry(dev->msi_list.next, struct msi_desc, list);
+	dev->irq = entry->irq;
 	return nvec;
 }
 EXPORT_SYMBOL(pci_enable_msi_range);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 07/11] PCI/MSI: Mask MSI-X entry in msix_setup_entries()
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (5 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 06/11] PCI/MSI: Save MSI irq in PCI MSI layer Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 08/11] PCI/MSI: Introduce new struct msi_irqs and struct msi_ops Yijing Wang
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Save the MSI-X entry initial mask status in
msix_setup_entries(), also mask the entry.
This is preparation for generic MSI.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/msi.c |   21 +++++++++++----------
 1 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index f96dd38..41c33da 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -672,7 +672,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 			      struct msix_entry *entries, int nvec)
 {
 	struct msi_desc *entry;
-	int i;
+	int i, offset;
 
 	for (i = 0; i < nvec; i++) {
 		entry = alloc_msi_entry(dev);
@@ -691,6 +691,15 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 		entry->msi_attrib.default_irq	= dev->irq;
 		entry->mask_base		= base;
 
+		msix_clear_and_set_ctrl(dev, 0, 
+				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE);
+		offset = entries[i].entry * PCI_MSIX_ENTRY_SIZE +
+						PCI_MSIX_ENTRY_VECTOR_CTRL;
+		entry->masked = readl(entry->mask_base + offset);
+		msix_mask_irq(entry, 1);
+		msix_clear_and_set_ctrl(dev, 
+				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
+
 		list_add_tail(&entry->list, &dev->msi_list);
 	}
 
@@ -704,13 +713,8 @@ static void msix_program_entries(struct pci_dev *dev,
 	int i = 0;
 
 	list_for_each_entry(entry, &dev->msi_list, list) {
-		int offset = entries[i].entry * PCI_MSIX_ENTRY_SIZE +
-						PCI_MSIX_ENTRY_VECTOR_CTRL;
-
 		entries[i].vector = entry->irq;
 		irq_set_msi_desc(entry->irq, entry);
-		entry->masked = readl(entry->mask_base + offset);
-		msix_mask_irq(entry, 1);
 		i++;
 	}
 }
@@ -746,16 +750,13 @@ static int msix_capability_init(struct pci_dev *dev, void __iomem *base,
 	 * MSI-X registers.  We need to mask all the vectors to prevent
 	 * interrupts coming in before they're fully set up.
 	 */
-	msix_clear_and_set_ctrl(dev, 0,
-				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE);
-
 	msix_program_entries(dev, entries);
 
 	/* Set MSI-X enabled bits and unmask the function */
 	pci_intx_for_msi(dev, 0);
 	dev->msix_enabled = 1;
 
-	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
+	msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_ENABLE);
 
 	return 0;
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 08/11] PCI/MSI: Introduce new struct msi_irqs and struct msi_ops
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (6 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 07/11] PCI/MSI: Mask MSI-X entry in msix_setup_entries() Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-07-26  3:08 ` [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver Yijing Wang
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Currently, MSI driver is bonding with PCI everywhere.
Now introduce a new struct msi_irqs to manage all MSI
related informations in a MSI support device. In addition,
we introduce struct msi_ops to hook all device specific
MSI operations. Then MSI driver can be decoupled with
PCI.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 include/linux/msi.h |   30 +++++++++++++++++++++++++++++-
 include/linux/pci.h |    7 +------
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 3ad8416..5a672d3 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -10,6 +10,34 @@ struct msi_msg {
 	u32	data;		/* 16 bits of msi message data */
 };
 
+struct msi_ops;
+
+struct msi_irqs {
+	u8 msi_enabled:1;
+	u8 msix_enabled:1;
+	int node;
+	struct list_head msi_list;
+	void *data;
+	struct msi_ops *ops;
+};
+
+struct msix_entry {
+	u32	vector;	/* kernel uses to write allocated vector */
+	u16	entry;	/* driver uses to specify entry, OS writes */
+};
+
+struct msi_ops {
+	void (*msi_set_enable)(struct msi_irqs *msi, int enable, int type);
+	struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi);
+	int (*msix_setup_entries)(struct msi_irqs *msi, void __iomem *base,
+			struct msix_entry *entries, int nvec);
+	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
+	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
+	void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
+	void (*msi_write_message)(struct msi_desc *desc, struct msi_msg *msg);
+	void (*msi_set_intx)(struct msi_irqs *msi, int enable);
+};
+
 /* Helper functions */
 struct irq_data;
 struct msi_desc;
@@ -42,7 +70,7 @@ struct msi_desc {
 		void __iomem *mask_base;
 		u8 mask_pos;
 	};
-	struct pci_dev *dev;
+	struct msi_irqs *msi;
 
 	/* Last set MSI message */
 	struct msi_msg msg;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c6c01ae..c7bca1c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -32,8 +32,8 @@
 #include <uapi/linux/pci.h>
 
 #include <linux/pci_ids.h>
-
 #include <linux/msi.h>
+
 /*
  * The PCI interface treats multi-function devices as independent
  * devices.  The slot/function address of each device is encoded
@@ -1182,11 +1182,6 @@ enum pci_dma_burst_strategy {
 				   strategy_parameter byte boundaries */
 };
 
-struct msix_entry {
-	u32	vector;	/* kernel uses to write allocated vector */
-	u16	entry;	/* driver uses to specify entry, OS writes */
-};
-
 
 #ifdef CONFIG_PCI_MSI
 int pci_msi_vec_count(struct pci_dev *dev);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (7 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 08/11] PCI/MSI: Introduce new struct msi_irqs and struct msi_ops Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-08-20  6:06   ` Bharat.Bhushan
  2014-07-26  3:08 ` [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file Yijing Wang
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Use struct msi_ops to hook PCI MSI operations,
and use struct msi_irqs to refactor PCI MSI drvier.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/msi.c   |  351 ++++++++++++++++++++++++++++++---------------------
 include/linux/msi.h |   14 +-
 include/linux/pci.h |   11 +-
 3 files changed, 222 insertions(+), 154 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 41c33da..f0c5989 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -29,8 +29,9 @@ static int pci_msi_enable = 1;
 
 /* Arch hooks */
 
-int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
+int __weak arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc)
 {
+	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support Non-PCI
 	struct msi_chip *chip = dev->bus->msi;
 	int err;
 
@@ -56,8 +57,9 @@ void __weak arch_teardown_msi_irq(unsigned int irq)
 	chip->teardown_irq(chip, irq);
 }
 
-int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
+int __weak arch_msi_check_device(struct msi_irqs *msi, int nvec, int type)
 {
+	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support Non-PCI
 	struct msi_chip *chip = dev->bus->msi;
 
 	if (!chip || !chip->check_device)
@@ -66,7 +68,7 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
 	return chip->check_device(chip, dev, nvec, type);
 }
 
-int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+int __weak arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
 {
 	struct msi_desc *entry;
 	int ret;
@@ -78,8 +80,8 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 
-	list_for_each_entry(entry, &dev->msi_list, list) {
-		ret = arch_setup_msi_irq(dev, entry);
+	list_for_each_entry(entry, &msi->msi_list, list) {
+		ret = arch_setup_msi_irq(msi, entry);
 		if (ret < 0)
 			return ret;
 		if (ret > 0)
@@ -93,11 +95,11 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
  * We have a default implementation available as a separate non-weak
  * function, as it is used by the Xen x86 PCI code
  */
-void default_teardown_msi_irqs(struct pci_dev *dev)
+void default_teardown_msi_irqs(struct msi_irqs *msi)
 {
 	struct msi_desc *entry;
 
-	list_for_each_entry(entry, &dev->msi_list, list) {
+	list_for_each_entry(entry, &msi->msi_list, list) {
 		int i, nvec;
 		if (entry->irq == 0)
 			continue;
@@ -110,22 +112,22 @@ void default_teardown_msi_irqs(struct pci_dev *dev)
 	}
 }
 
-void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
+void __weak arch_teardown_msi_irqs(struct msi_irqs *msi)
 {
-	return default_teardown_msi_irqs(dev);
+	return default_teardown_msi_irqs(msi);
 }
 
-static void default_restore_msi_irq(struct pci_dev *dev, int irq)
+static void default_restore_msi_irq(struct msi_irqs *msi, int irq)
 {
 	struct msi_desc *entry;
 
 	entry = NULL;
-	if (dev->msix_enabled) {
-		list_for_each_entry(entry, &dev->msi_list, list) {
+	if (msi->msix_enabled) {
+		list_for_each_entry(entry, &msi->msi_list, list) {
 			if (irq == entry->irq)
 				break;
 		}
-	} else if (pci_dev_msi_enabled(dev, MSI_TYPE))  {
+	} else if (msi->msi_enabled)  {
 		entry = irq_get_msi_desc(irq);
 	}
 
@@ -133,20 +135,9 @@ static void default_restore_msi_irq(struct pci_dev *dev, int irq)
 		write_msi_msg(irq, &entry->msg);
 }
 
-void __weak arch_restore_msi_irqs(struct pci_dev *dev)
+void __weak arch_restore_msi_irqs(struct msi_irqs *msi)
 {
-	return default_restore_msi_irqs(dev);
-}
-
-static void msi_set_enable(struct pci_dev *dev, int enable)
-{
-	u16 control;
-
-	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
-	control &= ~PCI_MSI_FLAGS_ENABLE;
-	if (enable)
-		control |= PCI_MSI_FLAGS_ENABLE;
-	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
+	return default_restore_msi_irqs(msi);
 }
 
 static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
@@ -159,6 +150,25 @@ static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
 	pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, ctrl);
 }
 
+static void msi_set_enable(struct msi_irqs *msi, int enable, int type)
+{
+	u16 control;
+	struct pci_dev *dev = msi->data;
+
+	if (type == MSI_TYPE) {
+		pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
+		control &= ~PCI_MSI_FLAGS_ENABLE;
+		if (enable)
+			control |= PCI_MSI_FLAGS_ENABLE;
+		pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
+	} else if (type == MSIX_TYPE) {
+		if (enable)
+			msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_ENABLE);
+		else
+			msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+	}
+}
+
 static inline __attribute_const__ u32 msi_mask(unsigned x)
 {
 	/* Don't shift by >= width of type */
@@ -175,6 +185,7 @@ static inline __attribute_const__ u32 msi_mask(unsigned x)
  */
 u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
+	struct pci_dev *dev = desc->msi->data;
 	u32 mask_bits = desc->masked;
 
 	if (!desc->msi_attrib.maskbit)
@@ -182,7 +193,7 @@ u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 
 	mask_bits &= ~mask;
 	mask_bits |= flag;
-	pci_write_config_dword(desc->dev, desc->mask_pos, mask_bits);
+	pci_write_config_dword(dev, desc->mask_pos, mask_bits);
 
 	return mask_bits;
 }
@@ -250,18 +261,30 @@ void unmask_msi_irq(struct irq_data *data)
 	msi_set_mask_bit(data, 0);
 }
 
-void default_restore_msi_irqs(struct pci_dev *dev)
+static void msix_set_all_mask(struct msi_irqs *msi, int flag)
+{
+	struct pci_dev *dev = msi->data;
+
+	if (flag)
+		msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_MASKALL);
+	else
+		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
+}
+
+void default_restore_msi_irqs(struct msi_irqs *msi)
 {
 	struct msi_desc *entry;
 
-	list_for_each_entry(entry, &dev->msi_list, list) {
-		default_restore_msi_irq(dev, entry->irq);
+	list_for_each_entry(entry, &msi->msi_list, list) {
+		default_restore_msi_irq(msi, entry->irq);
 	}
 }
 
 void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 {
-	BUG_ON(entry->dev->current_state != PCI_D0);
+	struct pci_dev *dev = entry->msi->data;
+
+	BUG_ON(dev->current_state != PCI_D0);
 
 	if (entry->msi_attrib.is_msix) {
 		void __iomem *base = entry->mask_base +
@@ -271,7 +294,6 @@ void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 		msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR);
 		msg->data = readl(base + PCI_MSIX_ENTRY_DATA);
 	} else {
-		struct pci_dev *dev = entry->dev;
 		int pos = dev->msi_cap;
 		u16 data;
 
@@ -315,7 +337,9 @@ void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg)
 
 void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 {
-	if (entry->dev->current_state != PCI_D0) {
+	struct pci_dev *dev = entry->msi->data;
+
+	if (dev->current_state != PCI_D0) {
 		/* Don't touch the hardware now */
 	} else if (entry->msi_attrib.is_msix) {
 		void __iomem *base;
@@ -326,7 +350,6 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 		writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
 		writel(msg->data, base + PCI_MSIX_ENTRY_DATA);
 	} else {
-		struct pci_dev *dev = entry->dev;
 		int pos = dev->msi_cap;
 		u16 msgctl;
 
@@ -357,14 +380,34 @@ void write_msi_msg(unsigned int irq, struct msi_msg *msg)
 	__write_msi_msg(entry, msg);
 }
 
-static void free_msi_irqs(struct pci_dev *dev)
+static void free_msi_sysfs(struct pci_dev *dev)
 {
-	struct msi_desc *entry, *tmp;
 	struct attribute **msi_attrs;
 	struct device_attribute *dev_attr;
 	int count = 0;
 
-	list_for_each_entry(entry, &dev->msi_list, list) {
+	if (dev->msi_irq_groups) {
+		sysfs_remove_groups(&dev->dev.kobj, dev->msi_irq_groups);
+		msi_attrs = dev->msi_irq_groups[0]->attrs;
+		while (msi_attrs[count]) {
+			dev_attr = container_of(msi_attrs[count],
+						struct device_attribute, attr);
+			kfree(dev_attr->attr.name);
+			kfree(dev_attr);
+			++count;
+		}
+		kfree(msi_attrs);
+		kfree(dev->msi_irq_groups[0]);
+		kfree(dev->msi_irq_groups);
+		dev->msi_irq_groups = NULL;
+	}
+}
+
+static void free_msi_irqs(struct msi_irqs *msi)
+{
+	struct msi_desc *entry, *tmp;
+
+	list_for_each_entry(entry, &msi->msi_list, list) {
 		int i, nvec;
 		if (!entry->irq)
 			continue;
@@ -376,11 +419,11 @@ static void free_msi_irqs(struct pci_dev *dev)
 			BUG_ON(irq_has_action(entry->irq + i));
 	}
 
-	arch_teardown_msi_irqs(dev);
+	arch_teardown_msi_irqs(msi);
 
-	list_for_each_entry_safe(entry, tmp, &dev->msi_list, list) {
+	list_for_each_entry_safe(entry, tmp, &msi->msi_list, list) {
 		if (entry->msi_attrib.is_msix) {
-			if (list_is_last(&entry->list, &dev->msi_list))
+			if (list_is_last(&entry->list, &msi->msi_list))
 				iounmap(entry->mask_base);
 		}
 
@@ -398,38 +441,24 @@ static void free_msi_irqs(struct pci_dev *dev)
 		list_del(&entry->list);
 		kfree(entry);
 	}
-
-	if (dev->msi_irq_groups) {
-		sysfs_remove_groups(&dev->dev.kobj, dev->msi_irq_groups);
-		msi_attrs = dev->msi_irq_groups[0]->attrs;
-		while (msi_attrs[count]) {
-			dev_attr = container_of(msi_attrs[count],
-						struct device_attribute, attr);
-			kfree(dev_attr->attr.name);
-			kfree(dev_attr);
-			++count;
-		}
-		kfree(msi_attrs);
-		kfree(dev->msi_irq_groups[0]);
-		kfree(dev->msi_irq_groups);
-		dev->msi_irq_groups = NULL;
-	}
 }
 
-static struct msi_desc *alloc_msi_entry(struct pci_dev *dev)
+static struct msi_desc *alloc_msi_entry(struct msi_irqs *msi)
 {
 	struct msi_desc *desc = kzalloc(sizeof(*desc), GFP_KERNEL);
 	if (!desc)
 		return NULL;
 
 	INIT_LIST_HEAD(&desc->list);
-	desc->dev = dev;
+	desc->msi = msi;
 
 	return desc;
 }
 
-static void pci_intx_for_msi(struct pci_dev *dev, int enable)
+static void pci_intx_for_msi(struct msi_irqs *msi, int enable)
 {
+	struct pci_dev *dev = msi->data;
+
 	if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG))
 		pci_intx(dev, enable);
 }
@@ -444,9 +473,9 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 
 	entry = irq_get_msi_desc(dev->irq);
 
-	pci_intx_for_msi(dev, 0);
-	msi_set_enable(dev, 0);
-	arch_restore_msi_irqs(dev);
+	pci_intx_for_msi(dev->msi, 0);
+	msi_set_enable(dev->msi, 0, MSI_TYPE);
+	arch_restore_msi_irqs(dev->msi);
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
 	msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap),
@@ -459,22 +488,21 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 static void __pci_restore_msix_state(struct pci_dev *dev)
 {
 	struct msi_desc *entry;
+	struct msi_irqs *msi = dev->msi;
 
-	if (!dev->msix_enabled)
+	if (!pci_dev_msi_enabled(dev, MSIX_TYPE))
 		return;
-	BUG_ON(list_empty(&dev->msi_list));
+	BUG_ON(list_empty(&msi->msi_list));
 
 	/* route the table */
-	pci_intx_for_msi(dev, 0);
-	msix_clear_and_set_ctrl(dev, 0,
-				PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL);
-
-	arch_restore_msi_irqs(dev);
-	list_for_each_entry(entry, &dev->msi_list, list) {
+	pci_intx_for_msi(msi, 0);
+	msi_set_enable(msi, 1, MSIX_TYPE);
+	msix_set_all_mask(msi, 1);
+	arch_restore_msi_irqs(msi);
+	list_for_each_entry(entry, &msi->msi_list, list) 
 		msix_mask_irq(entry, entry->masked);
-	}
 
-	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
+	msix_set_all_mask(msi, 0);
 }
 
 void pci_restore_msi_state(struct pci_dev *dev)
@@ -516,7 +544,7 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
 	int count = 0;
 
 	/* Determine how many msi entries we have */
-	list_for_each_entry(entry, &pdev->msi_list, list) {
+	list_for_each_entry(entry, &pdev->msi->msi_list, list) {
 		++num_msi;
 	}
 	if (!num_msi)
@@ -526,7 +554,7 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
 	msi_attrs = kzalloc(sizeof(void *) * (num_msi + 1), GFP_KERNEL);
 	if (!msi_attrs)
 		return -ENOMEM;
-	list_for_each_entry(entry, &pdev->msi_list, list) {
+	list_for_each_entry(entry, &pdev->msi->msi_list, list) {
 		msi_dev_attr = kzalloc(sizeof(*msi_dev_attr), GFP_KERNEL);
 		if (!msi_dev_attr)
 			goto error_attrs;
@@ -578,13 +606,14 @@ error_attrs:
 	return ret;
 }
 
-static struct msi_desc *msi_setup_entry(struct pci_dev *dev)
+static struct msi_desc *msi_setup_entry(struct msi_irqs *msi)
 {
 	u16 control;
 	struct msi_desc *entry;
+	struct pci_dev *dev = msi->data;
 
 	/* MSI Entry Initialization */
-	entry = alloc_msi_entry(dev);
+	entry = alloc_msi_entry(msi);
 	if (!entry)
 		return NULL;
 
@@ -620,15 +649,15 @@ static struct msi_desc *msi_setup_entry(struct pci_dev *dev)
  * an error, and a positive return value indicates the number of interrupts
  * which could have been allocated.
  */
-static int msi_capability_init(struct pci_dev *dev, int nvec)
+static int msi_capability_init(struct msi_irqs *msi, int nvec)
 {
 	struct msi_desc *entry;
 	int ret;
 	unsigned mask;
 
-	msi_set_enable(dev, 0);	/* Disable MSI during set up */
+	msi_set_enable(msi, 0, MSI_TYPE);	/* Disable MSI during set up */
 
-	entry = msi_setup_entry(dev);
+	entry = msi_setup_entry(msi);
 	if (!entry)
 		return -ENOMEM;
 
@@ -636,21 +665,23 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 	mask = msi_mask(entry->msi_attrib.multi_cap);
 	msi_mask_irq(entry, mask, mask);
 
-	list_add_tail(&entry->list, &dev->msi_list);
+	list_add_tail(&entry->list, &msi->msi_list);
 
 	/* Configure MSI capability structure */
-	ret = arch_setup_msi_irqs(dev, nvec, MSI_TYPE);
-	if (ret) {
-		msi_mask_irq(entry, mask, ~mask);
-		free_msi_irqs(dev);
-		return ret;
-	}
+	ret = arch_setup_msi_irqs(msi, nvec, MSI_TYPE);
+	if (ret)
+		goto err;
 
 	/* Set MSI enabled bits	 */
-	pci_intx_for_msi(dev, 0);
-	msi_set_enable(dev, 1);
-	dev->msi_enabled = 1;
+	pci_intx_for_msi(msi, 0);
+	msi_set_enable(msi, 1, MSI_TYPE);
+	msi->msi_enabled = 1;
 	return 0;
+
+err:
+	msi_mask_irq(entry, mask, ~mask);
+	free_msi_irqs(msi);
+	return ret;
 }
 
 static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
@@ -668,19 +699,20 @@ static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
 	return ioremap_nocache(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
 }
 
-static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
+static int msix_setup_entries(struct msi_irqs *msi, void __iomem *base,
 			      struct msix_entry *entries, int nvec)
 {
 	struct msi_desc *entry;
 	int i, offset;
+	struct pci_dev *dev = msi->data;
 
 	for (i = 0; i < nvec; i++) {
-		entry = alloc_msi_entry(dev);
+		entry = alloc_msi_entry(msi);
 		if (!entry) {
 			if (!i)
 				iounmap(base);
 			else
-				free_msi_irqs(dev);
+				free_msi_irqs(msi);
 			/* No enough memory. Don't try again */
 			return -ENOMEM;
 		}
@@ -688,7 +720,6 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 		entry->msi_attrib.is_msix	= 1;
 		entry->msi_attrib.is_64		= 1;
 		entry->msi_attrib.entry_nr	= entries[i].entry;
-		entry->msi_attrib.default_irq	= dev->irq;
 		entry->mask_base		= base;
 
 		msix_clear_and_set_ctrl(dev, 0, 
@@ -700,19 +731,19 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 		msix_clear_and_set_ctrl(dev, 
 				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
 
-		list_add_tail(&entry->list, &dev->msi_list);
+		list_add_tail(&entry->list, &msi->msi_list);
 	}
 
 	return 0;
 }
 
-static void msix_program_entries(struct pci_dev *dev,
+static void msix_program_entries(struct msi_irqs *msi,
 				 struct msix_entry *entries)
 {
 	struct msi_desc *entry;
 	int i = 0;
 
-	list_for_each_entry(entry, &dev->msi_list, list) {
+	list_for_each_entry(entry, &msi->msi_list, list) {
 		entries[i].vector = entry->irq;
 		irq_set_msi_desc(entry->irq, entry);
 		i++;
@@ -729,19 +760,19 @@ static void msix_program_entries(struct pci_dev *dev,
  * single MSI-X irq. A return of zero indicates the successful setup of
  * requested MSI-X entries with allocated irqs or non-zero for otherwise.
  **/
-static int msix_capability_init(struct pci_dev *dev, void __iomem *base,
+static int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
 				struct msix_entry *entries, int nvec)
 {
 	int ret;
 
 	/* Ensure MSI-X is disabled while it is set up */
-	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+	msi_set_enable(msi, 0, MSIX_TYPE);
 
-	ret = msix_setup_entries(dev, base, entries, nvec);
+	ret = msix_setup_entries(msi, base, entries, nvec);
 	if (ret)
 		return ret;
 
-	ret = arch_setup_msi_irqs(dev, nvec, MSIX_TYPE);
+	ret = arch_setup_msi_irqs(msi, nvec, MSIX_TYPE);
 	if (ret)
 		goto out_avail;
 
@@ -750,13 +781,13 @@ static int msix_capability_init(struct pci_dev *dev, void __iomem *base,
 	 * MSI-X registers.  We need to mask all the vectors to prevent
 	 * interrupts coming in before they're fully set up.
 	 */
-	msix_program_entries(dev, entries);
+	msix_program_entries(msi, entries);
 
 	/* Set MSI-X enabled bits and unmask the function */
-	pci_intx_for_msi(dev, 0);
-	dev->msix_enabled = 1;
+	pci_intx_for_msi(msi, 0);
+	msi->msix_enabled = 1;
 
-	msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_ENABLE);
+	msi_set_enable(msi, 1, MSIX_TYPE);
 
 	return 0;
 
@@ -769,7 +800,7 @@ out_avail:
 		struct msi_desc *entry;
 		int avail = 0;
 
-		list_for_each_entry(entry, &dev->msi_list, list) {
+		list_for_each_entry(entry, &msi->msi_list, list) {
 			if (entry->irq != 0)
 				avail++;
 		}
@@ -777,7 +808,7 @@ out_avail:
 			ret = avail;
 	}
 
-	free_msi_irqs(dev);
+	free_msi_irqs(msi);
 
 	return ret;
 }
@@ -820,7 +851,7 @@ static int pci_msi_check_device(struct pci_dev *dev, int nvec, int type)
 		if (bus->bus_flags & PCI_BUS_FLAGS_NO_MSI)
 			return -EINVAL;
 
-	ret = arch_msi_check_device(dev, nvec, type);
+	ret = arch_msi_check_device(dev->msi, nvec, type);
 	if (ret)
 		return ret;
 
@@ -861,12 +892,12 @@ void pci_msi_shutdown(struct pci_dev *dev)
 			!pci_dev_msi_enabled(dev, MSI_TYPE))
 		return;
 
-	BUG_ON(list_empty(&dev->msi_list));
-	desc = list_first_entry(&dev->msi_list, struct msi_desc, list);
+	BUG_ON(list_empty(&dev->msi->msi_list));
+	desc = list_first_entry(&dev->msi->msi_list, struct msi_desc, list);
 
-	msi_set_enable(dev, 0);
-	pci_intx_for_msi(dev, 1);
-	dev->msi_enabled = 0;
+	msi_set_enable(dev->msi, 0, MSI_TYPE);
+	pci_intx_for_msi(dev->msi, 1);
+	dev->msi->msi_enabled = 0;
 
 	/* Return the device with MSI unmasked as initial states */
 	mask = msi_mask(desc->msi_attrib.multi_cap);
@@ -884,7 +915,8 @@ void pci_disable_msi(struct pci_dev *dev)
 		return;
 
 	pci_msi_shutdown(dev);
-	free_msi_irqs(dev);
+	free_msi_irqs(dev->msi);
+	free_msi_sysfs(dev);
 }
 EXPORT_SYMBOL(pci_disable_msi);
 
@@ -930,9 +962,10 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 	void __iomem *base;
 	u16 control;
 
-	if (!entries || !dev->msix_cap || dev->current_state != PCI_D0)
+	if (!entries || !dev->msix_cap || !dev->msi
+		   	|| dev->current_state != PCI_D0)
 		return -EINVAL;
-
+	
 	status = pci_msi_check_device(dev, nvec, MSIX_TYPE);
 	if (status)
 		return status;
@@ -952,7 +985,7 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 				return -EINVAL;	/* duplicate entry */
 		}
 	}
-	WARN_ON(!!dev->msix_enabled);
+	WARN_ON(!!pci_dev_msi_enabled(dev, MSIX_TYPE));
 
 	/* Check whether driver already requested for MSI irq */
 	if (pci_dev_msi_enabled(dev, MSI_TYPE)) {
@@ -966,13 +999,13 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 	if (!base)
 		return -ENOMEM;
 
-	status = msix_capability_init(dev, base, entries, nvec);
+	status = msix_capability_init(dev->msi, base, entries, nvec);
 	if (!status) {
 		ret = populate_msi_sysfs(dev);
 		if (ret) {
-			dev->msix_enabled = 0;
-			pci_intx_for_msi(dev, 1);
-			free_msi_irqs(dev);
+			dev->msi->msix_enabled = 0;
+			pci_intx_for_msi(dev->msi, 1);
+			free_msi_irqs(dev->msi);
 		}
 	}
 	return status;
@@ -983,18 +1016,18 @@ void pci_msix_shutdown(struct pci_dev *dev)
 {
 	struct msi_desc *entry;
 
-	if (!pci_msi_enable || !dev || !dev->msix_enabled)
+	if (!pci_msi_enable || !dev || !pci_dev_msi_enabled(dev, MSIX_TYPE))
 		return;
 
 	/* Return the device with MSI-X masked as initial states */
-	list_for_each_entry(entry, &dev->msi_list, list) {
+	list_for_each_entry(entry, &dev->msi->msi_list, list) {
 		/* Keep cached states to be restored */
 		arch_msix_mask_irq(entry, 1);
 	}
 
-	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
-	pci_intx_for_msi(dev, 1);
-	dev->msix_enabled = 0;
+	msi_set_enable(dev->msi, 0, MSIX_TYPE);
+	pci_intx_for_msi(dev->msi, 1);
+	dev->msi->msix_enabled = 0;
 }
 
 void pci_disable_msix(struct pci_dev *dev)
@@ -1004,7 +1037,8 @@ void pci_disable_msix(struct pci_dev *dev)
 		return;
 
 	pci_msix_shutdown(dev);
-	free_msi_irqs(dev);
+	free_msi_irqs(dev->msi);
+	free_msi_sysfs(dev);
 }
 EXPORT_SYMBOL(pci_disable_msix);
 
@@ -1025,21 +1059,52 @@ int pci_msi_enabled(void)
 }
 EXPORT_SYMBOL(pci_msi_enabled);
 
-void pci_msi_init_pci_dev(struct pci_dev *dev)
+static struct msi_ops pci_msi = {
+	.msi_set_enable = msi_set_enable,
+	.msi_setup_entry = msi_setup_entry,
+	.msix_setup_entries = msix_setup_entries,
+	.msi_mask_irq = default_msi_mask_irq,
+	.msix_mask_irq = default_msix_mask_irq,
+	.msi_read_message = __read_msi_msg,
+	.msi_write_message = __write_msi_msg,
+	.msi_set_intx =  pci_intx_for_msi,
+};
+
+struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops)
 {
-	INIT_LIST_HEAD(&dev->msi_list);
+	struct msi_irqs *msi;
+
+	msi = kzalloc(sizeof(struct msi_irqs), GFP_KERNEL);
+	if (!msi)
+		return NULL;
 
+	INIT_LIST_HEAD(&msi->msi_list);
+	msi->data = data;
+	msi->ops = ops;
+	return msi;
+}
+
+void pci_msi_init_pci_dev(struct pci_dev *dev)
+{
 	/* Disable the msi hardware to avoid screaming interrupts
 	 * during boot.  This is the power on reset default so
 	 * usually this should be a noop.
 	 */
 	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
-	if (dev->msi_cap)
-		msi_set_enable(dev, 0);
-
 	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
-	if (dev->msix_cap)
-		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+
+	if (dev->msi_cap || dev->msix_cap) {
+		dev->msi = alloc_msi_irqs(dev, &pci_msi);
+		if (!dev->msi)
+			return;
+			
+		dev->msi->node = dev_to_node(&dev->dev);
+		if (dev->msi_cap) 
+			msi_set_enable(dev->msi, 0, MSI_TYPE);
+
+		if (dev->msix_cap) 
+			msi_set_enable(dev->msi, 0, MSIX_TYPE);
+	}
 }
 
 /**
@@ -1060,13 +1125,13 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
 	int rc;
 	struct msi_desc *entry;
 
-	if (dev->current_state != PCI_D0)
+	if (dev->current_state != PCI_D0 || !dev->msi)
 		return -EINVAL;
 
-	WARN_ON(!!dev->msi_enabled);
+	WARN_ON(!!pci_dev_msi_enabled(dev, MSI_TYPE));
 
 	/* Check whether driver already requested MSI-X irqs */
-	if (dev->msix_enabled) {
+	if (pci_dev_msi_enabled(dev, MSIX_TYPE)) {
 		dev_info(&dev->dev,
 			 "can't enable MSI (MSI-X already enabled)\n");
 		return -EINVAL;
@@ -1095,7 +1160,7 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
 	} while (rc);
 
 	do {
-		rc = msi_capability_init(dev, nvec);
+		rc = msi_capability_init(dev->msi, nvec);
 		if (rc < 0) {
 			return rc;
 		} else if (rc > 0) {
@@ -1107,14 +1172,14 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
 
 	rc = populate_msi_sysfs(dev);
 	if (rc) {
-		msi_set_enable(dev, 0);
-		pci_intx_for_msi(dev, 1);
-		dev->msi_enabled = 0;
-		free_msi_irqs(dev);
+		msi_set_enable(dev->msi, 0, MSI_TYPE);
+		pci_intx_for_msi(dev->msi, 1);
+		dev->msi->msi_enabled = 0;
+		free_msi_irqs(dev->msi);
 		return rc;
 	}
 
-	entry = list_entry(dev->msi_list.next, struct msi_desc, list);
+	entry = list_entry(dev->msi->msi_list.next, struct msi_desc, list);
 	dev->irq = entry->irq;
 	return nvec;
 }
@@ -1158,3 +1223,5 @@ int pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
 	return nvec;
 }
 EXPORT_SYMBOL(pci_enable_msix_range);
+
+
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 5a672d3..fc8f3e8 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -83,15 +83,15 @@ struct msi_desc {
  * implemented as weak symbols so that they /can/ be overriden by
  * architecture specific code if needed.
  */
-int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc);
+int arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc);
 void arch_teardown_msi_irq(unsigned int irq);
-int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
-void arch_teardown_msi_irqs(struct pci_dev *dev);
-int arch_msi_check_device(struct pci_dev* dev, int nvec, int type);
-void arch_restore_msi_irqs(struct pci_dev *dev);
+int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type);
+void arch_teardown_msi_irqs(struct msi_irqs *msi);
+int arch_msi_check_device(struct msi_irqs *msi, int nvec, int type);
+void arch_restore_msi_irqs(struct msi_irqs *msi);
 
-void default_teardown_msi_irqs(struct pci_dev *dev);
-void default_restore_msi_irqs(struct pci_dev *dev);
+void default_teardown_msi_irqs(struct msi_irqs *msi);
+void default_restore_msi_irqs(struct msi_irqs *msi);
 u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
 u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c7bca1c..d7126fc 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -334,8 +334,6 @@ struct pci_dev {
 	unsigned int	block_cfg_access:1;	/* config space access is blocked */
 	unsigned int	broken_parity_status:1;	/* Device generates false positive parity */
 	unsigned int	irq_reroute_variant:2;	/* device needs IRQ rerouting variant */
-	unsigned int	msi_enabled:1;
-	unsigned int	msix_enabled:1;
 	unsigned int	ari_enabled:1;	/* ARI forwarding */
 	unsigned int	is_managed:1;
 	unsigned int    needs_freset:1; /* Dev requires fundamental reset */
@@ -358,7 +356,7 @@ struct pci_dev {
 	struct bin_attribute *res_attr[DEVICE_COUNT_RESOURCE]; /* sysfs file for resources */
 	struct bin_attribute *res_attr_wc[DEVICE_COUNT_RESOURCE]; /* sysfs file for WC mapping of resources */
 #ifdef CONFIG_PCI_MSI
-	struct list_head msi_list;
+	struct msi_irqs *msi;
 	const struct attribute_group **msi_irq_groups;
 #endif
 	struct pci_vpd *vpd;
@@ -510,11 +508,14 @@ static inline struct pci_dev *pci_upstream_bridge(struct pci_dev *dev)
 static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev, int type)
 {
 	bool enabled = 0;
+	
+	if (!pci_dev->msi)
+		return false;
 
 	if (type & MSI_TYPE)
-		enabled |= pci_dev->msi_enabled;
+		enabled |= pci_dev->msi->msi_enabled;
 	if (type & MSIX_TYPE)
-		enabled |= pci_dev->msix_enabled;
+		enabled |= pci_dev->msi->msix_enabled;
 
 	return enabled;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (8 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-08-20  6:18   ` Bharat.Bhushan
  2014-07-26  3:08 ` [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code Yijing Wang
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

MSI interrupt will not only used in PCI device, more
and more Non-PCI device also want to use MSI. ARM
GIC v3 spec says in ARM platform with GIC v3 controller,
Non-PCI device can also be design to support MSI to
simplify interrupt wires, for the existing Non-PCI
device, consolidator is designed and used to translate
legacy interrupt to MSI. So for support Non-PCI MSI
device, generic MSI driver is needed. Split the generic
MSI code into new location, drivers/msi/msi.c. Then
MSI driver does not depend PCI anymore.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/Kconfig      |    1 +
 drivers/Makefile     |    1 +
 drivers/msi/Kconfig  |    8 +
 drivers/msi/Makefile |    1 +
 drivers/msi/msi.c    |  540 ++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/Kconfig  |    6 +-
 drivers/pci/msi.c    |  500 ++++-------------------------------------------
 include/linux/msi.h  |   31 +++-
 8 files changed, 617 insertions(+), 471 deletions(-)
 create mode 100644 drivers/msi/Kconfig
 create mode 100644 drivers/msi/Makefile
 create mode 100644 drivers/msi/msi.c

diff --git a/drivers/Kconfig b/drivers/Kconfig
index 0e87a34..4d05749 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -176,4 +176,5 @@ source "drivers/powercap/Kconfig"
 
 source "drivers/mcb/Kconfig"
 
+source "drivers/msi/Kconfig"
 endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index f98b50d..47ae3d1 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -158,3 +158,4 @@ obj-$(CONFIG_NTB)		+= ntb/
 obj-$(CONFIG_FMC)		+= fmc/
 obj-$(CONFIG_POWERCAP)		+= powercap/
 obj-$(CONFIG_MCB)		+= mcb/
+obj-$(CONFIG_MSI)		+= msi/
diff --git a/drivers/msi/Kconfig b/drivers/msi/Kconfig
new file mode 100644
index 0000000..739bd13
--- /dev/null
+++ b/drivers/msi/Kconfig
@@ -0,0 +1,8 @@
+config MSI
+	bool "Message Signaled Interrupts (MSI and MSI-X)"
+	default y
+	help
+		This allows device drivers to use generic MSI(Message
+		Signaled Interrupt). Message Signaled Interrupts enable 
+		a device to generate an interrupt using an inbound Memory 
+		Write to a specific target address.
diff --git a/drivers/msi/Makefile b/drivers/msi/Makefile
new file mode 100644
index 0000000..39cb026
--- /dev/null
+++ b/drivers/msi/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_MSI) += msi.o
diff --git a/drivers/msi/msi.c b/drivers/msi/msi.c
new file mode 100644
index 0000000..3fbd539
--- /dev/null
+++ b/drivers/msi/msi.c
@@ -0,0 +1,540 @@
+/*
+ * File:	msi.c
+ * Purpose:	Message Signaled Interrupt (MSI)
+ *
+ * Copyright (C) 2014 Huawei Ltd.
+ * Copyright (C) Yijing Wang <wangyijing@huawei.com> 
+ */
+#include <linux/err.h>
+#include <linux/mm.h>
+#include <linux/irq.h>
+#include <linux/interrupt.h>
+#include <linux/export.h>
+#include <linux/ioport.h>
+#include <linux/proc_fs.h>
+#include <linux/msi.h>
+#include <linux/smp.h>
+#include <linux/errno.h>
+#include <linux/io.h>
+#include <linux/slab.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+
+/* Arch hooks */
+
+int __weak arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc)
+{
+	struct pci_dev *dev = msi->data;
+	struct msi_chip *chip = dev->bus->msi; //TO BE DONE: rework msi_chip to support Non-PCI MSI
+	int err;
+
+	if (!chip || !chip->setup_irq)
+		return -EINVAL;
+
+	err = chip->setup_irq(chip, dev, desc);
+	if (err < 0)
+		return err;
+
+	irq_set_chip_data(desc->irq, chip);
+	return 0;
+}
+
+void __weak arch_teardown_msi_irq(unsigned int irq)
+{
+	struct msi_chip *chip = irq_get_chip_data(irq);
+
+	if (!chip || !chip->teardown_irq)
+		return;
+
+	chip->teardown_irq(chip, irq);
+}
+
+int __weak arch_msi_check_device(struct msi_irqs *msi, int nvec, int type)
+{
+	struct pci_dev *dev = msi->data;
+	struct msi_chip *chip = dev->bus->msi; //TO BE DONE: rework msi_chip to support Non-PCI MSI
+
+	if (!chip || !chip->check_device)
+		return 0;
+
+	return chip->check_device(chip, dev, nvec, type);
+}
+
+int __weak arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
+{
+	struct msi_desc *entry;
+	int ret;
+
+	/*
+	 * If an architecture wants to support multiple MSI, it needs to
+	 * override arch_setup_msi_irqs()
+	 */
+	if (type == MSI_TYPE && nvec > 1)
+		return 1;
+
+	list_for_each_entry(entry, &msi->msi_list, list) {
+		ret = arch_setup_msi_irq(msi, entry);
+		if (ret < 0)
+			return ret;
+		if (ret > 0)
+			return -ENOSPC;
+	}
+	return 0;
+}
+
+
+void __weak arch_teardown_msi_irqs(struct msi_irqs *msi)
+{
+	return default_teardown_msi_irqs(msi);
+}
+
+/*
+ * We have a default implementation available as a separate non-weak
+ * function, as it is used by the Xen x86 PCI code
+ */
+void default_teardown_msi_irqs(struct msi_irqs *msi)
+{
+	struct msi_desc *entry;
+
+	list_for_each_entry(entry, &msi->msi_list, list) {
+		int i, nvec;
+		if (entry->irq == 0)
+			continue;
+		if (entry->nvec_used)
+			nvec = entry->nvec_used;
+		else
+			nvec = 1 << entry->msi_attrib.multiple;
+		for (i = 0; i < nvec; i++)
+			arch_teardown_msi_irq(entry->irq + i);
+	}
+}
+
+static void default_restore_msi_irq(struct msi_irqs *msi, int irq)
+{
+	struct msi_desc *entry;
+
+	entry = NULL;
+	if (msi->msix_enabled) {
+		list_for_each_entry(entry, &msi->msi_list, list) {
+			if (irq == entry->irq)
+				break;
+		}
+	} else if (msi->msi_enabled)  {
+		entry = irq_get_msi_desc(irq);
+	}
+
+	if (entry)
+		write_msi_msg(irq, &entry->msg);
+}
+
+void default_restore_msi_irqs(struct msi_irqs *msi)
+{
+	struct msi_desc *entry;
+
+	list_for_each_entry(entry, &msi->msi_list, list) {
+		default_restore_msi_irq(msi, entry->irq);
+	}
+}
+
+void __weak arch_restore_msi_irqs(struct msi_irqs *msi)
+{
+	return default_restore_msi_irqs(msi);
+}
+
+u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+{
+	struct msi_irqs *msi = desc->msi;
+
+	if (!msi || !msi->ops || !msi->ops->msi_mask_irq)
+		return desc->masked;
+	return msi->ops->msi_mask_irq(desc, mask, flag);
+}
+
+__weak u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+{
+	return default_msi_mask_irq(desc, mask, flag);
+}
+
+void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+{
+	desc->masked = arch_msi_mask_irq(desc, mask, flag);
+}
+
+u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag)
+{
+	struct msi_irqs *msi = desc->msi;
+
+	if (!msi || !msi->ops || !msi->ops->msix_mask_irq)
+		return desc->masked;
+
+	return msi->ops->msix_mask_irq(desc, flag);
+}
+
+__weak u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag)
+{
+	return default_msix_mask_irq(desc, flag);
+}
+
+void msix_mask_irq(struct msi_desc *desc, u32 flag)
+{
+	desc->masked = arch_msix_mask_irq(desc, flag);
+}
+
+static void msi_set_mask_bit(struct irq_data *data, u32 flag)
+{
+	struct msi_desc *desc = irq_data_get_msi(data);
+
+	if (desc->msi_attrib.is_msix) {
+		msix_mask_irq(desc, flag);
+		readl(desc->mask_base);		/* Flush write to device */
+	} else {
+		unsigned offset = data->irq - desc->irq;
+		msi_mask_irq(desc, 1 << offset, flag << offset);
+	}
+}
+
+void mask_msi_irq(struct irq_data *data)
+{
+	msi_set_mask_bit(data, 1);
+}
+
+void unmask_msi_irq(struct irq_data *data)
+{
+	msi_set_mask_bit(data, 0);
+}
+
+void msi_set_enable(struct msi_irqs *msi, int enable, int type)
+{
+	if (!msi || !msi->ops || !msi->ops->msi_set_enable)
+		return;
+	msi->ops->msi_set_enable(msi, enable, type);
+}
+EXPORT_SYMBOL(msi_set_enable);
+
+void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
+{
+	struct msi_irqs *msi = entry->msi;
+
+	if (!msi || !msi->ops || !msi->ops->msi_read_message)
+		return;
+	msi->ops->msi_read_message(entry, msg);
+}
+
+void read_msi_msg(unsigned int irq, struct msi_msg *msg)
+{
+	struct msi_desc *entry = irq_get_msi_desc(irq);
+
+	__read_msi_msg(entry, msg);
+}
+
+void __get_cached_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
+{
+	/* Assert that the cache is valid, assuming that
+	 * valid messages are not all-zeroes. */
+	BUG_ON(!(entry->msg.address_hi | entry->msg.address_lo |
+		 entry->msg.data));
+
+	*msg = entry->msg;
+}
+
+void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg)
+{
+	struct msi_desc *entry = irq_get_msi_desc(irq);
+
+	__get_cached_msi_msg(entry, msg);
+}
+
+void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
+{
+	struct msi_irqs *msi = entry->msi;
+
+	if (!msi || !msi->ops || !msi->ops->msi_write_message)
+		return;
+	msi->ops->msi_write_message(entry, msg);
+}
+
+void write_msi_msg(unsigned int irq, struct msi_msg *msg)
+{
+	struct msi_desc *entry = irq_get_msi_desc(irq);
+
+	__write_msi_msg(entry, msg);
+}
+
+void free_msi_irqs(struct msi_irqs *msi)
+{
+	struct msi_desc *entry, *tmp;
+
+	list_for_each_entry(entry, &msi->msi_list, list) {
+		int i, nvec;
+		if (!entry->irq)
+			continue;
+		if (entry->nvec_used)
+			nvec = entry->nvec_used;
+		else
+			nvec = 1 << entry->msi_attrib.multiple;
+		for (i = 0; i < nvec; i++)
+			BUG_ON(irq_has_action(entry->irq + i));
+	}
+
+	arch_teardown_msi_irqs(msi);
+
+	list_for_each_entry_safe(entry, tmp, &msi->msi_list, list) {
+		if (entry->msi_attrib.is_msix) {
+			if (list_is_last(&entry->list, &msi->msi_list))
+				iounmap(entry->mask_base);
+		}
+
+		/*
+		 * Its possible that we get into this path
+		 * When populate_msi_sysfs fails, which means the entries
+		 * were not registered with sysfs.  In that case don't
+		 * unregister them.
+		 */
+		if (entry->kobj.parent) {
+			kobject_del(&entry->kobj);
+			kobject_put(&entry->kobj);
+		}
+
+		list_del(&entry->list);
+		kfree(entry);
+	}
+}
+EXPORT_SYMBOL(free_msi_irqs);
+
+struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops)
+{
+	struct msi_irqs *msi;
+
+	msi = kzalloc(sizeof(struct msi_irqs), GFP_KERNEL);
+	if (!msi)
+		return NULL;
+
+	INIT_LIST_HEAD(&msi->msi_list);
+	msi->data = data;
+	msi->ops = ops;
+	return msi;
+}
+EXPORT_SYMBOL(alloc_msi_irqs);
+
+struct msi_desc *alloc_msi_entry(struct msi_irqs *msi)
+{
+	struct msi_desc *desc = kzalloc(sizeof(*desc), GFP_KERNEL);
+	if (!desc)
+		return NULL;
+
+	INIT_LIST_HEAD(&desc->list);
+	desc->msi = msi;
+
+	return desc;
+}
+EXPORT_SYMBOL(alloc_msi_entry);
+
+static void msi_set_intx(struct msi_irqs *msi, int flag)
+{
+	if (!msi || !msi->ops || !msi->ops->msi_set_intx)
+		return;
+	msi->ops->msi_set_intx(msi, flag);
+}
+
+void msi_shutdown(struct msi_irqs *msi)
+{
+	u32 mask;
+	struct msi_desc *desc;
+
+	if (!msi || !msi->msi_enabled)
+		return;
+
+	BUG_ON(list_empty(&msi->msi_list));
+
+	desc = list_first_entry(&msi->msi_list, struct msi_desc, list);
+	msi_set_enable(msi, 0, MSI_TYPE);
+	msi_set_intx(msi, 1);
+	msi->msi_enabled = 0;
+
+	mask = msi_mask(desc->msi_attrib.multi_cap);
+	arch_msi_mask_irq(desc, mask, ~mask);
+}
+
+void msix_shutdown(struct msi_irqs *msi)
+{
+	struct msi_desc *entry;
+
+	if (!msi || !msi->msix_enabled)
+		return;
+
+	list_for_each_entry(entry, &msi->msi_list, list)
+		arch_msix_mask_irq(entry, 1);
+
+	msi_set_enable(msi, 0, MSIX_TYPE);
+	msi_set_intx(msi, 1);
+	msi->msix_enabled = 0;
+}
+
+static struct msi_desc * msi_setup_entry(struct msi_irqs *msi)
+{
+	struct msi_desc *entry;
+	
+	entry = alloc_msi_entry(msi);
+	if (!entry)
+		return NULL;
+
+	entry->msi_attrib.is_msix	= 0;
+	entry->msi_attrib.entry_nr	= 0;
+
+	if (!msi->ops || !msi->ops->msi_setup_entry) {
+		kfree(entry);
+		return NULL;
+	}
+
+	msi->ops->msi_setup_entry(msi, entry);
+	return entry;
+}
+
+static int msix_setup_entries(struct msi_irqs *msi, void __iomem *base,
+			      struct msix_entry *entries, int nvec)
+{
+	struct msi_desc *entry;
+	int i;
+
+	for (i = 0; i < nvec; i++) {
+		entry = alloc_msi_entry(msi);
+		if (!entry) {
+			if (!i)
+				iounmap(base);
+			else
+				free_msi_irqs(msi);
+			/* No enough memory. Don't try again */
+			return -ENOMEM;
+		}
+
+		entry->msi_attrib.is_msix	= 1;
+		entry->msi_attrib.is_64		= 1;
+		entry->msi_attrib.entry_nr	= entries[i].entry;
+		entry->mask_base		= base;
+		
+		list_add_tail(&entry->list, &msi->msi_list);
+	}
+
+	if (msi->ops && msi->ops->msix_setup_entries)
+		return msi->ops->msix_setup_entries(msi, entries);
+
+	return 0;
+}
+
+/**
+ * msi_capability_init - configure device's MSI capability structure
+ * @msi: pointer to the msi_irqs data structure of MSI device function
+ * @nvec: number of interrupts to allocate
+ *
+ * Setup the MSI capability structure of the device with the requested
+ * number of interrupts.  A return value of zero indicates the successful
+ * setup of an entry with the new MSI irq.  A negative return value indicates
+ * an error, and a positive return value indicates the number of interrupts
+ * which could have been allocated.
+ */
+int msi_capability_init(struct msi_irqs *msi, int nvec)
+{
+	struct msi_desc *entry;
+	int ret;
+	unsigned mask;
+
+	msi_set_enable(msi, 0, MSI_TYPE);	/* Disable MSI during set up */
+
+	/* MSI Entry Initialization */
+	entry = msi_setup_entry(msi);
+	if (!entry)
+		return -ENOMEM;
+
+	/* All MSIs are unmasked by default, Mask them all */
+	mask = msi_mask(entry->msi_attrib.multi_cap);
+	msi_mask_irq(entry, mask, mask);
+
+	/* Configure MSI capability structure */
+	ret = arch_setup_msi_irqs(msi, nvec, MSI_TYPE);
+	if (ret)
+		goto err;
+
+	/* Set MSI enabled bits	 */
+	msi_set_intx(msi, 0);
+	msi_set_enable(msi, 1, MSI_TYPE);
+	msi->msi_enabled = 1;
+
+	return 0;
+
+err:
+	msi_mask_irq(entry, mask, ~mask);
+	free_msi_irqs(msi);
+	return ret;
+}
+
+static void msix_program_entries(struct msi_irqs *msi,
+				 struct msix_entry *entries)
+{
+	struct msi_desc *entry;
+	int i = 0;
+
+	list_for_each_entry(entry, &msi->msi_list, list) {
+		entries[i].vector = entry->irq;
+		irq_set_msi_desc(entry->irq, entry);
+		i++;
+	}
+}
+
+/**
+ * msix_capability_init - configure device's MSI-X capability
+ * @dev: pointer to the pci_dev data structure of MSI-X device function
+ * @entries: pointer to an array of struct msix_entry entries
+ * @nvec: number of @entries
+ *
+ * Setup the MSI-X capability structure of device function with a
+ * single MSI-X irq. A return of zero indicates the successful setup of
+ * requested MSI-X entries with allocated irqs or non-zero for otherwise.
+ **/
+int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
+				struct msix_entry *entries, int nvec)
+{
+	int ret;
+
+	/* Ensure MSI-X is disabled while it is set up */
+	msi_set_enable(msi, 0, MSIX_TYPE);
+
+	ret = msix_setup_entries(msi, base, entries, nvec);
+	if (ret)
+		return ret;
+
+	ret = arch_setup_msi_irqs(msi, nvec, MSIX_TYPE);
+	if (ret)
+		goto out_avail;
+
+	msix_program_entries(msi, entries);
+
+	/* Set MSI-X enabled bits and unmask the function */
+	msi_set_intx(msi, 0);
+	msi->msix_enabled = 1;
+
+	msi_set_enable(msi, 1, MSIX_TYPE);
+
+	return 0;
+
+out_avail:
+	if (ret < 0) {
+		/*
+		 * If we had some success, report the number of irqs
+		 * we succeeded in setting up.
+		 */
+		struct msi_desc *entry;
+		int avail = 0;
+
+		list_for_each_entry(entry, &msi->msi_list, list) {
+			if (entry->irq != 0)
+				avail++;
+		}
+		if (avail != 0)
+			ret = avail;
+	}
+
+	free_msi_irqs(msi);
+
+	return ret;
+}
+
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 893503f..1a10488 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -2,10 +2,10 @@
 # PCI configuration
 #
 config PCI_MSI
-	bool "Message Signaled Interrupts (MSI and MSI-X)"
-	depends on PCI
+	bool "PCI Message Signaled Interrupts (MSI and MSI-X)"
+	depends on PCI && MSI
 	help
-	   This allows device drivers to enable MSI (Message Signaled
+	   This allows PCI device drivers to enable MSI (Message Signaled
 	   Interrupts).  Message Signaled Interrupts enable a device to
 	   generate an interrupt using an inbound Memory Write on its
 	   PCI bus instead of asserting a device IRQ pin.
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index f0c5989..df7223c 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -26,121 +26,8 @@ static int pci_msi_enable = 1;
 
 #define msix_table_size(flags)	((flags & PCI_MSIX_FLAGS_QSIZE) + 1)
 
-
-/* Arch hooks */
-
-int __weak arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc)
-{
-	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support Non-PCI
-	struct msi_chip *chip = dev->bus->msi;
-	int err;
-
-	if (!chip || !chip->setup_irq)
-		return -EINVAL;
-
-	err = chip->setup_irq(chip, dev, desc);
-	if (err < 0)
-		return err;
-
-	irq_set_chip_data(desc->irq, chip);
-
-	return 0;
-}
-
-void __weak arch_teardown_msi_irq(unsigned int irq)
-{
-	struct msi_chip *chip = irq_get_chip_data(irq);
-
-	if (!chip || !chip->teardown_irq)
-		return;
-
-	chip->teardown_irq(chip, irq);
-}
-
-int __weak arch_msi_check_device(struct msi_irqs *msi, int nvec, int type)
-{
-	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support Non-PCI
-	struct msi_chip *chip = dev->bus->msi;
-
-	if (!chip || !chip->check_device)
-		return 0;
-
-	return chip->check_device(chip, dev, nvec, type);
-}
-
-int __weak arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
-{
-	struct msi_desc *entry;
-	int ret;
-
-	/*
-	 * If an architecture wants to support multiple MSI, it needs to
-	 * override arch_setup_msi_irqs()
-	 */
-	if (type == MSI_TYPE && nvec > 1)
-		return 1;
-
-	list_for_each_entry(entry, &msi->msi_list, list) {
-		ret = arch_setup_msi_irq(msi, entry);
-		if (ret < 0)
-			return ret;
-		if (ret > 0)
-			return -ENOSPC;
-	}
-
-	return 0;
-}
-
-/*
- * We have a default implementation available as a separate non-weak
- * function, as it is used by the Xen x86 PCI code
- */
-void default_teardown_msi_irqs(struct msi_irqs *msi)
-{
-	struct msi_desc *entry;
-
-	list_for_each_entry(entry, &msi->msi_list, list) {
-		int i, nvec;
-		if (entry->irq == 0)
-			continue;
-		if (entry->nvec_used)
-			nvec = entry->nvec_used;
-		else
-			nvec = 1 << entry->msi_attrib.multiple;
-		for (i = 0; i < nvec; i++)
-			arch_teardown_msi_irq(entry->irq + i);
-	}
-}
-
-void __weak arch_teardown_msi_irqs(struct msi_irqs *msi)
-{
-	return default_teardown_msi_irqs(msi);
-}
-
-static void default_restore_msi_irq(struct msi_irqs *msi, int irq)
-{
-	struct msi_desc *entry;
-
-	entry = NULL;
-	if (msi->msix_enabled) {
-		list_for_each_entry(entry, &msi->msi_list, list) {
-			if (irq == entry->irq)
-				break;
-		}
-	} else if (msi->msi_enabled)  {
-		entry = irq_get_msi_desc(irq);
-	}
-
-	if (entry)
-		write_msi_msg(irq, &entry->msg);
-}
-
-void __weak arch_restore_msi_irqs(struct msi_irqs *msi)
-{
-	return default_restore_msi_irqs(msi);
-}
-
-static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
+static void msix_clear_and_set_ctrl(struct pci_dev *dev, 
+		u16 clear, u16 set)
 {
 	u16 ctrl;
 
@@ -150,7 +37,7 @@ static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
 	pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, ctrl);
 }
 
-static void msi_set_enable(struct msi_irqs *msi, int enable, int type)
+static void pci_msi_set_enable(struct msi_irqs *msi, int enable, int type)
 {
 	u16 control;
 	struct pci_dev *dev = msi->data;
@@ -169,21 +56,13 @@ static void msi_set_enable(struct msi_irqs *msi, int enable, int type)
 	}
 }
 
-static inline __attribute_const__ u32 msi_mask(unsigned x)
-{
-	/* Don't shift by >= width of type */
-	if (x >= 5)
-		return 0xffffffff;
-	return (1 << (1 << x)) - 1;
-}
-
 /*
  * PCI 2.3 does not specify mask bits for each MSI interrupt.  Attempting to
  * mask all MSI interrupts by clearing the MSI enable bit does not work
  * reliably as devices without an INTx disable bit will then generate a
  * level IRQ which will never be cleared.
  */
-u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+u32 pci_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
 	struct pci_dev *dev = desc->msi->data;
 	u32 mask_bits = desc->masked;
@@ -198,16 +77,6 @@ u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 	return mask_bits;
 }
 
-__weak u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
-{
-	return default_msi_mask_irq(desc, mask, flag);
-}
-
-static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
-{
-	desc->masked = arch_msi_mask_irq(desc, mask, flag);
-}
-
 /*
  * This internal function does not flush PCI writes to the device.
  * All users must ensure that they read from the device before either
@@ -215,7 +84,7 @@ static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
  * file.  This saves a few milliseconds when initialising devices with lots
  * of MSI-X interrupts.
  */
-u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag)
+u32 pci_msix_mask_irq(struct msi_desc *desc, u32 flag)
 {
 	u32 mask_bits = desc->masked;
 	unsigned offset = desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
@@ -228,40 +97,7 @@ u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag)
 	return mask_bits;
 }
 
-__weak u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag)
-{
-	return default_msix_mask_irq(desc, flag);
-}
-
-static void msix_mask_irq(struct msi_desc *desc, u32 flag)
-{
-	desc->masked = arch_msix_mask_irq(desc, flag);
-}
-
-static void msi_set_mask_bit(struct irq_data *data, u32 flag)
-{
-	struct msi_desc *desc = irq_data_get_msi(data);
-
-	if (desc->msi_attrib.is_msix) {
-		msix_mask_irq(desc, flag);
-		readl(desc->mask_base);		/* Flush write to device */
-	} else {
-		unsigned offset = data->irq - desc->irq;
-		msi_mask_irq(desc, 1 << offset, flag << offset);
-	}
-}
-
-void mask_msi_irq(struct irq_data *data)
-{
-	msi_set_mask_bit(data, 1);
-}
-
-void unmask_msi_irq(struct irq_data *data)
-{
-	msi_set_mask_bit(data, 0);
-}
-
-static void msix_set_all_mask(struct msi_irqs *msi, int flag)
+static void pci_msix_set_all_mask(struct msi_irqs *msi, int flag)
 {
 	struct pci_dev *dev = msi->data;
 
@@ -271,16 +107,7 @@ static void msix_set_all_mask(struct msi_irqs *msi, int flag)
 		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 }
 
-void default_restore_msi_irqs(struct msi_irqs *msi)
-{
-	struct msi_desc *entry;
-
-	list_for_each_entry(entry, &msi->msi_list, list) {
-		default_restore_msi_irq(msi, entry->irq);
-	}
-}
-
-void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
+void pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 {
 	struct pci_dev *dev = entry->msi->data;
 
@@ -311,31 +138,7 @@ void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 	}
 }
 
-void read_msi_msg(unsigned int irq, struct msi_msg *msg)
-{
-	struct msi_desc *entry = irq_get_msi_desc(irq);
-
-	__read_msi_msg(entry, msg);
-}
-
-void __get_cached_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
-{
-	/* Assert that the cache is valid, assuming that
-	 * valid messages are not all-zeroes. */
-	BUG_ON(!(entry->msg.address_hi | entry->msg.address_lo |
-		 entry->msg.data));
-
-	*msg = entry->msg;
-}
-
-void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg)
-{
-	struct msi_desc *entry = irq_get_msi_desc(irq);
-
-	__get_cached_msi_msg(entry, msg);
-}
-
-void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
+void pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 {
 	struct pci_dev *dev = entry->msi->data;
 
@@ -373,13 +176,6 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 	entry->msg = *msg;
 }
 
-void write_msi_msg(unsigned int irq, struct msi_msg *msg)
-{
-	struct msi_desc *entry = irq_get_msi_desc(irq);
-
-	__write_msi_msg(entry, msg);
-}
-
 static void free_msi_sysfs(struct pci_dev *dev)
 {
 	struct attribute **msi_attrs;
@@ -403,58 +199,6 @@ static void free_msi_sysfs(struct pci_dev *dev)
 	}
 }
 
-static void free_msi_irqs(struct msi_irqs *msi)
-{
-	struct msi_desc *entry, *tmp;
-
-	list_for_each_entry(entry, &msi->msi_list, list) {
-		int i, nvec;
-		if (!entry->irq)
-			continue;
-		if (entry->nvec_used)
-			nvec = entry->nvec_used;
-		else
-			nvec = 1 << entry->msi_attrib.multiple;
-		for (i = 0; i < nvec; i++)
-			BUG_ON(irq_has_action(entry->irq + i));
-	}
-
-	arch_teardown_msi_irqs(msi);
-
-	list_for_each_entry_safe(entry, tmp, &msi->msi_list, list) {
-		if (entry->msi_attrib.is_msix) {
-			if (list_is_last(&entry->list, &msi->msi_list))
-				iounmap(entry->mask_base);
-		}
-
-		/*
-		 * Its possible that we get into this path
-		 * When populate_msi_sysfs fails, which means the entries
-		 * were not registered with sysfs.  In that case don't
-		 * unregister them.
-		 */
-		if (entry->kobj.parent) {
-			kobject_del(&entry->kobj);
-			kobject_put(&entry->kobj);
-		}
-
-		list_del(&entry->list);
-		kfree(entry);
-	}
-}
-
-static struct msi_desc *alloc_msi_entry(struct msi_irqs *msi)
-{
-	struct msi_desc *desc = kzalloc(sizeof(*desc), GFP_KERNEL);
-	if (!desc)
-		return NULL;
-
-	INIT_LIST_HEAD(&desc->list);
-	desc->msi = msi;
-
-	return desc;
-}
-
 static void pci_intx_for_msi(struct msi_irqs *msi, int enable)
 {
 	struct pci_dev *dev = msi->data;
@@ -474,7 +218,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 	entry = irq_get_msi_desc(dev->irq);
 
 	pci_intx_for_msi(dev->msi, 0);
-	msi_set_enable(dev->msi, 0, MSI_TYPE);
+	pci_msi_set_enable(dev->msi, 0, MSI_TYPE);
 	arch_restore_msi_irqs(dev->msi);
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
@@ -496,13 +240,13 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
 
 	/* route the table */
 	pci_intx_for_msi(msi, 0);
-	msi_set_enable(msi, 1, MSIX_TYPE);
-	msix_set_all_mask(msi, 1);
+	pci_msi_set_enable(msi, 1, MSIX_TYPE);
+	pci_msix_set_all_mask(msi, 1);
 	arch_restore_msi_irqs(msi);
 	list_for_each_entry(entry, &msi->msi_list, list) 
 		msix_mask_irq(entry, entry->masked);
 
-	msix_set_all_mask(msi, 0);
+	pci_msix_set_all_mask(msi, 0);
 }
 
 void pci_restore_msi_state(struct pci_dev *dev)
@@ -606,22 +350,16 @@ error_attrs:
 	return ret;
 }
 
-static struct msi_desc *msi_setup_entry(struct msi_irqs *msi)
+static struct msi_desc *pci_msi_setup_entry(struct msi_irqs *msi, 
+		struct msi_desc *entry)
 {
 	u16 control;
-	struct msi_desc *entry;
 	struct pci_dev *dev = msi->data;
 
 	/* MSI Entry Initialization */
-	entry = alloc_msi_entry(msi);
-	if (!entry)
-		return NULL;
-
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
 
-	entry->msi_attrib.is_msix	= 0;
 	entry->msi_attrib.is_64		= !!(control & PCI_MSI_FLAGS_64BIT);
-	entry->msi_attrib.entry_nr	= 0;
 	entry->msi_attrib.maskbit	= !!(control & PCI_MSI_FLAGS_MASKBIT);
 	entry->msi_attrib.default_irq	= dev->irq;	/* Save IOAPIC IRQ */
 	entry->msi_attrib.multi_cap	= (control & PCI_MSI_FLAGS_QMASK) >> 1;
@@ -638,52 +376,6 @@ static struct msi_desc *msi_setup_entry(struct msi_irqs *msi)
 	return entry;
 }
 
-/**
- * msi_capability_init - configure device's MSI capability structure
- * @dev: pointer to the pci_dev data structure of MSI device function
- * @nvec: number of interrupts to allocate
- *
- * Setup the MSI capability structure of the device with the requested
- * number of interrupts.  A return value of zero indicates the successful
- * setup of an entry with the new MSI irq.  A negative return value indicates
- * an error, and a positive return value indicates the number of interrupts
- * which could have been allocated.
- */
-static int msi_capability_init(struct msi_irqs *msi, int nvec)
-{
-	struct msi_desc *entry;
-	int ret;
-	unsigned mask;
-
-	msi_set_enable(msi, 0, MSI_TYPE);	/* Disable MSI during set up */
-
-	entry = msi_setup_entry(msi);
-	if (!entry)
-		return -ENOMEM;
-
-	/* All MSIs are unmasked by default, Mask them all */
-	mask = msi_mask(entry->msi_attrib.multi_cap);
-	msi_mask_irq(entry, mask, mask);
-
-	list_add_tail(&entry->list, &msi->msi_list);
-
-	/* Configure MSI capability structure */
-	ret = arch_setup_msi_irqs(msi, nvec, MSI_TYPE);
-	if (ret)
-		goto err;
-
-	/* Set MSI enabled bits	 */
-	pci_intx_for_msi(msi, 0);
-	msi_set_enable(msi, 1, MSI_TYPE);
-	msi->msi_enabled = 1;
-	return 0;
-
-err:
-	msi_mask_irq(entry, mask, ~mask);
-	free_msi_irqs(msi);
-	return ret;
-}
-
 static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
 {
 	resource_size_t phys_addr;
@@ -699,28 +391,19 @@ static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
 	return ioremap_nocache(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
 }
 
-static int msix_setup_entries(struct msi_irqs *msi, void __iomem *base,
-			      struct msix_entry *entries, int nvec)
+static int pci_msix_setup_entries(struct msi_irqs *msi, struct msix_entry *entries)
 {
+	int offset, i = 0;
 	struct msi_desc *entry;
-	int i, offset;
 	struct pci_dev *dev = msi->data;
 
-	for (i = 0; i < nvec; i++) {
-		entry = alloc_msi_entry(msi);
-		if (!entry) {
-			if (!i)
-				iounmap(base);
-			else
-				free_msi_irqs(msi);
-			/* No enough memory. Don't try again */
-			return -ENOMEM;
-		}
 
-		entry->msi_attrib.is_msix	= 1;
-		entry->msi_attrib.is_64		= 1;
-		entry->msi_attrib.entry_nr	= entries[i].entry;
-		entry->mask_base		= base;
+	list_for_each_entry(entry, &msi->msi_list, list) {
+		/*
+		 * Some devices require MSI-X to be enabled before we can touch the
+		 * MSI-X registers.  We need to mask all the vectors to prevent
+		 * interrupts coming in before they're fully set up.
+		 */
 
 		msix_clear_and_set_ctrl(dev, 0, 
 				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE);
@@ -730,87 +413,10 @@ static int msix_setup_entries(struct msi_irqs *msi, void __iomem *base,
 		msix_mask_irq(entry, 1);
 		msix_clear_and_set_ctrl(dev, 
 				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
-
-		list_add_tail(&entry->list, &msi->msi_list);
-	}
-
-	return 0;
-}
-
-static void msix_program_entries(struct msi_irqs *msi,
-				 struct msix_entry *entries)
-{
-	struct msi_desc *entry;
-	int i = 0;
-
-	list_for_each_entry(entry, &msi->msi_list, list) {
-		entries[i].vector = entry->irq;
-		irq_set_msi_desc(entry->irq, entry);
 		i++;
 	}
-}
-
-/**
- * msix_capability_init - configure device's MSI-X capability
- * @dev: pointer to the pci_dev data structure of MSI-X device function
- * @entries: pointer to an array of struct msix_entry entries
- * @nvec: number of @entries
- *
- * Setup the MSI-X capability structure of device function with a
- * single MSI-X irq. A return of zero indicates the successful setup of
- * requested MSI-X entries with allocated irqs or non-zero for otherwise.
- **/
-static int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
-				struct msix_entry *entries, int nvec)
-{
-	int ret;
-
-	/* Ensure MSI-X is disabled while it is set up */
-	msi_set_enable(msi, 0, MSIX_TYPE);
-
-	ret = msix_setup_entries(msi, base, entries, nvec);
-	if (ret)
-		return ret;
-
-	ret = arch_setup_msi_irqs(msi, nvec, MSIX_TYPE);
-	if (ret)
-		goto out_avail;
-
-	/*
-	 * Some devices require MSI-X to be enabled before we can touch the
-	 * MSI-X registers.  We need to mask all the vectors to prevent
-	 * interrupts coming in before they're fully set up.
-	 */
-	msix_program_entries(msi, entries);
-
-	/* Set MSI-X enabled bits and unmask the function */
-	pci_intx_for_msi(msi, 0);
-	msi->msix_enabled = 1;
-
-	msi_set_enable(msi, 1, MSIX_TYPE);
 
 	return 0;
-
-out_avail:
-	if (ret < 0) {
-		/*
-		 * If we had some success, report the number of irqs
-		 * we succeeded in setting up.
-		 */
-		struct msi_desc *entry;
-		int avail = 0;
-
-		list_for_each_entry(entry, &msi->msi_list, list) {
-			if (entry->irq != 0)
-				avail++;
-		}
-		if (avail != 0)
-			ret = avail;
-	}
-
-	free_msi_irqs(msi);
-
-	return ret;
 }
 
 /**
@@ -886,25 +492,14 @@ EXPORT_SYMBOL(pci_msi_vec_count);
 void pci_msi_shutdown(struct pci_dev *dev)
 {
 	struct msi_desc *desc;
-	u32 mask;
 
 	if (!pci_msi_enable || !dev || 
 			!pci_dev_msi_enabled(dev, MSI_TYPE))
 		return;
 
-	BUG_ON(list_empty(&dev->msi->msi_list));
-	desc = list_first_entry(&dev->msi->msi_list, struct msi_desc, list);
-
-	msi_set_enable(dev->msi, 0, MSI_TYPE);
-	pci_intx_for_msi(dev->msi, 1);
-	dev->msi->msi_enabled = 0;
-
-	/* Return the device with MSI unmasked as initial states */
-	mask = msi_mask(desc->msi_attrib.multi_cap);
-	/* Keep cached state to be restored */
-	arch_msi_mask_irq(desc, mask, ~mask);
-
+	msi_shutdown(dev->msi);
 	/* Restore dev->irq to its default pin-assertion irq */
+	desc = list_first_entry(&dev->msi->msi_list, struct msi_desc, list);
 	dev->irq = desc->msi_attrib.default_irq;
 }
 
@@ -1014,20 +609,10 @@ EXPORT_SYMBOL(pci_enable_msix);
 
 void pci_msix_shutdown(struct pci_dev *dev)
 {
-	struct msi_desc *entry;
-
-	if (!pci_msi_enable || !dev || !pci_dev_msi_enabled(dev, MSIX_TYPE))
+	if (!pci_msi_enable || !dev)
 		return;
 
-	/* Return the device with MSI-X masked as initial states */
-	list_for_each_entry(entry, &dev->msi->msi_list, list) {
-		/* Keep cached states to be restored */
-		arch_msix_mask_irq(entry, 1);
-	}
-
-	msi_set_enable(dev->msi, 0, MSIX_TYPE);
-	pci_intx_for_msi(dev->msi, 1);
-	dev->msi->msix_enabled = 0;
+	msix_shutdown(dev->msi);
 }
 
 void pci_disable_msix(struct pci_dev *dev)
@@ -1060,30 +645,16 @@ int pci_msi_enabled(void)
 EXPORT_SYMBOL(pci_msi_enabled);
 
 static struct msi_ops pci_msi = {
-	.msi_set_enable = msi_set_enable,
-	.msi_setup_entry = msi_setup_entry,
-	.msix_setup_entries = msix_setup_entries,
-	.msi_mask_irq = default_msi_mask_irq,
-	.msix_mask_irq = default_msix_mask_irq,
-	.msi_read_message = __read_msi_msg,
-	.msi_write_message = __write_msi_msg,
+	.msi_set_enable = pci_msi_set_enable,
+	.msi_setup_entry = pci_msi_setup_entry,
+	.msix_setup_entries = pci_msix_setup_entries,
+	.msi_mask_irq = pci_msi_mask_irq,
+	.msix_mask_irq = pci_msix_mask_irq,
+	.msi_read_message = pci_read_msi_msg,
+	.msi_write_message = pci_write_msi_msg,
 	.msi_set_intx =  pci_intx_for_msi,
 };
 
-struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops)
-{
-	struct msi_irqs *msi;
-
-	msi = kzalloc(sizeof(struct msi_irqs), GFP_KERNEL);
-	if (!msi)
-		return NULL;
-
-	INIT_LIST_HEAD(&msi->msi_list);
-	msi->data = data;
-	msi->ops = ops;
-	return msi;
-}
-
 void pci_msi_init_pci_dev(struct pci_dev *dev)
 {
 	/* Disable the msi hardware to avoid screaming interrupts
@@ -1100,10 +671,10 @@ void pci_msi_init_pci_dev(struct pci_dev *dev)
 			
 		dev->msi->node = dev_to_node(&dev->dev);
 		if (dev->msi_cap) 
-			msi_set_enable(dev->msi, 0, MSI_TYPE);
+			pci_msi_set_enable(dev->msi, 0, MSI_TYPE);
 
 		if (dev->msix_cap) 
-			msi_set_enable(dev->msi, 0, MSIX_TYPE);
+			pci_msi_set_enable(dev->msi, 0, MSIX_TYPE);
 	}
 }
 
@@ -1224,4 +795,3 @@ int pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
 }
 EXPORT_SYMBOL(pci_enable_msix_range);
 
-
diff --git a/include/linux/msi.h b/include/linux/msi.h
index fc8f3e8..87ed0dd 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -28,9 +28,9 @@ struct msix_entry {
 
 struct msi_ops {
 	void (*msi_set_enable)(struct msi_irqs *msi, int enable, int type);
-	struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi);
-	int (*msix_setup_entries)(struct msi_irqs *msi, void __iomem *base,
-			struct msix_entry *entries, int nvec);
+	struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi, 
+			struct msi_desc *entry);
+	int (*msix_setup_entries)(struct msi_irqs *msi, struct msix_entry *entries);
 	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
 	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
 	void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
@@ -49,6 +49,18 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 void read_msi_msg(unsigned int irq, struct msi_msg *msg);
 void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg);
 void write_msi_msg(unsigned int irq, struct msi_msg *msg);
+struct msi_desc *alloc_msi_entry(struct msi_irqs *msi);
+void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
+void msix_mask_irq(struct msi_desc *desc, u32 flag);
+void msi_set_enable(struct msi_irqs *msi, int enable, int type);
+
+struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops);
+
+void free_msi_irqs(struct msi_irqs *msi);
+
+int msi_capability_init(struct msi_irqs *msi, int nvec);
+int msix_capability_init(struct msi_irqs *msi, void __iomem *base, 
+		struct msix_entry *entries, int nvec);
 
 struct msi_desc {
 	struct {
@@ -89,12 +101,17 @@ int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type);
 void arch_teardown_msi_irqs(struct msi_irqs *msi);
 int arch_msi_check_device(struct msi_irqs *msi, int nvec, int type);
 void arch_restore_msi_irqs(struct msi_irqs *msi);
+u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
+u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag);
 
 void default_teardown_msi_irqs(struct msi_irqs *msi);
 void default_restore_msi_irqs(struct msi_irqs *msi);
 u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
 u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag);
 
+void msi_shutdown(struct msi_irqs *msi);
+void msix_shutdown(struct msi_irqs *msi);
+
 #define MSI_TYPE	0x01
 #define MSIX_TYPE	0x02
 
@@ -111,4 +128,12 @@ struct msi_chip {
 			    int nvec, int type);
 };
 
+static inline __attribute_const__ u32 msi_mask(unsigned x)
+{
+	/* Don't shift by >= width of type */
+	if (x >= 5)
+		return 0xffffffff;
+	return (1 << (1 << x)) - 1;
+}
+
 #endif /* LINUX_MSI_H */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (9 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file Yijing Wang
@ 2014-07-26  3:08 ` Yijing Wang
  2014-08-20  6:20   ` Bharat.Bhushan
  2014-07-29 14:08 ` [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Arnd Bergmann
  2014-08-01 10:27 ` arnab.basu
  12 siblings, 1 reply; 41+ messages in thread
From: Yijing Wang @ 2014-07-26  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo,
	Yijing Wang

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 arch/x86/include/asm/io_apic.h       |    2 +-
 arch/x86/include/asm/irq_remapping.h |    4 +-
 arch/x86/include/asm/pci.h           |    6 ++--
 arch/x86/include/asm/x86_init.h      |   10 +++---
 arch/x86/kernel/apic/io_apic.c       |   23 +++++++--------
 arch/x86/kernel/x86_init.c           |   12 ++++----
 drivers/iommu/amd_iommu.c            |   16 ++++++----
 drivers/iommu/intel_irq_remapping.c  |    9 ++++--
 drivers/iommu/irq_remapping.c        |   51 ++++++++++++++++-----------------
 drivers/iommu/irq_remapping.h        |    6 ++--
 drivers/msi/msi.c                    |    3 +-
 11 files changed, 72 insertions(+), 70 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 90f97b4..692a90f 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -158,7 +158,7 @@ extern int native_setup_ioapic_entry(int, struct IO_APIC_route_entry *,
 				     struct io_apic_irq_attr *);
 extern void eoi_ioapic_irq(unsigned int irq, struct irq_cfg *cfg);
 
-extern void native_compose_msi_msg(struct pci_dev *pdev,
+extern void native_compose_msi_msg(struct msi_irqs *msi,
 				   unsigned int irq, unsigned int dest,
 				   struct msi_msg *msg, u8 hpet_id);
 extern void native_eoi_ioapic_pin(int apic, int pin, int vector);
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index b7747c4..a10003d 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -47,7 +47,7 @@ extern int setup_ioapic_remapped_entry(int irq,
 				       int vector,
 				       struct io_apic_irq_attr *attr);
 extern void free_remapped_irq(int irq);
-extern void compose_remapped_msi_msg(struct pci_dev *pdev,
+extern void compose_remapped_msi_msg(struct msi_irqs *msi,
 				     unsigned int irq, unsigned int dest,
 				     struct msi_msg *msg, u8 hpet_id);
 extern int setup_hpet_msi_remapped(unsigned int irq, unsigned int id);
@@ -77,7 +77,7 @@ static inline int setup_ioapic_remapped_entry(int irq,
 	return -ENODEV;
 }
 static inline void free_remapped_irq(int irq) { }
-static inline void compose_remapped_msi_msg(struct pci_dev *pdev,
+static inline void compose_remapped_msi_msg(struct msi_irqs *msi,
 					    unsigned int irq, unsigned int dest,
 					    struct msi_msg *msg, u8 hpet_id)
 {
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 0892ea0..04c9ef6 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -96,10 +96,10 @@ extern void pci_iommu_alloc(void);
 #ifdef CONFIG_PCI_MSI
 /* implemented in arch/x86/kernel/apic/io_apic. */
 struct msi_desc;
-int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
+int native_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type);
 void native_teardown_msi_irq(unsigned int irq);
-void native_restore_msi_irqs(struct pci_dev *dev);
-int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
+void native_restore_msi_irqs(struct msi_irqs *msi);
+int setup_msi_irq(struct msi_irqs *msi, struct msi_desc *msidesc,
 		  unsigned int irq_base, unsigned int irq_offset);
 #else
 #define native_setup_msi_irqs		NULL
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index e45e4da..8e42f17 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -170,18 +170,18 @@ struct x86_platform_ops {
 	void (*apic_post_init)(void);
 };
 
-struct pci_dev;
+struct msi_irqs;
 struct msi_msg;
 struct msi_desc;
 
 struct x86_msi_ops {
-	int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
-	void (*compose_msi_msg)(struct pci_dev *dev, unsigned int irq,
+	int (*setup_msi_irqs)(struct msi_irqs *msi, int nvec, int type);
+	void (*compose_msi_msg)(struct msi_irqs *msi, unsigned int irq,
 				unsigned int dest, struct msi_msg *msg,
 			       u8 hpet_id);
 	void (*teardown_msi_irq)(unsigned int irq);
-	void (*teardown_msi_irqs)(struct pci_dev *dev);
-	void (*restore_msi_irqs)(struct pci_dev *dev);
+	void (*teardown_msi_irqs)(struct msi_irqs *msi);
+	void (*restore_msi_irqs)(struct msi_irqs *msi);
 	int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
 	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
 	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index b833042..3cb4a6a 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2939,7 +2939,7 @@ void arch_teardown_hwirq(unsigned int irq)
 /*
  * MSI message composition
  */
-void native_compose_msi_msg(struct pci_dev *pdev,
+void native_compose_msi_msg(struct msi_irqs *msi,
 			    unsigned int irq, unsigned int dest,
 			    struct msi_msg *msg, u8 hpet_id)
 {
@@ -2970,7 +2970,7 @@ void native_compose_msi_msg(struct pci_dev *pdev,
 }
 
 #ifdef CONFIG_PCI_MSI
-static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
+static int msi_compose_msg(struct msi_irqs *msi, unsigned int irq,
 			   struct msi_msg *msg, u8 hpet_id)
 {
 	struct irq_cfg *cfg;
@@ -2990,7 +2990,7 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
 	if (err)
 		return err;
 
-	x86_msi.compose_msi_msg(pdev, irq, dest, msg, hpet_id);
+	x86_msi.compose_msi_msg(msi, irq, dest, msg, hpet_id);
 
 	return 0;
 }
@@ -3032,15 +3032,16 @@ static struct irq_chip msi_chip = {
 	.irq_retrigger		= ioapic_retrigger_irq,
 };
 
-int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
+int setup_msi_irq(struct msi_irqs *msi, struct msi_desc *msidesc,
 		  unsigned int irq_base, unsigned int irq_offset)
 {
 	struct irq_chip *chip = &msi_chip;
 	struct msi_msg msg;
 	unsigned int irq = irq_base + irq_offset;
 	int ret;
+	struct pci_dev *dev = msi->data;
 
-	ret = msi_compose_msg(dev, irq, &msg, -1);
+	ret = msi_compose_msg(msi, irq, &msg, -1);
 	if (ret < 0)
 		return ret;
 
@@ -3062,24 +3063,22 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 	return 0;
 }
 
-int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+int native_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
 {
 	struct msi_desc *msidesc;
 	unsigned int irq;
-	int node, ret;
+	int ret;
 
 	/* Multiple MSI vectors only supported with interrupt remapping */
 	if (type == MSI_TYPE && nvec > 1)
 		return 1;
 
-	node = dev_to_node(&dev->dev);
-
-	list_for_each_entry(msidesc, &dev->msi_list, list) {
-		irq = irq_alloc_hwirq(node);
+	list_for_each_entry(msidesc, &msi->msi_list, list) {
+		irq = irq_alloc_hwirq(msi->node);
 		if (!irq)
 			return -ENOSPC;
 
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
+		ret = setup_msi_irq(msi, msidesc, irq, 0);
 		if (ret < 0) {
 			irq_free_hwirq(irq);
 			return ret;
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index e48b674..a277faf 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -121,14 +121,14 @@ struct x86_msi_ops x86_msi = {
 };
 
 /* MSI arch specific hooks */
-int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
 {
-	return x86_msi.setup_msi_irqs(dev, nvec, type);
+	return x86_msi.setup_msi_irqs(msi, nvec, type);
 }
 
-void arch_teardown_msi_irqs(struct pci_dev *dev)
+void arch_teardown_msi_irqs(struct msi_irqs *msi)
 {
-	x86_msi.teardown_msi_irqs(dev);
+	x86_msi.teardown_msi_irqs(msi);
 }
 
 void arch_teardown_msi_irq(unsigned int irq)
@@ -136,9 +136,9 @@ void arch_teardown_msi_irq(unsigned int irq)
 	x86_msi.teardown_msi_irq(irq);
 }
 
-void arch_restore_msi_irqs(struct pci_dev *dev)
+void arch_restore_msi_irqs(struct msi_irqs *msi)
 {
-	x86_msi.restore_msi_irqs(dev);
+	x86_msi.restore_msi_irqs(msi);
 }
 u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4aec6a2..0e45cb7 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4237,7 +4237,7 @@ static int free_irq(int irq)
 	return 0;
 }
 
-static void compose_msi_msg(struct pci_dev *pdev,
+static void compose_msi_msg(struct msi_irqs *msi,
 			    unsigned int irq, unsigned int dest,
 			    struct msi_msg *msg, u8 hpet_id)
 {
@@ -4265,33 +4265,35 @@ static void compose_msi_msg(struct pci_dev *pdev,
 	msg->data       = irte_info->index;
 }
 
-static int msi_alloc_irq(struct pci_dev *pdev, int irq, int nvec)
+static int msi_alloc_irq(struct msi_irqs *msi, int irq, int nvec)
 {
 	struct irq_cfg *cfg;
 	int index;
 	u16 devid;
+	struct pci_dev *dev = msi->data;
 
-	if (!pdev)
+	if (!dev)
 		return -EINVAL;
 
 	cfg = irq_get_chip_data(irq);
 	if (!cfg)
 		return -EINVAL;
 
-	devid = get_device_id(&pdev->dev);
+	devid = get_device_id(&dev->dev);
 	index = alloc_irq_index(cfg, devid, nvec);
 
 	return index < 0 ? MAX_IRQS_PER_TABLE : index;
 }
 
-static int msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
+static int msi_setup_irq(struct msi_irqs *msi, unsigned int irq,
 			 int index, int offset)
 {
 	struct irq_2_irte *irte_info;
 	struct irq_cfg *cfg;
 	u16 devid;
+	struct pci_dev *dev = msi->data;
 
-	if (!pdev)
+	if (!dev)
 		return -EINVAL;
 
 	cfg = irq_get_chip_data(irq);
@@ -4301,7 +4303,7 @@ static int msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
 	if (index >= MAX_IRQS_PER_TABLE)
 		return 0;
 
-	devid		= get_device_id(&pdev->dev);
+	devid		= get_device_id(&dev->dev);
 	irte_info	= &cfg->irq_2_irte;
 
 	cfg->remapped	      = 1;
diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 9b17489..d6bde63 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1027,7 +1027,7 @@ intel_ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	return 0;
 }
 
-static void intel_compose_msi_msg(struct pci_dev *pdev,
+static void intel_compose_msi_msg(struct msi_irqs *msi,
 				  unsigned int irq, unsigned int dest,
 				  struct msi_msg *msg, u8 hpet_id)
 {
@@ -1035,6 +1035,7 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
 	struct irte irte;
 	u16 sub_handle = 0;
 	int ir_index;
+	struct pci_dev *pdev = msi->data;
 
 	cfg = irq_get_chip_data(irq);
 
@@ -1064,10 +1065,11 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
  * and allocate 'nvec' consecutive interrupt-remapping table entries
  * in it.
  */
-static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
+static int intel_msi_alloc_irq(struct msi_irqs *msi, int irq, int nvec)
 {
 	struct intel_iommu *iommu;
 	int index;
+	struct pci_dev *dev = msi->data;
 
 	down_read(&dmar_global_lock);
 	iommu = map_dev_to_ir(dev);
@@ -1089,11 +1091,12 @@ static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
 	return index;
 }
 
-static int intel_msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
+static int intel_msi_setup_irq(struct msi_irqs *msi, unsigned int irq,
 			       int index, int sub_handle)
 {
 	struct intel_iommu *iommu;
 	int ret = -ENOENT;
+	struct pci_dev *pdev = msi->data;
 
 	down_read(&dmar_global_lock);
 	iommu = map_dev_to_ir(pdev);
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index a3b1805..1fe14e5 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -24,8 +24,8 @@ int no_x2apic_optout;
 
 static struct irq_remap_ops *remap_ops;
 
-static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec);
-static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
+static int msi_alloc_remapped_irq(struct msi_irqs *msi, int irq, int nvec);
+static int msi_setup_remapped_irq(struct msi_irqs *msi, unsigned int irq,
 				  int index, int sub_handle);
 static int set_remapped_irq_affinity(struct irq_data *data,
 				     const struct cpumask *mask,
@@ -49,19 +49,19 @@ static void irq_remapping_disable_io_apic(void)
 		disconnect_bsp_APIC(0);
 }
 
-static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
+static int do_setup_msi_irqs(struct msi_irqs *msi, int nvec)
 {
 	int ret, sub_handle, nvec_pow2, index = 0;
 	unsigned int irq;
 	struct msi_desc *msidesc;
 
-	WARN_ON(!list_is_singular(&dev->msi_list));
-	msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
+	WARN_ON(!list_is_singular(&msi->msi_list));
+	msidesc = list_entry(msi->msi_list.next, struct msi_desc, list);
 	WARN_ON(msidesc->irq);
 	WARN_ON(msidesc->msi_attrib.multiple);
 	WARN_ON(msidesc->nvec_used);
 
-	irq = irq_alloc_hwirqs(nvec, dev_to_node(&dev->dev));
+	irq = irq_alloc_hwirqs(nvec, msi->node);
 	if (irq == 0)
 		return -ENOSPC;
 
@@ -70,18 +70,18 @@ static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
 	msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
 	for (sub_handle = 0; sub_handle < nvec; sub_handle++) {
 		if (!sub_handle) {
-			index = msi_alloc_remapped_irq(dev, irq, nvec_pow2);
+			index = msi_alloc_remapped_irq(msi, irq, nvec_pow2);
 			if (index < 0) {
 				ret = index;
 				goto error;
 			}
 		} else {
-			ret = msi_setup_remapped_irq(dev, irq + sub_handle,
+			ret = msi_setup_remapped_irq(msi, irq + sub_handle,
 						     index, sub_handle);
 			if (ret < 0)
 				goto error;
 		}
-		ret = setup_msi_irq(dev, msidesc, irq, sub_handle);
+		ret = setup_msi_irq(msi, msidesc, irq, sub_handle);
 		if (ret < 0)
 			goto error;
 	}
@@ -101,30 +101,29 @@ error:
 	return ret;
 }
 
-static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
+static int do_setup_msix_irqs(struct msi_irqs *msi, int nvec)
 {
 	int node, ret, sub_handle, index = 0;
 	struct msi_desc *msidesc;
 	unsigned int irq;
 
-	node		= dev_to_node(&dev->dev);
 	sub_handle	= 0;
 
-	list_for_each_entry(msidesc, &dev->msi_list, list) {
+	list_for_each_entry(msidesc, &msi->msi_list, list) {
 
-		irq = irq_alloc_hwirq(node);
+		irq = irq_alloc_hwirq(msi->node);
 		if (irq == 0)
 			return -1;
 
 		if (sub_handle == 0)
-			ret = index = msi_alloc_remapped_irq(dev, irq, nvec);
+			ret = index = msi_alloc_remapped_irq(msi, irq, nvec);
 		else
-			ret = msi_setup_remapped_irq(dev, irq, index, sub_handle);
+			ret = msi_setup_remapped_irq(msi, irq, index, sub_handle);
 
 		if (ret < 0)
 			goto error;
 
-		ret = setup_msi_irq(dev, msidesc, irq, 0);
+		ret = setup_msi_irq(msi, msidesc, irq, 0);
 		if (ret < 0)
 			goto error;
 
@@ -139,13 +138,13 @@ error:
 	return ret;
 }
 
-static int irq_remapping_setup_msi_irqs(struct pci_dev *dev,
+static int irq_remapping_setup_msi_irqs(struct msi_irqs *msi,
 					int nvec, int type)
 {
 	if (type == MSI_TYPE)
-		return do_setup_msi_irqs(dev, nvec);
+		return do_setup_msi_irqs(msi, nvec);
 	else
-		return do_setup_msix_irqs(dev, nvec);
+		return do_setup_msix_irqs(msi, nvec);
 }
 
 static void eoi_ioapic_pin_remapped(int apic, int pin, int vector)
@@ -314,33 +313,33 @@ void free_remapped_irq(int irq)
 		remap_ops->free_irq(irq);
 }
 
-void compose_remapped_msi_msg(struct pci_dev *pdev,
+void compose_remapped_msi_msg(struct msi_irqs *msi,
 			      unsigned int irq, unsigned int dest,
 			      struct msi_msg *msg, u8 hpet_id)
 {
 	struct irq_cfg *cfg = irq_get_chip_data(irq);
 
 	if (!irq_remapped(cfg))
-		native_compose_msi_msg(pdev, irq, dest, msg, hpet_id);
+		native_compose_msi_msg(msi, irq, dest, msg, hpet_id);
 	else if (remap_ops && remap_ops->compose_msi_msg)
-		remap_ops->compose_msi_msg(pdev, irq, dest, msg, hpet_id);
+		remap_ops->compose_msi_msg(msi, irq, dest, msg, hpet_id);
 }
 
-static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec)
+static int msi_alloc_remapped_irq(struct msi_irqs *msi, int irq, int nvec)
 {
 	if (!remap_ops || !remap_ops->msi_alloc_irq)
 		return -ENODEV;
 
-	return remap_ops->msi_alloc_irq(pdev, irq, nvec);
+	return remap_ops->msi_alloc_irq(msi, irq, nvec);
 }
 
-static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
+static int msi_setup_remapped_irq(struct msi_irqs *msi, unsigned int irq,
 				  int index, int sub_handle)
 {
 	if (!remap_ops || !remap_ops->msi_setup_irq)
 		return -ENODEV;
 
-	return remap_ops->msi_setup_irq(pdev, irq, index, sub_handle);
+	return remap_ops->msi_setup_irq(msi, irq, index, sub_handle);
 }
 
 int setup_hpet_msi_remapped(unsigned int irq, unsigned int id)
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 90c4dae..59c4cfb 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -69,15 +69,15 @@ struct irq_remap_ops {
 	int (*free_irq)(int);
 
 	/* Create MSI msg to use for interrupt remapping */
-	void (*compose_msi_msg)(struct pci_dev *,
+	void (*compose_msi_msg)(struct msi_irqs *,
 				unsigned int, unsigned int,
 				struct msi_msg *, u8);
 
 	/* Allocate remapping resources for MSI */
-	int (*msi_alloc_irq)(struct pci_dev *, int, int);
+	int (*msi_alloc_irq)(struct msi_irqs *, int, int);
 
 	/* Setup the remapped MSI irq */
-	int (*msi_setup_irq)(struct pci_dev *, unsigned int, int, int);
+	int (*msi_setup_irq)(struct msi_irqs *, unsigned int, int, int);
 
 	/* Setup interrupt remapping for an HPET MSI */
 	int (*setup_hpet_msi)(unsigned int, unsigned int);
diff --git a/drivers/msi/msi.c b/drivers/msi/msi.c
index 3fbd539..8462c6c 100644
--- a/drivers/msi/msi.c
+++ b/drivers/msi/msi.c
@@ -510,9 +510,8 @@ int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
 
 	/* Set MSI-X enabled bits and unmask the function */
 	msi_set_intx(msi, 0);
-	msi->msix_enabled = 1;
-
 	msi_set_enable(msi, 1, MSIX_TYPE);
+	msi->msix_enabled = 1;
 
 	return 0;
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (10 preceding siblings ...)
  2014-07-26  3:08 ` [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code Yijing Wang
@ 2014-07-29 14:08 ` Arnd Bergmann
  2014-07-30  2:45   ` Yijing Wang
  2014-08-01 10:27 ` arnab.basu
  12 siblings, 1 reply; 41+ messages in thread
From: Arnd Bergmann @ 2014-07-29 14:08 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Yijing Wang, linux-kernel, linux-arch, Russell King, Paul.Mundt,
	Marc Zyngier, linux-pci, James E.J. Bottomley, virtualization,
	Xinwei Hu, Hanjun Guo, Bjorn Helgaas, Wuyun

On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>         The series is a draft of generic MSI driver that supports PCI
> and Non-PCI device which have MSI capability. If you're not interested
> it, sorry for the noise.

I've finally managed to take some time to look at the series. Overall,
the concept looks good to me, and the patches look very well implemented.

The part I'm not sure about is the interface we want to end up with
at the end of the series. More on that below

> The series is based on Linux-3.16-rc1.
> 
> MSI was introduced in PCI Spec 2.2. Currently, kernel MSI 
> driver codes are bonding with PCI device. Because MSI has a lot
> advantages in design. More and more non-PCI devices want to
> use MSI as their default interrupt. The existing MSI device
> include HPET. HPET driver provide its own MSI code to initialize
> and process MSI interrupts. In the latest GIC v3 spec, legacy device
> can deliver MSI by the help of a relay device named consolidator.
> Consolidator can translate the legacy interrupts connected to it
> to MSI/MSI-X. And new non-PCI device will be designed to 
> support MSI in future. So make the MSI driver code be generic will 
> help the non-PCI device use MSI more simply.
> 
> The new data struct for generic MSI driver.
> struct msi_irqs {
>         u8 msi_enabled:1; /* Enable flag */
>         u8 msix_enabled:1;
>         struct list_head msi_list; /* MSI desc list */
>         void *data;     /* help to find the MSI device */
>         struct msi_ops *ops; /* MSI device specific hook */
> };
> struct msi_irqs is used to manage MSI related informations. Every device supports
> MSI should contain this data struct and allocate it.

I think you should have a stronger association with the 'struct
device' here. Can you replace the 'void *data' with 'struct device *dev'?

The other part I'm not completely sure about is how you want to
have MSIs map into normal IRQ descriptors. At the moment, all
MSI users are based on IRQ numbers, but this has known scalability
problems. I wonder if we can do the interface in a way that
hides the interrupt number from generic device drivers and just
passes a 'struct irq_desc'. Note that there are long-term plans to
get rid of IRQ numbers entirely, but those plans have existed for
a long time already without anybody seriously addressing the device
driver interfaces so far, so it might never really happen.

> struct msi_ops {
>         struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi, struct msi_desc *entry);
>         int msix_setup_entries(struct msi_irqs *msi, struct msix_entry *entries);
>         u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
>         u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
>         void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
>         void (*msi_write_message)(struct msi_desc *desc, struct msi_msg *msg);
>         void (*msi_set_intx)(struct msi_irqs *msi, int enable);
> };
> struct msi_ops provides several hook functions, generic MSI driver will call
> the hook functions to access device specific registers. PCI devices will share
> the same msi_ops, because they have the same way to access MSI hardware registers.
> 
> Generic MSI layer export msi_capability_init() and msix_capability_init() functions
> to drivers. msi/x_capability_init() will initialize MSI capability data struct msi_desc
> and alloc the irq, then write the msi address/data value to hardware registers.
> 
> This series only did compile test, we will test it in x86 and arm platform later.

For the generic drivers, I don't see much point in differentiating between
MSI and MSI-X, as I believe the difference is something internal to the PCI
implementation.

With the other operations, I think they should all take a 'struct device *'
as the first argument for convenience and consistency. I don't think you actually
need msi_read_message(), and we could avoid msi_write_message() by doing it
the other way round.

What I'd envision as the API from the device driver perspective is something
as simple like this:

struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
			unsigned long flags, const char *name, struct device *dev);

which would get an msi descriptor that is valid for this device (dev)
connected to a particular msi_chip, and associate a handler function
with it. The device driver can call that function and retrieve the
address/message pair from the msi_desc in order to store it in its own
device specific registers. The request_irq() can be handled internally
to msi_request().

Would that work for you?

	Arnd

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-07-29 14:08 ` [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Arnd Bergmann
@ 2014-07-30  2:45   ` Yijing Wang
  2014-07-30  6:47     ` Jiang Liu
  2014-08-01 13:52     ` Arnd Bergmann
  0 siblings, 2 replies; 41+ messages in thread
From: Yijing Wang @ 2014-07-30  2:45 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: linux-kernel, linux-arch, Russell King, Paul.Mundt, Marc Zyngier,
	linux-pci, James E.J. Bottomley, virtualization, Xinwei Hu,
	Hanjun Guo, Bjorn Helgaas, Wuyun

On 2014/7/29 22:08, Arnd Bergmann wrote:
> On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>>         The series is a draft of generic MSI driver that supports PCI
>> and Non-PCI device which have MSI capability. If you're not interested
>> it, sorry for the noise.
> 
> I've finally managed to take some time to look at the series. Overall,
> the concept looks good to me, and the patches look very well implemented.
> 
> The part I'm not sure about is the interface we want to end up with
> at the end of the series. More on that below

Hi Arnd,
   Thanks for your review and comments very much!
Please refer the inline comments.

> 
>> The series is based on Linux-3.16-rc1.
>>
>> MSI was introduced in PCI Spec 2.2. Currently, kernel MSI 
>> driver codes are bonding with PCI device. Because MSI has a lot
>> advantages in design. More and more non-PCI devices want to
>> use MSI as their default interrupt. The existing MSI device
>> include HPET. HPET driver provide its own MSI code to initialize
>> and process MSI interrupts. In the latest GIC v3 spec, legacy device
>> can deliver MSI by the help of a relay device named consolidator.
>> Consolidator can translate the legacy interrupts connected to it
>> to MSI/MSI-X. And new non-PCI device will be designed to 
>> support MSI in future. So make the MSI driver code be generic will 
>> help the non-PCI device use MSI more simply.
>>
>> The new data struct for generic MSI driver.
>> struct msi_irqs {
>>         u8 msi_enabled:1; /* Enable flag */
>>         u8 msix_enabled:1;
>>         struct list_head msi_list; /* MSI desc list */
>>         void *data;     /* help to find the MSI device */
>>         struct msi_ops *ops; /* MSI device specific hook */
>> };
>> struct msi_irqs is used to manage MSI related informations. Every device supports
>> MSI should contain this data struct and allocate it.
> 
> I think you should have a stronger association with the 'struct
> device' here. Can you replace the 'void *data' with 'struct device *dev'?

Actually, I used the struct device *dev in my first draft, finally, I replaced
it with void *data, because some MSI devices don't have a struct device *dev,
like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.

Of course, we can make the MSI devices have their own struct device, and register to
device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.

> 
> The other part I'm not completely sure about is how you want to
> have MSIs map into normal IRQ descriptors. At the moment, all
> MSI users are based on IRQ numbers, but this has known scalability problems.

Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
I'm sorry, I don't understand you meaning.
What are the scalability problems you mentioned ?
For device drivers, they always process interrupt in two steps.
If irq is the legacy interrupt, drivers will first
use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
Then drivers will call the request_irq() to register the interrupt handler.
If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
request_irq() to register interrupt handler.

> I wonder if we can do the interface in a way that
> hides the interrupt number from generic device drivers and just
> passes a 'struct irq_desc'. Note that there are long-term plans to
> get rid of IRQ numbers entirely, but those plans have existed for
> a long time already without anybody seriously addressing the device
> driver interfaces so far, so it might never really happen.
> 

Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
this in a separate title.

>> struct msi_ops {
>>         struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi, struct msi_desc *entry);
>>         int msix_setup_entries(struct msi_irqs *msi, struct msix_entry *entries);
>>         u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
>>         u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
>>         void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
>>         void (*msi_write_message)(struct msi_desc *desc, struct msi_msg *msg);
>>         void (*msi_set_intx)(struct msi_irqs *msi, int enable);
>> };
>> struct msi_ops provides several hook functions, generic MSI driver will call
>> the hook functions to access device specific registers. PCI devices will share
>> the same msi_ops, because they have the same way to access MSI hardware registers.
>>
>> Generic MSI layer export msi_capability_init() and msix_capability_init() functions
>> to drivers. msi/x_capability_init() will initialize MSI capability data struct msi_desc
>> and alloc the irq, then write the msi address/data value to hardware registers.
>>
>> This series only did compile test, we will test it in x86 and arm platform later.
> 
> For the generic drivers, I don't see much point in differentiating between
> MSI and MSI-X, as I believe the difference is something internal to the PCI
> implementation.

Yes, we can integrate them, and use a generic ops, add a type in hook function to
differentiate them.

> 
> With the other operations, I think they should all take a 'struct device *'
> as the first argument for convenience and consistency. I don't think you actually
> need msi_read_message(), and we could avoid msi_write_message() by doing it
> the other way round.
> 

There only two functions use the read_msi_msg(), because every msi_desc has
a struct msi_msg, and it caches the msi address and data. I will consider to
retrieve the msg from cached msi_msg, then we can avoid the msi_read_message().
But msi_write_message() maybe necessary, some xxx_set_affinity() functions and
restore functions need the msi_write_message() to rewrite the address and data.

> What I'd envision as the API from the device driver perspective is something
> as simple like this:
> 
> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
> 			unsigned long flags, const char *name, struct device *dev);
> 
> which would get an msi descriptor that is valid for this device (dev)
> connected to a particular msi_chip, and associate a handler function
> with it. The device driver can call that function and retrieve the
> address/message pair from the msi_desc in order to store it in its own
> device specific registers. The request_irq() can be handled internally
> to msi_request().

This is a huge change for device drivers, and some device drivers don't know which msi_chip
their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
binding for the MSI device to the target msi_chip.

For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
And other platform create the binding in DTS file, then the MSI device can find their msi_chip
by device_node.
I don't know whether there are other situations, we should provide a generic interface that
every MSI device under every platform can use it to find its msi_chip exactly.


Thanks!
Yijing.

> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-07-30  2:45   ` Yijing Wang
@ 2014-07-30  6:47     ` Jiang Liu
  2014-07-30  7:20       ` Yijing Wang
  2014-08-01 13:52     ` Arnd Bergmann
  1 sibling, 1 reply; 41+ messages in thread
From: Jiang Liu @ 2014-07-30  6:47 UTC (permalink / raw)
  To: Yijing Wang, Arnd Bergmann, linux-arm-kernel
  Cc: linux-kernel, linux-arch, Russell King, Paul.Mundt, Marc Zyngier,
	linux-pci, James E.J. Bottomley, virtualization, Xinwei Hu,
	Hanjun Guo, Bjorn Helgaas, Wuyun



On 2014/7/30 10:45, Yijing Wang wrote:
> On 2014/7/29 22:08, Arnd Bergmann wrote:
>> On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>>>         The series is a draft of generic MSI driver that supports PCI
>>> and Non-PCI device which have MSI capability. If you're not interested
>>> it, sorry for the noise.
>>
>> I've finally managed to take some time to look at the series. Overall,
>> the concept looks good to me, and the patches look very well implemented.
>>
>> The part I'm not sure about is the interface we want to end up with
>> at the end of the series. More on that below
> 
> Hi Arnd,
>    Thanks for your review and comments very much!
> Please refer the inline comments.
> 
>>
>>> The series is based on Linux-3.16-rc1.
>>>
>>> MSI was introduced in PCI Spec 2.2. Currently, kernel MSI 
>>> driver codes are bonding with PCI device. Because MSI has a lot
>>> advantages in design. More and more non-PCI devices want to
>>> use MSI as their default interrupt. The existing MSI device
>>> include HPET. HPET driver provide its own MSI code to initialize
>>> and process MSI interrupts. In the latest GIC v3 spec, legacy device
>>> can deliver MSI by the help of a relay device named consolidator.
>>> Consolidator can translate the legacy interrupts connected to it
>>> to MSI/MSI-X. And new non-PCI device will be designed to 
>>> support MSI in future. So make the MSI driver code be generic will 
>>> help the non-PCI device use MSI more simply.
>>>
>>> The new data struct for generic MSI driver.
>>> struct msi_irqs {
>>>         u8 msi_enabled:1; /* Enable flag */
>>>         u8 msix_enabled:1;
>>>         struct list_head msi_list; /* MSI desc list */
>>>         void *data;     /* help to find the MSI device */
>>>         struct msi_ops *ops; /* MSI device specific hook */
>>> };
>>> struct msi_irqs is used to manage MSI related informations. Every device supports
>>> MSI should contain this data struct and allocate it.
>>
>> I think you should have a stronger association with the 'struct
>> device' here. Can you replace the 'void *data' with 'struct device *dev'?
> 
> Actually, I used the struct device *dev in my first draft, finally, I replaced
> it with void *data, because some MSI devices don't have a struct device *dev,
> like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.
> 
> Of course, we can make the MSI devices have their own struct device, and register to
> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.
> 
>>
>> The other part I'm not completely sure about is how you want to
>> have MSIs map into normal IRQ descriptors. At the moment, all
>> MSI users are based on IRQ numbers, but this has known scalability problems.
> 
> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
> I'm sorry, I don't understand you meaning.
> What are the scalability problems you mentioned ?
We have soft limitation of nr_irqs or hard limitation NR_IRQS,
we couldn't allocate as much irq number as we need in some cases,
such as to support MSI-x.

> For device drivers, they always process interrupt in two steps.
> If irq is the legacy interrupt, drivers will first
> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
> Then drivers will call the request_irq() to register the interrupt handler.
> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
> request_irq() to register interrupt handler.
> 
>> I wonder if we can do the interface in a way that
>> hides the interrupt number from generic device drivers and just
>> passes a 'struct irq_desc'. Note that there are long-term plans to
>> get rid of IRQ numbers entirely, but those plans have existed for
>> a long time already without anybody seriously addressing the device
>> driver interfaces so far, so it might never really happen.
>>
> 
> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
> this in a separate title.
> 
>>> struct msi_ops {
>>>         struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi, struct msi_desc *entry);
>>>         int msix_setup_entries(struct msi_irqs *msi, struct msix_entry *entries);
>>>         u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
>>>         u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
>>>         void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
>>>         void (*msi_write_message)(struct msi_desc *desc, struct msi_msg *msg);
>>>         void (*msi_set_intx)(struct msi_irqs *msi, int enable);
>>> };
>>> struct msi_ops provides several hook functions, generic MSI driver will call
>>> the hook functions to access device specific registers. PCI devices will share
>>> the same msi_ops, because they have the same way to access MSI hardware registers.
>>>
>>> Generic MSI layer export msi_capability_init() and msix_capability_init() functions
>>> to drivers. msi/x_capability_init() will initialize MSI capability data struct msi_desc
>>> and alloc the irq, then write the msi address/data value to hardware registers.
>>>
>>> This series only did compile test, we will test it in x86 and arm platform later.
>>
>> For the generic drivers, I don't see much point in differentiating between
>> MSI and MSI-X, as I believe the difference is something internal to the PCI
>> implementation.
> 
> Yes, we can integrate them, and use a generic ops, add a type in hook function to
> differentiate them.
> 
>>
>> With the other operations, I think they should all take a 'struct device *'
>> as the first argument for convenience and consistency. I don't think you actually
>> need msi_read_message(), and we could avoid msi_write_message() by doing it
>> the other way round.
>>
> 
> There only two functions use the read_msi_msg(), because every msi_desc has
> a struct msi_msg, and it caches the msi address and data. I will consider to
> retrieve the msg from cached msi_msg, then we can avoid the msi_read_message().
> But msi_write_message() maybe necessary, some xxx_set_affinity() functions and
> restore functions need the msi_write_message() to rewrite the address and data.
> 
>> What I'd envision as the API from the device driver perspective is something
>> as simple like this:
>>
>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
>> 			unsigned long flags, const char *name, struct device *dev);
>>
>> which would get an msi descriptor that is valid for this device (dev)
>> connected to a particular msi_chip, and associate a handler function
>> with it. The device driver can call that function and retrieve the
>> address/message pair from the msi_desc in order to store it in its own
>> device specific registers. The request_irq() can be handled internally
>> to msi_request().
> 
> This is a huge change for device drivers, and some device drivers don't know which msi_chip
> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
> binding for the MSI device to the target msi_chip.
> 
> For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
> pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
> And other platform create the binding in DTS file, then the MSI device can find their msi_chip
> by device_node.
> I don't know whether there are other situations, we should provide a generic interface that
> every MSI device under every platform can use it to find its msi_chip exactly.
> 
> 
> Thanks!
> Yijing.
> 
>>
>> .
>>
> 
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-07-30  6:47     ` Jiang Liu
@ 2014-07-30  7:20       ` Yijing Wang
  2014-08-01 13:16         ` Arnd Bergmann
  0 siblings, 1 reply; 41+ messages in thread
From: Yijing Wang @ 2014-07-30  7:20 UTC (permalink / raw)
  To: Jiang Liu, Arnd Bergmann, linux-arm-kernel
  Cc: linux-kernel, linux-arch, Russell King, Paul.Mundt, Marc Zyngier,
	linux-pci, James E.J. Bottomley, virtualization, Xinwei Hu,
	Hanjun Guo, Bjorn Helgaas, Wuyun

On 2014/7/30 14:47, Jiang Liu wrote:
> 
> 
> On 2014/7/30 10:45, Yijing Wang wrote:
>> On 2014/7/29 22:08, Arnd Bergmann wrote:
>>> On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>>>>         The series is a draft of generic MSI driver that supports PCI
>>>> and Non-PCI device which have MSI capability. If you're not interested
>>>> it, sorry for the noise.
>>>
>>> I've finally managed to take some time to look at the series. Overall,
>>> the concept looks good to me, and the patches look very well implemented.
>>>
>>> The part I'm not sure about is the interface we want to end up with
>>> at the end of the series. More on that below
>>
>> Hi Arnd,
>>    Thanks for your review and comments very much!
>> Please refer the inline comments.
>>
>>>
>>>> The series is based on Linux-3.16-rc1.
>>>>
>>>> MSI was introduced in PCI Spec 2.2. Currently, kernel MSI 
>>>> driver codes are bonding with PCI device. Because MSI has a lot
>>>> advantages in design. More and more non-PCI devices want to
>>>> use MSI as their default interrupt. The existing MSI device
>>>> include HPET. HPET driver provide its own MSI code to initialize
>>>> and process MSI interrupts. In the latest GIC v3 spec, legacy device
>>>> can deliver MSI by the help of a relay device named consolidator.
>>>> Consolidator can translate the legacy interrupts connected to it
>>>> to MSI/MSI-X. And new non-PCI device will be designed to 
>>>> support MSI in future. So make the MSI driver code be generic will 
>>>> help the non-PCI device use MSI more simply.
>>>>
>>>> The new data struct for generic MSI driver.
>>>> struct msi_irqs {
>>>>         u8 msi_enabled:1; /* Enable flag */
>>>>         u8 msix_enabled:1;
>>>>         struct list_head msi_list; /* MSI desc list */
>>>>         void *data;     /* help to find the MSI device */
>>>>         struct msi_ops *ops; /* MSI device specific hook */
>>>> };
>>>> struct msi_irqs is used to manage MSI related informations. Every device supports
>>>> MSI should contain this data struct and allocate it.
>>>
>>> I think you should have a stronger association with the 'struct
>>> device' here. Can you replace the 'void *data' with 'struct device *dev'?
>>
>> Actually, I used the struct device *dev in my first draft, finally, I replaced
>> it with void *data, because some MSI devices don't have a struct device *dev,
>> like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.
>>
>> Of course, we can make the MSI devices have their own struct device, and register to
>> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.
>>
>>>
>>> The other part I'm not completely sure about is how you want to
>>> have MSIs map into normal IRQ descriptors. At the moment, all
>>> MSI users are based on IRQ numbers, but this has known scalability problems.
>>
>> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
>> I'm sorry, I don't understand you meaning.
>> What are the scalability problems you mentioned ?
> We have soft limitation of nr_irqs or hard limitation NR_IRQS,
> we couldn't allocate as much irq number as we need in some cases,
> such as to support MSI-x.

Oh, yes, this is a potential issue. Gerry, thanks for you explanation. :)

> 
>> For device drivers, they always process interrupt in two steps.
>> If irq is the legacy interrupt, drivers will first
>> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
>> Then drivers will call the request_irq() to register the interrupt handler.
>> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
>> request_irq() to register interrupt handler.
>>
>>> I wonder if we can do the interface in a way that
>>> hides the interrupt number from generic device drivers and just
>>> passes a 'struct irq_desc'. Note that there are long-term plans to
>>> get rid of IRQ numbers entirely, but those plans have existed for
>>> a long time already without anybody seriously addressing the device
>>> driver interfaces so far, so it might never really happen.
>>>
>>
>> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
>> this in a separate title.
>>
>>>> struct msi_ops {
>>>>         struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi, struct msi_desc *entry);
>>>>         int msix_setup_entries(struct msi_irqs *msi, struct msix_entry *entries);
>>>>         u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
>>>>         u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
>>>>         void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
>>>>         void (*msi_write_message)(struct msi_desc *desc, struct msi_msg *msg);
>>>>         void (*msi_set_intx)(struct msi_irqs *msi, int enable);
>>>> };
>>>> struct msi_ops provides several hook functions, generic MSI driver will call
>>>> the hook functions to access device specific registers. PCI devices will share
>>>> the same msi_ops, because they have the same way to access MSI hardware registers.
>>>>
>>>> Generic MSI layer export msi_capability_init() and msix_capability_init() functions
>>>> to drivers. msi/x_capability_init() will initialize MSI capability data struct msi_desc
>>>> and alloc the irq, then write the msi address/data value to hardware registers.
>>>>
>>>> This series only did compile test, we will test it in x86 and arm platform later.
>>>
>>> For the generic drivers, I don't see much point in differentiating between
>>> MSI and MSI-X, as I believe the difference is something internal to the PCI
>>> implementation.
>>
>> Yes, we can integrate them, and use a generic ops, add a type in hook function to
>> differentiate them.
>>
>>>
>>> With the other operations, I think they should all take a 'struct device *'
>>> as the first argument for convenience and consistency. I don't think you actually
>>> need msi_read_message(), and we could avoid msi_write_message() by doing it
>>> the other way round.
>>>
>>
>> There only two functions use the read_msi_msg(), because every msi_desc has
>> a struct msi_msg, and it caches the msi address and data. I will consider to
>> retrieve the msg from cached msi_msg, then we can avoid the msi_read_message().
>> But msi_write_message() maybe necessary, some xxx_set_affinity() functions and
>> restore functions need the msi_write_message() to rewrite the address and data.
>>
>>> What I'd envision as the API from the device driver perspective is something
>>> as simple like this:
>>>
>>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
>>> 			unsigned long flags, const char *name, struct device *dev);
>>>
>>> which would get an msi descriptor that is valid for this device (dev)
>>> connected to a particular msi_chip, and associate a handler function
>>> with it. The device driver can call that function and retrieve the
>>> address/message pair from the msi_desc in order to store it in its own
>>> device specific registers. The request_irq() can be handled internally
>>> to msi_request().
>>
>> This is a huge change for device drivers, and some device drivers don't know which msi_chip
>> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
>> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
>> binding for the MSI device to the target msi_chip.
>>
>> For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
>> pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
>> And other platform create the binding in DTS file, then the MSI device can find their msi_chip
>> by device_node.
>> I don't know whether there are other situations, we should provide a generic interface that
>> every MSI device under every platform can use it to find its msi_chip exactly.
>>
>>
>> Thanks!
>> Yijing.
>>
>>>
>>> .
>>>
>>
>>
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
                   ` (11 preceding siblings ...)
  2014-07-29 14:08 ` [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Arnd Bergmann
@ 2014-08-01 10:27 ` arnab.basu
  2014-08-04  3:03   ` Yijing Wang
  12 siblings, 1 reply; 41+ messages in thread
From: arnab.basu @ 2014-08-01 10:27 UTC (permalink / raw)
  To: Yijing Wang
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, virtualization, Hanjun Guo,
	linux-kernel

Hi Yijing

> -----Original Message-----
> From: Yijing Wang [mailto:wangyijing@huawei.com]
> Sent: Saturday, July 26, 2014 8:39 AM
> To: linux-kernel@vger.kernel.org
> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org;
> Basu Arnab-B45036; virtualization@lists.linux-foundation.org; Hanjun Guo;
> Yijing Wang
> Subject: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
> 
> Hi all,
> 	The series is a draft of generic MSI driver that supports PCI and
> Non-PCI device which have MSI capability. If you're not interested it,
> sorry for the noise.
> 

Thanks for sending out these patches, I have some (very basic) questions.

> The series is based on Linux-3.16-rc1.
> 
> MSI was introduced in PCI Spec 2.2. Currently, kernel MSI driver codes
> are bonding with PCI device. Because MSI has a lot advantages in design.
> More and more non-PCI devices want to use MSI as their default interrupt.
> The existing MSI device include HPET. HPET driver provide its own MSI
> code to initialize and process MSI interrupts. In the latest GIC v3 spec,
> legacy device can deliver MSI by the help of a relay device named
> consolidator.
> Consolidator can translate the legacy interrupts connected to it to
> MSI/MSI-X. And new non-PCI device will be designed to support MSI in
> future. So make the MSI driver code be generic will help the non-PCI
> device use MSI more simply.

As per my understanding the GICv3 provides a service that will convert writes to a specified address to IRQs delivered to the core and as you mention above MSIs are part of the PCI spec. So I can see a strong case for non-PCI devices to want MSI like functionality without being fully compliant with the requirements of the MSI spec.

My question is do we necessarily want to rework so much of the PCI-MSI layer to support non PCI devices? Or will it be sufficient to create a framework to allow non PCI devices to hook up with a device that can convert their writes to an IRQ to the core.

As I understand it, the msi_chip is (almost) such a framework. The only problem being that it makes some PCI specific assumptions (such as PCI specific writes from within msi_chip functions). Won't it be sufficient to make the msi_chip framework generic enough to be used by non-PCI devices and let each bus/device manage any additional requirements (such as configuration flow, bit definitions etc) that it places on message based interrupts?

Thanks
Arnab

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-07-30  7:20       ` Yijing Wang
@ 2014-08-01 13:16         ` Arnd Bergmann
  2014-08-04  3:32           ` Yijing Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Arnd Bergmann @ 2014-08-01 13:16 UTC (permalink / raw)
  To: Yijing Wang
  Cc: Jiang Liu, linux-arm-kernel, linux-kernel, linux-arch,
	Russell King, Paul.Mundt, Marc Zyngier, linux-pci,
	James E.J. Bottomley, virtualization, Xinwei Hu, Hanjun Guo,
	Bjorn Helgaas, Wuyun

On Wednesday 30 July 2014, Yijing Wang wrote:
> >>>
> >>> The other part I'm not completely sure about is how you want to
> >>> have MSIs map into normal IRQ descriptors. At the moment, all
> >>> MSI users are based on IRQ numbers, but this has known scalability problems.
> >>
> >> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
> >> I'm sorry, I don't understand you meaning.
> >> What are the scalability problems you mentioned ?
> > We have soft limitation of nr_irqs or hard limitation NR_IRQS,
> > we couldn't allocate as much irq number as we need in some cases,
> > such as to support MSI-x.
> 
> Oh, yes, this is a potential issue. Gerry, thanks for you explanation. :)

This should no longer be an issue, as arm64 uses CONFIG_SPARSE_IRQ
and the number of interrupts is not limited in any form.

My point was more that the device driver should not need to care about
the interrupt number: it gets made up on the spot when the MSI is
needed, and then it is only used to request the IRQ. This can be
simplified into one interface at the device driver level, even though
the internal still use numbers somewhere. If we ever remove IRQ numbers
from the driver API, this part doesn't need to get touched again.

	Arnd

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-07-30  2:45   ` Yijing Wang
  2014-07-30  6:47     ` Jiang Liu
@ 2014-08-01 13:52     ` Arnd Bergmann
  2014-08-04  6:43       ` Yijing Wang
  1 sibling, 1 reply; 41+ messages in thread
From: Arnd Bergmann @ 2014-08-01 13:52 UTC (permalink / raw)
  To: Yijing Wang
  Cc: linux-arm-kernel, linux-kernel, linux-arch, Russell King,
	Paul.Mundt, Marc Zyngier, linux-pci, James E.J. Bottomley,
	virtualization, Xinwei Hu, Hanjun Guo, Bjorn Helgaas, Wuyun

On Wednesday 30 July 2014, Yijing Wang wrote:
> On 2014/7/29 22:08, Arnd Bergmann wrote:
> > On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
> >>
> >> The new data struct for generic MSI driver.
> >> struct msi_irqs {
> >>         u8 msi_enabled:1; /* Enable flag */
> >>         u8 msix_enabled:1;
> >>         struct list_head msi_list; /* MSI desc list */
> >>         void *data;     /* help to find the MSI device */
> >>         struct msi_ops *ops; /* MSI device specific hook */
> >> };
> >> struct msi_irqs is used to manage MSI related informations. Every device supports
> >> MSI should contain this data struct and allocate it.
> > 
> > I think you should have a stronger association with the 'struct
> > device' here. Can you replace the 'void *data' with 'struct device *dev'?
> 
> Actually, I used the struct device *dev in my first draft, finally, I replaced
> it with void *data, because some MSI devices don't have a struct device *dev,
> like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.
> 
> Of course, we can make the MSI devices have their own struct device, and register to
> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.

It doesn't have to be in the (OF) device tree, but I think it absolutely makes
sense to use the 'struct device' infrastructure here, as almost everything uses
a device, and the ones that don't do that today can be easily changed.

> > The other part I'm not completely sure about is how you want to
> > have MSIs map into normal IRQ descriptors. At the moment, all
> > MSI users are based on IRQ numbers, but this has known scalability problems.
> 
> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
> I'm sorry, I don't understand you meaning.
> What are the scalability problems you mentioned ?
> For device drivers, they always process interrupt in two steps.
> If irq is the legacy interrupt, drivers will first
> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
> Then drivers will call the request_irq() to register the interrupt handler.
> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
> request_irq() to register interrupt handler.

The method you describe here makes sense for PCI devices that are required to support
legacy interrupts and may or may not support MSI on a given system, but not so much
for platform devices for which we know exactly whether we want to use MSI
or legacy interrupts.

In particular if you have a device that can only do MSI, the entire pci_enable_msi
step is pointless: all we need to do is program the correct MSI target address/message
pair into the device and register the handler.

> > I wonder if we can do the interface in a way that
> > hides the interrupt number from generic device drivers and just
> > passes a 'struct irq_desc'. Note that there are long-term plans to
> > get rid of IRQ numbers entirely, but those plans have existed for
> > a long time already without anybody seriously addressing the device
> > driver interfaces so far, so it might never really happen.
> > 
> 
> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
> this in a separate title.

Sorry for being unclear here: I did suggest changing all drivers now. What I meant
is that we use a different API for non-PCI devices that works without IRQ numbers.
I don't think we should touch the PCI interfaces at this point.

> > With the other operations, I think they should all take a 'struct device *'
> > as the first argument for convenience and consistency. I don't think you actually
> > need msi_read_message(), and we could avoid msi_write_message() by doing it
> > the other way round.
> > 
> 
> There only two functions use the read_msi_msg(), because every msi_desc has
> a struct msi_msg, and it caches the msi address and data. I will consider to
> retrieve the msg from cached msi_msg, then we can avoid the msi_read_message().
> But msi_write_message() maybe necessary, some xxx_set_affinity() functions and
> restore functions need the msi_write_message() to rewrite the address and data.

Makes sense. I'd have to think about it more, but I think you are right
about the affinity APIs needing this.

> > What I'd envision as the API from the device driver perspective is something
> > as simple like this:
> > 
> > struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
> > 			unsigned long flags, const char *name, struct device *dev);
> > 
> > which would get an msi descriptor that is valid for this device (dev)
> > connected to a particular msi_chip, and associate a handler function
> > with it. The device driver can call that function and retrieve the
> > address/message pair from the msi_desc in order to store it in its own
> > device specific registers. The request_irq() can be handled internally
> > to msi_request().
> 
> This is a huge change for device drivers, and some device drivers don't know which msi_chip
> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
> binding for the MSI device to the target msi_chip.

Which drivers are you thinking of? Again, I wouldn't expect to change any PCI drivers,
but only platform drivers that do native MSI, so we only have to change drivers that
do not support any MSI at all yet and that need to be changed anyway in order to add
support.

> For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
> pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
> And other platform create the binding in DTS file, then the MSI device can find their msi_chip
> by device_node.
> I don't know whether there are other situations, we should provide a generic interface that
> every MSI device under every platform can use it to find its msi_chip exactly.

We have introduced the "msi-parent" property to mirror the "interrupt-parent" property.
For the PCI case, this property is only needed in the PCI host controller, and there
can be a system-wide default, by putting the "msi-parent" property into the root device
node or the node of a bus that is parent to all devices supporting MSI.

For non-PCI devices, it should be possible to override the "msi-parent" property per
device, but those can also use the global property.

The main use case that I see are PCI host controllers that have their own MSI catcher
included, so meaning that any PCI device can either send its MSIs there, or to a
system-wide GICv3 instance, and we need a way to select which one.

A degenerate case of this would be a system where a PCI device sends its MSI into
the host controller, that generates a legacy interrupt and that in turn gets 
sent to an irqchip which turns it back into an MSI for the GICv3. This would of
course be very inefficient, but I think we should be able to express this with
both the binding and the in-kernel framework just to be on the safe side.


	Arnd

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-01 10:27 ` arnab.basu
@ 2014-08-04  3:03   ` Yijing Wang
  2014-08-20  5:44     ` Bharat.Bhushan
  0 siblings, 1 reply; 41+ messages in thread
From: Yijing Wang @ 2014-08-04  3:03 UTC (permalink / raw)
  To: arnab.basu
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, virtualization, Hanjun Guo,
	linux-kernel

>> MSI was introduced in PCI Spec 2.2. Currently, kernel MSI driver codes
>> are bonding with PCI device. Because MSI has a lot advantages in design.
>> More and more non-PCI devices want to use MSI as their default interrupt.
>> The existing MSI device include HPET. HPET driver provide its own MSI
>> code to initialize and process MSI interrupts. In the latest GIC v3 spec,
>> legacy device can deliver MSI by the help of a relay device named
>> consolidator.
>> Consolidator can translate the legacy interrupts connected to it to
>> MSI/MSI-X. And new non-PCI device will be designed to support MSI in
>> future. So make the MSI driver code be generic will help the non-PCI
>> device use MSI more simply.
> 
> As per my understanding the GICv3 provides a service that will convert writes to a specified address to IRQs delivered to the core and as you mention above MSIs are part of the PCI spec. So I can see a strong case for non-PCI devices to want MSI like functionality without being fully compliant with the requirements of the MSI spec.

In GICv3, MBI is named for the service, but there is no more detailed information about it, only we can know is MBI is analogous to MSI,
MBI devices must have address/data registers, but other registers like enable/mask/ctrl are not mandatory requirement.
I don't know whether the MBI spec will be release, but anyway I think MSI refactoring is make sense, there are some existing Non-PCI MSI device like hpet, dmar.
For simplicity, let name MSI and MBI to MSI temporarily.
> 
> My question is do we necessarily want to rework so much of the PCI-MSI layer to support non PCI devices? Or will it be sufficient to create a framework to allow non PCI devices to hook up with a device that can convert their writes to an IRQ to the core.
> 
> As I understand it, the msi_chip is (almost) such a framework. The only problem being that it makes some PCI specific assumptions (such as PCI specific writes from within msi_chip functions). Won't it be sufficient to make the msi_chip framework generic enough to be used by non-PCI devices and let each bus/device manage any additional requirements (such as configuration flow, bit definitions etc) that it places on message based interrupts?

msi_chip framework is important to support that, but I think maybe it's not enough, msi_chip is only responsible for IRQ allocation, teardown, etc..

The key difference between PCI device and Non-PCI MSI is the interfaces to access hardware MSI registers.
for instance, currently, msi_chip->setup_irq() to setup MSI irq and configure the MSI address/data registers, so we need to provide device specific write_msi_msg() interface,
then when we call msi_chip->setup_irq(), the device MSI registers can be configured appropriately.

My patchset is just a RFC draft, I will update it later, all we want to do is make kernel support Non-PCI MSI devices.

Thanks!
Yijing.


> 
> Thanks
> Arnab
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-01 13:16         ` Arnd Bergmann
@ 2014-08-04  3:32           ` Yijing Wang
  2014-08-04 14:45             ` Arnd Bergmann
  0 siblings, 1 reply; 41+ messages in thread
From: Yijing Wang @ 2014-08-04  3:32 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Jiang Liu, linux-arm-kernel, linux-kernel, linux-arch,
	Russell King, Paul.Mundt, Marc Zyngier, linux-pci,
	James E.J. Bottomley, virtualization, Xinwei Hu, Hanjun Guo,
	Bjorn Helgaas, Wuyun

On 2014/8/1 21:16, Arnd Bergmann wrote:
> On Wednesday 30 July 2014, Yijing Wang wrote:
>>>>>
>>>>> The other part I'm not completely sure about is how you want to
>>>>> have MSIs map into normal IRQ descriptors. At the moment, all
>>>>> MSI users are based on IRQ numbers, but this has known scalability problems.
>>>>
>>>> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
>>>> I'm sorry, I don't understand you meaning.
>>>> What are the scalability problems you mentioned ?
>>> We have soft limitation of nr_irqs or hard limitation NR_IRQS,
>>> we couldn't allocate as much irq number as we need in some cases,
>>> such as to support MSI-x.
>>
>> Oh, yes, this is a potential issue. Gerry, thanks for you explanation. :)
> 
> This should no longer be an issue, as arm64 uses CONFIG_SPARSE_IRQ
> and the number of interrupts is not limited in any form.
> 
> My point was more that the device driver should not need to care about
> the interrupt number: it gets made up on the spot when the MSI is
> needed, and then it is only used to request the IRQ. This can be
> simplified into one interface at the device driver level, even though
> the internal still use numbers somewhere. If we ever remove IRQ numbers
> from the driver API, this part doesn't need to get touched again.
> 

Hi Arnd, I have another question is some drivers will request more than one
MSI/MSI-X IRQ, and the driver will use them to process different things.
Eg. network driver generally uses one of them to process trivial network thins,
and others to transmit/receive data.

So, in this case, it seems to driver need to touch the IRQ numbers.

wr-linux:~ # cat /proc/interrupts
            CPU0       CPU1       CPU2     ....      CPU17      CPU18      CPU19      CPU20      CPU21      CPU22      CPU23
 ......
 100:          0          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0
 101:          2          0          0               0          0          0  302830488          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-0
 102:        110          0          0               0          0  360675897          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-1
 103:        109          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-2
 104:        107          0          0         9678933          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-3
 105:        107          0          0               0  357838258          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-4
 106:        115          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-5
 107:        114          0          0               0          0          0          0  337866096          0          0  IR-PCI-MSI-edge      eth0-TxRx-6
 108:  373801199          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-7

Thanks!
Yijing.

> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-01 13:52     ` Arnd Bergmann
@ 2014-08-04  6:43       ` Yijing Wang
  2014-08-04 14:59         ` Arnd Bergmann
  0 siblings, 1 reply; 41+ messages in thread
From: Yijing Wang @ 2014-08-04  6:43 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, linux-kernel, linux-arch, Russell King,
	Paul.Mundt, Marc Zyngier, linux-pci, James E.J. Bottomley,
	virtualization, Xinwei Hu, Hanjun Guo, Bjorn Helgaas, Wuyun

On 2014/8/1 21:52, Arnd Bergmann wrote:
> On Wednesday 30 July 2014, Yijing Wang wrote:
>> On 2014/7/29 22:08, Arnd Bergmann wrote:
>>> On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
>>>>
>>>> The new data struct for generic MSI driver.
>>>> struct msi_irqs {
>>>>         u8 msi_enabled:1; /* Enable flag */
>>>>         u8 msix_enabled:1;
>>>>         struct list_head msi_list; /* MSI desc list */
>>>>         void *data;     /* help to find the MSI device */
>>>>         struct msi_ops *ops; /* MSI device specific hook */
>>>> };
>>>> struct msi_irqs is used to manage MSI related informations. Every device supports
>>>> MSI should contain this data struct and allocate it.
>>>
>>> I think you should have a stronger association with the 'struct
>>> device' here. Can you replace the 'void *data' with 'struct device *dev'?
>>
>> Actually, I used the struct device *dev in my first draft, finally, I replaced
>> it with void *data, because some MSI devices don't have a struct device *dev,
>> like the existing hpet device, dmar msi device, and OF device, like the ARM consolidator.
>>
>> Of course, we can make the MSI devices have their own struct device, and register to
>> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether it is appropriate.
> 
> It doesn't have to be in the (OF) device tree, but I think it absolutely makes
> sense to use the 'struct device' infrastructure here, as almost everything uses
> a device, and the ones that don't do that today can be easily changed.

I will try to use "struct device" infrastructure, thanks for your suggestion. :)

> 
>>> The other part I'm not completely sure about is how you want to
>>> have MSIs map into normal IRQ descriptors. At the moment, all
>>> MSI users are based on IRQ numbers, but this has known scalability problems.
>>
>> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
>> I'm sorry, I don't understand you meaning.
>> What are the scalability problems you mentioned ?
>> For device drivers, they always process interrupt in two steps.
>> If irq is the legacy interrupt, drivers will first
>> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
>> Then drivers will call the request_irq() to register the interrupt handler.
>> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
>> request_irq() to register interrupt handler.
> 
> The method you describe here makes sense for PCI devices that are required to support
> legacy interrupts and may or may not support MSI on a given system, but not so much
> for platform devices for which we know exactly whether we want to use MSI
> or legacy interrupts.
> 
> In particular if you have a device that can only do MSI, the entire pci_enable_msi
> step is pointless: all we need to do is program the correct MSI target address/message
> pair into the device and register the handler.

Yes, I almost agree if we won't change the existing hundreds of drivers, what
I worried about is some drivers may want to know the IRQ numbers, and use the IRQ
number to process different things, as I mentioned in another reply.
But we can also provide the interface which integrate MSI configuration and request_irq(),
if most drivers don't care the IRQ number.

> 
>>> I wonder if we can do the interface in a way that
>>> hides the interrupt number from generic device drivers and just
>>> passes a 'struct irq_desc'. Note that there are long-term plans to
>>> get rid of IRQ numbers entirely, but those plans have existed for
>>> a long time already without anybody seriously addressing the device
>>> driver interfaces so far, so it might never really happen.
>>>
>>
>> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe we can consider
>> this in a separate title.
> 
> Sorry for being unclear here: I did suggest changing all drivers now. What I meant
> is that we use a different API for non-PCI devices that works without IRQ numbers.
> I don't think we should touch the PCI interfaces at this point.

OK, I got it.

>>> What I'd envision as the API from the device driver perspective is something
>>> as simple like this:
>>>
>>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
>>> 			unsigned long flags, const char *name, struct device *dev);
>>>
>>> which would get an msi descriptor that is valid for this device (dev)
>>> connected to a particular msi_chip, and associate a handler function
>>> with it. The device driver can call that function and retrieve the
>>> address/message pair from the msi_desc in order to store it in its own
>>> device specific registers. The request_irq() can be handled internally
>>> to msi_request().
>>
>> This is a huge change for device drivers, and some device drivers don't know which msi_chip
>> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
>> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
>> binding for the MSI device to the target msi_chip.
> 
> Which drivers are you thinking of? Again, I wouldn't expect to change any PCI drivers,
> but only platform drivers that do native MSI, so we only have to change drivers that
> do not support any MSI at all yet and that need to be changed anyway in order to add
> support.

I mean platform device drivers, because we can find the target msi_chip by some platform
interfaces(like the existing of_pci_find_msi_chip_by_node()). So we no need to explicitly provide
the msi_chip as the function argument.

> 
>> For PCI device, some arm platform already bound the msi_chip to the pci hostbridge, then all
>> pci devices under the pci hostbridge deliver their MSI irqs to the target msi_chip.
>> And other platform create the binding in DTS file, then the MSI device can find their msi_chip
>> by device_node.
>> I don't know whether there are other situations, we should provide a generic interface that
>> every MSI device under every platform can use it to find its msi_chip exactly.
> 
> We have introduced the "msi-parent" property to mirror the "interrupt-parent" property.
> For the PCI case, this property is only needed in the PCI host controller, and there
> can be a system-wide default, by putting the "msi-parent" property into the root device
> node or the node of a bus that is parent to all devices supporting MSI.
> 
> For non-PCI devices, it should be possible to override the "msi-parent" property per
> device, but those can also use the global property.
> 
> The main use case that I see are PCI host controllers that have their own MSI catcher
> included, so meaning that any PCI device can either send its MSIs there, or to a
> system-wide GICv3 instance, and we need a way to select which one.

Yes, agree.

> 
> A degenerate case of this would be a system where a PCI device sends its MSI into
> the host controller, that generates a legacy interrupt and that in turn gets 
> sent to an irqchip which turns it back into an MSI for the GICv3. This would of
> course be very inefficient, but I think we should be able to express this with
> both the binding and the in-kernel framework just to be on the safe side.

Yes, the best way to tell the kernel which msi_chip should deliver to is describe
the binding in DTS file. If a real degenerate case found, we can update the platform
interface which is responsible for getting the match msi_chip in future.

> 
> 
> 	Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-04  3:32           ` Yijing Wang
@ 2014-08-04 14:45             ` Arnd Bergmann
  2014-08-05  2:20               ` Yijing Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Arnd Bergmann @ 2014-08-04 14:45 UTC (permalink / raw)
  To: Yijing Wang
  Cc: Jiang Liu, linux-arm-kernel, linux-kernel, linux-arch,
	Russell King, Paul.Mundt, Marc Zyngier, linux-pci,
	James E.J. Bottomley, virtualization, Xinwei Hu, Hanjun Guo,
	Bjorn Helgaas, Wuyun

On Monday 04 August 2014, Yijing Wang wrote:
> I have another question is some drivers will request more than one
> MSI/MSI-X IRQ, and the driver will use them to process different things.
> Eg. network driver generally uses one of them to process trivial network thins,
> and others to transmit/receive data.
> 
> So, in this case, it seems to driver need to touch the IRQ numbers.
> 
> wr-linux:~ # cat /proc/interrupts
>             CPU0       CPU1       CPU2     ....      CPU17      CPU18      CPU19      CPU20      CPU21      CPU22      CPU23
>  ......
>  100:          0          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0
>  101:          2          0          0               0          0          0  302830488          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-0
>  102:        110          0          0               0          0  360675897          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-1
>  103:        109          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-2
>  104:        107          0          0         9678933          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-3
>  105:        107          0          0               0  357838258          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-4
>  106:        115          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-5
>  107:        114          0          0               0          0          0          0  337866096          0          0  IR-PCI-MSI-edge      eth0-TxRx-6
>  108:  373801199          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-7
> 

I think in this example, you just need to request eight interrupts, and pass a
different data pointer each time, pointing to the napi_struct of each of the
NIC queues. The driver has no need to deal with the IRQ number at all,
and I would be surprised if it cared today.

	Arnd

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-04  6:43       ` Yijing Wang
@ 2014-08-04 14:59         ` Arnd Bergmann
  2014-08-05  2:12           ` Yijing Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Arnd Bergmann @ 2014-08-04 14:59 UTC (permalink / raw)
  To: Yijing Wang
  Cc: linux-arm-kernel, linux-kernel, linux-arch, Russell King,
	Paul.Mundt, Marc Zyngier, linux-pci, James E.J. Bottomley,
	virtualization, Xinwei Hu, Hanjun Guo, Bjorn Helgaas, Wuyun

On Monday 04 August 2014, Yijing Wang wrote:
> On 2014/8/1 21:52, Arnd Bergmann wrote:
> > On Wednesday 30 July 2014, Yijing Wang wrote:
> >> On 2014/7/29 22:08, Arnd Bergmann wrote:
> >>> The other part I'm not completely sure about is how you want to
> >>> have MSIs map into normal IRQ descriptors. At the moment, all
> >>> MSI users are based on IRQ numbers, but this has known scalability problems.
> >>
> >> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
> >> I'm sorry, I don't understand you meaning.
> >> What are the scalability problems you mentioned ?
> >> For device drivers, they always process interrupt in two steps.
> >> If irq is the legacy interrupt, drivers will first
> >> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the IRQ number.
> >> Then drivers will call the request_irq() to register the interrupt handler.
> >> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then call
> >> request_irq() to register interrupt handler.
> > 
> > The method you describe here makes sense for PCI devices that are required to support
> > legacy interrupts and may or may not support MSI on a given system, but not so much
> > for platform devices for which we know exactly whether we want to use MSI
> > or legacy interrupts.
> > 
> > In particular if you have a device that can only do MSI, the entire pci_enable_msi
> > step is pointless: all we need to do is program the correct MSI target address/message
> > pair into the device and register the handler.
> 
> Yes, I almost agree if we won't change the existing hundreds of drivers, what
> I worried about is some drivers may want to know the IRQ numbers, and use the IRQ
> number to process different things, as I mentioned in another reply.
> But we can also provide the interface which integrate MSI configuration and request_irq(),
> if most drivers don't care the IRQ number.

The driver would still have the option of getting the IRQ number for now: With
the interface I imagine, you would get a 'struct msi_desc' pointer, from which
you can look up the 'struct irq_desc' pointer (either embedded in msi_desc,
or using a pointer from a member of msi_desc), and you can already get the
interrupt number from the irq_desc.

My point was that a well-written driver already does not care about the interrupt
number: the only information a driver needs in the interrupt handler is a pointer
to its own context, which we already derive from the irq_desc.

The main interface that currently requires the irq number is free_irq(), but
I would argue that we can just add a wrapper that takes the msi_desc pointer
as its first argument so the driver does not have to worry about it.

We can add additional wrappers like that as needed.

> >>> What I'd envision as the API from the device driver perspective is something
> >>> as simple like this:
> >>>
> >>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
> >>> 			unsigned long flags, const char *name, struct device *dev);
> >>>
> >>> which would get an msi descriptor that is valid for this device (dev)
> >>> connected to a particular msi_chip, and associate a handler function
> >>> with it. The device driver can call that function and retrieve the
> >>> address/message pair from the msi_desc in order to store it in its own
> >>> device specific registers. The request_irq() can be handled internally
> >>> to msi_request().
> >>
> >> This is a huge change for device drivers, and some device drivers don't know which msi_chip
> >> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
> >> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
> >> binding for the MSI device to the target msi_chip.
> > 
> > Which drivers are you thinking of? Again, I wouldn't expect to change any PCI drivers,
> > but only platform drivers that do native MSI, so we only have to change drivers that
> > do not support any MSI at all yet and that need to be changed anyway in order to add
> > support.
> 
> I mean platform device drivers, because we can find the target msi_chip by some platform
> interfaces(like the existing of_pci_find_msi_chip_by_node()). So we no need to explicitly
> provide the msi_chip as the function argument.

Right, that works too. I was thinking we might need an interface that allows us to
pick a particular msi_chip if there are several alternatives (e.g. one in the GIC
and one in the PCI host), but you are right: we should normally be able to hardwire
that information in DT or elsewhere, and just need the 'struct device pointer' which
should probably be the first argument here.

As you pointed out, it's common to have multiple MSIs for a single device, so we
also need a context to pass around, so my suggestion would become something like:

struct msi_desc *msi_request(struct device *dev, irq_handler_t handler,
 			unsigned long flags, const char *name, void *data);

It's possible that we have to add one or two more arguments here.

> > A degenerate case of this would be a system where a PCI device sends its MSI into
> > the host controller, that generates a legacy interrupt and that in turn gets 
> > sent to an irqchip which turns it back into an MSI for the GICv3. This would of
> > course be very inefficient, but I think we should be able to express this with
> > both the binding and the in-kernel framework just to be on the safe side.
> 
> Yes, the best way to tell the kernel which msi_chip should deliver to is describe
> the binding in DTS file. If a real degenerate case found, we can update the platform
> interface which is responsible for getting the match msi_chip in future.

Ok.

	Arnd

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-04 14:59         ` Arnd Bergmann
@ 2014-08-05  2:12           ` Yijing Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-08-05  2:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, linux-kernel, linux-arch, Russell King,
	Paul.Mundt, Marc Zyngier, linux-pci, James E.J. Bottomley,
	virtualization, Xinwei Hu, Hanjun Guo, Bjorn Helgaas, Wuyun

>>> The method you describe here makes sense for PCI devices that are required to support
>>> legacy interrupts and may or may not support MSI on a given system, but not so much
>>> for platform devices for which we know exactly whether we want to use MSI
>>> or legacy interrupts.
>>>
>>> In particular if you have a device that can only do MSI, the entire pci_enable_msi
>>> step is pointless: all we need to do is program the correct MSI target address/message
>>> pair into the device and register the handler.
>>
>> Yes, I almost agree if we won't change the existing hundreds of drivers, what
>> I worried about is some drivers may want to know the IRQ numbers, and use the IRQ
>> number to process different things, as I mentioned in another reply.
>> But we can also provide the interface which integrate MSI configuration and request_irq(),
>> if most drivers don't care the IRQ number.
> 
> The driver would still have the option of getting the IRQ number for now: With
> the interface I imagine, you would get a 'struct msi_desc' pointer, from which
> you can look up the 'struct irq_desc' pointer (either embedded in msi_desc,
> or using a pointer from a member of msi_desc), and you can already get the
> interrupt number from the irq_desc.
> 
> My point was that a well-written driver already does not care about the interrupt
> number: the only information a driver needs in the interrupt handler is a pointer
> to its own context, which we already derive from the irq_desc.

Agree, I will try to introduce this similar interface in next version, thanks!

> 
> The main interface that currently requires the irq number is free_irq(), but
> I would argue that we can just add a wrapper that takes the msi_desc pointer
> as its first argument so the driver does not have to worry about it.
> 
> We can add additional wrappers like that as needed.

OK

>>>> This is a huge change for device drivers, and some device drivers don't know which msi_chip
>>>> their MSI irq deliver to. I'm reworking the msi_chip, and try to use msi_chip to eliminate
>>>> all arch_msi_xxx() under every arch in kernel. And the important point is how to create the
>>>> binding for the MSI device to the target msi_chip.
>>>
>>> Which drivers are you thinking of? Again, I wouldn't expect to change any PCI drivers,
>>> but only platform drivers that do native MSI, so we only have to change drivers that
>>> do not support any MSI at all yet and that need to be changed anyway in order to add
>>> support.
>>
>> I mean platform device drivers, because we can find the target msi_chip by some platform
>> interfaces(like the existing of_pci_find_msi_chip_by_node()). So we no need to explicitly
>> provide the msi_chip as the function argument.
> 
> Right, that works too. I was thinking we might need an interface that allows us to
> pick a particular msi_chip if there are several alternatives (e.g. one in the GIC
> and one in the PCI host), but you are right: we should normally be able to hardwire
> that information in DT or elsewhere, and just need the 'struct device pointer' which
> should probably be the first argument here.
> 
> As you pointed out, it's common to have multiple MSIs for a single device, so we
> also need a context to pass around, so my suggestion would become something like:
> 
> struct msi_desc *msi_request(struct device *dev, irq_handler_t handler,
>  			unsigned long flags, const char *name, void *data);
> 
> It's possible that we have to add one or two more arguments here.

Good suggestion, thanks!

> 
>>> A degenerate case of this would be a system where a PCI device sends its MSI into
>>> the host controller, that generates a legacy interrupt and that in turn gets 
>>> sent to an irqchip which turns it back into an MSI for the GICv3. This would of
>>> course be very inefficient, but I think we should be able to express this with
>>> both the binding and the in-kernel framework just to be on the safe side.
>>
>> Yes, the best way to tell the kernel which msi_chip should deliver to is describe
>> the binding in DTS file. If a real degenerate case found, we can update the platform
>> interface which is responsible for getting the match msi_chip in future.
> 
> Ok.
> 
> 	Arnd
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-04 14:45             ` Arnd Bergmann
@ 2014-08-05  2:20               ` Yijing Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-08-05  2:20 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Jiang Liu, linux-arm-kernel, linux-kernel, linux-arch,
	Russell King, Paul.Mundt, Marc Zyngier, linux-pci,
	James E.J. Bottomley, virtualization, Xinwei Hu, Hanjun Guo,
	Bjorn Helgaas, Wuyun

On 2014/8/4 22:45, Arnd Bergmann wrote:
> On Monday 04 August 2014, Yijing Wang wrote:
>> I have another question is some drivers will request more than one
>> MSI/MSI-X IRQ, and the driver will use them to process different things.
>> Eg. network driver generally uses one of them to process trivial network thins,
>> and others to transmit/receive data.
>>
>> So, in this case, it seems to driver need to touch the IRQ numbers.
>>
>> wr-linux:~ # cat /proc/interrupts
>>             CPU0       CPU1       CPU2     ....      CPU17      CPU18      CPU19      CPU20      CPU21      CPU22      CPU23
>>  ......
>>  100:          0          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0
>>  101:          2          0          0               0          0          0  302830488          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-0
>>  102:        110          0          0               0          0  360675897          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-1
>>  103:        109          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-2
>>  104:        107          0          0         9678933          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-3
>>  105:        107          0          0               0  357838258          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-4
>>  106:        115          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-5
>>  107:        114          0          0               0          0          0          0  337866096          0          0  IR-PCI-MSI-edge      eth0-TxRx-6
>>  108:  373801199          0          0               0          0          0          0          0          0          0  IR-PCI-MSI-edge      eth0-TxRx-7
>>
> 
> I think in this example, you just need to request eight interrupts, and pass a
> different data pointer each time, pointing to the napi_struct of each of the
> NIC queues. The driver has no need to deal with the IRQ number at all,
> and I would be surprised if it cared today.

Yes, you are right, this is not a stumbling block. :)

> 
> 	Arnd
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled()
  2014-07-26  3:08 ` [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled() Yijing Wang
@ 2014-08-05 22:35   ` Stuart Yoder
  2014-08-06  1:23     ` Yijing Wang
  2014-08-20  5:57   ` Bharat.Bhushan
  1 sibling, 1 reply; 41+ messages in thread
From: Stuart Yoder @ 2014-08-05 22:35 UTC (permalink / raw)
  To: Yijing Wang
  Cc: linux-kernel, Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci,
	Paul.Mundt, James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo

On Fri, Jul 25, 2014 at 10:08 PM, Yijing Wang <wangyijing@huawei.com> wrote:
> Pci_dev_msi_enabled() is used to check whether device
> MSI/MSIX enabled. Refactor this function  to suuport
> checking only device MSI or MSIX enabled.
>
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>

So this patch refactors things so that checks like this:
   > -       if (!dev->msi_enabled)

are moved into a function:
   > +       if (!pci_dev_msi_enabled(dev, MSI_TYPE))

Can you explain a bit more why this  needed.   Is it just cleanup?

Thanks,
Stuart

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled()
  2014-08-05 22:35   ` Stuart Yoder
@ 2014-08-06  1:23     ` Yijing Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-08-06  1:23 UTC (permalink / raw)
  To: Stuart Yoder
  Cc: linux-kernel, Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci,
	Paul.Mundt, James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo

On 2014/8/6 6:35, Stuart Yoder wrote:
> On Fri, Jul 25, 2014 at 10:08 PM, Yijing Wang <wangyijing@huawei.com> wrote:
>> Pci_dev_msi_enabled() is used to check whether device
>> MSI/MSIX enabled. Refactor this function  to suuport
>> checking only device MSI or MSIX enabled.
>>
>> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> 
> So this patch refactors things so that checks like this:
>    > -       if (!dev->msi_enabled)
> 
> are moved into a function:
>    > +       if (!pci_dev_msi_enabled(dev, MSI_TYPE))
> 
> Can you explain a bit more why this  needed.   Is it just cleanup?

Hi Stuart, it's not just cleanup, because "[RFC PATCH 08/11] PCI/MSI: Introduce new struct msi_irqs and struct msi_ops"
introduced struct msi_irqs, so the code will change to
if (!dev->msi_irqs->msi_enabled)

I think driver should not need to know the details of MSI members.
So I try to rework the pci_dev_msi_enabled() to hide the detailed MSI info.


Thanks!
Yijing.


> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-04  3:03   ` Yijing Wang
@ 2014-08-20  5:44     ` Bharat.Bhushan
  2014-08-20  6:28       ` Yijing Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Bharat.Bhushan @ 2014-08-20  5:44 UTC (permalink / raw)
  To: Yijing Wang, arnab.basu
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, virtualization, Hanjun Guo,
	linux-kernel

Hi Yijing

> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
> On Behalf Of Yijing Wang
> Sent: Monday, August 04, 2014 8:34 AM
> To: Basu Arnab-B45036
> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org;
> virtualization@lists.linux-foundation.org; Hanjun Guo; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
> 
> >> MSI was introduced in PCI Spec 2.2. Currently, kernel MSI driver
> >> codes are bonding with PCI device. Because MSI has a lot advantages in
> design.
> >> More and more non-PCI devices want to use MSI as their default interrupt.
> >> The existing MSI device include HPET. HPET driver provide its own MSI
> >> code to initialize and process MSI interrupts. In the latest GIC v3
> >> spec, legacy device can deliver MSI by the help of a relay device
> >> named consolidator.
> >> Consolidator can translate the legacy interrupts connected to it to
> >> MSI/MSI-X. And new non-PCI device will be designed to support MSI in
> >> future. So make the MSI driver code be generic will help the non-PCI
> >> device use MSI more simply.
> >
> > As per my understanding the GICv3 provides a service that will convert writes
> to a specified address to IRQs delivered to the core and as you mention above
> MSIs are part of the PCI spec. So I can see a strong case for non-PCI devices to
> want MSI like functionality without being fully compliant with the requirements
> of the MSI spec.
> 
> In GICv3, MBI is named for the service, but there is no more detailed
> information about it, only we can know is MBI is analogous to MSI, MBI devices
> must have address/data registers, but other registers like enable/mask/ctrl are
> not mandatory requirement.
> I don't know whether the MBI spec will be release, but anyway I think MSI
> refactoring is make sense, there are some existing Non-PCI MSI device like hpet,
> dmar.
> For simplicity, let name MSI and MBI to MSI temporarily.
> >
> > My question is do we necessarily want to rework so much of the PCI-MSI layer
> to support non PCI devices? Or will it be sufficient to create a framework to
> allow non PCI devices to hook up with a device that can convert their writes to
> an IRQ to the core.
> >
> > As I understand it, the msi_chip is (almost) such a framework. The only
> problem being that it makes some PCI specific assumptions (such as PCI specific
> writes from within msi_chip functions). Won't it be sufficient to make the
> msi_chip framework generic enough to be used by non-PCI devices and let each
> bus/device manage any additional requirements (such as configuration flow, bit
> definitions etc) that it places on message based interrupts?
> 
> msi_chip framework is important to support that, but I think maybe it's not
> enough, msi_chip is only responsible for IRQ allocation, teardown, etc..
> 
> The key difference between PCI device and Non-PCI MSI is the interfaces to
> access hardware MSI registers.
> for instance, currently, msi_chip->setup_irq() to setup MSI irq and configure
> the MSI address/data registers, so we need to provide device specific
> write_msi_msg() interface, then when we call msi_chip->setup_irq(), the device
> MSI registers can be configured appropriately.

What if we can register/override the setup_irq() from bus-driver (not sure, but may be device-driver itself). Example PCI bus-driver will provide setup_irq() (or the part of setup_irq which set address and data in h/w) by PCI bus, which configure address/data in h/w as per PCI standard. 

We in Freescale will be using MSI for the devices behind a new-bus (which is not PCI based), We have a separate bus driver for same. And this new bus driver register/provide its own address/data write function which is based on that specific bus protocol.

Thanks
-Bharat

> 
> My patchset is just a RFC draft, I will update it later, all we want to do is
> make kernel support Non-PCI MSI devices.
> 
> Thanks!
> Yijing.
> 
> 
> >
> > Thanks
> > Arnab
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > linux-kernel" in the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
> > .
> >
> 
> 
> --
> Thanks!
> Yijing
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled()
  2014-07-26  3:08 ` [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled() Yijing Wang
  2014-08-05 22:35   ` Stuart Yoder
@ 2014-08-20  5:57   ` Bharat.Bhushan
  2014-08-20  6:30     ` Yijing Wang
  1 sibling, 1 reply; 41+ messages in thread
From: Bharat.Bhushan @ 2014-08-20  5:57 UTC (permalink / raw)
  To: Yijing Wang, linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo



> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
> On Behalf Of Yijing Wang
> Sent: Saturday, July 26, 2014 8:39 AM
> To: linux-kernel@vger.kernel.org
> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org; Basu
> Arnab-B45036; virtualization@lists.linux-foundation.org; Hanjun Guo; Yijing Wang
> Subject: [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled()
> 
> Pci_dev_msi_enabled() is used to check whether device MSI/MSIX enabled. Refactor
> this function  to suuport checking only device MSI or MSIX enabled.

s/support/support

>From code it looks like you added one more parameter to pci_dev_msi_enabled() to check for a specific type, which earlier it was checking for both MSI and MSIX enable. While the description is not clear to me, Am I missing something ?

Thanks
-Bharat


> 
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> ---
>  arch/cris/arch-v32/drivers/pci/bios.c     |    2 +-
>  arch/frv/mb93090-mb00/pci-vdk.c           |    2 +-
>  arch/ia64/pci/pci.c                       |    4 ++--
>  arch/powerpc/kernel/eeh_driver.c          |    2 +-
>  arch/x86/pci/common.c                     |    5 +++--
>  drivers/block/nvme-core.c                 |    4 ++--
>  drivers/dma/ioat/dma.c                    |    2 +-
>  drivers/firewire/ohci.c                   |    2 +-
>  drivers/gpu/drm/i915/i915_dma.c           |    4 ++--
>  drivers/misc/mei/hw-me.c                  |    2 +-
>  drivers/misc/mei/hw-txe.c                 |    2 +-
>  drivers/misc/mei/pci-me.c                 |    4 ++--
>  drivers/misc/mei/pci-txe.c                |    4 ++--
>  drivers/misc/mic/host/mic_debugfs.c       |    4 ++--
>  drivers/misc/mic/host/mic_intr.c          |    8 ++++----
>  drivers/ntb/ntb_hw.c                      |    2 +-
>  drivers/pci/irq.c                         |    4 ++--
>  drivers/pci/msi.c                         |   15 +++++++++------
>  drivers/pci/pci.c                         |    6 +++---
>  drivers/pci/pcie/portdrv_core.c           |    4 ++--
>  drivers/scsi/esas2r/esas2r_init.c         |    4 ++--
>  drivers/scsi/esas2r/esas2r_ioctl.c        |    4 ++--
>  drivers/scsi/hpsa.c                       |    4 ++--
>  drivers/staging/crystalhd/crystalhd_lnx.c |    2 +-
>  drivers/xen/xen-pciback/pciback_ops.c     |   12 ++++++------
>  include/linux/pci.h                       |   12 ++++++++++--
>  virt/kvm/assigned-dev.c                   |    2 +-
>  27 files changed, 67 insertions(+), 55 deletions(-)
> 
> diff --git a/arch/cris/arch-v32/drivers/pci/bios.c b/arch/cris/arch-
> v32/drivers/pci/bios.c
> index 64a5fb9..d9d8332 100644
> --- a/arch/cris/arch-v32/drivers/pci/bios.c
> +++ b/arch/cris/arch-v32/drivers/pci/bios.c
> @@ -93,7 +93,7 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
>  	if ((err = pcibios_enable_resources(dev, mask)) < 0)
>  		return err;
> 
> -	if (!dev->msi_enabled)
> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		pcibios_enable_irq(dev);
>  	return 0;
>  }
> diff --git a/arch/frv/mb93090-mb00/pci-vdk.c b/arch/frv/mb93090-mb00/pci-vdk.c
> index efa5d65..b96c128 100644
> --- a/arch/frv/mb93090-mb00/pci-vdk.c
> +++ b/arch/frv/mb93090-mb00/pci-vdk.c
> @@ -409,7 +409,7 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
> 
>  	if ((err = pci_enable_resources(dev, mask)) < 0)
>  		return err;
> -	if (!dev->msi_enabled)
> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		pcibios_enable_irq(dev);
>  	return 0;
>  }
> diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c index 291a582..da8ddff
> 100644
> --- a/arch/ia64/pci/pci.c
> +++ b/arch/ia64/pci/pci.c
> @@ -568,7 +568,7 @@ pcibios_enable_device (struct pci_dev *dev, int mask)
>  	if (ret < 0)
>  		return ret;
> 
> -	if (!dev->msi_enabled)
> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		return acpi_pci_irq_enable(dev);
>  	return 0;
>  }
> @@ -577,7 +577,7 @@ void
>  pcibios_disable_device (struct pci_dev *dev)  {
>  	BUG_ON(atomic_read(&dev->enable_cnt));
> -	if (!dev->msi_enabled)
> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		acpi_pci_irq_disable(dev);
>  }
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 420da61..e3f2074 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -123,7 +123,7 @@ static void eeh_disable_irq(struct pci_dev *dev)
>  	 * effectively disabled by the DMA Stopped state
>  	 * when an EEH error occurs.
>  	 */
> -	if (dev->msi_enabled || dev->msix_enabled)
> +	if (pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
>  		return;
> 
>  	if (!irq_has_action(dev->irq))
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index
> 059a76c..4597940 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -662,14 +662,15 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
>  	if ((err = pci_enable_resources(dev, mask)) < 0)
>  		return err;
> 
> -	if (!pci_dev_msi_enabled(dev))
> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
>  		return pcibios_enable_irq(dev);
>  	return 0;
>  }
> 
>  void pcibios_disable_device (struct pci_dev *dev)  {
> -	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq)
> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE)
> +			&& pcibios_disable_irq)
>  		pcibios_disable_irq(dev);
>  }
> 
> diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c index
> 02351e2..f96b90f 100644
> --- a/drivers/block/nvme-core.c
> +++ b/drivers/block/nvme-core.c
> @@ -2325,9 +2325,9 @@ static int nvme_dev_map(struct nvme_dev *dev)
> 
>  static void nvme_dev_unmap(struct nvme_dev *dev)  {
> -	if (dev->pci_dev->msi_enabled)
> +	if (pci_dev_msi_enabled(dev->pci_dev, MSI_TYPE))
>  		pci_disable_msi(dev->pci_dev);
> -	else if (dev->pci_dev->msix_enabled)
> +	else if (pci_dev_msi_enabled(dev->pci_dev, MSIX_TYPE))
>  		pci_disable_msix(dev->pci_dev);
> 
>  	if (dev->bar) {
> diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c index
> 4e3549a..a11dac1 100644
> --- a/drivers/dma/ioat/dma.c
> +++ b/drivers/dma/ioat/dma.c
> @@ -1088,7 +1088,7 @@ static void ioat1_intr_quirk(struct ioatdma_device
> *device)
>  	u32 dmactrl;
> 
>  	pci_read_config_dword(pdev, IOAT_PCI_DMACTRL_OFFSET, &dmactrl);
> -	if (pdev->msi_enabled)
> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE))
>  		dmactrl |= IOAT_PCI_DMACTRL_MSI_EN;
>  	else
>  		dmactrl &= ~IOAT_PCI_DMACTRL_MSI_EN;
> diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c index
> 5798541..ec0a794 100644
> --- a/drivers/firewire/ohci.c
> +++ b/drivers/firewire/ohci.c
> @@ -3705,7 +3705,7 @@ static int pci_probe(struct pci_dev *dev,
>  	if (!(ohci->quirks & QUIRK_NO_MSI))
>  		pci_enable_msi(dev);
>  	if (request_irq(dev->irq, irq_handler,
> -			pci_dev_msi_enabled(dev) ? 0 : IRQF_SHARED,
> +			pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE) ? 0 :
> IRQF_SHARED,
>  			ohci_driver_name, ohci)) {
>  		ohci_err(ohci, "failed to allocate interrupt %d\n", dev->irq);
>  		err = -EIO;
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 4c22a5b..0c248fe 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1745,7 +1745,7 @@ out_gem_unload:
>  	WARN_ON(unregister_oom_notifier(&dev_priv->mm.oom_notifier));
>  	unregister_shrinker(&dev_priv->mm.shrinker);
> 
> -	if (dev->pdev->msi_enabled)
> +	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE))
>  		pci_disable_msi(dev->pdev);
> 
>  	intel_teardown_gmbus(dev);
> @@ -1826,7 +1826,7 @@ int i915_driver_unload(struct drm_device *dev)
>  	cancel_work_sync(&dev_priv->gpu_error.work);
>  	i915_destroy_error_state(dev);
> 
> -	if (dev->pdev->msi_enabled)
> +	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE))
>  		pci_disable_msi(dev->pdev);
> 
>  	intel_opregion_fini(dev);
> diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c index
> 6a2d272..d7595d4 100644
> --- a/drivers/misc/mei/hw-me.c
> +++ b/drivers/misc/mei/hw-me.c
> @@ -647,7 +647,7 @@ irqreturn_t mei_me_irq_thread_handler(int irq, void *dev_id)
> 
>  	/* Ack the interrupt here
>  	 * In case of MSI we don't go through the quick handler */
> -	if (pci_dev_msi_enabled(dev->pdev))
> +	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE | MSIX_TYPE))
>  		mei_clear_interrupts(dev);
> 
>  	/* check if ME wants a reset */
> diff --git a/drivers/misc/mei/hw-txe.c b/drivers/misc/mei/hw-txe.c index
> 9327378..8c2d95c 100644
> --- a/drivers/misc/mei/hw-txe.c
> +++ b/drivers/misc/mei/hw-txe.c
> @@ -951,7 +951,7 @@ irqreturn_t mei_txe_irq_thread_handler(int irq, void
> *dev_id)
>  	mutex_lock(&dev->device_lock);
>  	mei_io_list_init(&complete_list);
> 
> -	if (pci_dev_msi_enabled(dev->pdev))
> +	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE | MSIX_TYPE))
>  		mei_txe_check_and_ack_intrs(dev, true);
> 
>  	/* show irq events */
> diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c index
> 1b46c64..283fc09 100644
> --- a/drivers/misc/mei/pci-me.c
> +++ b/drivers/misc/mei/pci-me.c
> @@ -181,7 +181,7 @@ static int mei_me_probe(struct pci_dev *pdev, const struct
> pci_device_id *ent)
>  	pci_enable_msi(pdev);
> 
>  	 /* request and enable interrupt */
> -	if (pci_dev_msi_enabled(pdev))
> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>  		err = request_threaded_irq(pdev->irq,
>  			NULL,
>  			mei_me_irq_thread_handler,
> @@ -329,7 +329,7 @@ static int mei_me_pci_resume(struct device *device)
>  	pci_enable_msi(pdev);
> 
>  	/* request and enable interrupt */
> -	if (pci_dev_msi_enabled(pdev))
> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>  		err = request_threaded_irq(pdev->irq,
>  			NULL,
>  			mei_me_irq_thread_handler,
> diff --git a/drivers/misc/mei/pci-txe.c b/drivers/misc/mei/pci-txe.c index
> 2343c62..a3bf202 100644
> --- a/drivers/misc/mei/pci-txe.c
> +++ b/drivers/misc/mei/pci-txe.c
> @@ -124,7 +124,7 @@ static int mei_txe_probe(struct pci_dev *pdev, const struct
> pci_device_id *ent)
>  	mei_clear_interrupts(dev);
> 
>  	/* request and enable interrupt  */
> -	if (pci_dev_msi_enabled(pdev))
> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>  		err = request_threaded_irq(pdev->irq,
>  			NULL,
>  			mei_txe_irq_thread_handler,
> @@ -272,7 +272,7 @@ static int mei_txe_pci_resume(struct device *device)
>  	mei_clear_interrupts(dev);
> 
>  	/* request and enable interrupt */
> -	if (pci_dev_msi_enabled(pdev))
> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>  		err = request_threaded_irq(pdev->irq,
>  			NULL,
>  			mei_txe_irq_thread_handler,
> diff --git a/drivers/misc/mic/host/mic_debugfs.c
> b/drivers/misc/mic/host/mic_debugfs.c
> index 028ba5d..6e1a553 100644
> --- a/drivers/misc/mic/host/mic_debugfs.c
> +++ b/drivers/misc/mic/host/mic_debugfs.c
> @@ -376,9 +376,9 @@ static int mic_msi_irq_info_show(struct seq_file *s, void
> *pos)
>  	struct pci_dev *pdev = container_of(mdev->sdev->parent,
>  		struct pci_dev, dev);
> 
> -	if (pci_dev_msi_enabled(pdev)) {
> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>  		for (i = 0; i < mdev->irq_info.num_vectors; i++) {
> -			if (pdev->msix_enabled) {
> +			if (pci_dev_msi_enabled(pdev, MSIX_TYPE)) {
>  				entry = mdev->irq_info.msix_entries[i].entry;
>  				vector = mdev->irq_info.msix_entries[i].vector;
>  			} else {
> diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
> index dbc5afd..9eab900 100644
> --- a/drivers/misc/mic/host/mic_intr.c
> +++ b/drivers/misc/mic/host/mic_intr.c
> @@ -468,7 +468,7 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
>  		}
> 
>  		entry = 0;
> -		if (pci_dev_msi_enabled(pdev)) {
> +		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>  			mdev->irq_info.mic_msi_map[entry] |= (1 << offset);
>  			mdev->intr_ops->program_msi_to_src_map(mdev,
>  				entry, offset, true);
> @@ -526,7 +526,7 @@ void mic_free_irq(struct mic_device *mdev,
>  			dev_warn(mdev->sdev->parent, "Error unregistering
> callback\n");
>  			return;
>  		}
> -		if (pci_dev_msi_enabled(pdev)) {
> +		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>  			mdev->irq_info.mic_msi_map[entry] &= ~(BIT(src_id));
>  			mdev->intr_ops->program_msi_to_src_map(mdev,
>  				entry, src_id, false);
> @@ -589,7 +589,7 @@ void mic_free_interrupts(struct mic_device *mdev, struct
> pci_dev *pdev)
>  		kfree(mdev->irq_info.msix_entries);
>  		pci_disable_msix(pdev);
>  	} else {
> -		if (pci_dev_msi_enabled(pdev)) {
> +		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>  			free_irq(pdev->irq, mdev);
>  			kfree(mdev->irq_info.mic_msi_map);
>  			pci_disable_msi(pdev);
> @@ -617,7 +617,7 @@ void mic_intr_restore(struct mic_device *mdev)
>  	struct pci_dev *pdev = container_of(mdev->sdev->parent,
>  		struct pci_dev, dev);
> 
> -	if (!pci_dev_msi_enabled(pdev))
> +	if (!pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>  		return;
> 
>  	for (entry = 0; entry < mdev->irq_info.num_vectors; entry++) { diff --git
> a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c index 372e08c..868f685 100644
> --- a/drivers/ntb/ntb_hw.c
> +++ b/drivers/ntb/ntb_hw.c
> @@ -1306,7 +1306,7 @@ static void ntb_free_interrupts(struct ntb_device *ndev)
>  	} else {
>  		free_irq(pdev->irq, ndev);
> 
> -		if (pci_dev_msi_enabled(pdev))
> +		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>  			pci_disable_msi(pdev);
>  	}
>  }
> diff --git a/drivers/pci/irq.c b/drivers/pci/irq.c index 6684f15..e3e3293 100644
> --- a/drivers/pci/irq.c
> +++ b/drivers/pci/irq.c
> @@ -36,10 +36,10 @@ static void pci_note_irq_problem(struct pci_dev *pdev, const
> char *reason)
>   */
>  enum pci_lost_interrupt_reason pci_lost_interrupt(struct pci_dev *pdev)  {
> -	if (pdev->msi_enabled || pdev->msix_enabled) {
> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>  		enum pci_lost_interrupt_reason ret;
> 
> -		if (pdev->msix_enabled) {
> +		if (pci_dev_msi_enabled(pdev, MSIX_TYPE)) {
>  			pci_note_irq_problem(pdev, "MSIX routing failure");
>  			ret = PCI_LOST_IRQ_DISABLE_MSIX;
>  		} else {
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index e416dc0..d5c8e56 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -125,7 +125,7 @@ static void default_restore_msi_irq(struct pci_dev *dev, int
> irq)
>  			if (irq == entry->irq)
>  				break;
>  		}
> -	} else if (dev->msi_enabled)  {
> +	} else if (pci_dev_msi_enabled(dev, MSI_TYPE))  {
>  		entry = irq_get_msi_desc(irq);
>  	}
> 
> @@ -439,7 +439,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
>  	u16 control;
>  	struct msi_desc *entry;
> 
> -	if (!dev->msi_enabled)
> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		return;
> 
>  	entry = irq_get_msi_desc(dev->irq);
> @@ -878,7 +878,8 @@ void pci_msi_shutdown(struct pci_dev *dev)
>  	struct msi_desc *desc;
>  	u32 mask;
> 
> -	if (!pci_msi_enable || !dev || !dev->msi_enabled)
> +	if (!pci_msi_enable || !dev ||
> +			!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		return;
> 
>  	BUG_ON(list_empty(&dev->msi_list));
> @@ -899,7 +900,8 @@ void pci_msi_shutdown(struct pci_dev *dev)
> 
>  void pci_disable_msi(struct pci_dev *dev)  {
> -	if (!pci_msi_enable || !dev || !dev->msi_enabled)
> +	if (!pci_msi_enable || !dev ||
> +			!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		return;
> 
>  	pci_msi_shutdown(dev);
> @@ -972,7 +974,7 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry
> *entries, int nvec)
>  	WARN_ON(!!dev->msix_enabled);
> 
>  	/* Check whether driver already requested for MSI irq */
> -	if (dev->msi_enabled) {
> +	if (pci_dev_msi_enabled(dev, MSI_TYPE)) {
>  		dev_info(&dev->dev, "can't enable MSI-X (MSI IRQ already
> assigned)\n");
>  		return -EINVAL;
>  	}
> @@ -1001,7 +1003,8 @@ void pci_msix_shutdown(struct pci_dev *dev)
> 
>  void pci_disable_msix(struct pci_dev *dev)  {
> -	if (!pci_msi_enable || !dev || !dev->msix_enabled)
> +	if (!pci_msi_enable || !dev ||
> +			!pci_dev_msi_enabled(dev, MSIX_TYPE))
>  		return;
> 
>  	pci_msix_shutdown(dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 74043a2..6e9e7bd 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1206,7 +1206,7 @@ static int do_pci_enable_device(struct pci_dev *dev, int
> bars)
>  		return err;
>  	pci_fixup_device(pci_fixup_enable, dev);
> 
> -	if (dev->msi_enabled || dev->msix_enabled)
> +	if (pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
>  		return 0;
> 
>  	pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); @@ -1361,9 +1361,9 @@
> static void pcim_release(struct device *gendev, void *res)
>  	struct pci_devres *this = res;
>  	int i;
> 
> -	if (dev->msi_enabled)
> +	if (pci_dev_msi_enabled(dev, MSI_TYPE))
>  		pci_disable_msi(dev);
> -	if (dev->msix_enabled)
> +	if (pci_dev_msi_enabled(dev, MSIX_TYPE))
>  		pci_disable_msix(dev);
> 
>  	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) diff --git
> a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index
> 2f0ce66..7a1b6ec 100644
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -235,9 +235,9 @@ static int init_service_irqs(struct pci_dev *dev, int *irqs,
> int mask)
> 
>  static void cleanup_service_irqs(struct pci_dev *dev)  {
> -	if (dev->msix_enabled)
> +	if (pci_dev_msi_enabled(dev, MSIX_TYPE))
>  		pci_disable_msix(dev);
> -	else if (dev->msi_enabled)
> +	else if (pci_dev_msi_enabled(dev, MSI_TYPE))
>  		pci_disable_msi(dev);
>  }
> 
> diff --git a/drivers/scsi/esas2r/esas2r_init.c
> b/drivers/scsi/esas2r/esas2r_init.c
> index 6776931..444f64d 100644
> --- a/drivers/scsi/esas2r/esas2r_init.c
> +++ b/drivers/scsi/esas2r/esas2r_init.c
> @@ -617,8 +617,8 @@ void esas2r_kill_adapter(int i)
>  			       &(a->pcid->dev),
>  			       "pci_disable_device() called.  msix_enabled: %d "
>  			       "msi_enabled: %d irq: %d pin: %d",
> -			       a->pcid->msix_enabled,
> -			       a->pcid->msi_enabled,
> +			       pci_dev_msi_enabled(a->pcid, MSIX_TYPE),
> +			       pci_dev_msi_enabled(a->pcid, MSI_TYPE),
>  			       a->pcid->irq,
>  			       a->pcid->pin);
> 
> diff --git a/drivers/scsi/esas2r/esas2r_ioctl.c
> b/drivers/scsi/esas2r/esas2r_ioctl.c
> index d89a027..31e06bd 100644
> --- a/drivers/scsi/esas2r/esas2r_ioctl.c
> +++ b/drivers/scsi/esas2r/esas2r_ioctl.c
> @@ -810,9 +810,9 @@ static int hba_ioctl_callback(struct esas2r_adapter *a,
> 
>  		gai->pci.msi_vector_cnt = 1;
> 
> -		if (a->pcid->msix_enabled)
> +		if (pci_dev_msi_enabled(a->pcid, MSIX_TYPE))
>  			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_MSIX;
> -		else if (a->pcid->msi_enabled)
> +		else if (pci_dev_msi_enabled(a->pcid, MSI_TYPE))
>  			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_MSI;
>  		else
>  			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_LEGACY; diff --git
> a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 31184b3..964d809 100644
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -6707,10 +6707,10 @@ static void hpsa_free_irqs_and_disable_msix(struct
> ctlr_info *h)
>  	free_irqs(h);
>  #ifdef CONFIG_PCI_MSI
>  	if (h->msix_vector) {
> -		if (h->pdev->msix_enabled)
> +		if (pci_dev_msi_enabled(h->pdev, MSIX_TYPE))
>  			pci_disable_msix(h->pdev);
>  	} else if (h->msi_vector) {
> -		if (h->pdev->msi_enabled)
> +		if (pci_dev_msi_enabled(h->pdev, MSI_TYPE))
>  			pci_disable_msi(h->pdev);
>  	}
>  #endif /* CONFIG_PCI_MSI */
> diff --git a/drivers/staging/crystalhd/crystalhd_lnx.c
> b/drivers/staging/crystalhd/crystalhd_lnx.c
> index e6fb331..9459b42 100644
> --- a/drivers/staging/crystalhd/crystalhd_lnx.c
> +++ b/drivers/staging/crystalhd/crystalhd_lnx.c
> @@ -45,7 +45,7 @@ static int chd_dec_enable_int(struct crystalhd_adp *adp)
>  		return -EINVAL;
>  	}
> 
> -	if (adp->pdev->msi_enabled)
> +	if (pci_msi_dev_enabled(adp->pdev, MSI_TYPE))
>  		adp->msi = 1;
>  	else
>  		adp->msi = pci_enable_msi(adp->pdev); diff --git a/drivers/xen/xen-
> pciback/pciback_ops.c b/drivers/xen/xen-pciback/pciback_ops.c
> index c4a0666..fee2f19 100644
> --- a/drivers/xen/xen-pciback/pciback_ops.c
> +++ b/drivers/xen/xen-pciback/pciback_ops.c
> @@ -64,8 +64,8 @@ static void xen_pcibk_control_isr(struct pci_dev *dev, int
> reset)
>  		dev_data->irq_name,
>  		dev_data->irq,
>  		pci_is_enabled(dev) ? "on" : "off",
> -		dev->msi_enabled ? "MSI" : "",
> -		dev->msix_enabled ? "MSI/X" : "",
> +		pci_dev_msi_enabled(dev, MSI_TYPE) ? "MSI" : "",
> +		pci_dev_msi_enabled(dev, MSIX_TYPE) ? "MSI/X" : "",
>  		dev_data->isr_on ? "enable" : "disable",
>  		enable ? "enable" : "disable");
> 
> @@ -90,8 +90,8 @@ out:
>  		dev_data->irq_name,
>  		dev_data->irq,
>  		pci_is_enabled(dev) ? "on" : "off",
> -		dev->msi_enabled ? "MSI" : "",
> -		dev->msix_enabled ? "MSI/X" : "",
> +		pci_dev_msi_enabled(dev, MSI_TYPE) ? "MSI" : "",
> +		pci_dev_msi_enabled(dev, MSIX_TYPE) ? "MSI/X" : "",
>  		enable ? (dev_data->isr_on ? "enabled" : "failed to enable") :
>  			(dev_data->isr_on ? "failed to disable" : "disabled"));  }
> @@ -111,9 +111,9 @@ void xen_pcibk_reset_device(struct pci_dev *dev)  #ifdef
> CONFIG_PCI_MSI
>  		/* The guest could have been abruptly killed without
>  		 * disabling MSI/MSI-X interrupts.*/
> -		if (dev->msix_enabled)
> +		if (pci_dev_msi_enabled(dev, MSIX_TYPE))
>  			pci_disable_msix(dev);
> -		if (dev->msi_enabled)
> +		if (pci_dev_msi_enabled(dev, MSI_TYPE))
>  			pci_disable_msi(dev);
>  #endif
>  		if (pci_is_enabled(dev))
> diff --git a/include/linux/pci.h b/include/linux/pci.h index 6ed3647..c6c01ae
> 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -33,6 +33,7 @@
> 
>  #include <linux/pci_ids.h>
> 
> +#include <linux/msi.h>
>  /*
>   * The PCI interface treats multi-function devices as independent
>   * devices.  The slot/function address of each device is encoded @@ -506,9
> +507,16 @@ static inline struct pci_dev *pci_upstream_bridge(struct pci_dev
> *dev)  }
> 
>  #ifdef CONFIG_PCI_MSI
> -static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev)
> +static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev, int
> +type)
>  {
> -	return pci_dev->msi_enabled || pci_dev->msix_enabled;
> +	bool enabled = 0;
> +
> +	if (type & MSI_TYPE)
> +		enabled |= pci_dev->msi_enabled;
> +	if (type & MSIX_TYPE)
> +		enabled |= pci_dev->msix_enabled;
> +
> +	return enabled;
>  }
>  #else
>  static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false;
> } diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c index
> bf06577..4634bd0 100644
> --- a/virt/kvm/assigned-dev.c
> +++ b/virt/kvm/assigned-dev.c
> @@ -366,7 +366,7 @@ static int assigned_device_enable_host_msi(struct kvm *kvm,
> {
>  	int r;
> 
> -	if (!dev->dev->msi_enabled) {
> +	if (!pci_dev_msi_enabled(dev->dev, MSI_TYPE)) {
>  		r = pci_enable_msi(dev->dev);
>  		if (r)
>  			return r;
> --
> 1.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver
  2014-07-26  3:08 ` [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver Yijing Wang
@ 2014-08-20  6:06   ` Bharat.Bhushan
  2014-08-20  6:34     ` Yijing Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Bharat.Bhushan @ 2014-08-20  6:06 UTC (permalink / raw)
  To: Yijing Wang, linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo



> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
> On Behalf Of Yijing Wang
> Sent: Saturday, July 26, 2014 8:39 AM
> To: linux-kernel@vger.kernel.org
> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org; Basu
> Arnab-B45036; virtualization@lists.linux-foundation.org; Hanjun Guo; Yijing Wang
> Subject: [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver
> 
> Use struct msi_ops to hook PCI MSI operations,
> and use struct msi_irqs to refactor PCI MSI drvier.
> 
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> ---
>  drivers/pci/msi.c   |  351 ++++++++++++++++++++++++++++++---------------------
>  include/linux/msi.h |   14 +-
>  include/linux/pci.h |   11 +-
>  3 files changed, 222 insertions(+), 154 deletions(-)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 41c33da..f0c5989 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -29,8 +29,9 @@ static int pci_msi_enable = 1;
> 
>  /* Arch hooks */
> 
> -int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
> +int __weak arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc)
>  {
> +	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support
> Non-PCI
>  	struct msi_chip *chip = dev->bus->msi;
>  	int err;
> 
> @@ -56,8 +57,9 @@ void __weak arch_teardown_msi_irq(unsigned int irq)
>  	chip->teardown_irq(chip, irq);
>  }
> 
> -int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
> +int __weak arch_msi_check_device(struct msi_irqs *msi, int nvec, int type)
>  {
> +	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support
> Non-PCI
>  	struct msi_chip *chip = dev->bus->msi;
> 
>  	if (!chip || !chip->check_device)
> @@ -66,7 +68,7 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int
> nvec, int type)
>  	return chip->check_device(chip, dev, nvec, type);
>  }
> 
> -int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> +int __weak arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
>  {
>  	struct msi_desc *entry;
>  	int ret;
> @@ -78,8 +80,8 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec,
> int type)
>  	if (type == MSI_TYPE && nvec > 1)
>  		return 1;
> 
> -	list_for_each_entry(entry, &dev->msi_list, list) {
> -		ret = arch_setup_msi_irq(dev, entry);
> +	list_for_each_entry(entry, &msi->msi_list, list) {
> +		ret = arch_setup_msi_irq(msi, entry);
>  		if (ret < 0)
>  			return ret;
>  		if (ret > 0)
> @@ -93,11 +95,11 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int
> nvec, int type)
>   * We have a default implementation available as a separate non-weak
>   * function, as it is used by the Xen x86 PCI code
>   */
> -void default_teardown_msi_irqs(struct pci_dev *dev)
> +void default_teardown_msi_irqs(struct msi_irqs *msi)
>  {
>  	struct msi_desc *entry;
> 
> -	list_for_each_entry(entry, &dev->msi_list, list) {
> +	list_for_each_entry(entry, &msi->msi_list, list) {
>  		int i, nvec;
>  		if (entry->irq == 0)
>  			continue;
> @@ -110,22 +112,22 @@ void default_teardown_msi_irqs(struct pci_dev *dev)
>  	}
>  }
> 
> -void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
> +void __weak arch_teardown_msi_irqs(struct msi_irqs *msi)
>  {
> -	return default_teardown_msi_irqs(dev);
> +	return default_teardown_msi_irqs(msi);
>  }
> 
> -static void default_restore_msi_irq(struct pci_dev *dev, int irq)
> +static void default_restore_msi_irq(struct msi_irqs *msi, int irq)
>  {
>  	struct msi_desc *entry;
> 
>  	entry = NULL;
> -	if (dev->msix_enabled) {
> -		list_for_each_entry(entry, &dev->msi_list, list) {
> +	if (msi->msix_enabled) {
> +		list_for_each_entry(entry, &msi->msi_list, list) {
>  			if (irq == entry->irq)
>  				break;
>  		}
> -	} else if (pci_dev_msi_enabled(dev, MSI_TYPE))  {
> +	} else if (msi->msi_enabled)  {
>  		entry = irq_get_msi_desc(irq);
>  	}
> 
> @@ -133,20 +135,9 @@ static void default_restore_msi_irq(struct pci_dev *dev,
> int irq)
>  		write_msi_msg(irq, &entry->msg);
>  }
> 
> -void __weak arch_restore_msi_irqs(struct pci_dev *dev)
> +void __weak arch_restore_msi_irqs(struct msi_irqs *msi)
>  {
> -	return default_restore_msi_irqs(dev);
> -}
> -
> -static void msi_set_enable(struct pci_dev *dev, int enable)
> -{
> -	u16 control;
> -
> -	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
> -	control &= ~PCI_MSI_FLAGS_ENABLE;
> -	if (enable)
> -		control |= PCI_MSI_FLAGS_ENABLE;
> -	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
> +	return default_restore_msi_irqs(msi);
>  }
> 
>  static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
> @@ -159,6 +150,25 @@ static void msix_clear_and_set_ctrl(struct pci_dev *dev,
> u16 clear, u16 set)
>  	pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, ctrl);
>  }
> 
> +static void msi_set_enable(struct msi_irqs *msi, int enable, int type)
> +{
> +	u16 control;
> +	struct pci_dev *dev = msi->data;
> +
> +	if (type == MSI_TYPE) {
> +		pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
> +		control &= ~PCI_MSI_FLAGS_ENABLE;
> +		if (enable)
> +			control |= PCI_MSI_FLAGS_ENABLE;
> +		pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
> +	} else if (type == MSIX_TYPE) {
> +		if (enable)
> +			msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_ENABLE);
> +		else
> +			msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
> +	}
> +}
> +
>  static inline __attribute_const__ u32 msi_mask(unsigned x)
>  {
>  	/* Don't shift by >= width of type */
> @@ -175,6 +185,7 @@ static inline __attribute_const__ u32 msi_mask(unsigned x)
>   */
>  u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>  {
> +	struct pci_dev *dev = desc->msi->data;
>  	u32 mask_bits = desc->masked;
> 
>  	if (!desc->msi_attrib.maskbit)
> @@ -182,7 +193,7 @@ u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask,
> u32 flag)
> 
>  	mask_bits &= ~mask;
>  	mask_bits |= flag;
> -	pci_write_config_dword(desc->dev, desc->mask_pos, mask_bits);
> +	pci_write_config_dword(dev, desc->mask_pos, mask_bits);
> 
>  	return mask_bits;
>  }
> @@ -250,18 +261,30 @@ void unmask_msi_irq(struct irq_data *data)
>  	msi_set_mask_bit(data, 0);
>  }
> 
> -void default_restore_msi_irqs(struct pci_dev *dev)
> +static void msix_set_all_mask(struct msi_irqs *msi, int flag)
> +{
> +	struct pci_dev *dev = msi->data;
> +
> +	if (flag)
> +		msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_MASKALL);
> +	else
> +		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
> +}
> +
> +void default_restore_msi_irqs(struct msi_irqs *msi)
>  {
>  	struct msi_desc *entry;
> 
> -	list_for_each_entry(entry, &dev->msi_list, list) {
> -		default_restore_msi_irq(dev, entry->irq);
> +	list_for_each_entry(entry, &msi->msi_list, list) {
> +		default_restore_msi_irq(msi, entry->irq);
>  	}
>  }
> 
>  void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>  {
> -	BUG_ON(entry->dev->current_state != PCI_D0);
> +	struct pci_dev *dev = entry->msi->data;
> +
> +	BUG_ON(dev->current_state != PCI_D0);
> 
>  	if (entry->msi_attrib.is_msix) {
>  		void __iomem *base = entry->mask_base +
> @@ -271,7 +294,6 @@ void __read_msi_msg(struct msi_desc *entry, struct msi_msg
> *msg)
>  		msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR);
>  		msg->data = readl(base + PCI_MSIX_ENTRY_DATA);
>  	} else {
> -		struct pci_dev *dev = entry->dev;
>  		int pos = dev->msi_cap;
>  		u16 data;
> 
> @@ -315,7 +337,9 @@ void get_cached_msi_msg(unsigned int irq, struct msi_msg
> *msg)
> 
>  void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>  {
> -	if (entry->dev->current_state != PCI_D0) {
> +	struct pci_dev *dev = entry->msi->data;
> +
> +	if (dev->current_state != PCI_D0) {
>  		/* Don't touch the hardware now */
>  	} else if (entry->msi_attrib.is_msix) {
>  		void __iomem *base;
> @@ -326,7 +350,6 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg
> *msg)
>  		writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
>  		writel(msg->data, base + PCI_MSIX_ENTRY_DATA);
>  	} else {
> -		struct pci_dev *dev = entry->dev;
>  		int pos = dev->msi_cap;
>  		u16 msgctl;
> 
> @@ -357,14 +380,34 @@ void write_msi_msg(unsigned int irq, struct msi_msg *msg)
>  	__write_msi_msg(entry, msg);
>  }
> 
> -static void free_msi_irqs(struct pci_dev *dev)
> +static void free_msi_sysfs(struct pci_dev *dev)
>  {
> -	struct msi_desc *entry, *tmp;
>  	struct attribute **msi_attrs;
>  	struct device_attribute *dev_attr;
>  	int count = 0;
> 
> -	list_for_each_entry(entry, &dev->msi_list, list) {
> +	if (dev->msi_irq_groups) {
> +		sysfs_remove_groups(&dev->dev.kobj, dev->msi_irq_groups);
> +		msi_attrs = dev->msi_irq_groups[0]->attrs;
> +		while (msi_attrs[count]) {
> +			dev_attr = container_of(msi_attrs[count],
> +						struct device_attribute, attr);
> +			kfree(dev_attr->attr.name);
> +			kfree(dev_attr);
> +			++count;
> +		}
> +		kfree(msi_attrs);
> +		kfree(dev->msi_irq_groups[0]);
> +		kfree(dev->msi_irq_groups);
> +		dev->msi_irq_groups = NULL;
> +	}
> +}
> +
> +static void free_msi_irqs(struct msi_irqs *msi)
> +{
> +	struct msi_desc *entry, *tmp;
> +
> +	list_for_each_entry(entry, &msi->msi_list, list) {
>  		int i, nvec;
>  		if (!entry->irq)
>  			continue;
> @@ -376,11 +419,11 @@ static void free_msi_irqs(struct pci_dev *dev)
>  			BUG_ON(irq_has_action(entry->irq + i));
>  	}
> 
> -	arch_teardown_msi_irqs(dev);
> +	arch_teardown_msi_irqs(msi);
> 
> -	list_for_each_entry_safe(entry, tmp, &dev->msi_list, list) {
> +	list_for_each_entry_safe(entry, tmp, &msi->msi_list, list) {
>  		if (entry->msi_attrib.is_msix) {
> -			if (list_is_last(&entry->list, &dev->msi_list))
> +			if (list_is_last(&entry->list, &msi->msi_list))
>  				iounmap(entry->mask_base);
>  		}
> 
> @@ -398,38 +441,24 @@ static void free_msi_irqs(struct pci_dev *dev)
>  		list_del(&entry->list);
>  		kfree(entry);
>  	}
> -
> -	if (dev->msi_irq_groups) {
> -		sysfs_remove_groups(&dev->dev.kobj, dev->msi_irq_groups);
> -		msi_attrs = dev->msi_irq_groups[0]->attrs;
> -		while (msi_attrs[count]) {
> -			dev_attr = container_of(msi_attrs[count],
> -						struct device_attribute, attr);
> -			kfree(dev_attr->attr.name);
> -			kfree(dev_attr);
> -			++count;
> -		}
> -		kfree(msi_attrs);
> -		kfree(dev->msi_irq_groups[0]);
> -		kfree(dev->msi_irq_groups);
> -		dev->msi_irq_groups = NULL;
> -	}
>  }
> 
> -static struct msi_desc *alloc_msi_entry(struct pci_dev *dev)
> +static struct msi_desc *alloc_msi_entry(struct msi_irqs *msi)
>  {
>  	struct msi_desc *desc = kzalloc(sizeof(*desc), GFP_KERNEL);
>  	if (!desc)
>  		return NULL;
> 
>  	INIT_LIST_HEAD(&desc->list);
> -	desc->dev = dev;
> +	desc->msi = msi;
> 
>  	return desc;
>  }
> 
> -static void pci_intx_for_msi(struct pci_dev *dev, int enable)
> +static void pci_intx_for_msi(struct msi_irqs *msi, int enable)
>  {
> +	struct pci_dev *dev = msi->data;
> +
>  	if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG))
>  		pci_intx(dev, enable);
>  }
> @@ -444,9 +473,9 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
> 
>  	entry = irq_get_msi_desc(dev->irq);
> 
> -	pci_intx_for_msi(dev, 0);
> -	msi_set_enable(dev, 0);
> -	arch_restore_msi_irqs(dev);
> +	pci_intx_for_msi(dev->msi, 0);
> +	msi_set_enable(dev->msi, 0, MSI_TYPE);
> +	arch_restore_msi_irqs(dev->msi);
> 
>  	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
>  	msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap),
> @@ -459,22 +488,21 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
>  static void __pci_restore_msix_state(struct pci_dev *dev)
>  {
>  	struct msi_desc *entry;
> +	struct msi_irqs *msi = dev->msi;
> 
> -	if (!dev->msix_enabled)
> +	if (!pci_dev_msi_enabled(dev, MSIX_TYPE))
>  		return;
> -	BUG_ON(list_empty(&dev->msi_list));
> +	BUG_ON(list_empty(&msi->msi_list));
> 
>  	/* route the table */
> -	pci_intx_for_msi(dev, 0);
> -	msix_clear_and_set_ctrl(dev, 0,
> -				PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL);
> -
> -	arch_restore_msi_irqs(dev);
> -	list_for_each_entry(entry, &dev->msi_list, list) {
> +	pci_intx_for_msi(msi, 0);
> +	msi_set_enable(msi, 1, MSIX_TYPE);
> +	msix_set_all_mask(msi, 1);
> +	arch_restore_msi_irqs(msi);
> +	list_for_each_entry(entry, &msi->msi_list, list)
>  		msix_mask_irq(entry, entry->masked);
> -	}
> 
> -	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
> +	msix_set_all_mask(msi, 0);
>  }
> 
>  void pci_restore_msi_state(struct pci_dev *dev)
> @@ -516,7 +544,7 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
>  	int count = 0;
> 
>  	/* Determine how many msi entries we have */
> -	list_for_each_entry(entry, &pdev->msi_list, list) {
> +	list_for_each_entry(entry, &pdev->msi->msi_list, list) {
>  		++num_msi;
>  	}
>  	if (!num_msi)
> @@ -526,7 +554,7 @@ static int populate_msi_sysfs(struct pci_dev *pdev)
>  	msi_attrs = kzalloc(sizeof(void *) * (num_msi + 1), GFP_KERNEL);
>  	if (!msi_attrs)
>  		return -ENOMEM;
> -	list_for_each_entry(entry, &pdev->msi_list, list) {
> +	list_for_each_entry(entry, &pdev->msi->msi_list, list) {
>  		msi_dev_attr = kzalloc(sizeof(*msi_dev_attr), GFP_KERNEL);
>  		if (!msi_dev_attr)
>  			goto error_attrs;
> @@ -578,13 +606,14 @@ error_attrs:
>  	return ret;
>  }
> 
> -static struct msi_desc *msi_setup_entry(struct pci_dev *dev)
> +static struct msi_desc *msi_setup_entry(struct msi_irqs *msi)
>  {
>  	u16 control;
>  	struct msi_desc *entry;
> +	struct pci_dev *dev = msi->data;
> 
>  	/* MSI Entry Initialization */
> -	entry = alloc_msi_entry(dev);
> +	entry = alloc_msi_entry(msi);
>  	if (!entry)
>  		return NULL;
> 
> @@ -620,15 +649,15 @@ static struct msi_desc *msi_setup_entry(struct pci_dev
> *dev)
>   * an error, and a positive return value indicates the number of interrupts
>   * which could have been allocated.
>   */
> -static int msi_capability_init(struct pci_dev *dev, int nvec)
> +static int msi_capability_init(struct msi_irqs *msi, int nvec)
>  {
>  	struct msi_desc *entry;
>  	int ret;
>  	unsigned mask;
> 
> -	msi_set_enable(dev, 0);	/* Disable MSI during set up */
> +	msi_set_enable(msi, 0, MSI_TYPE);	/* Disable MSI during set up */
> 
> -	entry = msi_setup_entry(dev);
> +	entry = msi_setup_entry(msi);
>  	if (!entry)
>  		return -ENOMEM;
> 
> @@ -636,21 +665,23 @@ static int msi_capability_init(struct pci_dev *dev, int
> nvec)
>  	mask = msi_mask(entry->msi_attrib.multi_cap);
>  	msi_mask_irq(entry, mask, mask);
> 
> -	list_add_tail(&entry->list, &dev->msi_list);
> +	list_add_tail(&entry->list, &msi->msi_list);
> 
>  	/* Configure MSI capability structure */
> -	ret = arch_setup_msi_irqs(dev, nvec, MSI_TYPE);
> -	if (ret) {
> -		msi_mask_irq(entry, mask, ~mask);
> -		free_msi_irqs(dev);
> -		return ret;
> -	}
> +	ret = arch_setup_msi_irqs(msi, nvec, MSI_TYPE);
> +	if (ret)
> +		goto err;
> 
>  	/* Set MSI enabled bits	 */
> -	pci_intx_for_msi(dev, 0);
> -	msi_set_enable(dev, 1);
> -	dev->msi_enabled = 1;
> +	pci_intx_for_msi(msi, 0);
> +	msi_set_enable(msi, 1, MSI_TYPE);
> +	msi->msi_enabled = 1;
>  	return 0;
> +
> +err:
> +	msi_mask_irq(entry, mask, ~mask);
> +	free_msi_irqs(msi);
> +	return ret;
>  }
> 
>  static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
> @@ -668,19 +699,20 @@ static void __iomem *msix_map_region(struct pci_dev *dev,
> unsigned nr_entries)
>  	return ioremap_nocache(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
>  }
> 
> -static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
> +static int msix_setup_entries(struct msi_irqs *msi, void __iomem *base,
>  			      struct msix_entry *entries, int nvec)
>  {
>  	struct msi_desc *entry;
>  	int i, offset;
> +	struct pci_dev *dev = msi->data;
> 
>  	for (i = 0; i < nvec; i++) {
> -		entry = alloc_msi_entry(dev);
> +		entry = alloc_msi_entry(msi);
>  		if (!entry) {
>  			if (!i)
>  				iounmap(base);
>  			else
> -				free_msi_irqs(dev);
> +				free_msi_irqs(msi);
>  			/* No enough memory. Don't try again */
>  			return -ENOMEM;
>  		}
> @@ -688,7 +720,6 @@ static int msix_setup_entries(struct pci_dev *dev, void
> __iomem *base,
>  		entry->msi_attrib.is_msix	= 1;
>  		entry->msi_attrib.is_64		= 1;
>  		entry->msi_attrib.entry_nr	= entries[i].entry;
> -		entry->msi_attrib.default_irq	= dev->irq;
>  		entry->mask_base		= base;
> 
>  		msix_clear_and_set_ctrl(dev, 0,
> @@ -700,19 +731,19 @@ static int msix_setup_entries(struct pci_dev *dev, void
> __iomem *base,
>  		msix_clear_and_set_ctrl(dev,
>  				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
> 
> -		list_add_tail(&entry->list, &dev->msi_list);
> +		list_add_tail(&entry->list, &msi->msi_list);
>  	}
> 
>  	return 0;
>  }
> 
> -static void msix_program_entries(struct pci_dev *dev,
> +static void msix_program_entries(struct msi_irqs *msi,
>  				 struct msix_entry *entries)
>  {
>  	struct msi_desc *entry;
>  	int i = 0;
> 
> -	list_for_each_entry(entry, &dev->msi_list, list) {
> +	list_for_each_entry(entry, &msi->msi_list, list) {
>  		entries[i].vector = entry->irq;
>  		irq_set_msi_desc(entry->irq, entry);
>  		i++;
> @@ -729,19 +760,19 @@ static void msix_program_entries(struct pci_dev *dev,
>   * single MSI-X irq. A return of zero indicates the successful setup of
>   * requested MSI-X entries with allocated irqs or non-zero for otherwise.
>   **/
> -static int msix_capability_init(struct pci_dev *dev, void __iomem *base,
> +static int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
>  				struct msix_entry *entries, int nvec)
>  {
>  	int ret;
> 
>  	/* Ensure MSI-X is disabled while it is set up */
> -	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
> +	msi_set_enable(msi, 0, MSIX_TYPE);
> 
> -	ret = msix_setup_entries(dev, base, entries, nvec);
> +	ret = msix_setup_entries(msi, base, entries, nvec);
>  	if (ret)
>  		return ret;
> 
> -	ret = arch_setup_msi_irqs(dev, nvec, MSIX_TYPE);
> +	ret = arch_setup_msi_irqs(msi, nvec, MSIX_TYPE);
>  	if (ret)
>  		goto out_avail;
> 
> @@ -750,13 +781,13 @@ static int msix_capability_init(struct pci_dev *dev, void
> __iomem *base,
>  	 * MSI-X registers.  We need to mask all the vectors to prevent
>  	 * interrupts coming in before they're fully set up.
>  	 */
> -	msix_program_entries(dev, entries);
> +	msix_program_entries(msi, entries);
> 
>  	/* Set MSI-X enabled bits and unmask the function */
> -	pci_intx_for_msi(dev, 0);
> -	dev->msix_enabled = 1;
> +	pci_intx_for_msi(msi, 0);
> +	msi->msix_enabled = 1;
> 
> -	msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_ENABLE);
> +	msi_set_enable(msi, 1, MSIX_TYPE);
> 
>  	return 0;
> 
> @@ -769,7 +800,7 @@ out_avail:
>  		struct msi_desc *entry;
>  		int avail = 0;
> 
> -		list_for_each_entry(entry, &dev->msi_list, list) {
> +		list_for_each_entry(entry, &msi->msi_list, list) {
>  			if (entry->irq != 0)
>  				avail++;
>  		}
> @@ -777,7 +808,7 @@ out_avail:
>  			ret = avail;
>  	}
> 
> -	free_msi_irqs(dev);
> +	free_msi_irqs(msi);
> 
>  	return ret;
>  }
> @@ -820,7 +851,7 @@ static int pci_msi_check_device(struct pci_dev *dev, int
> nvec, int type)
>  		if (bus->bus_flags & PCI_BUS_FLAGS_NO_MSI)
>  			return -EINVAL;
> 
> -	ret = arch_msi_check_device(dev, nvec, type);
> +	ret = arch_msi_check_device(dev->msi, nvec, type);
>  	if (ret)
>  		return ret;
> 
> @@ -861,12 +892,12 @@ void pci_msi_shutdown(struct pci_dev *dev)
>  			!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		return;
> 
> -	BUG_ON(list_empty(&dev->msi_list));
> -	desc = list_first_entry(&dev->msi_list, struct msi_desc, list);
> +	BUG_ON(list_empty(&dev->msi->msi_list));
> +	desc = list_first_entry(&dev->msi->msi_list, struct msi_desc, list);
> 
> -	msi_set_enable(dev, 0);
> -	pci_intx_for_msi(dev, 1);
> -	dev->msi_enabled = 0;
> +	msi_set_enable(dev->msi, 0, MSI_TYPE);
> +	pci_intx_for_msi(dev->msi, 1);
> +	dev->msi->msi_enabled = 0;
> 
>  	/* Return the device with MSI unmasked as initial states */
>  	mask = msi_mask(desc->msi_attrib.multi_cap);
> @@ -884,7 +915,8 @@ void pci_disable_msi(struct pci_dev *dev)
>  		return;
> 
>  	pci_msi_shutdown(dev);
> -	free_msi_irqs(dev);
> +	free_msi_irqs(dev->msi);
> +	free_msi_sysfs(dev);
>  }
>  EXPORT_SYMBOL(pci_disable_msi);
> 
> @@ -930,9 +962,10 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry
> *entries, int nvec)
>  	void __iomem *base;
>  	u16 control;
> 
> -	if (!entries || !dev->msix_cap || dev->current_state != PCI_D0)
> +	if (!entries || !dev->msix_cap || !dev->msi
> +		   	|| dev->current_state != PCI_D0)
>  		return -EINVAL;
> -
> +
>  	status = pci_msi_check_device(dev, nvec, MSIX_TYPE);
>  	if (status)
>  		return status;
> @@ -952,7 +985,7 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry
> *entries, int nvec)
>  				return -EINVAL;	/* duplicate entry */
>  		}
>  	}
> -	WARN_ON(!!dev->msix_enabled);
> +	WARN_ON(!!pci_dev_msi_enabled(dev, MSIX_TYPE));
> 
>  	/* Check whether driver already requested for MSI irq */
>  	if (pci_dev_msi_enabled(dev, MSI_TYPE)) {
> @@ -966,13 +999,13 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry
> *entries, int nvec)
>  	if (!base)
>  		return -ENOMEM;
> 
> -	status = msix_capability_init(dev, base, entries, nvec);
> +	status = msix_capability_init(dev->msi, base, entries, nvec);
>  	if (!status) {
>  		ret = populate_msi_sysfs(dev);
>  		if (ret) {
> -			dev->msix_enabled = 0;
> -			pci_intx_for_msi(dev, 1);
> -			free_msi_irqs(dev);
> +			dev->msi->msix_enabled = 0;
> +			pci_intx_for_msi(dev->msi, 1);
> +			free_msi_irqs(dev->msi);
>  		}
>  	}
>  	return status;
> @@ -983,18 +1016,18 @@ void pci_msix_shutdown(struct pci_dev *dev)
>  {
>  	struct msi_desc *entry;
> 
> -	if (!pci_msi_enable || !dev || !dev->msix_enabled)
> +	if (!pci_msi_enable || !dev || !pci_dev_msi_enabled(dev, MSIX_TYPE))
>  		return;
> 
>  	/* Return the device with MSI-X masked as initial states */
> -	list_for_each_entry(entry, &dev->msi_list, list) {
> +	list_for_each_entry(entry, &dev->msi->msi_list, list) {
>  		/* Keep cached states to be restored */
>  		arch_msix_mask_irq(entry, 1);
>  	}
> 
> -	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
> -	pci_intx_for_msi(dev, 1);
> -	dev->msix_enabled = 0;
> +	msi_set_enable(dev->msi, 0, MSIX_TYPE);
> +	pci_intx_for_msi(dev->msi, 1);
> +	dev->msi->msix_enabled = 0;
>  }
> 
>  void pci_disable_msix(struct pci_dev *dev)
> @@ -1004,7 +1037,8 @@ void pci_disable_msix(struct pci_dev *dev)
>  		return;
> 
>  	pci_msix_shutdown(dev);
> -	free_msi_irqs(dev);
> +	free_msi_irqs(dev->msi);
> +	free_msi_sysfs(dev);
>  }
>  EXPORT_SYMBOL(pci_disable_msix);
> 
> @@ -1025,21 +1059,52 @@ int pci_msi_enabled(void)
>  }
>  EXPORT_SYMBOL(pci_msi_enabled);
> 
> -void pci_msi_init_pci_dev(struct pci_dev *dev)
> +static struct msi_ops pci_msi = {
> +	.msi_set_enable = msi_set_enable,
> +	.msi_setup_entry = msi_setup_entry,
> +	.msix_setup_entries = msix_setup_entries,
> +	.msi_mask_irq = default_msi_mask_irq,
> +	.msix_mask_irq = default_msix_mask_irq,
> +	.msi_read_message = __read_msi_msg,
> +	.msi_write_message = __write_msi_msg,
> +	.msi_set_intx =  pci_intx_for_msi,
> +};

Ahh, want to be sure I am understanding this correctly. So if I have a non-pci driver "xyz" which wants to use separate ops then I need to have a all these functions in that driver. Something like driver/xyz/msi.c

Thanks
-Bharat

> +
> +struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops)
>  {
> -	INIT_LIST_HEAD(&dev->msi_list);
> +	struct msi_irqs *msi;
> +
> +	msi = kzalloc(sizeof(struct msi_irqs), GFP_KERNEL);
> +	if (!msi)
> +		return NULL;
> 
> +	INIT_LIST_HEAD(&msi->msi_list);
> +	msi->data = data;
> +	msi->ops = ops;
> +	return msi;
> +}
> +
> +void pci_msi_init_pci_dev(struct pci_dev *dev)
> +{
>  	/* Disable the msi hardware to avoid screaming interrupts
>  	 * during boot.  This is the power on reset default so
>  	 * usually this should be a noop.
>  	 */
>  	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
> -	if (dev->msi_cap)
> -		msi_set_enable(dev, 0);
> -
>  	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
> -	if (dev->msix_cap)
> -		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
> +
> +	if (dev->msi_cap || dev->msix_cap) {
> +		dev->msi = alloc_msi_irqs(dev, &pci_msi);
> +		if (!dev->msi)
> +			return;
> +
> +		dev->msi->node = dev_to_node(&dev->dev);
> +		if (dev->msi_cap)
> +			msi_set_enable(dev->msi, 0, MSI_TYPE);
> +
> +		if (dev->msix_cap)
> +			msi_set_enable(dev->msi, 0, MSIX_TYPE);
> +	}
>  }
> 
>  /**
> @@ -1060,13 +1125,13 @@ int pci_enable_msi_range(struct pci_dev *dev, int
> minvec, int maxvec)
>  	int rc;
>  	struct msi_desc *entry;
> 
> -	if (dev->current_state != PCI_D0)
> +	if (dev->current_state != PCI_D0 || !dev->msi)
>  		return -EINVAL;
> 
> -	WARN_ON(!!dev->msi_enabled);
> +	WARN_ON(!!pci_dev_msi_enabled(dev, MSI_TYPE));
> 
>  	/* Check whether driver already requested MSI-X irqs */
> -	if (dev->msix_enabled) {
> +	if (pci_dev_msi_enabled(dev, MSIX_TYPE)) {
>  		dev_info(&dev->dev,
>  			 "can't enable MSI (MSI-X already enabled)\n");
>  		return -EINVAL;
> @@ -1095,7 +1160,7 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec,
> int maxvec)
>  	} while (rc);
> 
>  	do {
> -		rc = msi_capability_init(dev, nvec);
> +		rc = msi_capability_init(dev->msi, nvec);
>  		if (rc < 0) {
>  			return rc;
>  		} else if (rc > 0) {
> @@ -1107,14 +1172,14 @@ int pci_enable_msi_range(struct pci_dev *dev, int
> minvec, int maxvec)
> 
>  	rc = populate_msi_sysfs(dev);
>  	if (rc) {
> -		msi_set_enable(dev, 0);
> -		pci_intx_for_msi(dev, 1);
> -		dev->msi_enabled = 0;
> -		free_msi_irqs(dev);
> +		msi_set_enable(dev->msi, 0, MSI_TYPE);
> +		pci_intx_for_msi(dev->msi, 1);
> +		dev->msi->msi_enabled = 0;
> +		free_msi_irqs(dev->msi);
>  		return rc;
>  	}
> 
> -	entry = list_entry(dev->msi_list.next, struct msi_desc, list);
> +	entry = list_entry(dev->msi->msi_list.next, struct msi_desc, list);
>  	dev->irq = entry->irq;
>  	return nvec;
>  }
> @@ -1158,3 +1223,5 @@ int pci_enable_msix_range(struct pci_dev *dev, struct
> msix_entry *entries,
>  	return nvec;
>  }
>  EXPORT_SYMBOL(pci_enable_msix_range);
> +
> +
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 5a672d3..fc8f3e8 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -83,15 +83,15 @@ struct msi_desc {
>   * implemented as weak symbols so that they /can/ be overriden by
>   * architecture specific code if needed.
>   */
> -int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc);
> +int arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc);
>  void arch_teardown_msi_irq(unsigned int irq);
> -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
> -void arch_teardown_msi_irqs(struct pci_dev *dev);
> -int arch_msi_check_device(struct pci_dev* dev, int nvec, int type);
> -void arch_restore_msi_irqs(struct pci_dev *dev);
> +int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type);
> +void arch_teardown_msi_irqs(struct msi_irqs *msi);
> +int arch_msi_check_device(struct msi_irqs *msi, int nvec, int type);
> +void arch_restore_msi_irqs(struct msi_irqs *msi);
> 
> -void default_teardown_msi_irqs(struct pci_dev *dev);
> -void default_restore_msi_irqs(struct pci_dev *dev);
> +void default_teardown_msi_irqs(struct msi_irqs *msi);
> +void default_restore_msi_irqs(struct msi_irqs *msi);
>  u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
>  u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag);
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index c7bca1c..d7126fc 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -334,8 +334,6 @@ struct pci_dev {
>  	unsigned int	block_cfg_access:1;	/* config space access is blocked */
>  	unsigned int	broken_parity_status:1;	/* Device generates false positive
> parity */
>  	unsigned int	irq_reroute_variant:2;	/* device needs IRQ rerouting
> variant */
> -	unsigned int	msi_enabled:1;
> -	unsigned int	msix_enabled:1;
>  	unsigned int	ari_enabled:1;	/* ARI forwarding */
>  	unsigned int	is_managed:1;
>  	unsigned int    needs_freset:1; /* Dev requires fundamental reset */
> @@ -358,7 +356,7 @@ struct pci_dev {
>  	struct bin_attribute *res_attr[DEVICE_COUNT_RESOURCE]; /* sysfs file for
> resources */
>  	struct bin_attribute *res_attr_wc[DEVICE_COUNT_RESOURCE]; /* sysfs file
> for WC mapping of resources */
>  #ifdef CONFIG_PCI_MSI
> -	struct list_head msi_list;
> +	struct msi_irqs *msi;
>  	const struct attribute_group **msi_irq_groups;
>  #endif
>  	struct pci_vpd *vpd;
> @@ -510,11 +508,14 @@ static inline struct pci_dev *pci_upstream_bridge(struct
> pci_dev *dev)
>  static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev, int type)
>  {
>  	bool enabled = 0;
> +
> +	if (!pci_dev->msi)
> +		return false;
> 
>  	if (type & MSI_TYPE)
> -		enabled |= pci_dev->msi_enabled;
> +		enabled |= pci_dev->msi->msi_enabled;
>  	if (type & MSIX_TYPE)
> -		enabled |= pci_dev->msix_enabled;
> +		enabled |= pci_dev->msi->msix_enabled;
> 
>  	return enabled;
>  }
> --
> 1.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file
  2014-07-26  3:08 ` [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file Yijing Wang
@ 2014-08-20  6:18   ` Bharat.Bhushan
  2014-08-20  6:43     ` Yijing Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Bharat.Bhushan @ 2014-08-20  6:18 UTC (permalink / raw)
  To: Yijing Wang, linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo



> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
> On Behalf Of Yijing Wang
> Sent: Saturday, July 26, 2014 8:39 AM
> To: linux-kernel@vger.kernel.org
> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org; Basu
> Arnab-B45036; virtualization@lists.linux-foundation.org; Hanjun Guo; Yijing Wang
> Subject: [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file
> 
> MSI interrupt will not only used in PCI device, more
> and more Non-PCI device also want to use MSI. ARM
> GIC v3 spec says in ARM platform with GIC v3 controller,
> Non-PCI device can also be design to support MSI to
> simplify interrupt wires, for the existing Non-PCI
> device, consolidator is designed and used to translate
> legacy interrupt to MSI. So for support Non-PCI MSI
> device, generic MSI driver is needed. Split the generic
> MSI code into new location, drivers/msi/msi.c. Then
> MSI driver does not depend PCI anymore.
> 
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> ---
>  drivers/Kconfig      |    1 +
>  drivers/Makefile     |    1 +
>  drivers/msi/Kconfig  |    8 +
>  drivers/msi/Makefile |    1 +
>  drivers/msi/msi.c    |  540 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/pci/Kconfig  |    6 +-
>  drivers/pci/msi.c    |  500 ++++-------------------------------------------
>  include/linux/msi.h  |   31 +++-
>  8 files changed, 617 insertions(+), 471 deletions(-)
>  create mode 100644 drivers/msi/Kconfig
>  create mode 100644 drivers/msi/Makefile
>  create mode 100644 drivers/msi/msi.c
> 
> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index 0e87a34..4d05749 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -176,4 +176,5 @@ source "drivers/powercap/Kconfig"
> 
>  source "drivers/mcb/Kconfig"
> 
> +source "drivers/msi/Kconfig"
>  endmenu
> diff --git a/drivers/Makefile b/drivers/Makefile
> index f98b50d..47ae3d1 100644
> --- a/drivers/Makefile
> +++ b/drivers/Makefile
> @@ -158,3 +158,4 @@ obj-$(CONFIG_NTB)		+= ntb/
>  obj-$(CONFIG_FMC)		+= fmc/
>  obj-$(CONFIG_POWERCAP)		+= powercap/
>  obj-$(CONFIG_MCB)		+= mcb/
> +obj-$(CONFIG_MSI)		+= msi/
> diff --git a/drivers/msi/Kconfig b/drivers/msi/Kconfig
> new file mode 100644
> index 0000000..739bd13
> --- /dev/null
> +++ b/drivers/msi/Kconfig
> @@ -0,0 +1,8 @@
> +config MSI
> +	bool "Message Signaled Interrupts (MSI and MSI-X)"
> +	default y
> +	help
> +		This allows device drivers to use generic MSI(Message
> +		Signaled Interrupt). Message Signaled Interrupts enable
> +		a device to generate an interrupt using an inbound Memory
> +		Write to a specific target address.
> diff --git a/drivers/msi/Makefile b/drivers/msi/Makefile
> new file mode 100644
> index 0000000..39cb026
> --- /dev/null
> +++ b/drivers/msi/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_MSI) += msi.o
> diff --git a/drivers/msi/msi.c b/drivers/msi/msi.c
> new file mode 100644
> index 0000000..3fbd539
> --- /dev/null
> +++ b/drivers/msi/msi.c
> @@ -0,0 +1,540 @@
> +/*
> + * File:	msi.c
> + * Purpose:	Message Signaled Interrupt (MSI)
> + *
> + * Copyright (C) 2014 Huawei Ltd.
> + * Copyright (C) Yijing Wang <wangyijing@huawei.com>
> + */
> +#include <linux/err.h>
> +#include <linux/mm.h>
> +#include <linux/irq.h>
> +#include <linux/interrupt.h>
> +#include <linux/export.h>
> +#include <linux/ioport.h>
> +#include <linux/proc_fs.h>
> +#include <linux/msi.h>
> +#include <linux/smp.h>
> +#include <linux/errno.h>
> +#include <linux/io.h>
> +#include <linux/slab.h>
> +#include <linux/device.h>
> +#include <linux/pci.h>
> +
> +/* Arch hooks */
> +
> +int __weak arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc)
> +{
> +	struct pci_dev *dev = msi->data;
> +	struct msi_chip *chip = dev->bus->msi; //TO BE DONE: rework msi_chip to
> support Non-PCI MSI
> +	int err;
> +
> +	if (!chip || !chip->setup_irq)
> +		return -EINVAL;
> +
> +	err = chip->setup_irq(chip, dev, desc);
> +	if (err < 0)
> +		return err;
> +
> +	irq_set_chip_data(desc->irq, chip);
> +	return 0;
> +}
> +
> +void __weak arch_teardown_msi_irq(unsigned int irq)
> +{
> +	struct msi_chip *chip = irq_get_chip_data(irq);
> +
> +	if (!chip || !chip->teardown_irq)
> +		return;
> +
> +	chip->teardown_irq(chip, irq);
> +}
> +
> +int __weak arch_msi_check_device(struct msi_irqs *msi, int nvec, int type)
> +{
> +	struct pci_dev *dev = msi->data;
> +	struct msi_chip *chip = dev->bus->msi; //TO BE DONE: rework msi_chip to
> support Non-PCI MSI
> +
> +	if (!chip || !chip->check_device)
> +		return 0;
> +
> +	return chip->check_device(chip, dev, nvec, type);
> +}
> +
> +int __weak arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
> +{
> +	struct msi_desc *entry;
> +	int ret;
> +
> +	/*
> +	 * If an architecture wants to support multiple MSI, it needs to
> +	 * override arch_setup_msi_irqs()
> +	 */
> +	if (type == MSI_TYPE && nvec > 1)
> +		return 1;
> +
> +	list_for_each_entry(entry, &msi->msi_list, list) {
> +		ret = arch_setup_msi_irq(msi, entry);
> +		if (ret < 0)
> +			return ret;
> +		if (ret > 0)
> +			return -ENOSPC;
> +	}
> +	return 0;
> +}
> +
> +
> +void __weak arch_teardown_msi_irqs(struct msi_irqs *msi)
> +{
> +	return default_teardown_msi_irqs(msi);
> +}
> +
> +/*
> + * We have a default implementation available as a separate non-weak
> + * function, as it is used by the Xen x86 PCI code
> + */
> +void default_teardown_msi_irqs(struct msi_irqs *msi)
> +{
> +	struct msi_desc *entry;
> +
> +	list_for_each_entry(entry, &msi->msi_list, list) {
> +		int i, nvec;
> +		if (entry->irq == 0)
> +			continue;
> +		if (entry->nvec_used)
> +			nvec = entry->nvec_used;
> +		else
> +			nvec = 1 << entry->msi_attrib.multiple;
> +		for (i = 0; i < nvec; i++)
> +			arch_teardown_msi_irq(entry->irq + i);
> +	}
> +}
> +
> +static void default_restore_msi_irq(struct msi_irqs *msi, int irq)
> +{
> +	struct msi_desc *entry;
> +
> +	entry = NULL;
> +	if (msi->msix_enabled) {
> +		list_for_each_entry(entry, &msi->msi_list, list) {
> +			if (irq == entry->irq)
> +				break;
> +		}
> +	} else if (msi->msi_enabled)  {
> +		entry = irq_get_msi_desc(irq);
> +	}
> +
> +	if (entry)
> +		write_msi_msg(irq, &entry->msg);
> +}
> +
> +void default_restore_msi_irqs(struct msi_irqs *msi)
> +{
> +	struct msi_desc *entry;
> +
> +	list_for_each_entry(entry, &msi->msi_list, list) {
> +		default_restore_msi_irq(msi, entry->irq);
> +	}
> +}
> +
> +void __weak arch_restore_msi_irqs(struct msi_irqs *msi)
> +{
> +	return default_restore_msi_irqs(msi);
> +}
> +
> +u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
> +{
> +	struct msi_irqs *msi = desc->msi;
> +
> +	if (!msi || !msi->ops || !msi->ops->msi_mask_irq)
> +		return desc->masked;
> +	return msi->ops->msi_mask_irq(desc, mask, flag);
> +}
> +
> +__weak u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
> +{
> +	return default_msi_mask_irq(desc, mask, flag);
> +}
> +
> +void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
> +{
> +	desc->masked = arch_msi_mask_irq(desc, mask, flag);
> +}
> +
> +u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag)
> +{
> +	struct msi_irqs *msi = desc->msi;
> +
> +	if (!msi || !msi->ops || !msi->ops->msix_mask_irq)
> +		return desc->masked;
> +
> +	return msi->ops->msix_mask_irq(desc, flag);
> +}
> +
> +__weak u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag)
> +{
> +	return default_msix_mask_irq(desc, flag);
> +}
> +
> +void msix_mask_irq(struct msi_desc *desc, u32 flag)
> +{
> +	desc->masked = arch_msix_mask_irq(desc, flag);
> +}
> +
> +static void msi_set_mask_bit(struct irq_data *data, u32 flag)
> +{
> +	struct msi_desc *desc = irq_data_get_msi(data);
> +
> +	if (desc->msi_attrib.is_msix) {
> +		msix_mask_irq(desc, flag);
> +		readl(desc->mask_base);		/* Flush write to device */
> +	} else {
> +		unsigned offset = data->irq - desc->irq;
> +		msi_mask_irq(desc, 1 << offset, flag << offset);
> +	}
> +}
> +
> +void mask_msi_irq(struct irq_data *data)
> +{
> +	msi_set_mask_bit(data, 1);
> +}
> +
> +void unmask_msi_irq(struct irq_data *data)
> +{
> +	msi_set_mask_bit(data, 0);
> +}
> +
> +void msi_set_enable(struct msi_irqs *msi, int enable, int type)
> +{
> +	if (!msi || !msi->ops || !msi->ops->msi_set_enable)
> +		return;
> +	msi->ops->msi_set_enable(msi, enable, type);
> +}
> +EXPORT_SYMBOL(msi_set_enable);
> +
> +void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
> +{
> +	struct msi_irqs *msi = entry->msi;
> +
> +	if (!msi || !msi->ops || !msi->ops->msi_read_message)
> +		return;
> +	msi->ops->msi_read_message(entry, msg);
> +}
> +
> +void read_msi_msg(unsigned int irq, struct msi_msg *msg)
> +{
> +	struct msi_desc *entry = irq_get_msi_desc(irq);
> +
> +	__read_msi_msg(entry, msg);
> +}
> +
> +void __get_cached_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
> +{
> +	/* Assert that the cache is valid, assuming that
> +	 * valid messages are not all-zeroes. */
> +	BUG_ON(!(entry->msg.address_hi | entry->msg.address_lo |
> +		 entry->msg.data));
> +
> +	*msg = entry->msg;
> +}
> +
> +void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg)
> +{
> +	struct msi_desc *entry = irq_get_msi_desc(irq);
> +
> +	__get_cached_msi_msg(entry, msg);
> +}
> +
> +void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
> +{
> +	struct msi_irqs *msi = entry->msi;
> +
> +	if (!msi || !msi->ops || !msi->ops->msi_write_message)
> +		return;
> +	msi->ops->msi_write_message(entry, msg);
> +}
> +
> +void write_msi_msg(unsigned int irq, struct msi_msg *msg)
> +{
> +	struct msi_desc *entry = irq_get_msi_desc(irq);
> +
> +	__write_msi_msg(entry, msg);
> +}
> +
> +void free_msi_irqs(struct msi_irqs *msi)
> +{
> +	struct msi_desc *entry, *tmp;
> +
> +	list_for_each_entry(entry, &msi->msi_list, list) {
> +		int i, nvec;
> +		if (!entry->irq)
> +			continue;
> +		if (entry->nvec_used)
> +			nvec = entry->nvec_used;
> +		else
> +			nvec = 1 << entry->msi_attrib.multiple;
> +		for (i = 0; i < nvec; i++)
> +			BUG_ON(irq_has_action(entry->irq + i));
> +	}
> +
> +	arch_teardown_msi_irqs(msi);
> +
> +	list_for_each_entry_safe(entry, tmp, &msi->msi_list, list) {
> +		if (entry->msi_attrib.is_msix) {
> +			if (list_is_last(&entry->list, &msi->msi_list))
> +				iounmap(entry->mask_base);
> +		}
> +
> +		/*
> +		 * Its possible that we get into this path
> +		 * When populate_msi_sysfs fails, which means the entries
> +		 * were not registered with sysfs.  In that case don't
> +		 * unregister them.
> +		 */
> +		if (entry->kobj.parent) {
> +			kobject_del(&entry->kobj);
> +			kobject_put(&entry->kobj);
> +		}
> +
> +		list_del(&entry->list);
> +		kfree(entry);
> +	}
> +}
> +EXPORT_SYMBOL(free_msi_irqs);
> +
> +struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops)
> +{
> +	struct msi_irqs *msi;
> +
> +	msi = kzalloc(sizeof(struct msi_irqs), GFP_KERNEL);
> +	if (!msi)
> +		return NULL;
> +
> +	INIT_LIST_HEAD(&msi->msi_list);
> +	msi->data = data;
> +	msi->ops = ops;
> +	return msi;
> +}
> +EXPORT_SYMBOL(alloc_msi_irqs);
> +
> +struct msi_desc *alloc_msi_entry(struct msi_irqs *msi)
> +{
> +	struct msi_desc *desc = kzalloc(sizeof(*desc), GFP_KERNEL);
> +	if (!desc)
> +		return NULL;
> +
> +	INIT_LIST_HEAD(&desc->list);
> +	desc->msi = msi;
> +
> +	return desc;
> +}
> +EXPORT_SYMBOL(alloc_msi_entry);
> +
> +static void msi_set_intx(struct msi_irqs *msi, int flag)
> +{
> +	if (!msi || !msi->ops || !msi->ops->msi_set_intx)
> +		return;
> +	msi->ops->msi_set_intx(msi, flag);
> +}
> +
> +void msi_shutdown(struct msi_irqs *msi)
> +{
> +	u32 mask;
> +	struct msi_desc *desc;
> +
> +	if (!msi || !msi->msi_enabled)
> +		return;
> +
> +	BUG_ON(list_empty(&msi->msi_list));
> +
> +	desc = list_first_entry(&msi->msi_list, struct msi_desc, list);
> +	msi_set_enable(msi, 0, MSI_TYPE);
> +	msi_set_intx(msi, 1);
> +	msi->msi_enabled = 0;
> +
> +	mask = msi_mask(desc->msi_attrib.multi_cap);
> +	arch_msi_mask_irq(desc, mask, ~mask);
> +}
> +
> +void msix_shutdown(struct msi_irqs *msi)
> +{
> +	struct msi_desc *entry;
> +
> +	if (!msi || !msi->msix_enabled)
> +		return;
> +
> +	list_for_each_entry(entry, &msi->msi_list, list)
> +		arch_msix_mask_irq(entry, 1);
> +
> +	msi_set_enable(msi, 0, MSIX_TYPE);
> +	msi_set_intx(msi, 1);
> +	msi->msix_enabled = 0;
> +}
> +
> +static struct msi_desc * msi_setup_entry(struct msi_irqs *msi)
> +{
> +	struct msi_desc *entry;
> +
> +	entry = alloc_msi_entry(msi);
> +	if (!entry)
> +		return NULL;
> +
> +	entry->msi_attrib.is_msix	= 0;
> +	entry->msi_attrib.entry_nr	= 0;
> +
> +	if (!msi->ops || !msi->ops->msi_setup_entry) {
> +		kfree(entry);
> +		return NULL;
> +	}

Can we move this check at the start of the function?

> +
> +	msi->ops->msi_setup_entry(msi, entry);
> +	return entry;
> +}
> +
> +static int msix_setup_entries(struct msi_irqs *msi, void __iomem *base,
> +			      struct msix_entry *entries, int nvec)
> +{
> +	struct msi_desc *entry;
> +	int i;
> +
> +	for (i = 0; i < nvec; i++) {
> +		entry = alloc_msi_entry(msi);
> +		if (!entry) {
> +			if (!i)
> +				iounmap(base);
> +			else
> +				free_msi_irqs(msi);
> +			/* No enough memory. Don't try again */
> +			return -ENOMEM;
> +		}
> +
> +		entry->msi_attrib.is_msix	= 1;
> +		entry->msi_attrib.is_64		= 1;
> +		entry->msi_attrib.entry_nr	= entries[i].entry;
> +		entry->mask_base		= base;
> +
> +		list_add_tail(&entry->list, &msi->msi_list);
> +	}
> +
> +	if (msi->ops && msi->ops->msix_setup_entries)
> +		return msi->ops->msix_setup_entries(msi, entries);
> +
> +	return 0;
> +}
> +
> +/**
> + * msi_capability_init - configure device's MSI capability structure
> + * @msi: pointer to the msi_irqs data structure of MSI device function
> + * @nvec: number of interrupts to allocate
> + *
> + * Setup the MSI capability structure of the device with the requested
> + * number of interrupts.  A return value of zero indicates the successful
> + * setup of an entry with the new MSI irq.  A negative return value indicates
> + * an error, and a positive return value indicates the number of interrupts
> + * which could have been allocated.
> + */
> +int msi_capability_init(struct msi_irqs *msi, int nvec)
> +{
> +	struct msi_desc *entry;
> +	int ret;
> +	unsigned mask;
> +
> +	msi_set_enable(msi, 0, MSI_TYPE);	/* Disable MSI during set up */
> +
> +	/* MSI Entry Initialization */
> +	entry = msi_setup_entry(msi);
> +	if (!entry)
> +		return -ENOMEM;
> +
> +	/* All MSIs are unmasked by default, Mask them all */

Will this be true for non-pci devices as well?

Thanks
-Bharat

> +	mask = msi_mask(entry->msi_attrib.multi_cap);
> +	msi_mask_irq(entry, mask, mask);
> +
> +	/* Configure MSI capability structure */
> +	ret = arch_setup_msi_irqs(msi, nvec, MSI_TYPE);
> +	if (ret)
> +		goto err;
> +
> +	/* Set MSI enabled bits	 */
> +	msi_set_intx(msi, 0);
> +	msi_set_enable(msi, 1, MSI_TYPE);
> +	msi->msi_enabled = 1;
> +
> +	return 0;
> +
> +err:
> +	msi_mask_irq(entry, mask, ~mask);
> +	free_msi_irqs(msi);
> +	return ret;
> +}
> +
> +static void msix_program_entries(struct msi_irqs *msi,
> +				 struct msix_entry *entries)
> +{
> +	struct msi_desc *entry;
> +	int i = 0;
> +
> +	list_for_each_entry(entry, &msi->msi_list, list) {
> +		entries[i].vector = entry->irq;
> +		irq_set_msi_desc(entry->irq, entry);
> +		i++;
> +	}
> +}
> +
> +/**
> + * msix_capability_init - configure device's MSI-X capability
> + * @dev: pointer to the pci_dev data structure of MSI-X device function
> + * @entries: pointer to an array of struct msix_entry entries
> + * @nvec: number of @entries
> + *
> + * Setup the MSI-X capability structure of device function with a
> + * single MSI-X irq. A return of zero indicates the successful setup of
> + * requested MSI-X entries with allocated irqs or non-zero for otherwise.
> + **/
> +int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
> +				struct msix_entry *entries, int nvec)
> +{
> +	int ret;
> +
> +	/* Ensure MSI-X is disabled while it is set up */
> +	msi_set_enable(msi, 0, MSIX_TYPE);
> +
> +	ret = msix_setup_entries(msi, base, entries, nvec);
> +	if (ret)
> +		return ret;
> +
> +	ret = arch_setup_msi_irqs(msi, nvec, MSIX_TYPE);
> +	if (ret)
> +		goto out_avail;
> +
> +	msix_program_entries(msi, entries);
> +
> +	/* Set MSI-X enabled bits and unmask the function */
> +	msi_set_intx(msi, 0);
> +	msi->msix_enabled = 1;
> +
> +	msi_set_enable(msi, 1, MSIX_TYPE);
> +
> +	return 0;
> +
> +out_avail:
> +	if (ret < 0) {
> +		/*
> +		 * If we had some success, report the number of irqs
> +		 * we succeeded in setting up.
> +		 */
> +		struct msi_desc *entry;
> +		int avail = 0;
> +
> +		list_for_each_entry(entry, &msi->msi_list, list) {
> +			if (entry->irq != 0)
> +				avail++;
> +		}
> +		if (avail != 0)
> +			ret = avail;
> +	}
> +
> +	free_msi_irqs(msi);
> +
> +	return ret;
> +}
> +
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 893503f..1a10488 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -2,10 +2,10 @@
>  # PCI configuration
>  #
>  config PCI_MSI
> -	bool "Message Signaled Interrupts (MSI and MSI-X)"
> -	depends on PCI
> +	bool "PCI Message Signaled Interrupts (MSI and MSI-X)"
> +	depends on PCI && MSI
>  	help
> -	   This allows device drivers to enable MSI (Message Signaled
> +	   This allows PCI device drivers to enable MSI (Message Signaled
>  	   Interrupts).  Message Signaled Interrupts enable a device to
>  	   generate an interrupt using an inbound Memory Write on its
>  	   PCI bus instead of asserting a device IRQ pin.
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index f0c5989..df7223c 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -26,121 +26,8 @@ static int pci_msi_enable = 1;
> 
>  #define msix_table_size(flags)	((flags & PCI_MSIX_FLAGS_QSIZE) + 1)
> 
> -
> -/* Arch hooks */
> -
> -int __weak arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc)
> -{
> -	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support
> Non-PCI
> -	struct msi_chip *chip = dev->bus->msi;
> -	int err;
> -
> -	if (!chip || !chip->setup_irq)
> -		return -EINVAL;
> -
> -	err = chip->setup_irq(chip, dev, desc);
> -	if (err < 0)
> -		return err;
> -
> -	irq_set_chip_data(desc->irq, chip);
> -
> -	return 0;
> -}
> -
> -void __weak arch_teardown_msi_irq(unsigned int irq)
> -{
> -	struct msi_chip *chip = irq_get_chip_data(irq);
> -
> -	if (!chip || !chip->teardown_irq)
> -		return;
> -
> -	chip->teardown_irq(chip, irq);
> -}
> -
> -int __weak arch_msi_check_device(struct msi_irqs *msi, int nvec, int type)
> -{
> -	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support
> Non-PCI
> -	struct msi_chip *chip = dev->bus->msi;
> -
> -	if (!chip || !chip->check_device)
> -		return 0;
> -
> -	return chip->check_device(chip, dev, nvec, type);
> -}
> -
> -int __weak arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
> -{
> -	struct msi_desc *entry;
> -	int ret;
> -
> -	/*
> -	 * If an architecture wants to support multiple MSI, it needs to
> -	 * override arch_setup_msi_irqs()
> -	 */
> -	if (type == MSI_TYPE && nvec > 1)
> -		return 1;
> -
> -	list_for_each_entry(entry, &msi->msi_list, list) {
> -		ret = arch_setup_msi_irq(msi, entry);
> -		if (ret < 0)
> -			return ret;
> -		if (ret > 0)
> -			return -ENOSPC;
> -	}
> -
> -	return 0;
> -}
> -
> -/*
> - * We have a default implementation available as a separate non-weak
> - * function, as it is used by the Xen x86 PCI code
> - */
> -void default_teardown_msi_irqs(struct msi_irqs *msi)
> -{
> -	struct msi_desc *entry;
> -
> -	list_for_each_entry(entry, &msi->msi_list, list) {
> -		int i, nvec;
> -		if (entry->irq == 0)
> -			continue;
> -		if (entry->nvec_used)
> -			nvec = entry->nvec_used;
> -		else
> -			nvec = 1 << entry->msi_attrib.multiple;
> -		for (i = 0; i < nvec; i++)
> -			arch_teardown_msi_irq(entry->irq + i);
> -	}
> -}
> -
> -void __weak arch_teardown_msi_irqs(struct msi_irqs *msi)
> -{
> -	return default_teardown_msi_irqs(msi);
> -}
> -
> -static void default_restore_msi_irq(struct msi_irqs *msi, int irq)
> -{
> -	struct msi_desc *entry;
> -
> -	entry = NULL;
> -	if (msi->msix_enabled) {
> -		list_for_each_entry(entry, &msi->msi_list, list) {
> -			if (irq == entry->irq)
> -				break;
> -		}
> -	} else if (msi->msi_enabled)  {
> -		entry = irq_get_msi_desc(irq);
> -	}
> -
> -	if (entry)
> -		write_msi_msg(irq, &entry->msg);
> -}
> -
> -void __weak arch_restore_msi_irqs(struct msi_irqs *msi)
> -{
> -	return default_restore_msi_irqs(msi);
> -}
> -
> -static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
> +static void msix_clear_and_set_ctrl(struct pci_dev *dev,
> +		u16 clear, u16 set)
>  {
>  	u16 ctrl;
> 
> @@ -150,7 +37,7 @@ static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16
> clear, u16 set)
>  	pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, ctrl);
>  }
> 
> -static void msi_set_enable(struct msi_irqs *msi, int enable, int type)
> +static void pci_msi_set_enable(struct msi_irqs *msi, int enable, int type)
>  {
>  	u16 control;
>  	struct pci_dev *dev = msi->data;
> @@ -169,21 +56,13 @@ static void msi_set_enable(struct msi_irqs *msi, int
> enable, int type)
>  	}
>  }
> 
> -static inline __attribute_const__ u32 msi_mask(unsigned x)
> -{
> -	/* Don't shift by >= width of type */
> -	if (x >= 5)
> -		return 0xffffffff;
> -	return (1 << (1 << x)) - 1;
> -}
> -
>  /*
>   * PCI 2.3 does not specify mask bits for each MSI interrupt.  Attempting to
>   * mask all MSI interrupts by clearing the MSI enable bit does not work
>   * reliably as devices without an INTx disable bit will then generate a
>   * level IRQ which will never be cleared.
>   */
> -u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
> +u32 pci_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>  {
>  	struct pci_dev *dev = desc->msi->data;
>  	u32 mask_bits = desc->masked;
> @@ -198,16 +77,6 @@ u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask,
> u32 flag)
>  	return mask_bits;
>  }
> 
> -__weak u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
> -{
> -	return default_msi_mask_irq(desc, mask, flag);
> -}
> -
> -static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
> -{
> -	desc->masked = arch_msi_mask_irq(desc, mask, flag);
> -}
> -
>  /*
>   * This internal function does not flush PCI writes to the device.
>   * All users must ensure that they read from the device before either
> @@ -215,7 +84,7 @@ static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32
> flag)
>   * file.  This saves a few milliseconds when initialising devices with lots
>   * of MSI-X interrupts.
>   */
> -u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag)
> +u32 pci_msix_mask_irq(struct msi_desc *desc, u32 flag)
>  {
>  	u32 mask_bits = desc->masked;
>  	unsigned offset = desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
> @@ -228,40 +97,7 @@ u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag)
>  	return mask_bits;
>  }
> 
> -__weak u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag)
> -{
> -	return default_msix_mask_irq(desc, flag);
> -}
> -
> -static void msix_mask_irq(struct msi_desc *desc, u32 flag)
> -{
> -	desc->masked = arch_msix_mask_irq(desc, flag);
> -}
> -
> -static void msi_set_mask_bit(struct irq_data *data, u32 flag)
> -{
> -	struct msi_desc *desc = irq_data_get_msi(data);
> -
> -	if (desc->msi_attrib.is_msix) {
> -		msix_mask_irq(desc, flag);
> -		readl(desc->mask_base);		/* Flush write to device */
> -	} else {
> -		unsigned offset = data->irq - desc->irq;
> -		msi_mask_irq(desc, 1 << offset, flag << offset);
> -	}
> -}
> -
> -void mask_msi_irq(struct irq_data *data)
> -{
> -	msi_set_mask_bit(data, 1);
> -}
> -
> -void unmask_msi_irq(struct irq_data *data)
> -{
> -	msi_set_mask_bit(data, 0);
> -}
> -
> -static void msix_set_all_mask(struct msi_irqs *msi, int flag)
> +static void pci_msix_set_all_mask(struct msi_irqs *msi, int flag)
>  {
>  	struct pci_dev *dev = msi->data;
> 
> @@ -271,16 +107,7 @@ static void msix_set_all_mask(struct msi_irqs *msi, int
> flag)
>  		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
>  }
> 
> -void default_restore_msi_irqs(struct msi_irqs *msi)
> -{
> -	struct msi_desc *entry;
> -
> -	list_for_each_entry(entry, &msi->msi_list, list) {
> -		default_restore_msi_irq(msi, entry->irq);
> -	}
> -}
> -
> -void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
> +void pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>  {
>  	struct pci_dev *dev = entry->msi->data;
> 
> @@ -311,31 +138,7 @@ void __read_msi_msg(struct msi_desc *entry, struct msi_msg
> *msg)
>  	}
>  }
> 
> -void read_msi_msg(unsigned int irq, struct msi_msg *msg)
> -{
> -	struct msi_desc *entry = irq_get_msi_desc(irq);
> -
> -	__read_msi_msg(entry, msg);
> -}
> -
> -void __get_cached_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
> -{
> -	/* Assert that the cache is valid, assuming that
> -	 * valid messages are not all-zeroes. */
> -	BUG_ON(!(entry->msg.address_hi | entry->msg.address_lo |
> -		 entry->msg.data));
> -
> -	*msg = entry->msg;
> -}
> -
> -void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg)
> -{
> -	struct msi_desc *entry = irq_get_msi_desc(irq);
> -
> -	__get_cached_msi_msg(entry, msg);
> -}
> -
> -void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
> +void pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>  {
>  	struct pci_dev *dev = entry->msi->data;
> 
> @@ -373,13 +176,6 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg
> *msg)
>  	entry->msg = *msg;
>  }
> 
> -void write_msi_msg(unsigned int irq, struct msi_msg *msg)
> -{
> -	struct msi_desc *entry = irq_get_msi_desc(irq);
> -
> -	__write_msi_msg(entry, msg);
> -}
> -
>  static void free_msi_sysfs(struct pci_dev *dev)
>  {
>  	struct attribute **msi_attrs;
> @@ -403,58 +199,6 @@ static void free_msi_sysfs(struct pci_dev *dev)
>  	}
>  }
> 
> -static void free_msi_irqs(struct msi_irqs *msi)
> -{
> -	struct msi_desc *entry, *tmp;
> -
> -	list_for_each_entry(entry, &msi->msi_list, list) {
> -		int i, nvec;
> -		if (!entry->irq)
> -			continue;
> -		if (entry->nvec_used)
> -			nvec = entry->nvec_used;
> -		else
> -			nvec = 1 << entry->msi_attrib.multiple;
> -		for (i = 0; i < nvec; i++)
> -			BUG_ON(irq_has_action(entry->irq + i));
> -	}
> -
> -	arch_teardown_msi_irqs(msi);
> -
> -	list_for_each_entry_safe(entry, tmp, &msi->msi_list, list) {
> -		if (entry->msi_attrib.is_msix) {
> -			if (list_is_last(&entry->list, &msi->msi_list))
> -				iounmap(entry->mask_base);
> -		}
> -
> -		/*
> -		 * Its possible that we get into this path
> -		 * When populate_msi_sysfs fails, which means the entries
> -		 * were not registered with sysfs.  In that case don't
> -		 * unregister them.
> -		 */
> -		if (entry->kobj.parent) {
> -			kobject_del(&entry->kobj);
> -			kobject_put(&entry->kobj);
> -		}
> -
> -		list_del(&entry->list);
> -		kfree(entry);
> -	}
> -}
> -
> -static struct msi_desc *alloc_msi_entry(struct msi_irqs *msi)
> -{
> -	struct msi_desc *desc = kzalloc(sizeof(*desc), GFP_KERNEL);
> -	if (!desc)
> -		return NULL;
> -
> -	INIT_LIST_HEAD(&desc->list);
> -	desc->msi = msi;
> -
> -	return desc;
> -}
> -
>  static void pci_intx_for_msi(struct msi_irqs *msi, int enable)
>  {
>  	struct pci_dev *dev = msi->data;
> @@ -474,7 +218,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
>  	entry = irq_get_msi_desc(dev->irq);
> 
>  	pci_intx_for_msi(dev->msi, 0);
> -	msi_set_enable(dev->msi, 0, MSI_TYPE);
> +	pci_msi_set_enable(dev->msi, 0, MSI_TYPE);
>  	arch_restore_msi_irqs(dev->msi);
> 
>  	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
> @@ -496,13 +240,13 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
> 
>  	/* route the table */
>  	pci_intx_for_msi(msi, 0);
> -	msi_set_enable(msi, 1, MSIX_TYPE);
> -	msix_set_all_mask(msi, 1);
> +	pci_msi_set_enable(msi, 1, MSIX_TYPE);
> +	pci_msix_set_all_mask(msi, 1);
>  	arch_restore_msi_irqs(msi);
>  	list_for_each_entry(entry, &msi->msi_list, list)
>  		msix_mask_irq(entry, entry->masked);
> 
> -	msix_set_all_mask(msi, 0);
> +	pci_msix_set_all_mask(msi, 0);
>  }
> 
>  void pci_restore_msi_state(struct pci_dev *dev)
> @@ -606,22 +350,16 @@ error_attrs:
>  	return ret;
>  }
> 
> -static struct msi_desc *msi_setup_entry(struct msi_irqs *msi)
> +static struct msi_desc *pci_msi_setup_entry(struct msi_irqs *msi,
> +		struct msi_desc *entry)
>  {
>  	u16 control;
> -	struct msi_desc *entry;
>  	struct pci_dev *dev = msi->data;
> 
>  	/* MSI Entry Initialization */
> -	entry = alloc_msi_entry(msi);
> -	if (!entry)
> -		return NULL;
> -
>  	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
> 
> -	entry->msi_attrib.is_msix	= 0;
>  	entry->msi_attrib.is_64		= !!(control & PCI_MSI_FLAGS_64BIT);
> -	entry->msi_attrib.entry_nr	= 0;
>  	entry->msi_attrib.maskbit	= !!(control & PCI_MSI_FLAGS_MASKBIT);
>  	entry->msi_attrib.default_irq	= dev->irq;	/* Save IOAPIC IRQ */
>  	entry->msi_attrib.multi_cap	= (control & PCI_MSI_FLAGS_QMASK) >> 1;
> @@ -638,52 +376,6 @@ static struct msi_desc *msi_setup_entry(struct msi_irqs
> *msi)
>  	return entry;
>  }
> 
> -/**
> - * msi_capability_init - configure device's MSI capability structure
> - * @dev: pointer to the pci_dev data structure of MSI device function
> - * @nvec: number of interrupts to allocate
> - *
> - * Setup the MSI capability structure of the device with the requested
> - * number of interrupts.  A return value of zero indicates the successful
> - * setup of an entry with the new MSI irq.  A negative return value indicates
> - * an error, and a positive return value indicates the number of interrupts
> - * which could have been allocated.
> - */
> -static int msi_capability_init(struct msi_irqs *msi, int nvec)
> -{
> -	struct msi_desc *entry;
> -	int ret;
> -	unsigned mask;
> -
> -	msi_set_enable(msi, 0, MSI_TYPE);	/* Disable MSI during set up */
> -
> -	entry = msi_setup_entry(msi);
> -	if (!entry)
> -		return -ENOMEM;
> -
> -	/* All MSIs are unmasked by default, Mask them all */
> -	mask = msi_mask(entry->msi_attrib.multi_cap);
> -	msi_mask_irq(entry, mask, mask);
> -
> -	list_add_tail(&entry->list, &msi->msi_list);
> -
> -	/* Configure MSI capability structure */
> -	ret = arch_setup_msi_irqs(msi, nvec, MSI_TYPE);
> -	if (ret)
> -		goto err;
> -
> -	/* Set MSI enabled bits	 */
> -	pci_intx_for_msi(msi, 0);
> -	msi_set_enable(msi, 1, MSI_TYPE);
> -	msi->msi_enabled = 1;
> -	return 0;
> -
> -err:
> -	msi_mask_irq(entry, mask, ~mask);
> -	free_msi_irqs(msi);
> -	return ret;
> -}
> -
>  static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
>  {
>  	resource_size_t phys_addr;
> @@ -699,28 +391,19 @@ static void __iomem *msix_map_region(struct pci_dev *dev,
> unsigned nr_entries)
>  	return ioremap_nocache(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
>  }
> 
> -static int msix_setup_entries(struct msi_irqs *msi, void __iomem *base,
> -			      struct msix_entry *entries, int nvec)
> +static int pci_msix_setup_entries(struct msi_irqs *msi, struct msix_entry
> *entries)
>  {
> +	int offset, i = 0;
>  	struct msi_desc *entry;
> -	int i, offset;
>  	struct pci_dev *dev = msi->data;
> 
> -	for (i = 0; i < nvec; i++) {
> -		entry = alloc_msi_entry(msi);
> -		if (!entry) {
> -			if (!i)
> -				iounmap(base);
> -			else
> -				free_msi_irqs(msi);
> -			/* No enough memory. Don't try again */
> -			return -ENOMEM;
> -		}
> 
> -		entry->msi_attrib.is_msix	= 1;
> -		entry->msi_attrib.is_64		= 1;
> -		entry->msi_attrib.entry_nr	= entries[i].entry;
> -		entry->mask_base		= base;
> +	list_for_each_entry(entry, &msi->msi_list, list) {
> +		/*
> +		 * Some devices require MSI-X to be enabled before we can touch the
> +		 * MSI-X registers.  We need to mask all the vectors to prevent
> +		 * interrupts coming in before they're fully set up.
> +		 */
> 
>  		msix_clear_and_set_ctrl(dev, 0,
>  				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE);
> @@ -730,87 +413,10 @@ static int msix_setup_entries(struct msi_irqs *msi, void
> __iomem *base,
>  		msix_mask_irq(entry, 1);
>  		msix_clear_and_set_ctrl(dev,
>  				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
> -
> -		list_add_tail(&entry->list, &msi->msi_list);
> -	}
> -
> -	return 0;
> -}
> -
> -static void msix_program_entries(struct msi_irqs *msi,
> -				 struct msix_entry *entries)
> -{
> -	struct msi_desc *entry;
> -	int i = 0;
> -
> -	list_for_each_entry(entry, &msi->msi_list, list) {
> -		entries[i].vector = entry->irq;
> -		irq_set_msi_desc(entry->irq, entry);
>  		i++;
>  	}
> -}
> -
> -/**
> - * msix_capability_init - configure device's MSI-X capability
> - * @dev: pointer to the pci_dev data structure of MSI-X device function
> - * @entries: pointer to an array of struct msix_entry entries
> - * @nvec: number of @entries
> - *
> - * Setup the MSI-X capability structure of device function with a
> - * single MSI-X irq. A return of zero indicates the successful setup of
> - * requested MSI-X entries with allocated irqs or non-zero for otherwise.
> - **/
> -static int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
> -				struct msix_entry *entries, int nvec)
> -{
> -	int ret;
> -
> -	/* Ensure MSI-X is disabled while it is set up */
> -	msi_set_enable(msi, 0, MSIX_TYPE);
> -
> -	ret = msix_setup_entries(msi, base, entries, nvec);
> -	if (ret)
> -		return ret;
> -
> -	ret = arch_setup_msi_irqs(msi, nvec, MSIX_TYPE);
> -	if (ret)
> -		goto out_avail;
> -
> -	/*
> -	 * Some devices require MSI-X to be enabled before we can touch the
> -	 * MSI-X registers.  We need to mask all the vectors to prevent
> -	 * interrupts coming in before they're fully set up.
> -	 */
> -	msix_program_entries(msi, entries);
> -
> -	/* Set MSI-X enabled bits and unmask the function */
> -	pci_intx_for_msi(msi, 0);
> -	msi->msix_enabled = 1;
> -
> -	msi_set_enable(msi, 1, MSIX_TYPE);
> 
>  	return 0;
> -
> -out_avail:
> -	if (ret < 0) {
> -		/*
> -		 * If we had some success, report the number of irqs
> -		 * we succeeded in setting up.
> -		 */
> -		struct msi_desc *entry;
> -		int avail = 0;
> -
> -		list_for_each_entry(entry, &msi->msi_list, list) {
> -			if (entry->irq != 0)
> -				avail++;
> -		}
> -		if (avail != 0)
> -			ret = avail;
> -	}
> -
> -	free_msi_irqs(msi);
> -
> -	return ret;
>  }
> 
>  /**
> @@ -886,25 +492,14 @@ EXPORT_SYMBOL(pci_msi_vec_count);
>  void pci_msi_shutdown(struct pci_dev *dev)
>  {
>  	struct msi_desc *desc;
> -	u32 mask;
> 
>  	if (!pci_msi_enable || !dev ||
>  			!pci_dev_msi_enabled(dev, MSI_TYPE))
>  		return;
> 
> -	BUG_ON(list_empty(&dev->msi->msi_list));
> -	desc = list_first_entry(&dev->msi->msi_list, struct msi_desc, list);
> -
> -	msi_set_enable(dev->msi, 0, MSI_TYPE);
> -	pci_intx_for_msi(dev->msi, 1);
> -	dev->msi->msi_enabled = 0;
> -
> -	/* Return the device with MSI unmasked as initial states */
> -	mask = msi_mask(desc->msi_attrib.multi_cap);
> -	/* Keep cached state to be restored */
> -	arch_msi_mask_irq(desc, mask, ~mask);
> -
> +	msi_shutdown(dev->msi);
>  	/* Restore dev->irq to its default pin-assertion irq */
> +	desc = list_first_entry(&dev->msi->msi_list, struct msi_desc, list);
>  	dev->irq = desc->msi_attrib.default_irq;
>  }
> 
> @@ -1014,20 +609,10 @@ EXPORT_SYMBOL(pci_enable_msix);
> 
>  void pci_msix_shutdown(struct pci_dev *dev)
>  {
> -	struct msi_desc *entry;
> -
> -	if (!pci_msi_enable || !dev || !pci_dev_msi_enabled(dev, MSIX_TYPE))
> +	if (!pci_msi_enable || !dev)
>  		return;
> 
> -	/* Return the device with MSI-X masked as initial states */
> -	list_for_each_entry(entry, &dev->msi->msi_list, list) {
> -		/* Keep cached states to be restored */
> -		arch_msix_mask_irq(entry, 1);
> -	}
> -
> -	msi_set_enable(dev->msi, 0, MSIX_TYPE);
> -	pci_intx_for_msi(dev->msi, 1);
> -	dev->msi->msix_enabled = 0;
> +	msix_shutdown(dev->msi);
>  }
> 
>  void pci_disable_msix(struct pci_dev *dev)
> @@ -1060,30 +645,16 @@ int pci_msi_enabled(void)
>  EXPORT_SYMBOL(pci_msi_enabled);
> 
>  static struct msi_ops pci_msi = {
> -	.msi_set_enable = msi_set_enable,
> -	.msi_setup_entry = msi_setup_entry,
> -	.msix_setup_entries = msix_setup_entries,
> -	.msi_mask_irq = default_msi_mask_irq,
> -	.msix_mask_irq = default_msix_mask_irq,
> -	.msi_read_message = __read_msi_msg,
> -	.msi_write_message = __write_msi_msg,
> +	.msi_set_enable = pci_msi_set_enable,
> +	.msi_setup_entry = pci_msi_setup_entry,
> +	.msix_setup_entries = pci_msix_setup_entries,
> +	.msi_mask_irq = pci_msi_mask_irq,
> +	.msix_mask_irq = pci_msix_mask_irq,
> +	.msi_read_message = pci_read_msi_msg,
> +	.msi_write_message = pci_write_msi_msg,
>  	.msi_set_intx =  pci_intx_for_msi,
>  };
> 
> -struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops)
> -{
> -	struct msi_irqs *msi;
> -
> -	msi = kzalloc(sizeof(struct msi_irqs), GFP_KERNEL);
> -	if (!msi)
> -		return NULL;
> -
> -	INIT_LIST_HEAD(&msi->msi_list);
> -	msi->data = data;
> -	msi->ops = ops;
> -	return msi;
> -}
> -
>  void pci_msi_init_pci_dev(struct pci_dev *dev)
>  {
>  	/* Disable the msi hardware to avoid screaming interrupts
> @@ -1100,10 +671,10 @@ void pci_msi_init_pci_dev(struct pci_dev *dev)
> 
>  		dev->msi->node = dev_to_node(&dev->dev);
>  		if (dev->msi_cap)
> -			msi_set_enable(dev->msi, 0, MSI_TYPE);
> +			pci_msi_set_enable(dev->msi, 0, MSI_TYPE);
> 
>  		if (dev->msix_cap)
> -			msi_set_enable(dev->msi, 0, MSIX_TYPE);
> +			pci_msi_set_enable(dev->msi, 0, MSIX_TYPE);
>  	}
>  }
> 
> @@ -1224,4 +795,3 @@ int pci_enable_msix_range(struct pci_dev *dev, struct
> msix_entry *entries,
>  }
>  EXPORT_SYMBOL(pci_enable_msix_range);
> 
> -
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index fc8f3e8..87ed0dd 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -28,9 +28,9 @@ struct msix_entry {
> 
>  struct msi_ops {
>  	void (*msi_set_enable)(struct msi_irqs *msi, int enable, int type);
> -	struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi);
> -	int (*msix_setup_entries)(struct msi_irqs *msi, void __iomem *base,
> -			struct msix_entry *entries, int nvec);
> +	struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi,
> +			struct msi_desc *entry);
> +	int (*msix_setup_entries)(struct msi_irqs *msi, struct msix_entry
> *entries);
>  	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
>  	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
>  	void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
> @@ -49,6 +49,18 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg
> *msg);
>  void read_msi_msg(unsigned int irq, struct msi_msg *msg);
>  void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg);
>  void write_msi_msg(unsigned int irq, struct msi_msg *msg);
> +struct msi_desc *alloc_msi_entry(struct msi_irqs *msi);
> +void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
> +void msix_mask_irq(struct msi_desc *desc, u32 flag);
> +void msi_set_enable(struct msi_irqs *msi, int enable, int type);
> +
> +struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops);
> +
> +void free_msi_irqs(struct msi_irqs *msi);
> +
> +int msi_capability_init(struct msi_irqs *msi, int nvec);
> +int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
> +		struct msix_entry *entries, int nvec);
> 
>  struct msi_desc {
>  	struct {
> @@ -89,12 +101,17 @@ int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int
> type);
>  void arch_teardown_msi_irqs(struct msi_irqs *msi);
>  int arch_msi_check_device(struct msi_irqs *msi, int nvec, int type);
>  void arch_restore_msi_irqs(struct msi_irqs *msi);
> +u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
> +u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag);
> 
>  void default_teardown_msi_irqs(struct msi_irqs *msi);
>  void default_restore_msi_irqs(struct msi_irqs *msi);
>  u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
>  u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag);
> 
> +void msi_shutdown(struct msi_irqs *msi);
> +void msix_shutdown(struct msi_irqs *msi);
> +
>  #define MSI_TYPE	0x01
>  #define MSIX_TYPE	0x02
> 
> @@ -111,4 +128,12 @@ struct msi_chip {
>  			    int nvec, int type);
>  };
> 
> +static inline __attribute_const__ u32 msi_mask(unsigned x)
> +{
> +	/* Don't shift by >= width of type */
> +	if (x >= 5)
> +		return 0xffffffff;
> +	return (1 << (1 << x)) - 1;
> +}
> +
>  #endif /* LINUX_MSI_H */
> --
> 1.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code
  2014-07-26  3:08 ` [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code Yijing Wang
@ 2014-08-20  6:20   ` Bharat.Bhushan
  2014-08-20  7:01     ` Yijing Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Bharat.Bhushan @ 2014-08-20  6:20 UTC (permalink / raw)
  To: Yijing Wang, linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo



> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
> On Behalf Of Yijing Wang
> Sent: Saturday, July 26, 2014 8:39 AM
> To: linux-kernel@vger.kernel.org
> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org; Basu
> Arnab-B45036; virtualization@lists.linux-foundation.org; Hanjun Guo; Yijing Wang
> Subject: [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code

Please provide description about what this refactoring is? Also does other architecture also need similar refactoring ?

Thanks
-Bharat

> 
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> ---
>  arch/x86/include/asm/io_apic.h       |    2 +-
>  arch/x86/include/asm/irq_remapping.h |    4 +-
>  arch/x86/include/asm/pci.h           |    6 ++--
>  arch/x86/include/asm/x86_init.h      |   10 +++---
>  arch/x86/kernel/apic/io_apic.c       |   23 +++++++--------
>  arch/x86/kernel/x86_init.c           |   12 ++++----
>  drivers/iommu/amd_iommu.c            |   16 ++++++----
>  drivers/iommu/intel_irq_remapping.c  |    9 ++++--
>  drivers/iommu/irq_remapping.c        |   51 ++++++++++++++++-----------------
>  drivers/iommu/irq_remapping.h        |    6 ++--
>  drivers/msi/msi.c                    |    3 +-
>  11 files changed, 72 insertions(+), 70 deletions(-)
> 
> diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
> index 90f97b4..692a90f 100644
> --- a/arch/x86/include/asm/io_apic.h
> +++ b/arch/x86/include/asm/io_apic.h
> @@ -158,7 +158,7 @@ extern int native_setup_ioapic_entry(int, struct
> IO_APIC_route_entry *,
>  				     struct io_apic_irq_attr *);
>  extern void eoi_ioapic_irq(unsigned int irq, struct irq_cfg *cfg);
> 
> -extern void native_compose_msi_msg(struct pci_dev *pdev,
> +extern void native_compose_msi_msg(struct msi_irqs *msi,
>  				   unsigned int irq, unsigned int dest,
>  				   struct msi_msg *msg, u8 hpet_id);
>  extern void native_eoi_ioapic_pin(int apic, int pin, int vector);
> diff --git a/arch/x86/include/asm/irq_remapping.h
> b/arch/x86/include/asm/irq_remapping.h
> index b7747c4..a10003d 100644
> --- a/arch/x86/include/asm/irq_remapping.h
> +++ b/arch/x86/include/asm/irq_remapping.h
> @@ -47,7 +47,7 @@ extern int setup_ioapic_remapped_entry(int irq,
>  				       int vector,
>  				       struct io_apic_irq_attr *attr);
>  extern void free_remapped_irq(int irq);
> -extern void compose_remapped_msi_msg(struct pci_dev *pdev,
> +extern void compose_remapped_msi_msg(struct msi_irqs *msi,
>  				     unsigned int irq, unsigned int dest,
>  				     struct msi_msg *msg, u8 hpet_id);
>  extern int setup_hpet_msi_remapped(unsigned int irq, unsigned int id);
> @@ -77,7 +77,7 @@ static inline int setup_ioapic_remapped_entry(int irq,
>  	return -ENODEV;
>  }
>  static inline void free_remapped_irq(int irq) { }
> -static inline void compose_remapped_msi_msg(struct pci_dev *pdev,
> +static inline void compose_remapped_msi_msg(struct msi_irqs *msi,
>  					    unsigned int irq, unsigned int dest,
>  					    struct msi_msg *msg, u8 hpet_id)
>  {
> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
> index 0892ea0..04c9ef6 100644
> --- a/arch/x86/include/asm/pci.h
> +++ b/arch/x86/include/asm/pci.h
> @@ -96,10 +96,10 @@ extern void pci_iommu_alloc(void);
>  #ifdef CONFIG_PCI_MSI
>  /* implemented in arch/x86/kernel/apic/io_apic. */
>  struct msi_desc;
> -int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
> +int native_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type);
>  void native_teardown_msi_irq(unsigned int irq);
> -void native_restore_msi_irqs(struct pci_dev *dev);
> -int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
> +void native_restore_msi_irqs(struct msi_irqs *msi);
> +int setup_msi_irq(struct msi_irqs *msi, struct msi_desc *msidesc,
>  		  unsigned int irq_base, unsigned int irq_offset);
>  #else
>  #define native_setup_msi_irqs		NULL
> diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
> index e45e4da..8e42f17 100644
> --- a/arch/x86/include/asm/x86_init.h
> +++ b/arch/x86/include/asm/x86_init.h
> @@ -170,18 +170,18 @@ struct x86_platform_ops {
>  	void (*apic_post_init)(void);
>  };
> 
> -struct pci_dev;
> +struct msi_irqs;
>  struct msi_msg;
>  struct msi_desc;
> 
>  struct x86_msi_ops {
> -	int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
> -	void (*compose_msi_msg)(struct pci_dev *dev, unsigned int irq,
> +	int (*setup_msi_irqs)(struct msi_irqs *msi, int nvec, int type);
> +	void (*compose_msi_msg)(struct msi_irqs *msi, unsigned int irq,
>  				unsigned int dest, struct msi_msg *msg,
>  			       u8 hpet_id);
>  	void (*teardown_msi_irq)(unsigned int irq);
> -	void (*teardown_msi_irqs)(struct pci_dev *dev);
> -	void (*restore_msi_irqs)(struct pci_dev *dev);
> +	void (*teardown_msi_irqs)(struct msi_irqs *msi);
> +	void (*restore_msi_irqs)(struct msi_irqs *msi);
>  	int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
>  	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
>  	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index b833042..3cb4a6a 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2939,7 +2939,7 @@ void arch_teardown_hwirq(unsigned int irq)
>  /*
>   * MSI message composition
>   */
> -void native_compose_msi_msg(struct pci_dev *pdev,
> +void native_compose_msi_msg(struct msi_irqs *msi,
>  			    unsigned int irq, unsigned int dest,
>  			    struct msi_msg *msg, u8 hpet_id)
>  {
> @@ -2970,7 +2970,7 @@ void native_compose_msi_msg(struct pci_dev *pdev,
>  }
> 
>  #ifdef CONFIG_PCI_MSI
> -static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
> +static int msi_compose_msg(struct msi_irqs *msi, unsigned int irq,
>  			   struct msi_msg *msg, u8 hpet_id)
>  {
>  	struct irq_cfg *cfg;
> @@ -2990,7 +2990,7 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned
> int irq,
>  	if (err)
>  		return err;
> 
> -	x86_msi.compose_msi_msg(pdev, irq, dest, msg, hpet_id);
> +	x86_msi.compose_msi_msg(msi, irq, dest, msg, hpet_id);
> 
>  	return 0;
>  }
> @@ -3032,15 +3032,16 @@ static struct irq_chip msi_chip = {
>  	.irq_retrigger		= ioapic_retrigger_irq,
>  };
> 
> -int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
> +int setup_msi_irq(struct msi_irqs *msi, struct msi_desc *msidesc,
>  		  unsigned int irq_base, unsigned int irq_offset)
>  {
>  	struct irq_chip *chip = &msi_chip;
>  	struct msi_msg msg;
>  	unsigned int irq = irq_base + irq_offset;
>  	int ret;
> +	struct pci_dev *dev = msi->data;
> 
> -	ret = msi_compose_msg(dev, irq, &msg, -1);
> +	ret = msi_compose_msg(msi, irq, &msg, -1);
>  	if (ret < 0)
>  		return ret;
> 
> @@ -3062,24 +3063,22 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc
> *msidesc,
>  	return 0;
>  }
> 
> -int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> +int native_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
>  {
>  	struct msi_desc *msidesc;
>  	unsigned int irq;
> -	int node, ret;
> +	int ret;
> 
>  	/* Multiple MSI vectors only supported with interrupt remapping */
>  	if (type == MSI_TYPE && nvec > 1)
>  		return 1;
> 
> -	node = dev_to_node(&dev->dev);
> -
> -	list_for_each_entry(msidesc, &dev->msi_list, list) {
> -		irq = irq_alloc_hwirq(node);
> +	list_for_each_entry(msidesc, &msi->msi_list, list) {
> +		irq = irq_alloc_hwirq(msi->node);
>  		if (!irq)
>  			return -ENOSPC;
> 
> -		ret = setup_msi_irq(dev, msidesc, irq, 0);
> +		ret = setup_msi_irq(msi, msidesc, irq, 0);
>  		if (ret < 0) {
>  			irq_free_hwirq(irq);
>  			return ret;
> diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
> index e48b674..a277faf 100644
> --- a/arch/x86/kernel/x86_init.c
> +++ b/arch/x86/kernel/x86_init.c
> @@ -121,14 +121,14 @@ struct x86_msi_ops x86_msi = {
>  };
> 
>  /* MSI arch specific hooks */
> -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> +int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
>  {
> -	return x86_msi.setup_msi_irqs(dev, nvec, type);
> +	return x86_msi.setup_msi_irqs(msi, nvec, type);
>  }
> 
> -void arch_teardown_msi_irqs(struct pci_dev *dev)
> +void arch_teardown_msi_irqs(struct msi_irqs *msi)
>  {
> -	x86_msi.teardown_msi_irqs(dev);
> +	x86_msi.teardown_msi_irqs(msi);
>  }
> 
>  void arch_teardown_msi_irq(unsigned int irq)
> @@ -136,9 +136,9 @@ void arch_teardown_msi_irq(unsigned int irq)
>  	x86_msi.teardown_msi_irq(irq);
>  }
> 
> -void arch_restore_msi_irqs(struct pci_dev *dev)
> +void arch_restore_msi_irqs(struct msi_irqs *msi)
>  {
> -	x86_msi.restore_msi_irqs(dev);
> +	x86_msi.restore_msi_irqs(msi);
>  }
>  u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>  {
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 4aec6a2..0e45cb7 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -4237,7 +4237,7 @@ static int free_irq(int irq)
>  	return 0;
>  }
> 
> -static void compose_msi_msg(struct pci_dev *pdev,
> +static void compose_msi_msg(struct msi_irqs *msi,
>  			    unsigned int irq, unsigned int dest,
>  			    struct msi_msg *msg, u8 hpet_id)
>  {
> @@ -4265,33 +4265,35 @@ static void compose_msi_msg(struct pci_dev *pdev,
>  	msg->data       = irte_info->index;
>  }
> 
> -static int msi_alloc_irq(struct pci_dev *pdev, int irq, int nvec)
> +static int msi_alloc_irq(struct msi_irqs *msi, int irq, int nvec)
>  {
>  	struct irq_cfg *cfg;
>  	int index;
>  	u16 devid;
> +	struct pci_dev *dev = msi->data;
> 
> -	if (!pdev)
> +	if (!dev)
>  		return -EINVAL;
> 
>  	cfg = irq_get_chip_data(irq);
>  	if (!cfg)
>  		return -EINVAL;
> 
> -	devid = get_device_id(&pdev->dev);
> +	devid = get_device_id(&dev->dev);
>  	index = alloc_irq_index(cfg, devid, nvec);
> 
>  	return index < 0 ? MAX_IRQS_PER_TABLE : index;
>  }
> 
> -static int msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
> +static int msi_setup_irq(struct msi_irqs *msi, unsigned int irq,
>  			 int index, int offset)
>  {
>  	struct irq_2_irte *irte_info;
>  	struct irq_cfg *cfg;
>  	u16 devid;
> +	struct pci_dev *dev = msi->data;
> 
> -	if (!pdev)
> +	if (!dev)
>  		return -EINVAL;
> 
>  	cfg = irq_get_chip_data(irq);
> @@ -4301,7 +4303,7 @@ static int msi_setup_irq(struct pci_dev *pdev, unsigned
> int irq,
>  	if (index >= MAX_IRQS_PER_TABLE)
>  		return 0;
> 
> -	devid		= get_device_id(&pdev->dev);
> +	devid		= get_device_id(&dev->dev);
>  	irte_info	= &cfg->irq_2_irte;
> 
>  	cfg->remapped	      = 1;
> diff --git a/drivers/iommu/intel_irq_remapping.c
> b/drivers/iommu/intel_irq_remapping.c
> index 9b17489..d6bde63 100644
> --- a/drivers/iommu/intel_irq_remapping.c
> +++ b/drivers/iommu/intel_irq_remapping.c
> @@ -1027,7 +1027,7 @@ intel_ioapic_set_affinity(struct irq_data *data, const
> struct cpumask *mask,
>  	return 0;
>  }
> 
> -static void intel_compose_msi_msg(struct pci_dev *pdev,
> +static void intel_compose_msi_msg(struct msi_irqs *msi,
>  				  unsigned int irq, unsigned int dest,
>  				  struct msi_msg *msg, u8 hpet_id)
>  {
> @@ -1035,6 +1035,7 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
>  	struct irte irte;
>  	u16 sub_handle = 0;
>  	int ir_index;
> +	struct pci_dev *pdev = msi->data;
> 
>  	cfg = irq_get_chip_data(irq);
> 
> @@ -1064,10 +1065,11 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
>   * and allocate 'nvec' consecutive interrupt-remapping table entries
>   * in it.
>   */
> -static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
> +static int intel_msi_alloc_irq(struct msi_irqs *msi, int irq, int nvec)
>  {
>  	struct intel_iommu *iommu;
>  	int index;
> +	struct pci_dev *dev = msi->data;
> 
>  	down_read(&dmar_global_lock);
>  	iommu = map_dev_to_ir(dev);
> @@ -1089,11 +1091,12 @@ static int intel_msi_alloc_irq(struct pci_dev *dev, int
> irq, int nvec)
>  	return index;
>  }
> 
> -static int intel_msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
> +static int intel_msi_setup_irq(struct msi_irqs *msi, unsigned int irq,
>  			       int index, int sub_handle)
>  {
>  	struct intel_iommu *iommu;
>  	int ret = -ENOENT;
> +	struct pci_dev *pdev = msi->data;
> 
>  	down_read(&dmar_global_lock);
>  	iommu = map_dev_to_ir(pdev);
> diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
> index a3b1805..1fe14e5 100644
> --- a/drivers/iommu/irq_remapping.c
> +++ b/drivers/iommu/irq_remapping.c
> @@ -24,8 +24,8 @@ int no_x2apic_optout;
> 
>  static struct irq_remap_ops *remap_ops;
> 
> -static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec);
> -static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
> +static int msi_alloc_remapped_irq(struct msi_irqs *msi, int irq, int nvec);
> +static int msi_setup_remapped_irq(struct msi_irqs *msi, unsigned int irq,
>  				  int index, int sub_handle);
>  static int set_remapped_irq_affinity(struct irq_data *data,
>  				     const struct cpumask *mask,
> @@ -49,19 +49,19 @@ static void irq_remapping_disable_io_apic(void)
>  		disconnect_bsp_APIC(0);
>  }
> 
> -static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
> +static int do_setup_msi_irqs(struct msi_irqs *msi, int nvec)
>  {
>  	int ret, sub_handle, nvec_pow2, index = 0;
>  	unsigned int irq;
>  	struct msi_desc *msidesc;
> 
> -	WARN_ON(!list_is_singular(&dev->msi_list));
> -	msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
> +	WARN_ON(!list_is_singular(&msi->msi_list));
> +	msidesc = list_entry(msi->msi_list.next, struct msi_desc, list);
>  	WARN_ON(msidesc->irq);
>  	WARN_ON(msidesc->msi_attrib.multiple);
>  	WARN_ON(msidesc->nvec_used);
> 
> -	irq = irq_alloc_hwirqs(nvec, dev_to_node(&dev->dev));
> +	irq = irq_alloc_hwirqs(nvec, msi->node);
>  	if (irq == 0)
>  		return -ENOSPC;
> 
> @@ -70,18 +70,18 @@ static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
>  	msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
>  	for (sub_handle = 0; sub_handle < nvec; sub_handle++) {
>  		if (!sub_handle) {
> -			index = msi_alloc_remapped_irq(dev, irq, nvec_pow2);
> +			index = msi_alloc_remapped_irq(msi, irq, nvec_pow2);
>  			if (index < 0) {
>  				ret = index;
>  				goto error;
>  			}
>  		} else {
> -			ret = msi_setup_remapped_irq(dev, irq + sub_handle,
> +			ret = msi_setup_remapped_irq(msi, irq + sub_handle,
>  						     index, sub_handle);
>  			if (ret < 0)
>  				goto error;
>  		}
> -		ret = setup_msi_irq(dev, msidesc, irq, sub_handle);
> +		ret = setup_msi_irq(msi, msidesc, irq, sub_handle);
>  		if (ret < 0)
>  			goto error;
>  	}
> @@ -101,30 +101,29 @@ error:
>  	return ret;
>  }
> 
> -static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
> +static int do_setup_msix_irqs(struct msi_irqs *msi, int nvec)
>  {
>  	int node, ret, sub_handle, index = 0;
>  	struct msi_desc *msidesc;
>  	unsigned int irq;
> 
> -	node		= dev_to_node(&dev->dev);
>  	sub_handle	= 0;
> 
> -	list_for_each_entry(msidesc, &dev->msi_list, list) {
> +	list_for_each_entry(msidesc, &msi->msi_list, list) {
> 
> -		irq = irq_alloc_hwirq(node);
> +		irq = irq_alloc_hwirq(msi->node);
>  		if (irq == 0)
>  			return -1;
> 
>  		if (sub_handle == 0)
> -			ret = index = msi_alloc_remapped_irq(dev, irq, nvec);
> +			ret = index = msi_alloc_remapped_irq(msi, irq, nvec);
>  		else
> -			ret = msi_setup_remapped_irq(dev, irq, index, sub_handle);
> +			ret = msi_setup_remapped_irq(msi, irq, index, sub_handle);
> 
>  		if (ret < 0)
>  			goto error;
> 
> -		ret = setup_msi_irq(dev, msidesc, irq, 0);
> +		ret = setup_msi_irq(msi, msidesc, irq, 0);
>  		if (ret < 0)
>  			goto error;
> 
> @@ -139,13 +138,13 @@ error:
>  	return ret;
>  }
> 
> -static int irq_remapping_setup_msi_irqs(struct pci_dev *dev,
> +static int irq_remapping_setup_msi_irqs(struct msi_irqs *msi,
>  					int nvec, int type)
>  {
>  	if (type == MSI_TYPE)
> -		return do_setup_msi_irqs(dev, nvec);
> +		return do_setup_msi_irqs(msi, nvec);
>  	else
> -		return do_setup_msix_irqs(dev, nvec);
> +		return do_setup_msix_irqs(msi, nvec);
>  }
> 
>  static void eoi_ioapic_pin_remapped(int apic, int pin, int vector)
> @@ -314,33 +313,33 @@ void free_remapped_irq(int irq)
>  		remap_ops->free_irq(irq);
>  }
> 
> -void compose_remapped_msi_msg(struct pci_dev *pdev,
> +void compose_remapped_msi_msg(struct msi_irqs *msi,
>  			      unsigned int irq, unsigned int dest,
>  			      struct msi_msg *msg, u8 hpet_id)
>  {
>  	struct irq_cfg *cfg = irq_get_chip_data(irq);
> 
>  	if (!irq_remapped(cfg))
> -		native_compose_msi_msg(pdev, irq, dest, msg, hpet_id);
> +		native_compose_msi_msg(msi, irq, dest, msg, hpet_id);
>  	else if (remap_ops && remap_ops->compose_msi_msg)
> -		remap_ops->compose_msi_msg(pdev, irq, dest, msg, hpet_id);
> +		remap_ops->compose_msi_msg(msi, irq, dest, msg, hpet_id);
>  }
> 
> -static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec)
> +static int msi_alloc_remapped_irq(struct msi_irqs *msi, int irq, int nvec)
>  {
>  	if (!remap_ops || !remap_ops->msi_alloc_irq)
>  		return -ENODEV;
> 
> -	return remap_ops->msi_alloc_irq(pdev, irq, nvec);
> +	return remap_ops->msi_alloc_irq(msi, irq, nvec);
>  }
> 
> -static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
> +static int msi_setup_remapped_irq(struct msi_irqs *msi, unsigned int irq,
>  				  int index, int sub_handle)
>  {
>  	if (!remap_ops || !remap_ops->msi_setup_irq)
>  		return -ENODEV;
> 
> -	return remap_ops->msi_setup_irq(pdev, irq, index, sub_handle);
> +	return remap_ops->msi_setup_irq(msi, irq, index, sub_handle);
>  }
> 
>  int setup_hpet_msi_remapped(unsigned int irq, unsigned int id)
> diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
> index 90c4dae..59c4cfb 100644
> --- a/drivers/iommu/irq_remapping.h
> +++ b/drivers/iommu/irq_remapping.h
> @@ -69,15 +69,15 @@ struct irq_remap_ops {
>  	int (*free_irq)(int);
> 
>  	/* Create MSI msg to use for interrupt remapping */
> -	void (*compose_msi_msg)(struct pci_dev *,
> +	void (*compose_msi_msg)(struct msi_irqs *,
>  				unsigned int, unsigned int,
>  				struct msi_msg *, u8);
> 
>  	/* Allocate remapping resources for MSI */
> -	int (*msi_alloc_irq)(struct pci_dev *, int, int);
> +	int (*msi_alloc_irq)(struct msi_irqs *, int, int);
> 
>  	/* Setup the remapped MSI irq */
> -	int (*msi_setup_irq)(struct pci_dev *, unsigned int, int, int);
> +	int (*msi_setup_irq)(struct msi_irqs *, unsigned int, int, int);
> 
>  	/* Setup interrupt remapping for an HPET MSI */
>  	int (*setup_hpet_msi)(unsigned int, unsigned int);
> diff --git a/drivers/msi/msi.c b/drivers/msi/msi.c
> index 3fbd539..8462c6c 100644
> --- a/drivers/msi/msi.c
> +++ b/drivers/msi/msi.c
> @@ -510,9 +510,8 @@ int msix_capability_init(struct msi_irqs *msi, void __iomem
> *base,
> 
>  	/* Set MSI-X enabled bits and unmask the function */
>  	msi_set_intx(msi, 0);
> -	msi->msix_enabled = 1;
> -
>  	msi_set_enable(msi, 1, MSIX_TYPE);
> +	msi->msix_enabled = 1;
> 
>  	return 0;
> 
> --
> 1.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-20  5:44     ` Bharat.Bhushan
@ 2014-08-20  6:28       ` Yijing Wang
  2014-08-20  7:41         ` Bharat.Bhushan
  0 siblings, 1 reply; 41+ messages in thread
From: Yijing Wang @ 2014-08-20  6:28 UTC (permalink / raw)
  To: Bharat.Bhushan, arnab.basu
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, virtualization, Hanjun Guo,
	linux-kernel

>> The key difference between PCI device and Non-PCI MSI is the interfaces to
>> access hardware MSI registers.
>> for instance, currently, msi_chip->setup_irq() to setup MSI irq and configure
>> the MSI address/data registers, so we need to provide device specific
>> write_msi_msg() interface, then when we call msi_chip->setup_irq(), the device
>> MSI registers can be configured appropriately.
> 
> What if we can register/override the setup_irq() from bus-driver (not sure, but may be device-driver itself). Example PCI bus-driver will provide setup_irq() (or the part of setup_irq which set address and data in h/w) by PCI bus, which configure address/data in h/w as per PCI standard. 
> 
> We in Freescale will be using MSI for the devices behind a new-bus (which is not PCI based), We have a separate bus driver for same. And this new bus driver register/provide its own address/data write function which is based on that specific bus protocol.

Hi Bharat, I'm glad to know your MSI device working mode.
Provide the private MSI setup functions in bus-driver layer can't apply to all Non-PCI MSI devices,
because we can not guarantee Non-PCI MSI devices are always on a bus. The existing HPET, DMAR device both
have no bus bind. I'm working on a new MSI setup framework, as you mentioned before, in device-driver model.

I abstracted a new virtual device (called struct msi_dev), this msi_dev will manage all MSI info, and a new bus
named msi_bus, also introduced a new driver msi_driver, msi_bus is responsible for binding msi_dev and msi_driver.
All MSI devices will be classified into different MSI device types, like MSI_TYPE_PCI, MSI_TYPE_HPET, MSI_TYPE_DMAR, etc..

Each MSI type device should provide a private struct msi_driver. msi_driver should contain the type specific
MSI ops functions to help setup and enable MSI device, request MSI irq.

I almost finish the first draft, and will post out next week in plan :)


Thanks!
Yijing.

> 
> Thanks
> -Bharat
> 
>>
>> My patchset is just a RFC draft, I will update it later, all we want to do is
>> make kernel support Non-PCI MSI devices.
>>
>> Thanks!
>> Yijing.
>>
>>
>>>
>>> Thanks
>>> Arnab
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
>>> .
>>>
>>
>>
>> --
>> Thanks!
>> Yijing
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body
>> of a message to majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled()
  2014-08-20  5:57   ` Bharat.Bhushan
@ 2014-08-20  6:30     ` Yijing Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-08-20  6:30 UTC (permalink / raw)
  To: Bharat.Bhushan, linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo

On 2014/8/20 13:57, Bharat.Bhushan@freescale.com wrote:
> 
> 
>> -----Original Message-----
>> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
>> On Behalf Of Yijing Wang
>> Sent: Saturday, July 26, 2014 8:39 AM
>> To: linux-kernel@vger.kernel.org
>> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
>> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
>> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org; Basu
>> Arnab-B45036; virtualization@lists.linux-foundation.org; Hanjun Guo; Yijing Wang
>> Subject: [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled()
>>
>> Pci_dev_msi_enabled() is used to check whether device MSI/MSIX enabled. Refactor
>> this function  to suuport checking only device MSI or MSIX enabled.
> 
> s/support/support
> 
>>From code it looks like you added one more parameter to pci_dev_msi_enabled() to check for a specific type, which earlier it was checking for both MSI and MSIX enable. While the description is not clear to me, Am I missing something ?

Right~


> 
> Thanks
> -Bharat
> 
> 
>>
>> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
>> ---
>>  arch/cris/arch-v32/drivers/pci/bios.c     |    2 +-
>>  arch/frv/mb93090-mb00/pci-vdk.c           |    2 +-
>>  arch/ia64/pci/pci.c                       |    4 ++--
>>  arch/powerpc/kernel/eeh_driver.c          |    2 +-
>>  arch/x86/pci/common.c                     |    5 +++--
>>  drivers/block/nvme-core.c                 |    4 ++--
>>  drivers/dma/ioat/dma.c                    |    2 +-
>>  drivers/firewire/ohci.c                   |    2 +-
>>  drivers/gpu/drm/i915/i915_dma.c           |    4 ++--
>>  drivers/misc/mei/hw-me.c                  |    2 +-
>>  drivers/misc/mei/hw-txe.c                 |    2 +-
>>  drivers/misc/mei/pci-me.c                 |    4 ++--
>>  drivers/misc/mei/pci-txe.c                |    4 ++--
>>  drivers/misc/mic/host/mic_debugfs.c       |    4 ++--
>>  drivers/misc/mic/host/mic_intr.c          |    8 ++++----
>>  drivers/ntb/ntb_hw.c                      |    2 +-
>>  drivers/pci/irq.c                         |    4 ++--
>>  drivers/pci/msi.c                         |   15 +++++++++------
>>  drivers/pci/pci.c                         |    6 +++---
>>  drivers/pci/pcie/portdrv_core.c           |    4 ++--
>>  drivers/scsi/esas2r/esas2r_init.c         |    4 ++--
>>  drivers/scsi/esas2r/esas2r_ioctl.c        |    4 ++--
>>  drivers/scsi/hpsa.c                       |    4 ++--
>>  drivers/staging/crystalhd/crystalhd_lnx.c |    2 +-
>>  drivers/xen/xen-pciback/pciback_ops.c     |   12 ++++++------
>>  include/linux/pci.h                       |   12 ++++++++++--
>>  virt/kvm/assigned-dev.c                   |    2 +-
>>  27 files changed, 67 insertions(+), 55 deletions(-)
>>
>> diff --git a/arch/cris/arch-v32/drivers/pci/bios.c b/arch/cris/arch-
>> v32/drivers/pci/bios.c
>> index 64a5fb9..d9d8332 100644
>> --- a/arch/cris/arch-v32/drivers/pci/bios.c
>> +++ b/arch/cris/arch-v32/drivers/pci/bios.c
>> @@ -93,7 +93,7 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
>>  	if ((err = pcibios_enable_resources(dev, mask)) < 0)
>>  		return err;
>>
>> -	if (!dev->msi_enabled)
>> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		pcibios_enable_irq(dev);
>>  	return 0;
>>  }
>> diff --git a/arch/frv/mb93090-mb00/pci-vdk.c b/arch/frv/mb93090-mb00/pci-vdk.c
>> index efa5d65..b96c128 100644
>> --- a/arch/frv/mb93090-mb00/pci-vdk.c
>> +++ b/arch/frv/mb93090-mb00/pci-vdk.c
>> @@ -409,7 +409,7 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
>>
>>  	if ((err = pci_enable_resources(dev, mask)) < 0)
>>  		return err;
>> -	if (!dev->msi_enabled)
>> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		pcibios_enable_irq(dev);
>>  	return 0;
>>  }
>> diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c index 291a582..da8ddff
>> 100644
>> --- a/arch/ia64/pci/pci.c
>> +++ b/arch/ia64/pci/pci.c
>> @@ -568,7 +568,7 @@ pcibios_enable_device (struct pci_dev *dev, int mask)
>>  	if (ret < 0)
>>  		return ret;
>>
>> -	if (!dev->msi_enabled)
>> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		return acpi_pci_irq_enable(dev);
>>  	return 0;
>>  }
>> @@ -577,7 +577,7 @@ void
>>  pcibios_disable_device (struct pci_dev *dev)  {
>>  	BUG_ON(atomic_read(&dev->enable_cnt));
>> -	if (!dev->msi_enabled)
>> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		acpi_pci_irq_disable(dev);
>>  }
>>
>> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>> index 420da61..e3f2074 100644
>> --- a/arch/powerpc/kernel/eeh_driver.c
>> +++ b/arch/powerpc/kernel/eeh_driver.c
>> @@ -123,7 +123,7 @@ static void eeh_disable_irq(struct pci_dev *dev)
>>  	 * effectively disabled by the DMA Stopped state
>>  	 * when an EEH error occurs.
>>  	 */
>> -	if (dev->msi_enabled || dev->msix_enabled)
>> +	if (pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
>>  		return;
>>
>>  	if (!irq_has_action(dev->irq))
>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index
>> 059a76c..4597940 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -662,14 +662,15 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
>>  	if ((err = pci_enable_resources(dev, mask)) < 0)
>>  		return err;
>>
>> -	if (!pci_dev_msi_enabled(dev))
>> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
>>  		return pcibios_enable_irq(dev);
>>  	return 0;
>>  }
>>
>>  void pcibios_disable_device (struct pci_dev *dev)  {
>> -	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq)
>> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE)
>> +			&& pcibios_disable_irq)
>>  		pcibios_disable_irq(dev);
>>  }
>>
>> diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c index
>> 02351e2..f96b90f 100644
>> --- a/drivers/block/nvme-core.c
>> +++ b/drivers/block/nvme-core.c
>> @@ -2325,9 +2325,9 @@ static int nvme_dev_map(struct nvme_dev *dev)
>>
>>  static void nvme_dev_unmap(struct nvme_dev *dev)  {
>> -	if (dev->pci_dev->msi_enabled)
>> +	if (pci_dev_msi_enabled(dev->pci_dev, MSI_TYPE))
>>  		pci_disable_msi(dev->pci_dev);
>> -	else if (dev->pci_dev->msix_enabled)
>> +	else if (pci_dev_msi_enabled(dev->pci_dev, MSIX_TYPE))
>>  		pci_disable_msix(dev->pci_dev);
>>
>>  	if (dev->bar) {
>> diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c index
>> 4e3549a..a11dac1 100644
>> --- a/drivers/dma/ioat/dma.c
>> +++ b/drivers/dma/ioat/dma.c
>> @@ -1088,7 +1088,7 @@ static void ioat1_intr_quirk(struct ioatdma_device
>> *device)
>>  	u32 dmactrl;
>>
>>  	pci_read_config_dword(pdev, IOAT_PCI_DMACTRL_OFFSET, &dmactrl);
>> -	if (pdev->msi_enabled)
>> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE))
>>  		dmactrl |= IOAT_PCI_DMACTRL_MSI_EN;
>>  	else
>>  		dmactrl &= ~IOAT_PCI_DMACTRL_MSI_EN;
>> diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c index
>> 5798541..ec0a794 100644
>> --- a/drivers/firewire/ohci.c
>> +++ b/drivers/firewire/ohci.c
>> @@ -3705,7 +3705,7 @@ static int pci_probe(struct pci_dev *dev,
>>  	if (!(ohci->quirks & QUIRK_NO_MSI))
>>  		pci_enable_msi(dev);
>>  	if (request_irq(dev->irq, irq_handler,
>> -			pci_dev_msi_enabled(dev) ? 0 : IRQF_SHARED,
>> +			pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE) ? 0 :
>> IRQF_SHARED,
>>  			ohci_driver_name, ohci)) {
>>  		ohci_err(ohci, "failed to allocate interrupt %d\n", dev->irq);
>>  		err = -EIO;
>> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
>> index 4c22a5b..0c248fe 100644
>> --- a/drivers/gpu/drm/i915/i915_dma.c
>> +++ b/drivers/gpu/drm/i915/i915_dma.c
>> @@ -1745,7 +1745,7 @@ out_gem_unload:
>>  	WARN_ON(unregister_oom_notifier(&dev_priv->mm.oom_notifier));
>>  	unregister_shrinker(&dev_priv->mm.shrinker);
>>
>> -	if (dev->pdev->msi_enabled)
>> +	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE))
>>  		pci_disable_msi(dev->pdev);
>>
>>  	intel_teardown_gmbus(dev);
>> @@ -1826,7 +1826,7 @@ int i915_driver_unload(struct drm_device *dev)
>>  	cancel_work_sync(&dev_priv->gpu_error.work);
>>  	i915_destroy_error_state(dev);
>>
>> -	if (dev->pdev->msi_enabled)
>> +	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE))
>>  		pci_disable_msi(dev->pdev);
>>
>>  	intel_opregion_fini(dev);
>> diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c index
>> 6a2d272..d7595d4 100644
>> --- a/drivers/misc/mei/hw-me.c
>> +++ b/drivers/misc/mei/hw-me.c
>> @@ -647,7 +647,7 @@ irqreturn_t mei_me_irq_thread_handler(int irq, void *dev_id)
>>
>>  	/* Ack the interrupt here
>>  	 * In case of MSI we don't go through the quick handler */
>> -	if (pci_dev_msi_enabled(dev->pdev))
>> +	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE | MSIX_TYPE))
>>  		mei_clear_interrupts(dev);
>>
>>  	/* check if ME wants a reset */
>> diff --git a/drivers/misc/mei/hw-txe.c b/drivers/misc/mei/hw-txe.c index
>> 9327378..8c2d95c 100644
>> --- a/drivers/misc/mei/hw-txe.c
>> +++ b/drivers/misc/mei/hw-txe.c
>> @@ -951,7 +951,7 @@ irqreturn_t mei_txe_irq_thread_handler(int irq, void
>> *dev_id)
>>  	mutex_lock(&dev->device_lock);
>>  	mei_io_list_init(&complete_list);
>>
>> -	if (pci_dev_msi_enabled(dev->pdev))
>> +	if (pci_dev_msi_enabled(dev->pdev, MSI_TYPE | MSIX_TYPE))
>>  		mei_txe_check_and_ack_intrs(dev, true);
>>
>>  	/* show irq events */
>> diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c index
>> 1b46c64..283fc09 100644
>> --- a/drivers/misc/mei/pci-me.c
>> +++ b/drivers/misc/mei/pci-me.c
>> @@ -181,7 +181,7 @@ static int mei_me_probe(struct pci_dev *pdev, const struct
>> pci_device_id *ent)
>>  	pci_enable_msi(pdev);
>>
>>  	 /* request and enable interrupt */
>> -	if (pci_dev_msi_enabled(pdev))
>> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>>  		err = request_threaded_irq(pdev->irq,
>>  			NULL,
>>  			mei_me_irq_thread_handler,
>> @@ -329,7 +329,7 @@ static int mei_me_pci_resume(struct device *device)
>>  	pci_enable_msi(pdev);
>>
>>  	/* request and enable interrupt */
>> -	if (pci_dev_msi_enabled(pdev))
>> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>>  		err = request_threaded_irq(pdev->irq,
>>  			NULL,
>>  			mei_me_irq_thread_handler,
>> diff --git a/drivers/misc/mei/pci-txe.c b/drivers/misc/mei/pci-txe.c index
>> 2343c62..a3bf202 100644
>> --- a/drivers/misc/mei/pci-txe.c
>> +++ b/drivers/misc/mei/pci-txe.c
>> @@ -124,7 +124,7 @@ static int mei_txe_probe(struct pci_dev *pdev, const struct
>> pci_device_id *ent)
>>  	mei_clear_interrupts(dev);
>>
>>  	/* request and enable interrupt  */
>> -	if (pci_dev_msi_enabled(pdev))
>> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>>  		err = request_threaded_irq(pdev->irq,
>>  			NULL,
>>  			mei_txe_irq_thread_handler,
>> @@ -272,7 +272,7 @@ static int mei_txe_pci_resume(struct device *device)
>>  	mei_clear_interrupts(dev);
>>
>>  	/* request and enable interrupt */
>> -	if (pci_dev_msi_enabled(pdev))
>> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>>  		err = request_threaded_irq(pdev->irq,
>>  			NULL,
>>  			mei_txe_irq_thread_handler,
>> diff --git a/drivers/misc/mic/host/mic_debugfs.c
>> b/drivers/misc/mic/host/mic_debugfs.c
>> index 028ba5d..6e1a553 100644
>> --- a/drivers/misc/mic/host/mic_debugfs.c
>> +++ b/drivers/misc/mic/host/mic_debugfs.c
>> @@ -376,9 +376,9 @@ static int mic_msi_irq_info_show(struct seq_file *s, void
>> *pos)
>>  	struct pci_dev *pdev = container_of(mdev->sdev->parent,
>>  		struct pci_dev, dev);
>>
>> -	if (pci_dev_msi_enabled(pdev)) {
>> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>>  		for (i = 0; i < mdev->irq_info.num_vectors; i++) {
>> -			if (pdev->msix_enabled) {
>> +			if (pci_dev_msi_enabled(pdev, MSIX_TYPE)) {
>>  				entry = mdev->irq_info.msix_entries[i].entry;
>>  				vector = mdev->irq_info.msix_entries[i].vector;
>>  			} else {
>> diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
>> index dbc5afd..9eab900 100644
>> --- a/drivers/misc/mic/host/mic_intr.c
>> +++ b/drivers/misc/mic/host/mic_intr.c
>> @@ -468,7 +468,7 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
>>  		}
>>
>>  		entry = 0;
>> -		if (pci_dev_msi_enabled(pdev)) {
>> +		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>>  			mdev->irq_info.mic_msi_map[entry] |= (1 << offset);
>>  			mdev->intr_ops->program_msi_to_src_map(mdev,
>>  				entry, offset, true);
>> @@ -526,7 +526,7 @@ void mic_free_irq(struct mic_device *mdev,
>>  			dev_warn(mdev->sdev->parent, "Error unregistering
>> callback\n");
>>  			return;
>>  		}
>> -		if (pci_dev_msi_enabled(pdev)) {
>> +		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>>  			mdev->irq_info.mic_msi_map[entry] &= ~(BIT(src_id));
>>  			mdev->intr_ops->program_msi_to_src_map(mdev,
>>  				entry, src_id, false);
>> @@ -589,7 +589,7 @@ void mic_free_interrupts(struct mic_device *mdev, struct
>> pci_dev *pdev)
>>  		kfree(mdev->irq_info.msix_entries);
>>  		pci_disable_msix(pdev);
>>  	} else {
>> -		if (pci_dev_msi_enabled(pdev)) {
>> +		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>>  			free_irq(pdev->irq, mdev);
>>  			kfree(mdev->irq_info.mic_msi_map);
>>  			pci_disable_msi(pdev);
>> @@ -617,7 +617,7 @@ void mic_intr_restore(struct mic_device *mdev)
>>  	struct pci_dev *pdev = container_of(mdev->sdev->parent,
>>  		struct pci_dev, dev);
>>
>> -	if (!pci_dev_msi_enabled(pdev))
>> +	if (!pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>>  		return;
>>
>>  	for (entry = 0; entry < mdev->irq_info.num_vectors; entry++) { diff --git
>> a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c index 372e08c..868f685 100644
>> --- a/drivers/ntb/ntb_hw.c
>> +++ b/drivers/ntb/ntb_hw.c
>> @@ -1306,7 +1306,7 @@ static void ntb_free_interrupts(struct ntb_device *ndev)
>>  	} else {
>>  		free_irq(pdev->irq, ndev);
>>
>> -		if (pci_dev_msi_enabled(pdev))
>> +		if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE))
>>  			pci_disable_msi(pdev);
>>  	}
>>  }
>> diff --git a/drivers/pci/irq.c b/drivers/pci/irq.c index 6684f15..e3e3293 100644
>> --- a/drivers/pci/irq.c
>> +++ b/drivers/pci/irq.c
>> @@ -36,10 +36,10 @@ static void pci_note_irq_problem(struct pci_dev *pdev, const
>> char *reason)
>>   */
>>  enum pci_lost_interrupt_reason pci_lost_interrupt(struct pci_dev *pdev)  {
>> -	if (pdev->msi_enabled || pdev->msix_enabled) {
>> +	if (pci_dev_msi_enabled(pdev, MSI_TYPE | MSIX_TYPE)) {
>>  		enum pci_lost_interrupt_reason ret;
>>
>> -		if (pdev->msix_enabled) {
>> +		if (pci_dev_msi_enabled(pdev, MSIX_TYPE)) {
>>  			pci_note_irq_problem(pdev, "MSIX routing failure");
>>  			ret = PCI_LOST_IRQ_DISABLE_MSIX;
>>  		} else {
>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index e416dc0..d5c8e56 100644
>> --- a/drivers/pci/msi.c
>> +++ b/drivers/pci/msi.c
>> @@ -125,7 +125,7 @@ static void default_restore_msi_irq(struct pci_dev *dev, int
>> irq)
>>  			if (irq == entry->irq)
>>  				break;
>>  		}
>> -	} else if (dev->msi_enabled)  {
>> +	} else if (pci_dev_msi_enabled(dev, MSI_TYPE))  {
>>  		entry = irq_get_msi_desc(irq);
>>  	}
>>
>> @@ -439,7 +439,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
>>  	u16 control;
>>  	struct msi_desc *entry;
>>
>> -	if (!dev->msi_enabled)
>> +	if (!pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		return;
>>
>>  	entry = irq_get_msi_desc(dev->irq);
>> @@ -878,7 +878,8 @@ void pci_msi_shutdown(struct pci_dev *dev)
>>  	struct msi_desc *desc;
>>  	u32 mask;
>>
>> -	if (!pci_msi_enable || !dev || !dev->msi_enabled)
>> +	if (!pci_msi_enable || !dev ||
>> +			!pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		return;
>>
>>  	BUG_ON(list_empty(&dev->msi_list));
>> @@ -899,7 +900,8 @@ void pci_msi_shutdown(struct pci_dev *dev)
>>
>>  void pci_disable_msi(struct pci_dev *dev)  {
>> -	if (!pci_msi_enable || !dev || !dev->msi_enabled)
>> +	if (!pci_msi_enable || !dev ||
>> +			!pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		return;
>>
>>  	pci_msi_shutdown(dev);
>> @@ -972,7 +974,7 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry
>> *entries, int nvec)
>>  	WARN_ON(!!dev->msix_enabled);
>>
>>  	/* Check whether driver already requested for MSI irq */
>> -	if (dev->msi_enabled) {
>> +	if (pci_dev_msi_enabled(dev, MSI_TYPE)) {
>>  		dev_info(&dev->dev, "can't enable MSI-X (MSI IRQ already
>> assigned)\n");
>>  		return -EINVAL;
>>  	}
>> @@ -1001,7 +1003,8 @@ void pci_msix_shutdown(struct pci_dev *dev)
>>
>>  void pci_disable_msix(struct pci_dev *dev)  {
>> -	if (!pci_msi_enable || !dev || !dev->msix_enabled)
>> +	if (!pci_msi_enable || !dev ||
>> +			!pci_dev_msi_enabled(dev, MSIX_TYPE))
>>  		return;
>>
>>  	pci_msix_shutdown(dev);
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 74043a2..6e9e7bd 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1206,7 +1206,7 @@ static int do_pci_enable_device(struct pci_dev *dev, int
>> bars)
>>  		return err;
>>  	pci_fixup_device(pci_fixup_enable, dev);
>>
>> -	if (dev->msi_enabled || dev->msix_enabled)
>> +	if (pci_dev_msi_enabled(dev, MSI_TYPE | MSIX_TYPE))
>>  		return 0;
>>
>>  	pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); @@ -1361,9 +1361,9 @@
>> static void pcim_release(struct device *gendev, void *res)
>>  	struct pci_devres *this = res;
>>  	int i;
>>
>> -	if (dev->msi_enabled)
>> +	if (pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		pci_disable_msi(dev);
>> -	if (dev->msix_enabled)
>> +	if (pci_dev_msi_enabled(dev, MSIX_TYPE))
>>  		pci_disable_msix(dev);
>>
>>  	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) diff --git
>> a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index
>> 2f0ce66..7a1b6ec 100644
>> --- a/drivers/pci/pcie/portdrv_core.c
>> +++ b/drivers/pci/pcie/portdrv_core.c
>> @@ -235,9 +235,9 @@ static int init_service_irqs(struct pci_dev *dev, int *irqs,
>> int mask)
>>
>>  static void cleanup_service_irqs(struct pci_dev *dev)  {
>> -	if (dev->msix_enabled)
>> +	if (pci_dev_msi_enabled(dev, MSIX_TYPE))
>>  		pci_disable_msix(dev);
>> -	else if (dev->msi_enabled)
>> +	else if (pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		pci_disable_msi(dev);
>>  }
>>
>> diff --git a/drivers/scsi/esas2r/esas2r_init.c
>> b/drivers/scsi/esas2r/esas2r_init.c
>> index 6776931..444f64d 100644
>> --- a/drivers/scsi/esas2r/esas2r_init.c
>> +++ b/drivers/scsi/esas2r/esas2r_init.c
>> @@ -617,8 +617,8 @@ void esas2r_kill_adapter(int i)
>>  			       &(a->pcid->dev),
>>  			       "pci_disable_device() called.  msix_enabled: %d "
>>  			       "msi_enabled: %d irq: %d pin: %d",
>> -			       a->pcid->msix_enabled,
>> -			       a->pcid->msi_enabled,
>> +			       pci_dev_msi_enabled(a->pcid, MSIX_TYPE),
>> +			       pci_dev_msi_enabled(a->pcid, MSI_TYPE),
>>  			       a->pcid->irq,
>>  			       a->pcid->pin);
>>
>> diff --git a/drivers/scsi/esas2r/esas2r_ioctl.c
>> b/drivers/scsi/esas2r/esas2r_ioctl.c
>> index d89a027..31e06bd 100644
>> --- a/drivers/scsi/esas2r/esas2r_ioctl.c
>> +++ b/drivers/scsi/esas2r/esas2r_ioctl.c
>> @@ -810,9 +810,9 @@ static int hba_ioctl_callback(struct esas2r_adapter *a,
>>
>>  		gai->pci.msi_vector_cnt = 1;
>>
>> -		if (a->pcid->msix_enabled)
>> +		if (pci_dev_msi_enabled(a->pcid, MSIX_TYPE))
>>  			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_MSIX;
>> -		else if (a->pcid->msi_enabled)
>> +		else if (pci_dev_msi_enabled(a->pcid, MSI_TYPE))
>>  			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_MSI;
>>  		else
>>  			gai->pci.interrupt_mode = ATTO_GAI_PCIIM_LEGACY; diff --git
>> a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 31184b3..964d809 100644
>> --- a/drivers/scsi/hpsa.c
>> +++ b/drivers/scsi/hpsa.c
>> @@ -6707,10 +6707,10 @@ static void hpsa_free_irqs_and_disable_msix(struct
>> ctlr_info *h)
>>  	free_irqs(h);
>>  #ifdef CONFIG_PCI_MSI
>>  	if (h->msix_vector) {
>> -		if (h->pdev->msix_enabled)
>> +		if (pci_dev_msi_enabled(h->pdev, MSIX_TYPE))
>>  			pci_disable_msix(h->pdev);
>>  	} else if (h->msi_vector) {
>> -		if (h->pdev->msi_enabled)
>> +		if (pci_dev_msi_enabled(h->pdev, MSI_TYPE))
>>  			pci_disable_msi(h->pdev);
>>  	}
>>  #endif /* CONFIG_PCI_MSI */
>> diff --git a/drivers/staging/crystalhd/crystalhd_lnx.c
>> b/drivers/staging/crystalhd/crystalhd_lnx.c
>> index e6fb331..9459b42 100644
>> --- a/drivers/staging/crystalhd/crystalhd_lnx.c
>> +++ b/drivers/staging/crystalhd/crystalhd_lnx.c
>> @@ -45,7 +45,7 @@ static int chd_dec_enable_int(struct crystalhd_adp *adp)
>>  		return -EINVAL;
>>  	}
>>
>> -	if (adp->pdev->msi_enabled)
>> +	if (pci_msi_dev_enabled(adp->pdev, MSI_TYPE))
>>  		adp->msi = 1;
>>  	else
>>  		adp->msi = pci_enable_msi(adp->pdev); diff --git a/drivers/xen/xen-
>> pciback/pciback_ops.c b/drivers/xen/xen-pciback/pciback_ops.c
>> index c4a0666..fee2f19 100644
>> --- a/drivers/xen/xen-pciback/pciback_ops.c
>> +++ b/drivers/xen/xen-pciback/pciback_ops.c
>> @@ -64,8 +64,8 @@ static void xen_pcibk_control_isr(struct pci_dev *dev, int
>> reset)
>>  		dev_data->irq_name,
>>  		dev_data->irq,
>>  		pci_is_enabled(dev) ? "on" : "off",
>> -		dev->msi_enabled ? "MSI" : "",
>> -		dev->msix_enabled ? "MSI/X" : "",
>> +		pci_dev_msi_enabled(dev, MSI_TYPE) ? "MSI" : "",
>> +		pci_dev_msi_enabled(dev, MSIX_TYPE) ? "MSI/X" : "",
>>  		dev_data->isr_on ? "enable" : "disable",
>>  		enable ? "enable" : "disable");
>>
>> @@ -90,8 +90,8 @@ out:
>>  		dev_data->irq_name,
>>  		dev_data->irq,
>>  		pci_is_enabled(dev) ? "on" : "off",
>> -		dev->msi_enabled ? "MSI" : "",
>> -		dev->msix_enabled ? "MSI/X" : "",
>> +		pci_dev_msi_enabled(dev, MSI_TYPE) ? "MSI" : "",
>> +		pci_dev_msi_enabled(dev, MSIX_TYPE) ? "MSI/X" : "",
>>  		enable ? (dev_data->isr_on ? "enabled" : "failed to enable") :
>>  			(dev_data->isr_on ? "failed to disable" : "disabled"));  }
>> @@ -111,9 +111,9 @@ void xen_pcibk_reset_device(struct pci_dev *dev)  #ifdef
>> CONFIG_PCI_MSI
>>  		/* The guest could have been abruptly killed without
>>  		 * disabling MSI/MSI-X interrupts.*/
>> -		if (dev->msix_enabled)
>> +		if (pci_dev_msi_enabled(dev, MSIX_TYPE))
>>  			pci_disable_msix(dev);
>> -		if (dev->msi_enabled)
>> +		if (pci_dev_msi_enabled(dev, MSI_TYPE))
>>  			pci_disable_msi(dev);
>>  #endif
>>  		if (pci_is_enabled(dev))
>> diff --git a/include/linux/pci.h b/include/linux/pci.h index 6ed3647..c6c01ae
>> 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -33,6 +33,7 @@
>>
>>  #include <linux/pci_ids.h>
>>
>> +#include <linux/msi.h>
>>  /*
>>   * The PCI interface treats multi-function devices as independent
>>   * devices.  The slot/function address of each device is encoded @@ -506,9
>> +507,16 @@ static inline struct pci_dev *pci_upstream_bridge(struct pci_dev
>> *dev)  }
>>
>>  #ifdef CONFIG_PCI_MSI
>> -static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev)
>> +static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev, int
>> +type)
>>  {
>> -	return pci_dev->msi_enabled || pci_dev->msix_enabled;
>> +	bool enabled = 0;
>> +
>> +	if (type & MSI_TYPE)
>> +		enabled |= pci_dev->msi_enabled;
>> +	if (type & MSIX_TYPE)
>> +		enabled |= pci_dev->msix_enabled;
>> +
>> +	return enabled;
>>  }
>>  #else
>>  static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev) { return false;
>> } diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c index
>> bf06577..4634bd0 100644
>> --- a/virt/kvm/assigned-dev.c
>> +++ b/virt/kvm/assigned-dev.c
>> @@ -366,7 +366,7 @@ static int assigned_device_enable_host_msi(struct kvm *kvm,
>> {
>>  	int r;
>>
>> -	if (!dev->dev->msi_enabled) {
>> +	if (!pci_dev_msi_enabled(dev->dev, MSI_TYPE)) {
>>  		r = pci_enable_msi(dev->dev);
>>  		if (r)
>>  			return r;
>> --
>> 1.7.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body
>> of a message to majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver
  2014-08-20  6:06   ` Bharat.Bhushan
@ 2014-08-20  6:34     ` Yijing Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-08-20  6:34 UTC (permalink / raw)
  To: Bharat.Bhushan, linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo

>> @@ -1025,21 +1059,52 @@ int pci_msi_enabled(void)
>>  }
>>  EXPORT_SYMBOL(pci_msi_enabled);
>>
>> -void pci_msi_init_pci_dev(struct pci_dev *dev)
>> +static struct msi_ops pci_msi = {
>> +	.msi_set_enable = msi_set_enable,
>> +	.msi_setup_entry = msi_setup_entry,
>> +	.msix_setup_entries = msix_setup_entries,
>> +	.msi_mask_irq = default_msi_mask_irq,
>> +	.msix_mask_irq = default_msix_mask_irq,
>> +	.msi_read_message = __read_msi_msg,
>> +	.msi_write_message = __write_msi_msg,
>> +	.msi_set_intx =  pci_intx_for_msi,
>> +};
> 
> Ahh, want to be sure I am understanding this correctly. So if I have a non-pci driver "xyz" which wants to use separate ops then I need to have a all these functions in that driver. Something like driver/xyz/msi.c

Yes, because different MSI device has different MSI hardware registers, so every MSI type should provide its own msi_ops, or its own msi_driver in my new proposal.


> 
> Thanks
> -Bharat
> 
>> +
>> +struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops)
>>  {
>> -	INIT_LIST_HEAD(&dev->msi_list);
>> +	struct msi_irqs *msi;
>> +
>> +	msi = kzalloc(sizeof(struct msi_irqs), GFP_KERNEL);
>> +	if (!msi)
>> +		return NULL;
>>
>> +	INIT_LIST_HEAD(&msi->msi_list);
>> +	msi->data = data;
>> +	msi->ops = ops;
>> +	return msi;
>> +}
>> +
>> +void pci_msi_init_pci_dev(struct pci_dev *dev)
>> +{
>>  	/* Disable the msi hardware to avoid screaming interrupts
>>  	 * during boot.  This is the power on reset default so
>>  	 * usually this should be a noop.
>>  	 */
>>  	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
>> -	if (dev->msi_cap)
>> -		msi_set_enable(dev, 0);
>> -
>>  	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
>> -	if (dev->msix_cap)
>> -		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
>> +
>> +	if (dev->msi_cap || dev->msix_cap) {
>> +		dev->msi = alloc_msi_irqs(dev, &pci_msi);
>> +		if (!dev->msi)
>> +			return;
>> +
>> +		dev->msi->node = dev_to_node(&dev->dev);
>> +		if (dev->msi_cap)
>> +			msi_set_enable(dev->msi, 0, MSI_TYPE);
>> +
>> +		if (dev->msix_cap)
>> +			msi_set_enable(dev->msi, 0, MSIX_TYPE);
>> +	}
>>  }
>>
>>  /**
>> @@ -1060,13 +1125,13 @@ int pci_enable_msi_range(struct pci_dev *dev, int
>> minvec, int maxvec)
>>  	int rc;
>>  	struct msi_desc *entry;
>>
>> -	if (dev->current_state != PCI_D0)
>> +	if (dev->current_state != PCI_D0 || !dev->msi)
>>  		return -EINVAL;
>>
>> -	WARN_ON(!!dev->msi_enabled);
>> +	WARN_ON(!!pci_dev_msi_enabled(dev, MSI_TYPE));
>>
>>  	/* Check whether driver already requested MSI-X irqs */
>> -	if (dev->msix_enabled) {
>> +	if (pci_dev_msi_enabled(dev, MSIX_TYPE)) {
>>  		dev_info(&dev->dev,
>>  			 "can't enable MSI (MSI-X already enabled)\n");
>>  		return -EINVAL;
>> @@ -1095,7 +1160,7 @@ int pci_enable_msi_range(struct pci_dev *dev, int minvec,
>> int maxvec)
>>  	} while (rc);
>>
>>  	do {
>> -		rc = msi_capability_init(dev, nvec);
>> +		rc = msi_capability_init(dev->msi, nvec);
>>  		if (rc < 0) {
>>  			return rc;
>>  		} else if (rc > 0) {
>> @@ -1107,14 +1172,14 @@ int pci_enable_msi_range(struct pci_dev *dev, int
>> minvec, int maxvec)
>>
>>  	rc = populate_msi_sysfs(dev);
>>  	if (rc) {
>> -		msi_set_enable(dev, 0);
>> -		pci_intx_for_msi(dev, 1);
>> -		dev->msi_enabled = 0;
>> -		free_msi_irqs(dev);
>> +		msi_set_enable(dev->msi, 0, MSI_TYPE);
>> +		pci_intx_for_msi(dev->msi, 1);
>> +		dev->msi->msi_enabled = 0;
>> +		free_msi_irqs(dev->msi);
>>  		return rc;
>>  	}
>>
>> -	entry = list_entry(dev->msi_list.next, struct msi_desc, list);
>> +	entry = list_entry(dev->msi->msi_list.next, struct msi_desc, list);
>>  	dev->irq = entry->irq;
>>  	return nvec;
>>  }
>> @@ -1158,3 +1223,5 @@ int pci_enable_msix_range(struct pci_dev *dev, struct
>> msix_entry *entries,
>>  	return nvec;
>>  }
>>  EXPORT_SYMBOL(pci_enable_msix_range);
>> +
>> +
>> diff --git a/include/linux/msi.h b/include/linux/msi.h
>> index 5a672d3..fc8f3e8 100644
>> --- a/include/linux/msi.h
>> +++ b/include/linux/msi.h
>> @@ -83,15 +83,15 @@ struct msi_desc {
>>   * implemented as weak symbols so that they /can/ be overriden by
>>   * architecture specific code if needed.
>>   */
>> -int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc);
>> +int arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc);
>>  void arch_teardown_msi_irq(unsigned int irq);
>> -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
>> -void arch_teardown_msi_irqs(struct pci_dev *dev);
>> -int arch_msi_check_device(struct pci_dev* dev, int nvec, int type);
>> -void arch_restore_msi_irqs(struct pci_dev *dev);
>> +int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type);
>> +void arch_teardown_msi_irqs(struct msi_irqs *msi);
>> +int arch_msi_check_device(struct msi_irqs *msi, int nvec, int type);
>> +void arch_restore_msi_irqs(struct msi_irqs *msi);
>>
>> -void default_teardown_msi_irqs(struct pci_dev *dev);
>> -void default_restore_msi_irqs(struct pci_dev *dev);
>> +void default_teardown_msi_irqs(struct msi_irqs *msi);
>> +void default_restore_msi_irqs(struct msi_irqs *msi);
>>  u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
>>  u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag);
>>
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index c7bca1c..d7126fc 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -334,8 +334,6 @@ struct pci_dev {
>>  	unsigned int	block_cfg_access:1;	/* config space access is blocked */
>>  	unsigned int	broken_parity_status:1;	/* Device generates false positive
>> parity */
>>  	unsigned int	irq_reroute_variant:2;	/* device needs IRQ rerouting
>> variant */
>> -	unsigned int	msi_enabled:1;
>> -	unsigned int	msix_enabled:1;
>>  	unsigned int	ari_enabled:1;	/* ARI forwarding */
>>  	unsigned int	is_managed:1;
>>  	unsigned int    needs_freset:1; /* Dev requires fundamental reset */
>> @@ -358,7 +356,7 @@ struct pci_dev {
>>  	struct bin_attribute *res_attr[DEVICE_COUNT_RESOURCE]; /* sysfs file for
>> resources */
>>  	struct bin_attribute *res_attr_wc[DEVICE_COUNT_RESOURCE]; /* sysfs file
>> for WC mapping of resources */
>>  #ifdef CONFIG_PCI_MSI
>> -	struct list_head msi_list;
>> +	struct msi_irqs *msi;
>>  	const struct attribute_group **msi_irq_groups;
>>  #endif
>>  	struct pci_vpd *vpd;
>> @@ -510,11 +508,14 @@ static inline struct pci_dev *pci_upstream_bridge(struct
>> pci_dev *dev)
>>  static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev, int type)
>>  {
>>  	bool enabled = 0;
>> +
>> +	if (!pci_dev->msi)
>> +		return false;
>>
>>  	if (type & MSI_TYPE)
>> -		enabled |= pci_dev->msi_enabled;
>> +		enabled |= pci_dev->msi->msi_enabled;
>>  	if (type & MSIX_TYPE)
>> -		enabled |= pci_dev->msix_enabled;
>> +		enabled |= pci_dev->msi->msix_enabled;
>>
>>  	return enabled;
>>  }
>> --
>> 1.7.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file
  2014-08-20  6:18   ` Bharat.Bhushan
@ 2014-08-20  6:43     ` Yijing Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-08-20  6:43 UTC (permalink / raw)
  To: Bharat.Bhushan, linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo

> +int msi_capability_init(struct msi_irqs *msi, int nvec)
>> +{
>> +	struct msi_desc *entry;
>> +	int ret;
>> +	unsigned mask;
>> +
>> +	msi_set_enable(msi, 0, MSI_TYPE);	/* Disable MSI during set up */
>> +
>> +	/* MSI Entry Initialization */
>> +	entry = msi_setup_entry(msi);
>> +	if (!entry)
>> +		return -ENOMEM;
>> +
>> +	/* All MSIs are unmasked by default, Mask them all */
> 
> Will this be true for non-pci devices as well?

In my opinion, yes, I think all msi devices should be masked during the setup.
Of course, mask and unmask functions will be override by private mask/unmask functions.


> 
> Thanks
> -Bharat
> 
>> +	mask = msi_mask(entry->msi_attrib.multi_cap);
>> +	msi_mask_irq(entry, mask, mask);
>> +
>> +	/* Configure MSI capability structure */
>> +	ret = arch_setup_msi_irqs(msi, nvec, MSI_TYPE);
>> +	if (ret)
>> +		goto err;
>> +
>> +	/* Set MSI enabled bits	 */
>> +	msi_set_intx(msi, 0);
>> +	msi_set_enable(msi, 1, MSI_TYPE);
>> +	msi->msi_enabled = 1;
>> +
>> +	return 0;
>> +
>> +err:
>> +	msi_mask_irq(entry, mask, ~mask);
>> +	free_msi_irqs(msi);
>> +	return ret;
>> +}
>> +
>> +static void msix_program_entries(struct msi_irqs *msi,
>> +				 struct msix_entry *entries)
>> +{
>> +	struct msi_desc *entry;
>> +	int i = 0;
>> +
>> +	list_for_each_entry(entry, &msi->msi_list, list) {
>> +		entries[i].vector = entry->irq;
>> +		irq_set_msi_desc(entry->irq, entry);
>> +		i++;
>> +	}
>> +}
>> +
>> +/**
>> + * msix_capability_init - configure device's MSI-X capability
>> + * @dev: pointer to the pci_dev data structure of MSI-X device function
>> + * @entries: pointer to an array of struct msix_entry entries
>> + * @nvec: number of @entries
>> + *
>> + * Setup the MSI-X capability structure of device function with a
>> + * single MSI-X irq. A return of zero indicates the successful setup of
>> + * requested MSI-X entries with allocated irqs or non-zero for otherwise.
>> + **/
>> +int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
>> +				struct msix_entry *entries, int nvec)
>> +{
>> +	int ret;
>> +
>> +	/* Ensure MSI-X is disabled while it is set up */
>> +	msi_set_enable(msi, 0, MSIX_TYPE);
>> +
>> +	ret = msix_setup_entries(msi, base, entries, nvec);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = arch_setup_msi_irqs(msi, nvec, MSIX_TYPE);
>> +	if (ret)
>> +		goto out_avail;
>> +
>> +	msix_program_entries(msi, entries);
>> +
>> +	/* Set MSI-X enabled bits and unmask the function */
>> +	msi_set_intx(msi, 0);
>> +	msi->msix_enabled = 1;
>> +
>> +	msi_set_enable(msi, 1, MSIX_TYPE);
>> +
>> +	return 0;
>> +
>> +out_avail:
>> +	if (ret < 0) {
>> +		/*
>> +		 * If we had some success, report the number of irqs
>> +		 * we succeeded in setting up.
>> +		 */
>> +		struct msi_desc *entry;
>> +		int avail = 0;
>> +
>> +		list_for_each_entry(entry, &msi->msi_list, list) {
>> +			if (entry->irq != 0)
>> +				avail++;
>> +		}
>> +		if (avail != 0)
>> +			ret = avail;
>> +	}
>> +
>> +	free_msi_irqs(msi);
>> +
>> +	return ret;
>> +}
>> +
>> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
>> index 893503f..1a10488 100644
>> --- a/drivers/pci/Kconfig
>> +++ b/drivers/pci/Kconfig
>> @@ -2,10 +2,10 @@
>>  # PCI configuration
>>  #
>>  config PCI_MSI
>> -	bool "Message Signaled Interrupts (MSI and MSI-X)"
>> -	depends on PCI
>> +	bool "PCI Message Signaled Interrupts (MSI and MSI-X)"
>> +	depends on PCI && MSI
>>  	help
>> -	   This allows device drivers to enable MSI (Message Signaled
>> +	   This allows PCI device drivers to enable MSI (Message Signaled
>>  	   Interrupts).  Message Signaled Interrupts enable a device to
>>  	   generate an interrupt using an inbound Memory Write on its
>>  	   PCI bus instead of asserting a device IRQ pin.
>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>> index f0c5989..df7223c 100644
>> --- a/drivers/pci/msi.c
>> +++ b/drivers/pci/msi.c
>> @@ -26,121 +26,8 @@ static int pci_msi_enable = 1;
>>
>>  #define msix_table_size(flags)	((flags & PCI_MSIX_FLAGS_QSIZE) + 1)
>>
>> -
>> -/* Arch hooks */
>> -
>> -int __weak arch_setup_msi_irq(struct msi_irqs *msi, struct msi_desc *desc)
>> -{
>> -	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support
>> Non-PCI
>> -	struct msi_chip *chip = dev->bus->msi;
>> -	int err;
>> -
>> -	if (!chip || !chip->setup_irq)
>> -		return -EINVAL;
>> -
>> -	err = chip->setup_irq(chip, dev, desc);
>> -	if (err < 0)
>> -		return err;
>> -
>> -	irq_set_chip_data(desc->irq, chip);
>> -
>> -	return 0;
>> -}
>> -
>> -void __weak arch_teardown_msi_irq(unsigned int irq)
>> -{
>> -	struct msi_chip *chip = irq_get_chip_data(irq);
>> -
>> -	if (!chip || !chip->teardown_irq)
>> -		return;
>> -
>> -	chip->teardown_irq(chip, irq);
>> -}
>> -
>> -int __weak arch_msi_check_device(struct msi_irqs *msi, int nvec, int type)
>> -{
>> -	struct pci_dev *dev = msi->data; //TO BE DONE: rework msi_chip to support
>> Non-PCI
>> -	struct msi_chip *chip = dev->bus->msi;
>> -
>> -	if (!chip || !chip->check_device)
>> -		return 0;
>> -
>> -	return chip->check_device(chip, dev, nvec, type);
>> -}
>> -
>> -int __weak arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
>> -{
>> -	struct msi_desc *entry;
>> -	int ret;
>> -
>> -	/*
>> -	 * If an architecture wants to support multiple MSI, it needs to
>> -	 * override arch_setup_msi_irqs()
>> -	 */
>> -	if (type == MSI_TYPE && nvec > 1)
>> -		return 1;
>> -
>> -	list_for_each_entry(entry, &msi->msi_list, list) {
>> -		ret = arch_setup_msi_irq(msi, entry);
>> -		if (ret < 0)
>> -			return ret;
>> -		if (ret > 0)
>> -			return -ENOSPC;
>> -	}
>> -
>> -	return 0;
>> -}
>> -
>> -/*
>> - * We have a default implementation available as a separate non-weak
>> - * function, as it is used by the Xen x86 PCI code
>> - */
>> -void default_teardown_msi_irqs(struct msi_irqs *msi)
>> -{
>> -	struct msi_desc *entry;
>> -
>> -	list_for_each_entry(entry, &msi->msi_list, list) {
>> -		int i, nvec;
>> -		if (entry->irq == 0)
>> -			continue;
>> -		if (entry->nvec_used)
>> -			nvec = entry->nvec_used;
>> -		else
>> -			nvec = 1 << entry->msi_attrib.multiple;
>> -		for (i = 0; i < nvec; i++)
>> -			arch_teardown_msi_irq(entry->irq + i);
>> -	}
>> -}
>> -
>> -void __weak arch_teardown_msi_irqs(struct msi_irqs *msi)
>> -{
>> -	return default_teardown_msi_irqs(msi);
>> -}
>> -
>> -static void default_restore_msi_irq(struct msi_irqs *msi, int irq)
>> -{
>> -	struct msi_desc *entry;
>> -
>> -	entry = NULL;
>> -	if (msi->msix_enabled) {
>> -		list_for_each_entry(entry, &msi->msi_list, list) {
>> -			if (irq == entry->irq)
>> -				break;
>> -		}
>> -	} else if (msi->msi_enabled)  {
>> -		entry = irq_get_msi_desc(irq);
>> -	}
>> -
>> -	if (entry)
>> -		write_msi_msg(irq, &entry->msg);
>> -}
>> -
>> -void __weak arch_restore_msi_irqs(struct msi_irqs *msi)
>> -{
>> -	return default_restore_msi_irqs(msi);
>> -}
>> -
>> -static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
>> +static void msix_clear_and_set_ctrl(struct pci_dev *dev,
>> +		u16 clear, u16 set)
>>  {
>>  	u16 ctrl;
>>
>> @@ -150,7 +37,7 @@ static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16
>> clear, u16 set)
>>  	pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, ctrl);
>>  }
>>
>> -static void msi_set_enable(struct msi_irqs *msi, int enable, int type)
>> +static void pci_msi_set_enable(struct msi_irqs *msi, int enable, int type)
>>  {
>>  	u16 control;
>>  	struct pci_dev *dev = msi->data;
>> @@ -169,21 +56,13 @@ static void msi_set_enable(struct msi_irqs *msi, int
>> enable, int type)
>>  	}
>>  }
>>
>> -static inline __attribute_const__ u32 msi_mask(unsigned x)
>> -{
>> -	/* Don't shift by >= width of type */
>> -	if (x >= 5)
>> -		return 0xffffffff;
>> -	return (1 << (1 << x)) - 1;
>> -}
>> -
>>  /*
>>   * PCI 2.3 does not specify mask bits for each MSI interrupt.  Attempting to
>>   * mask all MSI interrupts by clearing the MSI enable bit does not work
>>   * reliably as devices without an INTx disable bit will then generate a
>>   * level IRQ which will never be cleared.
>>   */
>> -u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>> +u32 pci_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>>  {
>>  	struct pci_dev *dev = desc->msi->data;
>>  	u32 mask_bits = desc->masked;
>> @@ -198,16 +77,6 @@ u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask,
>> u32 flag)
>>  	return mask_bits;
>>  }
>>
>> -__weak u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>> -{
>> -	return default_msi_mask_irq(desc, mask, flag);
>> -}
>> -
>> -static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>> -{
>> -	desc->masked = arch_msi_mask_irq(desc, mask, flag);
>> -}
>> -
>>  /*
>>   * This internal function does not flush PCI writes to the device.
>>   * All users must ensure that they read from the device before either
>> @@ -215,7 +84,7 @@ static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32
>> flag)
>>   * file.  This saves a few milliseconds when initialising devices with lots
>>   * of MSI-X interrupts.
>>   */
>> -u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag)
>> +u32 pci_msix_mask_irq(struct msi_desc *desc, u32 flag)
>>  {
>>  	u32 mask_bits = desc->masked;
>>  	unsigned offset = desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
>> @@ -228,40 +97,7 @@ u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag)
>>  	return mask_bits;
>>  }
>>
>> -__weak u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag)
>> -{
>> -	return default_msix_mask_irq(desc, flag);
>> -}
>> -
>> -static void msix_mask_irq(struct msi_desc *desc, u32 flag)
>> -{
>> -	desc->masked = arch_msix_mask_irq(desc, flag);
>> -}
>> -
>> -static void msi_set_mask_bit(struct irq_data *data, u32 flag)
>> -{
>> -	struct msi_desc *desc = irq_data_get_msi(data);
>> -
>> -	if (desc->msi_attrib.is_msix) {
>> -		msix_mask_irq(desc, flag);
>> -		readl(desc->mask_base);		/* Flush write to device */
>> -	} else {
>> -		unsigned offset = data->irq - desc->irq;
>> -		msi_mask_irq(desc, 1 << offset, flag << offset);
>> -	}
>> -}
>> -
>> -void mask_msi_irq(struct irq_data *data)
>> -{
>> -	msi_set_mask_bit(data, 1);
>> -}
>> -
>> -void unmask_msi_irq(struct irq_data *data)
>> -{
>> -	msi_set_mask_bit(data, 0);
>> -}
>> -
>> -static void msix_set_all_mask(struct msi_irqs *msi, int flag)
>> +static void pci_msix_set_all_mask(struct msi_irqs *msi, int flag)
>>  {
>>  	struct pci_dev *dev = msi->data;
>>
>> @@ -271,16 +107,7 @@ static void msix_set_all_mask(struct msi_irqs *msi, int
>> flag)
>>  		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
>>  }
>>
>> -void default_restore_msi_irqs(struct msi_irqs *msi)
>> -{
>> -	struct msi_desc *entry;
>> -
>> -	list_for_each_entry(entry, &msi->msi_list, list) {
>> -		default_restore_msi_irq(msi, entry->irq);
>> -	}
>> -}
>> -
>> -void __read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>> +void pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>>  {
>>  	struct pci_dev *dev = entry->msi->data;
>>
>> @@ -311,31 +138,7 @@ void __read_msi_msg(struct msi_desc *entry, struct msi_msg
>> *msg)
>>  	}
>>  }
>>
>> -void read_msi_msg(unsigned int irq, struct msi_msg *msg)
>> -{
>> -	struct msi_desc *entry = irq_get_msi_desc(irq);
>> -
>> -	__read_msi_msg(entry, msg);
>> -}
>> -
>> -void __get_cached_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>> -{
>> -	/* Assert that the cache is valid, assuming that
>> -	 * valid messages are not all-zeroes. */
>> -	BUG_ON(!(entry->msg.address_hi | entry->msg.address_lo |
>> -		 entry->msg.data));
>> -
>> -	*msg = entry->msg;
>> -}
>> -
>> -void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg)
>> -{
>> -	struct msi_desc *entry = irq_get_msi_desc(irq);
>> -
>> -	__get_cached_msi_msg(entry, msg);
>> -}
>> -
>> -void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>> +void pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
>>  {
>>  	struct pci_dev *dev = entry->msi->data;
>>
>> @@ -373,13 +176,6 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg
>> *msg)
>>  	entry->msg = *msg;
>>  }
>>
>> -void write_msi_msg(unsigned int irq, struct msi_msg *msg)
>> -{
>> -	struct msi_desc *entry = irq_get_msi_desc(irq);
>> -
>> -	__write_msi_msg(entry, msg);
>> -}
>> -
>>  static void free_msi_sysfs(struct pci_dev *dev)
>>  {
>>  	struct attribute **msi_attrs;
>> @@ -403,58 +199,6 @@ static void free_msi_sysfs(struct pci_dev *dev)
>>  	}
>>  }
>>
>> -static void free_msi_irqs(struct msi_irqs *msi)
>> -{
>> -	struct msi_desc *entry, *tmp;
>> -
>> -	list_for_each_entry(entry, &msi->msi_list, list) {
>> -		int i, nvec;
>> -		if (!entry->irq)
>> -			continue;
>> -		if (entry->nvec_used)
>> -			nvec = entry->nvec_used;
>> -		else
>> -			nvec = 1 << entry->msi_attrib.multiple;
>> -		for (i = 0; i < nvec; i++)
>> -			BUG_ON(irq_has_action(entry->irq + i));
>> -	}
>> -
>> -	arch_teardown_msi_irqs(msi);
>> -
>> -	list_for_each_entry_safe(entry, tmp, &msi->msi_list, list) {
>> -		if (entry->msi_attrib.is_msix) {
>> -			if (list_is_last(&entry->list, &msi->msi_list))
>> -				iounmap(entry->mask_base);
>> -		}
>> -
>> -		/*
>> -		 * Its possible that we get into this path
>> -		 * When populate_msi_sysfs fails, which means the entries
>> -		 * were not registered with sysfs.  In that case don't
>> -		 * unregister them.
>> -		 */
>> -		if (entry->kobj.parent) {
>> -			kobject_del(&entry->kobj);
>> -			kobject_put(&entry->kobj);
>> -		}
>> -
>> -		list_del(&entry->list);
>> -		kfree(entry);
>> -	}
>> -}
>> -
>> -static struct msi_desc *alloc_msi_entry(struct msi_irqs *msi)
>> -{
>> -	struct msi_desc *desc = kzalloc(sizeof(*desc), GFP_KERNEL);
>> -	if (!desc)
>> -		return NULL;
>> -
>> -	INIT_LIST_HEAD(&desc->list);
>> -	desc->msi = msi;
>> -
>> -	return desc;
>> -}
>> -
>>  static void pci_intx_for_msi(struct msi_irqs *msi, int enable)
>>  {
>>  	struct pci_dev *dev = msi->data;
>> @@ -474,7 +218,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
>>  	entry = irq_get_msi_desc(dev->irq);
>>
>>  	pci_intx_for_msi(dev->msi, 0);
>> -	msi_set_enable(dev->msi, 0, MSI_TYPE);
>> +	pci_msi_set_enable(dev->msi, 0, MSI_TYPE);
>>  	arch_restore_msi_irqs(dev->msi);
>>
>>  	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
>> @@ -496,13 +240,13 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
>>
>>  	/* route the table */
>>  	pci_intx_for_msi(msi, 0);
>> -	msi_set_enable(msi, 1, MSIX_TYPE);
>> -	msix_set_all_mask(msi, 1);
>> +	pci_msi_set_enable(msi, 1, MSIX_TYPE);
>> +	pci_msix_set_all_mask(msi, 1);
>>  	arch_restore_msi_irqs(msi);
>>  	list_for_each_entry(entry, &msi->msi_list, list)
>>  		msix_mask_irq(entry, entry->masked);
>>
>> -	msix_set_all_mask(msi, 0);
>> +	pci_msix_set_all_mask(msi, 0);
>>  }
>>
>>  void pci_restore_msi_state(struct pci_dev *dev)
>> @@ -606,22 +350,16 @@ error_attrs:
>>  	return ret;
>>  }
>>
>> -static struct msi_desc *msi_setup_entry(struct msi_irqs *msi)
>> +static struct msi_desc *pci_msi_setup_entry(struct msi_irqs *msi,
>> +		struct msi_desc *entry)
>>  {
>>  	u16 control;
>> -	struct msi_desc *entry;
>>  	struct pci_dev *dev = msi->data;
>>
>>  	/* MSI Entry Initialization */
>> -	entry = alloc_msi_entry(msi);
>> -	if (!entry)
>> -		return NULL;
>> -
>>  	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
>>
>> -	entry->msi_attrib.is_msix	= 0;
>>  	entry->msi_attrib.is_64		= !!(control & PCI_MSI_FLAGS_64BIT);
>> -	entry->msi_attrib.entry_nr	= 0;
>>  	entry->msi_attrib.maskbit	= !!(control & PCI_MSI_FLAGS_MASKBIT);
>>  	entry->msi_attrib.default_irq	= dev->irq;	/* Save IOAPIC IRQ */
>>  	entry->msi_attrib.multi_cap	= (control & PCI_MSI_FLAGS_QMASK) >> 1;
>> @@ -638,52 +376,6 @@ static struct msi_desc *msi_setup_entry(struct msi_irqs
>> *msi)
>>  	return entry;
>>  }
>>
>> -/**
>> - * msi_capability_init - configure device's MSI capability structure
>> - * @dev: pointer to the pci_dev data structure of MSI device function
>> - * @nvec: number of interrupts to allocate
>> - *
>> - * Setup the MSI capability structure of the device with the requested
>> - * number of interrupts.  A return value of zero indicates the successful
>> - * setup of an entry with the new MSI irq.  A negative return value indicates
>> - * an error, and a positive return value indicates the number of interrupts
>> - * which could have been allocated.
>> - */
>> -static int msi_capability_init(struct msi_irqs *msi, int nvec)
>> -{
>> -	struct msi_desc *entry;
>> -	int ret;
>> -	unsigned mask;
>> -
>> -	msi_set_enable(msi, 0, MSI_TYPE);	/* Disable MSI during set up */
>> -
>> -	entry = msi_setup_entry(msi);
>> -	if (!entry)
>> -		return -ENOMEM;
>> -
>> -	/* All MSIs are unmasked by default, Mask them all */
>> -	mask = msi_mask(entry->msi_attrib.multi_cap);
>> -	msi_mask_irq(entry, mask, mask);
>> -
>> -	list_add_tail(&entry->list, &msi->msi_list);
>> -
>> -	/* Configure MSI capability structure */
>> -	ret = arch_setup_msi_irqs(msi, nvec, MSI_TYPE);
>> -	if (ret)
>> -		goto err;
>> -
>> -	/* Set MSI enabled bits	 */
>> -	pci_intx_for_msi(msi, 0);
>> -	msi_set_enable(msi, 1, MSI_TYPE);
>> -	msi->msi_enabled = 1;
>> -	return 0;
>> -
>> -err:
>> -	msi_mask_irq(entry, mask, ~mask);
>> -	free_msi_irqs(msi);
>> -	return ret;
>> -}
>> -
>>  static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
>>  {
>>  	resource_size_t phys_addr;
>> @@ -699,28 +391,19 @@ static void __iomem *msix_map_region(struct pci_dev *dev,
>> unsigned nr_entries)
>>  	return ioremap_nocache(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
>>  }
>>
>> -static int msix_setup_entries(struct msi_irqs *msi, void __iomem *base,
>> -			      struct msix_entry *entries, int nvec)
>> +static int pci_msix_setup_entries(struct msi_irqs *msi, struct msix_entry
>> *entries)
>>  {
>> +	int offset, i = 0;
>>  	struct msi_desc *entry;
>> -	int i, offset;
>>  	struct pci_dev *dev = msi->data;
>>
>> -	for (i = 0; i < nvec; i++) {
>> -		entry = alloc_msi_entry(msi);
>> -		if (!entry) {
>> -			if (!i)
>> -				iounmap(base);
>> -			else
>> -				free_msi_irqs(msi);
>> -			/* No enough memory. Don't try again */
>> -			return -ENOMEM;
>> -		}
>>
>> -		entry->msi_attrib.is_msix	= 1;
>> -		entry->msi_attrib.is_64		= 1;
>> -		entry->msi_attrib.entry_nr	= entries[i].entry;
>> -		entry->mask_base		= base;
>> +	list_for_each_entry(entry, &msi->msi_list, list) {
>> +		/*
>> +		 * Some devices require MSI-X to be enabled before we can touch the
>> +		 * MSI-X registers.  We need to mask all the vectors to prevent
>> +		 * interrupts coming in before they're fully set up.
>> +		 */
>>
>>  		msix_clear_and_set_ctrl(dev, 0,
>>  				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE);
>> @@ -730,87 +413,10 @@ static int msix_setup_entries(struct msi_irqs *msi, void
>> __iomem *base,
>>  		msix_mask_irq(entry, 1);
>>  		msix_clear_and_set_ctrl(dev,
>>  				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE, 0);
>> -
>> -		list_add_tail(&entry->list, &msi->msi_list);
>> -	}
>> -
>> -	return 0;
>> -}
>> -
>> -static void msix_program_entries(struct msi_irqs *msi,
>> -				 struct msix_entry *entries)
>> -{
>> -	struct msi_desc *entry;
>> -	int i = 0;
>> -
>> -	list_for_each_entry(entry, &msi->msi_list, list) {
>> -		entries[i].vector = entry->irq;
>> -		irq_set_msi_desc(entry->irq, entry);
>>  		i++;
>>  	}
>> -}
>> -
>> -/**
>> - * msix_capability_init - configure device's MSI-X capability
>> - * @dev: pointer to the pci_dev data structure of MSI-X device function
>> - * @entries: pointer to an array of struct msix_entry entries
>> - * @nvec: number of @entries
>> - *
>> - * Setup the MSI-X capability structure of device function with a
>> - * single MSI-X irq. A return of zero indicates the successful setup of
>> - * requested MSI-X entries with allocated irqs or non-zero for otherwise.
>> - **/
>> -static int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
>> -				struct msix_entry *entries, int nvec)
>> -{
>> -	int ret;
>> -
>> -	/* Ensure MSI-X is disabled while it is set up */
>> -	msi_set_enable(msi, 0, MSIX_TYPE);
>> -
>> -	ret = msix_setup_entries(msi, base, entries, nvec);
>> -	if (ret)
>> -		return ret;
>> -
>> -	ret = arch_setup_msi_irqs(msi, nvec, MSIX_TYPE);
>> -	if (ret)
>> -		goto out_avail;
>> -
>> -	/*
>> -	 * Some devices require MSI-X to be enabled before we can touch the
>> -	 * MSI-X registers.  We need to mask all the vectors to prevent
>> -	 * interrupts coming in before they're fully set up.
>> -	 */
>> -	msix_program_entries(msi, entries);
>> -
>> -	/* Set MSI-X enabled bits and unmask the function */
>> -	pci_intx_for_msi(msi, 0);
>> -	msi->msix_enabled = 1;
>> -
>> -	msi_set_enable(msi, 1, MSIX_TYPE);
>>
>>  	return 0;
>> -
>> -out_avail:
>> -	if (ret < 0) {
>> -		/*
>> -		 * If we had some success, report the number of irqs
>> -		 * we succeeded in setting up.
>> -		 */
>> -		struct msi_desc *entry;
>> -		int avail = 0;
>> -
>> -		list_for_each_entry(entry, &msi->msi_list, list) {
>> -			if (entry->irq != 0)
>> -				avail++;
>> -		}
>> -		if (avail != 0)
>> -			ret = avail;
>> -	}
>> -
>> -	free_msi_irqs(msi);
>> -
>> -	return ret;
>>  }
>>
>>  /**
>> @@ -886,25 +492,14 @@ EXPORT_SYMBOL(pci_msi_vec_count);
>>  void pci_msi_shutdown(struct pci_dev *dev)
>>  {
>>  	struct msi_desc *desc;
>> -	u32 mask;
>>
>>  	if (!pci_msi_enable || !dev ||
>>  			!pci_dev_msi_enabled(dev, MSI_TYPE))
>>  		return;
>>
>> -	BUG_ON(list_empty(&dev->msi->msi_list));
>> -	desc = list_first_entry(&dev->msi->msi_list, struct msi_desc, list);
>> -
>> -	msi_set_enable(dev->msi, 0, MSI_TYPE);
>> -	pci_intx_for_msi(dev->msi, 1);
>> -	dev->msi->msi_enabled = 0;
>> -
>> -	/* Return the device with MSI unmasked as initial states */
>> -	mask = msi_mask(desc->msi_attrib.multi_cap);
>> -	/* Keep cached state to be restored */
>> -	arch_msi_mask_irq(desc, mask, ~mask);
>> -
>> +	msi_shutdown(dev->msi);
>>  	/* Restore dev->irq to its default pin-assertion irq */
>> +	desc = list_first_entry(&dev->msi->msi_list, struct msi_desc, list);
>>  	dev->irq = desc->msi_attrib.default_irq;
>>  }
>>
>> @@ -1014,20 +609,10 @@ EXPORT_SYMBOL(pci_enable_msix);
>>
>>  void pci_msix_shutdown(struct pci_dev *dev)
>>  {
>> -	struct msi_desc *entry;
>> -
>> -	if (!pci_msi_enable || !dev || !pci_dev_msi_enabled(dev, MSIX_TYPE))
>> +	if (!pci_msi_enable || !dev)
>>  		return;
>>
>> -	/* Return the device with MSI-X masked as initial states */
>> -	list_for_each_entry(entry, &dev->msi->msi_list, list) {
>> -		/* Keep cached states to be restored */
>> -		arch_msix_mask_irq(entry, 1);
>> -	}
>> -
>> -	msi_set_enable(dev->msi, 0, MSIX_TYPE);
>> -	pci_intx_for_msi(dev->msi, 1);
>> -	dev->msi->msix_enabled = 0;
>> +	msix_shutdown(dev->msi);
>>  }
>>
>>  void pci_disable_msix(struct pci_dev *dev)
>> @@ -1060,30 +645,16 @@ int pci_msi_enabled(void)
>>  EXPORT_SYMBOL(pci_msi_enabled);
>>
>>  static struct msi_ops pci_msi = {
>> -	.msi_set_enable = msi_set_enable,
>> -	.msi_setup_entry = msi_setup_entry,
>> -	.msix_setup_entries = msix_setup_entries,
>> -	.msi_mask_irq = default_msi_mask_irq,
>> -	.msix_mask_irq = default_msix_mask_irq,
>> -	.msi_read_message = __read_msi_msg,
>> -	.msi_write_message = __write_msi_msg,
>> +	.msi_set_enable = pci_msi_set_enable,
>> +	.msi_setup_entry = pci_msi_setup_entry,
>> +	.msix_setup_entries = pci_msix_setup_entries,
>> +	.msi_mask_irq = pci_msi_mask_irq,
>> +	.msix_mask_irq = pci_msix_mask_irq,
>> +	.msi_read_message = pci_read_msi_msg,
>> +	.msi_write_message = pci_write_msi_msg,
>>  	.msi_set_intx =  pci_intx_for_msi,
>>  };
>>
>> -struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops)
>> -{
>> -	struct msi_irqs *msi;
>> -
>> -	msi = kzalloc(sizeof(struct msi_irqs), GFP_KERNEL);
>> -	if (!msi)
>> -		return NULL;
>> -
>> -	INIT_LIST_HEAD(&msi->msi_list);
>> -	msi->data = data;
>> -	msi->ops = ops;
>> -	return msi;
>> -}
>> -
>>  void pci_msi_init_pci_dev(struct pci_dev *dev)
>>  {
>>  	/* Disable the msi hardware to avoid screaming interrupts
>> @@ -1100,10 +671,10 @@ void pci_msi_init_pci_dev(struct pci_dev *dev)
>>
>>  		dev->msi->node = dev_to_node(&dev->dev);
>>  		if (dev->msi_cap)
>> -			msi_set_enable(dev->msi, 0, MSI_TYPE);
>> +			pci_msi_set_enable(dev->msi, 0, MSI_TYPE);
>>
>>  		if (dev->msix_cap)
>> -			msi_set_enable(dev->msi, 0, MSIX_TYPE);
>> +			pci_msi_set_enable(dev->msi, 0, MSIX_TYPE);
>>  	}
>>  }
>>
>> @@ -1224,4 +795,3 @@ int pci_enable_msix_range(struct pci_dev *dev, struct
>> msix_entry *entries,
>>  }
>>  EXPORT_SYMBOL(pci_enable_msix_range);
>>
>> -
>> diff --git a/include/linux/msi.h b/include/linux/msi.h
>> index fc8f3e8..87ed0dd 100644
>> --- a/include/linux/msi.h
>> +++ b/include/linux/msi.h
>> @@ -28,9 +28,9 @@ struct msix_entry {
>>
>>  struct msi_ops {
>>  	void (*msi_set_enable)(struct msi_irqs *msi, int enable, int type);
>> -	struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi);
>> -	int (*msix_setup_entries)(struct msi_irqs *msi, void __iomem *base,
>> -			struct msix_entry *entries, int nvec);
>> +	struct msi_desc *(*msi_setup_entry)(struct msi_irqs *msi,
>> +			struct msi_desc *entry);
>> +	int (*msix_setup_entries)(struct msi_irqs *msi, struct msix_entry
>> *entries);
>>  	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
>>  	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
>>  	void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
>> @@ -49,6 +49,18 @@ void __write_msi_msg(struct msi_desc *entry, struct msi_msg
>> *msg);
>>  void read_msi_msg(unsigned int irq, struct msi_msg *msg);
>>  void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg);
>>  void write_msi_msg(unsigned int irq, struct msi_msg *msg);
>> +struct msi_desc *alloc_msi_entry(struct msi_irqs *msi);
>> +void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
>> +void msix_mask_irq(struct msi_desc *desc, u32 flag);
>> +void msi_set_enable(struct msi_irqs *msi, int enable, int type);
>> +
>> +struct msi_irqs *alloc_msi_irqs(void *data, struct msi_ops *ops);
>> +
>> +void free_msi_irqs(struct msi_irqs *msi);
>> +
>> +int msi_capability_init(struct msi_irqs *msi, int nvec);
>> +int msix_capability_init(struct msi_irqs *msi, void __iomem *base,
>> +		struct msix_entry *entries, int nvec);
>>
>>  struct msi_desc {
>>  	struct {
>> @@ -89,12 +101,17 @@ int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int
>> type);
>>  void arch_teardown_msi_irqs(struct msi_irqs *msi);
>>  int arch_msi_check_device(struct msi_irqs *msi, int nvec, int type);
>>  void arch_restore_msi_irqs(struct msi_irqs *msi);
>> +u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
>> +u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag);
>>
>>  void default_teardown_msi_irqs(struct msi_irqs *msi);
>>  void default_restore_msi_irqs(struct msi_irqs *msi);
>>  u32 default_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
>>  u32 default_msix_mask_irq(struct msi_desc *desc, u32 flag);
>>
>> +void msi_shutdown(struct msi_irqs *msi);
>> +void msix_shutdown(struct msi_irqs *msi);
>> +
>>  #define MSI_TYPE	0x01
>>  #define MSIX_TYPE	0x02
>>
>> @@ -111,4 +128,12 @@ struct msi_chip {
>>  			    int nvec, int type);
>>  };
>>
>> +static inline __attribute_const__ u32 msi_mask(unsigned x)
>> +{
>> +	/* Don't shift by >= width of type */
>> +	if (x >= 5)
>> +		return 0xffffffff;
>> +	return (1 << (1 << x)) - 1;
>> +}
>> +
>>  #endif /* LINUX_MSI_H */
>> --
>> 1.7.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code
  2014-08-20  6:20   ` Bharat.Bhushan
@ 2014-08-20  7:01     ` Yijing Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-08-20  7:01 UTC (permalink / raw)
  To: Bharat.Bhushan, linux-kernel
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, arnab.basu, virtualization, Hanjun Guo

On 2014/8/20 14:20, Bharat.Bhushan@freescale.com wrote:
> 
> 
>> -----Original Message-----
>> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
>> On Behalf Of Yijing Wang
>> Sent: Saturday, July 26, 2014 8:39 AM
>> To: linux-kernel@vger.kernel.org
>> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
>> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
>> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org; Basu
>> Arnab-B45036; virtualization@lists.linux-foundation.org; Hanjun Guo; Yijing Wang
>> Subject: [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code
> 
> Please provide description about what this refactoring is? Also does other architecture also need similar refactoring ?

Sorry, I will update all description in my new proposal.

I provided another patchset to decouple MSI driver and arch MSI code,
link: http://marc.info/?l=linux-pci&m=140782732604433&w=2

Based that, there are few changes related to arch MSI code.

I will rebas the patchset based that~

> 
> Thanks
> -Bharat
> 
>>
>> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
>> ---
>>  arch/x86/include/asm/io_apic.h       |    2 +-
>>  arch/x86/include/asm/irq_remapping.h |    4 +-
>>  arch/x86/include/asm/pci.h           |    6 ++--
>>  arch/x86/include/asm/x86_init.h      |   10 +++---
>>  arch/x86/kernel/apic/io_apic.c       |   23 +++++++--------
>>  arch/x86/kernel/x86_init.c           |   12 ++++----
>>  drivers/iommu/amd_iommu.c            |   16 ++++++----
>>  drivers/iommu/intel_irq_remapping.c  |    9 ++++--
>>  drivers/iommu/irq_remapping.c        |   51 ++++++++++++++++-----------------
>>  drivers/iommu/irq_remapping.h        |    6 ++--
>>  drivers/msi/msi.c                    |    3 +-
>>  11 files changed, 72 insertions(+), 70 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
>> index 90f97b4..692a90f 100644
>> --- a/arch/x86/include/asm/io_apic.h
>> +++ b/arch/x86/include/asm/io_apic.h
>> @@ -158,7 +158,7 @@ extern int native_setup_ioapic_entry(int, struct
>> IO_APIC_route_entry *,
>>  				     struct io_apic_irq_attr *);
>>  extern void eoi_ioapic_irq(unsigned int irq, struct irq_cfg *cfg);
>>
>> -extern void native_compose_msi_msg(struct pci_dev *pdev,
>> +extern void native_compose_msi_msg(struct msi_irqs *msi,
>>  				   unsigned int irq, unsigned int dest,
>>  				   struct msi_msg *msg, u8 hpet_id);
>>  extern void native_eoi_ioapic_pin(int apic, int pin, int vector);
>> diff --git a/arch/x86/include/asm/irq_remapping.h
>> b/arch/x86/include/asm/irq_remapping.h
>> index b7747c4..a10003d 100644
>> --- a/arch/x86/include/asm/irq_remapping.h
>> +++ b/arch/x86/include/asm/irq_remapping.h
>> @@ -47,7 +47,7 @@ extern int setup_ioapic_remapped_entry(int irq,
>>  				       int vector,
>>  				       struct io_apic_irq_attr *attr);
>>  extern void free_remapped_irq(int irq);
>> -extern void compose_remapped_msi_msg(struct pci_dev *pdev,
>> +extern void compose_remapped_msi_msg(struct msi_irqs *msi,
>>  				     unsigned int irq, unsigned int dest,
>>  				     struct msi_msg *msg, u8 hpet_id);
>>  extern int setup_hpet_msi_remapped(unsigned int irq, unsigned int id);
>> @@ -77,7 +77,7 @@ static inline int setup_ioapic_remapped_entry(int irq,
>>  	return -ENODEV;
>>  }
>>  static inline void free_remapped_irq(int irq) { }
>> -static inline void compose_remapped_msi_msg(struct pci_dev *pdev,
>> +static inline void compose_remapped_msi_msg(struct msi_irqs *msi,
>>  					    unsigned int irq, unsigned int dest,
>>  					    struct msi_msg *msg, u8 hpet_id)
>>  {
>> diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
>> index 0892ea0..04c9ef6 100644
>> --- a/arch/x86/include/asm/pci.h
>> +++ b/arch/x86/include/asm/pci.h
>> @@ -96,10 +96,10 @@ extern void pci_iommu_alloc(void);
>>  #ifdef CONFIG_PCI_MSI
>>  /* implemented in arch/x86/kernel/apic/io_apic. */
>>  struct msi_desc;
>> -int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
>> +int native_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type);
>>  void native_teardown_msi_irq(unsigned int irq);
>> -void native_restore_msi_irqs(struct pci_dev *dev);
>> -int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
>> +void native_restore_msi_irqs(struct msi_irqs *msi);
>> +int setup_msi_irq(struct msi_irqs *msi, struct msi_desc *msidesc,
>>  		  unsigned int irq_base, unsigned int irq_offset);
>>  #else
>>  #define native_setup_msi_irqs		NULL
>> diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
>> index e45e4da..8e42f17 100644
>> --- a/arch/x86/include/asm/x86_init.h
>> +++ b/arch/x86/include/asm/x86_init.h
>> @@ -170,18 +170,18 @@ struct x86_platform_ops {
>>  	void (*apic_post_init)(void);
>>  };
>>
>> -struct pci_dev;
>> +struct msi_irqs;
>>  struct msi_msg;
>>  struct msi_desc;
>>
>>  struct x86_msi_ops {
>> -	int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
>> -	void (*compose_msi_msg)(struct pci_dev *dev, unsigned int irq,
>> +	int (*setup_msi_irqs)(struct msi_irqs *msi, int nvec, int type);
>> +	void (*compose_msi_msg)(struct msi_irqs *msi, unsigned int irq,
>>  				unsigned int dest, struct msi_msg *msg,
>>  			       u8 hpet_id);
>>  	void (*teardown_msi_irq)(unsigned int irq);
>> -	void (*teardown_msi_irqs)(struct pci_dev *dev);
>> -	void (*restore_msi_irqs)(struct pci_dev *dev);
>> +	void (*teardown_msi_irqs)(struct msi_irqs *msi);
>> +	void (*restore_msi_irqs)(struct msi_irqs *msi);
>>  	int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
>>  	u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
>>  	u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
>> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>> index b833042..3cb4a6a 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -2939,7 +2939,7 @@ void arch_teardown_hwirq(unsigned int irq)
>>  /*
>>   * MSI message composition
>>   */
>> -void native_compose_msi_msg(struct pci_dev *pdev,
>> +void native_compose_msi_msg(struct msi_irqs *msi,
>>  			    unsigned int irq, unsigned int dest,
>>  			    struct msi_msg *msg, u8 hpet_id)
>>  {
>> @@ -2970,7 +2970,7 @@ void native_compose_msi_msg(struct pci_dev *pdev,
>>  }
>>
>>  #ifdef CONFIG_PCI_MSI
>> -static int msi_compose_msg(struct pci_dev *pdev, unsigned int irq,
>> +static int msi_compose_msg(struct msi_irqs *msi, unsigned int irq,
>>  			   struct msi_msg *msg, u8 hpet_id)
>>  {
>>  	struct irq_cfg *cfg;
>> @@ -2990,7 +2990,7 @@ static int msi_compose_msg(struct pci_dev *pdev, unsigned
>> int irq,
>>  	if (err)
>>  		return err;
>>
>> -	x86_msi.compose_msi_msg(pdev, irq, dest, msg, hpet_id);
>> +	x86_msi.compose_msi_msg(msi, irq, dest, msg, hpet_id);
>>
>>  	return 0;
>>  }
>> @@ -3032,15 +3032,16 @@ static struct irq_chip msi_chip = {
>>  	.irq_retrigger		= ioapic_retrigger_irq,
>>  };
>>
>> -int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
>> +int setup_msi_irq(struct msi_irqs *msi, struct msi_desc *msidesc,
>>  		  unsigned int irq_base, unsigned int irq_offset)
>>  {
>>  	struct irq_chip *chip = &msi_chip;
>>  	struct msi_msg msg;
>>  	unsigned int irq = irq_base + irq_offset;
>>  	int ret;
>> +	struct pci_dev *dev = msi->data;
>>
>> -	ret = msi_compose_msg(dev, irq, &msg, -1);
>> +	ret = msi_compose_msg(msi, irq, &msg, -1);
>>  	if (ret < 0)
>>  		return ret;
>>
>> @@ -3062,24 +3063,22 @@ int setup_msi_irq(struct pci_dev *dev, struct msi_desc
>> *msidesc,
>>  	return 0;
>>  }
>>
>> -int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
>> +int native_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
>>  {
>>  	struct msi_desc *msidesc;
>>  	unsigned int irq;
>> -	int node, ret;
>> +	int ret;
>>
>>  	/* Multiple MSI vectors only supported with interrupt remapping */
>>  	if (type == MSI_TYPE && nvec > 1)
>>  		return 1;
>>
>> -	node = dev_to_node(&dev->dev);
>> -
>> -	list_for_each_entry(msidesc, &dev->msi_list, list) {
>> -		irq = irq_alloc_hwirq(node);
>> +	list_for_each_entry(msidesc, &msi->msi_list, list) {
>> +		irq = irq_alloc_hwirq(msi->node);
>>  		if (!irq)
>>  			return -ENOSPC;
>>
>> -		ret = setup_msi_irq(dev, msidesc, irq, 0);
>> +		ret = setup_msi_irq(msi, msidesc, irq, 0);
>>  		if (ret < 0) {
>>  			irq_free_hwirq(irq);
>>  			return ret;
>> diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
>> index e48b674..a277faf 100644
>> --- a/arch/x86/kernel/x86_init.c
>> +++ b/arch/x86/kernel/x86_init.c
>> @@ -121,14 +121,14 @@ struct x86_msi_ops x86_msi = {
>>  };
>>
>>  /* MSI arch specific hooks */
>> -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
>> +int arch_setup_msi_irqs(struct msi_irqs *msi, int nvec, int type)
>>  {
>> -	return x86_msi.setup_msi_irqs(dev, nvec, type);
>> +	return x86_msi.setup_msi_irqs(msi, nvec, type);
>>  }
>>
>> -void arch_teardown_msi_irqs(struct pci_dev *dev)
>> +void arch_teardown_msi_irqs(struct msi_irqs *msi)
>>  {
>> -	x86_msi.teardown_msi_irqs(dev);
>> +	x86_msi.teardown_msi_irqs(msi);
>>  }
>>
>>  void arch_teardown_msi_irq(unsigned int irq)
>> @@ -136,9 +136,9 @@ void arch_teardown_msi_irq(unsigned int irq)
>>  	x86_msi.teardown_msi_irq(irq);
>>  }
>>
>> -void arch_restore_msi_irqs(struct pci_dev *dev)
>> +void arch_restore_msi_irqs(struct msi_irqs *msi)
>>  {
>> -	x86_msi.restore_msi_irqs(dev);
>> +	x86_msi.restore_msi_irqs(msi);
>>  }
>>  u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>>  {
>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>> index 4aec6a2..0e45cb7 100644
>> --- a/drivers/iommu/amd_iommu.c
>> +++ b/drivers/iommu/amd_iommu.c
>> @@ -4237,7 +4237,7 @@ static int free_irq(int irq)
>>  	return 0;
>>  }
>>
>> -static void compose_msi_msg(struct pci_dev *pdev,
>> +static void compose_msi_msg(struct msi_irqs *msi,
>>  			    unsigned int irq, unsigned int dest,
>>  			    struct msi_msg *msg, u8 hpet_id)
>>  {
>> @@ -4265,33 +4265,35 @@ static void compose_msi_msg(struct pci_dev *pdev,
>>  	msg->data       = irte_info->index;
>>  }
>>
>> -static int msi_alloc_irq(struct pci_dev *pdev, int irq, int nvec)
>> +static int msi_alloc_irq(struct msi_irqs *msi, int irq, int nvec)
>>  {
>>  	struct irq_cfg *cfg;
>>  	int index;
>>  	u16 devid;
>> +	struct pci_dev *dev = msi->data;
>>
>> -	if (!pdev)
>> +	if (!dev)
>>  		return -EINVAL;
>>
>>  	cfg = irq_get_chip_data(irq);
>>  	if (!cfg)
>>  		return -EINVAL;
>>
>> -	devid = get_device_id(&pdev->dev);
>> +	devid = get_device_id(&dev->dev);
>>  	index = alloc_irq_index(cfg, devid, nvec);
>>
>>  	return index < 0 ? MAX_IRQS_PER_TABLE : index;
>>  }
>>
>> -static int msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
>> +static int msi_setup_irq(struct msi_irqs *msi, unsigned int irq,
>>  			 int index, int offset)
>>  {
>>  	struct irq_2_irte *irte_info;
>>  	struct irq_cfg *cfg;
>>  	u16 devid;
>> +	struct pci_dev *dev = msi->data;
>>
>> -	if (!pdev)
>> +	if (!dev)
>>  		return -EINVAL;
>>
>>  	cfg = irq_get_chip_data(irq);
>> @@ -4301,7 +4303,7 @@ static int msi_setup_irq(struct pci_dev *pdev, unsigned
>> int irq,
>>  	if (index >= MAX_IRQS_PER_TABLE)
>>  		return 0;
>>
>> -	devid		= get_device_id(&pdev->dev);
>> +	devid		= get_device_id(&dev->dev);
>>  	irte_info	= &cfg->irq_2_irte;
>>
>>  	cfg->remapped	      = 1;
>> diff --git a/drivers/iommu/intel_irq_remapping.c
>> b/drivers/iommu/intel_irq_remapping.c
>> index 9b17489..d6bde63 100644
>> --- a/drivers/iommu/intel_irq_remapping.c
>> +++ b/drivers/iommu/intel_irq_remapping.c
>> @@ -1027,7 +1027,7 @@ intel_ioapic_set_affinity(struct irq_data *data, const
>> struct cpumask *mask,
>>  	return 0;
>>  }
>>
>> -static void intel_compose_msi_msg(struct pci_dev *pdev,
>> +static void intel_compose_msi_msg(struct msi_irqs *msi,
>>  				  unsigned int irq, unsigned int dest,
>>  				  struct msi_msg *msg, u8 hpet_id)
>>  {
>> @@ -1035,6 +1035,7 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
>>  	struct irte irte;
>>  	u16 sub_handle = 0;
>>  	int ir_index;
>> +	struct pci_dev *pdev = msi->data;
>>
>>  	cfg = irq_get_chip_data(irq);
>>
>> @@ -1064,10 +1065,11 @@ static void intel_compose_msi_msg(struct pci_dev *pdev,
>>   * and allocate 'nvec' consecutive interrupt-remapping table entries
>>   * in it.
>>   */
>> -static int intel_msi_alloc_irq(struct pci_dev *dev, int irq, int nvec)
>> +static int intel_msi_alloc_irq(struct msi_irqs *msi, int irq, int nvec)
>>  {
>>  	struct intel_iommu *iommu;
>>  	int index;
>> +	struct pci_dev *dev = msi->data;
>>
>>  	down_read(&dmar_global_lock);
>>  	iommu = map_dev_to_ir(dev);
>> @@ -1089,11 +1091,12 @@ static int intel_msi_alloc_irq(struct pci_dev *dev, int
>> irq, int nvec)
>>  	return index;
>>  }
>>
>> -static int intel_msi_setup_irq(struct pci_dev *pdev, unsigned int irq,
>> +static int intel_msi_setup_irq(struct msi_irqs *msi, unsigned int irq,
>>  			       int index, int sub_handle)
>>  {
>>  	struct intel_iommu *iommu;
>>  	int ret = -ENOENT;
>> +	struct pci_dev *pdev = msi->data;
>>
>>  	down_read(&dmar_global_lock);
>>  	iommu = map_dev_to_ir(pdev);
>> diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
>> index a3b1805..1fe14e5 100644
>> --- a/drivers/iommu/irq_remapping.c
>> +++ b/drivers/iommu/irq_remapping.c
>> @@ -24,8 +24,8 @@ int no_x2apic_optout;
>>
>>  static struct irq_remap_ops *remap_ops;
>>
>> -static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec);
>> -static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
>> +static int msi_alloc_remapped_irq(struct msi_irqs *msi, int irq, int nvec);
>> +static int msi_setup_remapped_irq(struct msi_irqs *msi, unsigned int irq,
>>  				  int index, int sub_handle);
>>  static int set_remapped_irq_affinity(struct irq_data *data,
>>  				     const struct cpumask *mask,
>> @@ -49,19 +49,19 @@ static void irq_remapping_disable_io_apic(void)
>>  		disconnect_bsp_APIC(0);
>>  }
>>
>> -static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
>> +static int do_setup_msi_irqs(struct msi_irqs *msi, int nvec)
>>  {
>>  	int ret, sub_handle, nvec_pow2, index = 0;
>>  	unsigned int irq;
>>  	struct msi_desc *msidesc;
>>
>> -	WARN_ON(!list_is_singular(&dev->msi_list));
>> -	msidesc = list_entry(dev->msi_list.next, struct msi_desc, list);
>> +	WARN_ON(!list_is_singular(&msi->msi_list));
>> +	msidesc = list_entry(msi->msi_list.next, struct msi_desc, list);
>>  	WARN_ON(msidesc->irq);
>>  	WARN_ON(msidesc->msi_attrib.multiple);
>>  	WARN_ON(msidesc->nvec_used);
>>
>> -	irq = irq_alloc_hwirqs(nvec, dev_to_node(&dev->dev));
>> +	irq = irq_alloc_hwirqs(nvec, msi->node);
>>  	if (irq == 0)
>>  		return -ENOSPC;
>>
>> @@ -70,18 +70,18 @@ static int do_setup_msi_irqs(struct pci_dev *dev, int nvec)
>>  	msidesc->msi_attrib.multiple = ilog2(nvec_pow2);
>>  	for (sub_handle = 0; sub_handle < nvec; sub_handle++) {
>>  		if (!sub_handle) {
>> -			index = msi_alloc_remapped_irq(dev, irq, nvec_pow2);
>> +			index = msi_alloc_remapped_irq(msi, irq, nvec_pow2);
>>  			if (index < 0) {
>>  				ret = index;
>>  				goto error;
>>  			}
>>  		} else {
>> -			ret = msi_setup_remapped_irq(dev, irq + sub_handle,
>> +			ret = msi_setup_remapped_irq(msi, irq + sub_handle,
>>  						     index, sub_handle);
>>  			if (ret < 0)
>>  				goto error;
>>  		}
>> -		ret = setup_msi_irq(dev, msidesc, irq, sub_handle);
>> +		ret = setup_msi_irq(msi, msidesc, irq, sub_handle);
>>  		if (ret < 0)
>>  			goto error;
>>  	}
>> @@ -101,30 +101,29 @@ error:
>>  	return ret;
>>  }
>>
>> -static int do_setup_msix_irqs(struct pci_dev *dev, int nvec)
>> +static int do_setup_msix_irqs(struct msi_irqs *msi, int nvec)
>>  {
>>  	int node, ret, sub_handle, index = 0;
>>  	struct msi_desc *msidesc;
>>  	unsigned int irq;
>>
>> -	node		= dev_to_node(&dev->dev);
>>  	sub_handle	= 0;
>>
>> -	list_for_each_entry(msidesc, &dev->msi_list, list) {
>> +	list_for_each_entry(msidesc, &msi->msi_list, list) {
>>
>> -		irq = irq_alloc_hwirq(node);
>> +		irq = irq_alloc_hwirq(msi->node);
>>  		if (irq == 0)
>>  			return -1;
>>
>>  		if (sub_handle == 0)
>> -			ret = index = msi_alloc_remapped_irq(dev, irq, nvec);
>> +			ret = index = msi_alloc_remapped_irq(msi, irq, nvec);
>>  		else
>> -			ret = msi_setup_remapped_irq(dev, irq, index, sub_handle);
>> +			ret = msi_setup_remapped_irq(msi, irq, index, sub_handle);
>>
>>  		if (ret < 0)
>>  			goto error;
>>
>> -		ret = setup_msi_irq(dev, msidesc, irq, 0);
>> +		ret = setup_msi_irq(msi, msidesc, irq, 0);
>>  		if (ret < 0)
>>  			goto error;
>>
>> @@ -139,13 +138,13 @@ error:
>>  	return ret;
>>  }
>>
>> -static int irq_remapping_setup_msi_irqs(struct pci_dev *dev,
>> +static int irq_remapping_setup_msi_irqs(struct msi_irqs *msi,
>>  					int nvec, int type)
>>  {
>>  	if (type == MSI_TYPE)
>> -		return do_setup_msi_irqs(dev, nvec);
>> +		return do_setup_msi_irqs(msi, nvec);
>>  	else
>> -		return do_setup_msix_irqs(dev, nvec);
>> +		return do_setup_msix_irqs(msi, nvec);
>>  }
>>
>>  static void eoi_ioapic_pin_remapped(int apic, int pin, int vector)
>> @@ -314,33 +313,33 @@ void free_remapped_irq(int irq)
>>  		remap_ops->free_irq(irq);
>>  }
>>
>> -void compose_remapped_msi_msg(struct pci_dev *pdev,
>> +void compose_remapped_msi_msg(struct msi_irqs *msi,
>>  			      unsigned int irq, unsigned int dest,
>>  			      struct msi_msg *msg, u8 hpet_id)
>>  {
>>  	struct irq_cfg *cfg = irq_get_chip_data(irq);
>>
>>  	if (!irq_remapped(cfg))
>> -		native_compose_msi_msg(pdev, irq, dest, msg, hpet_id);
>> +		native_compose_msi_msg(msi, irq, dest, msg, hpet_id);
>>  	else if (remap_ops && remap_ops->compose_msi_msg)
>> -		remap_ops->compose_msi_msg(pdev, irq, dest, msg, hpet_id);
>> +		remap_ops->compose_msi_msg(msi, irq, dest, msg, hpet_id);
>>  }
>>
>> -static int msi_alloc_remapped_irq(struct pci_dev *pdev, int irq, int nvec)
>> +static int msi_alloc_remapped_irq(struct msi_irqs *msi, int irq, int nvec)
>>  {
>>  	if (!remap_ops || !remap_ops->msi_alloc_irq)
>>  		return -ENODEV;
>>
>> -	return remap_ops->msi_alloc_irq(pdev, irq, nvec);
>> +	return remap_ops->msi_alloc_irq(msi, irq, nvec);
>>  }
>>
>> -static int msi_setup_remapped_irq(struct pci_dev *pdev, unsigned int irq,
>> +static int msi_setup_remapped_irq(struct msi_irqs *msi, unsigned int irq,
>>  				  int index, int sub_handle)
>>  {
>>  	if (!remap_ops || !remap_ops->msi_setup_irq)
>>  		return -ENODEV;
>>
>> -	return remap_ops->msi_setup_irq(pdev, irq, index, sub_handle);
>> +	return remap_ops->msi_setup_irq(msi, irq, index, sub_handle);
>>  }
>>
>>  int setup_hpet_msi_remapped(unsigned int irq, unsigned int id)
>> diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
>> index 90c4dae..59c4cfb 100644
>> --- a/drivers/iommu/irq_remapping.h
>> +++ b/drivers/iommu/irq_remapping.h
>> @@ -69,15 +69,15 @@ struct irq_remap_ops {
>>  	int (*free_irq)(int);
>>
>>  	/* Create MSI msg to use for interrupt remapping */
>> -	void (*compose_msi_msg)(struct pci_dev *,
>> +	void (*compose_msi_msg)(struct msi_irqs *,
>>  				unsigned int, unsigned int,
>>  				struct msi_msg *, u8);
>>
>>  	/* Allocate remapping resources for MSI */
>> -	int (*msi_alloc_irq)(struct pci_dev *, int, int);
>> +	int (*msi_alloc_irq)(struct msi_irqs *, int, int);
>>
>>  	/* Setup the remapped MSI irq */
>> -	int (*msi_setup_irq)(struct pci_dev *, unsigned int, int, int);
>> +	int (*msi_setup_irq)(struct msi_irqs *, unsigned int, int, int);
>>
>>  	/* Setup interrupt remapping for an HPET MSI */
>>  	int (*setup_hpet_msi)(unsigned int, unsigned int);
>> diff --git a/drivers/msi/msi.c b/drivers/msi/msi.c
>> index 3fbd539..8462c6c 100644
>> --- a/drivers/msi/msi.c
>> +++ b/drivers/msi/msi.c
>> @@ -510,9 +510,8 @@ int msix_capability_init(struct msi_irqs *msi, void __iomem
>> *base,
>>
>>  	/* Set MSI-X enabled bits and unmask the function */
>>  	msi_set_intx(msi, 0);
>> -	msi->msix_enabled = 1;
>> -
>>  	msi_set_enable(msi, 1, MSIX_TYPE);
>> +	msi->msix_enabled = 1;
>>
>>  	return 0;
>>
>> --
>> 1.7.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-20  6:28       ` Yijing Wang
@ 2014-08-20  7:41         ` Bharat.Bhushan
  2014-08-20  7:55           ` Yijing Wang
  2014-09-03  7:15           ` Yijing Wang
  0 siblings, 2 replies; 41+ messages in thread
From: Bharat.Bhushan @ 2014-08-20  7:41 UTC (permalink / raw)
  To: Yijing Wang, arnab.basu
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, virtualization, Hanjun Guo,
	linux-kernel



> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-owner@vger.kernel.org]
> On Behalf Of Yijing Wang
> Sent: Wednesday, August 20, 2014 11:59 AM
> To: Bhushan Bharat-R65777; Basu Arnab-B45036
> Cc: Xinwei Hu; Wuyun; Bjorn Helgaas; linux-pci@vger.kernel.org;
> Paul.Mundt@huawei.com; James E.J. Bottomley; Marc Zyngier; linux-arm-
> kernel@lists.infradead.org; Russell King; linux-arch@vger.kernel.org;
> virtualization@lists.linux-foundation.org; Hanjun Guo; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
> 
> >> The key difference between PCI device and Non-PCI MSI is the
> >> interfaces to access hardware MSI registers.
> >> for instance, currently, msi_chip->setup_irq() to setup MSI irq and
> >> configure the MSI address/data registers, so we need to provide
> >> device specific
> >> write_msi_msg() interface, then when we call msi_chip->setup_irq(),
> >> the device MSI registers can be configured appropriately.
> >
> > What if we can register/override the setup_irq() from bus-driver (not sure,
> but may be device-driver itself). Example PCI bus-driver will provide
> setup_irq() (or the part of setup_irq which set address and data in h/w) by PCI
> bus, which configure address/data in h/w as per PCI standard.
> >
> > We in Freescale will be using MSI for the devices behind a new-bus (which is
> not PCI based), We have a separate bus driver for same. And this new bus driver
> register/provide its own address/data write function which is based on that
> specific bus protocol.
> 
> Hi Bharat, I'm glad to know your MSI device working mode.
> Provide the private MSI setup functions in bus-driver layer can't apply to all
> Non-PCI MSI devices, because we can not guarantee Non-PCI MSI devices are always
> on a bus. The existing HPET, DMAR device both have no bus bind.

Yes, that's why I was not sure of bus-driver or device-driver model.

> I'm working on a
> new MSI setup framework, as you mentioned before, in device-driver model.
> 
> I abstracted a new virtual device (called struct msi_dev), this msi_dev will
> manage all MSI info,

Will this "struct msi_dev" will be part of "struct device"?

> and a new bus named msi_bus, also introduced a new driver
> msi_driver, msi_bus is responsible for binding msi_dev and msi_driver.
> All MSI devices will be classified into different MSI device types, like
> MSI_TYPE_PCI, MSI_TYPE_HPET, MSI_TYPE_DMAR, etc..
> 
> Each MSI type device should provide a private struct msi_driver. msi_driver
> should contain the type specific MSI ops functions to help setup and enable MSI
> device, request MSI irq.
> 
> I almost finish the first draft, and will post out next week in plan :)

Will be looking forward to next version.

Thanks
-Bharat

> 
> 
> Thanks!
> Yijing.
> 
> >
> > Thanks
> > -Bharat
> >
> >>
> >> My patchset is just a RFC draft, I will update it later, all we want
> >> to do is make kernel support Non-PCI MSI devices.
> >>
> >> Thanks!
> >> Yijing.
> >>
> >>
> >>>
> >>> Thanks
> >>> Arnab
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe
> >>> linux-kernel" in the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>> Please read the FAQ at  http://www.tux.org/lkml/
> >>>
> >>> .
> >>>
> >>
> >>
> >> --
> >> Thanks!
> >> Yijing
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-pci"
> >> in the body of a message to majordomo@vger.kernel.org More majordomo
> >> info at http://vger.kernel.org/majordomo-info.html
> >
> > .
> >
> 
> 
> --
> Thanks!
> Yijing
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-20  7:41         ` Bharat.Bhushan
@ 2014-08-20  7:55           ` Yijing Wang
  2014-09-03  7:15           ` Yijing Wang
  1 sibling, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-08-20  7:55 UTC (permalink / raw)
  To: Bharat.Bhushan, arnab.basu
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, virtualization, Hanjun Guo,
	linux-kernel

>>> We in Freescale will be using MSI for the devices behind a new-bus (which is
>> not PCI based), We have a separate bus driver for same. And this new bus driver
>> register/provide its own address/data write function which is based on that
>> specific bus protocol.
>>
>> Hi Bharat, I'm glad to know your MSI device working mode.
>> Provide the private MSI setup functions in bus-driver layer can't apply to all
>> Non-PCI MSI devices, because we can not guarantee Non-PCI MSI devices are always
>> on a bus. The existing HPET, DMAR device both have no bus bind.
> 
> Yes, that's why I was not sure of bus-driver or device-driver model.
> 
>> I'm working on a
>> new MSI setup framework, as you mentioned before, in device-driver model.
>>
>> I abstracted a new virtual device (called struct msi_dev), this msi_dev will
>> manage all MSI info,
> 
> Will this "struct msi_dev" will be part of "struct device"?

struct msi_dev contains the struct device

piece code:

struct msi_dev {
    u8 type;
    u8 enabled;
    u8 nvec;
    u8 nvec_retry;
    char *id;
    void __iomem *base;
    struct msix_entry *entries;
    struct list_head msi_list;
    struct device dev;
    void *msi_data;
    struct msi_driver *driver;
    const struct attribute_group **irq_groups;
};


struct msi_driver {
    const char *name;
    char *id;
    void (*msi_set_enable)(struct msi_dev *dev, int enable);
    int (*msi_setup_entry)(struct msi_dev *dev, struct msi_desc *entry);
    int (*msix_setup_entries)(struct msi_dev *dev, struct msi_desc *entry, int index);
    u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
    u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
    void (*msi_read_message)(struct msi_desc *desc, struct msi_msg *msg);
    void (*msi_write_message)(struct msi_desc *desc, struct msi_msg *msg);
    void (*msi_set_legacy_irq)(struct msi_dev *dev, int enable);
    struct device_driver driver;
};

Thanks!
Yijing.

> 
>> and a new bus named msi_bus, also introduced a new driver
>> msi_driver, msi_bus is responsible for binding msi_dev and msi_driver.
>> All MSI devices will be classified into different MSI device types, like
>> MSI_TYPE_PCI, MSI_TYPE_HPET, MSI_TYPE_DMAR, etc..
>>
>> Each MSI type device should provide a private struct msi_driver. msi_driver
>> should contain the type specific MSI ops functions to help setup and enable MSI
>> device, request MSI irq.
>>
>> I almost finish the first draft, and will post out next week in plan :)
> 
> Will be looking forward to next version.
> 
> Thanks
> -Bharat
> 
>>
>>
>> Thanks!
>> Yijing.
>>
>>>
>>> Thanks
>>> -Bharat
>>>
>>>>
>>>> My patchset is just a RFC draft, I will update it later, all we want
>>>> to do is make kernel support Non-PCI MSI devices.
>>>>
>>>> Thanks!
>>>> Yijing.
>>>>
>>>>
>>>>>
>>>>> Thanks
>>>>> Arnab
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>>
>>>>> .
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks!
>>>> Yijing
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-pci"
>>>> in the body of a message to majordomo@vger.kernel.org More majordomo
>>>> info at http://vger.kernel.org/majordomo-info.html
>>>
>>> .
>>>
>>
>>
>> --
>> Thanks!
>> Yijing
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body
>> of a message to majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device
  2014-08-20  7:41         ` Bharat.Bhushan
  2014-08-20  7:55           ` Yijing Wang
@ 2014-09-03  7:15           ` Yijing Wang
  1 sibling, 0 replies; 41+ messages in thread
From: Yijing Wang @ 2014-09-03  7:15 UTC (permalink / raw)
  To: Bharat.Bhushan, arnab.basu
  Cc: Xinwei Hu, Wuyun, Bjorn Helgaas, linux-pci, Paul.Mundt,
	James E.J. Bottomley, Marc Zyngier, linux-arm-kernel,
	Russell King, linux-arch, virtualization, Hanjun Guo,
	linux-kernel

>> Provide the private MSI setup functions in bus-driver layer can't apply to all
>> Non-PCI MSI devices, because we can not guarantee Non-PCI MSI devices are always
>> on a bus. The existing HPET, DMAR device both have no bus bind.
> 
> Yes, that's why I was not sure of bus-driver or device-driver model.
> 
>> I'm working on a
>> new MSI setup framework, as you mentioned before, in device-driver model.
>>
>> I abstracted a new virtual device (called struct msi_dev), this msi_dev will
>> manage all MSI info,
> 
> Will this "struct msi_dev" will be part of "struct device"?
> 
>> and a new bus named msi_bus, also introduced a new driver
>> msi_driver, msi_bus is responsible for binding msi_dev and msi_driver.
>> All MSI devices will be classified into different MSI device types, like
>> MSI_TYPE_PCI, MSI_TYPE_HPET, MSI_TYPE_DMAR, etc..
>>
>> Each MSI type device should provide a private struct msi_driver. msi_driver
>> should contain the type specific MSI ops functions to help setup and enable MSI
>> device, request MSI irq.
>>
>> I almost finish the first draft, and will post out next week in plan :)
> 
> Will be looking forward to next version.

Hi Bharat, I'm sorry I had to delay to send out the new version :(. I found some risks in the
new MSI framework, i.e. DMAR MSI initialized the MSI before the linux device-driver tree be built.
And we also found some problems during test. So I think I need more time to review and test.

Thanks!
Yijing.

> 
> Thanks
> -Bharat
> 
>>
>>
>> Thanks!
>> Yijing.
>>
>>>
>>> Thanks
>>> -Bharat
>>>
>>>>
>>>> My patchset is just a RFC draft, I will update it later, all we want
>>>> to do is make kernel support Non-PCI MSI devices.
>>>>
>>>> Thanks!
>>>> Yijing.
>>>>
>>>>
>>>>>
>>>>> Thanks
>>>>> Arnab
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>>
>>>>> .
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks!
>>>> Yijing
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-pci"
>>>> in the body of a message to majordomo@vger.kernel.org More majordomo
>>>> info at http://vger.kernel.org/majordomo-info.html
>>>
>>> .
>>>
>>
>>
>> --
>> Thanks!
>> Yijing
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body
>> of a message to majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2014-09-03  7:15 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-26  3:08 [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 01/11] PCI/MSI: Use pci_dev->msi_cap instead of msi_desc->msi_attrib.pos Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 02/11] PCI/MSI: Use new MSI type macro instead of PCI MSI flags Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 03/11] PCI/MSI: Refactor pci_dev_msi_enabled() Yijing Wang
2014-08-05 22:35   ` Stuart Yoder
2014-08-06  1:23     ` Yijing Wang
2014-08-20  5:57   ` Bharat.Bhushan
2014-08-20  6:30     ` Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 04/11] PCI/MSI: Move MSIX table address mapping out of msix_capability_init Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 05/11] PCI/MSI: Move populate_msi_sysfs() out of msi_capability_init() Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 06/11] PCI/MSI: Save MSI irq in PCI MSI layer Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 07/11] PCI/MSI: Mask MSI-X entry in msix_setup_entries() Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 08/11] PCI/MSI: Introduce new struct msi_irqs and struct msi_ops Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 09/11] PCI/MSI: refactor PCI MSI driver Yijing Wang
2014-08-20  6:06   ` Bharat.Bhushan
2014-08-20  6:34     ` Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 10/11] PCI/MSI: Split the generic MSI code into new file Yijing Wang
2014-08-20  6:18   ` Bharat.Bhushan
2014-08-20  6:43     ` Yijing Wang
2014-07-26  3:08 ` [RFC PATCH 11/11] x86/MSI: Refactor x86 MSI code Yijing Wang
2014-08-20  6:20   ` Bharat.Bhushan
2014-08-20  7:01     ` Yijing Wang
2014-07-29 14:08 ` [RFC PATCH 00/11] Refactor MSI to support Non-PCI device Arnd Bergmann
2014-07-30  2:45   ` Yijing Wang
2014-07-30  6:47     ` Jiang Liu
2014-07-30  7:20       ` Yijing Wang
2014-08-01 13:16         ` Arnd Bergmann
2014-08-04  3:32           ` Yijing Wang
2014-08-04 14:45             ` Arnd Bergmann
2014-08-05  2:20               ` Yijing Wang
2014-08-01 13:52     ` Arnd Bergmann
2014-08-04  6:43       ` Yijing Wang
2014-08-04 14:59         ` Arnd Bergmann
2014-08-05  2:12           ` Yijing Wang
2014-08-01 10:27 ` arnab.basu
2014-08-04  3:03   ` Yijing Wang
2014-08-20  5:44     ` Bharat.Bhushan
2014-08-20  6:28       ` Yijing Wang
2014-08-20  7:41         ` Bharat.Bhushan
2014-08-20  7:55           ` Yijing Wang
2014-09-03  7:15           ` Yijing Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).