linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3
@ 2021-12-06 22:51 Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 01/31] genirq/msi: Move descriptor list to struct msi_device_data Thomas Gleixner
                   ` (31 more replies)
  0 siblings, 32 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

This is the third part of [PCI]MSI refactoring which aims to provide the
ability of expanding MSI-X vectors after enabling MSI-X.

The first two parts of this work can be found here:

    https://lore.kernel.org/r/20211206210147.872865823@linutronix.de
    https://lore.kernel.org/r/20211206210307.625116253@linutronix.de

This third part has the following important changes:

   1) Add locking to protect the MSI descriptor storage

      Right now the MSI descriptor storage (linked list) is not protected
      by anything under the assumption that the list is installed before
      use and destroyed after use. As this is about to change there has to
      be protection

   2) A new set of iterators which allow filtering on the state of the
      descriptors namely whether a descriptor is associated to a Linux
      interrupt or not.

      This cleans up a lot of use cases which have to do this filtering
      manually.

   3) A new set of MSI descriptor allocation functions which make the usage
      sites simpler and confine the storage handling to the core code.

      Trivial MSI descriptors (non PCI) are now allocated by the core code
      automatically when the underlying irq domain requests that.

   4) Rework of sysfs handling to prepare for dynamic extension of MSI-X

      The current mechanism which creates the directory and the attributes
      for all MSI descriptors in one go is obviously not suitable for
      dynamic extension. The rework splits the directory creation out and
      lets the MSI interrupt allocation create the per descriptor
      attributes.

   5) Conversion of the MSI descriptor storage to xarray

      The linked list based storage is suboptimal even without dynamic
      expansion as it requires full list walks to get to a specific
      descriptor. With dynamic expansion this gets even more
      convoluted. Xarray is way more suitable and simplifies the
      final goal of dynamic expansion of the MSI-X space.

This third series is based on:

     git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git msi-v2-part-2

and also available from git:

     git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git msi-v2-part-3

V1 of this series can be found here:

    https://lore.kernel.org/r/20211126224100.303046749@linutronix.de

Changes versus V1:

  - Remove the allocation counter patch as it is not required

  - Fix the powerpc fallout - Cedric

  - Fix the CONFIG typo - Niklas

  - Picked up Reviewed/Tested/Acked-by tags as appropriate

As a consequence of the discussion vs. the general direction of these
patches, part 4 is not going to be updated in it's current form.

The assumption that MSI[X] and IMS are mutually exclusive and that IMS is
basically an extension for finer grained splitup of the PCI device does not
hold.

There is a plan to refactor the code further in order to provide the
desired functionality of MSI[X]/IMS which will also gain the dynamic
extension of MSI-X vectors:

      https://lore.kernel.org/r/87o85v3znb.ffs@tglx

Thanks,

	tglx
---
 .clang-format                          |    1 
 arch/powerpc/platforms/4xx/hsta_msi.c  |    7 
 arch/powerpc/platforms/cell/axon_msi.c |    7 
 arch/powerpc/platforms/pasemi/msi.c    |    9 
 arch/powerpc/sysdev/fsl_msi.c          |    8 
 arch/powerpc/sysdev/mpic_u3msi.c       |    9 
 arch/s390/pci/pci_irq.c                |    6 
 arch/x86/pci/xen.c                     |   14 
 drivers/base/core.c                    |    3 
 drivers/base/platform-msi.c            |  110 -----
 drivers/bus/fsl-mc/fsl-mc-msi.c        |   61 --
 drivers/ntb/msi.c                      |   19 
 drivers/pci/controller/pci-hyperv.c    |   15 
 drivers/pci/msi/irqdomain.c            |   11 
 drivers/pci/msi/legacy.c               |   20 
 drivers/pci/msi/msi.c                  |  258 ++++++------
 drivers/pci/xen-pcifront.c             |    2 
 drivers/soc/ti/ti_sci_inta_msi.c       |   77 +--
 include/linux/device.h                 |    4 
 include/linux/msi.h                    |  107 +++--
 include/linux/soc/ti/ti_sci_inta_msi.h |    1 
 kernel/irq/msi.c                       |  700 ++++++++++++++++++++++-----------
 22 files changed, 794 insertions(+), 655 deletions(-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 01/31] genirq/msi: Move descriptor list to struct msi_device_data
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 02/31] genirq/msi: Add mutex for MSI list protection Thomas Gleixner
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

It's only required when MSI is in use.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/core.c    |    3 ---
 include/linux/device.h |    4 ----
 include/linux/msi.h    |    4 +++-
 kernel/irq/msi.c       |    5 ++++-
 4 files changed, 7 insertions(+), 9 deletions(-)

--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2874,9 +2874,6 @@ void device_initialize(struct device *de
 	INIT_LIST_HEAD(&dev->devres_head);
 	device_pm_init(dev);
 	set_dev_node(dev, NUMA_NO_NODE);
-#ifdef CONFIG_GENERIC_MSI_IRQ
-	INIT_LIST_HEAD(&dev->msi_list);
-#endif
 	INIT_LIST_HEAD(&dev->links.consumers);
 	INIT_LIST_HEAD(&dev->links.suppliers);
 	INIT_LIST_HEAD(&dev->links.defer_sync);
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -423,7 +423,6 @@ struct dev_msi_info {
  * @pins:	For device pin management.
  *		See Documentation/driver-api/pin-control.rst for details.
  * @msi:	MSI related data
- * @msi_list:	Hosts MSI descriptors
  * @numa_node:	NUMA node this device is close to.
  * @dma_ops:    DMA mapping operations for this device.
  * @dma_mask:	Dma mask (if dma'ble device).
@@ -519,9 +518,6 @@ struct device {
 	struct dev_pin_info	*pins;
 #endif
 	struct dev_msi_info	msi;
-#ifdef CONFIG_GENERIC_MSI_IRQ
-	struct list_head	msi_list;
-#endif
 #ifdef CONFIG_DMA_OPS
 	const struct dma_map_ops *dma_ops;
 #endif
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -145,11 +145,13 @@ struct msi_desc {
  * @properties:		MSI properties which are interesting to drivers
  * @attrs:		Pointer to the sysfs attribute group
  * @platform_data:	Platform-MSI specific data
+ * @list:		List of MSI descriptors associated to the device
  */
 struct msi_device_data {
 	unsigned long			properties;
 	const struct attribute_group    **attrs;
 	struct platform_msi_priv_data	*platform_data;
+	struct list_head		list;
 };
 
 int msi_setup_device_data(struct device *dev);
@@ -174,7 +176,7 @@ unsigned int msi_get_virq(struct device
 
 /* Helpers to hide struct msi_desc implementation details */
 #define msi_desc_to_dev(desc)		((desc)->dev)
-#define dev_to_msi_list(dev)		(&(dev)->msi_list)
+#define dev_to_msi_list(dev)		(&(dev)->msi.data->list)
 #define first_msi_entry(dev)		\
 	list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
 #define for_each_msi_entry(desc, dev)	\
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -99,7 +99,9 @@ EXPORT_SYMBOL_GPL(get_cached_msi_msg);
 
 static void msi_device_data_release(struct device *dev, void *res)
 {
-	WARN_ON_ONCE(!list_empty(&dev->msi_list));
+	struct msi_device_data *md = res;
+
+	WARN_ON_ONCE(!list_empty(&md->list));
 	dev->msi.data = NULL;
 }
 
@@ -124,6 +126,7 @@ int msi_setup_device_data(struct device
 	if (!md)
 		return -ENOMEM;
 
+	INIT_LIST_HEAD(&md->list);
 	dev->msi.data = md;
 	devres_add(dev, md);
 	return 0;


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 02/31] genirq/msi: Add mutex for MSI list protection
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 01/31] genirq/msi: Move descriptor list to struct msi_device_data Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-09  0:47   ` Jason Gunthorpe
  2021-12-06 22:51 ` [patch V2 03/31] genirq/msi: Provide msi_domain_alloc/free_irqs_descs_locked() Thomas Gleixner
                   ` (29 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

For upcoming runtime extensions of MSI-X interrupts it's required to
protect the MSI descriptor list. Add a mutex to struct msi_device_data and
provide lock/unlock functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/msi.h |    5 +++++
 kernel/irq/msi.c    |   25 +++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -3,6 +3,7 @@
 #define LINUX_MSI_H
 
 #include <linux/cpumask.h>
+#include <linux/mutex.h>
 #include <linux/list.h>
 #include <linux/bits.h>
 #include <asm/msi.h>
@@ -146,12 +147,14 @@ struct msi_desc {
  * @attrs:		Pointer to the sysfs attribute group
  * @platform_data:	Platform-MSI specific data
  * @list:		List of MSI descriptors associated to the device
+ * @mutex:		Mutex protecting the MSI list
  */
 struct msi_device_data {
 	unsigned long			properties;
 	const struct attribute_group    **attrs;
 	struct platform_msi_priv_data	*platform_data;
 	struct list_head		list;
+	struct mutex			mutex;
 };
 
 int msi_setup_device_data(struct device *dev);
@@ -173,6 +176,8 @@ static inline void msi_device_set_proper
 #endif
 
 unsigned int msi_get_virq(struct device *dev, unsigned int index);
+void msi_lock_descs(struct device *dev);
+void msi_unlock_descs(struct device *dev);
 
 /* Helpers to hide struct msi_desc implementation details */
 #define msi_desc_to_dev(desc)		((desc)->dev)
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -127,12 +127,37 @@ int msi_setup_device_data(struct device
 		return -ENOMEM;
 
 	INIT_LIST_HEAD(&md->list);
+	mutex_init(&md->mutex);
 	dev->msi.data = md;
 	devres_add(dev, md);
 	return 0;
 }
 
 /**
+ * msi_lock_descs - Lock the MSI descriptor storage of a device
+ * @dev:	Device to operate on
+ */
+void msi_lock_descs(struct device *dev)
+{
+	if (WARN_ON_ONCE(!dev->msi.data))
+		return;
+	mutex_lock(&dev->msi.data->mutex);
+}
+EXPORT_SYMBOL_GPL(msi_lock_descs);
+
+/**
+ * msi_unlock_descs - Unlock the MSI descriptor storage of a device
+ * @dev:	Device to operate on
+ */
+void msi_unlock_descs(struct device *dev)
+{
+	if (WARN_ON_ONCE(!dev->msi.data))
+		return;
+	mutex_unlock(&dev->msi.data->mutex);
+}
+EXPORT_SYMBOL_GPL(msi_unlock_descs);
+
+/**
  * msi_get_virq - Return Linux interrupt number of a MSI interrupt
  * @dev:	Device to operate on
  * @index:	MSI interrupt index to look for (0-based)


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 03/31] genirq/msi: Provide msi_domain_alloc/free_irqs_descs_locked()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 01/31] genirq/msi: Move descriptor list to struct msi_device_data Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 02/31] genirq/msi: Add mutex for MSI list protection Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 04/31] genirq/msi: Provide a set of advanced MSI accessors and iterators Thomas Gleixner
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Usage sites which do allocations of the MSI descriptors before invoking
msi_domain_alloc_irqs() require to lock the MSI decriptors accross the
operation.

Provide entry points which can be called with the MSI mutex held and lock
the mutex in the existing entry points.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/msi.h |    3 ++
 kernel/irq/msi.c    |   74 ++++++++++++++++++++++++++++++++++++++++------------
 2 files changed, 61 insertions(+), 16 deletions(-)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -399,9 +399,12 @@ struct irq_domain *msi_create_irq_domain
 					 struct irq_domain *parent);
 int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
 			    int nvec);
+int msi_domain_alloc_irqs_descs_locked(struct irq_domain *domain, struct device *dev,
+				       int nvec);
 int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
 			  int nvec);
 void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev);
+void msi_domain_free_irqs_descs_locked(struct irq_domain *domain, struct device *dev);
 void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev);
 struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain);
 
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -689,10 +689,8 @@ int __msi_domain_alloc_irqs(struct irq_d
 		virq = __irq_domain_alloc_irqs(domain, -1, desc->nvec_used,
 					       dev_to_node(dev), &arg, false,
 					       desc->affinity);
-		if (virq < 0) {
-			ret = msi_handle_pci_fail(domain, desc, allocated);
-			goto cleanup;
-		}
+		if (virq < 0)
+			return msi_handle_pci_fail(domain, desc, allocated);
 
 		for (i = 0; i < desc->nvec_used; i++) {
 			irq_set_msi_desc_off(virq, i, desc);
@@ -726,7 +724,7 @@ int __msi_domain_alloc_irqs(struct irq_d
 		}
 		ret = irq_domain_activate_irq(irq_data, can_reserve);
 		if (ret)
-			goto cleanup;
+			return ret;
 	}
 
 skip_activate:
@@ -741,38 +739,63 @@ int __msi_domain_alloc_irqs(struct irq_d
 		}
 	}
 	return 0;
-
-cleanup:
-	msi_domain_free_irqs(domain, dev);
-	return ret;
 }
 
 /**
- * msi_domain_alloc_irqs - Allocate interrupts from a MSI interrupt domain
+ * msi_domain_alloc_irqs_descs_locked - Allocate interrupts from a MSI interrupt domain
  * @domain:	The domain to allocate from
  * @dev:	Pointer to device struct of the device for which the interrupts
  *		are allocated
  * @nvec:	The number of interrupts to allocate
  *
+ * Must be invoked from within a msi_lock_descs() / msi_unlock_descs()
+ * pair. Use this for MSI irqdomains which implement their own vector
+ * allocation/free.
+ *
  * Return: %0 on success or an error code.
  */
-int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
-			  int nvec)
+int msi_domain_alloc_irqs_descs_locked(struct irq_domain *domain, struct device *dev,
+				       int nvec)
 {
 	struct msi_domain_info *info = domain->host_data;
 	struct msi_domain_ops *ops = info->ops;
 	int ret;
 
+	lockdep_assert_held(&dev->msi.data->mutex);
+
 	ret = ops->domain_alloc_irqs(domain, dev, nvec);
 	if (ret)
-		return ret;
+		goto cleanup;
 
 	if (!(info->flags & MSI_FLAG_DEV_SYSFS))
 		return 0;
 
 	ret = msi_device_populate_sysfs(dev);
 	if (ret)
-		msi_domain_free_irqs(domain, dev);
+		goto cleanup;
+	return 0;
+
+cleanup:
+	msi_domain_free_irqs_descs_locked(domain, dev);
+	return ret;
+}
+
+/**
+ * msi_domain_alloc_irqs - Allocate interrupts from a MSI interrupt domain
+ * @domain:	The domain to allocate from
+ * @dev:	Pointer to device struct of the device for which the interrupts
+ *		are allocated
+ * @nvec:	The number of interrupts to allocate
+ *
+ * Return: %0 on success or an error code.
+ */
+int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, int nvec)
+{
+	int ret;
+
+	msi_lock_descs(dev);
+	ret = msi_domain_alloc_irqs_descs_locked(domain, dev, nvec);
+	msi_unlock_descs(dev);
 	return ret;
 }
 
@@ -802,22 +825,41 @@ void __msi_domain_free_irqs(struct irq_d
 }
 
 /**
- * msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain associated to @dev
+ * msi_domain_free_irqs_descs_locked - Free interrupts from a MSI interrupt @domain associated to @dev
  * @domain:	The domain to managing the interrupts
  * @dev:	Pointer to device struct of the device for which the interrupts
  *		are free
+ *
+ * Must be invoked from within a msi_lock_descs() / msi_unlock_descs()
+ * pair. Use this for MSI irqdomains which implement their own vector
+ * allocation.
  */
-void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
+void msi_domain_free_irqs_descs_locked(struct irq_domain *domain, struct device *dev)
 {
 	struct msi_domain_info *info = domain->host_data;
 	struct msi_domain_ops *ops = info->ops;
 
+	lockdep_assert_held(&dev->msi.data->mutex);
+
 	if (info->flags & MSI_FLAG_DEV_SYSFS)
 		msi_device_destroy_sysfs(dev);
 	ops->domain_free_irqs(domain, dev);
 }
 
 /**
+ * msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain associated to @dev
+ * @domain:	The domain to managing the interrupts
+ * @dev:	Pointer to device struct of the device for which the interrupts
+ *		are free
+ */
+void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
+{
+	msi_lock_descs(dev);
+	msi_domain_free_irqs_descs_locked(domain, dev);
+	msi_unlock_descs(dev);
+}
+
+/**
  * msi_get_domain_info - Get the MSI interrupt domain info for @domain
  * @domain:	The interrupt domain to retrieve data from
  *


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 04/31] genirq/msi: Provide a set of advanced MSI accessors and iterators
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (2 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 03/31] genirq/msi: Provide msi_domain_alloc/free_irqs_descs_locked() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 05/31] genirq/msi: Provide msi_alloc_msi_desc() and a simple allocator Thomas Gleixner
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

In preparation for dynamic handling of MSI-X interrupts provide a new set
of MSI descriptor accessor functions and iterators. They are benefitial per
se as they allow to cleanup quite some code in various MSI domain
implementations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/msi.h |   33 +++++++++++++++++
 kernel/irq/msi.c    |   96 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 129 insertions(+)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -141,6 +141,18 @@ struct msi_desc {
 	struct pci_msi_desc		pci;
 };
 
+/*
+ * Filter values for the MSI descriptor iterators and accessor functions.
+ */
+enum msi_desc_filter {
+	/* All descriptors */
+	MSI_DESC_ALL,
+	/* Descriptors which have no interrupt associated */
+	MSI_DESC_NOTASSOCIATED,
+	/* Descriptors which have an interrupt associated */
+	MSI_DESC_ASSOCIATED,
+};
+
 /**
  * msi_device_data - MSI per device data
  * @properties:		MSI properties which are interesting to drivers
@@ -148,6 +160,7 @@ struct msi_desc {
  * @platform_data:	Platform-MSI specific data
  * @list:		List of MSI descriptors associated to the device
  * @mutex:		Mutex protecting the MSI list
+ * @__next:		Cached pointer to the next entry for iterators
  */
 struct msi_device_data {
 	unsigned long			properties;
@@ -155,6 +168,7 @@ struct msi_device_data {
 	struct platform_msi_priv_data	*platform_data;
 	struct list_head		list;
 	struct mutex			mutex;
+	struct msi_desc			*__next;
 };
 
 int msi_setup_device_data(struct device *dev);
@@ -177,6 +191,25 @@ unsigned int msi_get_virq(struct device
 void msi_lock_descs(struct device *dev);
 void msi_unlock_descs(struct device *dev);
 
+struct msi_desc *msi_first_desc(struct device *dev, enum msi_desc_filter filter);
+struct msi_desc *msi_next_desc(struct device *dev, enum msi_desc_filter filter);
+
+/**
+ * msi_for_each_desc - Iterate the MSI descriptors
+ *
+ * @desc:	struct msi_desc pointer used as iterator
+ * @dev:	struct device pointer - device to iterate
+ * @filter:	Filter for descriptor selection
+ *
+ * Notes:
+ *  - The loop must be protected with a msi_lock_descs()/msi_unlock_descs()
+ *    pair.
+ *  - It is safe to remove a retrieved MSI descriptor in the loop.
+ */
+#define msi_for_each_desc(desc, dev, filter)			\
+	for ((desc) = msi_first_desc((dev), (filter)); (desc);	\
+	     (desc) = msi_next_desc((dev), (filter)))
+
 /* Helpers to hide struct msi_desc implementation details */
 #define msi_desc_to_dev(desc)		((desc)->dev)
 #define dev_to_msi_list(dev)		(&(dev)->msi.data->list)
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -142,10 +142,106 @@ void msi_unlock_descs(struct device *dev
 {
 	if (WARN_ON_ONCE(!dev->msi.data))
 		return;
+	/* Clear the next pointer which was cached by the iterator */
+	dev->msi.data->__next = NULL;
 	mutex_unlock(&dev->msi.data->mutex);
 }
 EXPORT_SYMBOL_GPL(msi_unlock_descs);
 
+static bool msi_desc_match(struct msi_desc *desc, enum msi_desc_filter filter)
+{
+	switch (filter) {
+	case MSI_DESC_ALL:
+		return true;
+	case MSI_DESC_NOTASSOCIATED:
+		return !desc->irq;
+	case MSI_DESC_ASSOCIATED:
+		return !!desc->irq;
+	}
+	WARN_ON_ONCE(1);
+	return false;
+}
+
+static struct msi_desc *msi_find_first_desc(struct device *dev, enum msi_desc_filter filter)
+{
+	struct msi_desc *desc;
+
+	list_for_each_entry(desc, dev_to_msi_list(dev), list) {
+		if (msi_desc_match(desc, filter))
+			return desc;
+	}
+	return NULL;
+}
+
+/**
+ * msi_first_desc - Get the first MSI descriptor of a device
+ * @dev:	Device to operate on
+ * @filter:	Descriptor state filter
+ *
+ * Must be called with the MSI descriptor mutex held, i.e. msi_lock_descs()
+ * must be invoked before the call.
+ *
+ * Return: Pointer to the first MSI descriptor matching the search
+ *	   criteria, NULL if none found.
+ */
+struct msi_desc *msi_first_desc(struct device *dev, enum msi_desc_filter filter)
+{
+	struct msi_desc *desc;
+
+	if (WARN_ON_ONCE(!dev->msi.data))
+		return NULL;
+
+	lockdep_assert_held(&dev->msi.data->mutex);
+
+	desc = msi_find_first_desc(dev, filter);
+	dev->msi.data->__next = desc ? list_next_entry(desc, list) : NULL;
+	return desc;
+}
+EXPORT_SYMBOL_GPL(msi_first_desc);
+
+static struct msi_desc *__msi_next_desc(struct device *dev, enum msi_desc_filter filter,
+					struct msi_desc *from)
+{
+	struct msi_desc *desc = from;
+
+	list_for_each_entry_from(desc, dev_to_msi_list(dev), list) {
+		if (msi_desc_match(desc, filter))
+			return desc;
+	}
+	return NULL;
+}
+
+/**
+ * msi_next_desc - Get the next MSI descriptor of a device
+ * @dev:	Device to operate on
+ *
+ * The first invocation of msi_next_desc() has to be preceeded by a
+ * successful incovation of __msi_first_desc(). Consecutive invocations are
+ * only valid if the previous one was successful. All these operations have
+ * to be done within the same MSI mutex held region.
+ *
+ * Return: Pointer to the next MSI descriptor matching the search
+ *	   criteria, NULL if none found.
+ */
+struct msi_desc *msi_next_desc(struct device *dev, enum msi_desc_filter filter)
+{
+	struct msi_device_data *data = dev->msi.data;
+	struct msi_desc *desc;
+
+	if (WARN_ON_ONCE(!data))
+		return NULL;
+
+	lockdep_assert_held(&data->mutex);
+
+	if (!data->__next)
+		return NULL;
+
+	desc = __msi_next_desc(dev, filter, data->__next);
+	dev->msi.data->__next = desc ? list_next_entry(desc, list) : NULL;
+	return desc;
+}
+EXPORT_SYMBOL_GPL(msi_next_desc);
+
 /**
  * msi_get_virq - Return Linux interrupt number of a MSI interrupt
  * @dev:	Device to operate on


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 05/31] genirq/msi: Provide msi_alloc_msi_desc() and a simple allocator
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (3 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 04/31] genirq/msi: Provide a set of advanced MSI accessors and iterators Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 06/31] genirq/msi: Provide domain flags to allocate/free MSI descriptors automatically Thomas Gleixner
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Provide msi_alloc_msi_desc() which takes a template MSI descriptor for
initializing a newly allocated descriptor. This allows to simplify various
usage sites of alloc_msi_entry() and moves the storage handling into the
core code.

For simple cases where only a linear vector space is required provide
msi_add_simple_msi_descs() which just allocates a linear range of MSI
descriptors and fills msi_desc::msi_index accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/msi.h |    2 +
 kernel/irq/msi.c    |   59 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -264,6 +264,8 @@ static inline void pci_write_msi_msg(uns
 }
 #endif /* CONFIG_PCI_MSI */
 
+int msi_add_msi_desc(struct device *dev, struct msi_desc *init_desc);
+
 struct msi_desc *alloc_msi_entry(struct device *dev, int nvec,
 				 const struct irq_affinity_desc *affinity);
 void free_msi_entry(struct msi_desc *entry);
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -61,6 +61,65 @@ void free_msi_entry(struct msi_desc *ent
 }
 
 /**
+ * msi_add_msi_desc - Allocate and initialize a MSI descriptor
+ * @dev:	Pointer to the device for which the descriptor is allocated
+ * @init_desc:	Pointer to an MSI descriptor to initialize the new descriptor
+ *
+ * Return: 0 on success or an appropriate failure code.
+ */
+int msi_add_msi_desc(struct device *dev, struct msi_desc *init_desc)
+{
+	struct msi_desc *desc;
+
+	lockdep_assert_held(&dev->msi.data->mutex);
+
+	desc = alloc_msi_entry(dev, init_desc->nvec_used, init_desc->affinity);
+	if (!desc)
+		return -ENOMEM;
+
+	/* Copy the MSI index and type specific data to the new descriptor. */
+	desc->msi_index = init_desc->msi_index;
+	desc->pci = init_desc->pci;
+
+	list_add_tail(&desc->list, &dev->msi.data->list);
+	return 0;
+}
+
+/**
+ * msi_add_simple_msi_descs - Allocate and initialize MSI descriptors
+ * @dev:	Pointer to the device for which the descriptors are allocated
+ * @index:	Index for the first MSI descriptor
+ * @ndesc:	Number of descriptors to allocate
+ *
+ * Return: 0 on success or an appropriate failure code.
+ */
+static int msi_add_simple_msi_descs(struct device *dev, unsigned int index, unsigned int ndesc)
+{
+	struct msi_desc *desc, *tmp;
+	LIST_HEAD(list);
+	unsigned int i;
+
+	lockdep_assert_held(&dev->msi.data->mutex);
+
+	for (i = 0; i < ndesc; i++) {
+		desc = alloc_msi_entry(dev, 1, NULL);
+		if (!desc)
+			goto fail;
+		desc->msi_index = index + i;
+		list_add_tail(&desc->list, &list);
+	}
+	list_splice_tail(&list, &dev->msi.data->list);
+	return 0;
+
+fail:
+	list_for_each_entry_safe(desc, tmp, &list, list) {
+		list_del(&desc->list);
+		free_msi_entry(desc);
+	}
+	return -ENOMEM;
+}
+
+/**
  * msi_device_set_properties - Set device specific MSI properties
  * @dev:	Pointer to the device which is queried
  * @prop:	Properties to set


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 06/31] genirq/msi: Provide domain flags to allocate/free MSI descriptors automatically
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (4 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 05/31] genirq/msi: Provide msi_alloc_msi_desc() and a simple allocator Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 07/31] PCI/MSI: Protect MSI operations Thomas Gleixner
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Provide domain info flags which tell the core to allocate simple
descriptors or to free descriptors when the interrupts are freed and
implement the required functionality.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/msi.h |   17 +++++++++++++++++
 kernel/irq/msi.c    |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -106,6 +106,8 @@ struct pci_msi_desc {
 	};
 };
 
+#define MSI_MAX_INDEX		((unsigned int)USHRT_MAX)
+
 /**
  * struct msi_desc - Descriptor structure for MSI based interrupts
  * @list:	List head for management
@@ -264,6 +266,17 @@ static inline void pci_write_msi_msg(uns
 #endif /* CONFIG_PCI_MSI */
 
 int msi_add_msi_desc(struct device *dev, struct msi_desc *init_desc);
+void msi_free_msi_descs_range(struct device *dev, enum msi_desc_filter filter,
+			      unsigned int first_index, unsigned int last_index);
+
+/**
+ * msi_free_msi_descs - Free MSI descriptors of a device
+ * @dev:	Device to free the descriptors
+ */
+static inline void msi_free_msi_descs(struct device *dev)
+{
+	msi_free_msi_descs_range(dev, MSI_DESC_ALL, 0, MSI_MAX_INDEX);
+}
 
 struct msi_desc *alloc_msi_entry(struct device *dev, int nvec,
 				 const struct irq_affinity_desc *affinity);
@@ -424,6 +437,10 @@ enum {
 	MSI_FLAG_DEV_SYSFS		= (1 << 7),
 	/* MSI-X entries must be contiguous */
 	MSI_FLAG_MSIX_CONTIGUOUS	= (1 << 8),
+	/* Allocate simple MSI descriptors */
+	MSI_FLAG_ALLOC_SIMPLE_MSI_DESCS	= (1 << 9),
+	/* Free MSI descriptors */
+	MSI_FLAG_FREE_MSI_DESCS		= (1 << 10),
 };
 
 int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -120,6 +120,32 @@ static int msi_add_simple_msi_descs(stru
 }
 
 /**
+ * msi_free_msi_descs_range - Free MSI descriptors of a device
+ * @dev:		Device to free the descriptors
+ * @filter:		Descriptor state filter
+ * @first_index:	Index to start freeing from
+ * @last_index:		Last index to be freed
+ */
+void msi_free_msi_descs_range(struct device *dev, enum msi_desc_filter filter,
+			      unsigned int first_index, unsigned int last_index)
+{
+	struct msi_desc *desc;
+
+	lockdep_assert_held(&dev->msi.data->mutex);
+
+	msi_for_each_desc(desc, dev, filter) {
+		/*
+		 * Stupid for now to handle MSI device domain until the
+		 * storage is switched over to an xarray.
+		 */
+		if (desc->msi_index < first_index || desc->msi_index > last_index)
+			continue;
+		list_del(&desc->list);
+		free_msi_entry(desc);
+	}
+}
+
+/**
  * msi_device_has_property - Check whether a device has a specific MSI property
  * @dev:	Pointer to the device which is queried
  * @prop:	Property to check for
@@ -896,6 +922,16 @@ int __msi_domain_alloc_irqs(struct irq_d
 	return 0;
 }
 
+static int msi_domain_add_simple_msi_descs(struct msi_domain_info *info,
+					   struct device *dev,
+					   unsigned int num_descs)
+{
+	if (!(info->flags & MSI_FLAG_ALLOC_SIMPLE_MSI_DESCS))
+		return 0;
+
+	return msi_add_simple_msi_descs(dev, 0, num_descs);
+}
+
 /**
  * msi_domain_alloc_irqs_descs_locked - Allocate interrupts from a MSI interrupt domain
  * @domain:	The domain to allocate from
@@ -918,6 +954,10 @@ int msi_domain_alloc_irqs_descs_locked(s
 
 	lockdep_assert_held(&dev->msi.data->mutex);
 
+	ret = msi_domain_add_simple_msi_descs(info, dev, nvec);
+	if (ret)
+		return ret;
+
 	ret = ops->domain_alloc_irqs(domain, dev, nvec);
 	if (ret)
 		goto cleanup;
@@ -979,6 +1019,13 @@ void __msi_domain_free_irqs(struct irq_d
 	}
 }
 
+static void msi_domain_free_msi_descs(struct msi_domain_info *info,
+				      struct device *dev)
+{
+	if (info->flags & MSI_FLAG_FREE_MSI_DESCS)
+		msi_free_msi_descs(dev);
+}
+
 /**
  * msi_domain_free_irqs_descs_locked - Free interrupts from a MSI interrupt @domain associated to @dev
  * @domain:	The domain to managing the interrupts
@@ -999,6 +1046,7 @@ void msi_domain_free_irqs_descs_locked(s
 	if (info->flags & MSI_FLAG_DEV_SYSFS)
 		msi_device_destroy_sysfs(dev);
 	ops->domain_free_irqs(domain, dev);
+	msi_domain_free_msi_descs(info, dev);
 }
 
 /**


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 07/31] PCI/MSI: Protect MSI operations
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (5 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 06/31] genirq/msi: Provide domain flags to allocate/free MSI descriptors automatically Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-07 21:06   ` Bjorn Helgaas
  2021-12-06 22:51 ` [patch V2 08/31] PCI/MSI: Use msi_add_msi_desc() Thomas Gleixner
                   ` (24 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

To prepare for dynamic extension of MSI-X vectors, protect the MSI
operations for MSI and MSI-X. This requires to move the invocation of
irq_create_affinity_masks() out of the descriptor lock section to avoid
reverse lock ordering vs. CPU hotplug lock as some callers of the PCI/MSI
allocation interfaces already hold it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/pci/msi/irqdomain.c |    4 -
 drivers/pci/msi/msi.c       |  120 ++++++++++++++++++++++++++------------------
 2 files changed, 73 insertions(+), 51 deletions(-)

--- a/drivers/pci/msi/irqdomain.c
+++ b/drivers/pci/msi/irqdomain.c
@@ -14,7 +14,7 @@ int pci_msi_setup_msi_irqs(struct pci_de
 
 	domain = dev_get_msi_domain(&dev->dev);
 	if (domain && irq_domain_is_hierarchy(domain))
-		return msi_domain_alloc_irqs(domain, &dev->dev, nvec);
+		return msi_domain_alloc_irqs_descs_locked(domain, &dev->dev, nvec);
 
 	return pci_msi_legacy_setup_msi_irqs(dev, nvec, type);
 }
@@ -25,7 +25,7 @@ void pci_msi_teardown_msi_irqs(struct pc
 
 	domain = dev_get_msi_domain(&dev->dev);
 	if (domain && irq_domain_is_hierarchy(domain))
-		msi_domain_free_irqs(domain, &dev->dev);
+		msi_domain_free_irqs_descs_locked(domain, &dev->dev);
 	else
 		pci_msi_legacy_teardown_msi_irqs(dev);
 }
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -322,11 +322,13 @@ static void __pci_restore_msix_state(str
 
 	write_msg = arch_restore_msi_irqs(dev);
 
+	msi_lock_descs(&dev->dev);
 	for_each_pci_msi_entry(entry, dev) {
 		if (write_msg)
 			__pci_write_msi_msg(entry, &entry->msg);
 		pci_msix_write_vector_ctrl(entry, entry->pci.msix_ctrl);
 	}
+	msi_unlock_descs(&dev->dev);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 }
@@ -339,20 +341,16 @@ void pci_restore_msi_state(struct pci_de
 EXPORT_SYMBOL_GPL(pci_restore_msi_state);
 
 static struct msi_desc *
-msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
+msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity_desc *masks)
 {
-	struct irq_affinity_desc *masks = NULL;
 	struct msi_desc *entry;
 	unsigned long prop;
 	u16 control;
 
-	if (affd)
-		masks = irq_create_affinity_masks(nvec, affd);
-
 	/* MSI Entry Initialization */
 	entry = alloc_msi_entry(&dev->dev, nvec, masks);
 	if (!entry)
-		goto out;
+		return NULL;
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
 	/* Lies, damned lies, and MSIs */
@@ -379,8 +377,7 @@ msi_setup_entry(struct pci_dev *dev, int
 	if (entry->pci.msi_attrib.is_64)
 		prop |= MSI_PROP_64BIT;
 	msi_device_set_properties(&dev->dev, prop);
-out:
-	kfree(masks);
+
 	return entry;
 }
 
@@ -416,14 +413,21 @@ static int msi_verify_entries(struct pci
 static int msi_capability_init(struct pci_dev *dev, int nvec,
 			       struct irq_affinity *affd)
 {
+	struct irq_affinity_desc *masks = NULL;
 	struct msi_desc *entry;
 	int ret;
 
 	pci_msi_set_enable(dev, 0);	/* Disable MSI during set up */
 
-	entry = msi_setup_entry(dev, nvec, affd);
-	if (!entry)
-		return -ENOMEM;
+	if (affd)
+		masks = irq_create_affinity_masks(nvec, affd);
+
+	msi_lock_descs(&dev->dev);
+	entry = msi_setup_entry(dev, nvec, masks);
+	if (!entry) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
 
 	/* All MSIs are unmasked by default; mask them all */
 	pci_msi_mask(entry, msi_multi_mask(entry));
@@ -446,11 +450,14 @@ static int msi_capability_init(struct pc
 
 	pcibios_free_irq(dev);
 	dev->irq = entry->irq;
-	return 0;
+	goto unlock;
 
 err:
 	pci_msi_unmask(entry, msi_multi_mask(entry));
 	free_msi_irqs(dev);
+unlock:
+	msi_unlock_descs(&dev->dev);
+	kfree(masks);
 	return ret;
 }
 
@@ -477,23 +484,18 @@ static void __iomem *msix_map_region(str
 
 static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 			      struct msix_entry *entries, int nvec,
-			      struct irq_affinity *affd)
+			      struct irq_affinity_desc *masks)
 {
-	struct irq_affinity_desc *curmsk, *masks = NULL;
+	int i, vec_count = pci_msix_vec_count(dev);
+	struct irq_affinity_desc *curmsk;
 	struct msi_desc *entry;
 	void __iomem *addr;
-	int ret, i;
-	int vec_count = pci_msix_vec_count(dev);
-
-	if (affd)
-		masks = irq_create_affinity_masks(nvec, affd);
 
 	for (i = 0, curmsk = masks; i < nvec; i++) {
 		entry = alloc_msi_entry(&dev->dev, 1, curmsk);
 		if (!entry) {
 			/* No enough memory. Don't try again */
-			ret = -ENOMEM;
-			goto out;
+			return -ENOMEM;
 		}
 
 		entry->pci.msi_attrib.is_msix	= 1;
@@ -522,10 +524,7 @@ static int msix_setup_entries(struct pci
 			curmsk++;
 	}
 	msi_device_set_properties(&dev->dev, MSI_PROP_PCI_MSIX | MSI_PROP_64BIT);
-	ret = 0;
-out:
-	kfree(masks);
-	return ret;
+	return 0;
 }
 
 static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries)
@@ -552,6 +551,41 @@ static void msix_mask_all(void __iomem *
 		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
 }
 
+static int msix_setup_interrupts(struct pci_dev *dev, void __iomem *base,
+				 struct msix_entry *entries, int nvec,
+				 struct irq_affinity *affd)
+{
+	struct irq_affinity_desc *masks = NULL;
+	int ret;
+
+	if (affd)
+		masks = irq_create_affinity_masks(nvec, affd);
+
+	msi_lock_descs(&dev->dev);
+	ret = msix_setup_entries(dev, base, entries, nvec, masks);
+	if (ret)
+		goto out_free;
+
+	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
+	if (ret)
+		goto out_free;
+
+	/* Check if all MSI entries honor device restrictions */
+	ret = msi_verify_entries(dev);
+	if (ret)
+		goto out_free;
+
+	msix_update_entries(dev, entries);
+	goto out_unlock;
+
+out_free:
+	free_msi_irqs(dev);
+out_unlock:
+	msi_unlock_descs(&dev->dev);
+	kfree(masks);
+	return ret;
+}
+
 /**
  * msix_capability_init - configure device's MSI-X capability
  * @dev: pointer to the pci_dev data structure of MSI-X device function
@@ -592,20 +626,9 @@ static int msix_capability_init(struct p
 	/* Ensure that all table entries are masked. */
 	msix_mask_all(base, tsize);
 
-	ret = msix_setup_entries(dev, base, entries, nvec, affd);
+	ret = msix_setup_interrupts(dev, base, entries, nvec, affd);
 	if (ret)
-		goto out_free;
-
-	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
-	if (ret)
-		goto out_free;
-
-	/* Check if all MSI entries honor device restrictions */
-	ret = msi_verify_entries(dev);
-	if (ret)
-		goto out_free;
-
-	msix_update_entries(dev, entries);
+		goto out_disable;
 
 	/* Set MSI-X enabled bits and unmask the function */
 	pci_intx_for_msi(dev, 0);
@@ -615,12 +638,8 @@ static int msix_capability_init(struct p
 	pcibios_free_irq(dev);
 	return 0;
 
-out_free:
-	free_msi_irqs(dev);
-
 out_disable:
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
-
 	return ret;
 }
 
@@ -725,8 +744,10 @@ void pci_disable_msi(struct pci_dev *dev
 	if (!pci_msi_enable || !dev || !dev->msi_enabled)
 		return;
 
+	msi_lock_descs(&dev->dev);
 	pci_msi_shutdown(dev);
 	free_msi_irqs(dev);
+	msi_unlock_descs(&dev->dev);
 }
 EXPORT_SYMBOL(pci_disable_msi);
 
@@ -812,8 +833,10 @@ void pci_disable_msix(struct pci_dev *de
 	if (!pci_msi_enable || !dev || !dev->msix_enabled)
 		return;
 
+	msi_lock_descs(&dev->dev);
 	pci_msix_shutdown(dev);
 	free_msi_irqs(dev);
+	msi_unlock_descs(&dev->dev);
 }
 EXPORT_SYMBOL(pci_disable_msix);
 
@@ -874,7 +897,6 @@ int pci_enable_msi(struct pci_dev *dev)
 
 	if (!rc)
 		rc = __pci_enable_msi_range(dev, 1, 1, NULL);
-
 	return rc < 0 ? rc : 0;
 }
 EXPORT_SYMBOL(pci_enable_msi);
@@ -961,11 +983,7 @@ int pci_alloc_irq_vectors_affinity(struc
 				   struct irq_affinity *affd)
 {
 	struct irq_affinity msi_default_affd = {0};
-	int ret = msi_setup_device_data(&dev->dev);
-	int nvecs = -ENOSPC;
-
-	if (ret)
-		return ret;
+	int ret, nvecs;
 
 	if (flags & PCI_IRQ_AFFINITY) {
 		if (!affd)
@@ -975,6 +993,10 @@ int pci_alloc_irq_vectors_affinity(struc
 			affd = NULL;
 	}
 
+	ret = msi_setup_device_data(&dev->dev);
+	if (ret)
+		return ret;
+
 	if (flags & PCI_IRQ_MSIX) {
 		nvecs = __pci_enable_msix_range(dev, NULL, min_vecs, max_vecs,
 						affd, flags);
@@ -1003,7 +1025,7 @@ int pci_alloc_irq_vectors_affinity(struc
 		}
 	}
 
-	return nvecs;
+	return -ENOSPC;
 }
 EXPORT_SYMBOL(pci_alloc_irq_vectors_affinity);
 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 08/31] PCI/MSI: Use msi_add_msi_desc()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (6 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 07/31] PCI/MSI: Protect MSI operations Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-07 21:07   ` Bjorn Helgaas
  2021-12-06 22:51 ` [patch V2 09/31] PCI/MSI: Let core code free MSI descriptors Thomas Gleixner
                   ` (23 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Simplify the allocation of MSI descriptors by using msi_add_msi_desc()
which moves the storage handling to core code and prepares for dynamic
extension of the MSI-X vector space.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/pci/msi/msi.c |  122 ++++++++++++++++++++++++--------------------------
 1 file changed, 59 insertions(+), 63 deletions(-)

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -340,45 +340,51 @@ void pci_restore_msi_state(struct pci_de
 }
 EXPORT_SYMBOL_GPL(pci_restore_msi_state);
 
-static struct msi_desc *
-msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity_desc *masks)
+static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
+			      struct irq_affinity_desc *masks)
 {
-	struct msi_desc *entry;
+	struct msi_desc desc;
 	unsigned long prop;
 	u16 control;
+	int ret;
 
 	/* MSI Entry Initialization */
-	entry = alloc_msi_entry(&dev->dev, nvec, masks);
-	if (!entry)
-		return NULL;
+	memset(&desc, 0, sizeof(desc));
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
 	/* Lies, damned lies, and MSIs */
 	if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
 		control |= PCI_MSI_FLAGS_MASKBIT;
+	/* Respect XEN's mask disabling */
+	if (pci_msi_ignore_mask)
+		control &= ~PCI_MSI_FLAGS_MASKBIT;
 
-	entry->pci.msi_attrib.is_64	= !!(control & PCI_MSI_FLAGS_64BIT);
-	entry->pci.msi_attrib.can_mask	= !pci_msi_ignore_mask &&
-					  !!(control & PCI_MSI_FLAGS_MASKBIT);
-	entry->pci.msi_attrib.default_irq = dev->irq;
-	entry->pci.msi_attrib.multi_cap	= (control & PCI_MSI_FLAGS_QMASK) >> 1;
-	entry->pci.msi_attrib.multiple	= ilog2(__roundup_pow_of_two(nvec));
+	desc.nvec_used			= nvec;
+	desc.pci.msi_attrib.is_64	= !!(control & PCI_MSI_FLAGS_64BIT);
+	desc.pci.msi_attrib.can_mask	= !!(control & PCI_MSI_FLAGS_MASKBIT);
+	desc.pci.msi_attrib.default_irq	= dev->irq;
+	desc.pci.msi_attrib.multi_cap	= (control & PCI_MSI_FLAGS_QMASK) >> 1;
+	desc.pci.msi_attrib.multiple	= ilog2(__roundup_pow_of_two(nvec));
+	desc.affinity			= masks;
 
 	if (control & PCI_MSI_FLAGS_64BIT)
-		entry->pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_64;
+		desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_64;
 	else
-		entry->pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_32;
+		desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_32;
 
 	/* Save the initial mask status */
-	if (entry->pci.msi_attrib.can_mask)
-		pci_read_config_dword(dev, entry->pci.mask_pos, &entry->pci.msi_mask);
+	if (desc.pci.msi_attrib.can_mask)
+		pci_read_config_dword(dev, desc.pci.mask_pos, &desc.pci.msi_mask);
 
-	prop = MSI_PROP_PCI_MSI;
-	if (entry->pci.msi_attrib.is_64)
-		prop |= MSI_PROP_64BIT;
-	msi_device_set_properties(&dev->dev, prop);
+	ret = msi_add_msi_desc(&dev->dev, &desc);
+	if (!ret) {
+		prop = MSI_PROP_PCI_MSI;
+		if (desc.pci.msi_attrib.is_64)
+			prop |= MSI_PROP_64BIT;
+		msi_device_set_properties(&dev->dev, prop);
+	}
 
-	return entry;
+	return ret;
 }
 
 static int msi_verify_entries(struct pci_dev *dev)
@@ -423,17 +429,14 @@ static int msi_capability_init(struct pc
 		masks = irq_create_affinity_masks(nvec, affd);
 
 	msi_lock_descs(&dev->dev);
-	entry = msi_setup_entry(dev, nvec, masks);
-	if (!entry) {
-		ret = -ENOMEM;
+	ret = msi_setup_msi_desc(dev, nvec, masks);
+	if (ret)
 		goto unlock;
-	}
 
 	/* All MSIs are unmasked by default; mask them all */
+	entry = first_pci_msi_entry(dev);
 	pci_msi_mask(entry, msi_multi_mask(entry));
 
-	list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
-
 	/* Configure MSI capability structure */
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
 	if (ret)
@@ -482,49 +485,40 @@ static void __iomem *msix_map_region(str
 	return ioremap(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
 }
 
-static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
-			      struct msix_entry *entries, int nvec,
-			      struct irq_affinity_desc *masks)
+static int msix_setup_msi_descs(struct pci_dev *dev, void __iomem *base,
+				struct msix_entry *entries, int nvec,
+				struct irq_affinity_desc *masks)
 {
-	int i, vec_count = pci_msix_vec_count(dev);
+	int ret = 0, i, vec_count = pci_msix_vec_count(dev);
 	struct irq_affinity_desc *curmsk;
-	struct msi_desc *entry;
+	struct msi_desc desc;
 	void __iomem *addr;
 
-	for (i = 0, curmsk = masks; i < nvec; i++) {
-		entry = alloc_msi_entry(&dev->dev, 1, curmsk);
-		if (!entry) {
-			/* No enough memory. Don't try again */
-			return -ENOMEM;
-		}
-
-		entry->pci.msi_attrib.is_msix	= 1;
-		entry->pci.msi_attrib.is_64	= 1;
-
-		if (entries)
-			entry->msi_index = entries[i].entry;
-		else
-			entry->msi_index = i;
-
-		entry->pci.msi_attrib.is_virtual = entry->msi_index >= vec_count;
-
-		entry->pci.msi_attrib.can_mask	= !pci_msi_ignore_mask &&
-						  !entry->pci.msi_attrib.is_virtual;
-
-		entry->pci.msi_attrib.default_irq	= dev->irq;
-		entry->pci.mask_base			= base;
+	memset(&desc, 0, sizeof(desc));
 
-		if (entry->pci.msi_attrib.can_mask) {
-			addr = pci_msix_desc_addr(entry);
-			entry->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+	desc.nvec_used			= 1;
+	desc.pci.msi_attrib.is_msix	= 1;
+	desc.pci.msi_attrib.is_64	= 1;
+	desc.pci.msi_attrib.default_irq	= dev->irq;
+	desc.pci.mask_base		= base;
+
+	for (i = 0, curmsk = masks; i < nvec; i++, curmsk++) {
+		desc.msi_index = entries ? entries[i].entry : i;
+		desc.affinity = masks ? curmsk : NULL;
+		desc.pci.msi_attrib.is_virtual = desc.msi_index >= vec_count;
+		desc.pci.msi_attrib.can_mask = !pci_msi_ignore_mask &&
+					       !desc.pci.msi_attrib.is_virtual;
+
+		if (!desc.pci.msi_attrib.can_mask) {
+			addr = pci_msix_desc_addr(&desc);
+			desc.pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
 		}
 
-		list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
-		if (masks)
-			curmsk++;
+		ret = msi_add_msi_desc(&dev->dev, &desc);
+		if (ret)
+			break;
 	}
-	msi_device_set_properties(&dev->dev, MSI_PROP_PCI_MSIX | MSI_PROP_64BIT);
-	return 0;
+	return ret;
 }
 
 static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries)
@@ -562,10 +556,12 @@ static int msix_setup_interrupts(struct
 		masks = irq_create_affinity_masks(nvec, affd);
 
 	msi_lock_descs(&dev->dev);
-	ret = msix_setup_entries(dev, base, entries, nvec, masks);
+	ret = msix_setup_msi_descs(dev, base, entries, nvec, masks);
 	if (ret)
 		goto out_free;
 
+	msi_device_set_properties(&dev->dev, MSI_PROP_PCI_MSIX | MSI_PROP_64BIT);
+
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
 	if (ret)
 		goto out_free;


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 09/31] PCI/MSI: Let core code free MSI descriptors
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (7 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 08/31] PCI/MSI: Use msi_add_msi_desc() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-07 21:07   ` Bjorn Helgaas
  2021-12-06 22:51 ` [patch V2 10/31] PCI/MSI: Use msi_on_each_desc() Thomas Gleixner
                   ` (22 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Set the domain info flag which tells the core code to free the MSI
descriptors from msi_domain_free_irqs() and add an explicit call to the
core function into the legacy code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/pci/msi/irqdomain.c |    3 ++-
 drivers/pci/msi/legacy.c    |    1 +
 drivers/pci/msi/msi.c       |   14 --------------
 3 files changed, 3 insertions(+), 15 deletions(-)

--- a/drivers/pci/msi/irqdomain.c
+++ b/drivers/pci/msi/irqdomain.c
@@ -171,7 +171,8 @@ struct irq_domain *pci_msi_create_irq_do
 	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
 		pci_msi_domain_update_chip_ops(info);
 
-	info->flags |= MSI_FLAG_ACTIVATE_EARLY | MSI_FLAG_DEV_SYSFS;
+	info->flags |= MSI_FLAG_ACTIVATE_EARLY | MSI_FLAG_DEV_SYSFS |
+		       MSI_FLAG_FREE_MSI_DESCS;
 	if (IS_ENABLED(CONFIG_GENERIC_IRQ_RESERVATION_MODE))
 		info->flags |= MSI_FLAG_MUST_REACTIVATE;
 
--- a/drivers/pci/msi/legacy.c
+++ b/drivers/pci/msi/legacy.c
@@ -80,4 +80,5 @@ void pci_msi_legacy_teardown_msi_irqs(st
 {
 	msi_device_destroy_sysfs(&dev->dev);
 	arch_teardown_msi_irqs(dev);
+	msi_free_msi_descs(&dev->dev);
 }
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -224,22 +224,8 @@ EXPORT_SYMBOL_GPL(pci_write_msi_msg);
 
 static void free_msi_irqs(struct pci_dev *dev)
 {
-	struct list_head *msi_list = dev_to_msi_list(&dev->dev);
-	struct msi_desc *entry, *tmp;
-	int i;
-
-	for_each_pci_msi_entry(entry, dev)
-		if (entry->irq)
-			for (i = 0; i < entry->nvec_used; i++)
-				BUG_ON(irq_has_action(entry->irq + i));
-
 	pci_msi_teardown_msi_irqs(dev);
 
-	list_for_each_entry_safe(entry, tmp, msi_list, list) {
-		list_del(&entry->list);
-		free_msi_entry(entry);
-	}
-
 	if (dev->msix_base) {
 		iounmap(dev->msix_base);
 		dev->msix_base = NULL;


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 10/31] PCI/MSI: Use msi_on_each_desc()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (8 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 09/31] PCI/MSI: Let core code free MSI descriptors Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-07 21:07   ` Bjorn Helgaas
  2021-12-06 22:51 ` [patch V2 11/31] x86/pci/xen: Use msi_for_each_desc() Thomas Gleixner
                   ` (21 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Use the new iterator functions which pave the way for dynamically extending
MSI-X vectors.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/pci/msi/irqdomain.c |    4 ++--
 drivers/pci/msi/legacy.c    |   19 ++++++++-----------
 drivers/pci/msi/msi.c       |   30 ++++++++++++++----------------
 3 files changed, 24 insertions(+), 29 deletions(-)

--- a/drivers/pci/msi/irqdomain.c
+++ b/drivers/pci/msi/irqdomain.c
@@ -83,7 +83,7 @@ static int pci_msi_domain_check_cap(stru
 				    struct msi_domain_info *info,
 				    struct device *dev)
 {
-	struct msi_desc *desc = first_pci_msi_entry(to_pci_dev(dev));
+	struct msi_desc *desc = msi_first_desc(dev, MSI_DESC_ALL);
 
 	/* Special handling to support __pci_enable_msi_range() */
 	if (pci_msi_desc_is_multi_msi(desc) &&
@@ -98,7 +98,7 @@ static int pci_msi_domain_check_cap(stru
 			unsigned int idx = 0;
 
 			/* Check for gaps in the entry indices */
-			for_each_msi_entry(desc, dev) {
+			msi_for_each_desc(desc, dev, MSI_DESC_ALL) {
 				if (desc->msi_index != idx++)
 					return -ENOTSUPP;
 			}
--- a/drivers/pci/msi/legacy.c
+++ b/drivers/pci/msi/legacy.c
@@ -28,7 +28,7 @@ int __weak arch_setup_msi_irqs(struct pc
 	if (type == PCI_CAP_ID_MSI && nvec > 1)
 		return 1;
 
-	for_each_pci_msi_entry(desc, dev) {
+	msi_for_each_desc(desc, &dev->dev, MSI_DESC_NOTASSOCIATED) {
 		ret = arch_setup_msi_irq(dev, desc);
 		if (ret)
 			return ret < 0 ? ret : -ENOSPC;
@@ -42,27 +42,24 @@ void __weak arch_teardown_msi_irqs(struc
 	struct msi_desc *desc;
 	int i;
 
-	for_each_pci_msi_entry(desc, dev) {
-		if (desc->irq) {
-			for (i = 0; i < desc->nvec_used; i++)
-				arch_teardown_msi_irq(desc->irq + i);
-		}
+	msi_for_each_desc(desc, &dev->dev, MSI_DESC_ASSOCIATED) {
+		for (i = 0; i < desc->nvec_used; i++)
+			arch_teardown_msi_irq(desc->irq + i);
 	}
 }
 
 static int pci_msi_setup_check_result(struct pci_dev *dev, int type, int ret)
 {
-	struct msi_desc *entry;
+	struct msi_desc *desc;
 	int avail = 0;
 
 	if (type != PCI_CAP_ID_MSIX || ret >= 0)
 		return ret;
 
 	/* Scan the MSI descriptors for successfully allocated ones. */
-	for_each_pci_msi_entry(entry, dev) {
-		if (entry->irq != 0)
-			avail++;
-	}
+	msi_for_each_desc(desc, &dev->dev, MSI_DESC_ASSOCIATED)
+		avail++;
+
 	return avail ? avail : ret;
 }
 
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -299,7 +299,6 @@ static void __pci_restore_msix_state(str
 
 	if (!dev->msix_enabled)
 		return;
-	BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));
 
 	/* route the table */
 	pci_intx_for_msi(dev, 0);
@@ -309,7 +308,7 @@ static void __pci_restore_msix_state(str
 	write_msg = arch_restore_msi_irqs(dev);
 
 	msi_lock_descs(&dev->dev);
-	for_each_pci_msi_entry(entry, dev) {
+	msi_for_each_desc(entry, &dev->dev, MSI_DESC_ALL) {
 		if (write_msg)
 			__pci_write_msi_msg(entry, &entry->msg);
 		pci_msix_write_vector_ctrl(entry, entry->pci.msix_ctrl);
@@ -378,14 +377,14 @@ static int msi_verify_entries(struct pci
 	if (!dev->no_64bit_msi)
 		return 0;
 
-	for_each_pci_msi_entry(entry, dev) {
+	msi_for_each_desc(entry, &dev->dev, MSI_DESC_ALL) {
 		if (entry->msg.address_hi) {
 			pci_err(dev, "arch assigned 64-bit MSI address %#x%08x but device only supports 32 bits\n",
 				entry->msg.address_hi, entry->msg.address_lo);
-			return -EIO;
+			break;
 		}
 	}
-	return 0;
+	return !entry ? 0 : -EIO;
 }
 
 /**
@@ -418,7 +417,7 @@ static int msi_capability_init(struct pc
 		goto unlock;
 
 	/* All MSIs are unmasked by default; mask them all */
-	entry = first_pci_msi_entry(dev);
+	entry = msi_first_desc(&dev->dev, MSI_DESC_ALL);
 	pci_msi_mask(entry, msi_multi_mask(entry));
 
 	/* Configure MSI capability structure */
@@ -508,11 +507,11 @@ static int msix_setup_msi_descs(struct p
 
 static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries)
 {
-	struct msi_desc *entry;
+	struct msi_desc *desc;
 
 	if (entries) {
-		for_each_pci_msi_entry(entry, dev) {
-			entries->vector = entry->irq;
+		msi_for_each_desc(desc, &dev->dev, MSI_DESC_ALL) {
+			entries->vector = desc->irq;
 			entries++;
 		}
 	}
@@ -705,15 +704,14 @@ static void pci_msi_shutdown(struct pci_
 	if (!pci_msi_enable || !dev || !dev->msi_enabled)
 		return;
 
-	BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));
-	desc = first_pci_msi_entry(dev);
-
 	pci_msi_set_enable(dev, 0);
 	pci_intx_for_msi(dev, 1);
 	dev->msi_enabled = 0;
 
 	/* Return the device with MSI unmasked as initial states */
-	pci_msi_unmask(desc, msi_multi_mask(desc));
+	desc = msi_first_desc(&dev->dev, MSI_DESC_ALL);
+	if (!WARN_ON_ONCE(!desc))
+		pci_msi_unmask(desc, msi_multi_mask(desc));
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->pci.msi_attrib.default_irq;
@@ -789,7 +787,7 @@ static int __pci_enable_msix(struct pci_
 
 static void pci_msix_shutdown(struct pci_dev *dev)
 {
-	struct msi_desc *entry;
+	struct msi_desc *desc;
 
 	if (!pci_msi_enable || !dev || !dev->msix_enabled)
 		return;
@@ -800,8 +798,8 @@ static void pci_msix_shutdown(struct pci
 	}
 
 	/* Return the device with MSI-X masked as initial states */
-	for_each_pci_msi_entry(entry, dev)
-		pci_msix_mask(entry);
+	msi_for_each_desc(desc, &dev->dev, MSI_DESC_ALL)
+		pci_msix_mask(desc);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 	pci_intx_for_msi(dev, 1);


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 11/31] x86/pci/xen: Use msi_for_each_desc()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (9 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 10/31] PCI/MSI: Use msi_on_each_desc() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 12/31] xen/pcifront: Rework MSI handling Thomas Gleixner
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/pci/xen.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -184,7 +184,7 @@ static int xen_setup_msi_irqs(struct pci
 	if (ret)
 		goto error;
 	i = 0;
-	for_each_pci_msi_entry(msidesc, dev) {
+	msi_for_each_desc(msidesc, &dev->dev, MSI_DESC_NOTASSOCIATED) {
 		irq = xen_bind_pirq_msi_to_irq(dev, msidesc, v[i],
 					       (type == PCI_CAP_ID_MSI) ? nvec : 1,
 					       (type == PCI_CAP_ID_MSIX) ?
@@ -235,7 +235,7 @@ static int xen_hvm_setup_msi_irqs(struct
 	if (type == PCI_CAP_ID_MSI && nvec > 1)
 		return 1;
 
-	for_each_pci_msi_entry(msidesc, dev) {
+	msi_for_each_desc(msidesc, &dev->dev, MSI_DESC_NOTASSOCIATED) {
 		pirq = xen_allocate_pirq_msi(dev, msidesc);
 		if (pirq < 0) {
 			irq = -ENODEV;
@@ -270,7 +270,7 @@ static int xen_initdom_setup_msi_irqs(st
 	int ret = 0;
 	struct msi_desc *msidesc;
 
-	for_each_pci_msi_entry(msidesc, dev) {
+	msi_for_each_desc(msidesc, &dev->dev, MSI_DESC_NOTASSOCIATED) {
 		struct physdev_map_pirq map_irq;
 		domid_t domid;
 
@@ -389,11 +389,9 @@ static void xen_teardown_msi_irqs(struct
 	struct msi_desc *msidesc;
 	int i;
 
-	for_each_pci_msi_entry(msidesc, dev) {
-		if (msidesc->irq) {
-			for (i = 0; i < msidesc->nvec_used; i++)
-				xen_destroy_irq(msidesc->irq + i);
-		}
+	msi_for_each_desc(msidesc, &dev->dev, MSI_DESC_ASSOCIATED) {
+		for (i = 0; i < msidesc->nvec_used; i++)
+			xen_destroy_irq(msidesc->irq + i);
 	}
 }
 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 12/31] xen/pcifront: Rework MSI handling
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (10 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 11/31] x86/pci/xen: Use msi_for_each_desc() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 13/31] s390/pci: Rework MSI descriptor walk Thomas Gleixner
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/pci/xen-pcifront.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -262,7 +262,7 @@ static int pci_frontend_enable_msix(stru
 	}
 
 	i = 0;
-	for_each_pci_msi_entry(entry, dev) {
+	msi_for_each_desc(entry, &dev->dev, MSI_DESC_NOTASSOCIATED) {
 		op.msix_entries[i].entry = entry->msi_index;
 		/* Vector is useless at this point. */
 		op.msix_entries[i].vector = -1;


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 13/31] s390/pci: Rework MSI descriptor walk
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (11 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 12/31] xen/pcifront: Rework MSI handling Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 14/31] powerpc/4xx/hsta: Rework MSI handling Thomas Gleixner
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Niklas Schnelle,
	linux-s390, Heiko Carstens, Christian Borntraeger,
	Greg Kroah-Hartman, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators and make use of the filtering.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/pci/pci_irq.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -303,7 +303,7 @@ int arch_setup_msi_irqs(struct pci_dev *
 
 	/* Request MSI interrupts */
 	hwirq = bit;
-	for_each_pci_msi_entry(msi, pdev) {
+	msi_for_each_desc(msi, &pdev->dev, MSI_DESC_NOTASSOCIATED) {
 		rc = -EIO;
 		if (hwirq - bit >= msi_vecs)
 			break;
@@ -362,9 +362,7 @@ void arch_teardown_msi_irqs(struct pci_d
 		return;
 
 	/* Release MSI interrupts */
-	for_each_pci_msi_entry(msi, pdev) {
-		if (!msi->irq)
-			continue;
+	msi_for_each_desc(msi, &pdev->dev, MSI_DESC_ASSOCIATED) {
 		irq_set_msi_desc(msi->irq, NULL);
 		irq_free_desc(msi->irq);
 		msi->msg.address_lo = 0;


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 14/31] powerpc/4xx/hsta: Rework MSI handling
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (12 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 13/31] s390/pci: Rework MSI descriptor walk Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 15/31] powerpc/cell/axon_msi: Convert to msi_on_each_desc() Thomas Gleixner
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators and make use of the filtering.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/powerpc/platforms/4xx/hsta_msi.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

--- a/arch/powerpc/platforms/4xx/hsta_msi.c
+++ b/arch/powerpc/platforms/4xx/hsta_msi.c
@@ -47,7 +47,7 @@ static int hsta_setup_msi_irqs(struct pc
 		return -EINVAL;
 	}
 
-	for_each_pci_msi_entry(entry, dev) {
+	msi_for_each_desc(entry, &dev->dev, MSI_DESC_NOTASSOCIATED) {
 		irq = msi_bitmap_alloc_hwirqs(&ppc4xx_hsta_msi.bmp, 1);
 		if (irq < 0) {
 			pr_debug("%s: Failed to allocate msi interrupt\n",
@@ -105,10 +105,7 @@ static void hsta_teardown_msi_irqs(struc
 	struct msi_desc *entry;
 	int irq;
 
-	for_each_pci_msi_entry(entry, dev) {
-		if (!entry->irq)
-			continue;
-
+	msi_for_each_desc(entry, &dev->dev, MSI_DESC_ASSOCIATED) {
 		irq = hsta_find_hwirq_offset(entry->irq);
 
 		/* entry->irq should always be in irq_map */


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 15/31] powerpc/cell/axon_msi: Convert to msi_on_each_desc()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (13 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 14/31] powerpc/4xx/hsta: Rework MSI handling Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 16/31] powerpc/pasemi/msi: Convert to msi_on_each_dec() Thomas Gleixner
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators and make use of the filtering.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/powerpc/platforms/cell/axon_msi.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

--- a/arch/powerpc/platforms/cell/axon_msi.c
+++ b/arch/powerpc/platforms/cell/axon_msi.c
@@ -265,7 +265,7 @@ static int axon_msi_setup_msi_irqs(struc
 	if (rc)
 		return rc;
 
-	for_each_pci_msi_entry(entry, dev) {
+	msi_for_each_desc(entry, &dev->dev, MSI_DESC_NOTASSOCIATED) {
 		virq = irq_create_direct_mapping(msic->irq_domain);
 		if (!virq) {
 			dev_warn(&dev->dev,
@@ -288,10 +288,7 @@ static void axon_msi_teardown_msi_irqs(s
 
 	dev_dbg(&dev->dev, "axon_msi: tearing down msi irqs\n");
 
-	for_each_pci_msi_entry(entry, dev) {
-		if (!entry->irq)
-			continue;
-
+	msi_for_each_desc(entry, &dev->dev, MSI_DESC_ASSOCIATED) {
 		irq_set_msi_desc(entry->irq, NULL);
 		irq_dispose_mapping(entry->irq);
 	}


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 16/31] powerpc/pasemi/msi: Convert to msi_on_each_dec()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (14 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 15/31] powerpc/cell/axon_msi: Convert to msi_on_each_desc() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 17/31] powerpc/fsl_msi: Use msi_for_each_desc() Thomas Gleixner
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators and make use of the filtering.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/powerpc/platforms/pasemi/msi.c |    9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

--- a/arch/powerpc/platforms/pasemi/msi.c
+++ b/arch/powerpc/platforms/pasemi/msi.c
@@ -62,17 +62,12 @@ static void pasemi_msi_teardown_msi_irqs
 
 	pr_debug("pasemi_msi_teardown_msi_irqs, pdev %p\n", pdev);
 
-	for_each_pci_msi_entry(entry, pdev) {
-		if (!entry->irq)
-			continue;
-
+	msi_for_each_desc(entry, &pdev->dev, MSI_DESC_ASSOCIATED) {
 		hwirq = virq_to_hw(entry->irq);
 		irq_set_msi_desc(entry->irq, NULL);
 		irq_dispose_mapping(entry->irq);
 		msi_bitmap_free_hwirqs(&msi_mpic->msi_bitmap, hwirq, ALLOC_CHUNK);
 	}
-
-	return;
 }
 
 static int pasemi_msi_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
@@ -90,7 +85,7 @@ static int pasemi_msi_setup_msi_irqs(str
 	msg.address_hi = 0;
 	msg.address_lo = PASEMI_MSI_ADDR;
 
-	for_each_pci_msi_entry(entry, pdev) {
+	msi_for_each_desc(entry, &pdev->dev, MSI_DESC_NOTASSOCIATED) {
 		/* Allocate 16 interrupts for now, since that's the grouping for
 		 * affinity. This can be changed later if it turns out 32 is too
 		 * few MSIs for someone, but restrictions will apply to how the


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 17/31] powerpc/fsl_msi: Use msi_for_each_desc()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (15 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 16/31] powerpc/pasemi/msi: Convert to msi_on_each_dec() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 18/31] powerpc/mpic_u3msi: Use msi_for_each-desc() Thomas Gleixner
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators and make use of the filtering.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/powerpc/sysdev/fsl_msi.c |    8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -125,17 +125,13 @@ static void fsl_teardown_msi_irqs(struct
 	struct fsl_msi *msi_data;
 	irq_hw_number_t hwirq;
 
-	for_each_pci_msi_entry(entry, pdev) {
-		if (!entry->irq)
-			continue;
+	msi_for_each_desc(entry, &pdev->dev, MSI_DESC_ASSOCIATED) {
 		hwirq = virq_to_hw(entry->irq);
 		msi_data = irq_get_chip_data(entry->irq);
 		irq_set_msi_desc(entry->irq, NULL);
 		irq_dispose_mapping(entry->irq);
 		msi_bitmap_free_hwirqs(&msi_data->bitmap, hwirq, 1);
 	}
-
-	return;
 }
 
 static void fsl_compose_msi_msg(struct pci_dev *pdev, int hwirq,
@@ -215,7 +211,7 @@ static int fsl_setup_msi_irqs(struct pci
 		}
 	}
 
-	for_each_pci_msi_entry(entry, pdev) {
+	msi_for_each_desc(entry, &pdev->dev, MSI_DESC_NOTASSOCIATED) {
 		/*
 		 * Loop over all the MSI devices until we find one that has an
 		 * available interrupt.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 18/31] powerpc/mpic_u3msi: Use msi_for_each-desc()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (16 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 17/31] powerpc/fsl_msi: Use msi_for_each_desc() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 19/31] PCI: hv: Rework MSI handling Thomas Gleixner
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators and make use of the filtering.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/powerpc/sysdev/mpic_u3msi.c |    9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

--- a/arch/powerpc/sysdev/mpic_u3msi.c
+++ b/arch/powerpc/sysdev/mpic_u3msi.c
@@ -104,17 +104,12 @@ static void u3msi_teardown_msi_irqs(stru
 	struct msi_desc *entry;
 	irq_hw_number_t hwirq;
 
-	for_each_pci_msi_entry(entry, pdev) {
-		if (!entry->irq)
-			continue;
-
+	msi_for_each_desc(entry, &pdev->dev, MSI_DESC_ASSOCIATED) {
 		hwirq = virq_to_hw(entry->irq);
 		irq_set_msi_desc(entry->irq, NULL);
 		irq_dispose_mapping(entry->irq);
 		msi_bitmap_free_hwirqs(&msi_mpic->msi_bitmap, hwirq, 1);
 	}
-
-	return;
 }
 
 static int u3msi_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
@@ -136,7 +131,7 @@ static int u3msi_setup_msi_irqs(struct p
 		return -ENXIO;
 	}
 
-	for_each_pci_msi_entry(entry, pdev) {
+	msi_for_each_desc(entry, &pdev->dev, MSI_DESC_NOTASSOCIATED) {
 		hwirq = msi_bitmap_alloc_hwirqs(&msi_mpic->msi_bitmap, 1);
 		if (hwirq < 0) {
 			pr_debug("u3msi: failed allocating hwirq\n");


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 19/31] PCI: hv: Rework MSI handling
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (17 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 18/31] powerpc/mpic_u3msi: Use msi_for_each-desc() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-07 21:08   ` Bjorn Helgaas
  2021-12-06 22:51 ` [patch V2 20/31] NTB/msi: Convert to msi_on_each_desc() Thomas Gleixner
                   ` (12 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Replace the about to vanish iterators and make use of the filtering. Take
the descriptor lock around the iterators.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/pci/controller/pci-hyperv.c |   15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3445,18 +3445,23 @@ static int hv_pci_suspend(struct hv_devi
 
 static int hv_pci_restore_msi_msg(struct pci_dev *pdev, void *arg)
 {
-	struct msi_desc *entry;
 	struct irq_data *irq_data;
+	struct msi_desc *entry;
+	int ret = 0;
 
-	for_each_pci_msi_entry(entry, pdev) {
+	msi_lock_descs(&pdev->dev);
+	msi_for_each_desc(entry, &pdev->dev, MSI_DESC_ASSOCIATED) {
 		irq_data = irq_get_irq_data(entry->irq);
-		if (WARN_ON_ONCE(!irq_data))
-			return -EINVAL;
+		if (WARN_ON_ONCE(!irq_data)) {
+			ret = -EINVAL;
+			break;
+		}
 
 		hv_compose_msi_msg(irq_data, &entry->msg);
 	}
+	msi_unlock_descs(&pdev->dev);
 
-	return 0;
+	return ret;
 }
 
 /*


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 20/31] NTB/msi: Convert to msi_on_each_desc()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (18 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 19/31] PCI: hv: Rework MSI handling Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 21/31] soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation Thomas Gleixner
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb,
	Greg Kroah-Hartman, Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger

Replace the about to vanish iterators, make use of the filtering and take
the descriptor lock around the iteration.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: linux-ntb@googlegroups.com
---
 drivers/ntb/msi.c |   19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

--- a/drivers/ntb/msi.c
+++ b/drivers/ntb/msi.c
@@ -108,8 +108,10 @@ int ntb_msi_setup_mws(struct ntb_dev *nt
 	if (!ntb->msi)
 		return -EINVAL;
 
-	desc = first_msi_entry(&ntb->pdev->dev);
+	msi_lock_descs(&ntb->pdev->dev);
+	desc = msi_first_desc(&ntb->pdev->dev, MSI_DESC_ASSOCIATED);
 	addr = desc->msg.address_lo + ((uint64_t)desc->msg.address_hi << 32);
+	msi_unlock_descs(&ntb->pdev->dev);
 
 	for (peer = 0; peer < ntb_peer_port_count(ntb); peer++) {
 		peer_widx = ntb_peer_highest_mw_idx(ntb, peer);
@@ -281,13 +283,15 @@ int ntbm_msi_request_threaded_irq(struct
 				  const char *name, void *dev_id,
 				  struct ntb_msi_desc *msi_desc)
 {
+	struct device *dev = &ntb->pdev->dev;
 	struct msi_desc *entry;
 	int ret;
 
 	if (!ntb->msi)
 		return -EINVAL;
 
-	for_each_pci_msi_entry(entry, ntb->pdev) {
+	msi_lock_descs(dev);
+	msi_for_each_desc(entry, dev, MSI_DESC_ASSOCIATED) {
 		if (irq_has_action(entry->irq))
 			continue;
 
@@ -304,14 +308,17 @@ int ntbm_msi_request_threaded_irq(struct
 		ret = ntbm_msi_setup_callback(ntb, entry, msi_desc);
 		if (ret) {
 			devm_free_irq(&ntb->dev, entry->irq, dev_id);
-			return ret;
+			goto unlock;
 		}
 
-
-		return entry->irq;
+		ret = entry->irq;
+		goto unlock;
 	}
+	ret = -ENODEV;
 
-	return -ENODEV;
+unlock:
+	msi_unlock_descs(dev);
+	return ret;
 }
 EXPORT_SYMBOL(ntbm_msi_request_threaded_irq);
 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 21/31] soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (19 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 20/31] NTB/msi: Convert to msi_on_each_desc() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-15 20:50   ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 22/31] soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs() Thomas Gleixner
                   ` (10 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Protect the allocation properly and use the core allocation and free
mechanism.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/soc/ti/ti_sci_inta_msi.c |   71 +++++++++++++--------------------------
 1 file changed, 25 insertions(+), 46 deletions(-)

--- a/drivers/soc/ti/ti_sci_inta_msi.c
+++ b/drivers/soc/ti/ti_sci_inta_msi.c
@@ -51,6 +51,7 @@ struct irq_domain *ti_sci_inta_msi_creat
 	struct irq_domain *domain;
 
 	ti_sci_inta_msi_update_chip_ops(info);
+	info->flags |= MSI_FLAG_FREE_MSI_DESCS;
 
 	domain = msi_create_irq_domain(fwnode, info, parent);
 	if (domain)
@@ -60,50 +61,31 @@ struct irq_domain *ti_sci_inta_msi_creat
 }
 EXPORT_SYMBOL_GPL(ti_sci_inta_msi_create_irq_domain);
 
-static void ti_sci_inta_msi_free_descs(struct device *dev)
-{
-	struct msi_desc *desc, *tmp;
-
-	list_for_each_entry_safe(desc, tmp, dev_to_msi_list(dev), list) {
-		list_del(&desc->list);
-		free_msi_entry(desc);
-	}
-}
-
 static int ti_sci_inta_msi_alloc_descs(struct device *dev,
 				       struct ti_sci_resource *res)
 {
-	struct msi_desc *msi_desc;
+	struct msi_desc msi_desc;
 	int set, i, count = 0;
 
+	memset(&msi_desc, 0, sizeof(msi_desc));
+
 	for (set = 0; set < res->sets; set++) {
-		for (i = 0; i < res->desc[set].num; i++) {
-			msi_desc = alloc_msi_entry(dev, 1, NULL);
-			if (!msi_desc) {
-				ti_sci_inta_msi_free_descs(dev);
-				return -ENOMEM;
-			}
-
-			msi_desc->msi_index = res->desc[set].start + i;
-			INIT_LIST_HEAD(&msi_desc->list);
-			list_add_tail(&msi_desc->list, dev_to_msi_list(dev));
-			count++;
+		for (i = 0; i < res->desc[set].num; i++, count++) {
+			msi_desc.msi_index = res->desc[set].start + i;
+			if (msi_add_msi_desc(dev, &msi_desc))
+				goto fail;
 		}
-		for (i = 0; i < res->desc[set].num_sec; i++) {
-			msi_desc = alloc_msi_entry(dev, 1, NULL);
-			if (!msi_desc) {
-				ti_sci_inta_msi_free_descs(dev);
-				return -ENOMEM;
-			}
-
-			msi_desc->msi_index = res->desc[set].start_sec + i;
-			INIT_LIST_HEAD(&msi_desc->list);
-			list_add_tail(&msi_desc->list, dev_to_msi_list(dev));
-			count++;
+
+		for (i = 0; i < res->desc[set].num_sec; i++, count++) {
+			msi_desc.msi_index = res->desc[set].start_sec + i;
+			if (msi_add_msi_desc(dev, &msi_desc))
+				goto fail;
 		}
 	}
-
 	return count;
+fail:
+	msi_free_msi_descs(dev);
+	return -ENOMEM;
 }
 
 int ti_sci_inta_msi_domain_alloc_irqs(struct device *dev,
@@ -124,20 +106,18 @@ int ti_sci_inta_msi_domain_alloc_irqs(st
 	if (ret)
 		return ret;
 
+	msi_lock_descs(dev);
 	nvec = ti_sci_inta_msi_alloc_descs(dev, res);
-	if (nvec <= 0)
-		return nvec;
-
-	ret = msi_domain_alloc_irqs(msi_domain, dev, nvec);
-	if (ret) {
-		dev_err(dev, "Failed to allocate IRQs %d\n", ret);
-		goto cleanup;
+	if (nvec <= 0) {
+		ret = nvec;
+		goto unlock;
 	}
 
-	return 0;
-
-cleanup:
-	ti_sci_inta_msi_free_descs(&pdev->dev);
+	ret = msi_domain_alloc_irqs_descs_locked(msi_domain, dev, nvec);
+	if (ret)
+		dev_err(dev, "Failed to allocate IRQs %d\n", ret);
+unlock:
+	msi_unlock_descs(dev);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(ti_sci_inta_msi_domain_alloc_irqs);
@@ -145,6 +125,5 @@ EXPORT_SYMBOL_GPL(ti_sci_inta_msi_domain
 void ti_sci_inta_msi_domain_free_irqs(struct device *dev)
 {
 	msi_domain_free_irqs(dev->msi.domain, dev);
-	ti_sci_inta_msi_free_descs(dev);
 }
 EXPORT_SYMBOL_GPL(ti_sci_inta_msi_domain_free_irqs);


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 22/31] soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs()
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (20 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 21/31] soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 23/31] bus: fsl-mc-msi: Simplify MSI descriptor handling Thomas Gleixner
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

The function has no users and is pointless now that the core frees the MSI
descriptors, which means potential users can just use msi_domain_free_irqs().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/soc/ti/ti_sci_inta_msi.c       |    6 ------
 include/linux/soc/ti/ti_sci_inta_msi.h |    1 -
 2 files changed, 7 deletions(-)

--- a/drivers/soc/ti/ti_sci_inta_msi.c
+++ b/drivers/soc/ti/ti_sci_inta_msi.c
@@ -121,9 +121,3 @@ int ti_sci_inta_msi_domain_alloc_irqs(st
 	return ret;
 }
 EXPORT_SYMBOL_GPL(ti_sci_inta_msi_domain_alloc_irqs);
-
-void ti_sci_inta_msi_domain_free_irqs(struct device *dev)
-{
-	msi_domain_free_irqs(dev->msi.domain, dev);
-}
-EXPORT_SYMBOL_GPL(ti_sci_inta_msi_domain_free_irqs);
--- a/include/linux/soc/ti/ti_sci_inta_msi.h
+++ b/include/linux/soc/ti/ti_sci_inta_msi.h
@@ -18,5 +18,4 @@ struct irq_domain
 				   struct irq_domain *parent);
 int ti_sci_inta_msi_domain_alloc_irqs(struct device *dev,
 				      struct ti_sci_resource *res);
-void ti_sci_inta_msi_domain_free_irqs(struct device *dev);
 #endif /* __INCLUDE_LINUX_IRQCHIP_TI_SCI_INTA_H */


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 23/31] bus: fsl-mc-msi: Simplify MSI descriptor handling
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (21 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 22/31] soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs() Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 24/31] platform-msi: Let core code handle MSI descriptors Thomas Gleixner
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Let the MSI irq domain code handle descriptor allocation and free.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/bus/fsl-mc/fsl-mc-msi.c |   61 ++--------------------------------------
 1 file changed, 4 insertions(+), 57 deletions(-)

--- a/drivers/bus/fsl-mc/fsl-mc-msi.c
+++ b/drivers/bus/fsl-mc/fsl-mc-msi.c
@@ -170,6 +170,7 @@ struct irq_domain *fsl_mc_msi_create_irq
 		fsl_mc_msi_update_dom_ops(info);
 	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
 		fsl_mc_msi_update_chip_ops(info);
+	info->flags |= MSI_FLAG_ALLOC_SIMPLE_MSI_DESCS | MSI_FLAG_FREE_MSI_DESCS;
 
 	domain = msi_create_irq_domain(fwnode, info, parent);
 	if (domain)
@@ -210,45 +211,7 @@ struct irq_domain *fsl_mc_find_msi_domai
 	return msi_domain;
 }
 
-static void fsl_mc_msi_free_descs(struct device *dev)
-{
-	struct msi_desc *desc, *tmp;
-
-	list_for_each_entry_safe(desc, tmp, dev_to_msi_list(dev), list) {
-		list_del(&desc->list);
-		free_msi_entry(desc);
-	}
-}
-
-static int fsl_mc_msi_alloc_descs(struct device *dev, unsigned int irq_count)
-
-{
-	unsigned int i;
-	int error;
-	struct msi_desc *msi_desc;
-
-	for (i = 0; i < irq_count; i++) {
-		msi_desc = alloc_msi_entry(dev, 1, NULL);
-		if (!msi_desc) {
-			dev_err(dev, "Failed to allocate msi entry\n");
-			error = -ENOMEM;
-			goto cleanup_msi_descs;
-		}
-
-		msi_desc->msi_index = i;
-		INIT_LIST_HEAD(&msi_desc->list);
-		list_add_tail(&msi_desc->list, dev_to_msi_list(dev));
-	}
-
-	return 0;
-
-cleanup_msi_descs:
-	fsl_mc_msi_free_descs(dev);
-	return error;
-}
-
-int fsl_mc_msi_domain_alloc_irqs(struct device *dev,
-				 unsigned int irq_count)
+int fsl_mc_msi_domain_alloc_irqs(struct device *dev,  unsigned int irq_count)
 {
 	struct irq_domain *msi_domain;
 	int error;
@@ -261,28 +224,17 @@ int fsl_mc_msi_domain_alloc_irqs(struct
 	if (error)
 		return error;
 
-	if (!list_empty(dev_to_msi_list(dev)))
+	if (msi_first_desc(dev, MSI_DESC_ALL))
 		return -EINVAL;
 
-	error = fsl_mc_msi_alloc_descs(dev, irq_count);
-	if (error < 0)
-		return error;
-
 	/*
 	 * NOTE: Calling this function will trigger the invocation of the
 	 * its_fsl_mc_msi_prepare() callback
 	 */
 	error = msi_domain_alloc_irqs(msi_domain, dev, irq_count);
 
-	if (error) {
+	if (error)
 		dev_err(dev, "Failed to allocate IRQs\n");
-		goto cleanup_msi_descs;
-	}
-
-	return 0;
-
-cleanup_msi_descs:
-	fsl_mc_msi_free_descs(dev);
 	return error;
 }
 
@@ -295,9 +247,4 @@ void fsl_mc_msi_domain_free_irqs(struct
 		return;
 
 	msi_domain_free_irqs(msi_domain, dev);
-
-	if (list_empty(dev_to_msi_list(dev)))
-		return;
-
-	fsl_mc_msi_free_descs(dev);
 }


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 24/31] platform-msi: Let core code handle MSI descriptors
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (22 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 23/31] bus: fsl-mc-msi: Simplify MSI descriptor handling Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 25/31] platform-msi: Simplify platform device MSI code Thomas Gleixner
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Use the core functionality for platform MSI interrupt domains. The platform
device MSI interrupt domains will be converted in a later step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/base/platform-msi.c |  112 ++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 64 deletions(-)

--- a/drivers/base/platform-msi.c
+++ b/drivers/base/platform-msi.c
@@ -107,57 +107,6 @@ static void platform_msi_update_chip_ops
 		info->flags &= ~MSI_FLAG_LEVEL_CAPABLE;
 }
 
-static void platform_msi_free_descs(struct device *dev, int base, int nvec)
-{
-	struct msi_desc *desc, *tmp;
-
-	list_for_each_entry_safe(desc, tmp, dev_to_msi_list(dev), list) {
-		if (desc->msi_index >= base &&
-		    desc->msi_index < (base + nvec)) {
-			list_del(&desc->list);
-			free_msi_entry(desc);
-		}
-	}
-}
-
-static int platform_msi_alloc_descs_with_irq(struct device *dev, int virq,
-					     int nvec)
-{
-	struct msi_desc *desc;
-	int i, base = 0;
-
-	if (!list_empty(dev_to_msi_list(dev))) {
-		desc = list_last_entry(dev_to_msi_list(dev),
-				       struct msi_desc, list);
-		base = desc->msi_index + 1;
-	}
-
-	for (i = 0; i < nvec; i++) {
-		desc = alloc_msi_entry(dev, 1, NULL);
-		if (!desc)
-			break;
-
-		desc->msi_index = base + i;
-		desc->irq = virq ? virq + i : 0;
-
-		list_add_tail(&desc->list, dev_to_msi_list(dev));
-	}
-
-	if (i != nvec) {
-		/* Clean up the mess */
-		platform_msi_free_descs(dev, base, nvec);
-
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
-static int platform_msi_alloc_descs(struct device *dev, int nvec)
-{
-	return platform_msi_alloc_descs_with_irq(dev, 0, nvec);
-}
-
 /**
  * platform_msi_create_irq_domain - Create a platform MSI interrupt domain
  * @fwnode:		Optional fwnode of the interrupt controller
@@ -180,7 +129,8 @@ struct irq_domain *platform_msi_create_i
 		platform_msi_update_dom_ops(info);
 	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
 		platform_msi_update_chip_ops(info);
-	info->flags |= MSI_FLAG_DEV_SYSFS;
+	info->flags |= MSI_FLAG_DEV_SYSFS | MSI_FLAG_ALLOC_SIMPLE_MSI_DESCS |
+		       MSI_FLAG_FREE_MSI_DESCS;
 
 	domain = msi_create_irq_domain(fwnode, info, parent);
 	if (domain)
@@ -262,20 +212,10 @@ int platform_msi_domain_alloc_irqs(struc
 	if (err)
 		return err;
 
-	err = platform_msi_alloc_descs(dev, nvec);
-	if (err)
-		goto out_free_priv_data;
-
 	err = msi_domain_alloc_irqs(dev->msi.domain, dev, nvec);
 	if (err)
-		goto out_free_desc;
-
-	return 0;
+		platform_msi_free_priv_data(dev);
 
-out_free_desc:
-	platform_msi_free_descs(dev, 0, nvec);
-out_free_priv_data:
-	platform_msi_free_priv_data(dev);
 	return err;
 }
 EXPORT_SYMBOL_GPL(platform_msi_domain_alloc_irqs);
@@ -287,7 +227,6 @@ EXPORT_SYMBOL_GPL(platform_msi_domain_al
 void platform_msi_domain_free_irqs(struct device *dev)
 {
 	msi_domain_free_irqs(dev->msi.domain, dev);
-	platform_msi_free_descs(dev, 0, MAX_DEV_MSIS);
 	platform_msi_free_priv_data(dev);
 }
 EXPORT_SYMBOL_GPL(platform_msi_domain_free_irqs);
@@ -361,6 +300,51 @@ struct irq_domain *
 	return NULL;
 }
 
+static void platform_msi_free_descs(struct device *dev, int base, int nvec)
+{
+	struct msi_desc *desc, *tmp;
+
+	list_for_each_entry_safe(desc, tmp, dev_to_msi_list(dev), list) {
+		if (desc->msi_index >= base &&
+		    desc->msi_index < (base + nvec)) {
+			list_del(&desc->list);
+			free_msi_entry(desc);
+		}
+	}
+}
+
+static int platform_msi_alloc_descs_with_irq(struct device *dev, int virq,
+					     int nvec)
+{
+	struct msi_desc *desc;
+	int i, base = 0;
+
+	if (!list_empty(dev_to_msi_list(dev))) {
+		desc = list_last_entry(dev_to_msi_list(dev),
+				       struct msi_desc, list);
+		base = desc->msi_index + 1;
+	}
+
+	for (i = 0; i < nvec; i++) {
+		desc = alloc_msi_entry(dev, 1, NULL);
+		if (!desc)
+			break;
+
+		desc->msi_index = base + i;
+		desc->irq = virq + i;
+
+		list_add_tail(&desc->list, dev_to_msi_list(dev));
+	}
+
+	if (i != nvec) {
+		/* Clean up the mess */
+		platform_msi_free_descs(dev, base, nvec);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
 /**
  * platform_msi_device_domain_free - Free interrupts associated with a platform-msi
  *				     device domain


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 25/31] platform-msi: Simplify platform device MSI code
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (23 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 24/31] platform-msi: Let core code handle MSI descriptors Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 26/31] genirq/msi: Make interrupt allocation less convoluted Thomas Gleixner
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

The allocation code is overly complex. It tries to have the MSI index space
packed, which is not working when an interrupt is freed. There is no
requirement for this. The only requirement is that the MSI index is unique.

Move the MSI descriptor allocation into msi_domain_populate_irqs() and use
the Linux interrupt number as MSI index which fulfils the unique
requirement.

This requires to lock the MSI descriptors which makes the lock order
reverse to the regular MSI alloc/free functions vs. the domain
mutex. Assign a seperate lockdep class for these MSI device domains.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/base/platform-msi.c |   88 +++++++++-----------------------------------
 kernel/irq/msi.c            |   45 ++++++++++------------
 2 files changed, 39 insertions(+), 94 deletions(-)

--- a/drivers/base/platform-msi.c
+++ b/drivers/base/platform-msi.c
@@ -246,6 +246,8 @@ void *platform_msi_get_host_data(struct
 	return data->host_data;
 }
 
+static struct lock_class_key platform_device_msi_lock_class;
+
 /**
  * __platform_msi_create_device_domain - Create a platform-msi device domain
  *
@@ -278,6 +280,13 @@ struct irq_domain *
 	if (err)
 		return NULL;
 
+	/*
+	 * Use a separate lock class for the MSI descriptor mutex on
+	 * platform MSI device domains because the descriptor mutex nests
+	 * into the domain mutex. See alloc/free below.
+	 */
+	lockdep_set_class(&dev->msi.data->mutex, &platform_device_msi_lock_class);
+
 	data = dev->msi.data->platform_data;
 	data->host_data = host_data;
 	domain = irq_domain_create_hierarchy(dev->msi.domain, 0,
@@ -300,75 +309,23 @@ struct irq_domain *
 	return NULL;
 }
 
-static void platform_msi_free_descs(struct device *dev, int base, int nvec)
-{
-	struct msi_desc *desc, *tmp;
-
-	list_for_each_entry_safe(desc, tmp, dev_to_msi_list(dev), list) {
-		if (desc->msi_index >= base &&
-		    desc->msi_index < (base + nvec)) {
-			list_del(&desc->list);
-			free_msi_entry(desc);
-		}
-	}
-}
-
-static int platform_msi_alloc_descs_with_irq(struct device *dev, int virq,
-					     int nvec)
-{
-	struct msi_desc *desc;
-	int i, base = 0;
-
-	if (!list_empty(dev_to_msi_list(dev))) {
-		desc = list_last_entry(dev_to_msi_list(dev),
-				       struct msi_desc, list);
-		base = desc->msi_index + 1;
-	}
-
-	for (i = 0; i < nvec; i++) {
-		desc = alloc_msi_entry(dev, 1, NULL);
-		if (!desc)
-			break;
-
-		desc->msi_index = base + i;
-		desc->irq = virq + i;
-
-		list_add_tail(&desc->list, dev_to_msi_list(dev));
-	}
-
-	if (i != nvec) {
-		/* Clean up the mess */
-		platform_msi_free_descs(dev, base, nvec);
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
 /**
  * platform_msi_device_domain_free - Free interrupts associated with a platform-msi
  *				     device domain
  *
  * @domain:	The platform-msi device domain
  * @virq:	The base irq from which to perform the free operation
- * @nvec:	How many interrupts to free from @virq
+ * @nr_irqs:	How many interrupts to free from @virq
  */
 void platform_msi_device_domain_free(struct irq_domain *domain, unsigned int virq,
-				     unsigned int nvec)
+				     unsigned int nr_irqs)
 {
 	struct platform_msi_priv_data *data = domain->host_data;
-	struct msi_desc *desc, *tmp;
 
-	for_each_msi_entry_safe(desc, tmp, data->dev) {
-		if (WARN_ON(!desc->irq || desc->nvec_used != 1))
-			return;
-		if (!(desc->irq >= virq && desc->irq < (virq + nvec)))
-			continue;
-
-		irq_domain_free_irqs_common(domain, desc->irq, 1);
-		list_del(&desc->list);
-		free_msi_entry(desc);
-	}
+	msi_lock_descs(data->dev);
+	irq_domain_free_irqs_common(domain, virq, nr_irqs);
+	msi_free_msi_descs_range(data->dev, MSI_DESC_ALL, virq, virq + nr_irqs - 1);
+	msi_unlock_descs(data->dev);
 }
 
 /**
@@ -377,7 +334,7 @@ void platform_msi_device_domain_free(str
  *
  * @domain:	The platform-msi device domain
  * @virq:	The base irq from which to perform the allocate operation
- * @nr_irqs:	How many interrupts to free from @virq
+ * @nr_irqs:	How many interrupts to allocate from @virq
  *
  * Return 0 on success, or an error code on failure. Must be called
  * with irq_domain_mutex held (which can only be done as part of a
@@ -387,16 +344,7 @@ int platform_msi_device_domain_alloc(str
 				     unsigned int nr_irqs)
 {
 	struct platform_msi_priv_data *data = domain->host_data;
-	int err;
-
-	err = platform_msi_alloc_descs_with_irq(data->dev, virq, nr_irqs);
-	if (err)
-		return err;
-
-	err = msi_domain_populate_irqs(domain->parent, data->dev,
-				       virq, nr_irqs, &data->arg);
-	if (err)
-		platform_msi_device_domain_free(domain, virq, nr_irqs);
+	struct device *dev = data->dev;
 
-	return err;
+	return msi_domain_populate_irqs(domain->parent, dev, virq, nr_irqs, &data->arg);
 }
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -748,43 +748,40 @@ int msi_domain_prepare_irqs(struct irq_d
 }
 
 int msi_domain_populate_irqs(struct irq_domain *domain, struct device *dev,
-			     int virq, int nvec, msi_alloc_info_t *arg)
+			     int virq_base, int nvec, msi_alloc_info_t *arg)
 {
 	struct msi_domain_info *info = domain->host_data;
 	struct msi_domain_ops *ops = info->ops;
 	struct msi_desc *desc;
-	int ret = 0;
+	int ret, virq;
 
-	for_each_msi_entry(desc, dev) {
-		/* Don't even try the multi-MSI brain damage. */
-		if (WARN_ON(!desc->irq || desc->nvec_used != 1)) {
-			ret = -EINVAL;
-			break;
+	msi_lock_descs(dev);
+	for (virq = virq_base; virq < virq_base + nvec; virq++) {
+		desc = alloc_msi_entry(dev, 1, NULL);
+		if (!desc) {
+			ret = -ENOMEM;
+			goto fail;
 		}
 
-		if (!(desc->irq >= virq && desc->irq < (virq + nvec)))
-			continue;
+		desc->msi_index = virq;
+		desc->irq = virq;
+		list_add_tail(&desc->list, &dev->msi.data->list);
 
 		ops->set_desc(arg, desc);
-		/* Assumes the domain mutex is held! */
-		ret = irq_domain_alloc_irqs_hierarchy(domain, desc->irq, 1,
-						      arg);
+		ret = irq_domain_alloc_irqs_hierarchy(domain, virq, 1, arg);
 		if (ret)
-			break;
+			goto fail;
 
-		irq_set_msi_desc_off(desc->irq, 0, desc);
-	}
-
-	if (ret) {
-		/* Mop up the damage */
-		for_each_msi_entry(desc, dev) {
-			if (!(desc->irq >= virq && desc->irq < (virq + nvec)))
-				continue;
-
-			irq_domain_free_irqs_common(domain, desc->irq, 1);
-		}
+		irq_set_msi_desc(virq, desc);
 	}
+	msi_unlock_descs(dev);
+	return 0;
 
+fail:
+	for (--virq; virq >= virq_base; virq--)
+		irq_domain_free_irqs_common(domain, virq, 1);
+	msi_free_msi_descs_range(dev, MSI_DESC_ALL, virq_base, virq_base + nvec - 1);
+	msi_unlock_descs(dev);
 	return ret;
 }
 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 26/31] genirq/msi: Make interrupt allocation less convoluted
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (24 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 25/31] platform-msi: Simplify platform device MSI code Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 27/31] genirq/msi: Convert to new functions Thomas Gleixner
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

There is no real reason to do several loops over the MSI descriptors
instead of just doing one loop. In case of an error everything is undone
anyway so it does not matter whether it's a partial or a full rollback.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 .clang-format       |    1 
 include/linux/msi.h |    6 --
 kernel/irq/msi.c    |  129 +++++++++++++++++++++++++++-------------------------
 3 files changed, 69 insertions(+), 67 deletions(-)

--- a/.clang-format
+++ b/.clang-format
@@ -216,7 +216,6 @@ ExperimentalAutoDetectBinPacking: false
   - 'for_each_migratetype_order'
   - 'for_each_msi_entry'
   - 'for_each_msi_entry_safe'
-  - 'for_each_msi_vector'
   - 'for_each_net'
   - 'for_each_net_continue_reverse'
   - 'for_each_netdev'
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -223,12 +223,6 @@ struct msi_desc *msi_next_desc(struct de
 	list_for_each_entry((desc), dev_to_msi_list((dev)), list)
 #define for_each_msi_entry_safe(desc, tmp, dev)	\
 	list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
-#define for_each_msi_vector(desc, __irq, dev)				\
-	for_each_msi_entry((desc), (dev))				\
-		if ((desc)->irq)					\
-			for (__irq = (desc)->irq;			\
-			     __irq < ((desc)->irq + (desc)->nvec_used);	\
-			     __irq++)
 
 #ifdef CONFIG_IRQ_MSI_IOMMU
 static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -856,23 +856,74 @@ static int msi_handle_pci_fail(struct ir
 	return allocated ? allocated : -ENOSPC;
 }
 
+#define VIRQ_CAN_RESERVE	0x01
+#define VIRQ_ACTIVATE		0x02
+#define VIRQ_NOMASK_QUIRK	0x04
+
+static int msi_init_virq(struct irq_domain *domain, int virq, unsigned int vflags)
+{
+	struct irq_data *irqd = irq_domain_get_irq_data(domain, virq);
+	int ret;
+
+	if (!(vflags & VIRQ_CAN_RESERVE)) {
+		irqd_clr_can_reserve(irqd);
+		if (vflags & VIRQ_NOMASK_QUIRK)
+			irqd_set_msi_nomask_quirk(irqd);
+	}
+
+	if (!(vflags & VIRQ_ACTIVATE))
+		return 0;
+
+	ret = irq_domain_activate_irq(irqd, vflags & VIRQ_CAN_RESERVE);
+	if (ret)
+		return ret;
+	/*
+	 * If the interrupt uses reservation mode, clear the activated bit
+	 * so request_irq() will assign the final vector.
+	 */
+	if (vflags & VIRQ_CAN_RESERVE)
+		irqd_clr_activated(irqd);
+	return 0;
+}
+
 int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
 			    int nvec)
 {
 	struct msi_domain_info *info = domain->host_data;
 	struct msi_domain_ops *ops = info->ops;
-	struct irq_data *irq_data;
-	struct msi_desc *desc;
 	msi_alloc_info_t arg = { };
+	unsigned int vflags = 0;
+	struct msi_desc *desc;
 	int allocated = 0;
 	int i, ret, virq;
-	bool can_reserve;
 
 	ret = msi_domain_prepare_irqs(domain, dev, nvec, &arg);
 	if (ret)
 		return ret;
 
-	for_each_msi_entry(desc, dev) {
+	/*
+	 * This flag is set by the PCI layer as we need to activate
+	 * the MSI entries before the PCI layer enables MSI in the
+	 * card. Otherwise the card latches a random msi message.
+	 */
+	if (info->flags & MSI_FLAG_ACTIVATE_EARLY)
+		vflags |= VIRQ_ACTIVATE;
+
+	/*
+	 * Interrupt can use a reserved vector and will not occupy
+	 * a real device vector until the interrupt is requested.
+	 */
+	if (msi_check_reservation_mode(domain, info, dev)) {
+		vflags |= VIRQ_CAN_RESERVE;
+		/*
+		 * MSI affinity setting requires a special quirk (X86) when
+		 * reservation mode is active.
+		 */
+		if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
+			vflags |= VIRQ_NOMASK_QUIRK;
+	}
+
+	msi_for_each_desc(desc, dev, MSI_DESC_NOTASSOCIATED) {
 		ops->set_desc(&arg, desc);
 
 		virq = __irq_domain_alloc_irqs(domain, -1, desc->nvec_used,
@@ -884,49 +935,12 @@ int __msi_domain_alloc_irqs(struct irq_d
 		for (i = 0; i < desc->nvec_used; i++) {
 			irq_set_msi_desc_off(virq, i, desc);
 			irq_debugfs_copy_devname(virq + i, dev);
+			ret = msi_init_virq(domain, virq + i, vflags);
+			if (ret)
+				return ret;
 		}
 		allocated++;
 	}
-
-	can_reserve = msi_check_reservation_mode(domain, info, dev);
-
-	/*
-	 * This flag is set by the PCI layer as we need to activate
-	 * the MSI entries before the PCI layer enables MSI in the
-	 * card. Otherwise the card latches a random msi message.
-	 */
-	if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
-		goto skip_activate;
-
-	for_each_msi_vector(desc, i, dev) {
-		if (desc->irq == i) {
-			virq = desc->irq;
-			dev_dbg(dev, "irq [%d-%d] for MSI\n",
-				virq, virq + desc->nvec_used - 1);
-		}
-
-		irq_data = irq_domain_get_irq_data(domain, i);
-		if (!can_reserve) {
-			irqd_clr_can_reserve(irq_data);
-			if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
-				irqd_set_msi_nomask_quirk(irq_data);
-		}
-		ret = irq_domain_activate_irq(irq_data, can_reserve);
-		if (ret)
-			return ret;
-	}
-
-skip_activate:
-	/*
-	 * If these interrupts use reservation mode, clear the activated bit
-	 * so request_irq() will assign the final vector.
-	 */
-	if (can_reserve) {
-		for_each_msi_vector(desc, i, dev) {
-			irq_data = irq_domain_get_irq_data(domain, i);
-			irqd_clr_activated(irq_data);
-		}
-	}
 	return 0;
 }
 
@@ -1004,26 +1018,21 @@ int msi_domain_alloc_irqs(struct irq_dom
 
 void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
 {
-	struct irq_data *irq_data;
+	struct irq_data *irqd;
 	struct msi_desc *desc;
 	int i;
 
-	for_each_msi_vector(desc, i, dev) {
-		irq_data = irq_domain_get_irq_data(domain, i);
-		if (irqd_is_activated(irq_data))
-			irq_domain_deactivate_irq(irq_data);
-	}
-
-	for_each_msi_entry(desc, dev) {
-		/*
-		 * We might have failed to allocate an MSI early
-		 * enough that there is no IRQ associated to this
-		 * entry. If that's the case, don't do anything.
-		 */
-		if (desc->irq) {
-			irq_domain_free_irqs(desc->irq, desc->nvec_used);
-			desc->irq = 0;
+	/* Only handle MSI entries which have an interrupt associated */
+	msi_for_each_desc(desc, dev, MSI_DESC_ASSOCIATED) {
+		/* Make sure all interrupts are deactivated */
+		for (i = 0; i < desc->nvec_used; i++) {
+			irqd = irq_domain_get_irq_data(domain, desc->irq + i);
+			if (irqd && irqd_is_activated(irqd))
+				irq_domain_deactivate_irq(irqd);
 		}
+
+		irq_domain_free_irqs(desc->irq, desc->nvec_used);
+		desc->irq = 0;
 	}
 }
 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 27/31] genirq/msi: Convert to new functions
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (25 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 26/31] genirq/msi: Make interrupt allocation less convoluted Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 28/31] genirq/msi: Mop up old interfaces Thomas Gleixner
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Use the new iterator functions and add locking where required.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/irq/msi.c |   23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -348,6 +348,7 @@ EXPORT_SYMBOL_GPL(msi_next_desc);
 unsigned int msi_get_virq(struct device *dev, unsigned int index)
 {
 	struct msi_desc *desc;
+	unsigned int ret = 0;
 	bool pcimsi;
 
 	if (!dev->msi.data)
@@ -355,11 +356,12 @@ unsigned int msi_get_virq(struct device
 
 	pcimsi = msi_device_has_property(dev, MSI_PROP_PCI_MSI);
 
-	for_each_msi_entry(desc, dev) {
+	msi_lock_descs(dev);
+	msi_for_each_desc(desc, dev, MSI_DESC_ASSOCIATED) {
 		/* PCI-MSI has only one descriptor for multiple interrupts. */
 		if (pcimsi) {
-			if (desc->irq && index < desc->nvec_used)
-				return desc->irq + index;
+			if (index < desc->nvec_used)
+				ret = desc->irq + index;
 			break;
 		}
 
@@ -367,10 +369,13 @@ unsigned int msi_get_virq(struct device
 		 * PCI-MSIX and platform MSI use a descriptor per
 		 * interrupt.
 		 */
-		if (desc->msi_index == index)
-			return desc->irq;
+		if (desc->msi_index == index) {
+			ret = desc->irq;
+			break;
+		}
 	}
-	return 0;
+	msi_unlock_descs(dev);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(msi_get_virq);
 
@@ -401,7 +406,7 @@ static const struct attribute_group **ms
 	int i;
 
 	/* Determine how many msi entries we have */
-	for_each_msi_entry(entry, dev)
+	msi_for_each_desc(entry, dev, MSI_DESC_ALL)
 		num_msi += entry->nvec_used;
 	if (!num_msi)
 		return NULL;
@@ -411,7 +416,7 @@ static const struct attribute_group **ms
 	if (!msi_attrs)
 		return ERR_PTR(-ENOMEM);
 
-	for_each_msi_entry(entry, dev) {
+	msi_for_each_desc(entry, dev, MSI_DESC_ALL) {
 		for (i = 0; i < entry->nvec_used; i++) {
 			msi_dev_attr = kzalloc(sizeof(*msi_dev_attr), GFP_KERNEL);
 			if (!msi_dev_attr)
@@ -831,7 +836,7 @@ static bool msi_check_reservation_mode(s
 	 * Checking the first MSI descriptor is sufficient. MSIX supports
 	 * masking and MSI does so when the can_mask attribute is set.
 	 */
-	desc = first_msi_entry(dev);
+	desc = msi_first_desc(dev, MSI_DESC_ALL);
 	return desc->pci.msi_attrib.is_msix || desc->pci.msi_attrib.can_mask;
 }
 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 28/31] genirq/msi: Mop up old interfaces
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (26 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 27/31] genirq/msi: Convert to new functions Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 29/31] genirq/msi: Add abuse prevention comment to msi header Thomas Gleixner
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Get rid of the old iterators, alloc/free functions and adjust the core code
accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/msi.h |   15 ---------------
 kernel/irq/msi.c    |   31 +++++++++++++++----------------
 2 files changed, 15 insertions(+), 31 deletions(-)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -214,15 +214,7 @@ struct msi_desc *msi_next_desc(struct de
 	for ((desc) = msi_first_desc((dev), (filter)); (desc);	\
 	     (desc) = msi_next_desc((dev), (filter)))
 
-/* Helpers to hide struct msi_desc implementation details */
 #define msi_desc_to_dev(desc)		((desc)->dev)
-#define dev_to_msi_list(dev)		(&(dev)->msi.data->list)
-#define first_msi_entry(dev)		\
-	list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
-#define for_each_msi_entry(desc, dev)	\
-	list_for_each_entry((desc), dev_to_msi_list((dev)), list)
-#define for_each_msi_entry_safe(desc, tmp, dev)	\
-	list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
 
 #ifdef CONFIG_IRQ_MSI_IOMMU
 static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
@@ -248,10 +240,6 @@ static inline void msi_desc_set_iommu_co
 #endif
 
 #ifdef CONFIG_PCI_MSI
-#define first_pci_msi_entry(pdev)	first_msi_entry(&(pdev)->dev)
-#define for_each_pci_msi_entry(desc, pdev)	\
-	for_each_msi_entry((desc), &(pdev)->dev)
-
 struct pci_dev *msi_desc_to_pci_dev(struct msi_desc *desc);
 void pci_write_msi_msg(unsigned int irq, struct msi_msg *msg);
 #else /* CONFIG_PCI_MSI */
@@ -273,9 +261,6 @@ static inline void msi_free_msi_descs(st
 	msi_free_msi_descs_range(dev, MSI_DESC_ALL, 0, MSI_MAX_INDEX);
 }
 
-struct msi_desc *alloc_msi_entry(struct device *dev, int nvec,
-				 const struct irq_affinity_desc *affinity);
-void free_msi_entry(struct msi_desc *entry);
 void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -19,8 +19,10 @@
 
 #include "internals.h"
 
+#define dev_to_msi_list(dev)	(&(dev)->msi.data->list)
+
 /**
- * alloc_msi_entry - Allocate an initialized msi_desc
+ * msi_alloc_desc - Allocate an initialized msi_desc
  * @dev:	Pointer to the device for which this is allocated
  * @nvec:	The number of vectors used in this entry
  * @affinity:	Optional pointer to an affinity mask array size of @nvec
@@ -30,12 +32,11 @@
  *
  * Return: pointer to allocated &msi_desc on success or %NULL on failure
  */
-struct msi_desc *alloc_msi_entry(struct device *dev, int nvec,
-				 const struct irq_affinity_desc *affinity)
+static struct msi_desc *msi_alloc_desc(struct device *dev, int nvec,
+					const struct irq_affinity_desc *affinity)
 {
-	struct msi_desc *desc;
+	struct msi_desc *desc = kzalloc(sizeof(*desc), GFP_KERNEL);
 
-	desc = kzalloc(sizeof(*desc), GFP_KERNEL);
 	if (!desc)
 		return NULL;
 
@@ -43,21 +44,19 @@ struct msi_desc *alloc_msi_entry(struct
 	desc->dev = dev;
 	desc->nvec_used = nvec;
 	if (affinity) {
-		desc->affinity = kmemdup(affinity,
-			nvec * sizeof(*desc->affinity), GFP_KERNEL);
+		desc->affinity = kmemdup(affinity, nvec * sizeof(*desc->affinity), GFP_KERNEL);
 		if (!desc->affinity) {
 			kfree(desc);
 			return NULL;
 		}
 	}
-
 	return desc;
 }
 
-void free_msi_entry(struct msi_desc *entry)
+static void msi_free_desc(struct msi_desc *desc)
 {
-	kfree(entry->affinity);
-	kfree(entry);
+	kfree(desc->affinity);
+	kfree(desc);
 }
 
 /**
@@ -73,7 +72,7 @@ int msi_add_msi_desc(struct device *dev,
 
 	lockdep_assert_held(&dev->msi.data->mutex);
 
-	desc = alloc_msi_entry(dev, init_desc->nvec_used, init_desc->affinity);
+	desc = msi_alloc_desc(dev, init_desc->nvec_used, init_desc->affinity);
 	if (!desc)
 		return -ENOMEM;
 
@@ -102,7 +101,7 @@ static int msi_add_simple_msi_descs(stru
 	lockdep_assert_held(&dev->msi.data->mutex);
 
 	for (i = 0; i < ndesc; i++) {
-		desc = alloc_msi_entry(dev, 1, NULL);
+		desc = msi_alloc_desc(dev, 1, NULL);
 		if (!desc)
 			goto fail;
 		desc->msi_index = index + i;
@@ -114,7 +113,7 @@ static int msi_add_simple_msi_descs(stru
 fail:
 	list_for_each_entry_safe(desc, tmp, &list, list) {
 		list_del(&desc->list);
-		free_msi_entry(desc);
+		msi_free_desc(desc);
 	}
 	return -ENOMEM;
 }
@@ -141,7 +140,7 @@ void msi_free_msi_descs_range(struct dev
 		if (desc->msi_index < first_index || desc->msi_index > last_index)
 			continue;
 		list_del(&desc->list);
-		free_msi_entry(desc);
+		msi_free_desc(desc);
 	}
 }
 
@@ -762,7 +761,7 @@ int msi_domain_populate_irqs(struct irq_
 
 	msi_lock_descs(dev);
 	for (virq = virq_base; virq < virq_base + nvec; virq++) {
-		desc = alloc_msi_entry(dev, 1, NULL);
+		desc = msi_alloc_desc(dev, 1, NULL);
 		if (!desc) {
 			ret = -ENOMEM;
 			goto fail;


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 29/31] genirq/msi: Add abuse prevention comment to msi header
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (27 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 28/31] genirq/msi: Mop up old interfaces Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-07  8:21   ` Greg Kroah-Hartman
  2021-12-06 22:51 ` [patch V2 30/31] genirq/msi: Simplify sysfs handling Thomas Gleixner
                   ` (2 subsequent siblings)
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/msi.h |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -2,6 +2,20 @@
 #ifndef LINUX_MSI_H
 #define LINUX_MSI_H
 
+/*
+ * This header file contains MSI data structures and functions which are
+ * only relevant for:
+ *	- Interrupt core code
+ *	- PCI/MSI core code
+ *	- MSI interrupt domain implementations
+ *	- IOMMU, low level VFIO, NTB and other justified exceptions
+ *	  dealing with low level MSI details.
+ *
+ * Regular device drivers have no business with any of these functions and
+ * especially storing MSI descriptor pointers in random code is considered
+ * abuse. The only function which is relevant for drivers is msi_get_virq().
+ */
+
 #include <linux/cpumask.h>
 #include <linux/mutex.h>
 #include <linux/list.h>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 30/31] genirq/msi: Simplify sysfs handling
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (28 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 29/31] genirq/msi: Add abuse prevention comment to msi header Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2022-01-10 18:12   ` [patch] genirq/msi: Populate sysfs entry only once Thomas Gleixner
  2021-12-06 22:51 ` [patch V2 31/31] genirq/msi: Convert storage to xarray Thomas Gleixner
  2021-12-09  1:01 ` [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Jason Gunthorpe
  31 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

The sysfs handling for MSI is a convoluted maze and it is in the way of
supporting dynamic expansion of the MSI-X vectors because it only supports
a one off bulk population/free of the sysfs entries.

Change it to do:

   1) Creating an empty sysfs attribute group when msi_device_data is
      allocated

   2) Populate the entries when the MSI descriptor is initialized

   3) Free the entries when a MSI descriptor is detached from a Linux
      interrupt.

   4) Provide functions for the legacy non-irqdomain fallback code to
      do a bulk population/free. This code won't support dynamic
      expansion.

This makes the code simpler and reduces the number of allocations as the
empty attribute group can be shared.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/msi.h |    8 +-
 kernel/irq/msi.c    |  196 +++++++++++++++++++++++-----------------------------
 2 files changed, 95 insertions(+), 109 deletions(-)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -72,7 +72,7 @@ struct irq_data;
 struct msi_desc;
 struct pci_dev;
 struct platform_msi_priv_data;
-struct attribute_group;
+struct device_attribute;
 
 void __get_cached_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 #ifdef CONFIG_GENERIC_MSI_IRQ
@@ -130,6 +130,7 @@ struct pci_msi_desc {
  * @dev:	Pointer to the device which uses this descriptor
  * @msg:	The last set MSI message cached for reuse
  * @affinity:	Optional pointer to a cpu affinity mask for this descriptor
+ * @sysfs_attr:	Pointer to sysfs device attribute
  *
  * @write_msi_msg:	Callback that may be called when the MSI message
  *			address or data changes
@@ -149,6 +150,9 @@ struct msi_desc {
 #ifdef CONFIG_IRQ_MSI_IOMMU
 	const void			*iommu_cookie;
 #endif
+#ifdef CONFIG_SYSFS
+	struct device_attribute		*sysfs_attrs;
+#endif
 
 	void (*write_msi_msg)(struct msi_desc *entry, void *data);
 	void *write_msi_msg_data;
@@ -172,7 +176,6 @@ enum msi_desc_filter {
 /**
  * msi_device_data - MSI per device data
  * @properties:		MSI properties which are interesting to drivers
- * @attrs:		Pointer to the sysfs attribute group
  * @platform_data:	Platform-MSI specific data
  * @list:		List of MSI descriptors associated to the device
  * @mutex:		Mutex protecting the MSI list
@@ -180,7 +183,6 @@ enum msi_desc_filter {
  */
 struct msi_device_data {
 	unsigned long			properties;
-	const struct attribute_group    **attrs;
 	struct platform_msi_priv_data	*platform_data;
 	struct list_head		list;
 	struct mutex			mutex;
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -19,6 +19,7 @@
 
 #include "internals.h"
 
+static inline int msi_sysfs_create_group(struct device *dev);
 #define dev_to_msi_list(dev)	(&(dev)->msi.data->list)
 
 /**
@@ -202,6 +203,7 @@ static void msi_device_data_release(stru
 int msi_setup_device_data(struct device *dev)
 {
 	struct msi_device_data *md;
+	int ret;
 
 	if (dev->msi.data)
 		return 0;
@@ -210,6 +212,12 @@ int msi_setup_device_data(struct device
 	if (!md)
 		return -ENOMEM;
 
+	ret = msi_sysfs_create_group(dev);
+	if (ret) {
+		devres_free(md);
+		return ret;
+	}
+
 	INIT_LIST_HEAD(&md->list);
 	mutex_init(&md->mutex);
 	dev->msi.data = md;
@@ -379,6 +387,20 @@ unsigned int msi_get_virq(struct device
 EXPORT_SYMBOL_GPL(msi_get_virq);
 
 #ifdef CONFIG_SYSFS
+static struct attribute *msi_dev_attrs[] = {
+	NULL
+};
+
+static const struct attribute_group msi_irqs_group = {
+	.name	= "msi_irqs",
+	.attrs	= msi_dev_attrs,
+};
+
+static inline int msi_sysfs_create_group(struct device *dev)
+{
+	return devm_device_add_group(dev, &msi_irqs_group);
+}
+
 static ssize_t msi_mode_show(struct device *dev, struct device_attribute *attr,
 			     char *buf)
 {
@@ -388,97 +410,74 @@ static ssize_t msi_mode_show(struct devi
 	return sysfs_emit(buf, "%s\n", is_msix ? "msix" : "msi");
 }
 
-/**
- * msi_populate_sysfs - Populate msi_irqs sysfs entries for devices
- * @dev:	The device(PCI, platform etc) who will get sysfs entries
- */
-static const struct attribute_group **msi_populate_sysfs(struct device *dev)
+static void msi_sysfs_remove_desc(struct device *dev, struct msi_desc *desc)
 {
-	const struct attribute_group **msi_irq_groups;
-	struct attribute **msi_attrs, *msi_attr;
-	struct device_attribute *msi_dev_attr;
-	struct attribute_group *msi_irq_group;
-	struct msi_desc *entry;
-	int ret = -ENOMEM;
-	int num_msi = 0;
-	int count = 0;
+	struct device_attribute *attrs = desc->sysfs_attrs;
 	int i;
 
-	/* Determine how many msi entries we have */
-	msi_for_each_desc(entry, dev, MSI_DESC_ALL)
-		num_msi += entry->nvec_used;
-	if (!num_msi)
-		return NULL;
+	if (!attrs)
+		return;
 
-	/* Dynamically create the MSI attributes for the device */
-	msi_attrs = kcalloc(num_msi + 1, sizeof(void *), GFP_KERNEL);
-	if (!msi_attrs)
-		return ERR_PTR(-ENOMEM);
-
-	msi_for_each_desc(entry, dev, MSI_DESC_ALL) {
-		for (i = 0; i < entry->nvec_used; i++) {
-			msi_dev_attr = kzalloc(sizeof(*msi_dev_attr), GFP_KERNEL);
-			if (!msi_dev_attr)
-				goto error_attrs;
-			msi_attrs[count] = &msi_dev_attr->attr;
-
-			sysfs_attr_init(&msi_dev_attr->attr);
-			msi_dev_attr->attr.name = kasprintf(GFP_KERNEL, "%d",
-							    entry->irq + i);
-			if (!msi_dev_attr->attr.name)
-				goto error_attrs;
-			msi_dev_attr->attr.mode = 0444;
-			msi_dev_attr->show = msi_mode_show;
-			++count;
-		}
+	desc->sysfs_attrs = NULL;
+	for (i = 0; i < desc->nvec_used; i++) {
+		if (attrs[i].show)
+			sysfs_remove_file_from_group(&dev->kobj, &attrs[i].attr, msi_irqs_group.name);
+		kfree(attrs[i].attr.name);
 	}
+	kfree(attrs);
+}
 
-	msi_irq_group = kzalloc(sizeof(*msi_irq_group), GFP_KERNEL);
-	if (!msi_irq_group)
-		goto error_attrs;
-	msi_irq_group->name = "msi_irqs";
-	msi_irq_group->attrs = msi_attrs;
-
-	msi_irq_groups = kcalloc(2, sizeof(void *), GFP_KERNEL);
-	if (!msi_irq_groups)
-		goto error_irq_group;
-	msi_irq_groups[0] = msi_irq_group;
+static int msi_sysfs_populate_desc(struct device *dev, struct msi_desc *desc)
+{
+	struct device_attribute *attrs;
+	int ret, i;
 
-	ret = sysfs_create_groups(&dev->kobj, msi_irq_groups);
-	if (ret)
-		goto error_irq_groups;
+	attrs = kcalloc(desc->nvec_used, sizeof(*attrs), GFP_KERNEL);
+	if (!attrs)
+		return -ENOMEM;
+
+	desc->sysfs_attrs = attrs;
+	for (i = 0; i < desc->nvec_used; i++) {
+		sysfs_attr_init(&attrs[i].attr);
+		attrs[i].attr.name = kasprintf(GFP_KERNEL, "%d", desc->irq + i);
+		if (!attrs[i].attr.name) {
+			ret = -ENOMEM;
+			goto fail;
+		}
 
-	return msi_irq_groups;
+		attrs[i].attr.mode = 0444;
+		attrs[i].show = msi_mode_show;
 
-error_irq_groups:
-	kfree(msi_irq_groups);
-error_irq_group:
-	kfree(msi_irq_group);
-error_attrs:
-	count = 0;
-	msi_attr = msi_attrs[count];
-	while (msi_attr) {
-		msi_dev_attr = container_of(msi_attr, struct device_attribute, attr);
-		kfree(msi_attr->name);
-		kfree(msi_dev_attr);
-		++count;
-		msi_attr = msi_attrs[count];
+		ret = sysfs_add_file_to_group(&dev->kobj, &attrs[i].attr, msi_irqs_group.name);
+		if (ret) {
+			attrs[i].show = NULL;
+			goto fail;
+		}
 	}
-	kfree(msi_attrs);
-	return ERR_PTR(ret);
+	return 0;
+
+fail:
+	msi_sysfs_remove_desc(dev, desc);
+	return ret;
 }
 
+#ifdef CONFIG_PCI_MSI_ARCH_FALLBACKS
 /**
  * msi_device_populate_sysfs - Populate msi_irqs sysfs entries for a device
  * @dev:	The device (PCI, platform etc) which will get sysfs entries
  */
 int msi_device_populate_sysfs(struct device *dev)
 {
-	const struct attribute_group **group = msi_populate_sysfs(dev);
+	struct msi_desc *desc;
+	int ret;
 
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-	dev->msi.data->attrs = group;
+	msi_for_each_desc(desc, dev, MSI_DESC_ASSOCIATED) {
+		if (desc->sysfs_attrs)
+			continue;
+		ret = msi_sysfs_populate_desc(dev, desc);
+		if (ret)
+			return ret;
+	}
 	return 0;
 }
 
@@ -489,28 +488,17 @@ int msi_device_populate_sysfs(struct dev
  */
 void msi_device_destroy_sysfs(struct device *dev)
 {
-	const struct attribute_group **msi_irq_groups = dev->msi.data->attrs;
-	struct device_attribute *dev_attr;
-	struct attribute **msi_attrs;
-	int count = 0;
-
-	dev->msi.data->attrs = NULL;
-	if (!msi_irq_groups)
-		return;
+	struct msi_desc *desc;
 
-	sysfs_remove_groups(&dev->kobj, msi_irq_groups);
-	msi_attrs = msi_irq_groups[0]->attrs;
-	while (msi_attrs[count]) {
-		dev_attr = container_of(msi_attrs[count], struct device_attribute, attr);
-		kfree(dev_attr->attr.name);
-		kfree(dev_attr);
-		++count;
-	}
-	kfree(msi_attrs);
-	kfree(msi_irq_groups[0]);
-	kfree(msi_irq_groups);
+	msi_for_each_desc(desc, dev, MSI_DESC_ALL)
+		msi_sysfs_remove_desc(dev, desc);
 }
-#endif
+#endif /* CONFIG_PCI_MSI_ARCH_FALLBACK */
+#else /* CONFIG_SYSFS */
+static inline int msi_sysfs_create_group(struct device *dev) { return 0; }
+static inline int msi_sysfs_populate_desc(struct device *dev, struct msi_desc *desc) { return 0; }
+static inline void msi_sysfs_remove_desc(struct device *dev, struct msi_desc *desc) { }
+#endif /* !CONFIG_SYSFS */
 
 #ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
 static inline void irq_chip_write_msi_msg(struct irq_data *data,
@@ -942,6 +930,12 @@ int __msi_domain_alloc_irqs(struct irq_d
 			ret = msi_init_virq(domain, virq + i, vflags);
 			if (ret)
 				return ret;
+
+			if (info->flags & MSI_FLAG_DEV_SYSFS) {
+				ret = msi_sysfs_populate_desc(dev, desc);
+				if (ret)
+					return ret;
+			}
 		}
 		allocated++;
 	}
@@ -986,18 +980,7 @@ int msi_domain_alloc_irqs_descs_locked(s
 
 	ret = ops->domain_alloc_irqs(domain, dev, nvec);
 	if (ret)
-		goto cleanup;
-
-	if (!(info->flags & MSI_FLAG_DEV_SYSFS))
-		return 0;
-
-	ret = msi_device_populate_sysfs(dev);
-	if (ret)
-		goto cleanup;
-	return 0;
-
-cleanup:
-	msi_domain_free_irqs_descs_locked(domain, dev);
+		msi_domain_free_irqs_descs_locked(domain, dev);
 	return ret;
 }
 
@@ -1022,6 +1005,7 @@ int msi_domain_alloc_irqs(struct irq_dom
 
 void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
 {
+	struct msi_domain_info *info = domain->host_data;
 	struct irq_data *irqd;
 	struct msi_desc *desc;
 	int i;
@@ -1036,6 +1020,8 @@ void __msi_domain_free_irqs(struct irq_d
 		}
 
 		irq_domain_free_irqs(desc->irq, desc->nvec_used);
+		if (info->flags & MSI_FLAG_DEV_SYSFS)
+			msi_sysfs_remove_desc(dev, desc);
 		desc->irq = 0;
 	}
 }
@@ -1064,8 +1050,6 @@ void msi_domain_free_irqs_descs_locked(s
 
 	lockdep_assert_held(&dev->msi.data->mutex);
 
-	if (info->flags & MSI_FLAG_DEV_SYSFS)
-		msi_device_destroy_sysfs(dev);
 	ops->domain_free_irqs(domain, dev);
 	msi_domain_free_msi_descs(info, dev);
 }


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V2 31/31] genirq/msi: Convert storage to xarray
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (29 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 30/31] genirq/msi: Simplify sysfs handling Thomas Gleixner
@ 2021-12-06 22:51 ` Thomas Gleixner
  2021-12-09  1:01 ` [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Jason Gunthorpe
  31 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-06 22:51 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

The current linked list storage for MSI descriptors is suboptimal in
several ways:

  1) Looking up a MSI desciptor requires a O(n) list walk in the worst case

  2) The upcoming support of runtime expansion of MSI-X vectors would need
     to do a full list walk to figure out whether a particular index is
     already associated.

  3) Runtime expansion of sparse allocations is even more complex as the
     current implementation assumes an ordered list (increasing MSI index).

Use an xarray which solves all of the above problems nicely.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/msi.h |   13 +---
 kernel/irq/msi.c    |  169 +++++++++++++++++++++++-----------------------------
 2 files changed, 83 insertions(+), 99 deletions(-)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -17,6 +17,7 @@
  */
 
 #include <linux/cpumask.h>
+#include <linux/xarray.h>
 #include <linux/mutex.h>
 #include <linux/list.h>
 #include <linux/bits.h>
@@ -124,7 +125,6 @@ struct pci_msi_desc {
 
 /**
  * struct msi_desc - Descriptor structure for MSI based interrupts
- * @list:	List head for management
  * @irq:	The base interrupt number
  * @nvec_used:	The number of vectors used
  * @dev:	Pointer to the device which uses this descriptor
@@ -141,7 +141,6 @@ struct pci_msi_desc {
  */
 struct msi_desc {
 	/* Shared device/bus type independent data */
-	struct list_head		list;
 	unsigned int			irq;
 	unsigned int			nvec_used;
 	struct device			*dev;
@@ -177,16 +176,16 @@ enum msi_desc_filter {
  * msi_device_data - MSI per device data
  * @properties:		MSI properties which are interesting to drivers
  * @platform_data:	Platform-MSI specific data
- * @list:		List of MSI descriptors associated to the device
- * @mutex:		Mutex protecting the MSI list
- * @__next:		Cached pointer to the next entry for iterators
+ * @mutex:		Mutex protecting the MSI descriptor store
+ * @__store:		Xarray for storing MSI descriptor pointers
+ * @__iter_idx:		Index to search the next entry for iterators
  */
 struct msi_device_data {
 	unsigned long			properties;
 	struct platform_msi_priv_data	*platform_data;
-	struct list_head		list;
 	struct mutex			mutex;
-	struct msi_desc			*__next;
+	struct xarray			__store;
+	unsigned long			__iter_idx;
 };
 
 int msi_setup_device_data(struct device *dev);
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -20,7 +20,6 @@
 #include "internals.h"
 
 static inline int msi_sysfs_create_group(struct device *dev);
-#define dev_to_msi_list(dev)	(&(dev)->msi.data->list)
 
 /**
  * msi_alloc_desc - Allocate an initialized msi_desc
@@ -41,7 +40,6 @@ static struct msi_desc *msi_alloc_desc(s
 	if (!desc)
 		return NULL;
 
-	INIT_LIST_HEAD(&desc->list);
 	desc->dev = dev;
 	desc->nvec_used = nvec;
 	if (affinity) {
@@ -60,6 +58,17 @@ static void msi_free_desc(struct msi_des
 	kfree(desc);
 }
 
+static int msi_insert_desc(struct msi_device_data *md, struct msi_desc *desc, unsigned int index)
+{
+	int ret;
+
+	desc->msi_index = index;
+	ret = xa_insert(&md->__store, index, desc, GFP_KERNEL);
+	if (ret)
+		msi_free_desc(desc);
+	return ret;
+}
+
 /**
  * msi_add_msi_desc - Allocate and initialize a MSI descriptor
  * @dev:	Pointer to the device for which the descriptor is allocated
@@ -77,12 +86,9 @@ int msi_add_msi_desc(struct device *dev,
 	if (!desc)
 		return -ENOMEM;
 
-	/* Copy the MSI index and type specific data to the new descriptor. */
-	desc->msi_index = init_desc->msi_index;
+	/* Copy type specific data to the new descriptor. */
 	desc->pci = init_desc->pci;
-
-	list_add_tail(&desc->list, &dev->msi.data->list);
-	return 0;
+	return msi_insert_desc(dev->msi.data, desc, init_desc->msi_index);
 }
 
 /**
@@ -95,28 +101,41 @@ int msi_add_msi_desc(struct device *dev,
  */
 static int msi_add_simple_msi_descs(struct device *dev, unsigned int index, unsigned int ndesc)
 {
-	struct msi_desc *desc, *tmp;
-	LIST_HEAD(list);
-	unsigned int i;
+	unsigned int idx, last = index + ndesc - 1;
+	struct msi_desc *desc;
+	int ret;
 
 	lockdep_assert_held(&dev->msi.data->mutex);
 
-	for (i = 0; i < ndesc; i++) {
+	for (idx = index; idx <= last; idx++) {
 		desc = msi_alloc_desc(dev, 1, NULL);
 		if (!desc)
+			goto fail_mem;
+		ret = msi_insert_desc(dev->msi.data, desc, idx);
+		if (ret)
 			goto fail;
-		desc->msi_index = index + i;
-		list_add_tail(&desc->list, &list);
 	}
-	list_splice_tail(&list, &dev->msi.data->list);
 	return 0;
 
+fail_mem:
+	ret = -ENOMEM;
 fail:
-	list_for_each_entry_safe(desc, tmp, &list, list) {
-		list_del(&desc->list);
-		msi_free_desc(desc);
+	msi_free_msi_descs_range(dev, MSI_DESC_NOTASSOCIATED, index, last);
+	return ret;
+}
+
+static bool msi_desc_match(struct msi_desc *desc, enum msi_desc_filter filter)
+{
+	switch (filter) {
+	case MSI_DESC_ALL:
+		return true;
+	case MSI_DESC_NOTASSOCIATED:
+		return !desc->irq;
+	case MSI_DESC_ASSOCIATED:
+		return !!desc->irq;
 	}
-	return -ENOMEM;
+	WARN_ON_ONCE(1);
+	return false;
 }
 
 /**
@@ -141,19 +160,17 @@ void msi_device_set_properties(struct de
 void msi_free_msi_descs_range(struct device *dev, enum msi_desc_filter filter,
 			      unsigned int first_index, unsigned int last_index)
 {
+	struct xarray *xa = &dev->msi.data->__store;
 	struct msi_desc *desc;
+	unsigned long idx;
 
 	lockdep_assert_held(&dev->msi.data->mutex);
 
-	msi_for_each_desc(desc, dev, filter) {
-		/*
-		 * Stupid for now to handle MSI device domain until the
-		 * storage is switched over to an xarray.
-		 */
-		if (desc->msi_index < first_index || desc->msi_index > last_index)
-			continue;
-		list_del(&desc->list);
-		msi_free_desc(desc);
+	xa_for_each_range(xa, idx, desc, first_index, last_index) {
+		if (msi_desc_match(desc, filter)) {
+			xa_erase(xa, idx);
+			msi_free_desc(desc);
+		}
 	}
 }
 
@@ -186,7 +203,8 @@ static void msi_device_data_release(stru
 {
 	struct msi_device_data *md = res;
 
-	WARN_ON_ONCE(!list_empty(&md->list));
+	WARN_ON_ONCE(!xa_empty(&md->__store));
+	xa_destroy(&md->__store);
 	dev->msi.data = NULL;
 }
 
@@ -218,7 +236,7 @@ int msi_setup_device_data(struct device
 		return ret;
 	}
 
-	INIT_LIST_HEAD(&md->list);
+	xa_init(&md->__store);
 	mutex_init(&md->mutex);
 	dev->msi.data = md;
 	devres_add(dev, md);
@@ -245,34 +263,21 @@ void msi_unlock_descs(struct device *dev
 {
 	if (WARN_ON_ONCE(!dev->msi.data))
 		return;
-	/* Clear the next pointer which was cached by the iterator */
-	dev->msi.data->__next = NULL;
+	/* Invalidate the index wich was cached by the iterator */
+	dev->msi.data->__iter_idx = MSI_MAX_INDEX;
 	mutex_unlock(&dev->msi.data->mutex);
 }
 EXPORT_SYMBOL_GPL(msi_unlock_descs);
 
-static bool msi_desc_match(struct msi_desc *desc, enum msi_desc_filter filter)
-{
-	switch (filter) {
-	case MSI_DESC_ALL:
-		return true;
-	case MSI_DESC_NOTASSOCIATED:
-		return !desc->irq;
-	case MSI_DESC_ASSOCIATED:
-		return !!desc->irq;
-	}
-	WARN_ON_ONCE(1);
-	return false;
-}
-
-static struct msi_desc *msi_find_first_desc(struct device *dev, enum msi_desc_filter filter)
+static struct msi_desc *msi_find_desc(struct msi_device_data *md, enum msi_desc_filter filter)
 {
 	struct msi_desc *desc;
 
-	list_for_each_entry(desc, dev_to_msi_list(dev), list) {
+	xa_for_each_start(&md->__store, md->__iter_idx, desc, md->__iter_idx) {
 		if (msi_desc_match(desc, filter))
 			return desc;
 	}
+	md->__iter_idx = MSI_MAX_INDEX;
 	return NULL;
 }
 
@@ -289,37 +294,24 @@ static struct msi_desc *msi_find_first_d
  */
 struct msi_desc *msi_first_desc(struct device *dev, enum msi_desc_filter filter)
 {
-	struct msi_desc *desc;
+	struct msi_device_data *md = dev->msi.data;
 
-	if (WARN_ON_ONCE(!dev->msi.data))
+	if (WARN_ON_ONCE(!md))
 		return NULL;
 
-	lockdep_assert_held(&dev->msi.data->mutex);
+	lockdep_assert_held(&md->mutex);
 
-	desc = msi_find_first_desc(dev, filter);
-	dev->msi.data->__next = desc ? list_next_entry(desc, list) : NULL;
-	return desc;
+	md->__iter_idx = 0;
+	return msi_find_desc(md, filter);
 }
 EXPORT_SYMBOL_GPL(msi_first_desc);
 
-static struct msi_desc *__msi_next_desc(struct device *dev, enum msi_desc_filter filter,
-					struct msi_desc *from)
-{
-	struct msi_desc *desc = from;
-
-	list_for_each_entry_from(desc, dev_to_msi_list(dev), list) {
-		if (msi_desc_match(desc, filter))
-			return desc;
-	}
-	return NULL;
-}
-
 /**
  * msi_next_desc - Get the next MSI descriptor of a device
  * @dev:	Device to operate on
  *
  * The first invocation of msi_next_desc() has to be preceeded by a
- * successful incovation of __msi_first_desc(). Consecutive invocations are
+ * successful invocation of __msi_first_desc(). Consecutive invocations are
  * only valid if the previous one was successful. All these operations have
  * to be done within the same MSI mutex held region.
  *
@@ -328,20 +320,18 @@ static struct msi_desc *__msi_next_desc(
  */
 struct msi_desc *msi_next_desc(struct device *dev, enum msi_desc_filter filter)
 {
-	struct msi_device_data *data = dev->msi.data;
-	struct msi_desc *desc;
+	struct msi_device_data *md = dev->msi.data;
 
-	if (WARN_ON_ONCE(!data))
+	if (WARN_ON_ONCE(!md))
 		return NULL;
 
-	lockdep_assert_held(&data->mutex);
+	lockdep_assert_held(&md->mutex);
 
-	if (!data->__next)
+	if (md->__iter_idx >= (unsigned long)MSI_MAX_INDEX)
 		return NULL;
 
-	desc = __msi_next_desc(dev, filter, data->__next);
-	dev->msi.data->__next = desc ? list_next_entry(desc, list) : NULL;
-	return desc;
+	md->__iter_idx++;
+	return msi_find_desc(md, filter);
 }
 EXPORT_SYMBOL_GPL(msi_next_desc);
 
@@ -364,21 +354,18 @@ unsigned int msi_get_virq(struct device
 	pcimsi = msi_device_has_property(dev, MSI_PROP_PCI_MSI);
 
 	msi_lock_descs(dev);
-	msi_for_each_desc(desc, dev, MSI_DESC_ASSOCIATED) {
-		/* PCI-MSI has only one descriptor for multiple interrupts. */
-		if (pcimsi) {
-			if (index < desc->nvec_used)
-				ret = desc->irq + index;
-			break;
-		}
-
+	desc = xa_load(&dev->msi.data->__store, pcimsi ? 0 : index);
+	if (desc && desc->irq) {
 		/*
+		 * PCI-MSI has only one descriptor for multiple interrupts.
 		 * PCI-MSIX and platform MSI use a descriptor per
 		 * interrupt.
 		 */
-		if (desc->msi_index == index) {
+		if (pcimsi) {
+			if (index < desc->nvec_used)
+				ret = desc->irq + index;
+		} else {
 			ret = desc->irq;
-			break;
 		}
 	}
 	msi_unlock_descs(dev);
@@ -759,16 +746,13 @@ int msi_domain_populate_irqs(struct irq_
 	int ret, virq;
 
 	msi_lock_descs(dev);
-	for (virq = virq_base; virq < virq_base + nvec; virq++) {
-		desc = msi_alloc_desc(dev, 1, NULL);
-		if (!desc) {
-			ret = -ENOMEM;
-			goto fail;
-		}
+	ret = msi_add_simple_msi_descs(dev, virq_base, nvec);
+	if (ret)
+		goto unlock;
 
-		desc->msi_index = virq;
+	for (virq = virq_base; virq < virq_base + nvec; virq++) {
+		desc = xa_load(&dev->msi.data->__store, virq);
 		desc->irq = virq;
-		list_add_tail(&desc->list, &dev->msi.data->list);
 
 		ops->set_desc(arg, desc);
 		ret = irq_domain_alloc_irqs_hierarchy(domain, virq, 1, arg);
@@ -784,6 +768,7 @@ int msi_domain_populate_irqs(struct irq_
 	for (--virq; virq >= virq_base; virq--)
 		irq_domain_free_irqs_common(domain, virq, 1);
 	msi_free_msi_descs_range(dev, MSI_DESC_ALL, virq_base, virq_base + nvec - 1);
+unlock:
 	msi_unlock_descs(dev);
 	return ret;
 }


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 29/31] genirq/msi: Add abuse prevention comment to msi header
  2021-12-06 22:51 ` [patch V2 29/31] genirq/msi: Add abuse prevention comment to msi header Thomas Gleixner
@ 2021-12-07  8:21   ` Greg Kroah-Hartman
  2021-12-07 12:46     ` Thomas Gleixner
  0 siblings, 1 reply; 50+ messages in thread
From: Greg Kroah-Hartman @ 2021-12-07  8:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Niklas Schnelle,
	linux-s390, Heiko Carstens, Christian Borntraeger,
	Logan Gunthorpe, Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Dec 06, 2021 at 11:51:49PM +0100, Thomas Gleixner wrote:
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/linux/msi.h |   14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -2,6 +2,20 @@
>  #ifndef LINUX_MSI_H
>  #define LINUX_MSI_H
>  
> +/*
> + * This header file contains MSI data structures and functions which are
> + * only relevant for:
> + *	- Interrupt core code
> + *	- PCI/MSI core code
> + *	- MSI interrupt domain implementations
> + *	- IOMMU, low level VFIO, NTB and other justified exceptions
> + *	  dealing with low level MSI details.
> + *
> + * Regular device drivers have no business with any of these functions and
> + * especially storing MSI descriptor pointers in random code is considered
> + * abuse. The only function which is relevant for drivers is msi_get_virq().
> + */
> +
>  #include <linux/cpumask.h>
>  #include <linux/mutex.h>
>  #include <linux/list.h>
> 

Ah, to be young and idealistic and hope that kernel developers read
comments in header files :)

You might want to add this to the driver-api kernel doc build?

Anyway, looks good to me:

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 29/31] genirq/msi: Add abuse prevention comment to msi header
  2021-12-07  8:21   ` Greg Kroah-Hartman
@ 2021-12-07 12:46     ` Thomas Gleixner
  0 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-07 12:46 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: LKML, Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Niklas Schnelle,
	linux-s390, Heiko Carstens, Christian Borntraeger,
	Logan Gunthorpe, Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Tue, Dec 07 2021 at 09:21, Greg Kroah-Hartman wrote:
> On Mon, Dec 06, 2021 at 11:51:49PM +0100, Thomas Gleixner wrote:
>>  #include <linux/cpumask.h>
>>  #include <linux/mutex.h>
>>  #include <linux/list.h>
>> 
> Ah, to be young and idealistic and hope that kernel developers read
> comments in header files :)

Hope dies last.

> You might want to add this to the driver-api kernel doc build?

When I do the next round of refactoring, I'm going to move the functions
which are available for drivers into a separate header file.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 07/31] PCI/MSI: Protect MSI operations
  2021-12-06 22:51 ` [patch V2 07/31] PCI/MSI: Protect MSI operations Thomas Gleixner
@ 2021-12-07 21:06   ` Bjorn Helgaas
  0 siblings, 0 replies; 50+ messages in thread
From: Bjorn Helgaas @ 2021-12-07 21:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Marc Zygnier, Alex Williamson, Kevin Tian, Jason Gunthorpe,
	Megha Dey, Ashok Raj, linux-pci, Cedric Le Goater, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Dec 06, 2021 at 11:51:13PM +0100, Thomas Gleixner wrote:
> To prepare for dynamic extension of MSI-X vectors, protect the MSI
> operations for MSI and MSI-X. This requires to move the invocation of
> irq_create_affinity_masks() out of the descriptor lock section to avoid
> reverse lock ordering vs. CPU hotplug lock as some callers of the PCI/MSI
> allocation interfaces already hold it.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/msi/irqdomain.c |    4 -
>  drivers/pci/msi/msi.c       |  120 ++++++++++++++++++++++++++------------------
>  2 files changed, 73 insertions(+), 51 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -14,7 +14,7 @@ int pci_msi_setup_msi_irqs(struct pci_de
>  
>  	domain = dev_get_msi_domain(&dev->dev);
>  	if (domain && irq_domain_is_hierarchy(domain))
> -		return msi_domain_alloc_irqs(domain, &dev->dev, nvec);
> +		return msi_domain_alloc_irqs_descs_locked(domain, &dev->dev, nvec);
>  
>  	return pci_msi_legacy_setup_msi_irqs(dev, nvec, type);
>  }
> @@ -25,7 +25,7 @@ void pci_msi_teardown_msi_irqs(struct pc
>  
>  	domain = dev_get_msi_domain(&dev->dev);
>  	if (domain && irq_domain_is_hierarchy(domain))
> -		msi_domain_free_irqs(domain, &dev->dev);
> +		msi_domain_free_irqs_descs_locked(domain, &dev->dev);
>  	else
>  		pci_msi_legacy_teardown_msi_irqs(dev);
>  }
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -322,11 +322,13 @@ static void __pci_restore_msix_state(str
>  
>  	write_msg = arch_restore_msi_irqs(dev);
>  
> +	msi_lock_descs(&dev->dev);
>  	for_each_pci_msi_entry(entry, dev) {
>  		if (write_msg)
>  			__pci_write_msi_msg(entry, &entry->msg);
>  		pci_msix_write_vector_ctrl(entry, entry->pci.msix_ctrl);
>  	}
> +	msi_unlock_descs(&dev->dev);
>  
>  	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
>  }
> @@ -339,20 +341,16 @@ void pci_restore_msi_state(struct pci_de
>  EXPORT_SYMBOL_GPL(pci_restore_msi_state);
>  
>  static struct msi_desc *
> -msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
> +msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity_desc *masks)
>  {
> -	struct irq_affinity_desc *masks = NULL;
>  	struct msi_desc *entry;
>  	unsigned long prop;
>  	u16 control;
>  
> -	if (affd)
> -		masks = irq_create_affinity_masks(nvec, affd);
> -
>  	/* MSI Entry Initialization */
>  	entry = alloc_msi_entry(&dev->dev, nvec, masks);
>  	if (!entry)
> -		goto out;
> +		return NULL;
>  
>  	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
>  	/* Lies, damned lies, and MSIs */
> @@ -379,8 +377,7 @@ msi_setup_entry(struct pci_dev *dev, int
>  	if (entry->pci.msi_attrib.is_64)
>  		prop |= MSI_PROP_64BIT;
>  	msi_device_set_properties(&dev->dev, prop);
> -out:
> -	kfree(masks);
> +
>  	return entry;
>  }
>  
> @@ -416,14 +413,21 @@ static int msi_verify_entries(struct pci
>  static int msi_capability_init(struct pci_dev *dev, int nvec,
>  			       struct irq_affinity *affd)
>  {
> +	struct irq_affinity_desc *masks = NULL;
>  	struct msi_desc *entry;
>  	int ret;
>  
>  	pci_msi_set_enable(dev, 0);	/* Disable MSI during set up */
>  
> -	entry = msi_setup_entry(dev, nvec, affd);
> -	if (!entry)
> -		return -ENOMEM;
> +	if (affd)
> +		masks = irq_create_affinity_masks(nvec, affd);
> +
> +	msi_lock_descs(&dev->dev);
> +	entry = msi_setup_entry(dev, nvec, masks);
> +	if (!entry) {
> +		ret = -ENOMEM;
> +		goto unlock;
> +	}
>  
>  	/* All MSIs are unmasked by default; mask them all */
>  	pci_msi_mask(entry, msi_multi_mask(entry));
> @@ -446,11 +450,14 @@ static int msi_capability_init(struct pc
>  
>  	pcibios_free_irq(dev);
>  	dev->irq = entry->irq;
> -	return 0;
> +	goto unlock;
>  
>  err:
>  	pci_msi_unmask(entry, msi_multi_mask(entry));
>  	free_msi_irqs(dev);
> +unlock:
> +	msi_unlock_descs(&dev->dev);
> +	kfree(masks);
>  	return ret;
>  }
>  
> @@ -477,23 +484,18 @@ static void __iomem *msix_map_region(str
>  
>  static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
>  			      struct msix_entry *entries, int nvec,
> -			      struct irq_affinity *affd)
> +			      struct irq_affinity_desc *masks)
>  {
> -	struct irq_affinity_desc *curmsk, *masks = NULL;
> +	int i, vec_count = pci_msix_vec_count(dev);
> +	struct irq_affinity_desc *curmsk;
>  	struct msi_desc *entry;
>  	void __iomem *addr;
> -	int ret, i;
> -	int vec_count = pci_msix_vec_count(dev);
> -
> -	if (affd)
> -		masks = irq_create_affinity_masks(nvec, affd);
>  
>  	for (i = 0, curmsk = masks; i < nvec; i++) {
>  		entry = alloc_msi_entry(&dev->dev, 1, curmsk);
>  		if (!entry) {
>  			/* No enough memory. Don't try again */
> -			ret = -ENOMEM;
> -			goto out;
> +			return -ENOMEM;
>  		}
>  
>  		entry->pci.msi_attrib.is_msix	= 1;
> @@ -522,10 +524,7 @@ static int msix_setup_entries(struct pci
>  			curmsk++;
>  	}
>  	msi_device_set_properties(&dev->dev, MSI_PROP_PCI_MSIX | MSI_PROP_64BIT);
> -	ret = 0;
> -out:
> -	kfree(masks);
> -	return ret;
> +	return 0;
>  }
>  
>  static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries)
> @@ -552,6 +551,41 @@ static void msix_mask_all(void __iomem *
>  		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
>  }
>  
> +static int msix_setup_interrupts(struct pci_dev *dev, void __iomem *base,
> +				 struct msix_entry *entries, int nvec,
> +				 struct irq_affinity *affd)
> +{
> +	struct irq_affinity_desc *masks = NULL;
> +	int ret;
> +
> +	if (affd)
> +		masks = irq_create_affinity_masks(nvec, affd);
> +
> +	msi_lock_descs(&dev->dev);
> +	ret = msix_setup_entries(dev, base, entries, nvec, masks);
> +	if (ret)
> +		goto out_free;
> +
> +	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
> +	if (ret)
> +		goto out_free;
> +
> +	/* Check if all MSI entries honor device restrictions */
> +	ret = msi_verify_entries(dev);
> +	if (ret)
> +		goto out_free;
> +
> +	msix_update_entries(dev, entries);
> +	goto out_unlock;
> +
> +out_free:
> +	free_msi_irqs(dev);
> +out_unlock:
> +	msi_unlock_descs(&dev->dev);
> +	kfree(masks);
> +	return ret;
> +}
> +
>  /**
>   * msix_capability_init - configure device's MSI-X capability
>   * @dev: pointer to the pci_dev data structure of MSI-X device function
> @@ -592,20 +626,9 @@ static int msix_capability_init(struct p
>  	/* Ensure that all table entries are masked. */
>  	msix_mask_all(base, tsize);
>  
> -	ret = msix_setup_entries(dev, base, entries, nvec, affd);
> +	ret = msix_setup_interrupts(dev, base, entries, nvec, affd);
>  	if (ret)
> -		goto out_free;
> -
> -	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
> -	if (ret)
> -		goto out_free;
> -
> -	/* Check if all MSI entries honor device restrictions */
> -	ret = msi_verify_entries(dev);
> -	if (ret)
> -		goto out_free;
> -
> -	msix_update_entries(dev, entries);
> +		goto out_disable;
>  
>  	/* Set MSI-X enabled bits and unmask the function */
>  	pci_intx_for_msi(dev, 0);
> @@ -615,12 +638,8 @@ static int msix_capability_init(struct p
>  	pcibios_free_irq(dev);
>  	return 0;
>  
> -out_free:
> -	free_msi_irqs(dev);
> -
>  out_disable:
>  	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
> -
>  	return ret;
>  }
>  
> @@ -725,8 +744,10 @@ void pci_disable_msi(struct pci_dev *dev
>  	if (!pci_msi_enable || !dev || !dev->msi_enabled)
>  		return;
>  
> +	msi_lock_descs(&dev->dev);
>  	pci_msi_shutdown(dev);
>  	free_msi_irqs(dev);
> +	msi_unlock_descs(&dev->dev);
>  }
>  EXPORT_SYMBOL(pci_disable_msi);
>  
> @@ -812,8 +833,10 @@ void pci_disable_msix(struct pci_dev *de
>  	if (!pci_msi_enable || !dev || !dev->msix_enabled)
>  		return;
>  
> +	msi_lock_descs(&dev->dev);
>  	pci_msix_shutdown(dev);
>  	free_msi_irqs(dev);
> +	msi_unlock_descs(&dev->dev);
>  }
>  EXPORT_SYMBOL(pci_disable_msix);
>  
> @@ -874,7 +897,6 @@ int pci_enable_msi(struct pci_dev *dev)
>  
>  	if (!rc)
>  		rc = __pci_enable_msi_range(dev, 1, 1, NULL);
> -
>  	return rc < 0 ? rc : 0;
>  }
>  EXPORT_SYMBOL(pci_enable_msi);
> @@ -961,11 +983,7 @@ int pci_alloc_irq_vectors_affinity(struc
>  				   struct irq_affinity *affd)
>  {
>  	struct irq_affinity msi_default_affd = {0};
> -	int ret = msi_setup_device_data(&dev->dev);
> -	int nvecs = -ENOSPC;
> -
> -	if (ret)
> -		return ret;
> +	int ret, nvecs;
>  
>  	if (flags & PCI_IRQ_AFFINITY) {
>  		if (!affd)
> @@ -975,6 +993,10 @@ int pci_alloc_irq_vectors_affinity(struc
>  			affd = NULL;
>  	}
>  
> +	ret = msi_setup_device_data(&dev->dev);
> +	if (ret)
> +		return ret;
> +
>  	if (flags & PCI_IRQ_MSIX) {
>  		nvecs = __pci_enable_msix_range(dev, NULL, min_vecs, max_vecs,
>  						affd, flags);
> @@ -1003,7 +1025,7 @@ int pci_alloc_irq_vectors_affinity(struc
>  		}
>  	}
>  
> -	return nvecs;
> +	return -ENOSPC;
>  }
>  EXPORT_SYMBOL(pci_alloc_irq_vectors_affinity);
>  
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 08/31] PCI/MSI: Use msi_add_msi_desc()
  2021-12-06 22:51 ` [patch V2 08/31] PCI/MSI: Use msi_add_msi_desc() Thomas Gleixner
@ 2021-12-07 21:07   ` Bjorn Helgaas
  0 siblings, 0 replies; 50+ messages in thread
From: Bjorn Helgaas @ 2021-12-07 21:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Marc Zygnier, Alex Williamson, Kevin Tian, Jason Gunthorpe,
	Megha Dey, Ashok Raj, linux-pci, Cedric Le Goater, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Dec 06, 2021 at 11:51:15PM +0100, Thomas Gleixner wrote:
> Simplify the allocation of MSI descriptors by using msi_add_msi_desc()
> which moves the storage handling to core code and prepares for dynamic
> extension of the MSI-X vector space.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/msi/msi.c |  122 ++++++++++++++++++++++++--------------------------
>  1 file changed, 59 insertions(+), 63 deletions(-)
> 
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -340,45 +340,51 @@ void pci_restore_msi_state(struct pci_de
>  }
>  EXPORT_SYMBOL_GPL(pci_restore_msi_state);
>  
> -static struct msi_desc *
> -msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity_desc *masks)
> +static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
> +			      struct irq_affinity_desc *masks)
>  {
> -	struct msi_desc *entry;
> +	struct msi_desc desc;
>  	unsigned long prop;
>  	u16 control;
> +	int ret;
>  
>  	/* MSI Entry Initialization */
> -	entry = alloc_msi_entry(&dev->dev, nvec, masks);
> -	if (!entry)
> -		return NULL;
> +	memset(&desc, 0, sizeof(desc));
>  
>  	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
>  	/* Lies, damned lies, and MSIs */
>  	if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
>  		control |= PCI_MSI_FLAGS_MASKBIT;
> +	/* Respect XEN's mask disabling */
> +	if (pci_msi_ignore_mask)
> +		control &= ~PCI_MSI_FLAGS_MASKBIT;
>  
> -	entry->pci.msi_attrib.is_64	= !!(control & PCI_MSI_FLAGS_64BIT);
> -	entry->pci.msi_attrib.can_mask	= !pci_msi_ignore_mask &&
> -					  !!(control & PCI_MSI_FLAGS_MASKBIT);
> -	entry->pci.msi_attrib.default_irq = dev->irq;
> -	entry->pci.msi_attrib.multi_cap	= (control & PCI_MSI_FLAGS_QMASK) >> 1;
> -	entry->pci.msi_attrib.multiple	= ilog2(__roundup_pow_of_two(nvec));
> +	desc.nvec_used			= nvec;
> +	desc.pci.msi_attrib.is_64	= !!(control & PCI_MSI_FLAGS_64BIT);
> +	desc.pci.msi_attrib.can_mask	= !!(control & PCI_MSI_FLAGS_MASKBIT);
> +	desc.pci.msi_attrib.default_irq	= dev->irq;
> +	desc.pci.msi_attrib.multi_cap	= (control & PCI_MSI_FLAGS_QMASK) >> 1;
> +	desc.pci.msi_attrib.multiple	= ilog2(__roundup_pow_of_two(nvec));
> +	desc.affinity			= masks;
>  
>  	if (control & PCI_MSI_FLAGS_64BIT)
> -		entry->pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_64;
> +		desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_64;
>  	else
> -		entry->pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_32;
> +		desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_32;
>  
>  	/* Save the initial mask status */
> -	if (entry->pci.msi_attrib.can_mask)
> -		pci_read_config_dword(dev, entry->pci.mask_pos, &entry->pci.msi_mask);
> +	if (desc.pci.msi_attrib.can_mask)
> +		pci_read_config_dword(dev, desc.pci.mask_pos, &desc.pci.msi_mask);
>  
> -	prop = MSI_PROP_PCI_MSI;
> -	if (entry->pci.msi_attrib.is_64)
> -		prop |= MSI_PROP_64BIT;
> -	msi_device_set_properties(&dev->dev, prop);
> +	ret = msi_add_msi_desc(&dev->dev, &desc);
> +	if (!ret) {
> +		prop = MSI_PROP_PCI_MSI;
> +		if (desc.pci.msi_attrib.is_64)
> +			prop |= MSI_PROP_64BIT;
> +		msi_device_set_properties(&dev->dev, prop);
> +	}
>  
> -	return entry;
> +	return ret;
>  }
>  
>  static int msi_verify_entries(struct pci_dev *dev)
> @@ -423,17 +429,14 @@ static int msi_capability_init(struct pc
>  		masks = irq_create_affinity_masks(nvec, affd);
>  
>  	msi_lock_descs(&dev->dev);
> -	entry = msi_setup_entry(dev, nvec, masks);
> -	if (!entry) {
> -		ret = -ENOMEM;
> +	ret = msi_setup_msi_desc(dev, nvec, masks);
> +	if (ret)
>  		goto unlock;
> -	}
>  
>  	/* All MSIs are unmasked by default; mask them all */
> +	entry = first_pci_msi_entry(dev);
>  	pci_msi_mask(entry, msi_multi_mask(entry));
>  
> -	list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
> -
>  	/* Configure MSI capability structure */
>  	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
>  	if (ret)
> @@ -482,49 +485,40 @@ static void __iomem *msix_map_region(str
>  	return ioremap(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
>  }
>  
> -static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
> -			      struct msix_entry *entries, int nvec,
> -			      struct irq_affinity_desc *masks)
> +static int msix_setup_msi_descs(struct pci_dev *dev, void __iomem *base,
> +				struct msix_entry *entries, int nvec,
> +				struct irq_affinity_desc *masks)
>  {
> -	int i, vec_count = pci_msix_vec_count(dev);
> +	int ret = 0, i, vec_count = pci_msix_vec_count(dev);
>  	struct irq_affinity_desc *curmsk;
> -	struct msi_desc *entry;
> +	struct msi_desc desc;
>  	void __iomem *addr;
>  
> -	for (i = 0, curmsk = masks; i < nvec; i++) {
> -		entry = alloc_msi_entry(&dev->dev, 1, curmsk);
> -		if (!entry) {
> -			/* No enough memory. Don't try again */
> -			return -ENOMEM;
> -		}
> -
> -		entry->pci.msi_attrib.is_msix	= 1;
> -		entry->pci.msi_attrib.is_64	= 1;
> -
> -		if (entries)
> -			entry->msi_index = entries[i].entry;
> -		else
> -			entry->msi_index = i;
> -
> -		entry->pci.msi_attrib.is_virtual = entry->msi_index >= vec_count;
> -
> -		entry->pci.msi_attrib.can_mask	= !pci_msi_ignore_mask &&
> -						  !entry->pci.msi_attrib.is_virtual;
> -
> -		entry->pci.msi_attrib.default_irq	= dev->irq;
> -		entry->pci.mask_base			= base;
> +	memset(&desc, 0, sizeof(desc));
>  
> -		if (entry->pci.msi_attrib.can_mask) {
> -			addr = pci_msix_desc_addr(entry);
> -			entry->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
> +	desc.nvec_used			= 1;
> +	desc.pci.msi_attrib.is_msix	= 1;
> +	desc.pci.msi_attrib.is_64	= 1;
> +	desc.pci.msi_attrib.default_irq	= dev->irq;
> +	desc.pci.mask_base		= base;
> +
> +	for (i = 0, curmsk = masks; i < nvec; i++, curmsk++) {
> +		desc.msi_index = entries ? entries[i].entry : i;
> +		desc.affinity = masks ? curmsk : NULL;
> +		desc.pci.msi_attrib.is_virtual = desc.msi_index >= vec_count;
> +		desc.pci.msi_attrib.can_mask = !pci_msi_ignore_mask &&
> +					       !desc.pci.msi_attrib.is_virtual;
> +
> +		if (!desc.pci.msi_attrib.can_mask) {
> +			addr = pci_msix_desc_addr(&desc);
> +			desc.pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
>  		}
>  
> -		list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
> -		if (masks)
> -			curmsk++;
> +		ret = msi_add_msi_desc(&dev->dev, &desc);
> +		if (ret)
> +			break;
>  	}
> -	msi_device_set_properties(&dev->dev, MSI_PROP_PCI_MSIX | MSI_PROP_64BIT);
> -	return 0;
> +	return ret;
>  }
>  
>  static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries)
> @@ -562,10 +556,12 @@ static int msix_setup_interrupts(struct
>  		masks = irq_create_affinity_masks(nvec, affd);
>  
>  	msi_lock_descs(&dev->dev);
> -	ret = msix_setup_entries(dev, base, entries, nvec, masks);
> +	ret = msix_setup_msi_descs(dev, base, entries, nvec, masks);
>  	if (ret)
>  		goto out_free;
>  
> +	msi_device_set_properties(&dev->dev, MSI_PROP_PCI_MSIX | MSI_PROP_64BIT);
> +
>  	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
>  	if (ret)
>  		goto out_free;
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 09/31] PCI/MSI: Let core code free MSI descriptors
  2021-12-06 22:51 ` [patch V2 09/31] PCI/MSI: Let core code free MSI descriptors Thomas Gleixner
@ 2021-12-07 21:07   ` Bjorn Helgaas
  0 siblings, 0 replies; 50+ messages in thread
From: Bjorn Helgaas @ 2021-12-07 21:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Marc Zygnier, Alex Williamson, Kevin Tian, Jason Gunthorpe,
	Megha Dey, Ashok Raj, linux-pci, Cedric Le Goater, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Dec 06, 2021 at 11:51:16PM +0100, Thomas Gleixner wrote:
> Set the domain info flag which tells the core code to free the MSI
> descriptors from msi_domain_free_irqs() and add an explicit call to the
> core function into the legacy code.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/msi/irqdomain.c |    3 ++-
>  drivers/pci/msi/legacy.c    |    1 +
>  drivers/pci/msi/msi.c       |   14 --------------
>  3 files changed, 3 insertions(+), 15 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -171,7 +171,8 @@ struct irq_domain *pci_msi_create_irq_do
>  	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
>  		pci_msi_domain_update_chip_ops(info);
>  
> -	info->flags |= MSI_FLAG_ACTIVATE_EARLY | MSI_FLAG_DEV_SYSFS;
> +	info->flags |= MSI_FLAG_ACTIVATE_EARLY | MSI_FLAG_DEV_SYSFS |
> +		       MSI_FLAG_FREE_MSI_DESCS;
>  	if (IS_ENABLED(CONFIG_GENERIC_IRQ_RESERVATION_MODE))
>  		info->flags |= MSI_FLAG_MUST_REACTIVATE;
>  
> --- a/drivers/pci/msi/legacy.c
> +++ b/drivers/pci/msi/legacy.c
> @@ -80,4 +80,5 @@ void pci_msi_legacy_teardown_msi_irqs(st
>  {
>  	msi_device_destroy_sysfs(&dev->dev);
>  	arch_teardown_msi_irqs(dev);
> +	msi_free_msi_descs(&dev->dev);
>  }
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -224,22 +224,8 @@ EXPORT_SYMBOL_GPL(pci_write_msi_msg);
>  
>  static void free_msi_irqs(struct pci_dev *dev)
>  {
> -	struct list_head *msi_list = dev_to_msi_list(&dev->dev);
> -	struct msi_desc *entry, *tmp;
> -	int i;
> -
> -	for_each_pci_msi_entry(entry, dev)
> -		if (entry->irq)
> -			for (i = 0; i < entry->nvec_used; i++)
> -				BUG_ON(irq_has_action(entry->irq + i));
> -
>  	pci_msi_teardown_msi_irqs(dev);
>  
> -	list_for_each_entry_safe(entry, tmp, msi_list, list) {
> -		list_del(&entry->list);
> -		free_msi_entry(entry);
> -	}
> -
>  	if (dev->msix_base) {
>  		iounmap(dev->msix_base);
>  		dev->msix_base = NULL;
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 10/31] PCI/MSI: Use msi_on_each_desc()
  2021-12-06 22:51 ` [patch V2 10/31] PCI/MSI: Use msi_on_each_desc() Thomas Gleixner
@ 2021-12-07 21:07   ` Bjorn Helgaas
  0 siblings, 0 replies; 50+ messages in thread
From: Bjorn Helgaas @ 2021-12-07 21:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Marc Zygnier, Alex Williamson, Kevin Tian, Jason Gunthorpe,
	Megha Dey, Ashok Raj, linux-pci, Cedric Le Goater, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Dec 06, 2021 at 11:51:18PM +0100, Thomas Gleixner wrote:
> Use the new iterator functions which pave the way for dynamically extending
> MSI-X vectors.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/msi/irqdomain.c |    4 ++--
>  drivers/pci/msi/legacy.c    |   19 ++++++++-----------
>  drivers/pci/msi/msi.c       |   30 ++++++++++++++----------------
>  3 files changed, 24 insertions(+), 29 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -83,7 +83,7 @@ static int pci_msi_domain_check_cap(stru
>  				    struct msi_domain_info *info,
>  				    struct device *dev)
>  {
> -	struct msi_desc *desc = first_pci_msi_entry(to_pci_dev(dev));
> +	struct msi_desc *desc = msi_first_desc(dev, MSI_DESC_ALL);
>  
>  	/* Special handling to support __pci_enable_msi_range() */
>  	if (pci_msi_desc_is_multi_msi(desc) &&
> @@ -98,7 +98,7 @@ static int pci_msi_domain_check_cap(stru
>  			unsigned int idx = 0;
>  
>  			/* Check for gaps in the entry indices */
> -			for_each_msi_entry(desc, dev) {
> +			msi_for_each_desc(desc, dev, MSI_DESC_ALL) {
>  				if (desc->msi_index != idx++)
>  					return -ENOTSUPP;
>  			}
> --- a/drivers/pci/msi/legacy.c
> +++ b/drivers/pci/msi/legacy.c
> @@ -28,7 +28,7 @@ int __weak arch_setup_msi_irqs(struct pc
>  	if (type == PCI_CAP_ID_MSI && nvec > 1)
>  		return 1;
>  
> -	for_each_pci_msi_entry(desc, dev) {
> +	msi_for_each_desc(desc, &dev->dev, MSI_DESC_NOTASSOCIATED) {
>  		ret = arch_setup_msi_irq(dev, desc);
>  		if (ret)
>  			return ret < 0 ? ret : -ENOSPC;
> @@ -42,27 +42,24 @@ void __weak arch_teardown_msi_irqs(struc
>  	struct msi_desc *desc;
>  	int i;
>  
> -	for_each_pci_msi_entry(desc, dev) {
> -		if (desc->irq) {
> -			for (i = 0; i < desc->nvec_used; i++)
> -				arch_teardown_msi_irq(desc->irq + i);
> -		}
> +	msi_for_each_desc(desc, &dev->dev, MSI_DESC_ASSOCIATED) {
> +		for (i = 0; i < desc->nvec_used; i++)
> +			arch_teardown_msi_irq(desc->irq + i);
>  	}
>  }
>  
>  static int pci_msi_setup_check_result(struct pci_dev *dev, int type, int ret)
>  {
> -	struct msi_desc *entry;
> +	struct msi_desc *desc;
>  	int avail = 0;
>  
>  	if (type != PCI_CAP_ID_MSIX || ret >= 0)
>  		return ret;
>  
>  	/* Scan the MSI descriptors for successfully allocated ones. */
> -	for_each_pci_msi_entry(entry, dev) {
> -		if (entry->irq != 0)
> -			avail++;
> -	}
> +	msi_for_each_desc(desc, &dev->dev, MSI_DESC_ASSOCIATED)
> +		avail++;
> +
>  	return avail ? avail : ret;
>  }
>  
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -299,7 +299,6 @@ static void __pci_restore_msix_state(str
>  
>  	if (!dev->msix_enabled)
>  		return;
> -	BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));
>  
>  	/* route the table */
>  	pci_intx_for_msi(dev, 0);
> @@ -309,7 +308,7 @@ static void __pci_restore_msix_state(str
>  	write_msg = arch_restore_msi_irqs(dev);
>  
>  	msi_lock_descs(&dev->dev);
> -	for_each_pci_msi_entry(entry, dev) {
> +	msi_for_each_desc(entry, &dev->dev, MSI_DESC_ALL) {
>  		if (write_msg)
>  			__pci_write_msi_msg(entry, &entry->msg);
>  		pci_msix_write_vector_ctrl(entry, entry->pci.msix_ctrl);
> @@ -378,14 +377,14 @@ static int msi_verify_entries(struct pci
>  	if (!dev->no_64bit_msi)
>  		return 0;
>  
> -	for_each_pci_msi_entry(entry, dev) {
> +	msi_for_each_desc(entry, &dev->dev, MSI_DESC_ALL) {
>  		if (entry->msg.address_hi) {
>  			pci_err(dev, "arch assigned 64-bit MSI address %#x%08x but device only supports 32 bits\n",
>  				entry->msg.address_hi, entry->msg.address_lo);
> -			return -EIO;
> +			break;
>  		}
>  	}
> -	return 0;
> +	return !entry ? 0 : -EIO;
>  }
>  
>  /**
> @@ -418,7 +417,7 @@ static int msi_capability_init(struct pc
>  		goto unlock;
>  
>  	/* All MSIs are unmasked by default; mask them all */
> -	entry = first_pci_msi_entry(dev);
> +	entry = msi_first_desc(&dev->dev, MSI_DESC_ALL);
>  	pci_msi_mask(entry, msi_multi_mask(entry));
>  
>  	/* Configure MSI capability structure */
> @@ -508,11 +507,11 @@ static int msix_setup_msi_descs(struct p
>  
>  static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries)
>  {
> -	struct msi_desc *entry;
> +	struct msi_desc *desc;
>  
>  	if (entries) {
> -		for_each_pci_msi_entry(entry, dev) {
> -			entries->vector = entry->irq;
> +		msi_for_each_desc(desc, &dev->dev, MSI_DESC_ALL) {
> +			entries->vector = desc->irq;
>  			entries++;
>  		}
>  	}
> @@ -705,15 +704,14 @@ static void pci_msi_shutdown(struct pci_
>  	if (!pci_msi_enable || !dev || !dev->msi_enabled)
>  		return;
>  
> -	BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));
> -	desc = first_pci_msi_entry(dev);
> -
>  	pci_msi_set_enable(dev, 0);
>  	pci_intx_for_msi(dev, 1);
>  	dev->msi_enabled = 0;
>  
>  	/* Return the device with MSI unmasked as initial states */
> -	pci_msi_unmask(desc, msi_multi_mask(desc));
> +	desc = msi_first_desc(&dev->dev, MSI_DESC_ALL);
> +	if (!WARN_ON_ONCE(!desc))
> +		pci_msi_unmask(desc, msi_multi_mask(desc));
>  
>  	/* Restore dev->irq to its default pin-assertion IRQ */
>  	dev->irq = desc->pci.msi_attrib.default_irq;
> @@ -789,7 +787,7 @@ static int __pci_enable_msix(struct pci_
>  
>  static void pci_msix_shutdown(struct pci_dev *dev)
>  {
> -	struct msi_desc *entry;
> +	struct msi_desc *desc;
>  
>  	if (!pci_msi_enable || !dev || !dev->msix_enabled)
>  		return;
> @@ -800,8 +798,8 @@ static void pci_msix_shutdown(struct pci
>  	}
>  
>  	/* Return the device with MSI-X masked as initial states */
> -	for_each_pci_msi_entry(entry, dev)
> -		pci_msix_mask(entry);
> +	msi_for_each_desc(desc, &dev->dev, MSI_DESC_ALL)
> +		pci_msix_mask(desc);
>  
>  	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
>  	pci_intx_for_msi(dev, 1);
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 19/31] PCI: hv: Rework MSI handling
  2021-12-06 22:51 ` [patch V2 19/31] PCI: hv: Rework MSI handling Thomas Gleixner
@ 2021-12-07 21:08   ` Bjorn Helgaas
  0 siblings, 0 replies; 50+ messages in thread
From: Bjorn Helgaas @ 2021-12-07 21:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Marc Zygnier, Alex Williamson, Kevin Tian, Jason Gunthorpe,
	Megha Dey, Ashok Raj, linux-pci, Cedric Le Goater, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Dec 06, 2021 at 11:51:33PM +0100, Thomas Gleixner wrote:
> Replace the about to vanish iterators and make use of the filtering. Take
> the descriptor lock around the iterators.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/controller/pci-hyperv.c |   15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -3445,18 +3445,23 @@ static int hv_pci_suspend(struct hv_devi
>  
>  static int hv_pci_restore_msi_msg(struct pci_dev *pdev, void *arg)
>  {
> -	struct msi_desc *entry;
>  	struct irq_data *irq_data;
> +	struct msi_desc *entry;
> +	int ret = 0;
>  
> -	for_each_pci_msi_entry(entry, pdev) {
> +	msi_lock_descs(&pdev->dev);
> +	msi_for_each_desc(entry, &pdev->dev, MSI_DESC_ASSOCIATED) {
>  		irq_data = irq_get_irq_data(entry->irq);
> -		if (WARN_ON_ONCE(!irq_data))
> -			return -EINVAL;
> +		if (WARN_ON_ONCE(!irq_data)) {
> +			ret = -EINVAL;
> +			break;
> +		}
>  
>  		hv_compose_msi_msg(irq_data, &entry->msg);
>  	}
> +	msi_unlock_descs(&pdev->dev);
>  
> -	return 0;
> +	return ret;
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 02/31] genirq/msi: Add mutex for MSI list protection
  2021-12-06 22:51 ` [patch V2 02/31] genirq/msi: Add mutex for MSI list protection Thomas Gleixner
@ 2021-12-09  0:47   ` Jason Gunthorpe
  2021-12-09 20:07     ` Thomas Gleixner
  0 siblings, 1 reply; 50+ messages in thread
From: Jason Gunthorpe @ 2021-12-09  0:47 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Megha Dey, Ashok Raj, linux-pci, Cedric Le Goater, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Dec 06, 2021 at 11:51:05PM +0100, Thomas Gleixner wrote:
> +++ b/kernel/irq/msi.c
> @@ -127,12 +127,37 @@ int msi_setup_device_data(struct device
>  		return -ENOMEM;
>  
>  	INIT_LIST_HEAD(&md->list);
> +	mutex_init(&md->mutex);
>  	dev->msi.data = md;
>  	devres_add(dev, md);
>  	return 0;
>  }
>  
>  /**
> + * msi_lock_descs - Lock the MSI descriptor storage of a device
> + * @dev:	Device to operate on
> + */
> +void msi_lock_descs(struct device *dev)
> +{
> +	if (WARN_ON_ONCE(!dev->msi.data))
> +		return;

Is this useful? Other places that call msi_lock_descs will continue on and deref
null dev->msi anyhow - is the dump from the WARN_ON that much better
than the oops from the null deref here:

> +	mutex_lock(&dev->msi.data->mutex);

?

Honestly, still a bit unclear on what the community consensus is for
using WARN_ON.

Jason

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3
  2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
                   ` (30 preceding siblings ...)
  2021-12-06 22:51 ` [patch V2 31/31] genirq/msi: Convert storage to xarray Thomas Gleixner
@ 2021-12-09  1:01 ` Jason Gunthorpe
  31 siblings, 0 replies; 50+ messages in thread
From: Jason Gunthorpe @ 2021-12-09  1:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Megha Dey, Ashok Raj, linux-pci, Cedric Le Goater, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Dec 06, 2021 at 11:51:02PM +0100, Thomas Gleixner wrote:
> This is the third part of [PCI]MSI refactoring which aims to provide the
> ability of expanding MSI-X vectors after enabling MSI-X.

I read through this and didn't have any substantive remarks

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 02/31] genirq/msi: Add mutex for MSI list protection
  2021-12-09  0:47   ` Jason Gunthorpe
@ 2021-12-09 20:07     ` Thomas Gleixner
  0 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-09 20:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: LKML, Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Megha Dey, Ashok Raj, linux-pci, Cedric Le Goater, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Wed, Dec 08 2021 at 20:47, Jason Gunthorpe wrote:
> On Mon, Dec 06, 2021 at 11:51:05PM +0100, Thomas Gleixner wrote:
>> +++ b/kernel/irq/msi.c
>> @@ -127,12 +127,37 @@ int msi_setup_device_data(struct device
>>  		return -ENOMEM;
>>  
>>  	INIT_LIST_HEAD(&md->list);
>> +	mutex_init(&md->mutex);
>>  	dev->msi.data = md;
>>  	devres_add(dev, md);
>>  	return 0;
>>  }
>>  
>>  /**
>> + * msi_lock_descs - Lock the MSI descriptor storage of a device
>> + * @dev:	Device to operate on
>> + */
>> +void msi_lock_descs(struct device *dev)
>> +{
>> +	if (WARN_ON_ONCE(!dev->msi.data))
>> +		return;
>
> Is this useful? Other places that call msi_lock_descs will continue on and deref
> null dev->msi anyhow - is the dump from the WARN_ON that much better
> than the oops from the null deref here:
>
>> +	mutex_lock(&dev->msi.data->mutex);

I put it there for paranoia reasons and forgot to revist it later. In
that case yes, it's of questionable value.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V2 21/31] soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation
  2021-12-06 22:51 ` [patch V2 21/31] soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation Thomas Gleixner
@ 2021-12-15 20:50   ` Thomas Gleixner
  0 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2021-12-15 20:50 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb, Nishanth Menon

On Mon, Dec 06 2021 at 23:51, Thomas Gleixner wrote:
>
> No functional change intended.

Famous last words.

>  static int ti_sci_inta_msi_alloc_descs(struct device *dev,
>  				       struct ti_sci_resource *res)
>  {
> -	struct msi_desc *msi_desc;
> +	struct msi_desc msi_desc;
>  	int set, i, count = 0;
>  
> +	memset(&msi_desc, 0, sizeof(msi_desc));

This fails to initialize msi_desc.nvec_used which makes the subsequent
interrupt allocation fail. Delta fix below.

Thanks,

        tglx
---
--- a/drivers/soc/ti/ti_sci_inta_msi.c
+++ b/drivers/soc/ti/ti_sci_inta_msi.c
@@ -68,6 +68,7 @@ static int ti_sci_inta_msi_alloc_descs(s
 	int set, i, count = 0;
 
 	memset(&msi_desc, 0, sizeof(msi_desc));
+	msi_desc.nvec_used = 1;
 
 	for (set = 0; set < res->sets; set++) {
 		for (i = 0; i < res->desc[set].num; i++, count++) {

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch] genirq/msi: Populate sysfs entry only once
  2021-12-06 22:51 ` [patch V2 30/31] genirq/msi: Simplify sysfs handling Thomas Gleixner
@ 2022-01-10 18:12   ` Thomas Gleixner
  2022-01-10 18:15     ` Borislav Petkov
                       ` (3 more replies)
  0 siblings, 4 replies; 50+ messages in thread
From: Thomas Gleixner @ 2022-01-10 18:12 UTC (permalink / raw)
  To: LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

The MSI entries for multi-MSI are populated en bloc for the MSI descriptor,
but the current code invokes the population inside the per interrupt loop
which triggers a warning in the sysfs code and causes the interrupt
allocation to fail.

Move it outside of the loop so it works correctly for single and multi-MSI.

Fixes: bf5e758f02fc ("genirq/msi: Simplify sysfs handling")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/irq/msi.c |   11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -887,12 +887,11 @@ int __msi_domain_alloc_irqs(struct irq_d
 			ret = msi_init_virq(domain, virq + i, vflags);
 			if (ret)
 				return ret;
-
-			if (info->flags & MSI_FLAG_DEV_SYSFS) {
-				ret = msi_sysfs_populate_desc(dev, desc);
-				if (ret)
-					return ret;
-			}
+		}
+		if (info->flags & MSI_FLAG_DEV_SYSFS) {
+			ret = msi_sysfs_populate_desc(dev, desc);
+			if (ret)
+				return ret;
 		}
 		allocated++;
 	}

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch] genirq/msi: Populate sysfs entry only once
  2022-01-10 18:12   ` [patch] genirq/msi: Populate sysfs entry only once Thomas Gleixner
@ 2022-01-10 18:15     ` Borislav Petkov
  2022-01-11  8:57     ` Greg Kroah-Hartman
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Borislav Petkov @ 2022-01-10 18:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

On Mon, Jan 10, 2022 at 07:12:45PM +0100, Thomas Gleixner wrote:
> The MSI entries for multi-MSI are populated en bloc for the MSI descriptor,
> but the current code invokes the population inside the per interrupt loop
> which triggers a warning in the sysfs code and causes the interrupt
> allocation to fail.
> 
> Move it outside of the loop so it works correctly for single and multi-MSI.
> 
> Fixes: bf5e758f02fc ("genirq/msi: Simplify sysfs handling")
> Reported-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/irq/msi.c |   11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -887,12 +887,11 @@ int __msi_domain_alloc_irqs(struct irq_d
>  			ret = msi_init_virq(domain, virq + i, vflags);
>  			if (ret)
>  				return ret;
> -
> -			if (info->flags & MSI_FLAG_DEV_SYSFS) {
> -				ret = msi_sysfs_populate_desc(dev, desc);
> -				if (ret)
> -					return ret;
> -			}
> +		}
> +		if (info->flags & MSI_FLAG_DEV_SYSFS) {
> +			ret = msi_sysfs_populate_desc(dev, desc);
> +			if (ret)
> +				return ret;
>  		}
>  		allocated++;
>  	}

Yap, works.

Tested-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch] genirq/msi: Populate sysfs entry only once
  2022-01-10 18:12   ` [patch] genirq/msi: Populate sysfs entry only once Thomas Gleixner
  2022-01-10 18:15     ` Borislav Petkov
@ 2022-01-11  8:57     ` Greg Kroah-Hartman
  2022-01-11  9:02     ` Cédric Le Goater
  2022-01-12  0:05     ` Kunihiko Hayashi
  3 siblings, 0 replies; 50+ messages in thread
From: Greg Kroah-Hartman @ 2022-01-11  8:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Niklas Schnelle,
	linux-s390, Heiko Carstens, Christian Borntraeger,
	Logan Gunthorpe, Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On Mon, Jan 10, 2022 at 07:12:45PM +0100, Thomas Gleixner wrote:
> The MSI entries for multi-MSI are populated en bloc for the MSI descriptor,
> but the current code invokes the population inside the per interrupt loop
> which triggers a warning in the sysfs code and causes the interrupt
> allocation to fail.
> 
> Move it outside of the loop so it works correctly for single and multi-MSI.
> 
> Fixes: bf5e758f02fc ("genirq/msi: Simplify sysfs handling")
> Reported-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/irq/msi.c |   11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch] genirq/msi: Populate sysfs entry only once
  2022-01-10 18:12   ` [patch] genirq/msi: Populate sysfs entry only once Thomas Gleixner
  2022-01-10 18:15     ` Borislav Petkov
  2022-01-11  8:57     ` Greg Kroah-Hartman
@ 2022-01-11  9:02     ` Cédric Le Goater
  2022-01-12  0:05     ` Kunihiko Hayashi
  3 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2022-01-11  9:02 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci, xen-devel,
	Juergen Gross, Greg Kroah-Hartman, Niklas Schnelle, linux-s390,
	Heiko Carstens, Christian Borntraeger, Logan Gunthorpe,
	Jon Mason, Dave Jiang, Allen Hubbe, linux-ntb

On 1/10/22 19:12, Thomas Gleixner wrote:
> The MSI entries for multi-MSI are populated en bloc for the MSI descriptor,
> but the current code invokes the population inside the per interrupt loop
> which triggers a warning in the sysfs code and causes the interrupt
> allocation to fail.
> 
> Move it outside of the loop so it works correctly for single and multi-MSI.
> 
> Fixes: bf5e758f02fc ("genirq/msi: Simplify sysfs handling")
> Reported-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

> ---
>   kernel/irq/msi.c |   11 +++++------
>   1 file changed, 5 insertions(+), 6 deletions(-)
> 
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -887,12 +887,11 @@ int __msi_domain_alloc_irqs(struct irq_d
>   			ret = msi_init_virq(domain, virq + i, vflags);
>   			if (ret)
>   				return ret;
> -
> -			if (info->flags & MSI_FLAG_DEV_SYSFS) {
> -				ret = msi_sysfs_populate_desc(dev, desc);
> -				if (ret)
> -					return ret;
> -			}
> +		}
> +		if (info->flags & MSI_FLAG_DEV_SYSFS) {
> +			ret = msi_sysfs_populate_desc(dev, desc);
> +			if (ret)
> +				return ret;
>   		}
>   		allocated++;
>   	}
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch] genirq/msi: Populate sysfs entry only once
  2022-01-10 18:12   ` [patch] genirq/msi: Populate sysfs entry only once Thomas Gleixner
                       ` (2 preceding siblings ...)
  2022-01-11  9:02     ` Cédric Le Goater
@ 2022-01-12  0:05     ` Kunihiko Hayashi
  2022-01-18 23:59       ` Thomas Gleixner
  3 siblings, 1 reply; 50+ messages in thread
From: Kunihiko Hayashi @ 2022-01-12  0:05 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Hi Thomas,

Is this fix the same as below?
https://marc.info/?l=linux-kernel&m=164061119923119&w=2

On 2022/01/11 3:12, Thomas Gleixner wrote:
> The MSI entries for multi-MSI are populated en bloc for the MSI
> descriptor,
> but the current code invokes the population inside the per interrupt loop
> which triggers a warning in the sysfs code and causes the interrupt
> allocation to fail.
> 
> Move it outside of the loop so it works correctly for single and
> multi-MSI.
> 
> Fixes: bf5e758f02fc ("genirq/msi: Simplify sysfs handling")
> Reported-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   kernel/irq/msi.c |   11 +++++------
>   1 file changed, 5 insertions(+), 6 deletions(-)
> 
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -887,12 +887,11 @@ int __msi_domain_alloc_irqs(struct irq_d
>   			ret = msi_init_virq(domain, virq + i, vflags);
>   			if (ret)
>   				return ret;
> -
> -			if (info->flags & MSI_FLAG_DEV_SYSFS) {
> -				ret = msi_sysfs_populate_desc(dev, desc);
> -				if (ret)
> -					return ret;
> -			}
> +		}
> +		if (info->flags & MSI_FLAG_DEV_SYSFS) {
> +			ret = msi_sysfs_populate_desc(dev, desc);
> +			if (ret)
> +				return ret;
>   		}
>   		allocated++;
>   	}
> 

---
Best Regards
Kunihiko Hayashi

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch] genirq/msi: Populate sysfs entry only once
  2022-01-12  0:05     ` Kunihiko Hayashi
@ 2022-01-18 23:59       ` Thomas Gleixner
  2022-01-19  8:45         ` Kunihiko Hayashi
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2022-01-18 23:59 UTC (permalink / raw)
  To: Kunihiko Hayashi, LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Kunihiko,

On Wed, Jan 12 2022 at 09:05, Kunihiko Hayashi wrote:
> Is this fix the same as below?
> https://marc.info/?l=linux-kernel&m=164061119923119&w=2

pretty much the same, but I missed that patch. I was off for 2+ weeks
and on return Boris poked me about this issue and I fixed it. Then I
went ahead and marked all vacation mail read as I always do :)

So sorry for not noticing that patch.

Thanks,

        Thomas

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch] genirq/msi: Populate sysfs entry only once
  2022-01-18 23:59       ` Thomas Gleixner
@ 2022-01-19  8:45         ` Kunihiko Hayashi
  0 siblings, 0 replies; 50+ messages in thread
From: Kunihiko Hayashi @ 2022-01-19  8:45 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Bjorn Helgaas, Marc Zygnier, Alex Williamson, Kevin Tian,
	Jason Gunthorpe, Megha Dey, Ashok Raj, linux-pci,
	Cedric Le Goater, xen-devel, Juergen Gross, Greg Kroah-Hartman,
	Niklas Schnelle, linux-s390, Heiko Carstens,
	Christian Borntraeger, Logan Gunthorpe, Jon Mason, Dave Jiang,
	Allen Hubbe, linux-ntb

Hi Thomas,

On 2022/01/19 8:59, Thomas Gleixner wrote:
> Kunihiko,
> 
> On Wed, Jan 12 2022 at 09:05, Kunihiko Hayashi wrote:
>> Is this fix the same as below?
>> https://marc.info/?l=linux-kernel&m=164061119923119&w=2
> 
> pretty much the same, but I missed that patch. I was off for 2+ weeks
> and on return Boris poked me about this issue and I fixed it. Then I
> went ahead and marked all vacation mail read as I always do :)
> 
> So sorry for not noticing that patch.

No problem. If this issue wansn't resolved, the PCIe controller wouldn't
work properly, so I'm relieved to solve the issue and get your response.

Thank you,

---
Best Regards
Kunihiko Hayashi

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2022-01-19  8:45 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 22:51 [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Thomas Gleixner
2021-12-06 22:51 ` [patch V2 01/31] genirq/msi: Move descriptor list to struct msi_device_data Thomas Gleixner
2021-12-06 22:51 ` [patch V2 02/31] genirq/msi: Add mutex for MSI list protection Thomas Gleixner
2021-12-09  0:47   ` Jason Gunthorpe
2021-12-09 20:07     ` Thomas Gleixner
2021-12-06 22:51 ` [patch V2 03/31] genirq/msi: Provide msi_domain_alloc/free_irqs_descs_locked() Thomas Gleixner
2021-12-06 22:51 ` [patch V2 04/31] genirq/msi: Provide a set of advanced MSI accessors and iterators Thomas Gleixner
2021-12-06 22:51 ` [patch V2 05/31] genirq/msi: Provide msi_alloc_msi_desc() and a simple allocator Thomas Gleixner
2021-12-06 22:51 ` [patch V2 06/31] genirq/msi: Provide domain flags to allocate/free MSI descriptors automatically Thomas Gleixner
2021-12-06 22:51 ` [patch V2 07/31] PCI/MSI: Protect MSI operations Thomas Gleixner
2021-12-07 21:06   ` Bjorn Helgaas
2021-12-06 22:51 ` [patch V2 08/31] PCI/MSI: Use msi_add_msi_desc() Thomas Gleixner
2021-12-07 21:07   ` Bjorn Helgaas
2021-12-06 22:51 ` [patch V2 09/31] PCI/MSI: Let core code free MSI descriptors Thomas Gleixner
2021-12-07 21:07   ` Bjorn Helgaas
2021-12-06 22:51 ` [patch V2 10/31] PCI/MSI: Use msi_on_each_desc() Thomas Gleixner
2021-12-07 21:07   ` Bjorn Helgaas
2021-12-06 22:51 ` [patch V2 11/31] x86/pci/xen: Use msi_for_each_desc() Thomas Gleixner
2021-12-06 22:51 ` [patch V2 12/31] xen/pcifront: Rework MSI handling Thomas Gleixner
2021-12-06 22:51 ` [patch V2 13/31] s390/pci: Rework MSI descriptor walk Thomas Gleixner
2021-12-06 22:51 ` [patch V2 14/31] powerpc/4xx/hsta: Rework MSI handling Thomas Gleixner
2021-12-06 22:51 ` [patch V2 15/31] powerpc/cell/axon_msi: Convert to msi_on_each_desc() Thomas Gleixner
2021-12-06 22:51 ` [patch V2 16/31] powerpc/pasemi/msi: Convert to msi_on_each_dec() Thomas Gleixner
2021-12-06 22:51 ` [patch V2 17/31] powerpc/fsl_msi: Use msi_for_each_desc() Thomas Gleixner
2021-12-06 22:51 ` [patch V2 18/31] powerpc/mpic_u3msi: Use msi_for_each-desc() Thomas Gleixner
2021-12-06 22:51 ` [patch V2 19/31] PCI: hv: Rework MSI handling Thomas Gleixner
2021-12-07 21:08   ` Bjorn Helgaas
2021-12-06 22:51 ` [patch V2 20/31] NTB/msi: Convert to msi_on_each_desc() Thomas Gleixner
2021-12-06 22:51 ` [patch V2 21/31] soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation Thomas Gleixner
2021-12-15 20:50   ` Thomas Gleixner
2021-12-06 22:51 ` [patch V2 22/31] soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs() Thomas Gleixner
2021-12-06 22:51 ` [patch V2 23/31] bus: fsl-mc-msi: Simplify MSI descriptor handling Thomas Gleixner
2021-12-06 22:51 ` [patch V2 24/31] platform-msi: Let core code handle MSI descriptors Thomas Gleixner
2021-12-06 22:51 ` [patch V2 25/31] platform-msi: Simplify platform device MSI code Thomas Gleixner
2021-12-06 22:51 ` [patch V2 26/31] genirq/msi: Make interrupt allocation less convoluted Thomas Gleixner
2021-12-06 22:51 ` [patch V2 27/31] genirq/msi: Convert to new functions Thomas Gleixner
2021-12-06 22:51 ` [patch V2 28/31] genirq/msi: Mop up old interfaces Thomas Gleixner
2021-12-06 22:51 ` [patch V2 29/31] genirq/msi: Add abuse prevention comment to msi header Thomas Gleixner
2021-12-07  8:21   ` Greg Kroah-Hartman
2021-12-07 12:46     ` Thomas Gleixner
2021-12-06 22:51 ` [patch V2 30/31] genirq/msi: Simplify sysfs handling Thomas Gleixner
2022-01-10 18:12   ` [patch] genirq/msi: Populate sysfs entry only once Thomas Gleixner
2022-01-10 18:15     ` Borislav Petkov
2022-01-11  8:57     ` Greg Kroah-Hartman
2022-01-11  9:02     ` Cédric Le Goater
2022-01-12  0:05     ` Kunihiko Hayashi
2022-01-18 23:59       ` Thomas Gleixner
2022-01-19  8:45         ` Kunihiko Hayashi
2021-12-06 22:51 ` [patch V2 31/31] genirq/msi: Convert storage to xarray Thomas Gleixner
2021-12-09  1:01 ` [patch V2 00/31] genirq/msi, PCI/MSI: Spring cleaning - Part 3 Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).