linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies
@ 2021-07-29 21:51 Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 01/19] PCI/MSI: Enable and mask MSI-X early Thomas Gleixner
                   ` (19 more replies)
  0 siblings, 20 replies; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

A recent discussion about the PCI/MSI management for virtio unearthed a
violation of the MSI-X specification vs. writing the MSI-X message: under
certain circumstances the entry is written without being masked.

While looking at that and the related violation of the x86 non-remapped
interrupt affinity mechanism a few other issues were discovered by
inspection.

The following series addresses these.

Note this does not fix the virtio issue, but while staring at the above
problems I came up with a plan to address this. I'm still trying to
convince myself that I can get away without sprinkling locking all over the
place, so don't hold your breath that this will materialize tomorrow.

The series is also available from git:

    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git irq/msi

V1 can be found here:

   https://lore.kernel.org/r/20210721191126.274946280@linutronix.de

Changes vs. V1:

  - Identified and addressed more inconsistencies, especially the lack of
    serialization for multi-MSI masking
    
  - Removed the extra vector masking in S390 

  - Addressed review comments and picked up tags where applicable

  - Clean up of the naming of msi_desc::masked as discussed in the V1
    thread

  - Consolidation of the mask/unmask functions

Thanks,

	tglx
---
 arch/s390/pci/pci_irq.c        |    4 
 arch/x86/kernel/apic/io_apic.c |    6 
 arch/x86/kernel/apic/msi.c     |   11 +
 arch/x86/kernel/hpet.c         |    2 
 drivers/base/core.c            |    1 
 drivers/pci/msi.c              |  274 ++++++++++++++++++++++-------------------
 include/linux/device.h         |    1 
 include/linux/irq.h            |    2 
 include/linux/msi.h            |   10 -
 kernel/irq/chip.c              |    5 
 10 files changed, 178 insertions(+), 138 deletions(-)



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 01/19] PCI/MSI: Enable and mask MSI-X early
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 02/19] PCI/MSI: Mask all unused MSI-X entries Thomas Gleixner
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

From: Thomas Gleixner <tglx@linutronix.de>

The ordering of MSI-X enable in hardware is dysfunctional:

 1) MSI-X is disabled in the control register
 2) Various setup functions
 3) pci_msi_setup_msi_irqs() is invoked which ends up accessing
    the MSI-X table entries
 4) MSI-X is enabled and masked in the control register with the
    comment that enabling is required for some hardware to access
    the MSI-X table

Step #4 obviously contradicts #3. The history of this is an issue with the
NIU hardware. When #4 was introduced the table access actually happened in
msix_program_entries() which was invoked after enabling and masking MSI-X.

This was changed in commit d71d6432e105 ("PCI/MSI: Kill redundant call of
irq_set_msi_desc() for MSI-X interrupts") which removed the table write
from msix_program_entries().

Interestingly enough nobody noticed and either NIU still works or it did
not get any testing with a kernel 3.19 or later.

Nevertheless this is inconsistent and there is no reason why MSI-X can't be
enabled and masked in the control register early on, i.e. move step #4
above to step #1. This preserves the NIU workaround and has no side effects
on other hardware.

Fixes: d71d6432e105 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/msi.c |   28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -772,18 +772,25 @@ static int msix_capability_init(struct p
 	u16 control;
 	void __iomem *base;
 
-	/* Ensure MSI-X is disabled while it is set up */
-	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+	/*
+	 * Some devices require MSI-X to be enabled before the MSI-X
+	 * registers can be accessed.  Mask all the vectors to prevent
+	 * interrupts coming in before they're fully set up.
+	 */
+	pci_msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_MASKALL |
+				    PCI_MSIX_FLAGS_ENABLE);
 
 	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
 	/* Request & Map MSI-X table region */
 	base = msix_map_region(dev, msix_table_size(control));
-	if (!base)
-		return -ENOMEM;
+	if (!base) {
+		ret = -ENOMEM;
+		goto out_disable;
+	}
 
 	ret = msix_setup_entries(dev, base, entries, nvec, affd);
 	if (ret)
-		return ret;
+		goto out_disable;
 
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
 	if (ret)
@@ -794,14 +801,6 @@ static int msix_capability_init(struct p
 	if (ret)
 		goto out_free;
 
-	/*
-	 * Some devices require MSI-X to be enabled before we can touch the
-	 * MSI-X registers.  We need to mask all the vectors to prevent
-	 * interrupts coming in before they're fully set up.
-	 */
-	pci_msix_clear_and_set_ctrl(dev, 0,
-				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE);
-
 	msix_program_entries(dev, entries);
 
 	ret = populate_msi_sysfs(dev);
@@ -836,6 +835,9 @@ static int msix_capability_init(struct p
 out_free:
 	free_msi_irqs(dev);
 
+out_disable:
+	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+
 	return ret;
 }
 



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 02/19] PCI/MSI: Mask all unused MSI-X entries
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 01/19] PCI/MSI: Enable and mask MSI-X early Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 03/19] PCI/MSI: Enforce that MSI-X table entry is masked for update Thomas Gleixner
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

From: Thomas Gleixner <tglx@linutronix.de>

When MSI-X is enabled the ordering of calls is:

  msix_map_region();
  msix_setup_entries();
  pci_msi_setup_msi_irqs();
  msix_program_entries();

This has a few interesting issues:

 1) msix_setup_entries() allocates the MSI descriptors and initializes them
    except for the msi_desc:masked member which is left zero initialized.

 2) pci_msi_setup_msi_irqs() allocates the interrupt descriptors and sets
    up the MSI interrupts which ends up in pci_write_msi_msg() unless the
    interrupt chip provides its own irq_write_msi_msg() function.

 3) msix_program_entries() does not do what the name suggests. It solely
    updates the entries array (if not NULL) and initializes the masked
    member for each MSI descriptor by reading the hardware state and then
    masks the entry.

Obviously this has some issues:

 1) The uninitialized masked member of msi_desc prevents the enforcement
    of masking the entry in pci_write_msi_msg() depending on the cached
    masked bit. Aside of that half initialized data is a NONO in general

 2) msix_program_entries() only ensures that the actually allocated entries
    are masked. This is wrong as experimentation with crash testing and
    crash kernel kexec has shown.

    This limited testing unearthed that when the production kernel had more
    entries in use and unmasked when it crashed and the crash kernel
    allocated a smaller amount of entries, then a full scan of all entries
    found unmasked entries which were in use in the production kernel.

    This is obviously a device or emulation issue as the device reset
    should mask all MSI-X table entries, but obviously that's just part
    of the paper specification.

Cure this by:

 1) Masking all table entries in hardware
 2) Initializing msi_desc::masked in msix_setup_entries()
 3) Removing the mask dance in msix_program_entries()
 4) Renaming msix_program_entries() to msix_update_entries() to
    reflect the purpose of that function.

As the masking of unused entries has never been done the Fixes tag refers
to a commit in:
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/msi.c |   45 +++++++++++++++++++++++++++------------------
 1 file changed, 27 insertions(+), 18 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -691,6 +691,7 @@ static int msix_setup_entries(struct pci
 {
 	struct irq_affinity_desc *curmsk, *masks = NULL;
 	struct msi_desc *entry;
+	void __iomem *addr;
 	int ret, i;
 	int vec_count = pci_msix_vec_count(dev);
 
@@ -711,6 +712,7 @@ static int msix_setup_entries(struct pci
 
 		entry->msi_attrib.is_msix	= 1;
 		entry->msi_attrib.is_64		= 1;
+
 		if (entries)
 			entry->msi_attrib.entry_nr = entries[i].entry;
 		else
@@ -722,6 +724,10 @@ static int msix_setup_entries(struct pci
 		entry->msi_attrib.default_irq	= dev->irq;
 		entry->mask_base		= base;
 
+		addr = pci_msix_desc_addr(entry);
+		if (addr)
+			entry->masked = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+
 		list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
 		if (masks)
 			curmsk++;
@@ -732,26 +738,25 @@ static int msix_setup_entries(struct pci
 	return ret;
 }
 
-static void msix_program_entries(struct pci_dev *dev,
-				 struct msix_entry *entries)
+static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries)
 {
 	struct msi_desc *entry;
-	int i = 0;
-	void __iomem *desc_addr;
 
 	for_each_pci_msi_entry(entry, dev) {
-		if (entries)
-			entries[i++].vector = entry->irq;
+		if (entries) {
+			entries->vector = entry->irq;
+			entries++;
+		}
+	}
+}
 
-		desc_addr = pci_msix_desc_addr(entry);
-		if (desc_addr)
-			entry->masked = readl(desc_addr +
-					      PCI_MSIX_ENTRY_VECTOR_CTRL);
-		else
-			entry->masked = 0;
+static void msix_mask_all(void __iomem *base, int tsize)
+{
+	u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	int i;
 
-		msix_mask_irq(entry, 1);
-	}
+	for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE)
+		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
 }
 
 /**
@@ -768,9 +773,9 @@ static void msix_program_entries(struct
 static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 				int nvec, struct irq_affinity *affd)
 {
-	int ret;
-	u16 control;
 	void __iomem *base;
+	int ret, tsize;
+	u16 control;
 
 	/*
 	 * Some devices require MSI-X to be enabled before the MSI-X
@@ -782,12 +787,16 @@ static int msix_capability_init(struct p
 
 	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
 	/* Request & Map MSI-X table region */
-	base = msix_map_region(dev, msix_table_size(control));
+	tsize = msix_table_size(control);
+	base = msix_map_region(dev, tsize);
 	if (!base) {
 		ret = -ENOMEM;
 		goto out_disable;
 	}
 
+	/* Ensure that all table entries are masked. */
+	msix_mask_all(base, tsize);
+
 	ret = msix_setup_entries(dev, base, entries, nvec, affd);
 	if (ret)
 		goto out_disable;
@@ -801,7 +810,7 @@ static int msix_capability_init(struct p
 	if (ret)
 		goto out_free;
 
-	msix_program_entries(dev, entries);
+	msix_update_entries(dev, entries);
 
 	ret = populate_msi_sysfs(dev);
 	if (ret)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 03/19] PCI/MSI: Enforce that MSI-X table entry is masked for update
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 01/19] PCI/MSI: Enable and mask MSI-X early Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 02/19] PCI/MSI: Mask all unused MSI-X entries Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 04/19] PCI/MSI: Enforce MSI[X] entry updates to be visible Thomas Gleixner
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

From: Thomas Gleixner <tglx@linutronix.de>

The specification (PCIe r5.0, sec 6.1.4.5) states:

    For MSI-X, a function is permitted to cache Address and Data values
    from unmasked MSI-X Table entries. However, anytime software unmasks a
    currently masked MSI-X Table entry either by clearing its Mask bit or
    by clearing the Function Mask bit, the function must update any Address
    or Data values that it cached from that entry. If software changes the
    Address or Data value of an entry while the entry is unmasked, the
    result is undefined.

The Linux kernel's MSI-X support never enforced that the entry is masked
before the entry is modified hence the Fixes tag refers to a commit in:
      git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Enforce the entry to be masked across the update.

There is no point in enforcing this to be handled at all possible call
sites as this is just pointless code duplication and the common update
function is the obvious place to enforce this.

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Reported-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/msi.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -289,13 +289,28 @@ void __pci_write_msi_msg(struct msi_desc
 		/* Don't touch the hardware now */
 	} else if (entry->msi_attrib.is_msix) {
 		void __iomem *base = pci_msix_desc_addr(entry);
+		bool unmasked = !(entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		if (!base)
 			goto skip;
 
+		/*
+		 * The specification mandates that the entry is masked
+		 * when the message is modified:
+		 *
+		 * "If software changes the Address or Data value of an
+		 * entry while the entry is unmasked, the result is
+		 * undefined."
+		 */
+		if (unmasked)
+			__pci_msix_desc_mask_irq(entry, PCI_MSIX_ENTRY_CTRL_MASKBIT);
+
 		writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR);
 		writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
 		writel(msg->data, base + PCI_MSIX_ENTRY_DATA);
+
+		if (unmasked)
+			__pci_msix_desc_mask_irq(entry, 0);
 	} else {
 		int pos = dev->msi_cap;
 		u16 msgctl;



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 04/19] PCI/MSI: Enforce MSI[X] entry updates to be visible
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (2 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 03/19] PCI/MSI: Enforce that MSI-X table entry is masked for update Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 05/19] PCI/MSI: Do not set invalid bits in MSI mask Thomas Gleixner
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

From: Thomas Gleixner <tglx@linutronix.de>

Nothing enforces the posted writes to be visible when the function
returns. Flush them even if the flush might be redundant when the entry is
masked already as the unmask will flush as well. This is either setup or a
rare affinity change event so the extra flush is not the end of the world.

While this is more a theoretical issue especially the logic in the X86
specific msi_set_affinity() function relies on the assumption that the
update has reached the hardware when the function returns.

Again, as this never has been enforced the Fixes tag refers to a commit in:
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/msi.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -311,6 +311,9 @@ void __pci_write_msi_msg(struct msi_desc
 
 		if (unmasked)
 			__pci_msix_desc_mask_irq(entry, 0);
+
+		/* Ensure that the writes are visible in the device */
+		readl(base + PCI_MSIX_ENTRY_DATA);
 	} else {
 		int pos = dev->msi_cap;
 		u16 msgctl;
@@ -331,6 +334,8 @@ void __pci_write_msi_msg(struct msi_desc
 			pci_write_config_word(dev, pos + PCI_MSI_DATA_32,
 					      msg->data);
 		}
+		/* Ensure that the writes are visible in the device */
+		pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &msgctl);
 	}
 
 skip:


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 05/19] PCI/MSI: Do not set invalid bits in MSI mask
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (3 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 04/19] PCI/MSI: Enforce MSI[X] entry updates to be visible Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 06/19] PCI/MSI: Correct misleading comments Thomas Gleixner
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

msi_mask_irq() takes a mask and a flags argument. The mask argument is used
to mask out bits from the cached mask and the flags argument to set bits.

Some places invoke it with a flags argument which sets bits which are not
used by the device, i.e. when the device supports up to 8 vectors a full
unmask in some places sets the mask to 0xFFFFFF00. While devices probably
do not care, it's still bad practice.

Fixes: 7ba1930db02f ("PCI MSI: Unmask MSI if setup failed")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -656,21 +656,21 @@ static int msi_capability_init(struct pc
 	/* Configure MSI capability structure */
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
 	if (ret) {
-		msi_mask_irq(entry, mask, ~mask);
+		msi_mask_irq(entry, mask, 0);
 		free_msi_irqs(dev);
 		return ret;
 	}
 
 	ret = msi_verify_entries(dev);
 	if (ret) {
-		msi_mask_irq(entry, mask, ~mask);
+		msi_mask_irq(entry, mask, 0);
 		free_msi_irqs(dev);
 		return ret;
 	}
 
 	ret = populate_msi_sysfs(dev);
 	if (ret) {
-		msi_mask_irq(entry, mask, ~mask);
+		msi_mask_irq(entry, mask, 0);
 		free_msi_irqs(dev);
 		return ret;
 	}
@@ -962,7 +962,7 @@ static void pci_msi_shutdown(struct pci_
 	/* Return the device with MSI unmasked as initial states */
 	mask = msi_mask(desc->msi_attrib.multi_cap);
 	/* Keep cached state to be restored */
-	__pci_msi_desc_mask_irq(desc, mask, ~mask);
+	__pci_msi_desc_mask_irq(desc, mask, 0);
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->msi_attrib.default_irq;


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 06/19] PCI/MSI: Correct misleading comments
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (4 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 05/19] PCI/MSI: Do not set invalid bits in MSI mask Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 07/19] PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown() Thomas Gleixner
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

The comments about preserving the cached state in pci_msi[x]_shutdown() are
misleading as the MSI descriptors are freed right after those functions
return. So there is nothing to restore.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -961,7 +961,6 @@ static void pci_msi_shutdown(struct pci_
 
 	/* Return the device with MSI unmasked as initial states */
 	mask = msi_mask(desc->msi_attrib.multi_cap);
-	/* Keep cached state to be restored */
 	__pci_msi_desc_mask_irq(desc, mask, 0);
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
@@ -1047,10 +1046,8 @@ static void pci_msix_shutdown(struct pci
 	}
 
 	/* Return the device with MSI-X masked as initial states */
-	for_each_pci_msi_entry(entry, dev) {
-		/* Keep cached states to be restored */
+	for_each_pci_msi_entry(entry, dev)
 		__pci_msix_desc_mask_irq(entry, 1);
-	}
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 	pci_intx_for_msi(dev, 1);


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 07/19] PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown()
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (5 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 06/19] PCI/MSI: Correct misleading comments Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 08/19] PCI/MSI: Protect msi_desc::masked for multi-MSI Thomas Gleixner
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

No point in using the raw write function from shutdown. Preparatory change
to introduce proper serialization for the msi_desc::masked cache.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -961,7 +961,7 @@ static void pci_msi_shutdown(struct pci_
 
 	/* Return the device with MSI unmasked as initial states */
 	mask = msi_mask(desc->msi_attrib.multi_cap);
-	__pci_msi_desc_mask_irq(desc, mask, 0);
+	msi_mask_irq(desc, mask, 0);
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->msi_attrib.default_irq;


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 08/19] PCI/MSI: Protect msi_desc::masked for multi-MSI
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (6 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 07/19] PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown() Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 09/19] genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP Thomas Gleixner
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

Multi-MSI uses a single MSI descriptor and there is a single mask register
when the device supports per vector masking. To avoid reading back the mask
register the value is cached in the MSI descriptor and updates are done by
clearing and setting bits in the cache and writing it to the device.

But nothing protects msi_desc::masked and the mask register from being
modified concurrently on two different CPUs for two different Linux
interrupts which belong to the same multi-MSI descriptor.

Add a lock to struct device and protect any operation on the mask and the
mask register with it.

This makes the update of msi_desc::masked unconditional, but there is no
place which requires a modification of the hardware register without
updating the masked cache.

msi_mask_irq() is now an empty wrapper which will be cleaned up in follow
up changes.

The problem goes way back to the initial support of multi-MSI, but picking
the commit which introduced the mask cache is a valid cut off point
(2.6.30).

Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/base/core.c    |    1 +
 drivers/pci/msi.c      |   19 ++++++++++---------
 include/linux/device.h |    1 +
 include/linux/msi.h    |    2 +-
 4 files changed, 13 insertions(+), 10 deletions(-)

--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2837,6 +2837,7 @@ void device_initialize(struct device *de
 	device_pm_init(dev);
 	set_dev_node(dev, -1);
 #ifdef CONFIG_GENERIC_MSI_IRQ
+	raw_spin_lock_init(&dev->msi_lock);
 	INIT_LIST_HEAD(&dev->msi_list);
 #endif
 	INIT_LIST_HEAD(&dev->links.consumers);
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -143,24 +143,25 @@ static inline __attribute_const__ u32 ms
  * reliably as devices without an INTx disable bit will then generate a
  * level IRQ which will never be cleared.
  */
-u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
-	u32 mask_bits = desc->masked;
+	raw_spinlock_t *lock = &desc->dev->msi_lock;
+	unsigned long flags;
 
 	if (pci_msi_ignore_mask || !desc->msi_attrib.maskbit)
-		return 0;
+		return;
 
-	mask_bits &= ~mask;
-	mask_bits |= flag;
+	raw_spin_lock_irqsave(lock, flags);
+	desc->masked &= ~mask;
+	desc->masked |= flag;
 	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
-			       mask_bits);
-
-	return mask_bits;
+			       desc->masked);
+	raw_spin_unlock_irqrestore(lock, flags);
 }
 
 static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
-	desc->masked = __pci_msi_desc_mask_irq(desc, mask, flag);
+	__pci_msi_desc_mask_irq(desc, mask, flag);
 }
 
 static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -506,6 +506,7 @@ struct device {
 	struct dev_pin_info	*pins;
 #endif
 #ifdef CONFIG_GENERIC_MSI_IRQ
+	raw_spinlock_t		msi_lock;
 	struct list_head	msi_list;
 #endif
 #ifdef CONFIG_DMA_OPS
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -233,7 +233,7 @@ void __pci_read_msi_msg(struct msi_desc
 void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 
 u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag);
-u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
+void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
 void pci_msi_mask_irq(struct irq_data *data);
 void pci_msi_unmask_irq(struct irq_data *data);
 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 09/19] genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (7 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 08/19] PCI/MSI: Protect msi_desc::masked for multi-MSI Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 10/19] x86/ioapic: Force affinity setup before startup Thomas Gleixner
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

From: Thomas Gleixner <tglx@linutronix.de>

X86 IO/APIC and MSI interrupts (when used without interrupts remapping)
require that the affinity setup on startup is done before the interrupt is
enabled for the first time as the non-remapped operation mode cannot safely
migrate enabled interrupts from arbitrary contexts. Provide a new irq chip
flag which allows affected hardware to request this.

This has to be opt-in because there have been reports in the past that some
interrupt chips cannot handle affinity setting before startup.

Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>

---
 include/linux/irq.h |    2 ++
 kernel/irq/chip.c   |    5 ++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -569,6 +569,7 @@ struct irq_chip {
  * IRQCHIP_SUPPORTS_NMI:              Chip can deliver NMIs, only for root irqchips
  * IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND:  Invokes __enable_irq()/__disable_irq() for wake irqs
  *                                    in the suspend path if they are in disabled state
+ * IRQCHIP_AFFINITY_PRE_STARTUP:      Default affinity update before startup
  */
 enum {
 	IRQCHIP_SET_TYPE_MASKED			= (1 <<  0),
@@ -581,6 +582,7 @@ enum {
 	IRQCHIP_SUPPORTS_LEVEL_MSI		= (1 <<  7),
 	IRQCHIP_SUPPORTS_NMI			= (1 <<  8),
 	IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND	= (1 <<  9),
+	IRQCHIP_AFFINITY_PRE_STARTUP		= (1 << 10),
 };
 
 #include <linux/irqdesc.h>
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -265,8 +265,11 @@ int irq_startup(struct irq_desc *desc, b
 	} else {
 		switch (__irq_startup_managed(desc, aff, force)) {
 		case IRQ_STARTUP_NORMAL:
+			if (d->chip->flags & IRQCHIP_AFFINITY_PRE_STARTUP)
+				irq_setup_affinity(desc);
 			ret = __irq_startup(desc);
-			irq_setup_affinity(desc);
+			if (!(d->chip->flags & IRQCHIP_AFFINITY_PRE_STARTUP))
+				irq_setup_affinity(desc);
 			break;
 		case IRQ_STARTUP_MANAGED:
 			irq_do_set_affinity(d, aff, false);



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 10/19] x86/ioapic: Force affinity setup before startup
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (8 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 09/19] genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 11/19] x86/msi: " Thomas Gleixner
                   ` (9 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

From: Thomas Gleixner <tglx@linutronix.de>

The IO/APIC cannot handle interrupt affinity changes safely after startup
other than from an interrupt handler. The startup sequence in the generic
interrupt code violates that assumption.

Mark the irq chip with the new IRQCHIP_AFFINITY_PRE_STARTUP flag so that
the default interrupt setting happens before the interrupt is started up
for the first time.

Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>

---
 arch/x86/kernel/apic/io_apic.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1986,7 +1986,8 @@ static struct irq_chip ioapic_chip __rea
 	.irq_set_affinity	= ioapic_set_affinity,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static struct irq_chip ioapic_ir_chip __read_mostly = {
@@ -1999,7 +2000,8 @@ static struct irq_chip ioapic_ir_chip __
 	.irq_set_affinity	= ioapic_set_affinity,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static inline void init_IO_APIC_traps(void)



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 11/19] x86/msi: Force affinity setup before startup
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (9 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 10/19] x86/ioapic: Force affinity setup before startup Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 12/19] s390/pci: Do not mask MSI[-X] entries on teardown Thomas Gleixner
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

From: Thomas Gleixner <tglx@linutronix.de>

The X86 MSI mechanism cannot handle interrupt affinity changes safely after
startup other than from an interrupt handler, unless interrupt remapping is
enabled. The startup sequence in the generic interrupt code violates that
assumption.

Mark the irq chips with the new IRQCHIP_AFFINITY_PRE_STARTUP flag so that
the default interrupt setting happens before the interrupt is started up
for the first time.

While the interrupt remapping MSI chip does not require this, there is no
point in treating it differently as this might spare an interrupt to a CPU
which is not in the default affinity mask.

For the non-remapping case go to the direct write path when the interrupt
is not yet started similar to the not yet activated case.

Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>

---
 arch/x86/kernel/apic/msi.c |   11 ++++++++---
 arch/x86/kernel/hpet.c     |    2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -58,11 +58,13 @@ msi_set_affinity(struct irq_data *irqd,
 	 *   The quirk bit is not set in this case.
 	 * - The new vector is the same as the old vector
 	 * - The old vector is MANAGED_IRQ_SHUTDOWN_VECTOR (interrupt starts up)
+	 * - The interrupt is not yet started up
 	 * - The new destination CPU is the same as the old destination CPU
 	 */
 	if (!irqd_msi_nomask_quirk(irqd) ||
 	    cfg->vector == old_cfg.vector ||
 	    old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR ||
+	    !irqd_is_started(irqd) ||
 	    cfg->dest_apicid == old_cfg.dest_apicid) {
 		irq_msi_update_msg(irqd, cfg);
 		return ret;
@@ -150,7 +152,8 @@ static struct irq_chip pci_msi_controlle
 	.irq_ack		= irq_chip_ack_parent,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_set_affinity	= msi_set_affinity,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
@@ -219,7 +222,8 @@ static struct irq_chip pci_msi_ir_contro
 	.irq_mask		= pci_msi_mask_irq,
 	.irq_ack		= irq_chip_ack_parent,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static struct msi_domain_info pci_msi_ir_domain_info = {
@@ -273,7 +277,8 @@ static struct irq_chip dmar_msi_controll
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_compose_msi_msg	= dmar_msi_compose_msg,
 	.irq_write_msi_msg	= dmar_msi_write_msg,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static int dmar_msi_init(struct irq_domain *domain,
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -508,7 +508,7 @@ static struct irq_chip hpet_msi_controll
 	.irq_set_affinity = msi_domain_set_affinity,
 	.irq_retrigger = irq_chip_retrigger_hierarchy,
 	.irq_write_msi_msg = hpet_msi_write_msg,
-	.flags = IRQCHIP_SKIP_SET_WAKE,
+	.flags = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static int hpet_msi_init(struct irq_domain *domain,



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 12/19] s390/pci: Do not mask MSI[-X] entries on teardown
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (10 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 11/19] x86/msi: " Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-03 12:48   ` Niklas Schnelle
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 13/19] PCI/MSI: Simplify msi_verify_entries() Thomas Gleixner
                   ` (7 subsequent siblings)
  19 siblings, 2 replies; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

The PCI core already ensures that the MSI[-X] state is correct when MSI[-X]
is disabled. For MSI the reset state is all entries unmasked and for MSI-X
all vectors are masked.

S390 masks all MSI entries and masks the already masked MSI-X entries
again. Remove it and let the device in the correct state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-s390@vger.kernel.org
Cc: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/pci/pci_irq.c |    4 ----
 drivers/pci/msi.c       |    4 ++--
 include/linux/msi.h     |    2 --
 3 files changed, 2 insertions(+), 8 deletions(-)

--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -365,10 +365,6 @@ void arch_teardown_msi_irqs(struct pci_d
 	for_each_pci_msi_entry(msi, pdev) {
 		if (!msi->irq)
 			continue;
-		if (msi->msi_attrib.is_msix)
-			__pci_msix_desc_mask_irq(msi, 1);
-		else
-			__pci_msi_desc_mask_irq(msi, 1, 1);
 		irq_set_msi_desc(msi->irq, NULL);
 		irq_free_desc(msi->irq);
 		msi->msg.address_lo = 0;
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -143,7 +143,7 @@ static inline __attribute_const__ u32 ms
  * reliably as devices without an INTx disable bit will then generate a
  * level IRQ which will never be cleared.
  */
-void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+static void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
 	raw_spinlock_t *lock = &desc->dev->msi_lock;
 	unsigned long flags;
@@ -180,7 +180,7 @@ static void __iomem *pci_msix_desc_addr(
  * file.  This saves a few milliseconds when initialising devices with lots
  * of MSI-X interrupts.
  */
-u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
+static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
 {
 	u32 mask_bits = desc->masked;
 	void __iomem *desc_addr;
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -232,8 +232,6 @@ void free_msi_entry(struct msi_desc *ent
 void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 
-u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag);
-void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
 void pci_msi_mask_irq(struct irq_data *data);
 void pci_msi_unmask_irq(struct irq_data *data);
 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 13/19] PCI/MSI: Simplify msi_verify_entries()
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (11 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 12/19] s390/pci: Do not mask MSI[-X] entries on teardown Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 14/19] PCI/MSI: Rename msi_desc::masked Thomas Gleixner
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

From: Thomas Gleixner <tglx@linutronix.de>

No point in looping over all entries when 64bit addressing mode is enabled
for nothing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/msi.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -613,8 +613,11 @@ static int msi_verify_entries(struct pci
 {
 	struct msi_desc *entry;
 
+	if (!dev->no_64bit_msi)
+		return 0;
+
 	for_each_pci_msi_entry(entry, dev) {
-		if (entry->msg.address_hi && dev->no_64bit_msi) {
+		if (entry->msg.address_hi) {
 			pci_err(dev, "arch assigned 64-bit MSI address %#x%08x but device only supports 32 bits\n",
 				entry->msg.address_hi, entry->msg.address_lo);
 			return -EIO;


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 14/19] PCI/MSI: Rename msi_desc::masked
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (12 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 13/19] PCI/MSI: Simplify msi_verify_entries() Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 15/19] PCI/MSI: Consolidate error handling in msi_capability_init() Thomas Gleixner
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

msi_desc::masked is a misnomer. For MSI it's used to cache the MSI mask
bits when the device supports per vector masking. For MSI-X it's used to
cache the content of the vector control word which contains the mask bit
for the vector.

Replace it with a union of msi_mask and msix_ctrl to make the purpose clear
and fix up the usage sites.

No functional change

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c   |   30 +++++++++++++++---------------
 include/linux/msi.h |    8 ++++++--
 2 files changed, 21 insertions(+), 17 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -152,10 +152,10 @@ static void __pci_msi_desc_mask_irq(stru
 		return;
 
 	raw_spin_lock_irqsave(lock, flags);
-	desc->masked &= ~mask;
-	desc->masked |= flag;
+	desc->msi_mask &= ~mask;
+	desc->msi_mask |= flag;
 	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
-			       desc->masked);
+			       desc->msi_mask);
 	raw_spin_unlock_irqrestore(lock, flags);
 }
 
@@ -182,7 +182,7 @@ static void __iomem *pci_msix_desc_addr(
  */
 static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
 {
-	u32 mask_bits = desc->masked;
+	u32 ctrl = desc->msix_ctrl;
 	void __iomem *desc_addr;
 
 	if (pci_msi_ignore_mask)
@@ -192,18 +192,18 @@ static u32 __pci_msix_desc_mask_irq(stru
 	if (!desc_addr)
 		return 0;
 
-	mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
-	if (flag & PCI_MSIX_ENTRY_CTRL_MASKBIT)
-		mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	if (ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)
+		ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
 
-	writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+	writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
 
-	return mask_bits;
+	return ctrl;
 }
 
 static void msix_mask_irq(struct msi_desc *desc, u32 flag)
 {
-	desc->masked = __pci_msix_desc_mask_irq(desc, flag);
+	desc->msix_ctrl = __pci_msix_desc_mask_irq(desc, flag);
 }
 
 static void msi_set_mask_bit(struct irq_data *data, u32 flag)
@@ -290,7 +290,7 @@ void __pci_write_msi_msg(struct msi_desc
 		/* Don't touch the hardware now */
 	} else if (entry->msi_attrib.is_msix) {
 		void __iomem *base = pci_msix_desc_addr(entry);
-		bool unmasked = !(entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT);
+		bool unmasked = !(entry->msix_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		if (!base)
 			goto skip;
@@ -430,7 +430,7 @@ static void __pci_restore_msi_state(stru
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
 	msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap),
-		     entry->masked);
+		     entry->msi_mask);
 	control &= ~PCI_MSI_FLAGS_QSIZE;
 	control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
@@ -461,7 +461,7 @@ static void __pci_restore_msix_state(str
 
 	arch_restore_msi_irqs(dev);
 	for_each_pci_msi_entry(entry, dev)
-		msix_mask_irq(entry, entry->masked);
+		msix_mask_irq(entry, entry->msix_ctrl);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 }
@@ -602,7 +602,7 @@ msi_setup_entry(struct pci_dev *dev, int
 
 	/* Save the initial mask status */
 	if (entry->msi_attrib.maskbit)
-		pci_read_config_dword(dev, entry->mask_pos, &entry->masked);
+		pci_read_config_dword(dev, entry->mask_pos, &entry->msi_mask);
 
 out:
 	kfree(masks);
@@ -750,7 +750,7 @@ static int msix_setup_entries(struct pci
 
 		addr = pci_msix_desc_addr(entry);
 		if (addr)
-			entry->masked = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+			entry->msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
 
 		list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
 		if (masks)
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -107,7 +107,8 @@ struct ti_sci_inta_msi_desc {
  *			address or data changes
  * @write_msi_msg_data:	Data parameter for the callback.
  *
- * @masked:	[PCI MSI/X] Mask bits
+ * @msi_mask:	[PCI MSI]   MSI cached mask bits
+ * @msix_ctrl:	[PCI MSI-X] MSI-X cached per vector control bits
  * @is_msix:	[PCI MSI/X] True if MSI-X
  * @multiple:	[PCI MSI/X] log2 num of messages allocated
  * @multi_cap:	[PCI MSI/X] log2 num of messages supported
@@ -139,7 +140,10 @@ struct msi_desc {
 	union {
 		/* PCI MSI/X specific data */
 		struct {
-			u32 masked;
+			union {
+				u32 msi_mask;
+				u32 msix_ctrl;
+			};
 			struct {
 				u8	is_msix		: 1;
 				u8	multiple	: 3;


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 15/19] PCI/MSI: Consolidate error handling in msi_capability_init()
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (13 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 14/19] PCI/MSI: Rename msi_desc::masked Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 16/19] PCI/MSI: Deobfuscate virtual MSI-X Thomas Gleixner
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

Three error exits doing exactly the same ask for a common error exit point.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c |   26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -659,25 +659,16 @@ static int msi_capability_init(struct pc
 
 	/* Configure MSI capability structure */
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
-	if (ret) {
-		msi_mask_irq(entry, mask, 0);
-		free_msi_irqs(dev);
-		return ret;
-	}
+	if (ret)
+		goto err;
 
 	ret = msi_verify_entries(dev);
-	if (ret) {
-		msi_mask_irq(entry, mask, 0);
-		free_msi_irqs(dev);
-		return ret;
-	}
+	if (ret)
+		goto err;
 
 	ret = populate_msi_sysfs(dev);
-	if (ret) {
-		msi_mask_irq(entry, mask, 0);
-		free_msi_irqs(dev);
-		return ret;
-	}
+	if (ret)
+		goto err;
 
 	/* Set MSI enabled bits	*/
 	pci_intx_for_msi(dev, 0);
@@ -687,6 +678,11 @@ static int msi_capability_init(struct pc
 	pcibios_free_irq(dev);
 	dev->irq = entry->irq;
 	return 0;
+
+err:
+	msi_mask_irq(entry, mask, 0);
+	free_msi_irqs(dev);
+	return ret;
 }
 
 static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 16/19] PCI/MSI: Deobfuscate virtual MSI-X
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (14 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 15/19] PCI/MSI: Consolidate error handling in msi_capability_init() Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 17/19] PCI/MSI: Cleanup msi_mask() Thomas Gleixner
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

Handling of virtual MSI-X is obfuscated by letting pci_msix_desc_addr()
return NULL and checking the pointer.

Just use msi_desc::msi_attrib.is_virtual at the call sites and get rid of
that pointer check.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c |   25 ++++++++-----------------
 1 file changed, 8 insertions(+), 17 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -166,11 +166,7 @@ static void msi_mask_irq(struct msi_desc
 
 static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
 {
-	if (desc->msi_attrib.is_virtual)
-		return NULL;
-
-	return desc->mask_base +
-		desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
+	return desc->mask_base + desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
 }
 
 /*
@@ -182,14 +178,10 @@ static void __iomem *pci_msix_desc_addr(
  */
 static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
 {
+	void __iomem *desc_addr = pci_msix_desc_addr(desc);
 	u32 ctrl = desc->msix_ctrl;
-	void __iomem *desc_addr;
-
-	if (pci_msi_ignore_mask)
-		return 0;
 
-	desc_addr = pci_msix_desc_addr(desc);
-	if (!desc_addr)
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
 		return 0;
 
 	ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
@@ -256,10 +248,8 @@ void __pci_read_msi_msg(struct msi_desc
 	if (entry->msi_attrib.is_msix) {
 		void __iomem *base = pci_msix_desc_addr(entry);
 
-		if (!base) {
-			WARN_ON(1);
+		if (WARN_ON_ONCE(entry->msi_attrib.is_virtual))
 			return;
-		}
 
 		msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR);
 		msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR);
@@ -292,7 +282,7 @@ void __pci_write_msi_msg(struct msi_desc
 		void __iomem *base = pci_msix_desc_addr(entry);
 		bool unmasked = !(entry->msix_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
-		if (!base)
+		if (entry->msi_attrib.is_virtual)
 			goto skip;
 
 		/*
@@ -744,9 +734,10 @@ static int msix_setup_entries(struct pci
 		entry->msi_attrib.default_irq	= dev->irq;
 		entry->mask_base		= base;
 
-		addr = pci_msix_desc_addr(entry);
-		if (addr)
+		if (!entry->msi_attrib.is_virtual) {
+			addr = pci_msix_desc_addr(entry);
 			entry->msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+		}
 
 		list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
 		if (masks)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 17/19] PCI/MSI: Cleanup msi_mask()
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (15 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 16/19] PCI/MSI: Deobfuscate virtual MSI-X Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-07-29 21:51 ` [patch V2 18/19] PCI/MSI: Provide a new set of mask and unmask functions Thomas Gleixner
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

msi_mask() is calculating the possible mask bits for MSI per vector
masking.

Rename it to msi_multi_mask() and hand the MSI descriptor pointer into it
to simplify the call sites.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c |   25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -129,14 +129,6 @@ void __weak arch_restore_msi_irqs(struct
 	return default_restore_msi_irqs(dev);
 }
 
-static inline __attribute_const__ u32 msi_mask(unsigned x)
-{
-	/* Don't shift by >= width of type */
-	if (x >= 5)
-		return 0xffffffff;
-	return (1 << (1 << x)) - 1;
-}
-
 /*
  * PCI 2.3 does not specify mask bits for each MSI interrupt.  Attempting to
  * mask all MSI interrupts by clearing the MSI enable bit does not work
@@ -211,6 +203,14 @@ static void msi_set_mask_bit(struct irq_
 	}
 }
 
+static inline __attribute_const__ u32 msi_multi_mask(struct msi_desc *desc)
+{
+	/* Don't shift by >= width of type */
+	if (desc->msi_attrib.multi_cap >= 5)
+		return 0xffffffff;
+	return (1 << (1 << desc->msi_attrib.multi_cap)) - 1;
+}
+
 /**
  * pci_msi_mask_irq - Generic IRQ chip callback to mask PCI/MSI interrupts
  * @data:	pointer to irqdata associated to that interrupt
@@ -419,8 +419,7 @@ static void __pci_restore_msi_state(stru
 	arch_restore_msi_irqs(dev);
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
-	msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap),
-		     entry->msi_mask);
+	msi_mask_irq(entry, msi_multi_mask(entry), entry->msi_mask);
 	control &= ~PCI_MSI_FLAGS_QSIZE;
 	control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
@@ -642,7 +641,7 @@ static int msi_capability_init(struct pc
 		return -ENOMEM;
 
 	/* All MSIs are unmasked by default; mask them all */
-	mask = msi_mask(entry->msi_attrib.multi_cap);
+	mask = msi_multi_mask(entry);
 	msi_mask_irq(entry, mask, mask);
 
 	list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
@@ -938,7 +937,6 @@ EXPORT_SYMBOL(pci_msi_vec_count);
 static void pci_msi_shutdown(struct pci_dev *dev)
 {
 	struct msi_desc *desc;
-	u32 mask;
 
 	if (!pci_msi_enable || !dev || !dev->msi_enabled)
 		return;
@@ -951,8 +949,7 @@ static void pci_msi_shutdown(struct pci_
 	dev->msi_enabled = 0;
 
 	/* Return the device with MSI unmasked as initial states */
-	mask = msi_mask(desc->msi_attrib.multi_cap);
-	msi_mask_irq(desc, mask, 0);
+	msi_mask_irq(desc, msi_multi_mask(desc), 0);
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->msi_attrib.default_irq;


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 18/19] PCI/MSI: Provide a new set of mask and unmask functions
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (16 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 17/19] PCI/MSI: Cleanup msi_mask() Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
       [not found]   ` <87r1f6bpt7.wl-maz@kernel.org>
  2021-07-29 21:51 ` [patch V2 19/19] PCI/MSI: Use new mask/unmask functions Thomas Gleixner
  2021-08-10  7:49 ` [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Marc Zyngier
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

The existing mask/unmask functions are convoluted and generate suboptimal
assembly code.

Provide a new set of functions which will be used in later patches to
replace the exisiting ones.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c |   72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -211,6 +211,78 @@ static inline __attribute_const__ u32 ms
 	return (1 << (1 << desc->msi_attrib.multi_cap)) - 1;
 }
 
+static noinline void pci_msi_update_mask(struct msi_desc *desc, u32 clear, u32 set)
+{
+	raw_spinlock_t *lock = &desc->dev->msi_lock;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(lock, flags);
+	desc->msi_mask &= ~clear;
+	desc->msi_mask |= set;
+	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
+			       desc->msi_mask);
+	raw_spin_unlock_irqrestore(lock, flags);
+}
+
+static inline void pci_msi_mask(struct msi_desc *desc, u32 mask)
+{
+	pci_msi_update_mask(desc, 0, mask);
+}
+
+static inline void pci_msi_unmask(struct msi_desc *desc, u32 mask)
+{
+	pci_msi_update_mask(desc, mask, 0);
+}
+
+/*
+ * This internal function does not flush PCI writes to the device.  All
+ * users must ensure that they read from the device before either assuming
+ * that the device state is up to date, or returning out of this file.
+ * It does not affect the msi_desc::msix_ctrl cache either. Use with care!
+ */
+static void pci_msix_write_vector_ctrl(struct msi_desc *desc, u32 ctrl)
+{
+	void __iomem *desc_addr = pci_msix_desc_addr(desc);
+
+	writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+}
+
+static inline void pci_msix_mask(struct msi_desc *desc)
+{
+	desc->msix_ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
+	/* Flush write to device */
+	readl(desc->mask_base);
+}
+
+static inline void pci_msix_unmask(struct msi_desc *desc)
+{
+	desc->msix_ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
+}
+
+static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
+{
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
+	if (desc->msi_attrib.is_msix)
+		pci_msix_mask(desc);
+	else if (!desc->msi_attrib.maskbit)
+		pci_msi_mask(desc, mask);
+}
+
+static void __pci_msi_unmask_desc(struct msi_desc *desc, u32 mask)
+{
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
+	if (desc->msi_attrib.is_msix)
+		pci_msix_unmask(desc);
+	else if (!desc->msi_attrib.maskbit)
+		pci_msi_unmask(desc, mask);
+}
+
 /**
  * pci_msi_mask_irq - Generic IRQ chip callback to mask PCI/MSI interrupts
  * @data:	pointer to irqdata associated to that interrupt


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V2 19/19] PCI/MSI: Use new mask/unmask functions
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (17 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 18/19] PCI/MSI: Provide a new set of mask and unmask functions Thomas Gleixner
@ 2021-07-29 21:51 ` Thomas Gleixner
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  2021-08-10  7:49 ` [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Marc Zyngier
  19 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-07-29 21:51 UTC (permalink / raw)
  To: LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

Switch the PCI/MSI core to use the new mask/unmask functions. No functional
change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 drivers/pci/msi.c |  102 +++++++++++-------------------------------------------
 1 file changed, 21 insertions(+), 81 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -135,74 +135,6 @@ void __weak arch_restore_msi_irqs(struct
  * reliably as devices without an INTx disable bit will then generate a
  * level IRQ which will never be cleared.
  */
-static void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
-{
-	raw_spinlock_t *lock = &desc->dev->msi_lock;
-	unsigned long flags;
-
-	if (pci_msi_ignore_mask || !desc->msi_attrib.maskbit)
-		return;
-
-	raw_spin_lock_irqsave(lock, flags);
-	desc->msi_mask &= ~mask;
-	desc->msi_mask |= flag;
-	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
-			       desc->msi_mask);
-	raw_spin_unlock_irqrestore(lock, flags);
-}
-
-static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
-{
-	__pci_msi_desc_mask_irq(desc, mask, flag);
-}
-
-static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
-{
-	return desc->mask_base + desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
-}
-
-/*
- * This internal function does not flush PCI writes to the device.
- * All users must ensure that they read from the device before either
- * assuming that the device state is up to date, or returning out of this
- * file.  This saves a few milliseconds when initialising devices with lots
- * of MSI-X interrupts.
- */
-static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
-{
-	void __iomem *desc_addr = pci_msix_desc_addr(desc);
-	u32 ctrl = desc->msix_ctrl;
-
-	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
-		return 0;
-
-	ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
-	if (ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)
-		ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
-
-	writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
-
-	return ctrl;
-}
-
-static void msix_mask_irq(struct msi_desc *desc, u32 flag)
-{
-	desc->msix_ctrl = __pci_msix_desc_mask_irq(desc, flag);
-}
-
-static void msi_set_mask_bit(struct irq_data *data, u32 flag)
-{
-	struct msi_desc *desc = irq_data_get_msi_desc(data);
-
-	if (desc->msi_attrib.is_msix) {
-		msix_mask_irq(desc, flag);
-		readl(desc->mask_base);		/* Flush write to device */
-	} else {
-		unsigned offset = data->irq - desc->irq;
-		msi_mask_irq(desc, 1 << offset, flag << offset);
-	}
-}
-
 static inline __attribute_const__ u32 msi_multi_mask(struct msi_desc *desc)
 {
 	/* Don't shift by >= width of type */
@@ -234,6 +166,11 @@ static inline void pci_msi_unmask(struct
 	pci_msi_update_mask(desc, mask, 0);
 }
 
+static inline void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
+{
+	return desc->mask_base + desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
+}
+
 /*
  * This internal function does not flush PCI writes to the device.  All
  * users must ensure that they read from the device before either assuming
@@ -289,7 +226,9 @@ static void __pci_msi_unmask_desc(struct
  */
 void pci_msi_mask_irq(struct irq_data *data)
 {
-	msi_set_mask_bit(data, 1);
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+
+	__pci_msi_mask_desc(desc, BIT(data->irq - desc->irq));
 }
 EXPORT_SYMBOL_GPL(pci_msi_mask_irq);
 
@@ -299,7 +238,9 @@ EXPORT_SYMBOL_GPL(pci_msi_mask_irq);
  */
 void pci_msi_unmask_irq(struct irq_data *data)
 {
-	msi_set_mask_bit(data, 0);
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+
+	__pci_msi_unmask_desc(desc, BIT(data->irq - desc->irq));
 }
 EXPORT_SYMBOL_GPL(pci_msi_unmask_irq);
 
@@ -352,7 +293,8 @@ void __pci_write_msi_msg(struct msi_desc
 		/* Don't touch the hardware now */
 	} else if (entry->msi_attrib.is_msix) {
 		void __iomem *base = pci_msix_desc_addr(entry);
-		bool unmasked = !(entry->msix_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
+		u32 ctrl = entry->msix_ctrl;
+		bool unmasked = !(ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		if (entry->msi_attrib.is_virtual)
 			goto skip;
@@ -366,14 +308,14 @@ void __pci_write_msi_msg(struct msi_desc
 		 * undefined."
 		 */
 		if (unmasked)
-			__pci_msix_desc_mask_irq(entry, PCI_MSIX_ENTRY_CTRL_MASKBIT);
+			pci_msix_write_vector_ctrl(entry, ctrl | PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR);
 		writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
 		writel(msg->data, base + PCI_MSIX_ENTRY_DATA);
 
 		if (unmasked)
-			__pci_msix_desc_mask_irq(entry, 0);
+			pci_msix_write_vector_ctrl(entry, ctrl);
 
 		/* Ensure that the writes are visible in the device */
 		readl(base + PCI_MSIX_ENTRY_DATA);
@@ -491,7 +433,7 @@ static void __pci_restore_msi_state(stru
 	arch_restore_msi_irqs(dev);
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
-	msi_mask_irq(entry, msi_multi_mask(entry), entry->msi_mask);
+	pci_msi_update_mask(entry, 0, 0);
 	control &= ~PCI_MSI_FLAGS_QSIZE;
 	control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
@@ -522,7 +464,7 @@ static void __pci_restore_msix_state(str
 
 	arch_restore_msi_irqs(dev);
 	for_each_pci_msi_entry(entry, dev)
-		msix_mask_irq(entry, entry->msix_ctrl);
+		pci_msix_write_vector_ctrl(entry, entry->msix_ctrl);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 }
@@ -704,7 +646,6 @@ static int msi_capability_init(struct pc
 {
 	struct msi_desc *entry;
 	int ret;
-	unsigned mask;
 
 	pci_msi_set_enable(dev, 0);	/* Disable MSI during set up */
 
@@ -713,8 +654,7 @@ static int msi_capability_init(struct pc
 		return -ENOMEM;
 
 	/* All MSIs are unmasked by default; mask them all */
-	mask = msi_multi_mask(entry);
-	msi_mask_irq(entry, mask, mask);
+	pci_msi_mask(entry, msi_multi_mask(entry));
 
 	list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
 
@@ -741,7 +681,7 @@ static int msi_capability_init(struct pc
 	return 0;
 
 err:
-	msi_mask_irq(entry, mask, 0);
+	pci_msi_unmask(entry, msi_multi_mask(entry));
 	free_msi_irqs(dev);
 	return ret;
 }
@@ -1021,7 +961,7 @@ static void pci_msi_shutdown(struct pci_
 	dev->msi_enabled = 0;
 
 	/* Return the device with MSI unmasked as initial states */
-	msi_mask_irq(desc, msi_multi_mask(desc), 0);
+	pci_msi_unmask(desc, msi_multi_mask(desc));
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->msi_attrib.default_irq;
@@ -1107,7 +1047,7 @@ static void pci_msix_shutdown(struct pci
 
 	/* Return the device with MSI-X masked as initial states */
 	for_each_pci_msi_entry(entry, dev)
-		__pci_msix_desc_mask_irq(entry, 1);
+		pci_msix_mask(entry);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 	pci_intx_for_msi(dev, 1);


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch V2 12/19] s390/pci: Do not mask MSI[-X] entries on teardown
  2021-07-29 21:51 ` [patch V2 12/19] s390/pci: Do not mask MSI[-X] entries on teardown Thomas Gleixner
@ 2021-08-03 12:48   ` Niklas Schnelle
  2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 43+ messages in thread
From: Niklas Schnelle @ 2021-08-03 12:48 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Alex Williamson, Raj, Ashok, David S. Miller, Bjorn Helgaas,
	linux-pci, Kevin Tian, Marc Zyngier, Ingo Molnar, x86,
	linux-s390, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

On Thu, 2021-07-29 at 23:51 +0200, Thomas Gleixner wrote:
> The PCI core already ensures that the MSI[-X] state is correct when MSI[-X]
> is disabled. For MSI the reset state is all entries unmasked and for MSI-X
> all vectors are masked.
> 
> S390 masks all MSI entries and masks the already masked MSI-X entries
> again. Remove it and let the device in the correct state.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-s390@vger.kernel.org
> Cc: Niklas Schnelle <schnelle@linux.ibm.com>
> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  arch/s390/pci/pci_irq.c |    4 ----
>  drivers/pci/msi.c       |    4 ++--
>  include/linux/msi.h     |    2 --
>  3 files changed, 2 insertions(+), 8 deletions(-)
> 
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -365,10 +365,6 @@ void arch_teardown_msi_irqs(struct pci_d
>  	for_each_pci_msi_entry(msi, pdev) {
>  		if (!msi->irq)
>  			continue;
> -		if (msi->msi_attrib.is_msix)
> -			__pci_msix_desc_mask_irq(msi, 1);
> -		else
> -			__pci_msi_desc_mask_irq(msi, 1, 1);
>  		irq_set_msi_desc(msi->irq, NULL);
>  		irq_free_desc(msi->irq);
>  		msi->msg.address_lo = 0;
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -143,7 +143,7 @@ static inline __attribute_const__ u32 ms
>   * reliably as devices without an INTx disable bit will then generate a
>   * level IRQ which will never be cleared.
>   */
> -void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
> +static void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
>  {
>  	raw_spinlock_t *lock = &desc->dev->msi_lock;
>  	unsigned long flags;
> @@ -180,7 +180,7 @@ static void __iomem *pci_msix_desc_addr(
>   * file.  This saves a few milliseconds when initialising devices with lots
>   * of MSI-X interrupts.
>   */
> -u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
> +static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
>  {
>  	u32 mask_bits = desc->masked;
>  	void __iomem *desc_addr;
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -232,8 +232,6 @@ void free_msi_entry(struct msi_desc *ent
>  void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
>  void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
>  
> -u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag);
> -void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
>  void pci_msi_mask_irq(struct irq_data *data);
>  void pci_msi_unmask_irq(struct irq_data *data);
>  
> 

I gave this patch a try, adapted for v5.14-rc4 where
__pci_msi_desc_msg() returns a u32, and all looks good. I tested with
our pretty quirky ISM devices too, which are the only MSI ones on
s390x.

It also makes sense to me to let the common code handle this so feel
free to add my:

Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>

and/or

Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>

Thanks.



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch V2 18/19] PCI/MSI: Provide a new set of mask and unmask functions
       [not found]   ` <87r1f6bpt7.wl-maz@kernel.org>
@ 2021-08-09 18:56     ` Thomas Gleixner
  2021-08-09 19:08       ` [patch V3 " Thomas Gleixner
  0 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-08-09 18:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: LKML, Alex Williamson, Raj, Ashok, David S. Miller,
	Bjorn Helgaas, linux-pci, Kevin Tian, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

On Fri, Aug 06 2021 at 15:39, Marc Zyngier wrote:
> On Thu, 29 Jul 2021 22:51:57 +0100,
> Thomas Gleixner <tglx@linutronix.de> wrote:
>> +static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
>> +{
>> +	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
>> +		return;
>> +
>> +	if (desc->msi_attrib.is_msix)
>> +		pci_msix_mask(desc);
>> +	else if (!desc->msi_attrib.maskbit)
>
> This negation is preventing one of my boxes from working correctly (no
> idea why the i350 driver refuses to use MSI-X and sticks to a single
> MSI, but hey, that's another story), as the device supports MSI
> masking, and we definitely don't try to mask/unmask in this case...
>
> Dropping the '!' here and on the unmask path fixes it for me.

Duh. I'm a moron. Of course this needs to check maskbit if it wants to
mask. Sigh.



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch V3 18/19] PCI/MSI: Provide a new set of mask and unmask functions
  2021-08-09 18:56     ` Thomas Gleixner
@ 2021-08-09 19:08       ` Thomas Gleixner
  2021-08-10  9:07         ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
  0 siblings, 1 reply; 43+ messages in thread
From: Thomas Gleixner @ 2021-08-09 19:08 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: LKML, Alex Williamson, Raj, Ashok, David S. Miller,
	Bjorn Helgaas, linux-pci, Kevin Tian, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

The existing mask/unmask functions are convoluted and generate suboptimal
assembly code.

Provide a new set of functions which will be used in later patches to
replace the exisiting ones.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210729222543.257079238@linutronix.de

---
V3: Check maskbit when masking (Marc)
V2: New patch
---
 drivers/pci/msi.c |   72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -211,6 +211,78 @@ static inline __attribute_const__ u32 ms
 	return (1 << (1 << desc->msi_attrib.multi_cap)) - 1;
 }
 
+static noinline void pci_msi_update_mask(struct msi_desc *desc, u32 clear, u32 set)
+{
+	raw_spinlock_t *lock = &desc->dev->msi_lock;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(lock, flags);
+	desc->msi_mask &= ~clear;
+	desc->msi_mask |= set;
+	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
+			       desc->msi_mask);
+	raw_spin_unlock_irqrestore(lock, flags);
+}
+
+static inline void pci_msi_mask(struct msi_desc *desc, u32 mask)
+{
+	pci_msi_update_mask(desc, 0, mask);
+}
+
+static inline void pci_msi_unmask(struct msi_desc *desc, u32 mask)
+{
+	pci_msi_update_mask(desc, mask, 0);
+}
+
+/*
+ * This internal function does not flush PCI writes to the device.  All
+ * users must ensure that they read from the device before either assuming
+ * that the device state is up to date, or returning out of this file.
+ * It does not affect the msi_desc::msix_ctrl cache either. Use with care!
+ */
+static void pci_msix_write_vector_ctrl(struct msi_desc *desc, u32 ctrl)
+{
+	void __iomem *desc_addr = pci_msix_desc_addr(desc);
+
+	writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+}
+
+static inline void pci_msix_mask(struct msi_desc *desc)
+{
+	desc->msix_ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
+	/* Flush write to device */
+	readl(desc->mask_base);
+}
+
+static inline void pci_msix_unmask(struct msi_desc *desc)
+{
+	desc->msix_ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
+}
+
+static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
+{
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
+	if (desc->msi_attrib.is_msix)
+		pci_msix_mask(desc);
+	else if (desc->msi_attrib.maskbit)
+		pci_msi_mask(desc, mask);
+}
+
+static void __pci_msi_unmask_desc(struct msi_desc *desc, u32 mask)
+{
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
+	if (desc->msi_attrib.is_msix)
+		pci_msix_unmask(desc);
+	else if (desc->msi_attrib.maskbit)
+		pci_msi_unmask(desc, mask);
+}
+
 /**
  * pci_msi_mask_irq - Generic IRQ chip callback to mask PCI/MSI interrupts
  * @data:	pointer to irqdata associated to that interrupt

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies
  2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
                   ` (18 preceding siblings ...)
  2021-07-29 21:51 ` [patch V2 19/19] PCI/MSI: Use new mask/unmask functions Thomas Gleixner
@ 2021-08-10  7:49 ` Marc Zyngier
  19 siblings, 0 replies; 43+ messages in thread
From: Marc Zyngier @ 2021-08-10  7:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alex Williamson, Raj, Ashok, David S. Miller,
	Bjorn Helgaas, linux-pci, Kevin Tian, Ingo Molnar, x86,
	linux-s390, Niklas Schnelle, Gerald Schaefer, Heiko Carstens,
	Christian Borntraeger

On 2021-07-29 22:51, Thomas Gleixner wrote:
> A recent discussion about the PCI/MSI management for virtio unearthed a
> violation of the MSI-X specification vs. writing the MSI-X message: 
> under
> certain circumstances the entry is written without being masked.
> 
> While looking at that and the related violation of the x86 non-remapped
> interrupt affinity mechanism a few other issues were discovered by
> inspection.
> 
> The following series addresses these.
> 
> Note this does not fix the virtio issue, but while staring at the above
> problems I came up with a plan to address this. I'm still trying to
> convince myself that I can get away without sprinkling locking all over 
> the
> place, so don't hold your breath that this will materialize tomorrow.

For the patches I haven't acked yet, and with the fix on patch 18:

Reviewed-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>

Thanks,

         M.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Use new mask/unmask functions
  2021-07-29 21:51 ` [patch V2 19/19] PCI/MSI: Use new mask/unmask functions Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     446a98b19fd6da97a1fb148abb1766ad89c9b767
Gitweb:        https://git.kernel.org/tip/446a98b19fd6da97a1fb148abb1766ad89c9b767
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:58 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 11:03:30 +02:00

PCI/MSI: Use new mask/unmask functions

Switch the PCI/MSI core to use the new mask/unmask functions. No functional
change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.311207034@linutronix.de

---
 drivers/pci/msi.c | 102 +++++++++------------------------------------
 1 file changed, 21 insertions(+), 81 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 26dd91f..ce841f3 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -135,74 +135,6 @@ void __weak arch_restore_msi_irqs(struct pci_dev *dev)
  * reliably as devices without an INTx disable bit will then generate a
  * level IRQ which will never be cleared.
  */
-static void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
-{
-	raw_spinlock_t *lock = &desc->dev->msi_lock;
-	unsigned long flags;
-
-	if (pci_msi_ignore_mask || !desc->msi_attrib.maskbit)
-		return;
-
-	raw_spin_lock_irqsave(lock, flags);
-	desc->msi_mask &= ~mask;
-	desc->msi_mask |= flag;
-	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
-			       desc->msi_mask);
-	raw_spin_unlock_irqrestore(lock, flags);
-}
-
-static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
-{
-	__pci_msi_desc_mask_irq(desc, mask, flag);
-}
-
-static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
-{
-	return desc->mask_base + desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
-}
-
-/*
- * This internal function does not flush PCI writes to the device.
- * All users must ensure that they read from the device before either
- * assuming that the device state is up to date, or returning out of this
- * file.  This saves a few milliseconds when initialising devices with lots
- * of MSI-X interrupts.
- */
-static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
-{
-	void __iomem *desc_addr = pci_msix_desc_addr(desc);
-	u32 ctrl = desc->msix_ctrl;
-
-	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
-		return 0;
-
-	ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
-	if (ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)
-		ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
-
-	writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
-
-	return ctrl;
-}
-
-static void msix_mask_irq(struct msi_desc *desc, u32 flag)
-{
-	desc->msix_ctrl = __pci_msix_desc_mask_irq(desc, flag);
-}
-
-static void msi_set_mask_bit(struct irq_data *data, u32 flag)
-{
-	struct msi_desc *desc = irq_data_get_msi_desc(data);
-
-	if (desc->msi_attrib.is_msix) {
-		msix_mask_irq(desc, flag);
-		readl(desc->mask_base);		/* Flush write to device */
-	} else {
-		unsigned offset = data->irq - desc->irq;
-		msi_mask_irq(desc, 1 << offset, flag << offset);
-	}
-}
-
 static inline __attribute_const__ u32 msi_multi_mask(struct msi_desc *desc)
 {
 	/* Don't shift by >= width of type */
@@ -234,6 +166,11 @@ static inline void pci_msi_unmask(struct msi_desc *desc, u32 mask)
 	pci_msi_update_mask(desc, mask, 0);
 }
 
+static inline void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
+{
+	return desc->mask_base + desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
+}
+
 /*
  * This internal function does not flush PCI writes to the device.  All
  * users must ensure that they read from the device before either assuming
@@ -289,7 +226,9 @@ static void __pci_msi_unmask_desc(struct msi_desc *desc, u32 mask)
  */
 void pci_msi_mask_irq(struct irq_data *data)
 {
-	msi_set_mask_bit(data, 1);
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+
+	__pci_msi_mask_desc(desc, BIT(data->irq - desc->irq));
 }
 EXPORT_SYMBOL_GPL(pci_msi_mask_irq);
 
@@ -299,7 +238,9 @@ EXPORT_SYMBOL_GPL(pci_msi_mask_irq);
  */
 void pci_msi_unmask_irq(struct irq_data *data)
 {
-	msi_set_mask_bit(data, 0);
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+
+	__pci_msi_unmask_desc(desc, BIT(data->irq - desc->irq));
 }
 EXPORT_SYMBOL_GPL(pci_msi_unmask_irq);
 
@@ -352,7 +293,8 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 		/* Don't touch the hardware now */
 	} else if (entry->msi_attrib.is_msix) {
 		void __iomem *base = pci_msix_desc_addr(entry);
-		bool unmasked = !(entry->msix_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
+		u32 ctrl = entry->msix_ctrl;
+		bool unmasked = !(ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		if (entry->msi_attrib.is_virtual)
 			goto skip;
@@ -366,14 +308,14 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 		 * undefined."
 		 */
 		if (unmasked)
-			__pci_msix_desc_mask_irq(entry, PCI_MSIX_ENTRY_CTRL_MASKBIT);
+			pci_msix_write_vector_ctrl(entry, ctrl | PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR);
 		writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
 		writel(msg->data, base + PCI_MSIX_ENTRY_DATA);
 
 		if (unmasked)
-			__pci_msix_desc_mask_irq(entry, 0);
+			pci_msix_write_vector_ctrl(entry, ctrl);
 
 		/* Ensure that the writes are visible in the device */
 		readl(base + PCI_MSIX_ENTRY_DATA);
@@ -491,7 +433,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 	arch_restore_msi_irqs(dev);
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
-	msi_mask_irq(entry, msi_multi_mask(entry), entry->msi_mask);
+	pci_msi_update_mask(entry, 0, 0);
 	control &= ~PCI_MSI_FLAGS_QSIZE;
 	control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
@@ -522,7 +464,7 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
 
 	arch_restore_msi_irqs(dev);
 	for_each_pci_msi_entry(entry, dev)
-		msix_mask_irq(entry, entry->msix_ctrl);
+		pci_msix_write_vector_ctrl(entry, entry->msix_ctrl);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 }
@@ -704,7 +646,6 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
 {
 	struct msi_desc *entry;
 	int ret;
-	unsigned mask;
 
 	pci_msi_set_enable(dev, 0);	/* Disable MSI during set up */
 
@@ -713,8 +654,7 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
 		return -ENOMEM;
 
 	/* All MSIs are unmasked by default; mask them all */
-	mask = msi_multi_mask(entry);
-	msi_mask_irq(entry, mask, mask);
+	pci_msi_mask(entry, msi_multi_mask(entry));
 
 	list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
 
@@ -741,7 +681,7 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
 	return 0;
 
 err:
-	msi_mask_irq(entry, mask, 0);
+	pci_msi_unmask(entry, msi_multi_mask(entry));
 	free_msi_irqs(dev);
 	return ret;
 }
@@ -1021,7 +961,7 @@ static void pci_msi_shutdown(struct pci_dev *dev)
 	dev->msi_enabled = 0;
 
 	/* Return the device with MSI unmasked as initial states */
-	msi_mask_irq(desc, msi_multi_mask(desc), 0);
+	pci_msi_unmask(desc, msi_multi_mask(desc));
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->msi_attrib.default_irq;
@@ -1107,7 +1047,7 @@ static void pci_msix_shutdown(struct pci_dev *dev)
 
 	/* Return the device with MSI-X masked as initial states */
 	for_each_pci_msi_entry(entry, dev)
-		__pci_msix_desc_mask_irq(entry, 1);
+		pci_msix_mask(entry);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 	pci_intx_for_msi(dev, 1);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Cleanup msi_mask()
  2021-07-29 21:51 ` [patch V2 17/19] PCI/MSI: Cleanup msi_mask() Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     7327cefebb85d440fa6a589fdf53979d55b29a5a
Gitweb:        https://git.kernel.org/tip/7327cefebb85d440fa6a589fdf53979d55b29a5a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:56 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 11:03:30 +02:00

PCI/MSI: Cleanup msi_mask()

msi_mask() is calculating the possible mask bits for MSI per vector
masking.

Rename it to msi_multi_mask() and hand the MSI descriptor pointer into it
to simplify the call sites.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.203905260@linutronix.de

---
 drivers/pci/msi.c | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 1bde9ec..2eab07b 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -129,14 +129,6 @@ void __weak arch_restore_msi_irqs(struct pci_dev *dev)
 	return default_restore_msi_irqs(dev);
 }
 
-static inline __attribute_const__ u32 msi_mask(unsigned x)
-{
-	/* Don't shift by >= width of type */
-	if (x >= 5)
-		return 0xffffffff;
-	return (1 << (1 << x)) - 1;
-}
-
 /*
  * PCI 2.3 does not specify mask bits for each MSI interrupt.  Attempting to
  * mask all MSI interrupts by clearing the MSI enable bit does not work
@@ -211,6 +203,14 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag)
 	}
 }
 
+static inline __attribute_const__ u32 msi_multi_mask(struct msi_desc *desc)
+{
+	/* Don't shift by >= width of type */
+	if (desc->msi_attrib.multi_cap >= 5)
+		return 0xffffffff;
+	return (1 << (1 << desc->msi_attrib.multi_cap)) - 1;
+}
+
 /**
  * pci_msi_mask_irq - Generic IRQ chip callback to mask PCI/MSI interrupts
  * @data:	pointer to irqdata associated to that interrupt
@@ -419,8 +419,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 	arch_restore_msi_irqs(dev);
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
-	msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap),
-		     entry->msi_mask);
+	msi_mask_irq(entry, msi_multi_mask(entry), entry->msi_mask);
 	control &= ~PCI_MSI_FLAGS_QSIZE;
 	control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
@@ -642,7 +641,7 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
 		return -ENOMEM;
 
 	/* All MSIs are unmasked by default; mask them all */
-	mask = msi_mask(entry->msi_attrib.multi_cap);
+	mask = msi_multi_mask(entry);
 	msi_mask_irq(entry, mask, mask);
 
 	list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
@@ -938,7 +937,6 @@ EXPORT_SYMBOL(pci_msi_vec_count);
 static void pci_msi_shutdown(struct pci_dev *dev)
 {
 	struct msi_desc *desc;
-	u32 mask;
 
 	if (!pci_msi_enable || !dev || !dev->msi_enabled)
 		return;
@@ -951,8 +949,7 @@ static void pci_msi_shutdown(struct pci_dev *dev)
 	dev->msi_enabled = 0;
 
 	/* Return the device with MSI unmasked as initial states */
-	mask = msi_mask(desc->msi_attrib.multi_cap);
-	msi_mask_irq(desc, mask, 0);
+	msi_mask_irq(desc, msi_multi_mask(desc), 0);
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->msi_attrib.default_irq;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Provide a new set of mask and unmask functions
  2021-08-09 19:08       ` [patch V3 " Thomas Gleixner
@ 2021-08-10  9:07         ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     fcacdfbef5a1633211ebfac1b669a7739f5b553e
Gitweb:        https://git.kernel.org/tip/fcacdfbef5a1633211ebfac1b669a7739f5b553e
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 09 Aug 2021 21:08:56 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 11:03:30 +02:00

PCI/MSI: Provide a new set of mask and unmask functions

The existing mask/unmask functions are convoluted and generate suboptimal
assembly code.

Provide a new set of functions which will be used in later patches to
replace the exisiting ones.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/875ywetozb.ffs@tglx
---
 drivers/pci/msi.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 2eab07b..26dd91f 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -211,6 +211,78 @@ static inline __attribute_const__ u32 msi_multi_mask(struct msi_desc *desc)
 	return (1 << (1 << desc->msi_attrib.multi_cap)) - 1;
 }
 
+static noinline void pci_msi_update_mask(struct msi_desc *desc, u32 clear, u32 set)
+{
+	raw_spinlock_t *lock = &desc->dev->msi_lock;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(lock, flags);
+	desc->msi_mask &= ~clear;
+	desc->msi_mask |= set;
+	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
+			       desc->msi_mask);
+	raw_spin_unlock_irqrestore(lock, flags);
+}
+
+static inline void pci_msi_mask(struct msi_desc *desc, u32 mask)
+{
+	pci_msi_update_mask(desc, 0, mask);
+}
+
+static inline void pci_msi_unmask(struct msi_desc *desc, u32 mask)
+{
+	pci_msi_update_mask(desc, mask, 0);
+}
+
+/*
+ * This internal function does not flush PCI writes to the device.  All
+ * users must ensure that they read from the device before either assuming
+ * that the device state is up to date, or returning out of this file.
+ * It does not affect the msi_desc::msix_ctrl cache either. Use with care!
+ */
+static void pci_msix_write_vector_ctrl(struct msi_desc *desc, u32 ctrl)
+{
+	void __iomem *desc_addr = pci_msix_desc_addr(desc);
+
+	writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+}
+
+static inline void pci_msix_mask(struct msi_desc *desc)
+{
+	desc->msix_ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
+	/* Flush write to device */
+	readl(desc->mask_base);
+}
+
+static inline void pci_msix_unmask(struct msi_desc *desc)
+{
+	desc->msix_ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
+}
+
+static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
+{
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
+	if (desc->msi_attrib.is_msix)
+		pci_msix_mask(desc);
+	else if (desc->msi_attrib.maskbit)
+		pci_msi_mask(desc, mask);
+}
+
+static void __pci_msi_unmask_desc(struct msi_desc *desc, u32 mask)
+{
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
+	if (desc->msi_attrib.is_msix)
+		pci_msix_unmask(desc);
+	else if (desc->msi_attrib.maskbit)
+		pci_msi_unmask(desc, mask);
+}
+
 /**
  * pci_msi_mask_irq - Generic IRQ chip callback to mask PCI/MSI interrupts
  * @data:	pointer to irqdata associated to that interrupt

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Deobfuscate virtual MSI-X
  2021-07-29 21:51 ` [patch V2 16/19] PCI/MSI: Deobfuscate virtual MSI-X Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     b296ababcc4bbf8efbb603d3aec6024a78662c1b
Gitweb:        https://git.kernel.org/tip/b296ababcc4bbf8efbb603d3aec6024a78662c1b
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:55 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 11:03:30 +02:00

PCI/MSI: Deobfuscate virtual MSI-X

Handling of virtual MSI-X is obfuscated by letting pci_msix_desc_addr()
return NULL and checking the pointer.

Just use msi_desc::msi_attrib.is_virtual at the call sites and get rid of
that pointer check.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.151522318@linutronix.de

---
 drivers/pci/msi.c | 25 ++++++++-----------------
 1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d2821bb..1bde9ec 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -166,11 +166,7 @@ static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 
 static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
 {
-	if (desc->msi_attrib.is_virtual)
-		return NULL;
-
-	return desc->mask_base +
-		desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
+	return desc->mask_base + desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
 }
 
 /*
@@ -182,14 +178,10 @@ static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
  */
 static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
 {
+	void __iomem *desc_addr = pci_msix_desc_addr(desc);
 	u32 ctrl = desc->msix_ctrl;
-	void __iomem *desc_addr;
-
-	if (pci_msi_ignore_mask)
-		return 0;
 
-	desc_addr = pci_msix_desc_addr(desc);
-	if (!desc_addr)
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
 		return 0;
 
 	ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
@@ -256,10 +248,8 @@ void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 	if (entry->msi_attrib.is_msix) {
 		void __iomem *base = pci_msix_desc_addr(entry);
 
-		if (!base) {
-			WARN_ON(1);
+		if (WARN_ON_ONCE(entry->msi_attrib.is_virtual))
 			return;
-		}
 
 		msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR);
 		msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR);
@@ -292,7 +282,7 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 		void __iomem *base = pci_msix_desc_addr(entry);
 		bool unmasked = !(entry->msix_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
-		if (!base)
+		if (entry->msi_attrib.is_virtual)
 			goto skip;
 
 		/*
@@ -744,9 +734,10 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 		entry->msi_attrib.default_irq	= dev->irq;
 		entry->mask_base		= base;
 
-		addr = pci_msix_desc_addr(entry);
-		if (addr)
+		if (!entry->msi_attrib.is_virtual) {
+			addr = pci_msix_desc_addr(entry);
 			entry->msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+		}
 
 		list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
 		if (masks)

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Consolidate error handling in msi_capability_init()
  2021-07-29 21:51 ` [patch V2 15/19] PCI/MSI: Consolidate error handling in msi_capability_init() Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     8eb5ce3f78a5e5d3f1a12248f6b7dc64ebf71da6
Gitweb:        https://git.kernel.org/tip/8eb5ce3f78a5e5d3f1a12248f6b7dc64ebf71da6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:54 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 11:03:29 +02:00

PCI/MSI: Consolidate error handling in msi_capability_init()

Three error exits doing exactly the same ask for a common error exit point.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.098828720@linutronix.de

---
 drivers/pci/msi.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 175f4d6..d2821bb 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -659,25 +659,16 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
 
 	/* Configure MSI capability structure */
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
-	if (ret) {
-		msi_mask_irq(entry, mask, 0);
-		free_msi_irqs(dev);
-		return ret;
-	}
+	if (ret)
+		goto err;
 
 	ret = msi_verify_entries(dev);
-	if (ret) {
-		msi_mask_irq(entry, mask, 0);
-		free_msi_irqs(dev);
-		return ret;
-	}
+	if (ret)
+		goto err;
 
 	ret = populate_msi_sysfs(dev);
-	if (ret) {
-		msi_mask_irq(entry, mask, 0);
-		free_msi_irqs(dev);
-		return ret;
-	}
+	if (ret)
+		goto err;
 
 	/* Set MSI enabled bits	*/
 	pci_intx_for_msi(dev, 0);
@@ -687,6 +678,11 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
 	pcibios_free_irq(dev);
 	dev->irq = entry->irq;
 	return 0;
+
+err:
+	msi_mask_irq(entry, mask, 0);
+	free_msi_irqs(dev);
+	return ret;
 }
 
 static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Rename msi_desc::masked
  2021-07-29 21:51 ` [patch V2 14/19] PCI/MSI: Rename msi_desc::masked Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     67961e77a39b8e975dd1906179b9224f29150357
Gitweb:        https://git.kernel.org/tip/67961e77a39b8e975dd1906179b9224f29150357
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:53 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 11:03:29 +02:00

PCI/MSI: Rename msi_desc::masked

msi_desc::masked is a misnomer. For MSI it's used to cache the MSI mask
bits when the device supports per vector masking. For MSI-X it's used to
cache the content of the vector control word which contains the mask bit
for the vector.

Replace it with a union of msi_mask and msix_ctrl to make the purpose clear
and fix up the usage sites.

No functional change

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.045993608@linutronix.de

---
 drivers/pci/msi.c   | 30 +++++++++++++++---------------
 include/linux/msi.h |  8 ++++++--
 2 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index b59957c..175f4d6 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -152,10 +152,10 @@ static void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 		return;
 
 	raw_spin_lock_irqsave(lock, flags);
-	desc->masked &= ~mask;
-	desc->masked |= flag;
+	desc->msi_mask &= ~mask;
+	desc->msi_mask |= flag;
 	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
-			       desc->masked);
+			       desc->msi_mask);
 	raw_spin_unlock_irqrestore(lock, flags);
 }
 
@@ -182,7 +182,7 @@ static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
  */
 static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
 {
-	u32 mask_bits = desc->masked;
+	u32 ctrl = desc->msix_ctrl;
 	void __iomem *desc_addr;
 
 	if (pci_msi_ignore_mask)
@@ -192,18 +192,18 @@ static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
 	if (!desc_addr)
 		return 0;
 
-	mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
-	if (flag & PCI_MSIX_ENTRY_CTRL_MASKBIT)
-		mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	if (ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)
+		ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
 
-	writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+	writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
 
-	return mask_bits;
+	return ctrl;
 }
 
 static void msix_mask_irq(struct msi_desc *desc, u32 flag)
 {
-	desc->masked = __pci_msix_desc_mask_irq(desc, flag);
+	desc->msix_ctrl = __pci_msix_desc_mask_irq(desc, flag);
 }
 
 static void msi_set_mask_bit(struct irq_data *data, u32 flag)
@@ -290,7 +290,7 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 		/* Don't touch the hardware now */
 	} else if (entry->msi_attrib.is_msix) {
 		void __iomem *base = pci_msix_desc_addr(entry);
-		bool unmasked = !(entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT);
+		bool unmasked = !(entry->msix_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		if (!base)
 			goto skip;
@@ -430,7 +430,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
 	msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap),
-		     entry->masked);
+		     entry->msi_mask);
 	control &= ~PCI_MSI_FLAGS_QSIZE;
 	control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
@@ -461,7 +461,7 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
 
 	arch_restore_msi_irqs(dev);
 	for_each_pci_msi_entry(entry, dev)
-		msix_mask_irq(entry, entry->masked);
+		msix_mask_irq(entry, entry->msix_ctrl);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 }
@@ -602,7 +602,7 @@ msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
 
 	/* Save the initial mask status */
 	if (entry->msi_attrib.maskbit)
-		pci_read_config_dword(dev, entry->mask_pos, &entry->masked);
+		pci_read_config_dword(dev, entry->mask_pos, &entry->msi_mask);
 
 out:
 	kfree(masks);
@@ -750,7 +750,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 
 		addr = pci_msix_desc_addr(entry);
 		if (addr)
-			entry->masked = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+			entry->msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
 
 		list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
 		if (masks)
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 3d0e747..a20dc66 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -107,7 +107,8 @@ struct ti_sci_inta_msi_desc {
  *			address or data changes
  * @write_msi_msg_data:	Data parameter for the callback.
  *
- * @masked:	[PCI MSI/X] Mask bits
+ * @msi_mask:	[PCI MSI]   MSI cached mask bits
+ * @msix_ctrl:	[PCI MSI-X] MSI-X cached per vector control bits
  * @is_msix:	[PCI MSI/X] True if MSI-X
  * @multiple:	[PCI MSI/X] log2 num of messages allocated
  * @multi_cap:	[PCI MSI/X] log2 num of messages supported
@@ -139,7 +140,10 @@ struct msi_desc {
 	union {
 		/* PCI MSI/X specific data */
 		struct {
-			u32 masked;
+			union {
+				u32 msi_mask;
+				u32 msix_ctrl;
+			};
 			struct {
 				u8	is_msix		: 1;
 				u8	multiple	: 3;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Simplify msi_verify_entries()
  2021-07-29 21:51 ` [patch V2 13/19] PCI/MSI: Simplify msi_verify_entries() Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, Bjorn Helgaas, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     a6e8b946508cda3c3bf0f9b0e133d293dc9754f6
Gitweb:        https://git.kernel.org/tip/a6e8b946508cda3c3bf0f9b0e133d293dc9754f6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:52 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 11:03:29 +02:00

PCI/MSI: Simplify msi_verify_entries()

No point in looping over all entries when 64bit addressing mode is enabled
for nothing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20210729222542.992849326@linutronix.de

---
 drivers/pci/msi.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 95e6ce4..b59957c 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -613,8 +613,11 @@ static int msi_verify_entries(struct pci_dev *dev)
 {
 	struct msi_desc *entry;
 
+	if (!dev->no_64bit_msi)
+		return 0;
+
 	for_each_pci_msi_entry(entry, dev) {
-		if (entry->msg.address_hi && dev->no_64bit_msi) {
+		if (entry->msg.address_hi) {
 			pci_err(dev, "arch assigned 64-bit MSI address %#x%08x but device only supports 32 bits\n",
 				entry->msg.address_hi, entry->msg.address_lo);
 			return -EIO;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] s390/pci: Do not mask MSI[-X] entries on teardown
  2021-07-29 21:51 ` [patch V2 12/19] s390/pci: Do not mask MSI[-X] entries on teardown Thomas Gleixner
  2021-08-03 12:48   ` Niklas Schnelle
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Niklas Schnelle, Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     3998527d2e3ee2bfdf710a45b7b90968ff87babc
Gitweb:        https://git.kernel.org/tip/3998527d2e3ee2bfdf710a45b7b90968ff87babc
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:51 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 11:03:29 +02:00

s390/pci: Do not mask MSI[-X] entries on teardown

The PCI core already ensures that the MSI[-X] state is correct when MSI[-X]
is disabled. For MSI the reset state is all entries unmasked and for MSI-X
all vectors are masked.

S390 masks all MSI entries and masks the already masked MSI-X entries
again. Remove it and let the device in the correct state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20210729222542.939798136@linutronix.de

---
 arch/s390/pci/pci_irq.c | 4 ----
 drivers/pci/msi.c       | 4 ++--
 include/linux/msi.h     | 2 --
 3 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 9c7de90..3823e15 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -365,10 +365,6 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev)
 	for_each_pci_msi_entry(msi, pdev) {
 		if (!msi->irq)
 			continue;
-		if (msi->msi_attrib.is_msix)
-			__pci_msix_desc_mask_irq(msi, 1);
-		else
-			__pci_msi_desc_mask_irq(msi, 1, 1);
 		irq_set_msi_desc(msi->irq, NULL);
 		irq_free_desc(msi->irq);
 		msi->msg.address_lo = 0;
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index e5e7533..95e6ce4 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -143,7 +143,7 @@ static inline __attribute_const__ u32 msi_mask(unsigned x)
  * reliably as devices without an INTx disable bit will then generate a
  * level IRQ which will never be cleared.
  */
-void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+static void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
 	raw_spinlock_t *lock = &desc->dev->msi_lock;
 	unsigned long flags;
@@ -180,7 +180,7 @@ static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
  * file.  This saves a few milliseconds when initialising devices with lots
  * of MSI-X interrupts.
  */
-u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
+static u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
 {
 	u32 mask_bits = desc->masked;
 	void __iomem *desc_addr;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index e8bdcb8..3d0e747 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -232,8 +232,6 @@ void free_msi_entry(struct msi_desc *entry);
 void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 
-u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag);
-void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
 void pci_msi_mask_irq(struct irq_data *data);
 void pci_msi_unmask_irq(struct irq_data *data);
 

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] x86/ioapic: Force affinity setup before startup
  2021-07-29 21:51 ` [patch V2 10/19] x86/ioapic: Force affinity setup before startup Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     0c0e37dc11671384e53ba6ede53a4d91162a2cc5
Gitweb:        https://git.kernel.org/tip/0c0e37dc11671384e53ba6ede53a4d91162a2cc5
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:49 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:21 +02:00

x86/ioapic: Force affinity setup before startup

The IO/APIC cannot handle interrupt affinity changes safely after startup
other than from an interrupt handler. The startup sequence in the generic
interrupt code violates that assumption.

Mark the irq chip with the new IRQCHIP_AFFINITY_PRE_STARTUP flag so that
the default interrupt setting happens before the interrupt is started up
for the first time.

Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.832143400@linutronix.de


---
 arch/x86/kernel/apic/io_apic.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index d5c691a..39224e0 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1986,7 +1986,8 @@ static struct irq_chip ioapic_chip __read_mostly = {
 	.irq_set_affinity	= ioapic_set_affinity,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static struct irq_chip ioapic_ir_chip __read_mostly = {
@@ -1999,7 +2000,8 @@ static struct irq_chip ioapic_ir_chip __read_mostly = {
 	.irq_set_affinity	= ioapic_set_affinity,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static inline void init_IO_APIC_traps(void)

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] x86/msi: Force affinity setup before startup
  2021-07-29 21:51 ` [patch V2 11/19] x86/msi: " Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     ff363f480e5997051dd1de949121ffda3b753741
Gitweb:        https://git.kernel.org/tip/ff363f480e5997051dd1de949121ffda3b753741
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:50 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:21 +02:00

x86/msi: Force affinity setup before startup

The X86 MSI mechanism cannot handle interrupt affinity changes safely after
startup other than from an interrupt handler, unless interrupt remapping is
enabled. The startup sequence in the generic interrupt code violates that
assumption.

Mark the irq chips with the new IRQCHIP_AFFINITY_PRE_STARTUP flag so that
the default interrupt setting happens before the interrupt is started up
for the first time.

While the interrupt remapping MSI chip does not require this, there is no
point in treating it differently as this might spare an interrupt to a CPU
which is not in the default affinity mask.

For the non-remapping case go to the direct write path when the interrupt
is not yet started similar to the not yet activated case.

Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.886722080@linutronix.de


---
 arch/x86/kernel/apic/msi.c | 11 ++++++++---
 arch/x86/kernel/hpet.c     |  2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 44ebe25..dbacb9e 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -58,11 +58,13 @@ msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force)
 	 *   The quirk bit is not set in this case.
 	 * - The new vector is the same as the old vector
 	 * - The old vector is MANAGED_IRQ_SHUTDOWN_VECTOR (interrupt starts up)
+	 * - The interrupt is not yet started up
 	 * - The new destination CPU is the same as the old destination CPU
 	 */
 	if (!irqd_msi_nomask_quirk(irqd) ||
 	    cfg->vector == old_cfg.vector ||
 	    old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR ||
+	    !irqd_is_started(irqd) ||
 	    cfg->dest_apicid == old_cfg.dest_apicid) {
 		irq_msi_update_msg(irqd, cfg);
 		return ret;
@@ -150,7 +152,8 @@ static struct irq_chip pci_msi_controller = {
 	.irq_ack		= irq_chip_ack_parent,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_set_affinity	= msi_set_affinity,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
@@ -219,7 +222,8 @@ static struct irq_chip pci_msi_ir_controller = {
 	.irq_mask		= pci_msi_mask_irq,
 	.irq_ack		= irq_chip_ack_parent,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static struct msi_domain_info pci_msi_ir_domain_info = {
@@ -273,7 +277,8 @@ static struct irq_chip dmar_msi_controller = {
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
 	.irq_compose_msi_msg	= dmar_msi_compose_msg,
 	.irq_write_msi_msg	= dmar_msi_write_msg,
-	.flags			= IRQCHIP_SKIP_SET_WAKE,
+	.flags			= IRQCHIP_SKIP_SET_WAKE |
+				  IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static int dmar_msi_init(struct irq_domain *domain,
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 08651a4..42fc41d 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -508,7 +508,7 @@ static struct irq_chip hpet_msi_controller __ro_after_init = {
 	.irq_set_affinity = msi_domain_set_affinity,
 	.irq_retrigger = irq_chip_retrigger_hierarchy,
 	.irq_write_msi_msg = hpet_msi_write_msg,
-	.flags = IRQCHIP_SKIP_SET_WAKE,
+	.flags = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_AFFINITY_PRE_STARTUP,
 };
 
 static int hpet_msi_init(struct irq_domain *domain,

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Protect msi_desc::masked for multi-MSI
  2021-07-29 21:51 ` [patch V2 08/19] PCI/MSI: Protect msi_desc::masked for multi-MSI Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     77e89afc25f30abd56e76a809ee2884d7c1b63ce
Gitweb:        https://git.kernel.org/tip/77e89afc25f30abd56e76a809ee2884d7c1b63ce
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:47 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

PCI/MSI: Protect msi_desc::masked for multi-MSI

Multi-MSI uses a single MSI descriptor and there is a single mask register
when the device supports per vector masking. To avoid reading back the mask
register the value is cached in the MSI descriptor and updates are done by
clearing and setting bits in the cache and writing it to the device.

But nothing protects msi_desc::masked and the mask register from being
modified concurrently on two different CPUs for two different Linux
interrupts which belong to the same multi-MSI descriptor.

Add a lock to struct device and protect any operation on the mask and the
mask register with it.

This makes the update of msi_desc::masked unconditional, but there is no
place which requires a modification of the hardware register without
updating the masked cache.

msi_mask_irq() is now an empty wrapper which will be cleaned up in follow
up changes.

The problem goes way back to the initial support of multi-MSI, but picking
the commit which introduced the mask cache is a valid cut off point
(2.6.30).

Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.726833414@linutronix.de

---
 drivers/base/core.c    |  1 +
 drivers/pci/msi.c      | 19 ++++++++++---------
 include/linux/device.h |  1 +
 include/linux/msi.h    |  2 +-
 4 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index f636049..6c0ef9d 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2837,6 +2837,7 @@ void device_initialize(struct device *dev)
 	device_pm_init(dev);
 	set_dev_node(dev, -1);
 #ifdef CONFIG_GENERIC_MSI_IRQ
+	raw_spin_lock_init(&dev->msi_lock);
 	INIT_LIST_HEAD(&dev->msi_list);
 #endif
 	INIT_LIST_HEAD(&dev->links.consumers);
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index f0f7026..e5e7533 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -143,24 +143,25 @@ static inline __attribute_const__ u32 msi_mask(unsigned x)
  * reliably as devices without an INTx disable bit will then generate a
  * level IRQ which will never be cleared.
  */
-u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
-	u32 mask_bits = desc->masked;
+	raw_spinlock_t *lock = &desc->dev->msi_lock;
+	unsigned long flags;
 
 	if (pci_msi_ignore_mask || !desc->msi_attrib.maskbit)
-		return 0;
+		return;
 
-	mask_bits &= ~mask;
-	mask_bits |= flag;
+	raw_spin_lock_irqsave(lock, flags);
+	desc->masked &= ~mask;
+	desc->masked |= flag;
 	pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
-			       mask_bits);
-
-	return mask_bits;
+			       desc->masked);
+	raw_spin_unlock_irqrestore(lock, flags);
 }
 
 static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 {
-	desc->masked = __pci_msi_desc_mask_irq(desc, mask, flag);
+	__pci_msi_desc_mask_irq(desc, mask, flag);
 }
 
 static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
diff --git a/include/linux/device.h b/include/linux/device.h
index 59940f1..e53aa50 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -506,6 +506,7 @@ struct device {
 	struct dev_pin_info	*pins;
 #endif
 #ifdef CONFIG_GENERIC_MSI_IRQ
+	raw_spinlock_t		msi_lock;
 	struct list_head	msi_list;
 #endif
 #ifdef CONFIG_DMA_OPS
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 6aff469..e8bdcb8 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -233,7 +233,7 @@ void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
 
 u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag);
-u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
+void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
 void pci_msi_mask_irq(struct irq_data *data);
 void pci_msi_unmask_irq(struct irq_data *data);
 

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP
  2021-07-29 21:51 ` [patch V2 09/19] genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     826da771291fc25a428e871f9e7fb465e390f852
Gitweb:        https://git.kernel.org/tip/826da771291fc25a428e871f9e7fb465e390f852
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:48 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP

X86 IO/APIC and MSI interrupts (when used without interrupts remapping)
require that the affinity setup on startup is done before the interrupt is
enabled for the first time as the non-remapped operation mode cannot safely
migrate enabled interrupts from arbitrary contexts. Provide a new irq chip
flag which allows affected hardware to request this.

This has to be opt-in because there have been reports in the past that some
interrupt chips cannot handle affinity setting before startup.

Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.779791738@linutronix.de

---
 include/linux/irq.h | 2 ++
 kernel/irq/chip.c   | 5 ++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 8e9a9ae..c8293c8 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -569,6 +569,7 @@ struct irq_chip {
  * IRQCHIP_SUPPORTS_NMI:              Chip can deliver NMIs, only for root irqchips
  * IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND:  Invokes __enable_irq()/__disable_irq() for wake irqs
  *                                    in the suspend path if they are in disabled state
+ * IRQCHIP_AFFINITY_PRE_STARTUP:      Default affinity update before startup
  */
 enum {
 	IRQCHIP_SET_TYPE_MASKED			= (1 <<  0),
@@ -581,6 +582,7 @@ enum {
 	IRQCHIP_SUPPORTS_LEVEL_MSI		= (1 <<  7),
 	IRQCHIP_SUPPORTS_NMI			= (1 <<  8),
 	IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND	= (1 <<  9),
+	IRQCHIP_AFFINITY_PRE_STARTUP		= (1 << 10),
 };
 
 #include <linux/irqdesc.h>
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 7f04c7d..a98bcfc 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -265,8 +265,11 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force)
 	} else {
 		switch (__irq_startup_managed(desc, aff, force)) {
 		case IRQ_STARTUP_NORMAL:
+			if (d->chip->flags & IRQCHIP_AFFINITY_PRE_STARTUP)
+				irq_setup_affinity(desc);
 			ret = __irq_startup(desc);
-			irq_setup_affinity(desc);
+			if (!(d->chip->flags & IRQCHIP_AFFINITY_PRE_STARTUP))
+				irq_setup_affinity(desc);
 			break;
 		case IRQ_STARTUP_MANAGED:
 			irq_do_set_affinity(d, aff, false);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Correct misleading comments
  2021-07-29 21:51 ` [patch V2 06/19] PCI/MSI: Correct misleading comments Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     689e6b5351573c38ccf92a0dd8b3e2c2241e4aff
Gitweb:        https://git.kernel.org/tip/689e6b5351573c38ccf92a0dd8b3e2c2241e4aff
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:45 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

PCI/MSI: Correct misleading comments

The comments about preserving the cached state in pci_msi[x]_shutdown() are
misleading as the MSI descriptors are freed right after those functions
return. So there is nothing to restore. Preparatory change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.621609423@linutronix.de

---
 drivers/pci/msi.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index e27ac6b..b3f5807 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -961,7 +961,6 @@ static void pci_msi_shutdown(struct pci_dev *dev)
 
 	/* Return the device with MSI unmasked as initial states */
 	mask = msi_mask(desc->msi_attrib.multi_cap);
-	/* Keep cached state to be restored */
 	__pci_msi_desc_mask_irq(desc, mask, 0);
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
@@ -1047,10 +1046,8 @@ static void pci_msix_shutdown(struct pci_dev *dev)
 	}
 
 	/* Return the device with MSI-X masked as initial states */
-	for_each_pci_msi_entry(entry, dev) {
-		/* Keep cached states to be restored */
+	for_each_pci_msi_entry(entry, dev)
 		__pci_msix_desc_mask_irq(entry, 1);
-	}
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 	pci_intx_for_msi(dev, 1);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown()
  2021-07-29 21:51 ` [patch V2 07/19] PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown() Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     d28d4ad2a1aef27458b3383725bb179beb8d015c
Gitweb:        https://git.kernel.org/tip/d28d4ad2a1aef27458b3383725bb179beb8d015c
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:46 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown()

No point in using the raw write function from shutdown. Preparatory change
to introduce proper serialization for the msi_desc::masked cache.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.674391354@linutronix.de

---
 drivers/pci/msi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index b3f5807..f0f7026 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -961,7 +961,7 @@ static void pci_msi_shutdown(struct pci_dev *dev)
 
 	/* Return the device with MSI unmasked as initial states */
 	mask = msi_mask(desc->msi_attrib.multi_cap);
-	__pci_msi_desc_mask_irq(desc, mask, 0);
+	msi_mask_irq(desc, mask, 0);
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->msi_attrib.default_irq;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Do not set invalid bits in MSI mask
  2021-07-29 21:51 ` [patch V2 05/19] PCI/MSI: Do not set invalid bits in MSI mask Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     361fd37397f77578735907341579397d5bed0a2d
Gitweb:        https://git.kernel.org/tip/361fd37397f77578735907341579397d5bed0a2d
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:44 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

PCI/MSI: Do not set invalid bits in MSI mask

msi_mask_irq() takes a mask and a flags argument. The mask argument is used
to mask out bits from the cached mask and the flags argument to set bits.

Some places invoke it with a flags argument which sets bits which are not
used by the device, i.e. when the device supports up to 8 vectors a full
unmask in some places sets the mask to 0xFFFFFF00. While devices probably
do not care, it's still bad practice.

Fixes: 7ba1930db02f ("PCI MSI: Unmask MSI if setup failed")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.568173099@linutronix.de

---
 drivers/pci/msi.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 434c704..e27ac6b 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -656,21 +656,21 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
 	/* Configure MSI capability structure */
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
 	if (ret) {
-		msi_mask_irq(entry, mask, ~mask);
+		msi_mask_irq(entry, mask, 0);
 		free_msi_irqs(dev);
 		return ret;
 	}
 
 	ret = msi_verify_entries(dev);
 	if (ret) {
-		msi_mask_irq(entry, mask, ~mask);
+		msi_mask_irq(entry, mask, 0);
 		free_msi_irqs(dev);
 		return ret;
 	}
 
 	ret = populate_msi_sysfs(dev);
 	if (ret) {
-		msi_mask_irq(entry, mask, ~mask);
+		msi_mask_irq(entry, mask, 0);
 		free_msi_irqs(dev);
 		return ret;
 	}
@@ -962,7 +962,7 @@ static void pci_msi_shutdown(struct pci_dev *dev)
 	/* Return the device with MSI unmasked as initial states */
 	mask = msi_mask(desc->msi_attrib.multi_cap);
 	/* Keep cached state to be restored */
-	__pci_msi_desc_mask_irq(desc, mask, ~mask);
+	__pci_msi_desc_mask_irq(desc, mask, 0);
 
 	/* Restore dev->irq to its default pin-assertion IRQ */
 	dev->irq = desc->msi_attrib.default_irq;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Enforce MSI[X] entry updates to be visible
  2021-07-29 21:51 ` [patch V2 04/19] PCI/MSI: Enforce MSI[X] entry updates to be visible Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, Bjorn Helgaas, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     b9255a7cb51754e8d2645b65dd31805e282b4f3e
Gitweb:        https://git.kernel.org/tip/b9255a7cb51754e8d2645b65dd31805e282b4f3e
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:43 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

PCI/MSI: Enforce MSI[X] entry updates to be visible

Nothing enforces the posted writes to be visible when the function
returns. Flush them even if the flush might be redundant when the entry is
masked already as the unmask will flush as well. This is either setup or a
rare affinity change event so the extra flush is not the end of the world.

While this is more a theoretical issue especially the logic in the X86
specific msi_set_affinity() function relies on the assumption that the
update has reached the hardware when the function returns.

Again, as this never has been enforced the Fixes tag refers to a commit in:
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.515188147@linutronix.de

---
 drivers/pci/msi.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 7ee1ac4..434c704 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -311,6 +311,9 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 
 		if (unmasked)
 			__pci_msix_desc_mask_irq(entry, 0);
+
+		/* Ensure that the writes are visible in the device */
+		readl(base + PCI_MSIX_ENTRY_DATA);
 	} else {
 		int pos = dev->msi_cap;
 		u16 msgctl;
@@ -331,6 +334,8 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 			pci_write_config_word(dev, pos + PCI_MSI_DATA_32,
 					      msg->data);
 		}
+		/* Ensure that the writes are visible in the device */
+		pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &msgctl);
 	}
 
 skip:

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Enforce that MSI-X table entry is masked for update
  2021-07-29 21:51 ` [patch V2 03/19] PCI/MSI: Enforce that MSI-X table entry is masked for update Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kevin Tian, Thomas Gleixner, Marc Zyngier, Bjorn Helgaas, stable,
	x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     da181dc974ad667579baece33c2c8d2d1e4558d5
Gitweb:        https://git.kernel.org/tip/da181dc974ad667579baece33c2c8d2d1e4558d5
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:42 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

PCI/MSI: Enforce that MSI-X table entry is masked for update

The specification (PCIe r5.0, sec 6.1.4.5) states:

    For MSI-X, a function is permitted to cache Address and Data values
    from unmasked MSI-X Table entries. However, anytime software unmasks a
    currently masked MSI-X Table entry either by clearing its Mask bit or
    by clearing the Function Mask bit, the function must update any Address
    or Data values that it cached from that entry. If software changes the
    Address or Data value of an entry while the entry is unmasked, the
    result is undefined.

The Linux kernel's MSI-X support never enforced that the entry is masked
before the entry is modified hence the Fixes tag refers to a commit in:
      git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Enforce the entry to be masked across the update.

There is no point in enforcing this to be handled at all possible call
sites as this is just pointless code duplication and the common update
function is the obvious place to enforce this.

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Reported-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.462096385@linutronix.de

---
 drivers/pci/msi.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 57c9ec9..7ee1ac4 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -289,13 +289,28 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 		/* Don't touch the hardware now */
 	} else if (entry->msi_attrib.is_msix) {
 		void __iomem *base = pci_msix_desc_addr(entry);
+		bool unmasked = !(entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		if (!base)
 			goto skip;
 
+		/*
+		 * The specification mandates that the entry is masked
+		 * when the message is modified:
+		 *
+		 * "If software changes the Address or Data value of an
+		 * entry while the entry is unmasked, the result is
+		 * undefined."
+		 */
+		if (unmasked)
+			__pci_msix_desc_mask_irq(entry, PCI_MSIX_ENTRY_CTRL_MASKBIT);
+
 		writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR);
 		writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
 		writel(msg->data, base + PCI_MSIX_ENTRY_DATA);
+
+		if (unmasked)
+			__pci_msix_desc_mask_irq(entry, 0);
 	} else {
 		int pos = dev->msi_cap;
 		u16 msgctl;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Mask all unused MSI-X entries
  2021-07-29 21:51 ` [patch V2 02/19] PCI/MSI: Mask all unused MSI-X entries Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, Bjorn Helgaas, stable, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     7d5ec3d3612396dc6d4b76366d20ab9fc06f399f
Gitweb:        https://git.kernel.org/tip/7d5ec3d3612396dc6d4b76366d20ab9fc06f399f
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:41 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

PCI/MSI: Mask all unused MSI-X entries

When MSI-X is enabled the ordering of calls is:

  msix_map_region();
  msix_setup_entries();
  pci_msi_setup_msi_irqs();
  msix_program_entries();

This has a few interesting issues:

 1) msix_setup_entries() allocates the MSI descriptors and initializes them
    except for the msi_desc:masked member which is left zero initialized.

 2) pci_msi_setup_msi_irqs() allocates the interrupt descriptors and sets
    up the MSI interrupts which ends up in pci_write_msi_msg() unless the
    interrupt chip provides its own irq_write_msi_msg() function.

 3) msix_program_entries() does not do what the name suggests. It solely
    updates the entries array (if not NULL) and initializes the masked
    member for each MSI descriptor by reading the hardware state and then
    masks the entry.

Obviously this has some issues:

 1) The uninitialized masked member of msi_desc prevents the enforcement
    of masking the entry in pci_write_msi_msg() depending on the cached
    masked bit. Aside of that half initialized data is a NONO in general

 2) msix_program_entries() only ensures that the actually allocated entries
    are masked. This is wrong as experimentation with crash testing and
    crash kernel kexec has shown.

    This limited testing unearthed that when the production kernel had more
    entries in use and unmasked when it crashed and the crash kernel
    allocated a smaller amount of entries, then a full scan of all entries
    found unmasked entries which were in use in the production kernel.

    This is obviously a device or emulation issue as the device reset
    should mask all MSI-X table entries, but obviously that's just part
    of the paper specification.

Cure this by:

 1) Masking all table entries in hardware
 2) Initializing msi_desc::masked in msix_setup_entries()
 3) Removing the mask dance in msix_program_entries()
 4) Renaming msix_program_entries() to msix_update_entries() to
    reflect the purpose of that function.

As the masking of unused entries has never been done the Fixes tag refers
to a commit in:
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.403833459@linutronix.de

---
 drivers/pci/msi.c | 45 +++++++++++++++++++++++++++------------------
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 5d39ed8..57c9ec9 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -691,6 +691,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 {
 	struct irq_affinity_desc *curmsk, *masks = NULL;
 	struct msi_desc *entry;
+	void __iomem *addr;
 	int ret, i;
 	int vec_count = pci_msix_vec_count(dev);
 
@@ -711,6 +712,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 
 		entry->msi_attrib.is_msix	= 1;
 		entry->msi_attrib.is_64		= 1;
+
 		if (entries)
 			entry->msi_attrib.entry_nr = entries[i].entry;
 		else
@@ -722,6 +724,10 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
 		entry->msi_attrib.default_irq	= dev->irq;
 		entry->mask_base		= base;
 
+		addr = pci_msix_desc_addr(entry);
+		if (addr)
+			entry->masked = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+
 		list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
 		if (masks)
 			curmsk++;
@@ -732,26 +738,25 @@ out:
 	return ret;
 }
 
-static void msix_program_entries(struct pci_dev *dev,
-				 struct msix_entry *entries)
+static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries)
 {
 	struct msi_desc *entry;
-	int i = 0;
-	void __iomem *desc_addr;
 
 	for_each_pci_msi_entry(entry, dev) {
-		if (entries)
-			entries[i++].vector = entry->irq;
+		if (entries) {
+			entries->vector = entry->irq;
+			entries++;
+		}
+	}
+}
 
-		desc_addr = pci_msix_desc_addr(entry);
-		if (desc_addr)
-			entry->masked = readl(desc_addr +
-					      PCI_MSIX_ENTRY_VECTOR_CTRL);
-		else
-			entry->masked = 0;
+static void msix_mask_all(void __iomem *base, int tsize)
+{
+	u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT;
+	int i;
 
-		msix_mask_irq(entry, 1);
-	}
+	for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE)
+		writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
 }
 
 /**
@@ -768,9 +773,9 @@ static void msix_program_entries(struct pci_dev *dev,
 static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 				int nvec, struct irq_affinity *affd)
 {
-	int ret;
-	u16 control;
 	void __iomem *base;
+	int ret, tsize;
+	u16 control;
 
 	/*
 	 * Some devices require MSI-X to be enabled before the MSI-X
@@ -782,12 +787,16 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 
 	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
 	/* Request & Map MSI-X table region */
-	base = msix_map_region(dev, msix_table_size(control));
+	tsize = msix_table_size(control);
+	base = msix_map_region(dev, tsize);
 	if (!base) {
 		ret = -ENOMEM;
 		goto out_disable;
 	}
 
+	/* Ensure that all table entries are masked. */
+	msix_mask_all(base, tsize);
+
 	ret = msix_setup_entries(dev, base, entries, nvec, affd);
 	if (ret)
 		goto out_disable;
@@ -801,7 +810,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 	if (ret)
 		goto out_free;
 
-	msix_program_entries(dev, entries);
+	msix_update_entries(dev, entries);
 
 	ret = populate_msi_sysfs(dev);
 	if (ret)

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [tip: irq/core] PCI/MSI: Enable and mask MSI-X early
  2021-07-29 21:51 ` [patch V2 01/19] PCI/MSI: Enable and mask MSI-X early Thomas Gleixner
@ 2021-08-10  9:07   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 43+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-08-10  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Marc Zyngier, Ashok Raj, Bjorn Helgaas, stable,
	x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     438553958ba19296663c6d6583d208dfb6792830
Gitweb:        https://git.kernel.org/tip/438553958ba19296663c6d6583d208dfb6792830
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Jul 2021 23:51:40 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 10 Aug 2021 10:59:20 +02:00

PCI/MSI: Enable and mask MSI-X early

The ordering of MSI-X enable in hardware is dysfunctional:

 1) MSI-X is disabled in the control register
 2) Various setup functions
 3) pci_msi_setup_msi_irqs() is invoked which ends up accessing
    the MSI-X table entries
 4) MSI-X is enabled and masked in the control register with the
    comment that enabling is required for some hardware to access
    the MSI-X table

Step #4 obviously contradicts #3. The history of this is an issue with the
NIU hardware. When #4 was introduced the table access actually happened in
msix_program_entries() which was invoked after enabling and masking MSI-X.

This was changed in commit d71d6432e105 ("PCI/MSI: Kill redundant call of
irq_set_msi_desc() for MSI-X interrupts") which removed the table write
from msix_program_entries().

Interestingly enough nobody noticed and either NIU still works or it did
not get any testing with a kernel 3.19 or later.

Nevertheless this is inconsistent and there is no reason why MSI-X can't be
enabled and masked in the control register early on, i.e. move step #4
above to step #1. This preserves the NIU workaround and has no side effects
on other hardware.

Fixes: d71d6432e105 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.344136412@linutronix.de

---
 drivers/pci/msi.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 9232255..5d39ed8 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -772,18 +772,25 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 	u16 control;
 	void __iomem *base;
 
-	/* Ensure MSI-X is disabled while it is set up */
-	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+	/*
+	 * Some devices require MSI-X to be enabled before the MSI-X
+	 * registers can be accessed.  Mask all the vectors to prevent
+	 * interrupts coming in before they're fully set up.
+	 */
+	pci_msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_MASKALL |
+				    PCI_MSIX_FLAGS_ENABLE);
 
 	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
 	/* Request & Map MSI-X table region */
 	base = msix_map_region(dev, msix_table_size(control));
-	if (!base)
-		return -ENOMEM;
+	if (!base) {
+		ret = -ENOMEM;
+		goto out_disable;
+	}
 
 	ret = msix_setup_entries(dev, base, entries, nvec, affd);
 	if (ret)
-		return ret;
+		goto out_disable;
 
 	ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
 	if (ret)
@@ -794,14 +801,6 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 	if (ret)
 		goto out_free;
 
-	/*
-	 * Some devices require MSI-X to be enabled before we can touch the
-	 * MSI-X registers.  We need to mask all the vectors to prevent
-	 * interrupts coming in before they're fully set up.
-	 */
-	pci_msix_clear_and_set_ctrl(dev, 0,
-				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE);
-
 	msix_program_entries(dev, entries);
 
 	ret = populate_msi_sysfs(dev);
@@ -836,6 +835,9 @@ out_avail:
 out_free:
 	free_msi_irqs(dev);
 
+out_disable:
+	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+
 	return ret;
 }
 

^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-08-10  9:09 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29 21:51 [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Thomas Gleixner
2021-07-29 21:51 ` [patch V2 01/19] PCI/MSI: Enable and mask MSI-X early Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 02/19] PCI/MSI: Mask all unused MSI-X entries Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 03/19] PCI/MSI: Enforce that MSI-X table entry is masked for update Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 04/19] PCI/MSI: Enforce MSI[X] entry updates to be visible Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 05/19] PCI/MSI: Do not set invalid bits in MSI mask Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 06/19] PCI/MSI: Correct misleading comments Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 07/19] PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown() Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 08/19] PCI/MSI: Protect msi_desc::masked for multi-MSI Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 09/19] genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 10/19] x86/ioapic: Force affinity setup before startup Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 11/19] x86/msi: " Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 12/19] s390/pci: Do not mask MSI[-X] entries on teardown Thomas Gleixner
2021-08-03 12:48   ` Niklas Schnelle
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 13/19] PCI/MSI: Simplify msi_verify_entries() Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 14/19] PCI/MSI: Rename msi_desc::masked Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 15/19] PCI/MSI: Consolidate error handling in msi_capability_init() Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 16/19] PCI/MSI: Deobfuscate virtual MSI-X Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 17/19] PCI/MSI: Cleanup msi_mask() Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 18/19] PCI/MSI: Provide a new set of mask and unmask functions Thomas Gleixner
     [not found]   ` <87r1f6bpt7.wl-maz@kernel.org>
2021-08-09 18:56     ` Thomas Gleixner
2021-08-09 19:08       ` [patch V3 " Thomas Gleixner
2021-08-10  9:07         ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-07-29 21:51 ` [patch V2 19/19] PCI/MSI: Use new mask/unmask functions Thomas Gleixner
2021-08-10  9:07   ` [tip: irq/core] " tip-bot2 for Thomas Gleixner
2021-08-10  7:49 ` [patch V2 00/19] PCI/MSI, x86: Cure a couple of inconsistencies Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).