All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown
@ 2015-04-10 22:54 Bjorn Helgaas
  2015-04-10 22:54 ` [PATCH v6 01/10] PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl() Bjorn Helgaas
                   ` (10 more replies)
  0 siblings, 11 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

Hi Michael,

I put your patches on my pci/msi branch and I hope to merge them for v4.1.
I didn't apply the acks from Fam and Eric because I made changes to those
patches that weren't completely trivial.  I think the end result is
equivalent, though.  The diff attached to this cover letter is the
difference between your v5 series and this v6 series.

As far as I'm concerned, this is ready to go except that I would like a
little more info about the virtio kernel hang to include in the changelog
for "PCI/MSI: Don't disable MSI/MSI-X at shutdown".

Changes from v5:
	Edit summaries and changelogs for consistency
	Split msi_set_enable() rename/export for reviewability
	Move pci_msi_setup_pci_dev() to its ultimate location to avoid
	    unnecessary diffs in subsequent patch
	Call pci_msi_setup_pci_dev() from its ultimate location to avoid
	    unnecessary diffs in subsequent patch
	Skip pci_msi_off() duplicate code removal since we can remove
	    it completely later
	Remove pci_msi_off() completely

v5 posting: http://lkml.kernel.org/r/1427641227-7574-1-git-send-email-mst@redhat.com

Bjorn
    
---

Bjorn Helgaas (1):
      PCI/MSI: Remove unused pci_msi_off()

Michael S. Tsirkin (9):
      PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl()
      PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl()
      PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI
      PCI/MSI: Don't disable MSI/MSI-X at shutdown
      PCI/MSI: Make pci_msi_shutdown(), pci_msix_shutdown() static
      virtio_pci: drop pci_msi_off() call during probe
      ntb: Drop pci_msi_off() call during probe
      mic: Drop pci_msi_off() call during probe
      PCI/MSI: Drop pci_msi_off() calls from quirks


 drivers/misc/mic/host/mic_intr.c   |    2 -
 drivers/ntb/ntb_hw.c               |    2 -
 drivers/pci/msi.c                  |   57 ++++++++----------------------------
 drivers/pci/pci-driver.c           |    2 -
 drivers/pci/pci.c                  |   33 ---------------------
 drivers/pci/pci.h                  |   21 +++++++++++++
 drivers/pci/probe.c                |   18 +++++++++++
 drivers/pci/quirks.c               |    2 -
 drivers/virtio/virtio_pci_common.c |    3 --
 include/linux/pci.h                |    5 ---
 10 files changed, 51 insertions(+), 94 deletions(-)


--- This is "git diff v5 v6":

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 54cefb442d19..3d938a7d3b04 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3096,24 +3096,6 @@ bool pci_check_and_unmask_intx(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_check_and_unmask_intx);
 
-/**
- * pci_msi_off - disables any MSI or MSI-X capabilities
- * @dev: the PCI device to operate on
- *
- * If you want to use MSI, see pci_enable_msi() and friends.
- * This is a lower-level primitive that allows us to disable
- * MSI operation at the device level.
- * Not for use by drivers.
- */
-void pci_msi_off(struct pci_dev *dev)
-{
-	if (dev->msi_cap)
-		pci_msi_set_enable(dev, 0);
-
-	if (dev->msix_cap)
-		pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
-}
-
 int pci_set_dma_max_seg_size(struct pci_dev *dev, unsigned int size)
 {
 	return dma_set_max_seg_size(&dev->dev, size);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 620fcad1935d..17f213d494de 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -146,8 +146,6 @@ static inline void pci_no_msi(void) { }
 static inline void pci_msi_init_pci_dev(struct pci_dev *dev) { }
 #endif
 
-void pci_msi_off(struct pci_dev *dev);
-
 static inline void pci_msi_set_enable(struct pci_dev *dev, int enable)
 {
 	u16 control;
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 120772c219c7..740113b70ade 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1086,14 +1086,18 @@ int pci_cfg_space_size(struct pci_dev *dev)
 
 static void pci_msi_setup_pci_dev(struct pci_dev *dev)
 {
-	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
-	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
-
-	/* Disable the msi hardware to avoid screaming interrupts
+	/*
+	 * Disable the MSI hardware to avoid screaming interrupts
 	 * during boot.  This is the power on reset default so
 	 * usually this should be a noop.
 	 */
-	pci_msi_off(dev);
+	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
+	if (dev->msi_cap)
+		pci_msi_set_enable(dev, 0);
+
+	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
+	if (dev->msix_cap)
+		pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 }
 
 /**
@@ -1151,7 +1155,6 @@ int pci_setup_device(struct pci_dev *dev)
 	/* "Unknown power state" */
 	dev->current_state = PCI_UNKNOWN;
 
-	/* MSI/MSI-X setup has to be done early since it's used by quirks. */
 	pci_msi_setup_pci_dev(dev);
 
 	/* Early fixups, before probing the BARs */

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 01/10] PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl()
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
@ 2015-04-10 22:54 ` Bjorn Helgaas
  2015-04-11  7:30   ` Greg KH
  2015-04-10 22:54 ` [PATCH v6 02/10] PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl() Bjorn Helgaas
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Fam Zheng, linux-pci, stable, Eric W. Biederman, Yijing Wang, Yinghai Lu

From: Michael S. Tsirkin <mst@redhat.com>

Rename msi_set_enable() to pci_msi_set_enable() and
msix_clear_and_set_ctrl() to pci_msix_clear_and_set_ctrl().

No functional change.

[bhelgaas: changelog, split into separate patch]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
CC: stable@vger.kernel.org
---
 drivers/pci/msi.c |   28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index c3e7dfcf9ff5..6cd366058ec4 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -185,7 +185,7 @@ void __weak arch_restore_msi_irqs(struct pci_dev *dev)
 	return default_restore_msi_irqs(dev);
 }
 
-static void msi_set_enable(struct pci_dev *dev, int enable)
+static void pci_msi_set_enable(struct pci_dev *dev, int enable)
 {
 	u16 control;
 
@@ -196,7 +196,7 @@ static void msi_set_enable(struct pci_dev *dev, int enable)
 	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
 }
 
-static void msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
+static void pci_msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
 {
 	u16 ctrl;
 
@@ -452,7 +452,7 @@ static void __pci_restore_msi_state(struct pci_dev *dev)
 	entry = irq_get_msi_desc(dev->irq);
 
 	pci_intx_for_msi(dev, 0);
-	msi_set_enable(dev, 0);
+	pci_msi_set_enable(dev, 0);
 	arch_restore_msi_irqs(dev);
 
 	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
@@ -473,14 +473,14 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
 
 	/* route the table */
 	pci_intx_for_msi(dev, 0);
-	msix_clear_and_set_ctrl(dev, 0,
+	pci_msix_clear_and_set_ctrl(dev, 0,
 				PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL);
 
 	arch_restore_msi_irqs(dev);
 	list_for_each_entry(entry, &dev->msi_list, list)
 		msix_mask_irq(entry, entry->masked);
 
-	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
+	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 }
 
 void pci_restore_msi_state(struct pci_dev *dev)
@@ -647,7 +647,7 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 	int ret;
 	unsigned mask;
 
-	msi_set_enable(dev, 0);	/* Disable MSI during set up */
+	pci_msi_set_enable(dev, 0);	/* Disable MSI during set up */
 
 	entry = msi_setup_entry(dev, nvec);
 	if (!entry)
@@ -683,7 +683,7 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 
 	/* Set MSI enabled bits	 */
 	pci_intx_for_msi(dev, 0);
-	msi_set_enable(dev, 1);
+	pci_msi_set_enable(dev, 1);
 	dev->msi_enabled = 1;
 
 	dev->irq = entry->irq;
@@ -775,7 +775,7 @@ static int msix_capability_init(struct pci_dev *dev,
 	void __iomem *base;
 
 	/* Ensure MSI-X is disabled while it is set up */
-	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 
 	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
 	/* Request & Map MSI-X table region */
@@ -801,7 +801,7 @@ static int msix_capability_init(struct pci_dev *dev,
 	 * MSI-X registers.  We need to mask all the vectors to prevent
 	 * interrupts coming in before they're fully set up.
 	 */
-	msix_clear_and_set_ctrl(dev, 0,
+	pci_msix_clear_and_set_ctrl(dev, 0,
 				PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE);
 
 	msix_program_entries(dev, entries);
@@ -814,7 +814,7 @@ static int msix_capability_init(struct pci_dev *dev,
 	pci_intx_for_msi(dev, 0);
 	dev->msix_enabled = 1;
 
-	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
+	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 
 	return 0;
 
@@ -919,7 +919,7 @@ void pci_msi_shutdown(struct pci_dev *dev)
 	BUG_ON(list_empty(&dev->msi_list));
 	desc = list_first_entry(&dev->msi_list, struct msi_desc, list);
 
-	msi_set_enable(dev, 0);
+	pci_msi_set_enable(dev, 0);
 	pci_intx_for_msi(dev, 1);
 	dev->msi_enabled = 0;
 
@@ -1027,7 +1027,7 @@ void pci_msix_shutdown(struct pci_dev *dev)
 		__pci_msix_desc_mask_irq(entry, 1);
 	}
 
-	msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 	pci_intx_for_msi(dev, 1);
 	dev->msix_enabled = 0;
 }
@@ -1069,11 +1069,11 @@ void pci_msi_init_pci_dev(struct pci_dev *dev)
 	 */
 	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
 	if (dev->msi_cap)
-		msi_set_enable(dev, 0);
+		pci_msi_set_enable(dev, 0);
 
 	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
 	if (dev->msix_cap)
-		msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+		pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 }
 
 /**


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 02/10] PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl()
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
  2015-04-10 22:54 ` [PATCH v6 01/10] PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl() Bjorn Helgaas
@ 2015-04-10 22:54 ` Bjorn Helgaas
  2015-04-10 22:54 ` [PATCH v6 03/10] PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI Bjorn Helgaas
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Fam Zheng, linux-pci, stable, Eric W. Biederman, Yijing Wang, Yinghai Lu

From: Michael S. Tsirkin <mst@redhat.com>

Move pci_msi_set_enable() and pci_msix_clear_and_set_ctrl() to
drivers/pci/pci.h so they're available even when MSI isn't configured
into the kernel.

No functional change.

[bhelgaas: changelog, split into separate patch]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
CC: stable@vger.kernel.org
---
 drivers/pci/msi.c |   21 ---------------------
 drivers/pci/pci.h |   21 +++++++++++++++++++++
 2 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 6cd366058ec4..9942f6827a4a 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -185,27 +185,6 @@ void __weak arch_restore_msi_irqs(struct pci_dev *dev)
 	return default_restore_msi_irqs(dev);
 }
 
-static void pci_msi_set_enable(struct pci_dev *dev, int enable)
-{
-	u16 control;
-
-	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
-	control &= ~PCI_MSI_FLAGS_ENABLE;
-	if (enable)
-		control |= PCI_MSI_FLAGS_ENABLE;
-	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
-}
-
-static void pci_msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
-{
-	u16 ctrl;
-
-	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &ctrl);
-	ctrl &= ~clear;
-	ctrl |= set;
-	pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, ctrl);
-}
-
 static inline __attribute_const__ u32 msi_mask(unsigned x)
 {
 	/* Don't shift by >= width of type */
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4091f82239cd..17f213d494de 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -146,6 +146,27 @@ static inline void pci_no_msi(void) { }
 static inline void pci_msi_init_pci_dev(struct pci_dev *dev) { }
 #endif
 
+static inline void pci_msi_set_enable(struct pci_dev *dev, int enable)
+{
+	u16 control;
+
+	pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
+	control &= ~PCI_MSI_FLAGS_ENABLE;
+	if (enable)
+		control |= PCI_MSI_FLAGS_ENABLE;
+	pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
+}
+
+static inline void pci_msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
+{
+	u16 ctrl;
+
+	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &ctrl);
+	ctrl &= ~clear;
+	ctrl |= set;
+	pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, ctrl);
+}
+
 void pci_realloc_get_opt(char *);
 
 static inline int pci_no_d1d2(struct pci_dev *dev)


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 03/10] PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
  2015-04-10 22:54 ` [PATCH v6 01/10] PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl() Bjorn Helgaas
  2015-04-10 22:54 ` [PATCH v6 02/10] PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl() Bjorn Helgaas
@ 2015-04-10 22:54 ` Bjorn Helgaas
  2015-04-10 22:54 ` [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown Bjorn Helgaas
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Fam Zheng, linux-pci, Eric W. Biederman, Yijing Wang, Yinghai Lu, stable

From: Michael S. Tsirkin <mst@redhat.com>

If we enable MSI, then kexec a new kernel, the new kernel may receive MSIs
it is not prepared for.  Commit d5dea7d95c48 ("PCI: msi: Disable msi
interrupts when we initialize a pci device") prevents this, but only if the
new kernel is built with CONFIG_PCI_MSI=y.

Move the "disable MSI" functionality from drivers/pci/msi.c to a new
pci_msi_setup_pci_dev() in drivers/pci/probe.c so we can disable MSIs when
we enumerate devices even if the kernel doesn't include full MSI support.

[bhelgaas: changelog, disable MSIs in pci_setup_device(), put
pci_msi_setup_pci_dev() at its final destination]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@kernel.org
---
 drivers/pci/msi.c   |   12 ------------
 drivers/pci/probe.c |   18 ++++++++++++++++++
 2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 9942f6827a4a..f66be868ad21 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1041,18 +1041,6 @@ EXPORT_SYMBOL(pci_msi_enabled);
 void pci_msi_init_pci_dev(struct pci_dev *dev)
 {
 	INIT_LIST_HEAD(&dev->msi_list);
-
-	/* Disable the msi hardware to avoid screaming interrupts
-	 * during boot.  This is the power on reset default so
-	 * usually this should be a noop.
-	 */
-	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
-	if (dev->msi_cap)
-		pci_msi_set_enable(dev, 0);
-
-	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
-	if (dev->msix_cap)
-		pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
 }
 
 /**
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 8d2f400e96cb..740113b70ade 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1084,6 +1084,22 @@ int pci_cfg_space_size(struct pci_dev *dev)
 
 #define LEGACY_IO_RESOURCE	(IORESOURCE_IO | IORESOURCE_PCI_FIXED)
 
+static void pci_msi_setup_pci_dev(struct pci_dev *dev)
+{
+	/*
+	 * Disable the MSI hardware to avoid screaming interrupts
+	 * during boot.  This is the power on reset default so
+	 * usually this should be a noop.
+	 */
+	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
+	if (dev->msi_cap)
+		pci_msi_set_enable(dev, 0);
+
+	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
+	if (dev->msix_cap)
+		pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+}
+
 /**
  * pci_setup_device - fill in class and map information of a device
  * @dev: the device structure to fill
@@ -1139,6 +1155,8 @@ int pci_setup_device(struct pci_dev *dev)
 	/* "Unknown power state" */
 	dev->current_state = PCI_UNKNOWN;
 
+	pci_msi_setup_pci_dev(dev);
+
 	/* Early fixups, before probing the BARs */
 	pci_fixup_device(pci_fixup_early, dev);
 	/* device class may be changed after fixup */


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2015-04-10 22:54 ` [PATCH v6 03/10] PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI Bjorn Helgaas
@ 2015-04-10 22:54 ` Bjorn Helgaas
  2015-04-13  9:37   ` Fam Zheng
  2015-04-16  7:30   ` Michael S. Tsirkin
  2015-04-10 22:54 ` [PATCH v6 05/10] PCI/MSI: Make pci_msi_shutdown(), pci_msix_shutdown() static Bjorn Helgaas
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Fam Zheng, linux-pci, Rusty Russell, Ulrich Obergfell,
	Yinghai Lu, Eric W. Biederman, Yijing Wang, Yinghai Lu

From: Michael S. Tsirkin <mst@redhat.com>

d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown
v2") disabled MSI/MSI-X at device shutdown to address a kexec problem.

The problem is that after we disable MSI, the device may assert INTx, and
if the driver hasn't registered an interrupt handler for it, the interrupt
is never deasserted and causes a kernel hang.  In particular, this was
observed with virtio.

We now disable MSI/MSI-X for all devices during enumeration regardless of
CONFIG_PCI_MSI.  This solves the kexec problem in the new kernel, not the
old one.

Stop disabling MSIs at shutdown to avoid the kernel hang.

XXX bugzilla reference, details about how the hang happens?

[bhelgaas: changelog]
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Yinghai Lu <yhlu.kernel.send@gmail.com>
CC: Ulrich Obergfell <uobergfe@redhat.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
---
 drivers/pci/pci-driver.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 3cb2210de553..38a602cb9fb7 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -450,8 +450,6 @@ static void pci_device_shutdown(struct device *dev)
 
 	if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
-	pci_msi_shutdown(pci_dev);
-	pci_msix_shutdown(pci_dev);
 
 #ifdef CONFIG_KEXEC
 	/*


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 05/10] PCI/MSI: Make pci_msi_shutdown(), pci_msix_shutdown() static
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
                   ` (3 preceding siblings ...)
  2015-04-10 22:54 ` [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown Bjorn Helgaas
@ 2015-04-10 22:54 ` Bjorn Helgaas
  2015-04-10 22:55 ` [PATCH v6 06/10] virtio_pci: drop pci_msi_off() call during probe Bjorn Helgaas
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

From: Michael S. Tsirkin <mst@redhat.com>

pci_msi_shutdown() and pci_msix_shutdown() are now internal to
drivers/pci/msi.c; make them static.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/msi.c   |    4 ++--
 include/linux/pci.h |    4 ----
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index f66be868ad21..ea78a0746a42 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -887,7 +887,7 @@ int pci_msi_vec_count(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pci_msi_vec_count);
 
-void pci_msi_shutdown(struct pci_dev *dev)
+static void pci_msi_shutdown(struct pci_dev *dev)
 {
 	struct msi_desc *desc;
 	u32 mask;
@@ -993,7 +993,7 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
 }
 EXPORT_SYMBOL(pci_enable_msix);
 
-void pci_msix_shutdown(struct pci_dev *dev)
+static void pci_msix_shutdown(struct pci_dev *dev)
 {
 	struct msi_desc *entry;
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 211e9da8a7d7..a34df456faf2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1209,11 +1209,9 @@ struct msix_entry {
 
 #ifdef CONFIG_PCI_MSI
 int pci_msi_vec_count(struct pci_dev *dev);
-void pci_msi_shutdown(struct pci_dev *dev);
 void pci_disable_msi(struct pci_dev *dev);
 int pci_msix_vec_count(struct pci_dev *dev);
 int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec);
-void pci_msix_shutdown(struct pci_dev *dev);
 void pci_disable_msix(struct pci_dev *dev);
 void pci_restore_msi_state(struct pci_dev *dev);
 int pci_msi_enabled(void);
@@ -1237,13 +1235,11 @@ static inline int pci_enable_msix_exact(struct pci_dev *dev,
 }
 #else
 static inline int pci_msi_vec_count(struct pci_dev *dev) { return -ENOSYS; }
-static inline void pci_msi_shutdown(struct pci_dev *dev) { }
 static inline void pci_disable_msi(struct pci_dev *dev) { }
 static inline int pci_msix_vec_count(struct pci_dev *dev) { return -ENOSYS; }
 static inline int pci_enable_msix(struct pci_dev *dev,
 				  struct msix_entry *entries, int nvec)
 { return -ENOSYS; }
-static inline void pci_msix_shutdown(struct pci_dev *dev) { }
 static inline void pci_disable_msix(struct pci_dev *dev) { }
 static inline void pci_restore_msi_state(struct pci_dev *dev) { }
 static inline int pci_msi_enabled(void) { return 0; }


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 06/10] virtio_pci: drop pci_msi_off() call during probe
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
                   ` (4 preceding siblings ...)
  2015-04-10 22:54 ` [PATCH v6 05/10] PCI/MSI: Make pci_msi_shutdown(), pci_msix_shutdown() static Bjorn Helgaas
@ 2015-04-10 22:55 ` Bjorn Helgaas
  2015-04-10 22:55 ` [PATCH v6 07/10] ntb: Drop " Bjorn Helgaas
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

From: Michael S. Tsirkin <mst@redhat.com>

The PCI core now disables MSI and MSI-X for all devices during enumeration
regardless of CONFIG_PCI_MSI.  Remove device-specific code to disable
MSI/MSI-X.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/virtio/virtio_pci_common.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index e894eb278d83..806bb2c2e382 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -501,9 +501,6 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
 	INIT_LIST_HEAD(&vp_dev->virtqueues);
 	spin_lock_init(&vp_dev->lock);
 
-	/* Disable MSI/MSIX to bring device to a known good state. */
-	pci_msi_off(pci_dev);
-
 	/* enable the device */
 	rc = pci_enable_device(pci_dev);
 	if (rc)


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 07/10] ntb: Drop pci_msi_off() call during probe
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
                   ` (5 preceding siblings ...)
  2015-04-10 22:55 ` [PATCH v6 06/10] virtio_pci: drop pci_msi_off() call during probe Bjorn Helgaas
@ 2015-04-10 22:55 ` Bjorn Helgaas
  2015-04-10 22:55 ` [PATCH v6 08/10] mic: " Bjorn Helgaas
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

From: Michael S. Tsirkin <mst@redhat.com>

The PCI core now disables MSI and MSI-X for all devices during enumeration
regardless of CONFIG_PCI_MSI.  Remove device-specific code to disable
MSI/MSI-X.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/ntb/ntb_hw.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c
index cd29b1038c5e..8225cbcd6eb8 100644
--- a/drivers/ntb/ntb_hw.c
+++ b/drivers/ntb/ntb_hw.c
@@ -1313,8 +1313,6 @@ static int ntb_setup_intx(struct ntb_device *ndev)
 	struct pci_dev *pdev = ndev->pdev;
 	int rc;
 
-	pci_msi_off(pdev);
-
 	/* Verify intx is enabled */
 	pci_intx(pdev, 1);
 


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 08/10] mic: Drop pci_msi_off() call during probe
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
                   ` (6 preceding siblings ...)
  2015-04-10 22:55 ` [PATCH v6 07/10] ntb: Drop " Bjorn Helgaas
@ 2015-04-10 22:55 ` Bjorn Helgaas
  2015-04-10 22:55 ` [PATCH v6 09/10] PCI/MSI: Drop pci_msi_off() calls from quirks Bjorn Helgaas
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

From: Michael S. Tsirkin <mst@redhat.com>

The PCI core now disables MSI and MSI-X for all devices during enumeration
regardless of CONFIG_PCI_MSI.  Remove device-specific code to disable
MSI/MSI-X.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/misc/mic/host/mic_intr.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index d686f2846ac7..b4ca6c884d19 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -363,8 +363,6 @@ static int mic_setup_intx(struct mic_device *mdev, struct pci_dev *pdev)
 {
 	int rc;
 
-	pci_msi_off(pdev);
-
 	/* Enable intx */
 	pci_intx(pdev, 1);
 	rc = mic_setup_callbacks(mdev);


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 09/10] PCI/MSI: Drop pci_msi_off() calls from quirks
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
                   ` (7 preceding siblings ...)
  2015-04-10 22:55 ` [PATCH v6 08/10] mic: " Bjorn Helgaas
@ 2015-04-10 22:55 ` Bjorn Helgaas
  2015-04-10 22:55 ` [PATCH v6 10/10] PCI/MSI: Remove unused pci_msi_off() Bjorn Helgaas
  2015-04-26  6:50 ` [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Michael S. Tsirkin
  10 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

From: Michael S. Tsirkin <mst@redhat.com>

The PCI core now disables MSI and MSI-X for all devices during enumeration
regardless of CONFIG_PCI_MSI.  Remove device-specific code to disable
MSI/MSI-X.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/quirks.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 85f247e28a80..df3e71855316 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1600,7 +1600,6 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL,	PCI_DEVICE_ID_INTEL_EESSC,	quirk_a
 
 static void quirk_pcie_mch(struct pci_dev *pdev)
 {
-	pci_msi_off(pdev);
 	pdev->no_msi = 1;
 }
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL,	PCI_DEVICE_ID_INTEL_E7520_MCH,	quirk_pcie_mch);
@@ -1614,7 +1613,6 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL,	PCI_DEVICE_ID_INTEL_E7525_MCH,	quir
  */
 static void quirk_pcie_pxh(struct pci_dev *dev)
 {
-	pci_msi_off(dev);
 	dev->no_msi = 1;
 	dev_warn(&dev->dev, "PXH quirk detected; SHPC device MSI disabled\n");
 }


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v6 10/10] PCI/MSI: Remove unused pci_msi_off()
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
                   ` (8 preceding siblings ...)
  2015-04-10 22:55 ` [PATCH v6 09/10] PCI/MSI: Drop pci_msi_off() calls from quirks Bjorn Helgaas
@ 2015-04-10 22:55 ` Bjorn Helgaas
  2015-04-26  6:50 ` [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Michael S. Tsirkin
  10 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-10 22:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

pci_msi_off() is unused, so remove it.

Removes the exported symbol pci_msi_off().

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c   |   33 ---------------------------------
 include/linux/pci.h |    1 -
 2 files changed, 34 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 81f06e8dcc04..3d938a7d3b04 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3096,39 +3096,6 @@ bool pci_check_and_unmask_intx(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_check_and_unmask_intx);
 
-/**
- * pci_msi_off - disables any MSI or MSI-X capabilities
- * @dev: the PCI device to operate on
- *
- * If you want to use MSI, see pci_enable_msi() and friends.
- * This is a lower-level primitive that allows us to disable
- * MSI operation at the device level.
- */
-void pci_msi_off(struct pci_dev *dev)
-{
-	int pos;
-	u16 control;
-
-	/*
-	 * This looks like it could go in msi.c, but we need it even when
-	 * CONFIG_PCI_MSI=n.  For the same reason, we can't use
-	 * dev->msi_cap or dev->msix_cap here.
-	 */
-	pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
-	if (pos) {
-		pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &control);
-		control &= ~PCI_MSI_FLAGS_ENABLE;
-		pci_write_config_word(dev, pos + PCI_MSI_FLAGS, control);
-	}
-	pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
-	if (pos) {
-		pci_read_config_word(dev, pos + PCI_MSIX_FLAGS, &control);
-		control &= ~PCI_MSIX_FLAGS_ENABLE;
-		pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control);
-	}
-}
-EXPORT_SYMBOL_GPL(pci_msi_off);
-
 int pci_set_dma_max_seg_size(struct pci_dev *dev, unsigned int size)
 {
 	return dma_set_max_seg_size(&dev->dev, size);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a34df456faf2..ef15f91207b4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -970,7 +970,6 @@ void pci_intx(struct pci_dev *dev, int enable);
 bool pci_intx_mask_supported(struct pci_dev *dev);
 bool pci_check_and_mask_intx(struct pci_dev *dev);
 bool pci_check_and_unmask_intx(struct pci_dev *dev);
-void pci_msi_off(struct pci_dev *dev);
 int pci_set_dma_max_seg_size(struct pci_dev *dev, unsigned int size);
 int pci_set_dma_seg_boundary(struct pci_dev *dev, unsigned long mask);
 int pci_wait_for_pending(struct pci_dev *dev, int pos, u16 mask);


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 01/10] PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl()
  2015-04-10 22:54 ` [PATCH v6 01/10] PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl() Bjorn Helgaas
@ 2015-04-11  7:30   ` Greg KH
  2015-04-11 16:01     ` Bjorn Helgaas
  0 siblings, 1 reply; 30+ messages in thread
From: Greg KH @ 2015-04-11  7:30 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Michael S. Tsirkin, Fam Zheng, linux-pci, stable,
	Eric W. Biederman, Yijing Wang, Yinghai Lu

On Fri, Apr 10, 2015 at 05:54:26PM -0500, Bjorn Helgaas wrote:
> From: Michael S. Tsirkin <mst@redhat.com>
> 
> Rename msi_set_enable() to pci_msi_set_enable() and
> msix_clear_and_set_ctrl() to pci_msix_clear_and_set_ctrl().
> 
> No functional change.
> 
> [bhelgaas: changelog, split into separate patch]
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> CC: stable@vger.kernel.org
> ---
>  drivers/pci/msi.c |   28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)

How does this, and the other patch you marked for stable, relate to the
stable_kernel_rules.txt file that dictates what we can take for stable
patches?

confused,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 01/10] PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl()
  2015-04-11  7:30   ` Greg KH
@ 2015-04-11 16:01     ` Bjorn Helgaas
  0 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-11 16:01 UTC (permalink / raw)
  To: Greg KH
  Cc: Michael S. Tsirkin, Fam Zheng, linux-pci, stable,
	Eric W. Biederman, Yijing Wang, Yinghai Lu

On Sat, Apr 11, 2015 at 2:30 AM, Greg KH <greg@kroah.com> wrote:
> On Fri, Apr 10, 2015 at 05:54:26PM -0500, Bjorn Helgaas wrote:
>> From: Michael S. Tsirkin <mst@redhat.com>
>>
>> Rename msi_set_enable() to pci_msi_set_enable() and
>> msix_clear_and_set_ctrl() to pci_msix_clear_and_set_ctrl().
>>
>> No functional change.
>>
>> [bhelgaas: changelog, split into separate patch]
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> Reviewed-by: Fam Zheng <famz@redhat.com>
>> CC: stable@vger.kernel.org
>> ---
>>  drivers/pci/msi.c |   28 ++++++++++++++--------------
>>  1 file changed, 14 insertions(+), 14 deletions(-)
>
> How does this, and the other patch you marked for stable, relate to the
> stable_kernel_rules.txt file that dictates what we can take for stable
> patches?

I fat-fingered the stable email address on the patches with the actual
bug fixes, so you probably didn't see them.  The candidates for stable
are patches 1-4 of this series:

      PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl()
      PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl()
      PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI
      PCI/MSI: Don't disable MSI/MSI-X at shutdown

The first two make no functional difference; the last two are the bug fixes.

Bjorn

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-10 22:54 ` [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown Bjorn Helgaas
@ 2015-04-13  9:37   ` Fam Zheng
  2015-04-13 15:41     ` Bjorn Helgaas
  2015-04-16  7:30   ` Michael S. Tsirkin
  1 sibling, 1 reply; 30+ messages in thread
From: Fam Zheng @ 2015-04-13  9:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Michael S. Tsirkin, linux-pci, Rusty Russell, Ulrich Obergfell,
	Yinghai Lu, Eric W. Biederman, Yijing Wang, Yinghai Lu, famz

Hi Bjorn,

On Fri, 04/10 17:54, Bjorn Helgaas wrote:
> From: Michael S. Tsirkin <mst@redhat.com>
> 
> d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown
> v2") disabled MSI/MSI-X at device shutdown to address a kexec problem.
> 
> The problem is that after we disable MSI, the device may assert INTx, and
> if the driver hasn't registered an interrupt handler for it, the interrupt
> is never deasserted and causes a kernel hang.  In particular, this was
> observed with virtio.
> 
> We now disable MSI/MSI-X for all devices during enumeration regardless of
> CONFIG_PCI_MSI.  This solves the kexec problem in the new kernel, not the
> old one.
> 
> Stop disabling MSIs at shutdown to avoid the kernel hang.
> 
> XXX bugzilla reference, details about how the hang happens?

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96571

Please let me know if you need any further information in the bug.

Fam

> 
> [bhelgaas: changelog]
> Reported-by: Fam Zheng <famz@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> CC: Yinghai Lu <yhlu.kernel.send@gmail.com>
> CC: Ulrich Obergfell <uobergfe@redhat.com>
> CC: Rusty Russell <rusty@rustcorp.com.au>
> ---
>  drivers/pci/pci-driver.c |    2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3cb2210de553..38a602cb9fb7 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -450,8 +450,6 @@ static void pci_device_shutdown(struct device *dev)
>  
>  	if (drv && drv->shutdown)
>  		drv->shutdown(pci_dev);
> -	pci_msi_shutdown(pci_dev);
> -	pci_msix_shutdown(pci_dev);
>  
>  #ifdef CONFIG_KEXEC
>  	/*
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-13  9:37   ` Fam Zheng
@ 2015-04-13 15:41     ` Bjorn Helgaas
  2015-04-13 16:45       ` Eric W. Biederman
  2015-04-14  9:47       ` Michael S. Tsirkin
  0 siblings, 2 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-13 15:41 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Michael S. Tsirkin, linux-pci, Rusty Russell, Ulrich Obergfell,
	Yinghai Lu, Eric W. Biederman, Yijing Wang, Yinghai Lu

On Mon, Apr 13, 2015 at 4:37 AM, Fam Zheng <famz@redhat.com> wrote:
> Hi Bjorn,
>
> On Fri, 04/10 17:54, Bjorn Helgaas wrote:
>> From: Michael S. Tsirkin <mst@redhat.com>
>>
>> d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown
>> v2") disabled MSI/MSI-X at device shutdown to address a kexec problem.
>>
>> The problem is that after we disable MSI, the device may assert INTx, and
>> if the driver hasn't registered an interrupt handler for it, the interrupt
>> is never deasserted and causes a kernel hang.  In particular, this was
>> observed with virtio.
>>
>> We now disable MSI/MSI-X for all devices during enumeration regardless of
>> CONFIG_PCI_MSI.  This solves the kexec problem in the new kernel, not the
>> old one.
>>
>> Stop disabling MSIs at shutdown to avoid the kernel hang.
>>
>> XXX bugzilla reference, details about how the hang happens?
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96571
>
> Please let me know if you need any further information in the bug.

Please attach a complete dmesg log.  The bugzilla doesn't really have
any new information other than that you see a soft lockup.  I'm trying
to connect more of the dots between a spurious interrupt and a hang or
soft lockup.

It doesn't seem right that a spurious interrupt could cause a hang or
soft lockup.  I would think Linux would emit a message about the
unexpected interrupt, but would otherwise be relatively unconcerned.
So I'm trying to figure out why my assumption is wrong.  Probably this
is just because I don't know much about Linux IRQ handling.

Having more details, e.g., a stacktrace fragment from a soft lockup,
can also help people connect a problem they're seeing with the
solution.  It's pretty hard to google for "kernel hang," but if you
can google for a soft lockup in a specific function, that can be much
more useful.

>> [bhelgaas: changelog]
>> Reported-by: Fam Zheng <famz@redhat.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> CC: Yinghai Lu <yhlu.kernel.send@gmail.com>
>> CC: Ulrich Obergfell <uobergfe@redhat.com>
>> CC: Rusty Russell <rusty@rustcorp.com.au>
>> ---
>>  drivers/pci/pci-driver.c |    2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 3cb2210de553..38a602cb9fb7 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -450,8 +450,6 @@ static void pci_device_shutdown(struct device *dev)
>>
>>       if (drv && drv->shutdown)
>>               drv->shutdown(pci_dev);
>> -     pci_msi_shutdown(pci_dev);
>> -     pci_msix_shutdown(pci_dev);
>>
>>  #ifdef CONFIG_KEXEC
>>       /*
>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-13 15:41     ` Bjorn Helgaas
@ 2015-04-13 16:45       ` Eric W. Biederman
  2015-04-14  9:44         ` Michael S. Tsirkin
  2015-04-16 19:42         ` Bjorn Helgaas
  2015-04-14  9:47       ` Michael S. Tsirkin
  1 sibling, 2 replies; 30+ messages in thread
From: Eric W. Biederman @ 2015-04-13 16:45 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Fam Zheng, Michael S. Tsirkin, linux-pci, Rusty Russell,
	Ulrich Obergfell, Yinghai Lu, Yijing Wang, Yinghai Lu

Bjorn Helgaas <bhelgaas@google.com> writes:

> On Mon, Apr 13, 2015 at 4:37 AM, Fam Zheng <famz@redhat.com> wrote:
>> Hi Bjorn,
>>
>> On Fri, 04/10 17:54, Bjorn Helgaas wrote:
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>
>>> d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown
>>> v2") disabled MSI/MSI-X at device shutdown to address a kexec problem.
>>>
>>> The problem is that after we disable MSI, the device may assert INTx, and
>>> if the driver hasn't registered an interrupt handler for it, the interrupt
>>> is never deasserted and causes a kernel hang.  In particular, this was
>>> observed with virtio.
>>>
>>> We now disable MSI/MSI-X for all devices during enumeration regardless of
>>> CONFIG_PCI_MSI.  This solves the kexec problem in the new kernel, not the
>>> old one.
>>>
>>> Stop disabling MSIs at shutdown to avoid the kernel hang.
>>>
>>> XXX bugzilla reference, details about how the hang happens?
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96571
>>
>> Please let me know if you need any further information in the bug.
>
> Please attach a complete dmesg log.  The bugzilla doesn't really have
> any new information other than that you see a soft lockup.  I'm trying
> to connect more of the dots between a spurious interrupt and a hang or
> soft lockup.
>

The bugzilla implies that there is a screaming irq (which causes the
softlockup when they disable the kernels protections for buggy irqs).

> It doesn't seem right that a spurious interrupt could cause a hang or
> soft lockup.

The interrupt handler keeps firing.

> I would think Linux would emit a message about the
> unexpected interrupt, but would otherwise be relatively unconcerned.

That was disabled on the kernel command line.

> So I'm trying to figure out why my assumption is wrong.  Probably this
> is just because I don't know much about Linux IRQ handling.
>
> Having more details, e.g., a stacktrace fragment from a soft lockup,
> can also help people connect a problem they're seeing with the
> solution.  It's pretty hard to google for "kernel hang," but if you
> can google for a soft lockup in a specific function, that can be much
> more useful.

The thing is not disabling msi interrupts for the case described in the
buzilla report is the wrong fix.

The report is about a buggy driver doing the wrong thing.  Until someone
ships a system that is msi native (aka no intx support) disabling msi
interrupts as shutdown is the right thing to do.  If there is something
that handles intx interrupts it is not an msi native system.

The real bug is probably disabling bugging interrupt detection on the
kernel command line.

Beyond that to handle kexec cleanly something needs to stop the
interrupts and stop the the DMA transfers.   Which in the short term
means someone probably needs to write a shutdown method for the buggy
driver.

An interrupt coming in almost always implies a DMA having completed,
and if that DMA completed in the wrong spot the kexec'd kernel will be
toast.

We disable interrupts at boot so that a kernel started with
kexec-on-panic (which doesn't shut anything down) can boot.  There are
probably other valid use cases (like native msi interrupts) but I am not
aware of them.  But according to the pci spec shutting down msi
interrupts at boot should be a noop.

So in summary not disabling MSI/MSI-X at shutdown is the wrong fix,
and someone needs to fix a buggy driver.

Eric

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-13 16:45       ` Eric W. Biederman
@ 2015-04-14  9:44         ` Michael S. Tsirkin
  2015-04-16 19:42         ` Bjorn Helgaas
  1 sibling, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2015-04-14  9:44 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Bjorn Helgaas, Fam Zheng, linux-pci, Rusty Russell,
	Ulrich Obergfell, Yinghai Lu, Yijing Wang, Yinghai Lu

On Mon, Apr 13, 2015 at 11:45:31AM -0500, Eric W. Biederman wrote:
> Bjorn Helgaas <bhelgaas@google.com> writes:
> 
> > On Mon, Apr 13, 2015 at 4:37 AM, Fam Zheng <famz@redhat.com> wrote:
> >> Hi Bjorn,
> >>
> >> On Fri, 04/10 17:54, Bjorn Helgaas wrote:
> >>> From: Michael S. Tsirkin <mst@redhat.com>
> >>>
> >>> d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown
> >>> v2") disabled MSI/MSI-X at device shutdown to address a kexec problem.
> >>>
> >>> The problem is that after we disable MSI, the device may assert INTx, and
> >>> if the driver hasn't registered an interrupt handler for it, the interrupt
> >>> is never deasserted and causes a kernel hang.  In particular, this was
> >>> observed with virtio.
> >>>
> >>> We now disable MSI/MSI-X for all devices during enumeration regardless of
> >>> CONFIG_PCI_MSI.  This solves the kexec problem in the new kernel, not the
> >>> old one.
> >>>
> >>> Stop disabling MSIs at shutdown to avoid the kernel hang.
> >>>
> >>> XXX bugzilla reference, details about how the hang happens?
> >>
> >> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96571
> >>
> >> Please let me know if you need any further information in the bug.
> >
> > Please attach a complete dmesg log.  The bugzilla doesn't really have
> > any new information other than that you see a soft lockup.  I'm trying
> > to connect more of the dots between a spurious interrupt and a hang or
> > soft lockup.
> >
> 
> The bugzilla implies that there is a screaming irq (which causes the
> softlockup when they disable the kernels protections for buggy irqs).
> 
> > It doesn't seem right that a spurious interrupt could cause a hang or
> > soft lockup.
> 
> The interrupt handler keeps firing.
> 
> > I would think Linux would emit a message about the
> > unexpected interrupt, but would otherwise be relatively unconcerned.
> 
> That was disabled on the kernel command line.
> 
> > So I'm trying to figure out why my assumption is wrong.  Probably this
> > is just because I don't know much about Linux IRQ handling.
> >
> > Having more details, e.g., a stacktrace fragment from a soft lockup,
> > can also help people connect a problem they're seeing with the
> > solution.  It's pretty hard to google for "kernel hang," but if you
> > can google for a soft lockup in a specific function, that can be much
> > more useful.
> 
> The thing is not disabling msi interrupts for the case described in the
> buzilla report is the wrong fix.
> 
> The report is about a buggy driver doing the wrong thing.  Until someone
> ships a system that is msi native (aka no intx support) disabling msi
> interrupts as shutdown is the right thing to do.  If there is something
> that handles intx interrupts it is not an msi native system.
> 
> The real bug is probably disabling bugging interrupt detection on the
> kernel command line.
> 
> Beyond that to handle kexec cleanly something needs to stop the
> interrupts and stop the the DMA transfers.   Which in the short term
> means someone probably needs to write a shutdown method for the buggy
> driver.
> 
> An interrupt coming in almost always implies a DMA having completed,
> and if that DMA completed in the wrong spot the kexec'd kernel will be
> toast.
> 
> We disable interrupts at boot so that a kernel started with
> kexec-on-panic (which doesn't shut anything down) can boot.  There are
> probably other valid use cases (like native msi interrupts) but I am not
> aware of them.  But according to the pci spec shutting down msi
> interrupts at boot should be a noop.
> 
> So in summary not disabling MSI/MSI-X at shutdown is the wrong fix,
> and someone needs to fix a buggy driver.
> 
> Eric

I'm not all that worried about this patch making it into stable.  So I
suggest for now we ignore the bugzilla and just focus on the patch
itself.

And the patch itself is not about a buggy driver.  It's about
a correct driver causing screaming interrupts because
pci core decided to disable msi at shutdown.

Which is not necessary for two reasons:
- because previous patches disable msi when kexec starts now
- because suppressing DMA automatically suppresses MSI
  as well




-- 
MST

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-13 15:41     ` Bjorn Helgaas
  2015-04-13 16:45       ` Eric W. Biederman
@ 2015-04-14  9:47       ` Michael S. Tsirkin
  2015-04-14 10:45         ` Fam Zheng
  1 sibling, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2015-04-14  9:47 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Fam Zheng, linux-pci, Rusty Russell, Ulrich Obergfell,
	Yinghai Lu, Eric W. Biederman, Yijing Wang, Yinghai Lu

On Mon, Apr 13, 2015 at 10:41:22AM -0500, Bjorn Helgaas wrote:
> On Mon, Apr 13, 2015 at 4:37 AM, Fam Zheng <famz@redhat.com> wrote:
> > Hi Bjorn,
> >
> > On Fri, 04/10 17:54, Bjorn Helgaas wrote:
> >> From: Michael S. Tsirkin <mst@redhat.com>
> >>
> >> d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown
> >> v2") disabled MSI/MSI-X at device shutdown to address a kexec problem.
> >>
> >> The problem is that after we disable MSI, the device may assert INTx, and
> >> if the driver hasn't registered an interrupt handler for it, the interrupt
> >> is never deasserted and causes a kernel hang.  In particular, this was
> >> observed with virtio.
> >>
> >> We now disable MSI/MSI-X for all devices during enumeration regardless of
> >> CONFIG_PCI_MSI.  This solves the kexec problem in the new kernel, not the
> >> old one.
> >>
> >> Stop disabling MSIs at shutdown to avoid the kernel hang.
> >>
> >> XXX bugzilla reference, details about how the hang happens?
> >
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96571
> >
> > Please let me know if you need any further information in the bug.
> 
> Please attach a complete dmesg log.  The bugzilla doesn't really have
> any new information other than that you see a soft lockup.  I'm trying
> to connect more of the dots between a spurious interrupt and a hang or
> soft lockup.
> 
> It doesn't seem right that a spurious interrupt could cause a hang or
> soft lockup.  I would think Linux would emit a message about the
> unexpected interrupt, but would otherwise be relatively unconcerned.
> So I'm trying to figure out why my assumption is wrong.  Probably this
> is just because I don't know much about Linux IRQ handling.
> 
> Having more details, e.g., a stacktrace fragment from a soft lockup,
> can also help people connect a problem they're seeing with the
> solution.  It's pretty hard to google for "kernel hang," but if you
> can google for a soft lockup in a specific function, that can be much
> more useful.

I have investigated this, and I at this point I think the hang is basically
a non issue. So the commit log should say

	if the driver hasn't registered an interrupt handler for it, the interrupt
	is never deasserted and causes spurious interrupts, typically
	followed by kernel disabling the irq.


> >> [bhelgaas: changelog]
> >> Reported-by: Fam Zheng <famz@redhat.com>
> >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> >> CC: Yinghai Lu <yhlu.kernel.send@gmail.com>
> >> CC: Ulrich Obergfell <uobergfe@redhat.com>
> >> CC: Rusty Russell <rusty@rustcorp.com.au>
> >> ---
> >>  drivers/pci/pci-driver.c |    2 --
> >>  1 file changed, 2 deletions(-)
> >>
> >> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> >> index 3cb2210de553..38a602cb9fb7 100644
> >> --- a/drivers/pci/pci-driver.c
> >> +++ b/drivers/pci/pci-driver.c
> >> @@ -450,8 +450,6 @@ static void pci_device_shutdown(struct device *dev)
> >>
> >>       if (drv && drv->shutdown)
> >>               drv->shutdown(pci_dev);
> >> -     pci_msi_shutdown(pci_dev);
> >> -     pci_msix_shutdown(pci_dev);
> >>
> >>  #ifdef CONFIG_KEXEC
> >>       /*
> >>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-14  9:47       ` Michael S. Tsirkin
@ 2015-04-14 10:45         ` Fam Zheng
  2015-04-14 10:49           ` Michael S. Tsirkin
  0 siblings, 1 reply; 30+ messages in thread
From: Fam Zheng @ 2015-04-14 10:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Bjorn Helgaas, linux-pci, Rusty Russell, Ulrich Obergfell,
	Yinghai Lu, Eric W. Biederman, Yijing Wang, Yinghai Lu

On Tue, 04/14 11:47, Michael S. Tsirkin wrote:
> I have investigated this, and I at this point I think the hang is basically
> a non issue. So the commit log should say
> 
> 	if the driver hasn't registered an interrupt handler for it, the interrupt
> 	is never deasserted and causes spurious interrupts, typically
> 	followed by kernel disabling the irq.

Or, how about disabling intx immediately too?

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 3cb2210..dd7dcc1 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -452,6 +452,7 @@ static void pci_device_shutdown(struct device *dev)
                drv->shutdown(pci_dev);
        pci_msi_shutdown(pci_dev);
        pci_msix_shutdown(pci_dev);
+       pci_intx(pci_dev, 0);

 #ifdef CONFIG_KEXEC
        /*

> 
> 
> > >> [bhelgaas: changelog]
> > >> Reported-by: Fam Zheng <famz@redhat.com>
> > >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > >> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > >> CC: Yinghai Lu <yhlu.kernel.send@gmail.com>
> > >> CC: Ulrich Obergfell <uobergfe@redhat.com>
> > >> CC: Rusty Russell <rusty@rustcorp.com.au>
> > >> ---
> > >>  drivers/pci/pci-driver.c |    2 --
> > >>  1 file changed, 2 deletions(-)
> > >>
> > >> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > >> index 3cb2210de553..38a602cb9fb7 100644
> > >> --- a/drivers/pci/pci-driver.c
> > >> +++ b/drivers/pci/pci-driver.c
> > >> @@ -450,8 +450,6 @@ static void pci_device_shutdown(struct device *dev)
> > >>
> > >>       if (drv && drv->shutdown)
> > >>               drv->shutdown(pci_dev);
> > >> -     pci_msi_shutdown(pci_dev);
> > >> -     pci_msix_shutdown(pci_dev);
> > >>
> > >>  #ifdef CONFIG_KEXEC
> > >>       /*
> > >>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-14 10:45         ` Fam Zheng
@ 2015-04-14 10:49           ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2015-04-14 10:49 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Bjorn Helgaas, linux-pci, Rusty Russell, Ulrich Obergfell,
	Yinghai Lu, Eric W. Biederman, Yijing Wang, Yinghai Lu

On Tue, Apr 14, 2015 at 06:45:05PM +0800, Fam Zheng wrote:
> On Tue, 04/14 11:47, Michael S. Tsirkin wrote:
> > I have investigated this, and I at this point I think the hang is basically
> > a non issue. So the commit log should say
> > 
> > 	if the driver hasn't registered an interrupt handler for it, the interrupt
> > 	is never deasserted and causes spurious interrupts, typically
> > 	followed by kernel disabling the irq.
> 
> Or, how about disabling intx immediately too?
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3cb2210..dd7dcc1 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -452,6 +452,7 @@ static void pci_device_shutdown(struct device *dev)
>                 drv->shutdown(pci_dev);
>         pci_msi_shutdown(pci_dev);
>         pci_msix_shutdown(pci_dev);
> +       pci_intx(pci_dev, 0);
> 
>  #ifdef CONFIG_KEXEC
>         /*

Needs to happen before msi shutdown then.
There are also drivers which call pci_intx from interrupt
handler, which will conflict.

> > 
> > 
> > > >> [bhelgaas: changelog]
> > > >> Reported-by: Fam Zheng <famz@redhat.com>
> > > >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > >> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > > >> CC: Yinghai Lu <yhlu.kernel.send@gmail.com>
> > > >> CC: Ulrich Obergfell <uobergfe@redhat.com>
> > > >> CC: Rusty Russell <rusty@rustcorp.com.au>
> > > >> ---
> > > >>  drivers/pci/pci-driver.c |    2 --
> > > >>  1 file changed, 2 deletions(-)
> > > >>
> > > >> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > > >> index 3cb2210de553..38a602cb9fb7 100644
> > > >> --- a/drivers/pci/pci-driver.c
> > > >> +++ b/drivers/pci/pci-driver.c
> > > >> @@ -450,8 +450,6 @@ static void pci_device_shutdown(struct device *dev)
> > > >>
> > > >>       if (drv && drv->shutdown)
> > > >>               drv->shutdown(pci_dev);
> > > >> -     pci_msi_shutdown(pci_dev);
> > > >> -     pci_msix_shutdown(pci_dev);
> > > >>
> > > >>  #ifdef CONFIG_KEXEC
> > > >>       /*
> > > >>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-10 22:54 ` [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown Bjorn Helgaas
  2015-04-13  9:37   ` Fam Zheng
@ 2015-04-16  7:30   ` Michael S. Tsirkin
  1 sibling, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2015-04-16  7:30 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Fam Zheng, linux-pci, Rusty Russell, Ulrich Obergfell,
	Yinghai Lu, Eric W. Biederman, Yijing Wang, Yinghai Lu

OK, a couple more tweaks to the changelog.

On Fri, Apr 10, 2015 at 05:54:47PM -0500, Bjorn Helgaas wrote:
> From: Michael S. Tsirkin <mst@redhat.com>
> 
> d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown
> v2") disabled MSI/MSI-X at device shutdown to address a kexec problem.
> 
> The problem is that after we disable MSI, the device may assert INTx, and
> if the driver hasn't registered an interrupt handler for it, the interrupt
> is never deasserted and causes a kernel hang.

I think we should drop "and causes a kernel hang" from this sentence:
most configurations can work around this by disabling the irq
line in the apic.

>  In particular, this was
> observed with virtio.
> 
> We now disable MSI/MSI-X for all devices during enumeration regardless of
> CONFIG_PCI_MSI.  This solves the kexec problem in the new kernel, not the
> old one.
> 
> Stop disabling MSIs at shutdown to avoid the kernel hang.

And replace this one with:
  Stop disabling MSIs at shutdown to avoid conflicting with
  drivers.


> XXX bugzilla reference, details about how the hang happens?

Add:
See bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=96571
(that one's for a kernel that lacks
 1e77d0a1ed7417d2a5a52a7b8d32aea1833faa6c, so
 it doesn't recover by disabling the irq line).

> 
> [bhelgaas: changelog]
> Reported-by: Fam Zheng <famz@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> CC: Yinghai Lu <yhlu.kernel.send@gmail.com>
> CC: Ulrich Obergfell <uobergfe@redhat.com>
> CC: Rusty Russell <rusty@rustcorp.com.au>
> ---
>  drivers/pci/pci-driver.c |    2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3cb2210de553..38a602cb9fb7 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -450,8 +450,6 @@ static void pci_device_shutdown(struct device *dev)
>  
>  	if (drv && drv->shutdown)
>  		drv->shutdown(pci_dev);
> -	pci_msi_shutdown(pci_dev);
> -	pci_msix_shutdown(pci_dev);
>  
>  #ifdef CONFIG_KEXEC
>  	/*

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-13 16:45       ` Eric W. Biederman
  2015-04-14  9:44         ` Michael S. Tsirkin
@ 2015-04-16 19:42         ` Bjorn Helgaas
  2015-04-17  1:05           ` Fam Zheng
  1 sibling, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2015-04-16 19:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Fam Zheng, Michael S. Tsirkin, linux-pci, Rusty Russell,
	Ulrich Obergfell, Yinghai Lu, Yijing Wang, Yinghai Lu

On Mon, Apr 13, 2015 at 11:45:31AM -0500, Eric W. Biederman wrote:
> ...
> The thing is not disabling msi interrupts for the case described in the
> buzilla report is the wrong fix.
> 
> The report is about a buggy driver doing the wrong thing.  Until someone
> ships a system that is msi native (aka no intx support) disabling msi
> interrupts as shutdown is the right thing to do.  If there is something
> that handles intx interrupts it is not an msi native system.
> 
> The real bug is probably disabling bugging interrupt detection on the
> kernel command line.
> 
> Beyond that to handle kexec cleanly something needs to stop the
> interrupts and stop the the DMA transfers.   Which in the short term
> means someone probably needs to write a shutdown method for the buggy
> driver.
> 
> An interrupt coming in almost always implies a DMA having completed,
> and if that DMA completed in the wrong spot the kexec'd kernel will be
> toast.
> 
> We disable interrupts at boot so that a kernel started with
> kexec-on-panic (which doesn't shut anything down) can boot.  There are
> probably other valid use cases (like native msi interrupts) but I am not
> aware of them.  But according to the pci spec shutting down msi
> interrupts at boot should be a noop.
> 
> So in summary not disabling MSI/MSI-X at shutdown is the wrong fix,
> and someone needs to fix a buggy driver.

Are you saying that:

  - pci_device_shutdown() should continue to call pci_msi_shutdown() and
    pci_msix_shutdown() as it does today, and

  - virtio_pci_driver should implement a .shutdown method?

I'm missing a lot of the context, and this is really outside my normal
sphere, so I'm trying to figure out the scenario we're talking about.
Here's my pitiful guess (Michael/Fam, please correct me where I'm wrong):

  qemu emulates machine with virtio device X, e.g., [1af4:1001]

  guest Linux startup
    guest virtio-pci driver claims device X
      virtio_pci_probe
	register_virtio_device		# adds new device Y on virtio_bus

  guest Linux virtblk_probe		# virtio_driver.probe for device Y
    init_vq
      ...
	vp_find_vqs
	  vp_try_to_find_vqs
	    vp_request_msix_vectors
	      pci_enable_msix_exact	# enables MSI-X for qemu virtio device X
	      request_irq(..., vp_config_changed, ...)

  guest Linux shutdown
    kernel_halt
      ...
	pci_device_shutdown		# device X
	  drv->shutdown
	  pci_msi_shutdown
	  pci_msix_shutdown
	    clear PCI_MSIX_FLAGS_ENABLE

  qemu virtio device X generates interrupt
    virtio_pci_notify
      if (!msix_enabled)		# qemu reads MSIX_ENABLE_MASK
	pci_set_irq
	  pci_irq_handler		# assert INTx in guest

  guest Linux virtio-pci has no ISR for INTx

So now the guest Linux has INTx asserted, but it has no ISR for it, so the
CPU receiving the IRQ is stuck calling do_IRQ() endlessly.

Bjorn

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown
  2015-04-16 19:42         ` Bjorn Helgaas
@ 2015-04-17  1:05           ` Fam Zheng
  0 siblings, 0 replies; 30+ messages in thread
From: Fam Zheng @ 2015-04-17  1:05 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Eric W. Biederman, Michael S. Tsirkin, linux-pci, Rusty Russell,
	Ulrich Obergfell, Yinghai Lu, Yijing Wang, Yinghai Lu

On Thu, 04/16 14:42, Bjorn Helgaas wrote:
> On Mon, Apr 13, 2015 at 11:45:31AM -0500, Eric W. Biederman wrote:
> > ...
> > The thing is not disabling msi interrupts for the case described in the
> > buzilla report is the wrong fix.
> > 
> > The report is about a buggy driver doing the wrong thing.  Until someone
> > ships a system that is msi native (aka no intx support) disabling msi
> > interrupts as shutdown is the right thing to do.  If there is something
> > that handles intx interrupts it is not an msi native system.
> > 
> > The real bug is probably disabling bugging interrupt detection on the
> > kernel command line.
> > 
> > Beyond that to handle kexec cleanly something needs to stop the
> > interrupts and stop the the DMA transfers.   Which in the short term
> > means someone probably needs to write a shutdown method for the buggy
> > driver.
> > 
> > An interrupt coming in almost always implies a DMA having completed,
> > and if that DMA completed in the wrong spot the kexec'd kernel will be
> > toast.
> > 
> > We disable interrupts at boot so that a kernel started with
> > kexec-on-panic (which doesn't shut anything down) can boot.  There are
> > probably other valid use cases (like native msi interrupts) but I am not
> > aware of them.  But according to the pci spec shutting down msi
> > interrupts at boot should be a noop.
> > 
> > So in summary not disabling MSI/MSI-X at shutdown is the wrong fix,
> > and someone needs to fix a buggy driver.
> 
> Are you saying that:
> 
>   - pci_device_shutdown() should continue to call pci_msi_shutdown() and
>     pci_msix_shutdown() as it does today, and
> 
>   - virtio_pci_driver should implement a .shutdown method?
> 
> I'm missing a lot of the context, and this is really outside my normal
> sphere, so I'm trying to figure out the scenario we're talking about.
> Here's my pitiful guess (Michael/Fam, please correct me where I'm wrong):
> 
>   qemu emulates machine with virtio device X, e.g., [1af4:1001]
> 
>   guest Linux startup
>     guest virtio-pci driver claims device X
>       virtio_pci_probe
> 	register_virtio_device		# adds new device Y on virtio_bus
> 
>   guest Linux virtblk_probe		# virtio_driver.probe for device Y
>     init_vq
>       ...
> 	vp_find_vqs
> 	  vp_try_to_find_vqs
> 	    vp_request_msix_vectors
> 	      pci_enable_msix_exact	# enables MSI-X for qemu virtio device X
> 	      request_irq(..., vp_config_changed, ...)
> 
>   guest Linux shutdown
>     kernel_halt
>       ...
> 	pci_device_shutdown		# device X
> 	  drv->shutdown
> 	  pci_msi_shutdown
> 	  pci_msix_shutdown
> 	    clear PCI_MSIX_FLAGS_ENABLE
> 
>   qemu virtio device X generates interrupt
>     virtio_pci_notify
>       if (!msix_enabled)		# qemu reads MSIX_ENABLE_MASK
> 	pci_set_irq
> 	  pci_irq_handler		# assert INTx in guest
> 
>   guest Linux virtio-pci has no ISR for INTx
> 
> So now the guest Linux has INTx asserted, but it has no ISR for it, so the
> CPU receiving the IRQ is stuck calling do_IRQ() endlessly.

Exactly.

Fam

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown
  2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
                   ` (9 preceding siblings ...)
  2015-04-10 22:55 ` [PATCH v6 10/10] PCI/MSI: Remove unused pci_msi_off() Bjorn Helgaas
@ 2015-04-26  6:50 ` Michael S. Tsirkin
  2015-05-06 21:03   ` Bjorn Helgaas
  10 siblings, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2015-04-26  6:50 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

On Fri, Apr 10, 2015 at 05:54:19PM -0500, Bjorn Helgaas wrote:
> Hi Michael,
> 
> I put your patches on my pci/msi branch and I hope to merge them for v4.1.
> I didn't apply the acks from Fam and Eric because I made changes to those
> patches that weren't completely trivial.  I think the end result is
> equivalent, though.  The diff attached to this cover letter is the
> difference between your v5 series and this v6 series.
> 
> As far as I'm concerned, this is ready to go except that I would like a
> little more info about the virtio kernel hang to include in the changelog
> for "PCI/MSI: Don't disable MSI/MSI-X at shutdown".


Hi Bjorn,
do you have eveything you need to merge this?

> Changes from v5:
> 	Edit summaries and changelogs for consistency
> 	Split msi_set_enable() rename/export for reviewability
> 	Move pci_msi_setup_pci_dev() to its ultimate location to avoid
> 	    unnecessary diffs in subsequent patch
> 	Call pci_msi_setup_pci_dev() from its ultimate location to avoid
> 	    unnecessary diffs in subsequent patch
> 	Skip pci_msi_off() duplicate code removal since we can remove
> 	    it completely later
> 	Remove pci_msi_off() completely
> 
> v5 posting: http://lkml.kernel.org/r/1427641227-7574-1-git-send-email-mst@redhat.com
> 
> Bjorn
>     
> ---
> 
> Bjorn Helgaas (1):
>       PCI/MSI: Remove unused pci_msi_off()
> 
> Michael S. Tsirkin (9):
>       PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl()
>       PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl()
>       PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI
>       PCI/MSI: Don't disable MSI/MSI-X at shutdown
>       PCI/MSI: Make pci_msi_shutdown(), pci_msix_shutdown() static
>       virtio_pci: drop pci_msi_off() call during probe
>       ntb: Drop pci_msi_off() call during probe
>       mic: Drop pci_msi_off() call during probe
>       PCI/MSI: Drop pci_msi_off() calls from quirks
> 
> 
>  drivers/misc/mic/host/mic_intr.c   |    2 -
>  drivers/ntb/ntb_hw.c               |    2 -
>  drivers/pci/msi.c                  |   57 ++++++++----------------------------
>  drivers/pci/pci-driver.c           |    2 -
>  drivers/pci/pci.c                  |   33 ---------------------
>  drivers/pci/pci.h                  |   21 +++++++++++++
>  drivers/pci/probe.c                |   18 +++++++++++
>  drivers/pci/quirks.c               |    2 -
>  drivers/virtio/virtio_pci_common.c |    3 --
>  include/linux/pci.h                |    5 ---
>  10 files changed, 51 insertions(+), 94 deletions(-)
> 
> 
> --- This is "git diff v5 v6":
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 54cefb442d19..3d938a7d3b04 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3096,24 +3096,6 @@ bool pci_check_and_unmask_intx(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_GPL(pci_check_and_unmask_intx);
>  
> -/**
> - * pci_msi_off - disables any MSI or MSI-X capabilities
> - * @dev: the PCI device to operate on
> - *
> - * If you want to use MSI, see pci_enable_msi() and friends.
> - * This is a lower-level primitive that allows us to disable
> - * MSI operation at the device level.
> - * Not for use by drivers.
> - */
> -void pci_msi_off(struct pci_dev *dev)
> -{
> -	if (dev->msi_cap)
> -		pci_msi_set_enable(dev, 0);
> -
> -	if (dev->msix_cap)
> -		pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
> -}
> -
>  int pci_set_dma_max_seg_size(struct pci_dev *dev, unsigned int size)
>  {
>  	return dma_set_max_seg_size(&dev->dev, size);
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 620fcad1935d..17f213d494de 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -146,8 +146,6 @@ static inline void pci_no_msi(void) { }
>  static inline void pci_msi_init_pci_dev(struct pci_dev *dev) { }
>  #endif
>  
> -void pci_msi_off(struct pci_dev *dev);
> -
>  static inline void pci_msi_set_enable(struct pci_dev *dev, int enable)
>  {
>  	u16 control;
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 120772c219c7..740113b70ade 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1086,14 +1086,18 @@ int pci_cfg_space_size(struct pci_dev *dev)
>  
>  static void pci_msi_setup_pci_dev(struct pci_dev *dev)
>  {
> -	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
> -	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
> -
> -	/* Disable the msi hardware to avoid screaming interrupts
> +	/*
> +	 * Disable the MSI hardware to avoid screaming interrupts
>  	 * during boot.  This is the power on reset default so
>  	 * usually this should be a noop.
>  	 */
> -	pci_msi_off(dev);
> +	dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
> +	if (dev->msi_cap)
> +		pci_msi_set_enable(dev, 0);
> +
> +	dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
> +	if (dev->msix_cap)
> +		pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
>  }
>  
>  /**
> @@ -1151,7 +1155,6 @@ int pci_setup_device(struct pci_dev *dev)
>  	/* "Unknown power state" */
>  	dev->current_state = PCI_UNKNOWN;
>  
> -	/* MSI/MSI-X setup has to be done early since it's used by quirks. */
>  	pci_msi_setup_pci_dev(dev);
>  
>  	/* Early fixups, before probing the BARs */

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown
  2015-04-26  6:50 ` [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Michael S. Tsirkin
@ 2015-05-06 21:03   ` Bjorn Helgaas
  2015-05-07  0:53     ` Eric W. Biederman
  2015-05-10 11:09     ` Michael S. Tsirkin
  0 siblings, 2 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-05-06 21:03 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu, Eric W. Biederman

On Sun, Apr 26, 2015 at 08:50:06AM +0200, Michael S. Tsirkin wrote:
> On Fri, Apr 10, 2015 at 05:54:19PM -0500, Bjorn Helgaas wrote:
> > Hi Michael,
> > 
> > I put your patches on my pci/msi branch and I hope to merge them for v4.1.
> > I didn't apply the acks from Fam and Eric because I made changes to those
> > patches that weren't completely trivial.  I think the end result is
> > equivalent, though.  The diff attached to this cover letter is the
> > difference between your v5 series and this v6 series.
> > 
> > As far as I'm concerned, this is ready to go except that I would like a
> > little more info about the virtio kernel hang to include in the changelog
> > for "PCI/MSI: Don't disable MSI/MSI-X at shutdown".
> 
> 
> Hi Bjorn,
> do you have eveything you need to merge this?

No.  I made the minor changelog edits you suggested and the result is on
my pci/msi-v7 branch.  But I still have these open issues:

  - The last thing I heard from Eric was that "not disabling MSI/MSI-X at
    shutdown is the wrong fix, and someone needs to fix a buggy driver."
    I want to hear Eric say "OK, we need to leave MSI/MSI-X enabled at
    shutdown for this case."

  - One changelog says "Stop disabling MSIs at shutdown to avoid
    conflicting with drivers."  But I don't know what the conflict is.

  - The bugzilla has no dmesg log or detailed analysis.  Fam said the
    scenario I came up with
    (http://lkml.kernel.org/r/20150416194245.GB20701@google.com)
    was fairly close, but it took me a lot of work to derive that.  Fixing
    any errors in it and putting it in the bugzilla would be a big step.
    The bugzilla should have the raw data and the analysis, so someone else
    can validate the analysis and conclude that this patch is a reasonable
    fix for it.  That's currently impossible because the bugzilla really
    only contains the fix as a fait accompli.

Bjorn

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown
  2015-05-06 21:03   ` Bjorn Helgaas
@ 2015-05-07  0:53     ` Eric W. Biederman
  2015-05-07 15:04       ` Bjorn Helgaas
  2015-05-10 11:09     ` Michael S. Tsirkin
  1 sibling, 1 reply; 30+ messages in thread
From: Eric W. Biederman @ 2015-05-07  0:53 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Michael S. Tsirkin, Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu

Bjorn Helgaas <bhelgaas@google.com> writes:

> On Sun, Apr 26, 2015 at 08:50:06AM +0200, Michael S. Tsirkin wrote:
>> On Fri, Apr 10, 2015 at 05:54:19PM -0500, Bjorn Helgaas wrote:
>> > Hi Michael,
>> > 
>> > I put your patches on my pci/msi branch and I hope to merge them for v4.1.
>> > I didn't apply the acks from Fam and Eric because I made changes to those
>> > patches that weren't completely trivial.  I think the end result is
>> > equivalent, though.  The diff attached to this cover letter is the
>> > difference between your v5 series and this v6 series.
>> > 
>> > As far as I'm concerned, this is ready to go except that I would like a
>> > little more info about the virtio kernel hang to include in the changelog
>> > for "PCI/MSI: Don't disable MSI/MSI-X at shutdown".
>> 
>> 
>> Hi Bjorn,
>> do you have eveything you need to merge this?
>
> No.  I made the minor changelog edits you suggested and the result is on
> my pci/msi-v7 branch.  But I still have these open issues:
>
>   - The last thing I heard from Eric was that "not disabling MSI/MSI-X at
>     shutdown is the wrong fix, and someone needs to fix a buggy driver."
>     I want to hear Eric say "OK, we need to leave MSI/MSI-X enabled at
>     shutdown for this case."

So far this just sounds like a device that needs a shutdown method.

>   - One changelog says "Stop disabling MSIs at shutdown to avoid
>     conflicting with drivers."  But I don't know what the conflict is.
>
>   - The bugzilla has no dmesg log or detailed analysis.  Fam said the
>     scenario I came up with
>     (http://lkml.kernel.org/r/20150416194245.GB20701@google.com)
>     was fairly close, but it took me a lot of work to derive that.  Fixing
>     any errors in it and putting it in the bugzilla would be a big step.
>     The bugzilla should have the raw data and the analysis, so someone else
>     can validate the analysis and conclude that this patch is a reasonable
>     fix for it.  That's currently impossible because the bugzilla really
>     only contains the fix as a fait accompli.

What I saw in the bugzilla was:

An interrupt was stuck on, and being reasserted as quickly as we could
call iret for that interrupt.

We did not disable that interrupt because irq debugging was explicitly
disabled on the kernel command line.

The irq was asserted because the the device did not have a shutdown
method to stop the device doing things.

There is an argument that disabling bus mastering should have disabled
whatever the interrupting condition was (and thus there may also be a
bug in the qemu device emulation).

So I read this as the driver and maybe the "hardware" is buggy not that
the linux pci layer is buggy.

Eric


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown
  2015-05-07  0:53     ` Eric W. Biederman
@ 2015-05-07 15:04       ` Bjorn Helgaas
  2015-05-10 11:05         ` Michael S. Tsirkin
  0 siblings, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2015-05-07 15:04 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael S. Tsirkin, Yijing Wang, linux-pci, Fam Zheng, Yinghai Lu

On Wed, May 06, 2015 at 07:53:48PM -0500, Eric W. Biederman wrote:
> Bjorn Helgaas <bhelgaas@google.com> writes:
> 
> > On Sun, Apr 26, 2015 at 08:50:06AM +0200, Michael S. Tsirkin wrote:
> >> On Fri, Apr 10, 2015 at 05:54:19PM -0500, Bjorn Helgaas wrote:
> >> > Hi Michael,
> >> > 
> >> > I put your patches on my pci/msi branch and I hope to merge them for v4.1.
> >> > I didn't apply the acks from Fam and Eric because I made changes to those
> >> > patches that weren't completely trivial.  I think the end result is
> >> > equivalent, though.  The diff attached to this cover letter is the
> >> > difference between your v5 series and this v6 series.
> >> > 
> >> > As far as I'm concerned, this is ready to go except that I would like a
> >> > little more info about the virtio kernel hang to include in the changelog
> >> > for "PCI/MSI: Don't disable MSI/MSI-X at shutdown".
> >> 
> >> 
> >> Hi Bjorn,
> >> do you have eveything you need to merge this?
> >
> > No.  I made the minor changelog edits you suggested and the result is on
> > my pci/msi-v7 branch.  But I still have these open issues:
> >
> >   - The last thing I heard from Eric was that "not disabling MSI/MSI-X at
> >     shutdown is the wrong fix, and someone needs to fix a buggy driver."
> >     I want to hear Eric say "OK, we need to leave MSI/MSI-X enabled at
> >     shutdown for this case."
> 
> So far this just sounds like a device that needs a shutdown method.
> 
> >   - One changelog says "Stop disabling MSIs at shutdown to avoid
> >     conflicting with drivers."  But I don't know what the conflict is.
> >
> >   - The bugzilla has no dmesg log or detailed analysis.  Fam said the
> >     scenario I came up with
> >     (http://lkml.kernel.org/r/20150416194245.GB20701@google.com)
> >     was fairly close, but it took me a lot of work to derive that.  Fixing
> >     any errors in it and putting it in the bugzilla would be a big step.
> >     The bugzilla should have the raw data and the analysis, so someone else
> >     can validate the analysis and conclude that this patch is a reasonable
> >     fix for it.  That's currently impossible because the bugzilla really
> >     only contains the fix as a fait accompli.
> 
> What I saw in the bugzilla was:
> 
> An interrupt was stuck on, and being reasserted as quickly as we could
> call iret for that interrupt.
> 
> We did not disable that interrupt because irq debugging was explicitly
> disabled on the kernel command line.
> 
> The irq was asserted because the the device did not have a shutdown
> method to stop the device doing things.
> 
> There is an argument that disabling bus mastering should have disabled
> whatever the interrupting condition was (and thus there may also be a
> bug in the qemu device emulation).
> 
> So I read this as the driver and maybe the "hardware" is buggy not that
> the linux pci layer is buggy.

OK, it sounds like we don't have consensus on this issue yet.

The rest of the series, which disable MSIs at boot-time even without
CONFIG_PCI_MSI=y in the new kernel and does some cleanup, seems worthwhile
and non-controversial, so I reordered it and applied the following patches
to pci/msi for v4.2:

  PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl()
  PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl()
  PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI
  virtio_pci: drop pci_msi_off() call during probe
  ntb: Drop pci_msi_off() call during probe
  PCI/MSI: Drop pci_msi_off() calls from quirks
  PCI/MSI: Remove unused pci_msi_off()

I removed the "stable" annotations because I don't have a clear report of
a bug that this fixes.  If it does fix a bug, please point me to a
bugzilla, and I can add the stable annotations back.

That leaves these two:

  PCI/MSI: Don't disable MSI/MSI-X at shutdown
  PCI/MSI: Make pci_msi_shutdown(), pci_msix_shutdown() static

I'm going to ignore them until you guys figure out what to do.

Bjorn

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown
  2015-05-07 15:04       ` Bjorn Helgaas
@ 2015-05-10 11:05         ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2015-05-10 11:05 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Eric W. Biederman, Michael S. Tsirkin, Yijing Wang, linux-pci,
	Fam Zheng, Yinghai Lu

On Thu, May 07, 2015 at 10:04:12AM -0500, Bjorn Helgaas wrote:
> On Wed, May 06, 2015 at 07:53:48PM -0500, Eric W. Biederman wrote:
> > Bjorn Helgaas <bhelgaas@google.com> writes:
> > 
> > > On Sun, Apr 26, 2015 at 08:50:06AM +0200, Michael S. Tsirkin wrote:
> > >> On Fri, Apr 10, 2015 at 05:54:19PM -0500, Bjorn Helgaas wrote:
> > >> > Hi Michael,
> > >> > 
> > >> > I put your patches on my pci/msi branch and I hope to merge them for v4.1.
> > >> > I didn't apply the acks from Fam and Eric because I made changes to those
> > >> > patches that weren't completely trivial.  I think the end result is
> > >> > equivalent, though.  The diff attached to this cover letter is the
> > >> > difference between your v5 series and this v6 series.
> > >> > 
> > >> > As far as I'm concerned, this is ready to go except that I would like a
> > >> > little more info about the virtio kernel hang to include in the changelog
> > >> > for "PCI/MSI: Don't disable MSI/MSI-X at shutdown".
> > >> 
> > >> 
> > >> Hi Bjorn,
> > >> do you have eveything you need to merge this?
> > >
> > > No.  I made the minor changelog edits you suggested and the result is on
> > > my pci/msi-v7 branch.  But I still have these open issues:
> > >
> > >   - The last thing I heard from Eric was that "not disabling MSI/MSI-X at
> > >     shutdown is the wrong fix, and someone needs to fix a buggy driver."
> > >     I want to hear Eric say "OK, we need to leave MSI/MSI-X enabled at
> > >     shutdown for this case."
> > 
> > So far this just sounds like a device that needs a shutdown method.
> > 
> > >   - One changelog says "Stop disabling MSIs at shutdown to avoid
> > >     conflicting with drivers."  But I don't know what the conflict is.
> > >
> > >   - The bugzilla has no dmesg log or detailed analysis.  Fam said the
> > >     scenario I came up with
> > >     (http://lkml.kernel.org/r/20150416194245.GB20701@google.com)
> > >     was fairly close, but it took me a lot of work to derive that.  Fixing
> > >     any errors in it and putting it in the bugzilla would be a big step.
> > >     The bugzilla should have the raw data and the analysis, so someone else
> > >     can validate the analysis and conclude that this patch is a reasonable
> > >     fix for it.  That's currently impossible because the bugzilla really
> > >     only contains the fix as a fait accompli.
> > 
> > What I saw in the bugzilla was:
> > 
> > An interrupt was stuck on, and being reasserted as quickly as we could
> > call iret for that interrupt.
> > 
> > We did not disable that interrupt because irq debugging was explicitly
> > disabled on the kernel command line.
> > 
> > The irq was asserted because the the device did not have a shutdown
> > method to stop the device doing things.
> > 
> > There is an argument that disabling bus mastering should have disabled
> > whatever the interrupting condition was (and thus there may also be a
> > bug in the qemu device emulation).
> > 
> > So I read this as the driver and maybe the "hardware" is buggy not that
> > the linux pci layer is buggy.
> 
> OK, it sounds like we don't have consensus on this issue yet.
> 
> The rest of the series, which disable MSIs at boot-time even without
> CONFIG_PCI_MSI=y in the new kernel and does some cleanup, seems worthwhile
> and non-controversial, so I reordered it and applied the following patches
> to pci/msi for v4.2:
> 
>   PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl()
>   PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl()
>   PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI
>   virtio_pci: drop pci_msi_off() call during probe
>   ntb: Drop pci_msi_off() call during probe
>   PCI/MSI: Drop pci_msi_off() calls from quirks
>   PCI/MSI: Remove unused pci_msi_off()
> 
> I removed the "stable" annotations because I don't have a clear report of
> a bug that this fixes.  If it does fix a bug, please point me to a
> bugzilla, and I can add the stable annotations back.
> 
> That leaves these two:
> 
>   PCI/MSI: Don't disable MSI/MSI-X at shutdown
>   PCI/MSI: Make pci_msi_shutdown(), pci_msix_shutdown() static
> 
> I'm going to ignore them until you guys figure out what to do.
> 
> Bjorn

Thanks!
I will repost the omitted patches adding extra info in the commit
log, so we can restart the discussion.

-- 
MST

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown
  2015-05-06 21:03   ` Bjorn Helgaas
  2015-05-07  0:53     ` Eric W. Biederman
@ 2015-05-10 11:09     ` Michael S. Tsirkin
  2015-05-10 11:42       ` Bjorn Helgaas
  1 sibling, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2015-05-10 11:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Michael S. Tsirkin, Yijing Wang, linux-pci, Fam Zheng,
	Yinghai Lu, Eric W. Biederman

On Wed, May 06, 2015 at 04:03:27PM -0500, Bjorn Helgaas wrote:
> On Sun, Apr 26, 2015 at 08:50:06AM +0200, Michael S. Tsirkin wrote:
> > On Fri, Apr 10, 2015 at 05:54:19PM -0500, Bjorn Helgaas wrote:
> > > Hi Michael,
> > > 
> > > I put your patches on my pci/msi branch and I hope to merge them for v4.1.
> > > I didn't apply the acks from Fam and Eric because I made changes to those
> > > patches that weren't completely trivial.  I think the end result is
> > > equivalent, though.  The diff attached to this cover letter is the
> > > difference between your v5 series and this v6 series.
> > > 
> > > As far as I'm concerned, this is ready to go except that I would like a
> > > little more info about the virtio kernel hang to include in the changelog
> > > for "PCI/MSI: Don't disable MSI/MSI-X at shutdown".
> > 
> > 
> > Hi Bjorn,
> > do you have eveything you need to merge this?
> 
> No.  I made the minor changelog edits you suggested and the result is on
> my pci/msi-v7 branch.  But I still have these open issues:
> 
>   - The last thing I heard from Eric was that "not disabling MSI/MSI-X at
>     shutdown is the wrong fix, and someone needs to fix a buggy driver."
>     I want to hear Eric say "OK, we need to leave MSI/MSI-X enabled at
>     shutdown for this case."
> 
>   - One changelog says "Stop disabling MSIs at shutdown to avoid
>     conflicting with drivers."  But I don't know what the conflict is.

OK, I'll try to clarify.

>   - The bugzilla has no dmesg log or detailed analysis.  Fam said the
>     scenario I came up with
>     (http://lkml.kernel.org/r/20150416194245.GB20701@google.com)
>     was fairly close, but it took me a lot of work to derive that.  Fixing
>     any errors in it and putting it in the bugzilla would be a big step.
>     The bugzilla should have the raw data and the analysis, so someone else
>     can validate the analysis and conclude that this patch is a reasonable
>     fix for it.  That's currently impossible because the bugzilla really
>     only contains the fix as a fait accompli.
> 
> Bjorn

I think it's easier to just have all the info in the commit log, so I'll
do that for the next version.  I'm not too worried about adding these
patches to stable, so I think adding bugzilla for this isn't a must,
right?

-- 
MST

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown
  2015-05-10 11:09     ` Michael S. Tsirkin
@ 2015-05-10 11:42       ` Bjorn Helgaas
  0 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2015-05-10 11:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Michael S. Tsirkin, Yijing Wang, linux-pci, Fam Zheng,
	Yinghai Lu, Eric W. Biederman

On Sun, May 10, 2015 at 6:09 AM, Michael S. Tsirkin <mst@redhat.com> wrote:

> I think it's easier to just have all the info in the commit log, so I'll
> do that for the next version.  I'm not too worried about adding these
> patches to stable, so I think adding bugzilla for this isn't a must,
> right?

Right.  If it all fits in the changelog, so much the better.  I just
use bugzilla as a place to archive things too big to reasonably put in
the changelog.

Bjorn

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2015-05-10 11:42 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-10 22:54 [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Bjorn Helgaas
2015-04-10 22:54 ` [PATCH v6 01/10] PCI/MSI: Rename msi_set_enable(), msix_clear_and_set_ctrl() Bjorn Helgaas
2015-04-11  7:30   ` Greg KH
2015-04-11 16:01     ` Bjorn Helgaas
2015-04-10 22:54 ` [PATCH v6 02/10] PCI/MSI: Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl() Bjorn Helgaas
2015-04-10 22:54 ` [PATCH v6 03/10] PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI Bjorn Helgaas
2015-04-10 22:54 ` [PATCH v6 04/10] PCI/MSI: Don't disable MSI/MSI-X at shutdown Bjorn Helgaas
2015-04-13  9:37   ` Fam Zheng
2015-04-13 15:41     ` Bjorn Helgaas
2015-04-13 16:45       ` Eric W. Biederman
2015-04-14  9:44         ` Michael S. Tsirkin
2015-04-16 19:42         ` Bjorn Helgaas
2015-04-17  1:05           ` Fam Zheng
2015-04-14  9:47       ` Michael S. Tsirkin
2015-04-14 10:45         ` Fam Zheng
2015-04-14 10:49           ` Michael S. Tsirkin
2015-04-16  7:30   ` Michael S. Tsirkin
2015-04-10 22:54 ` [PATCH v6 05/10] PCI/MSI: Make pci_msi_shutdown(), pci_msix_shutdown() static Bjorn Helgaas
2015-04-10 22:55 ` [PATCH v6 06/10] virtio_pci: drop pci_msi_off() call during probe Bjorn Helgaas
2015-04-10 22:55 ` [PATCH v6 07/10] ntb: Drop " Bjorn Helgaas
2015-04-10 22:55 ` [PATCH v6 08/10] mic: " Bjorn Helgaas
2015-04-10 22:55 ` [PATCH v6 09/10] PCI/MSI: Drop pci_msi_off() calls from quirks Bjorn Helgaas
2015-04-10 22:55 ` [PATCH v6 10/10] PCI/MSI: Remove unused pci_msi_off() Bjorn Helgaas
2015-04-26  6:50 ` [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown Michael S. Tsirkin
2015-05-06 21:03   ` Bjorn Helgaas
2015-05-07  0:53     ` Eric W. Biederman
2015-05-07 15:04       ` Bjorn Helgaas
2015-05-10 11:05         ` Michael S. Tsirkin
2015-05-10 11:09     ` Michael S. Tsirkin
2015-05-10 11:42       ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.