linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Improvement on ioapic hotplug
       [not found] <201702280904.A2EXAFMQ%fengguang.wu@intel.com>
@ 2017-02-28 13:34 ` Rui Wang
  2017-02-28 13:34   ` [PATCH v2 1/2] x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC Rui Wang
  2017-02-28 13:34   ` [PATCH v2 2/2] x86/ioapic: Split IOAPIC hot-removal into two steps Rui Wang
  0 siblings, 2 replies; 5+ messages in thread
From: Rui Wang @ 2017-02-28 13:34 UTC (permalink / raw)
  To: helgaas, tglx, rjw
  Cc: tony.luck, bhelgaas, linux-acpi, linux-pci, linux-kernel,
	rui.y.wang, x86, fengguang.wu, kbuild-all

The revert of 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq()
and pcibios_free_irq()") causes a problem for IOAPIC hotplug. If
a device under the IOAPIC doesn't call pci_disable_device(), then
the hot-removal of the IOAPIC causes kernel stack dump.

This patchset can fix the problem. IOAPIC hot-removal works correctly
after applying this patchset.

v2: 0001 Fixed a missing stub function causing compiling error on i386 
    0002 Fixed a typo on the subject line.

Rui Wang (2):
  x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC
  x86/ioapic: Split IOAPIC hot-removal into two steps

 arch/x86/pci/common.c   |  9 +++++++++
 drivers/acpi/internal.h |  2 ++
 drivers/acpi/ioapic.c   | 22 ++++++++++++++++------
 drivers/acpi/pci_root.c |  4 ++--
 4 files changed, 29 insertions(+), 8 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC
  2017-02-28 13:34 ` [PATCH v2 0/2] Improvement on ioapic hotplug Rui Wang
@ 2017-02-28 13:34   ` Rui Wang
  2017-03-16 17:24     ` Bjorn Helgaas
  2017-02-28 13:34   ` [PATCH v2 2/2] x86/ioapic: Split IOAPIC hot-removal into two steps Rui Wang
  1 sibling, 1 reply; 5+ messages in thread
From: Rui Wang @ 2017-02-28 13:34 UTC (permalink / raw)
  To: helgaas, tglx, rjw
  Cc: tony.luck, bhelgaas, linux-acpi, linux-pci, linux-kernel,
	rui.y.wang, x86, fengguang.wu, kbuild-all

The revert of 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq()
and pcibios_free_irq()") causes a problem for IOAPIC hotplug. The
problem is that IRQs are allocated and freed in pci_enable_device()
and pci_disable_device(). But there are some drivers which don't call
pci_disable_device(), and they have good reasons not calling it, so
if they're using IOAPIC their IRQs won't have a chance to be released
from the IOAPIC. When this happens IOAPIC hot-removal fails with a
kernel stack dump and an error message like this:

[149335.697989] pin16 on IOAPIC2 is still in use.

It turns out that we can fix it in a different way without moving IRQ
allocation into pcibios_alloc_irq(), thus avoiding the regression of
991de2e59090. We can keep the allocation and freeing of IRQs as is
within pci_enable_device()/pci_disable_device(), without breaking any
previous assumption of the rest of the system, keeping compatibility
with both the legacy and the modern drivers. We can accomplish this by
implementing the existing __weak hook of pcibios_release_device() thus
when a pci device is about to be deleted we get notified in the hook
and take the chance to release its IRQ, if any, from the IOAPIC.

Besides implementing pcibios_release_device(), the hot-removal of
IOAPIC needs to be broken into two parts: the PCI part and the ACPI
part. The PCI part releases PCI resources before the PCI bus is gone,
and the ACPI part is moved to a stage later than the hot-removal of
the PCI root bus, so we have the chance to hook every PCI device's
pcibios_release_device(), before we remove the IOAPIC.

This patch implements pcibios_release_device() on x86 to release any
IRQ not released by the driver, so that the IOAPIC can then be safely
hot-removed.

v2: Fixed a typo (pcibios_release_device)

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
---
 arch/x86/pci/common.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 0cb52ae..190e718 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -735,6 +735,15 @@ void pcibios_disable_device (struct pci_dev *dev)
 		pcibios_disable_irq(dev);
 }
 
+#ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
+void pcibios_release_device(struct pci_dev *dev)
+{
+	if (atomic_dec_return(&dev->enable_cnt) >= 0)
+		pcibios_disable_device(dev);
+
+}
+#endif
+
 int pci_ext_cfg_avail(void)
 {
 	if (raw_pci_ext_ops)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] x86/ioapic: Split IOAPIC hot-removal into two steps
  2017-02-28 13:34 ` [PATCH v2 0/2] Improvement on ioapic hotplug Rui Wang
  2017-02-28 13:34   ` [PATCH v2 1/2] x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC Rui Wang
@ 2017-02-28 13:34   ` Rui Wang
  2017-03-16 17:48     ` Bjorn Helgaas
  1 sibling, 1 reply; 5+ messages in thread
From: Rui Wang @ 2017-02-28 13:34 UTC (permalink / raw)
  To: helgaas, tglx, rjw
  Cc: tony.luck, bhelgaas, linux-acpi, linux-pci, linux-kernel,
	rui.y.wang, x86, fengguang.wu, kbuild-all

The hot-removal of IOAPIC is broken into two parts: the PCI part and
the ACPI part. The PCI part releases PCI resources before the PCI bus
is gone, and the ACPI part is moved to a stage later than the hot-removal
of the PCI root bus, so that the IOAPIC driver is able to hook the
pcibios_release_device() of every PCI device under the same parent root
bus, before the IOAPIC is hot-removed. This makes it possible for the
IOAPIC to free any IRQ resource previously unable to get freed.

v2: Fixed compiling error on i386 (missing stub of pci_ioapic_remove())

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
---
 drivers/acpi/internal.h |  2 ++
 drivers/acpi/ioapic.c   | 22 ++++++++++++++++------
 drivers/acpi/pci_root.c |  4 ++--
 3 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 219b90b..f159001 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -41,8 +41,10 @@ static inline void acpi_amba_init(void) {}
 void acpi_container_init(void);
 void acpi_memory_hotplug_init(void);
 #ifdef	CONFIG_ACPI_HOTPLUG_IOAPIC
+void pci_ioapic_remove(struct acpi_pci_root *root);
 int acpi_ioapic_remove(struct acpi_pci_root *root);
 #else
+static inline void pci_ioapic_remove(struct acpi_pci_root *root) { return; }
 static inline int acpi_ioapic_remove(struct acpi_pci_root *root) { return 0; }
 #endif
 #ifdef CONFIG_ACPI_DOCK
diff --git a/drivers/acpi/ioapic.c b/drivers/acpi/ioapic.c
index 6d7ce6e..1120dfd6 100644
--- a/drivers/acpi/ioapic.c
+++ b/drivers/acpi/ioapic.c
@@ -206,24 +206,34 @@ int acpi_ioapic_add(acpi_handle root_handle)
 	return ACPI_SUCCESS(status) && ACPI_SUCCESS(retval) ? 0 : -ENODEV;
 }
 
-int acpi_ioapic_remove(struct acpi_pci_root *root)
+void pci_ioapic_remove(struct acpi_pci_root *root)
 {
-	int retval = 0;
 	struct acpi_pci_ioapic *ioapic, *tmp;
 
 	mutex_lock(&ioapic_list_lock);
 	list_for_each_entry_safe(ioapic, tmp, &ioapic_list, list) {
 		if (root->device->handle != ioapic->root_handle)
 			continue;
-
-		if (acpi_unregister_ioapic(ioapic->handle, ioapic->gsi_base))
-			retval = -EBUSY;
-
 		if (ioapic->pdev) {
 			pci_release_region(ioapic->pdev, 0);
 			pci_disable_device(ioapic->pdev);
 			pci_dev_put(ioapic->pdev);
 		}
+	}
+	mutex_unlock(&ioapic_list_lock);
+}
+
+int acpi_ioapic_remove(struct acpi_pci_root *root)
+{
+	int retval = 0;
+	struct acpi_pci_ioapic *ioapic, *tmp;
+
+	mutex_lock(&ioapic_list_lock);
+	list_for_each_entry_safe(ioapic, tmp, &ioapic_list, list) {
+		if (root->device->handle != ioapic->root_handle)
+			continue;
+		if (acpi_unregister_ioapic(ioapic->handle, ioapic->gsi_base))
+			retval = -EBUSY;
 		if (ioapic->res.flags && ioapic->res.parent)
 			release_resource(&ioapic->res);
 		list_del(&ioapic->list);
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index bf601d4..919be0a 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -648,12 +648,12 @@ static void acpi_pci_root_remove(struct acpi_device *device)
 
 	pci_stop_root_bus(root->bus);
 
-	WARN_ON(acpi_ioapic_remove(root));
-
+	pci_ioapic_remove(root);
 	device_set_run_wake(root->bus->bridge, false);
 	pci_acpi_remove_bus_pm_notifier(device);
 
 	pci_remove_root_bus(root->bus);
+	WARN_ON(acpi_ioapic_remove(root));
 
 	dmar_device_remove(device->handle);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC
  2017-02-28 13:34   ` [PATCH v2 1/2] x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC Rui Wang
@ 2017-03-16 17:24     ` Bjorn Helgaas
  0 siblings, 0 replies; 5+ messages in thread
From: Bjorn Helgaas @ 2017-03-16 17:24 UTC (permalink / raw)
  To: Rui Wang
  Cc: tglx, rjw, tony.luck, bhelgaas, linux-acpi, linux-pci,
	linux-kernel, x86, fengguang.wu, kbuild-all

Hi Rui,

On Tue, Feb 28, 2017 at 09:34:28PM +0800, Rui Wang wrote:
> The revert of 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq()
> and pcibios_free_irq()") causes a problem for IOAPIC hotplug. The
> problem is that IRQs are allocated and freed in pci_enable_device()
> and pci_disable_device(). But there are some drivers which don't call
> pci_disable_device(), and they have good reasons not calling it, so

Can you elaborate a little on what these drivers are and what the
reasons for not calling pci_disable_device() are?  Without a pointer
to details, this is just a teaser :)

Maybe this has to do with the "there are too many odd BIOS and bridge
setups" comment in pci_device_remove()?  If so, I think it's a bit of
a stretch to say they have good reasons to not call it, because nobody
knows what those reasons are.

> if they're using IOAPIC their IRQs won't have a chance to be released
> from the IOAPIC. When this happens IOAPIC hot-removal fails with a
> kernel stack dump and an error message like this:
> 
> [149335.697989] pin16 on IOAPIC2 is still in use.

Are there any bug reports for this that we should reference?  Pointing
to a bugzilla with more details would help avoid situations like the
vague "odd BIOS and bridge setups" comment above.

> It turns out that we can fix it in a different way without moving IRQ
> allocation into pcibios_alloc_irq(), thus avoiding the regression of
> 991de2e59090. We can keep the allocation and freeing of IRQs as is
> within pci_enable_device()/pci_disable_device(), without breaking any
> previous assumption of the rest of the system, keeping compatibility
> with both the legacy and the modern drivers. We can accomplish this by
> implementing the existing __weak hook of pcibios_release_device() thus
> when a pci device is about to be deleted we get notified in the hook
> and take the chance to release its IRQ, if any, from the IOAPIC.

I'm not a huge fan of using the release method because it makes the
code flow more obscure -- the release method is called semi-magically
when a reference count goes to zero, and it's primarily a mechanism
for dropping references and deallocating storage.

That's not to say we *can't* do it in pcibios_release_device(); it's
just that I'd prefer a more direct approach if we can find one.

If I understand correctly, after 991de2e59090, we did the IRQ setup
and teardown in the driver attach (pci_device_probe()) and remove
(pci_device_remove()) path.  The problem with that approach was that
we set up the IRQ for the device but not for any upstream bridges, so
some drivers didn't get the interrupts they expected.

After we reverted 991de2e59090 (with 6c777e8799a9), we now do IRQ
setup and teardown in the pci_enable_device() and pci_disable_device()
paths.  The pci_enable_device() path enables all upstream bridges and
sets up their IRQs.  The problem with this approach is that some
drivers don't call pci_disable_device() to teardown the IRQ, and a
subsequent hotplug removal of the IOAPIC fails because an interrupt is
still in use.

I think it makes sense to set up the IRQ in pci_enable_device() rather
than in pci_device_probe().  Doing it at probe-time is a little
aggressive because some drivers may not need the IRQ at all.

Could we do a hybrid, i.e., do the setup in pci_enable_device() and
the teardown in pci_device_remove()?  I think it makes sense that if a
driver has set up the IRQ, we should make sure it's torn down when we
unbind the driver from the device (or possibly early, if the driver
calls pci_disable_device()).

> Besides implementing pcibios_release_device(), the hot-removal of
> IOAPIC needs to be broken into two parts: the PCI part and the ACPI
> part. The PCI part releases PCI resources before the PCI bus is gone,
> and the ACPI part is moved to a stage later than the hot-removal of
> the PCI root bus, so we have the chance to hook every PCI device's
> pcibios_release_device(), before we remove the IOAPIC.
> 
> This patch implements pcibios_release_device() on x86 to release any
> IRQ not released by the driver, so that the IOAPIC can then be safely
> hot-removed.
> 
> v2: Fixed a typo (pcibios_release_device)
> 
> Signed-off-by: Rui Wang <rui.y.wang@intel.com>
> ---
>  arch/x86/pci/common.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 0cb52ae..190e718 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -735,6 +735,15 @@ void pcibios_disable_device (struct pci_dev *dev)
>  		pcibios_disable_irq(dev);
>  }
>  
> +#ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
> +void pcibios_release_device(struct pci_dev *dev)
> +{
> +	if (atomic_dec_return(&dev->enable_cnt) >= 0)
> +		pcibios_disable_device(dev);
> +
> +}
> +#endif
> +
>  int pci_ext_cfg_avail(void)
>  {
>  	if (raw_pci_ext_ops)
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] x86/ioapic: Split IOAPIC hot-removal into two steps
  2017-02-28 13:34   ` [PATCH v2 2/2] x86/ioapic: Split IOAPIC hot-removal into two steps Rui Wang
@ 2017-03-16 17:48     ` Bjorn Helgaas
  0 siblings, 0 replies; 5+ messages in thread
From: Bjorn Helgaas @ 2017-03-16 17:48 UTC (permalink / raw)
  To: Rui Wang
  Cc: tglx, rjw, tony.luck, bhelgaas, linux-acpi, linux-pci,
	linux-kernel, x86, fengguang.wu, kbuild-all

On Tue, Feb 28, 2017 at 09:34:29PM +0800, Rui Wang wrote:
> The hot-removal of IOAPIC is broken into two parts: the PCI part and
> the ACPI part. The PCI part releases PCI resources before the PCI bus
> is gone, and the ACPI part is moved to a stage later than the hot-removal
> of the PCI root bus, so that the IOAPIC driver is able to hook the
> pcibios_release_device() of every PCI device under the same parent root
> bus, before the IOAPIC is hot-removed. This makes it possible for the
> IOAPIC to free any IRQ resource previously unable to get freed.
> 
> v2: Fixed compiling error on i386 (missing stub of pci_ioapic_remove())
> 
> Signed-off-by: Rui Wang <rui.y.wang@intel.com>

Minor comments below, but

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/acpi/internal.h |  2 ++
>  drivers/acpi/ioapic.c   | 22 ++++++++++++++++------
>  drivers/acpi/pci_root.c |  4 ++--
>  3 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 219b90b..f159001 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -41,8 +41,10 @@ static inline void acpi_amba_init(void) {}
>  void acpi_container_init(void);
>  void acpi_memory_hotplug_init(void);
>  #ifdef	CONFIG_ACPI_HOTPLUG_IOAPIC
> +void pci_ioapic_remove(struct acpi_pci_root *root);
>  int acpi_ioapic_remove(struct acpi_pci_root *root);
>  #else
> +static inline void pci_ioapic_remove(struct acpi_pci_root *root) { return; }

Superfluous "return;" here.

>  static inline int acpi_ioapic_remove(struct acpi_pci_root *root) { return 0; }
>  #endif
>  #ifdef CONFIG_ACPI_DOCK
> diff --git a/drivers/acpi/ioapic.c b/drivers/acpi/ioapic.c
> index 6d7ce6e..1120dfd6 100644
> --- a/drivers/acpi/ioapic.c
> +++ b/drivers/acpi/ioapic.c
> @@ -206,24 +206,34 @@ int acpi_ioapic_add(acpi_handle root_handle)
>  	return ACPI_SUCCESS(status) && ACPI_SUCCESS(retval) ? 0 : -ENODEV;
>  }
>  
> -int acpi_ioapic_remove(struct acpi_pci_root *root)
> +void pci_ioapic_remove(struct acpi_pci_root *root)
>  {
> -	int retval = 0;
>  	struct acpi_pci_ioapic *ioapic, *tmp;
>  
>  	mutex_lock(&ioapic_list_lock);
>  	list_for_each_entry_safe(ioapic, tmp, &ioapic_list, list) {

I don't think it's necessary to use the "safe" iterator here, since
you're not modifying ioapic_list in the loop any more.

>  		if (root->device->handle != ioapic->root_handle)
>  			continue;
> -
> -		if (acpi_unregister_ioapic(ioapic->handle, ioapic->gsi_base))
> -			retval = -EBUSY;
> -
>  		if (ioapic->pdev) {
>  			pci_release_region(ioapic->pdev, 0);
>  			pci_disable_device(ioapic->pdev);
>  			pci_dev_put(ioapic->pdev);
>  		}
> +	}
> +	mutex_unlock(&ioapic_list_lock);
> +}
> +
> +int acpi_ioapic_remove(struct acpi_pci_root *root)
> +{
> +	int retval = 0;
> +	struct acpi_pci_ioapic *ioapic, *tmp;
> +
> +	mutex_lock(&ioapic_list_lock);
> +	list_for_each_entry_safe(ioapic, tmp, &ioapic_list, list) {
> +		if (root->device->handle != ioapic->root_handle)
> +			continue;
> +		if (acpi_unregister_ioapic(ioapic->handle, ioapic->gsi_base))
> +			retval = -EBUSY;
>  		if (ioapic->res.flags && ioapic->res.parent)
>  			release_resource(&ioapic->res);
>  		list_del(&ioapic->list);
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index bf601d4..919be0a 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -648,12 +648,12 @@ static void acpi_pci_root_remove(struct acpi_device *device)
>  
>  	pci_stop_root_bus(root->bus);
>  
> -	WARN_ON(acpi_ioapic_remove(root));
> -
> +	pci_ioapic_remove(root);
>  	device_set_run_wake(root->bus->bridge, false);
>  	pci_acpi_remove_bus_pm_notifier(device);
>  
>  	pci_remove_root_bus(root->bus);
> +	WARN_ON(acpi_ioapic_remove(root));

You didn't change this, but my personal preference is not to call
functions with side-effects inside a WARN_ON() or BUG_ON().  The
acpi_ioapic_remove() call is essential here, and I think it gets
obscured a bit by being inside WARN_ON().

>  	dmar_device_remove(device->handle);
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-03-16 17:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <201702280904.A2EXAFMQ%fengguang.wu@intel.com>
2017-02-28 13:34 ` [PATCH v2 0/2] Improvement on ioapic hotplug Rui Wang
2017-02-28 13:34   ` [PATCH v2 1/2] x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC Rui Wang
2017-03-16 17:24     ` Bjorn Helgaas
2017-02-28 13:34   ` [PATCH v2 2/2] x86/ioapic: Split IOAPIC hot-removal into two steps Rui Wang
2017-03-16 17:48     ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).