All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug
@ 2012-04-16 16:28 Jiang Liu
  2012-04-16 16:28 ` [PATCH RFC 01/17] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details Jiang Liu
                   ` (17 more replies)
  0 siblings, 18 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:28 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Greg Kroah-Hartman, Dely Sy, Scott Murray
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

There are multiple ways to trigger PCI hotplug requests concurrently,
such as:
1. Sysfs interfaces exported by the PCI core subsystem
2. Sysfs interfaces exported by the PCI hotplug subsystem
3. PCI hotplug events triggered by PCI Hotplug Controllers
4. ACPI hotplug events for PCI host bridges
5. Driver binding/unbinding events

The PCI core subsystem doesn't support concurrent hotplug operations yet,
so all PCI hotplug requests should be globally serialized. This patchset
introduces a global recursive rwsem to serialize all PCI hotplug operations.

Following PCI hotplug drivers/interfaces have been enhanced with this
1. Sysfs interfaces exported by the PCI core subsystem
2. Sysfs interfaces exported by the PCI hotplug subsystem
3. pciehp
4. shpchp
5. cpcihp_generic and cpcihp_zt5550
6. fakephp

But there are still several TODOs:
1) all other PCI hotplug driver in drivers/pci/hotplug directory
2) SR-IOV
3) acpiphp (plan to do this based on Yinghai's PCI root bus hotplug gate)
4) pci_root (plan to do this based on Yinghai's PCI root bus hotplug gate)

Basic test has been done as below, will find more hardwares to do more tests.
Start three scripts on an Intel Atom system to currently execute:
1) remove/rescan PCI devices by sysfs interfaces exported by PCI core subsystem
2) remove/rescan PCI devices by sysfs interfaces exported by fakephp driver
3) load/unload fakephp driver
The test has run about four hours without failure.

Jiang Liu (17):
  PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation
    details
  PCI: introduce recursive rwsem to serialize PCI hotplug operations
  PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock
  PCI: serialize hotplug operations triggered by PCI hotplug sysfs
    interfaces
  PCI: correctly flush workqueue when destroy pcie hotplug controller
  PCI: prepare for serializing hotplug operations triggered by pciehp
    driver
  PCI: serialize hotplug operaitons triggered by the pciehp driver
  PCI: fix two race windows when probing/removing SHPC controller
  PCI: correctly flush workqueues and timer when destroy SHPC
    controller
  PCI: serialize hotplug operaitons triggered by the shpchp driver
  PCI: release IO resource in error handling path in
    cpcihp_generic_init()
  PCI: clean up all resources in error handling path in
    zt5550_hc_init_one()
  PCI: trivial code clean up in cpci_hotplug_core.c
  PCI: fix race windows when shutting down cpcihp controller
  PCI: hold a reference count to the PCI bus used by cpcihp drivers
  PCI: serialize PCI hotplug operations triggered by cpcihp drivers
  PCI: serialize PCI hotplug operations triggered by fakephp drivers

 drivers/pci/bus.c                       |   15 +++++
 drivers/pci/hotplug.c                   |   55 +++++++++++++++++
 drivers/pci/hotplug/cpci_hotplug_core.c |   53 +++++++++-------
 drivers/pci/hotplug/cpcihp_generic.c    |   30 +++++++---
 drivers/pci/hotplug/cpcihp_zt5550.c     |   21 +++++--
 drivers/pci/hotplug/fakephp.c           |   38 ++++++++++--
 drivers/pci/hotplug/pci_hotplug_core.c  |   26 ++++++--
 drivers/pci/hotplug/pciehp.h            |    5 +-
 drivers/pci/hotplug/pciehp_core.c       |   25 ++++++--
 drivers/pci/hotplug/pciehp_ctrl.c       |   56 ++++++++++++++++-
 drivers/pci/hotplug/pciehp_hpc.c        |   18 +++++-
 drivers/pci/hotplug/shpchp.h            |    3 +
 drivers/pci/hotplug/shpchp_core.c       |   11 ++--
 drivers/pci/hotplug/shpchp_ctrl.c       |   32 ++++++++++
 drivers/pci/hotplug/shpchp_hpc.c        |   36 +++++++-----
 drivers/pci/pci-sysfs.c                 |  100 +++++++++++++++++-------------
 drivers/pci/remove.c                    |    1 +
 include/linux/pci.h                     |   16 +++++
 18 files changed, 415 insertions(+), 126 deletions(-)

-- 
1.7.5.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC 01/17] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
@ 2012-04-16 16:28 ` Jiang Liu
  2012-04-16 16:28 ` [PATCH RFC 02/17] PCI: introduce recursive rwsem to serialize PCI hotplug operations Jiang Liu
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:28 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/bus.c   |   15 +++++++++++++++
 include/linux/pci.h |    2 ++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 4ce5ef2..50f9c5d 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -35,6 +35,21 @@ void pci_add_resource_offset(struct list_head *resources, struct resource *res,
 }
 EXPORT_SYMBOL(pci_add_resource_offset);
 
+struct pci_bus *pci_bus_get(struct pci_bus *bus)
+{
+	if (bus)
+		get_device(&bus->dev);
+	return bus;
+}
+EXPORT_SYMBOL(pci_bus_get);
+
+void pci_bus_put(struct pci_bus *bus)
+{
+	if (bus)
+		put_device(&bus->dev);
+}
+EXPORT_SYMBOL(pci_bus_put);
+
 void pci_add_resource(struct list_head *resources, struct resource *res)
 {
 	pci_add_resource_offset(resources, res, 0);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e444f5b..0603a60 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -912,6 +912,8 @@ int pci_request_selected_regions_exclusive(struct pci_dev *, int, const char *);
 void pci_release_selected_regions(struct pci_dev *, int);
 
 /* drivers/pci/bus.c */
+struct pci_bus *pci_bus_get(struct pci_bus *bus);
+void pci_bus_put(struct pci_bus *bus);
 void pci_add_resource(struct list_head *resources, struct resource *res);
 void pci_add_resource_offset(struct list_head *resources, struct resource *res,
 			     resource_size_t offset);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 02/17] PCI: introduce recursive rwsem to serialize PCI hotplug operations
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
  2012-04-16 16:28 ` [PATCH RFC 01/17] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details Jiang Liu
@ 2012-04-16 16:28 ` Jiang Liu
  2012-04-16 16:28 ` [PATCH RFC 03/17] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock Jiang Liu
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:28 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

There are multiple ways to trigger PCI hotplug requests concurrently,
such as:
1. Sysfs interfaces exported by the PCI core subsystem
2. Sysfs interfaces exported by the PCI hotplug subsystem
3. PCI hotplug events triggered by PCI Hotplug Controllers
4. ACPI hotplug events for PCI host bridges
5. Driver binding/unbinding events

The PCI core subsystem doesn't support concurrent hotplug operations yet,
so all PCI hotplug requests should be globally serialized. This patch
introduces several new interfaces to serialize PCI hotplug operations.

pci_hotplug_try_enter(): try to acquire write lock
pci_hotplug_enter(): acquire write lock
pci_hotplug_exit(): release write lock
pci_hotplug_disable(): acquire read lock
pci_hotplug_enable(): release read lock

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug.c                  |   55 ++++++++++++++++++++++++++++++++
 drivers/pci/hotplug/pci_hotplug_core.c |    8 ++--
 include/linux/pci.h                    |   14 ++++++++
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug.c b/drivers/pci/hotplug.c
index 2b5352a..975bd3d 100644
--- a/drivers/pci/hotplug.c
+++ b/drivers/pci/hotplug.c
@@ -1,8 +1,63 @@
 #include <linux/kernel.h>
 #include <linux/pci.h>
 #include <linux/module.h>
+#include <linux/rwsem.h>
 #include "pci.h"
 
+/* Recursive mutex for PCI hotplug operations. */
+static DECLARE_RWSEM(pci_hotplug_rwsem);
+static struct task_struct *pci_hotplug_mutex_owner;
+static int pci_hotplug_mutex_recursive;
+
+/*
+ * trylock for writing -- returns 1 if successful, 0 if contention
+ */
+int pci_hotplug_try_enter(void)
+{
+	if (current != pci_hotplug_mutex_owner) {
+		if (down_write_trylock(&pci_hotplug_rwsem) == 0)
+			return 0;
+		pci_hotplug_mutex_owner = current;
+	}
+	pci_hotplug_mutex_recursive++;
+
+	return 1;
+}
+EXPORT_SYMBOL(pci_hotplug_try_enter);
+
+void pci_hotplug_enter(void)
+{
+	if (current != pci_hotplug_mutex_owner) {
+		down_write(&pci_hotplug_rwsem);
+		pci_hotplug_mutex_owner = current;
+	}
+	pci_hotplug_mutex_recursive++;
+
+}
+EXPORT_SYMBOL(pci_hotplug_enter);
+
+void pci_hotplug_exit(void)
+{
+	BUG_ON(pci_hotplug_mutex_owner != current);
+	if (--pci_hotplug_mutex_recursive == 0) {
+		pci_hotplug_mutex_owner = NULL;
+		up_write(&pci_hotplug_rwsem);
+	}
+}
+EXPORT_SYMBOL(pci_hotplug_exit);
+
+void pci_hotplug_enable(void)
+{
+	up_read(&pci_hotplug_rwsem);
+}
+EXPORT_SYMBOL(pci_hotplug_enable);
+
+void pci_hotplug_disable(void)
+{
+	down_read(&pci_hotplug_rwsem);
+}
+EXPORT_SYMBOL(pci_hotplug_disable);
+
 int pci_uevent(struct device *dev, struct kobj_uevent_env *env)
 {
 	struct pci_dev *pdev;
diff --git a/drivers/pci/hotplug/pci_hotplug_core.c b/drivers/pci/hotplug/pci_hotplug_core.c
index 202f4a9..1572665 100644
--- a/drivers/pci/hotplug/pci_hotplug_core.c
+++ b/drivers/pci/hotplug/pci_hotplug_core.c
@@ -537,7 +537,7 @@ int __must_check pci_hp_change_slot_info(struct hotplug_slot *hotplug,
 	return 0;
 }
 
-static int __init pci_hotplug_init (void)
+static int __init pci_hp_init(void)
 {
 	int result;
 
@@ -553,13 +553,13 @@ err_cpci:
 	return result;
 }
 
-static void __exit pci_hotplug_exit (void)
+static void __exit pci_hp_exit(void)
 {
 	cpci_hotplug_exit();
 }
 
-module_init(pci_hotplug_init);
-module_exit(pci_hotplug_exit);
+module_init(pci_hp_init);
+module_exit(pci_hp_exit);
 
 MODULE_AUTHOR(DRIVER_AUTHOR);
 MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0603a60..1c5f153 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -884,6 +884,20 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge);
 unsigned int pci_rescan_bus(struct pci_bus *bus);
 #endif
 
+#ifdef CONFIG_HOTPLUG
+extern int pci_hotplug_try_enter(void);
+extern void pci_hotplug_enter(void);
+extern void pci_hotplug_exit(void);
+extern void pci_hotplug_disable(void);
+extern void pci_hotplug_enable(void);
+#else
+static inline int pci_hotplug_try_enter(void) { return 1; }
+static inline void pci_hotplug_enter(void) {}
+static inline void pci_hotplug_exit(void) {}
+static inline void pci_hotplug_enable(void) {}
+static inline void pci_hotplug_disable(void) {}
+#endif
+
 /* Vital product data routines */
 ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
 ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void *buf);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 03/17] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
  2012-04-16 16:28 ` [PATCH RFC 01/17] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details Jiang Liu
  2012-04-16 16:28 ` [PATCH RFC 02/17] PCI: introduce recursive rwsem to serialize PCI hotplug operations Jiang Liu
@ 2012-04-16 16:28 ` Jiang Liu
  2012-04-16 16:28 ` [PATCH RFC 04/17] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces Jiang Liu
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:28 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Replace pci_remove_rescan_mutex with the PCI hotplug lock, which is used to
globally serialize all PCI hotplug operations.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/pci-sysfs.c |  100 +++++++++++++++++++++++++++--------------------
 drivers/pci/remove.c    |    1 +
 2 files changed, 58 insertions(+), 43 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index a55e248..ecf197d 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -283,7 +283,31 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
 }
 
 #ifdef CONFIG_HOTPLUG
-static DEFINE_MUTEX(pci_remove_rescan_mutex);
+/*
+ * Schedule a device callback to avoid deadlock in case of
+ * 1) An attribute method tries to deregister the sysfs file itself
+ * 2) Another thread is trying to remove the sysfs file, which have
+ *    already held the PCI hotplug lock by invoking pci_hotplug_enter().
+ */
+static int schedule_hp_callback(struct device *dev,
+				const char *buf, size_t count,
+				void (*func)(struct device *))
+{
+	int ret = 0;
+	unsigned long val;
+
+	if (kstrtoul(buf, 0, &val) < 0)
+		return -EINVAL;
+
+	if (val) {
+		ret = device_schedule_callback(dev, func);
+		if (ret)
+			count = ret;
+	}
+
+	return count;
+}
+
 static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 				size_t count)
 {
@@ -294,10 +318,10 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 		return -EINVAL;
 
 	if (val) {
-		mutex_lock(&pci_remove_rescan_mutex);
+		pci_hotplug_enter();
 		while ((b = pci_find_next_bus(b)) != NULL)
 			pci_rescan_bus(b);
-		mutex_unlock(&pci_remove_rescan_mutex);
+		pci_hotplug_exit();
 	}
 	return count;
 }
@@ -307,72 +331,62 @@ struct bus_attribute pci_bus_attrs[] = {
 	__ATTR_NULL
 };
 
-static ssize_t
-dev_rescan_store(struct device *dev, struct device_attribute *attr,
-		 const char *buf, size_t count)
+static void dev_rescan_callback(struct device *dev)
 {
-	unsigned long val;
 	struct pci_dev *pdev = to_pci_dev(dev);
 
-	if (strict_strtoul(buf, 0, &val) < 0)
-		return -EINVAL;
-
-	if (val) {
-		mutex_lock(&pci_remove_rescan_mutex);
+	pci_hotplug_enter();
+	/* Skip if the device has been stopped. */
+	if (pdev->is_added && pdev->bus)
 		pci_rescan_bus(pdev->bus);
-		mutex_unlock(&pci_remove_rescan_mutex);
-	}
-	return count;
+	pci_hotplug_exit();
+}
+
+static ssize_t
+dev_rescan_store(struct device *dev, struct device_attribute *attr,
+		 const char *buf, size_t count)
+{
+	return schedule_hp_callback(dev, buf, count, dev_rescan_callback);
 }
 
 static void remove_callback(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
 
-	mutex_lock(&pci_remove_rescan_mutex);
-	pci_stop_and_remove_bus_device(pdev);
-	mutex_unlock(&pci_remove_rescan_mutex);
+	pci_hotplug_enter();
+	/* Skip if the bus has been removed. */
+	if (pdev->bus_list.next)
+		pci_stop_and_remove_bus_device(pdev);
+	pci_hotplug_exit();
 }
 
 static ssize_t
 remove_store(struct device *dev, struct device_attribute *dummy,
 	     const char *buf, size_t count)
 {
-	int ret = 0;
-	unsigned long val;
-
-	if (strict_strtoul(buf, 0, &val) < 0)
-		return -EINVAL;
-
-	/* An attribute cannot be unregistered by one of its own methods,
-	 * so we have to use this roundabout approach.
-	 */
-	if (val)
-		ret = device_schedule_callback(dev, remove_callback);
-	if (ret)
-		count = ret;
-	return count;
+	return schedule_hp_callback(dev, buf, count, remove_callback);
 }
 
-static ssize_t
-dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
-		 const char *buf, size_t count)
+static void dev_bus_rescan_callback(struct device *dev)
 {
-	unsigned long val;
 	struct pci_bus *bus = to_pci_bus(dev);
 
-	if (strict_strtoul(buf, 0, &val) < 0)
-		return -EINVAL;
-
-	if (val) {
-		mutex_lock(&pci_remove_rescan_mutex);
+	pci_hotplug_enter();
+	/* Skip if the bus has been stopped. */
+	if (bus->is_added) {
 		if (!pci_is_root_bus(bus) && list_empty(&bus->devices))
 			pci_rescan_bus_bridge_resize(bus->self);
 		else
 			pci_rescan_bus(bus);
-		mutex_unlock(&pci_remove_rescan_mutex);
 	}
-	return count;
+	pci_hotplug_exit();
+}
+
+static ssize_t
+dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
+		 const char *buf, size_t count)
+{
+	return schedule_hp_callback(dev, buf, count, dev_bus_rescan_callback);
 }
 
 #endif
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index fd77e2b..0b4342b 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -72,6 +72,7 @@ void pci_remove_bus(struct pci_bus *pci_bus)
 	if (!pci_bus->is_added)
 		return;
 
+	pci_bus->is_added = 0;
 	pci_remove_legacy_files(pci_bus);
 	device_unregister(&pci_bus->dev);
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 04/17] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (2 preceding siblings ...)
  2012-04-16 16:28 ` [PATCH RFC 03/17] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock Jiang Liu
@ 2012-04-16 16:28 ` Jiang Liu
  2012-04-16 16:28 ` [PATCH RFC 05/17] PCI: correctly flush workqueue when destroy pcie hotplug controller Jiang Liu
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:28 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Greg Kroah-Hartman,
	Scott Murray
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Use PCI hotplug lock to globally serialize hotplug operations triggered by
PCI hotplug sysfs interfaces.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/pci_hotplug_core.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/hotplug/pci_hotplug_core.c b/drivers/pci/hotplug/pci_hotplug_core.c
index 1572665..9bbbe3e 100644
--- a/drivers/pci/hotplug/pci_hotplug_core.c
+++ b/drivers/pci/hotplug/pci_hotplug_core.c
@@ -39,6 +39,7 @@
 #include <linux/mutex.h>
 #include <linux/pci.h>
 #include <linux/pci_hotplug.h>
+#include <linux/delay.h>
 #include <asm/uaccess.h>
 #include "../pci.h"
 
@@ -121,6 +122,17 @@ static ssize_t power_write_file(struct pci_slot *pci_slot, const char *buf,
 		retval = -ENODEV;
 		goto exit;
 	}
+
+	/* Avoid deadlock with pci_hp_deregister() */
+	while (!pci_hotplug_try_enter()) {
+		/* Check whether the slot has been deregistered. */
+		if (list_empty(&slot->slot_list)) {
+			retval = -ENODEV;
+			goto exit_put;
+		}
+		msleep(1);
+	}
+
 	switch (power) {
 		case 0:
 			if (slot->ops->disable_slot)
@@ -136,8 +148,10 @@ static ssize_t power_write_file(struct pci_slot *pci_slot, const char *buf,
 			err ("Illegal value specified for power\n");
 			retval = -EINVAL;
 	}
-	module_put(slot->ops->owner);
 
+	pci_hotplug_exit();
+exit_put:
+	module_put(slot->ops->owner);
 exit:	
 	if (retval)
 		return retval;
@@ -500,7 +514,7 @@ int pci_hp_deregister(struct hotplug_slot *hotplug)
 		return -ENODEV;
 	}
 
-	list_del(&hotplug->slot_list);
+	list_del_init(&hotplug->slot_list);
 
 	slot = hotplug->pci_slot;
 	fs_remove_slot(slot);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 05/17] PCI: correctly flush workqueue when destroy pcie hotplug controller
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (3 preceding siblings ...)
  2012-04-16 16:28 ` [PATCH RFC 04/17] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces Jiang Liu
@ 2012-04-16 16:28 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 06/17] PCI: prepare for serializing hotplug operations triggered by pciehp driver Jiang Liu
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:28 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Dan Zink,
	Greg Kroah-Hartman, Dely Sy
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

When destroying a PCIe hotplug controller, all work items associated with
that controller should be flushed.  Function pcie_cleanup_slot() calls
cancel_delayed_work() and flush_workqueue() to achieve that.
Function flush_workqueue() will flush all work items already submitted,
but new work items submitted by those already submitted work items may
still be in live state when returning from flush_workqueue().

For the extreme case, pciehp driver may expierence following calling path:
1) pcie_isr() -> pciehp_handle_xxx() -> queue_interrupt_event()->queue_work()
2) interrupt_event_handler() -> handle_button_press_event() ->
   queue_delayed_work()
3) pciehp_queue_pushbutton_work() -> queue_work()

So enhance pcie_cleanup_slot() to correctly flush workqueue when destroying
PCIe hotplug controller.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
---
 drivers/pci/hotplug/pciehp_hpc.c |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index a960fae..98b775f 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -889,8 +889,20 @@ static int pcie_init_slot(struct controller *ctrl)
 static void pcie_cleanup_slot(struct controller *ctrl)
 {
 	struct slot *slot = ctrl->slot;
-	cancel_delayed_work(&slot->work);
+
+	/*
+	 * Following workqueue flushing logic is to deal with the special
+	 * call path:
+	 * 1) pcie_isr() -> pciehp_handle_xxx() ->
+	 *    queue_interrupt_event(pciehp_wq_event)->queue_work(pciehp_wq)
+	 * 2) interrupt_event_handler() -> handle_button_press_event() ->
+	 *    queue_delayed_work(pciehp_wq)
+	 * 3) pciehp_queue_pushbutton_work() -> queue_work(pciehp_wq)
+	 */
 	flush_workqueue(pciehp_wq);
+	cancel_delayed_work_sync(&slot->work);
+	flush_workqueue(pciehp_wq);
+
 	kfree(slot);
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 06/17] PCI: prepare for serializing hotplug operations triggered by pciehp driver
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (4 preceding siblings ...)
  2012-04-16 16:28 ` [PATCH RFC 05/17] PCI: correctly flush workqueue when destroy pcie hotplug controller Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 07/17] PCI: serialize hotplug operaitons triggered by the " Jiang Liu
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Dan Zink,
	Greg Kroah-Hartman, Dely Sy
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Split pciehp_wq into two workqueues to avoid possible deadlock issues
when using PCI hotplug lock to serialize hotplug operations triggered
by pciehp driver.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/pciehp.h      |    3 ++-
 drivers/pci/hotplug/pciehp_core.c |   18 +++++++++++++-----
 drivers/pci/hotplug/pciehp_ctrl.c |    8 ++++----
 drivers/pci/hotplug/pciehp_hpc.c  |   11 ++++++-----
 4 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 4b7cce1..c8a1a27 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -44,7 +44,8 @@ extern bool pciehp_poll_mode;
 extern int pciehp_poll_time;
 extern bool pciehp_debug;
 extern bool pciehp_force;
-extern struct workqueue_struct *pciehp_wq;
+extern struct workqueue_struct *pciehp_wq_event;
+extern struct workqueue_struct *pciehp_wq_power;
 
 #define dbg(format, arg...)						\
 do {									\
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 365c6b9..4ceefe3 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -42,7 +42,8 @@ bool pciehp_debug;
 bool pciehp_poll_mode;
 int pciehp_poll_time;
 bool pciehp_force;
-struct workqueue_struct *pciehp_wq;
+struct workqueue_struct *pciehp_wq_event;
+struct workqueue_struct *pciehp_wq_power;
 
 #define DRIVER_VERSION	"0.4"
 #define DRIVER_AUTHOR	"Dan Zink <dan.zink@compaq.com>, Greg Kroah-Hartman <greg@kroah.com>, Dely Sy <dely.l.sy@intel.com>"
@@ -340,16 +341,22 @@ static int __init pcied_init(void)
 {
 	int retval = 0;
 
-	pciehp_wq = alloc_workqueue("pciehp", 0, 0);
-	if (!pciehp_wq)
+	pciehp_wq_event = alloc_workqueue("pciehp_event", 0, 0);
+	if (!pciehp_wq_event)
 		return -ENOMEM;
+	pciehp_wq_power = alloc_workqueue("pciehp_power", 0, 0);
+	if (!pciehp_wq_power) {
+		destroy_workqueue(pciehp_wq_event);
+		return -ENOMEM;
+	}
 
 	pciehp_firmware_init();
 	retval = pcie_port_service_register(&hpdriver_portdrv);
  	dbg("pcie_port_service_register = %d\n", retval);
   	info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
  	if (retval) {
-		destroy_workqueue(pciehp_wq);
+		destroy_workqueue(pciehp_wq_event);
+		destroy_workqueue(pciehp_wq_power);
 		dbg("Failure to register service\n");
 	}
 	return retval;
@@ -359,7 +366,8 @@ static void __exit pcied_cleanup(void)
 {
 	dbg("unload_pciehpd()\n");
 	pcie_port_service_unregister(&hpdriver_portdrv);
-	destroy_workqueue(pciehp_wq);
+	destroy_workqueue(pciehp_wq_event);
+	destroy_workqueue(pciehp_wq_power);
 	info(DRIVER_DESC " version: " DRIVER_VERSION " unloaded\n");
 }
 
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 27f4429..8f4d261 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -49,7 +49,7 @@ static int queue_interrupt_event(struct slot *p_slot, u32 event_type)
 	info->p_slot = p_slot;
 	INIT_WORK(&info->work, interrupt_event_handler);
 
-	queue_work(pciehp_wq, &info->work);
+	queue_work(pciehp_wq_event, &info->work);
 
 	return 0;
 }
@@ -344,7 +344,7 @@ void pciehp_queue_pushbutton_work(struct work_struct *work)
 		kfree(info);
 		goto out;
 	}
-	queue_work(pciehp_wq, &info->work);
+	queue_work(pciehp_wq_power, &info->work);
  out:
 	mutex_unlock(&p_slot->lock);
 }
@@ -377,7 +377,7 @@ static void handle_button_press_event(struct slot *p_slot)
 		if (ATTN_LED(ctrl))
 			pciehp_set_attention_status(p_slot, 0);
 
-		queue_delayed_work(pciehp_wq, &p_slot->work, 5*HZ);
+		queue_delayed_work(pciehp_wq_event, &p_slot->work, 5*HZ);
 		break;
 	case BLINKINGOFF_STATE:
 	case BLINKINGON_STATE:
@@ -439,7 +439,7 @@ static void handle_surprise_event(struct slot *p_slot)
 	else
 		p_slot->state = POWERON_STATE;
 
-	queue_work(pciehp_wq, &info->work);
+	queue_work(pciehp_wq_power, &info->work);
 }
 
 static void interrupt_event_handler(struct work_struct *work)
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 98b775f..d5c826d 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -894,14 +894,15 @@ static void pcie_cleanup_slot(struct controller *ctrl)
 	 * Following workqueue flushing logic is to deal with the special
 	 * call path:
 	 * 1) pcie_isr() -> pciehp_handle_xxx() ->
-	 *    queue_interrupt_event(pciehp_wq_event)->queue_work(pciehp_wq)
+	 *    queue_interrupt_event(pciehp_wq_event)->
+	 *    queue_work(pciehp_wq_event)
 	 * 2) interrupt_event_handler() -> handle_button_press_event() ->
-	 *    queue_delayed_work(pciehp_wq)
-	 * 3) pciehp_queue_pushbutton_work() -> queue_work(pciehp_wq)
+	 *    queue_delayed_work(pciehp_wq_event)
+	 * 3) pciehp_queue_pushbutton_work() -> queue_work(pciehp_wq_power)
 	 */
-	flush_workqueue(pciehp_wq);
+	flush_workqueue(pciehp_wq_event);
 	cancel_delayed_work_sync(&slot->work);
-	flush_workqueue(pciehp_wq);
+	flush_workqueue(pciehp_wq_power);
 
 	kfree(slot);
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 07/17] PCI: serialize hotplug operaitons triggered by the pciehp driver
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (5 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 06/17] PCI: prepare for serializing hotplug operations triggered by pciehp driver Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 08/17] PCI: fix two race windows when probing/removing SHPC controller Jiang Liu
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Dan Zink,
	Greg Kroah-Hartman, Dely Sy
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Use PCI hotplug lock to serialize hotplug operations triggered by the
pciehp driver.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/pciehp.h      |    2 +
 drivers/pci/hotplug/pciehp_core.c |    7 ++++-
 drivers/pci/hotplug/pciehp_ctrl.c |   48 +++++++++++++++++++++++++++++++++++++
 drivers/pci/hotplug/pciehp_hpc.c  |    1 +
 4 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index c8a1a27..6078eea 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -117,6 +117,7 @@ struct controller {
 #define BLINKINGOFF_STATE		2
 #define POWERON_STATE			3
 #define POWEROFF_STATE			4
+#define SHUTDOWN_STATE			5
 
 #define ATTN_BUTTN(ctrl)	((ctrl)->slot_cap & PCI_EXP_SLTCAP_ABP)
 #define POWER_CTRL(ctrl)	((ctrl)->slot_cap & PCI_EXP_SLTCAP_PCP)
@@ -160,6 +161,7 @@ void pciehp_green_led_off(struct slot *slot);
 void pciehp_green_led_blink(struct slot *slot);
 int pciehp_check_link_status(struct controller *ctrl);
 void pciehp_release_ctrl(struct controller *ctrl);
+void pciehp_shutdown_slot(struct slot *slot);
 
 static inline const char *slot_name(struct slot *slot)
 {
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 4ceefe3..2195f67 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -268,8 +268,11 @@ static int pciehp_probe(struct pcie_device *dev)
 	slot = ctrl->slot;
 	pciehp_get_adapter_status(slot, &occupied);
 	pciehp_get_power_status(slot, &poweron);
-	if (occupied && pciehp_force)
+	if (occupied && pciehp_force) {
+		pci_hotplug_enter();
 		pciehp_enable_slot(slot);
+		pci_hotplug_exit();
+	}
 	/* If empty slot's power status is on, turn power off */
 	if (!occupied && poweron && POWER_CTRL(ctrl))
 		pciehp_power_off_slot(slot);
@@ -313,11 +316,13 @@ static int pciehp_resume (struct pcie_device *dev)
 		slot = ctrl->slot;
 
 		/* Check if slot is occupied */
+		pci_hotplug_enter();
 		pciehp_get_adapter_status(slot, &status);
 		if (status)
 			pciehp_enable_slot(slot);
 		else
 			pciehp_disable_slot(slot);
+		pci_hotplug_exit();
 	}
 	return 0;
 }
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 8f4d261..e859100 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -31,6 +31,7 @@
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/slab.h>
+#include <linux/delay.h>
 #include <linux/pci.h>
 #include "../pci.h"
 #include "pciehp.h"
@@ -290,6 +291,30 @@ static void pciehp_power_thread(struct work_struct *work)
 	struct power_work_info *info =
 		container_of(work, struct power_work_info, work);
 	struct slot *p_slot = info->p_slot;
+	bool shutdown;
+
+	/*
+	 * Break out deadlock issues caused by following scenario:
+	 * Thread A:
+	 *	1) acquire the PCI hotplug lock
+	 *	2) remove the PCI device associated with this PCIe HPC
+	 *	3) call pciehp_remove() for this PCIe HPC
+	 *	4) call flush_workqueue(pciehp_wq_power) to flush queued works
+	 *      5) wait until all queued works have done
+	 * Thread B is a workqueue worker thread:
+	 *	1) call pciehp_power_thread() to handle hotplug requests
+	 *	2) try to acquire the PCI hotplug lock
+	 * Please refer to pciehp_shutdown_slot() for the counterpart.
+	 */
+	while (!pci_hotplug_try_enter()) {
+		mutex_lock(&p_slot->lock);
+		shutdown = p_slot->state == SHUTDOWN_STATE;
+		mutex_unlock(&p_slot->lock);
+		if (shutdown)
+			goto out;
+		else
+			mdelay(1);
+	}
 
 	mutex_lock(&p_slot->lock);
 	switch (p_slot->state) {
@@ -315,6 +340,9 @@ static void pciehp_power_thread(struct work_struct *work)
 	}
 	mutex_unlock(&p_slot->lock);
 
+	pci_hotplug_exit();
+
+out:
 	kfree(info);
 }
 
@@ -623,3 +651,23 @@ int pciehp_sysfs_disable_slot(struct slot *p_slot)
 
 	return retval;
 }
+
+void pciehp_shutdown_slot(struct slot *slot)
+{
+	u8 getstatus;
+	struct controller *ctrl = slot->ctrl;
+
+	mutex_lock(&slot->lock);
+	slot->state = SHUTDOWN_STATE;
+	mutex_unlock(&slot->lock);
+
+	if (ATTN_LED(ctrl))
+		pciehp_set_attention_status(slot, 0);
+	if (PWR_LED(ctrl)) {
+		pciehp_get_power_status(slot, &getstatus);
+		if (getstatus)
+			pciehp_green_led_on(slot);
+		else
+			pciehp_green_led_off(slot);
+	}
+}
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index d5c826d..c492b2c 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -902,6 +902,7 @@ static void pcie_cleanup_slot(struct controller *ctrl)
 	 */
 	flush_workqueue(pciehp_wq_event);
 	cancel_delayed_work_sync(&slot->work);
+	pciehp_shutdown_slot(slot);
 	flush_workqueue(pciehp_wq_power);
 
 	kfree(slot);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 08/17] PCI: fix two race windows when probing/removing SHPC controller
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (6 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 07/17] PCI: serialize hotplug operaitons triggered by the " Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 09/17] PCI: correctly flush workqueues and timer when destroy " Jiang Liu
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Dan Zink,
	Greg Kroah-Hartman, Dely Sy
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

With current shpchp implementation, interrupt is enabled before corresponding
slot data structures are created. If interrupt happens immediately after
enabling the hotplug interrupt, shpchp_find_slot() may return NULL, thus
causes NULL pointer dereference. So create slot data structures before
enabling interrupt.

And cleanup_slots() is called to destroy slot data structures before disabling
interrupt, which may cause invalid memory access in shpc_isr().  So call
cleanup_slots() after disabling interrupt/removing the poll timer.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/shpchp.h      |    1 +
 drivers/pci/hotplug/shpchp_core.c |    6 +++---
 drivers/pci/hotplug/shpchp_hpc.c  |   36 +++++++++++++++++++++---------------
 3 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp.h b/drivers/pci/hotplug/shpchp.h
index ca64932..6691dc4 100644
--- a/drivers/pci/hotplug/shpchp.h
+++ b/drivers/pci/hotplug/shpchp.h
@@ -182,6 +182,7 @@ extern int shpchp_unconfigure_device(struct slot *p_slot);
 extern void cleanup_slots(struct controller *ctrl);
 extern void shpchp_queue_pushbutton_work(struct work_struct *work);
 extern int shpc_init( struct controller *ctrl, struct pci_dev *pdev);
+extern void shpc_enable_intr(struct controller *ctrl);
 
 static inline const char *slot_name(struct slot *slot)
 {
diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index 7414fd9..aaa66be 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
@@ -318,14 +318,14 @@ static int shpc_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_out_release_ctlr;
 	}
 
+	shpc_enable_intr(ctrl);
+
 	rc = shpchp_create_ctrl_files(ctrl);
 	if (rc)
-		goto err_cleanup_slots;
+		goto err_out_release_ctlr;
 
 	return 0;
 
-err_cleanup_slots:
-	cleanup_slots(ctrl);
 err_out_release_ctlr:
 	ctrl->hpc_ops->release_ctlr(ctrl);
 err_out_free_ctrl:
diff --git a/drivers/pci/hotplug/shpchp_hpc.c b/drivers/pci/hotplug/shpchp_hpc.c
index 75ba231..2bc6182 100644
--- a/drivers/pci/hotplug/shpchp_hpc.c
+++ b/drivers/pci/hotplug/shpchp_hpc.c
@@ -592,6 +592,13 @@ static void hpc_release_ctlr(struct controller *ctrl)
 		shpc_writel(ctrl, SLOT_REG(i), slot_reg);
 	}
 
+	if (shpchp_poll_mode)
+		del_timer(&ctrl->poll_timer);
+	else {
+		free_irq(ctrl->pci_dev->irq, ctrl);
+		pci_disable_msi(ctrl->pci_dev);
+	}
+
 	cleanup_slots(ctrl);
 
 	/*
@@ -603,13 +610,6 @@ static void hpc_release_ctlr(struct controller *ctrl)
 	serr_int &= ~SERR_INTR_RSVDZ_MASK;
 	shpc_writel(ctrl, SERR_INTR_ENABLE, serr_int);
 
-	if (shpchp_poll_mode)
-		del_timer(&ctrl->poll_timer);
-	else {
-		free_irq(ctrl->pci_dev->irq, ctrl);
-		pci_disable_msi(ctrl->pci_dev);
-	}
-
 	iounmap(ctrl->creg);
 	release_mem_region(ctrl->mmio_base, ctrl->mmio_size);
 }
@@ -1081,6 +1081,20 @@ int shpc_init(struct controller *ctrl, struct pci_dev *pdev)
 	shpc_get_max_bus_speed(ctrl);
 	shpc_get_cur_bus_speed(ctrl);
 
+	return 0;
+
+	/* We end up here for the many possible ways to fail this API.  */
+abort_iounmap:
+	iounmap(ctrl->creg);
+abort:
+	return rc;
+}
+
+void shpc_enable_intr(struct controller *ctrl)
+{
+	u8 hp_slot;
+	u32 tempdword, slot_reg;
+
 	/*
 	 * Unmask all event interrupts of all slots
 	 */
@@ -1102,12 +1116,4 @@ int shpc_init(struct controller *ctrl, struct pci_dev *pdev)
 		tempdword = shpc_readl(ctrl, SERR_INTR_ENABLE);
 		ctrl_dbg(ctrl, "SERR_INTR_ENABLE = %x\n", tempdword);
 	}
-
-	return 0;
-
-	/* We end up here for the many possible ways to fail this API.  */
-abort_iounmap:
-	iounmap(ctrl->creg);
-abort:
-	return rc;
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 09/17] PCI: correctly flush workqueues and timer when destroy SHPC controller
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (7 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 08/17] PCI: fix two race windows when probing/removing SHPC controller Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 10/17] PCI: serialize hotplug operaitons triggered by the shpchp driver Jiang Liu
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Dan Zink,
	Greg Kroah-Hartman, Dely Sy
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

del_timer() only deactivates a timer but doesn't wait for the handler
to finish, so use del_timer_sync() to deactivate a timer and wait for
the handler to finish in hpc_release_ctrl().

This patch also tune the workqueue flush logic to correctly flush all
work items.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/shpchp_core.c |    2 +-
 drivers/pci/hotplug/shpchp_hpc.c  |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index aaa66be..19762b3 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
@@ -173,8 +173,8 @@ void cleanup_slots(struct controller *ctrl)
 	list_for_each_safe(tmp, next, &ctrl->slot_list) {
 		slot = list_entry(tmp, struct slot, slot_list);
 		list_del(&slot->slot_list);
-		cancel_delayed_work(&slot->work);
 		flush_workqueue(shpchp_wq);
+		cancel_delayed_work_sync(&slot->work);
 		flush_workqueue(shpchp_ordered_wq);
 		pci_hp_deregister(slot->hotplug_slot);
 	}
diff --git a/drivers/pci/hotplug/shpchp_hpc.c b/drivers/pci/hotplug/shpchp_hpc.c
index 2bc6182..ff63887 100644
--- a/drivers/pci/hotplug/shpchp_hpc.c
+++ b/drivers/pci/hotplug/shpchp_hpc.c
@@ -593,7 +593,7 @@ static void hpc_release_ctlr(struct controller *ctrl)
 	}
 
 	if (shpchp_poll_mode)
-		del_timer(&ctrl->poll_timer);
+		del_timer_sync(&ctrl->poll_timer);
 	else {
 		free_irq(ctrl->pci_dev->irq, ctrl);
 		pci_disable_msi(ctrl->pci_dev);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 10/17] PCI: serialize hotplug operaitons triggered by the shpchp driver
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (8 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 09/17] PCI: correctly flush workqueues and timer when destroy " Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 11/17] PCI: release IO resource in error handling path in cpcihp_generic_init() Jiang Liu
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Dan Zink,
	Greg Kroah-Hartman, Dely Sy
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Use PCI hotplug lock to serialize hotplug operations triggered by the
shpchp driver.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/shpchp.h      |    2 ++
 drivers/pci/hotplug/shpchp_core.c |    3 ++-
 drivers/pci/hotplug/shpchp_ctrl.c |   32 ++++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp.h b/drivers/pci/hotplug/shpchp.h
index 6691dc4..07f3b2d 100644
--- a/drivers/pci/hotplug/shpchp.h
+++ b/drivers/pci/hotplug/shpchp.h
@@ -157,6 +157,7 @@ struct controller {
 #define BLINKINGOFF_STATE		2
 #define POWERON_STATE			3
 #define POWEROFF_STATE			4
+#define SHUTDOWN_STATE			5
 
 /* Error messages */
 #define INTERLOCK_OPEN			0x00000002
@@ -179,6 +180,7 @@ extern u8 shpchp_handle_presence_change(u8 hp_slot, struct controller *ctrl);
 extern u8 shpchp_handle_power_fault(u8 hp_slot, struct controller *ctrl);
 extern int shpchp_configure_device(struct slot *p_slot);
 extern int shpchp_unconfigure_device(struct slot *p_slot);
+extern void shpchp_shutdown_slot(struct slot *slot);
 extern void cleanup_slots(struct controller *ctrl);
 extern void shpchp_queue_pushbutton_work(struct work_struct *work);
 extern int shpc_init( struct controller *ctrl, struct pci_dev *pdev);
diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index 19762b3..71cc5f2 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
@@ -172,10 +172,11 @@ void cleanup_slots(struct controller *ctrl)
 
 	list_for_each_safe(tmp, next, &ctrl->slot_list) {
 		slot = list_entry(tmp, struct slot, slot_list);
-		list_del(&slot->slot_list);
 		flush_workqueue(shpchp_wq);
 		cancel_delayed_work_sync(&slot->work);
+		shpchp_shutdown_slot(slot);
 		flush_workqueue(shpchp_ordered_wq);
+		list_del(&slot->slot_list);
 		pci_hp_deregister(slot->hotplug_slot);
 	}
 }
diff --git a/drivers/pci/hotplug/shpchp_ctrl.c b/drivers/pci/hotplug/shpchp_ctrl.c
index b00b09b..7b8e3a6 100644
--- a/drivers/pci/hotplug/shpchp_ctrl.c
+++ b/drivers/pci/hotplug/shpchp_ctrl.c
@@ -31,6 +31,7 @@
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/slab.h>
+#include <linux/delay.h>
 #include <linux/pci.h>
 #include "../pci.h"
 #include "shpchp.h"
@@ -406,6 +407,18 @@ static void shpchp_pushbutton_thread(struct work_struct *work)
 	struct pushbutton_work_info *info =
 		container_of(work, struct pushbutton_work_info, work);
 	struct slot *p_slot = info->p_slot;
+	bool shutdown;
+
+	/* Break possible deadlock by using pci_hotplug_try_enter() */
+	while (!pci_hotplug_try_enter()) {
+		mutex_lock(&p_slot->lock);
+		shutdown = p_slot->state == SHUTDOWN_STATE;
+		mutex_unlock(&p_slot->lock);
+		if (shutdown)
+			goto out;
+		else
+			mdelay(1);
+	}
 
 	mutex_lock(&p_slot->lock);
 	switch (p_slot->state) {
@@ -427,6 +440,9 @@ static void shpchp_pushbutton_thread(struct work_struct *work)
 	}
 	mutex_unlock(&p_slot->lock);
 
+	pci_hotplug_exit();
+
+out:
 	kfree(info);
 }
 
@@ -729,3 +745,19 @@ int shpchp_sysfs_disable_slot(struct slot *p_slot)
 
 	return retval;
 }
+
+void shpchp_shutdown_slot(struct slot *slot)
+{
+	u8 getstatus;
+
+	mutex_lock(&slot->lock);
+	slot->state = SHUTDOWN_STATE;
+	mutex_unlock(&slot->lock);
+
+	slot->hpc_ops->set_attention_status(slot, 0);
+	slot->hpc_ops->get_power_status(slot, &getstatus);
+	if (getstatus)
+		slot->hpc_ops->green_led_on(slot);
+	else
+		slot->hpc_ops->green_led_off(slot);
+}
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 11/17] PCI: release IO resource in error handling path in cpcihp_generic_init()
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (9 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 10/17] PCI: serialize hotplug operaitons triggered by the shpchp driver Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 12/17] PCI: clean up all resources in error handling path in zt5550_hc_init_one() Jiang Liu
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Scott Murray
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Release IO resource in error handling path in function cpcihp_generic_init().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/cpcihp_generic.c |   25 ++++++++++++++++---------
 1 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index 81af764..518f387 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
@@ -157,16 +157,19 @@ static int __init cpcihp_generic_init(void)
 	bus = pci_find_bus(0, bridge_busnr);
 	if (!bus) {
 		err("Invalid bus number %d", bridge_busnr);
-		return -EINVAL;
-	}
-	dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
-	if(!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
-		err("Invalid bridge device %s", bridge);
+	} else {
+		dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
+		if (!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
+			err("Invalid bridge device %s", bridge);
+			bus = NULL;
+		} else
+			bus = dev->subordinate;
 		pci_dev_put(dev);
-		return -EINVAL;
 	}
-	bus = dev->subordinate;
-	pci_dev_put(dev);
+	if (!bus) {
+		status = -EINVAL;
+		goto init_find_bus_error;
+	}
 
 	memset(&generic_hpc, 0, sizeof (struct cpci_hp_controller));
 	generic_hpc_ops.query_enum = query_enum;
@@ -175,7 +178,8 @@ static int __init cpcihp_generic_init(void)
 	status = cpci_hp_register_controller(&generic_hpc);
 	if(status != 0) {
 		err("Could not register cPCI hotplug controller");
-		return -ENODEV;
+		status = -ENODEV;
+		goto init_find_bus_error;
 	}
 	dbg("registered controller");
 
@@ -193,10 +197,13 @@ static int __init cpcihp_generic_init(void)
 	}
 	dbg("started cpci hp system");
 	return 0;
+
 init_start_error:
 	cpci_hp_unregister_bus(bus);
 init_bus_register_error:
 	cpci_hp_unregister_controller(&generic_hpc);
+init_find_bus_error:
+	release_region(port, 1);
 	err("status = %d", status);
 	return status;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 12/17] PCI: clean up all resources in error handling path in zt5550_hc_init_one()
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (10 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 11/17] PCI: release IO resource in error handling path in cpcihp_generic_init() Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 13/17] PCI: trivial code clean up in cpci_hotplug_core.c Jiang Liu
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Scott Murray
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Clean up all resources in error handling path in function zt5550_hc_init_one().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/cpcihp_zt5550.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_zt5550.c b/drivers/pci/hotplug/cpcihp_zt5550.c
index 6bf8d2a..8a6f968 100644
--- a/drivers/pci/hotplug/cpcihp_zt5550.c
+++ b/drivers/pci/hotplug/cpcihp_zt5550.c
@@ -257,11 +257,13 @@ static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 	if(status != 0) {
 		err("could not started cPCI hotplug system");
 		cpci_hp_unregister_bus(bus0);
-		goto init_register_error;
+		goto init_start_error;
 	}
 	dbg("started cpci hp system");
 
 	return 0;
+init_start_error:
+	cpci_hp_unregister_bus(bus0);
 init_register_error:
 	cpci_hp_unregister_controller(&zt5550_hpc);
 init_hc_error:
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 13/17] PCI: trivial code clean up in cpci_hotplug_core.c
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (11 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 12/17] PCI: clean up all resources in error handling path in zt5550_hc_init_one() Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 14/17] PCI: fix race windows when shutting down cpcihp controller Jiang Liu
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Scott Murray
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

1) get rid of redundant lock operations in cpcihp_core.
2) return suitable error code instead of -1.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   16 +++-------------
 1 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 3fadf2f..7898023 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
@@ -304,7 +304,7 @@ cpci_hp_unregister_bus(struct pci_bus *bus)
 	down_write(&list_rwsem);
 	if (!slots) {
 		up_write(&list_rwsem);
-		return -1;
+		return -ENODEV;
 	}
 	list_for_each_entry_safe(slot, tmp, &slot_list, slot_list) {
 		if (slot->bus == bus) {
@@ -357,11 +357,6 @@ init_slots(int clear_ins)
 	struct pci_dev* dev;
 
 	dbg("%s - enter", __func__);
-	down_read(&list_rwsem);
-	if (!slots) {
-		up_read(&list_rwsem);
-		return -1;
-	}
 	list_for_each_entry(slot, &slot_list, slot_list) {
 		dbg("%s - looking at slot %s", __func__, slot_name(slot));
 		if (clear_ins && cpci_check_and_clear_ins(slot))
@@ -376,7 +371,6 @@ init_slots(int clear_ins)
 			slot->dev = dev;
 		}
 	}
-	up_read(&list_rwsem);
 	dbg("%s - exit", __func__);
 	return 0;
 }
@@ -585,7 +579,7 @@ cpci_hp_register_controller(struct cpci_hp_controller *new_controller)
 	int status = 0;
 
 	if (controller)
-		return -1;
+		return -EBUSY;
 	if (!(new_controller && new_controller->ops))
 		return -EINVAL;
 	if (new_controller->irq) {
@@ -620,15 +614,11 @@ cleanup_slots(void)
 	 * and free up all memory that we had allocated.
 	 */
 	down_write(&list_rwsem);
-	if (!slots)
-		goto cleanup_null;
 	list_for_each_entry_safe(slot, tmp, &slot_list, slot_list) {
 		list_del(&slot->slot_list);
 		pci_hp_deregister(slot->hotplug_slot);
 	}
-cleanup_null:
 	up_write(&list_rwsem);
-	return;
 }
 
 int
@@ -663,9 +653,9 @@ cpci_hp_start(void)
 		up_read(&list_rwsem);
 		return -ENODEV;
 	}
+	status = init_slots(first);
 	up_read(&list_rwsem);
 
-	status = init_slots(first);
 	if (first)
 		first = 0;
 	if (status)
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 14/17] PCI: fix race windows when shutting down cpcihp controller
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (12 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 13/17] PCI: trivial code clean up in cpci_hotplug_core.c Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 15/17] PCI: hold a reference count to the PCI bus used by cpcihp drivers Jiang Liu
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Scott Murray
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

When cpci_hp_stop() is called to disabled cpcihp controller, it will disable
interrupt for that controller. But there's small window for event_thread()
to reenable the interrupt again. So stop the worker thread before disabling
the interrupt.

If check_slots() returns error, ther working thread (cpci_thread) will exit.
Later when cpci_stop_thread() or cpci_hp_intr() tries to access cpci_thread,
it may have already been destroyed. So hold a reference count to cpci_thread
to avoid invalid memory access.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   27 +++++++++++++++++----------
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 7898023..68e43c7 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
@@ -60,7 +60,6 @@ static atomic_t extracting;
 int cpci_debug;
 static struct cpci_hp_controller *controller;
 static struct task_struct *cpci_thread;
-static int thread_finished;
 
 static int enable_slot(struct hotplug_slot *slot);
 static int disable_slot(struct hotplug_slot *slot);
@@ -341,7 +340,8 @@ cpci_hp_intr(int irq, void *data)
 	controller->ops->disable_irq();
 
 	/* Trigger processing by the event thread */
-	wake_up_process(cpci_thread);
+	if (cpci_thread)
+		wake_up_process(cpci_thread);
 	return IRQ_HANDLED;
 }
 
@@ -508,7 +508,6 @@ event_thread(void *data)
 				msleep(500);
 			} else if (rc < 0) {
 				dbg("%s - error checking slots", __func__);
-				thread_finished = 1;
 				goto out;
 			}
 		} while (atomic_read(&extracting) && !kthread_should_stop());
@@ -540,7 +539,6 @@ poll_thread(void *data)
 					msleep(500);
 				} else if (rc < 0) {
 					dbg("%s - error checking slots", __func__);
-					thread_finished = 1;
 					goto out;
 				}
 			} while (atomic_read(&extracting) && !kthread_should_stop());
@@ -562,15 +560,24 @@ cpci_start_thread(void)
 		err("Can't start up our thread");
 		return PTR_ERR(cpci_thread);
 	}
-	thread_finished = 0;
+	get_task_struct(cpci_thread);
+
 	return 0;
 }
 
 static void
 cpci_stop_thread(void)
 {
-	kthread_stop(cpci_thread);
-	thread_finished = 1;
+	struct task_struct *tp;
+
+	if (cpci_thread) {
+		local_irq_disable();
+		tp = cpci_thread;
+		cpci_thread = NULL;
+		local_irq_enable();
+		kthread_stop(tp);
+		put_task_struct(tp);
+	}
 }
 
 int
@@ -627,8 +634,7 @@ cpci_hp_unregister_controller(struct cpci_hp_controller *old_controller)
 	int status = 0;
 
 	if (controller) {
-		if (!thread_finished)
-			cpci_stop_thread();
+		cpci_stop_thread();
 		if (controller->irq)
 			free_irq(controller->irq, controller->dev_id);
 		controller = NULL;
@@ -680,12 +686,13 @@ cpci_hp_stop(void)
 {
 	if (!controller)
 		return -ENODEV;
+	cpci_stop_thread();
 	if (controller->irq) {
 		/* Stop enum interrupt processing */
 		dbg("%s - disabling irq", __func__);
 		controller->ops->disable_irq();
+		synchronize_irq(controller->irq);
 	}
-	cpci_stop_thread();
 	return 0;
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 15/17] PCI: hold a reference count to the PCI bus used by cpcihp drivers
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (13 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 14/17] PCI: fix race windows when shutting down cpcihp controller Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 16/17] PCI: serialize PCI hotplug operations triggered " Jiang Liu
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Scott Murray
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

The PCI bus used by cpcihp drivers may be removed at runtime through
/sys/devices/pcissss:bb/ssss\:bb\:dd.f/.../remove interface, so hold
a reference count to the PCI bus to avoid invalid memory access.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/cpcihp_generic.c |    7 +++++--
 drivers/pci/hotplug/cpcihp_zt5550.c  |    7 +++++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index 518f387..f002be5 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
@@ -163,7 +163,7 @@ static int __init cpcihp_generic_init(void)
 			err("Invalid bridge device %s", bridge);
 			bus = NULL;
 		} else
-			bus = dev->subordinate;
+			bus = pci_bus_get(dev->subordinate);
 		pci_dev_put(dev);
 	}
 	if (!bus) {
@@ -179,7 +179,7 @@ static int __init cpcihp_generic_init(void)
 	if(status != 0) {
 		err("Could not register cPCI hotplug controller");
 		status = -ENODEV;
-		goto init_find_bus_error;
+		goto init_register_controller_error;
 	}
 	dbg("registered controller");
 
@@ -202,6 +202,8 @@ init_start_error:
 	cpci_hp_unregister_bus(bus);
 init_bus_register_error:
 	cpci_hp_unregister_controller(&generic_hpc);
+init_register_controller_error:
+	pci_bus_put(bus);
 init_find_bus_error:
 	release_region(port, 1);
 	err("status = %d", status);
@@ -214,6 +216,7 @@ static void __exit cpcihp_generic_exit(void)
 	cpci_hp_stop();
 	cpci_hp_unregister_bus(bus);
 	cpci_hp_unregister_controller(&generic_hpc);
+	pci_bus_put(bus);
 	release_region(port, 1);
 }
 
diff --git a/drivers/pci/hotplug/cpcihp_zt5550.c b/drivers/pci/hotplug/cpcihp_zt5550.c
index 8a6f968..6606520 100644
--- a/drivers/pci/hotplug/cpcihp_zt5550.c
+++ b/drivers/pci/hotplug/cpcihp_zt5550.c
@@ -241,9 +241,9 @@ static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 	if(!(bus0_dev = pci_get_device(PCI_VENDOR_ID_DEC,
 					 PCI_DEVICE_ID_DEC_21154, NULL))) {
 		status = -ENODEV;
-		goto init_register_error;
+		goto init_find_bus_error;
 	}
-	bus0 = bus0_dev->subordinate;
+	bus0 = pci_bus_get(bus0_dev->subordinate);
 	pci_dev_put(bus0_dev);
 
 	status = cpci_hp_register_bus(bus0, 0x0a, 0x0f);
@@ -265,6 +265,8 @@ static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 init_start_error:
 	cpci_hp_unregister_bus(bus0);
 init_register_error:
+	pci_bus_put(bus0);
+init_find_bus_error:
 	cpci_hp_unregister_controller(&zt5550_hpc);
 init_hc_error:
 	err("status = %d", status);
@@ -278,6 +280,7 @@ static void __devexit zt5550_hc_remove_one(struct pci_dev *pdev)
 	cpci_hp_stop();
 	cpci_hp_unregister_bus(bus0);
 	cpci_hp_unregister_controller(&zt5550_hpc);
+	pci_bus_put(bus0);
 	zt5550_hc_cleanup();
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 16/17] PCI: serialize PCI hotplug operations triggered by cpcihp drivers
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (14 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 15/17] PCI: hold a reference count to the PCI bus used by cpcihp drivers Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 16:29 ` [PATCH RFC 17/17] PCI: serialize PCI hotplug operations triggered by fakephp drivers Jiang Liu
  2012-04-16 21:33 ` [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Greg KH
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Scott Murray
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Use PCI hotplug lock to globally serialize hotplug operations triggered
by cpcihp_xxx drivers.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   10 ++++++++++
 drivers/pci/hotplug/cpcihp_generic.c    |    2 ++
 drivers/pci/hotplug/cpcihp_zt5550.c     |   12 ++++++++----
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 68e43c7..bbac305 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
@@ -502,6 +502,10 @@ event_thread(void *data)
 		if (kthread_should_stop())
 			break;
 		do {
+			if (!pci_hotplug_try_enter()) {
+				msleep(1);
+				continue;
+			}
 			rc = check_slots();
 			if (rc > 0) {
 				/* Give userspace a chance to handle extraction */
@@ -510,6 +514,7 @@ event_thread(void *data)
 				dbg("%s - error checking slots", __func__);
 				goto out;
 			}
+			pci_hotplug_exit();
 		} while (atomic_read(&extracting) && !kthread_should_stop());
 		if (kthread_should_stop())
 			break;
@@ -533,6 +538,10 @@ poll_thread(void *data)
 			break;
 		if (controller->ops->query_enum()) {
 			do {
+				if (!pci_hotplug_try_enter()) {
+					msleep(1);
+					continue;
+				}
 				rc = check_slots();
 				if (rc > 0) {
 					/* Give userspace a chance to handle extraction */
@@ -541,6 +550,7 @@ poll_thread(void *data)
 					dbg("%s - error checking slots", __func__);
 					goto out;
 				}
+				pci_hotplug_exit();
 			} while (atomic_read(&extracting) && !kthread_should_stop());
 		}
 		msleep(100);
diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index f002be5..91d27d5 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
@@ -154,6 +154,7 @@ static int __init cpcihp_generic_init(void)
 	if(!r)
 		return -EBUSY;
 
+	pci_hotplug_disable();
 	bus = pci_find_bus(0, bridge_busnr);
 	if (!bus) {
 		err("Invalid bus number %d", bridge_busnr);
@@ -166,6 +167,7 @@ static int __init cpcihp_generic_init(void)
 			bus = pci_bus_get(dev->subordinate);
 		pci_dev_put(dev);
 	}
+	pci_hotplug_enable();
 	if (!bus) {
 		status = -EINVAL;
 		goto init_find_bus_error;
diff --git a/drivers/pci/hotplug/cpcihp_zt5550.c b/drivers/pci/hotplug/cpcihp_zt5550.c
index 6606520..1f81d85 100644
--- a/drivers/pci/hotplug/cpcihp_zt5550.c
+++ b/drivers/pci/hotplug/cpcihp_zt5550.c
@@ -238,13 +238,17 @@ static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 	dbg("registered controller");
 
 	/* Look for first device matching cPCI bus's bridge vendor and device IDs */
-	if(!(bus0_dev = pci_get_device(PCI_VENDOR_ID_DEC,
-					 PCI_DEVICE_ID_DEC_21154, NULL))) {
+	pci_hotplug_enter();
+	bus0_dev = pci_get_device(PCI_VENDOR_ID_DEC,
+				  PCI_DEVICE_ID_DEC_21154, NULL);
+	if (bus0_dev)
+		bus0 = pci_bus_get(bus0_dev->subordinate);
+	pci_dev_put(bus0_dev);
+	pci_hotplug_exit();
+	if (!bus0) {
 		status = -ENODEV;
 		goto init_find_bus_error;
 	}
-	bus0 = pci_bus_get(bus0_dev->subordinate);
-	pci_dev_put(bus0_dev);
 
 	status = cpci_hp_register_bus(bus0, 0x0a, 0x0f);
 	if(status != 0) {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 17/17] PCI: serialize PCI hotplug operations triggered by fakephp drivers
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (15 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 16/17] PCI: serialize PCI hotplug operations triggered " Jiang Liu
@ 2012-04-16 16:29 ` Jiang Liu
  2012-04-16 21:33 ` [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Greg KH
  17 siblings, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-16 16:29 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas
  Cc: Jiang Liu, Jiang Liu, Keping Chen, linux-pci

Use PCI hotplug lock to globally serialize hotplug operations triggered
by fakephp driver.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/hotplug/fakephp.c |   38 ++++++++++++++++++++++++++++++++------
 1 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
index a019c9a..ee6c79e 100644
--- a/drivers/pci/hotplug/fakephp.c
+++ b/drivers/pci/hotplug/fakephp.c
@@ -38,9 +38,24 @@ static ssize_t legacy_show(struct kobject *kobj, struct attribute *attr,
 	return 2;
 }
 
+static void rescan_callback(void *data)
+{
+	struct legacy_slot *slot = data;
+
+	pci_hotplug_enter();
+	if (!list_empty(&slot->list))
+		pci_rescan_bus(slot->dev->bus);
+	pci_hotplug_exit();
+}
+
 static void remove_callback(void *data)
 {
-	pci_stop_and_remove_bus_device((struct pci_dev *)data);
+	struct legacy_slot *slot = data;
+
+	pci_hotplug_enter();
+	if (!list_empty(&slot->list))
+		pci_stop_and_remove_bus_device(slot->dev);
+	pci_hotplug_exit();
 }
 
 static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
@@ -53,10 +68,11 @@ static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
 		return -EINVAL;
 
 	if (val)
-		pci_rescan_bus(slot->dev->bus);
+		sysfs_schedule_callback(&slot->kobj, rescan_callback,
+					slot, THIS_MODULE);
 	else
-		sysfs_schedule_callback(&slot->dev->dev.kobj, remove_callback,
-					slot->dev, THIS_MODULE);
+		sysfs_schedule_callback(&slot->kobj, remove_callback,
+					slot, THIS_MODULE);
 	return len;
 }
 
@@ -107,20 +123,25 @@ static int legacy_notify(struct notifier_block *nb,
 	struct pci_dev *pdev = to_pci_dev(data);
 
 	if (action == BUS_NOTIFY_ADD_DEVICE) {
+		pci_hotplug_enter();
 		legacy_add_slot(pdev);
+		pci_hotplug_exit();
 	} else if (action == BUS_NOTIFY_DEL_DEVICE) {
 		struct legacy_slot *slot;
 
+		pci_hotplug_enter();
 		list_for_each_entry(slot, &legacy_list, list)
 			if (slot->dev == pdev)
 				goto found;
 
+		pci_hotplug_exit();
 		dev_warn(&pdev->dev, "Missing legacy fake slot?");
 		return -ENODEV;
 found:
 		kobject_del(&slot->kobj);
-		list_del(&slot->list);
+		list_del_init(&slot->list);
 		kobject_put(&slot->kobj);
+		pci_hotplug_exit();
 	}
 
 	return 0;
@@ -135,11 +156,14 @@ static int __init init_legacy(void)
 	struct pci_dev *pdev = NULL;
 
 	/* Add existing devices */
+	pci_hotplug_disable();
 	for_each_pci_dev(pdev)
 		legacy_add_slot(pdev);
 
 	/* Be alerted of any new ones */
 	bus_register_notifier(&pci_bus_type, &legacy_notifier);
+	pci_hotplug_enable();
+
 	return 0;
 }
 module_init(init_legacy);
@@ -150,11 +174,13 @@ static void __exit remove_legacy(void)
 
 	bus_unregister_notifier(&pci_bus_type, &legacy_notifier);
 
+	pci_hotplug_disable();
 	list_for_each_entry_safe(slot, tmp, &legacy_list, list) {
-		list_del(&slot->list);
+		list_del_init(&slot->list);
 		kobject_del(&slot->kobj);
 		kobject_put(&slot->kobj);
 	}
+	pci_hotplug_enable();
 }
 module_exit(remove_legacy);
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug
  2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (16 preceding siblings ...)
  2012-04-16 16:29 ` [PATCH RFC 17/17] PCI: serialize PCI hotplug operations triggered by fakephp drivers Jiang Liu
@ 2012-04-16 21:33 ` Greg KH
  2012-04-17 11:57   ` Jiang Liu
  2012-04-17 14:53   ` Jiang Liu
  17 siblings, 2 replies; 21+ messages in thread
From: Greg KH @ 2012-04-16 21:33 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Yinghai Lu, Kenji Kaneshige, Dely Sy, Scott Murray, Jiang Liu,
	Keping Chen, linux-pci

On Tue, Apr 17, 2012 at 12:28:54AM +0800, Jiang Liu wrote:
> There are multiple ways to trigger PCI hotplug requests concurrently,
> such as:
> 1. Sysfs interfaces exported by the PCI core subsystem

Which ones?

> 2. Sysfs interfaces exported by the PCI hotplug subsystem

Which ones?

> 3. PCI hotplug events triggered by PCI Hotplug Controllers
> 4. ACPI hotplug events for PCI host bridges

Those are both the same.

> 5. Driver binding/unbinding events

Not really a "hotplug" event, that's something that all drivers in the
kernel support.

And in the end, they all propagate down to the driver core to be the
same thing that the PCI driver sees.

> The PCI core subsystem doesn't support concurrent hotplug operations yet,
> so all PCI hotplug requests should be globally serialized.

Why do you think they are not?  These should all be serialized today,
with the bus lock down in the driver core.  How is this failing?

> This patchset
> introduces a global recursive rwsem to serialize all PCI hotplug operations.

Ick, why?  What's wrong with the lock we are already taking?  And why
would you need a rwsem anyway?

> Following PCI hotplug drivers/interfaces have been enhanced with this
> 1. Sysfs interfaces exported by the PCI core subsystem
> 2. Sysfs interfaces exported by the PCI hotplug subsystem
> 3. pciehp
> 4. shpchp
> 5. cpcihp_generic and cpcihp_zt5550
> 6. fakephp

You are doing something wrong if you require this to be fixed up in each
individual pci hotplug driver.  Fix this in the PCI core, if needed.
But again, I don't see why it is needed.

> But there are still several TODOs:
> 1) all other PCI hotplug driver in drivers/pci/hotplug directory
> 2) SR-IOV
> 3) acpiphp (plan to do this based on Yinghai's PCI root bus hotplug gate)
> 4) pci_root (plan to do this based on Yinghai's PCI root bus hotplug gate)
> 
> Basic test has been done as below, will find more hardwares to do more tests.
> Start three scripts on an Intel Atom system to currently execute:
> 1) remove/rescan PCI devices by sysfs interfaces exported by PCI core subsystem
> 2) remove/rescan PCI devices by sysfs interfaces exported by fakephp driver
> 3) load/unload fakephp driver
> The test has run about four hours without failure.

And it fails without this?  How does it?

And really, fakephp?  Come on, what happens in the "real world" with
real pci hotplug systems/devices that this patch set is trying to solve?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug
  2012-04-16 21:33 ` [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Greg KH
@ 2012-04-17 11:57   ` Jiang Liu
  2012-04-17 14:53   ` Jiang Liu
  1 sibling, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-17 11:57 UTC (permalink / raw)
  To: Greg KH
  Cc: Jiang Liu, Yinghai Lu, Kenji Kaneshige, Dely Sy, Scott Murray,
	Keping Chen, linux-pci

Hi Greg,
     Thanks for your comments! There's some information missed
in the introduction letter, so I will give some background here.
     This patchset was developed based on Yinghai's tree, which
enables PCI root bus hotplug on x86/IA64 platforms. Yinghai's gate
has made following changes:
1)  Move code for PCI host bridge hotplug from acpiphp driver to
     pci_root driver, now acpiphp driver only handles PCI device
     hotplug and pci_root driver handles pci host bridge hotplug.
2)  When attaching pci_root driver to an ACPI host bridge device,
     it will enumerate all PCI devices under that host bridge.
3)  When detaching pci_root driver from an ACPI host bridge device,
     it will destroy all PCI devices under that host bridge.
     With Yinghai's gate, binding/unbinding pci_root driver has
the same effect as PCI root bridge hotplug. So originally this
patchset is to support Yinghai's work.

     Recently I have rebased the patchset to the mainline tree
according to Kenji's suggestion, because the mainline kernel has
the same issues. But I forget to change the comments after rebase,
which leads to misunderstanding here. Sorry!

On 2012-4-17 5:33, Greg KH wrote:
> On Tue, Apr 17, 2012 at 12:28:54AM +0800, Jiang Liu wrote:
>> There are multiple ways to trigger PCI hotplug requests concurrently,
>> such as:
>> 1. Sysfs interfaces exported by the PCI core subsystem
>
> Which ones?
With 3.3 kernel, PCI core logic provides following interfaces for
device hotplug,
/sys/devices/pcissss:bb/ssss:bb:dd.f/.../remove
/sys/devices/pcissss:bb/ssss:bb:dd.f/.../rescan
/sys/devices/pcissss:bb/ssss:bb:dd.f/.../pci_bus/ssss:bb/rescan
/sys/bus/pci/rescan

>
>> 2. Sysfs interfaces exported by the PCI hotplug subsystem
>
> Which ones?
/sys/bus/pci/slots/5/power

>> 3. PCI hotplug events triggered by PCI Hotplug Controllers
>> 4. ACPI hotplug events for PCI host bridges
>
> Those are both the same.
As mentioned above, it will be split into two parts:
acpiphp for PCI device hotplug
pci_root for PCI host bridge hotplug

>
>> 5. Driver binding/unbinding events
>
> Not really a "hotplug" event, that's something that all drivers in the
> kernel support.
>
> And in the end, they all propagate down to the driver core to be the
> same thing that the PCI driver sees.
Here I mean binding/unbinding pci_root driver, which will enumerate or
destroy all PCI devices under the corresponding PCI root bridge.

>
>> The PCI core subsystem doesn't support concurrent hotplug operations yet,
>> so all PCI hotplug requests should be globally serialized.
>
> Why do you think they are not?  These should all be serialized today,
> with the bus lock down in the driver core.  How is this failing?
According to my understanding, pci_bus_sem only protects several lists
in struct pci_bus, such as children, devices, but it doesn't protect
the pci_bus or pci_dev structure themselves.

Let's take pci_remove_bus_device() as an example, which are used by
PCI hotplug drivers to hot-remove PCI devices.
pci_remove_bus_device()
     ->pci_stop_bus_device()
         ->pci_stop_bus_device()
             ->pci_stop_bus_devices()
	    ->pci_stop_dev()
                 ->pci_proc_detach_device()
                 ->pci_remove_sysfs_dev_files()
	        ->device_unregister()
                 ->pcie_aspm_exit_link_state()
     ->__pci_remove_bus_device()
         ->__pci_remove_behind_bridge()
         ->pci_remove_bus()
             ->device_unregister()
         ->pci_destroy_dev()
             ->pci_free_resources()
             ->pci_dev_put()

Currently all these are free running without any protection, so can't
support re-entrance. There are similar issues on hot-adding side.
For example, all these are free running too.
pci_rescan_bus()
     ->pci_scan_child_bus()
         ->pci_scan_slot()
             ->pci_scan_single_device()
                 ->pci_scan_device()

It also may cause trouble if  pci_remove_bus_device() and
pci_rescan_bus() are called concurrently.

So the pci_bus_sem isn't strong enough to protect the PCI core
from concurrently hotplug operations.

>
>> This patchset
>> introduces a global recursive rwsem to serialize all PCI hotplug operations.
>
> Ick, why?  What's wrong with the lock we are already taking?  And why
> would you need a rwsem anyway?
The writer side is for PCI device/root bridge hotplug operations.

The reader side is to disable PCI device/root bridge hotplug
temporarily. For example, pci_find_bus()/pci_find_next_bus() returns
a pci_bus without holding a reference count on that pci_bus structure.
That may cause invalid memory access if the pci_bus is hot-removed.
I have proposed a patchset to fix the issue by holding a reference count
on the returned pci_bus, but Bjorn thought that patchset is too heavy.
So I plan to use the reader lock to protect those critical sections
from PCI hotplug operations.

The recursive lock algorithm is to avoid deadlocks.

>
>> Following PCI hotplug drivers/interfaces have been enhanced with this
>> 1. Sysfs interfaces exported by the PCI core subsystem
>> 2. Sysfs interfaces exported by the PCI hotplug subsystem
>> 3. pciehp
>> 4. shpchp
>> 5. cpcihp_generic and cpcihp_zt5550
>> 6. fakephp
>
> You are doing something wrong if you require this to be fixed up in each
> individual pci hotplug driver.  Fix this in the PCI core, if needed.
> But again, I don't see why it is needed.
This is really a good question. I have thought to solve this issue by
changing the PCI core logic, but it found it's hard by that way.
Essentially we needs some interfaces like pci_branch_lock()/unlock() to
lock/unlock a PCI sub-hierarchy, I feel it's a big task to make branch
lock interface deadlock free. So I did some tradeoff here to use a
global lock to serialize all PCI hotplug operations.

The changes to each PCI hotplug driver are mainly for two reasons:
1) Fix minor bugs/race conditions in existing PCI hotplug drivers.
2) Break a deadlock scenario as below.

Thread A				Thread B/ISR
1) Acquire the global lock
2) Remove a PCI subtree
3)					Hotplug interrupt happens
4)					ISR/worker tries to acquire
                                         the global lock
5) Try to unbind the PCI hotplug driver
6) Wait for the ISR/worker to finish
    tasks
7) Deadlock then

>
>> But there are still several TODOs:
>> 1) all other PCI hotplug driver in drivers/pci/hotplug directory
>> 2) SR-IOV
>> 3) acpiphp (plan to do this based on Yinghai's PCI root bus hotplug gate)
>> 4) pci_root (plan to do this based on Yinghai's PCI root bus hotplug gate)
>>
>> Basic test has been done as below, will find more hardwares to do more tests.
>> Start three scripts on an Intel Atom system to currently execute:
>> 1) remove/rescan PCI devices by sysfs interfaces exported by PCI core subsystem
>> 2) remove/rescan PCI devices by sysfs interfaces exported by fakephp driver
>> 3) load/unload fakephp driver
>> The test has run about four hours without failure.
>
> And it fails without this?  How does it?
Without the patchset, running following scripts concurrently will make
system reboot automatically. It's tested on an IBM x3850 system with
v3.3 kernel.

[root@IBM3850 tests]# cat hotplug.sh
#!/bin/bash
while true; do
         echo 0 > /sys/bus/pci/slots/0000:8b:00.0/power
         echo 1 > /sys/bus/pci/slots/0000:8b:00.1/power
         sleep 0.01
done
[root@IBM3850 tests]# cat sysfs.sh
#!/bin/bash
while true; do
         echo 1 > /sys/devices/pci0000:80/0000:80:03.0/remove
         echo 1 > /sys/bus/pci/rescan
         sleep 0.01
done

>
> And really, fakephp?  Come on, what happens in the "real world" with
> real pci hotplug systems/devices that this patch set is trying to solve?
I'm verifying this patchset at home last night, and have no platform
supporting PCI hotplug at hand, so just use fakephp to prove the
recursive lock algorithm.

Today we have reproduced the issue on a real platform by using
acpiphp driver. It's an IA64 platform running Suse 11SP1 (official
2.6.32.12 kernel). The test script is:

#!/bin/bash

for ((i=0;i<100;i++))
do
	echo 1 > /sys/bus/pci/devices/0000\:43\:00.0/remove
	echo 0 > /sys/bus/pci/slots/3/power
	sleep 1
	echo 1 > /sys/bus/pci/slots/3/power
done

And the bug report is:

------------[ cut here ]------------
WARNING: at fs/sysfs/group.c:138 sysfs_remove_group+0x210/0x240()
Hardware name: H8900
sysfs group a0000001012014f0 not found for kobject '0000:45:00.1'
Modules linked in: acpiphp(N) ipv6(N) cpufreq_conservative(N) 
cpufreq_userspace( 
                                            N) cpufreq_powersave(N) 
acpi_cpufreq(N) binfmt_misc(N) fuse(N) nls_iso8859_1(N) 
 
           loop(N) dm_mod(N) tpm_tis(N) tpm(N) ppdev(N) shpchp(N) 
tpm_bios(N) serio_raw(N) 
                                                   qla2xxx(N) 
i2c_i801(N) scsi_transport_fc(N) pci_hotplug(N) scsi_tgt(N) iTCO_wdt( 
 
                        N) sg(N) iTCO_vendor_support(N) i2c_core(N) 
mptctl(N) igb(N) parport_pc(N) parpo 
                                                              rt(N) 
button(N) container(N) usbhid(N) hid(N) uhci_hcd(N) ehci_hcd(N) 
usbcore(N) 
                                     sd_mod(N) crc_t10dif(N) ext3(N) 
mbcache(N) jbd(N) fan(N) processor(N) ide_pci_g 
 
  eneric(N) ide_core(N) ata_piix(N) libata(N) mptsas(N) mptscsih(N) 
mptbase(N) scs 
                                        i_transport_sas(N) scsi_mod(N) 
thermal(N) thermal_sys(N) hwmon(N)
Supported: Yes

Call Trace:
  [<a000000100017640>] show_stack+0x80/0xa0
                                 sp=e000002f4421fc00 bsp=e000002f44211678
  [<a0000001008cfd10>] dump_stack+0x30/0x50
                                 sp=e000002f4421fdd0 bsp=e000002f44211660
  [<a0000001000b9bc0>] warn_slowpath_common+0xc0/0x120
                                 sp=e000002f4421fdd0 bsp=e000002f44211628
  [<a0000001000b9d10>] warn_slowpath_fmt+0x90/0xc0
                                 sp=e000002f4421fdd0 bsp=e000002f442115c0
  [<a000000100331690>] sysfs_remove_group+0x210/0x240
                                 sp=e000002f4421fe10 bsp=e000002f44211590
  [<a000000100636190>] dpm_sysfs_remove+0x30/0x60
                                 sp=e000002f4421fe10 bsp=e000002f44211570
  [<a0000001006236c0>] device_del+0x80/0x460
                                 sp=e000002f4421fe10 bsp=e000002f44211528
  [<a000000100623ae0>] device_unregister+0x40/0x140
                                 sp=e000002f4421fe10 bsp=e000002f44211508
  [<a0000001004d2320>] pci_stop_bus_device+0x160/0x200
                                 sp=e000002f4421fe10 bsp=e000002f442114d8
  [<a000000223104e70>] acpiphp_disable_slot+0x170/0x580 [acpiphp]
                                 sp=e000002f4421fe10 bsp=e000002f44211470
  [<a000000223100b70>] disable_slot+0x50/0x160 [acpiphp]
                                 sp=e000002f4421fe20 bsp=e000002f44211448
  [<a00000021e960e60>] power_write_file+0x240/0x340 [pci_hotplug]
                                 sp=e000002f4421fe20 bsp=e000002f44211418
  [<a0000001004e5e00>] pci_slot_attr_store+0x60/0xa0
                                 sp=e000002f4421fe20 bsp=e000002f442113d8
  [<a00000010032a260>] sysfs_write_file+0x240/0x340
                                 sp=e000002f4421fe20 bsp=e000002f44211380
  [<a000000100232910>] vfs_write+0x1b0/0x3c0
                                 sp=e000002f4421fe20 bsp=e000002f44211330
  [<a000000100232ce0>] sys_write+0x80/0x100
                                 sp=e000002f4421fe20 bsp=e000002f442112b8
  [<a00000010000c9c0>] ia64_ret_from_syscall+0x0/0x20
                                 sp=e000002f4421fe30 bsp=e000002f442112b8
  [<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
                                 sp=e000002f44220000 bsp=e000002f442112b8
---[ end trace bd659e9a3f4f6279 ]---
offline_pci.sh[6450]: NaT consumption 17179869216 [1]
Modules linked in: acpiphp(N) ipv6(N) cpufreq_conservative(N) 
cpufreq_userspace( 
                                            N) cpufreq_powersave(N) 
acpi_cpufreq(N) binfmt_misc(N) fuse(N) nls_iso8859_1(N) 
 
           loop(N) dm_mod(N) tpm_tis(N) tpm(N) ppdev(N) shpchp(N) 
tpm_bios(N) serio_raw(N) 
                                                   qla2xxx(N) 
i2c_i801(N) scsi_transport_fc(N) pci_hotplug(N) scsi_tgt(N) iTCO_wdt( 
 
                        N) sg(N) iTCO_vendor_support(N) i2c_core(N) 
mptctl(N) igb(N) parport_pc(N) parpo 
                                                              rt(N) 
button(N) container(N) usbhid(N) hid(N) uhci_hcd(N) ehci_hcd(N) 
usbcore(N) 
                                     sd_mod(N) crc_t10dif(N) ext3(N) 
mbcache(N) jbd(N) fan(N) processor(N) ide_pci_g 
 
  eneric(N) ide_core(N) ata_piix(N) libata(N) mptsas(N) mptscsih(N) 
mptbase(N) scs 
                                        i_transport_sas(N) scsi_mod(N) 
thermal(N) thermal_sys(N) hwmon(N)
Supported: Yes

Pid: 6450, CPU 11, comm:       offline_pci.sh
psr : 0000101009526030 ifs : 8000000000000389 ip  : [<a0000001008a9870>] 
    Tain 
                                  ted: G        W N  (2.6.32.12-yyz)
ip is at klist_put+0x30/0x160
unat: 0000000000000000 pfs : 0000000000000206 rsc : 0000000000000003
rnat: 8000000000000711 bsps: 0000000000000000 pr  : 65519aa656999969
ldrs: 0000000000000000 ccv : 0000000040000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a0000001008a9a50 b6  : a0000001004b1320 b7  : a00000010000d170
qla2xxx 0000:45:00.1: PCI INT B disabled
f6  : 000000000000000000000 f7  : 1003e9e3779b97f4a7c16
f8  : 1003e0a00000000001072 f9  : 1003effffffffffffffee
f10 : 1003e0000000000000023 f11 : 1003e8208208208208209
r1  : a0000001015c8460 r2  : 0000000000000000 r3  : a0000001013e75b0
r8  : 0000000000000001 r9  : a0000001013e75b0 r10 : a0000001013e8ed8
r11 : 0000000000000000 r12 : e000002f4421fe10 r13 : e000002f44210000
r14 : 0000000000000020 r15 : 0000000000004000 r16 : 0000000000000009
r17 : 0000000000000200 r18 : 0000000040000000 r19 : 0000000040000000
r20 : 0000000040000200 r21 : 0000000040000000 r22 : 000000000001ae13
r23 : 0000000000100000 r24 : a0000001029780f0 r25 : 000000000001ae10
r26 : 000000000001ae10 r27 : 0000000000100000 r28 : 0000000000000034
r29 : 0000000000000034 r30 : a0000001029780f1 r31 : 000000000001ae11

Call Trace:
  [<a000000100017640>] show_stack+0x80/0xa0
                                 sp=e000002f4421f850 bsp=e000002f442116f8
  [<a000000100017ca0>] show_regs+0x640/0x920
                                 sp=e000002f4421fa20 bsp=e000002f442116a0
  [<a000000100028c70>] die+0x190/0x2e0
                                 sp=e000002f4421fa30 bsp=e000002f44211660
  [<a000000100028e10>] die_if_kernel+0x50/0x80
                                 sp=e000002f4421fa30 bsp=e000002f44211630
  [<a0000001008d8d70>] ia64_fault+0xf0/0x1640
                                 sp=e000002f4421fa30 bsp=e000002f442115d8
  [<a00000010000cb60>] ia64_native_leave_kernel+0x0/0x270
                                 sp=e000002f4421fc40 bsp=e000002f442115d8
  [<a0000001008a9870>] klist_put+0x30/0x160
                                 sp=e000002f4421fe10 bsp=e000002f44211590
  [<a0000001008a9a50>] klist_del+0x30/0x60
                                 sp=e000002f4421fe10 bsp=e000002f44211570
  [<a0000001006236e0>] device_del+0xa0/0x460
                                 sp=e000002f4421fe10 bsp=e000002f44211528
  [<a000000100623ae0>] device_unregister+0x40/0x140
                                 sp=e000002f4421fe10 bsp=e000002f44211508
  [<a0000001004d2320>] pci_stop_bus_device+0x160/0x200
                                 sp=e000002f4421fe10 bsp=e000002f442114d8
  [<a000000223104e70>] acpiphp_disable_slot+0x170/0x580 [acpiphp]
                                 sp=e000002f4421fe10 bsp=e000002f44211470
  [<a000000223100b70>] disable_slot+0x50/0x160 [acpiphp]
                                 sp=e000002f4421fe20 bsp=e000002f44211448
  [<a00000021e960e60>] power_write_file+0x240/0x340 [pci_hotplug]
                                 sp=e000002f4421fe20 bsp=e000002f44211418
  [<a0000001004e5e00>] pci_slot_attr_store+0x60/0xa0
                                 sp=e000002f4421fe20 bsp=e000002f442113d8
  [<a00000010032a260>] sysfs_write_file+0x240/0x340
                                 sp=e000002f4421fe20 bsp=e000002f44211380
  [<a000000100232910>] vfs_write+0x1b0/0x3c0
                                 sp=e000002f4421fe20 bsp=e000002f44211330
  [<a000000100232ce0>] sys_write+0x80/0x100
                                 sp=e000002f4421fe20 bsp=e000002f442112b8
  [<a00000010000c9c0>] ia64_ret_from_syscall+0x0/0x20
                                 sp=e000002f4421fe30 bsp=e000002f442112b8
  [<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
                                 sp=e000002f44220000 bsp=e000002f442112b8
Disabling lock debugging due to kernel taint

The second case is to test fakephp with following scripts
#!/bin/bash
while true; do
     echo 0 > /sys/bus/pci/slots/0000:8b:00.0/power
     echo 1 > /sys/bus/pci/slots/0000:8b:00.1/power
     sleep 0.01
done

>
> thanks,
>
> greg k-h
>
> .
>



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug
  2012-04-16 21:33 ` [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Greg KH
  2012-04-17 11:57   ` Jiang Liu
@ 2012-04-17 14:53   ` Jiang Liu
  1 sibling, 0 replies; 21+ messages in thread
From: Jiang Liu @ 2012-04-17 14:53 UTC (permalink / raw)
  To: Greg KH
  Cc: Yinghai Lu, Kenji Kaneshige, Dely Sy, Scott Murray, Jiang Liu,
	Keping Chen, linux-pci

Hi Greg,
	More logs for your reference below.
	Thanks!
	gerry

On 04/17/2012 05:33 AM, Greg KH wrote:
> On Tue, Apr 17, 2012 at 12:28:54AM +0800, Jiang Liu wrote:
>> There are multiple ways to trigger PCI hotplug requests concurrently,
>> such as:
>> 1. Sysfs interfaces exported by the PCI core subsystem
> 
> Which ones?
> 
>> 2. Sysfs interfaces exported by the PCI hotplug subsystem
> 
> Which ones?
> 
>> 3. PCI hotplug events triggered by PCI Hotplug Controllers
>> 4. ACPI hotplug events for PCI host bridges
> 
> Those are both the same.
> 
>> 5. Driver binding/unbinding events
> 
> Not really a "hotplug" event, that's something that all drivers in the
> kernel support.
> 
> And in the end, they all propagate down to the driver core to be the
> same thing that the PCI driver sees.
> 
>> The PCI core subsystem doesn't support concurrent hotplug operations yet,
>> so all PCI hotplug requests should be globally serialized.
> 
> Why do you think they are not?  These should all be serialized today,
> with the bus lock down in the driver core.  How is this failing?
> 
>> This patchset
>> introduces a global recursive rwsem to serialize all PCI hotplug operations.
> 
> Ick, why?  What's wrong with the lock we are already taking?  And why
> would you need a rwsem anyway?
> 
>> Following PCI hotplug drivers/interfaces have been enhanced with this
>> 1. Sysfs interfaces exported by the PCI core subsystem
>> 2. Sysfs interfaces exported by the PCI hotplug subsystem
>> 3. pciehp
>> 4. shpchp
>> 5. cpcihp_generic and cpcihp_zt5550
>> 6. fakephp
> 
> You are doing something wrong if you require this to be fixed up in each
> individual pci hotplug driver.  Fix this in the PCI core, if needed.
> But again, I don't see why it is needed.
> 
>> But there are still several TODOs:
>> 1) all other PCI hotplug driver in drivers/pci/hotplug directory
>> 2) SR-IOV
>> 3) acpiphp (plan to do this based on Yinghai's PCI root bus hotplug gate)
>> 4) pci_root (plan to do this based on Yinghai's PCI root bus hotplug gate)
>>
>> Basic test has been done as below, will find more hardwares to do more tests.
>> Start three scripts on an Intel Atom system to currently execute:
>> 1) remove/rescan PCI devices by sysfs interfaces exported by PCI core subsystem
>> 2) remove/rescan PCI devices by sysfs interfaces exported by fakephp driver
>> 3) load/unload fakephp driver
>> The test has run about four hours without failure.
> 
> And it fails without this?  How does it?
It's generated by executing following two scripts concurrently.
gerry@cat:~/tests$ cat hotplug 
#!/bin/bash

while true; do
        echo 0 > /sys/bus/pci/slots/0000\:00\:1c.0/power
        echo 0 > /sys/bus/pci/slots/0000\:00\:1c.1/power
        echo 0 > /sys/bus/pci/slots/0000\:00\:1c.2/power
        echo 1 > /sys/bus/pci/slots/0000\:00\:1c.3/power
        sleep 0.01
done;
gerry@cat:~/tests$ cat sysfs 
#!/bin/bash

while true; do
        echo 1 > /sys/devices/pci0000:00/0000:00:1c.0/remove
        echo 1 > /sys/devices/pci0000:00/0000:00:1c.1/remove
        echo 1 > /sys/devices/pci0000:00/0000:00:1c.2/remove
        echo 1 > /sys/devices/pci0000:00/pci_bus/0000:00/rescan
        sleep 0.01
done;


[  431.767731] ------------[ cut here ]------------
[  431.767744] WARNING: at fs/sysfs/dir.c:508 sysfs_add_one+0xb8/0xe0()
[  431.767749] Hardware name: To Be Filled By O.E.M.
[  431.767754] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1c.2'
[  431.767759] Modules linked in: shpchp fakephp r8169
[  431.767774] Pid: 3276, comm: hotplug Tainted: G      D W    3.4.0-rc2+ #20
[  431.767779] Call Trace:
[  431.767791]  [<ffffffff81036eea>] warn_slowpath_common+0x7a/0xb0
[  431.767800]  [<ffffffff81036fc1>] warn_slowpath_fmt+0x41/0x50
[  431.767808]  [<ffffffff811a6c68>] sysfs_add_one+0xb8/0xe0
[  431.767817]  [<ffffffff811a6df6>] create_dir+0x76/0xd0
[  431.767825]  [<ffffffff811a719e>] sysfs_create_dir+0x7e/0xc0
[  431.767836]  [<ffffffff812da298>] kobject_add_internal+0xb8/0x210
[  431.767846]  [<ffffffff812da767>] kobject_add+0x67/0xc0
[  431.767856]  [<ffffffff817541bc>] ? klist_init+0x3c/0x60
[  431.767866]  [<ffffffff813f590d>] device_add+0xed/0x680
[  431.767875]  [<ffffffff812f632f>] pci_bus_add_device+0x1f/0x50
[  431.767884]  [<ffffffff812f6541>] pci_bus_add_devices+0x41/0x130
[  431.767893]  [<ffffffff81757bf7>] pci_rescan_bus+0xa7/0xc0
[  431.767903]  [<ffffffffa000e066>] legacy_store+0x66/0x80 [fakephp]
[  431.767913]  [<ffffffff811a517e>] ? sysfs_write_file+0xde/0x180
[  431.767922]  [<ffffffff811a5197>] sysfs_write_file+0xf7/0x180
[  431.767932]  [<ffffffff811347e1>] vfs_write+0xb1/0x180
[  431.767941]  [<ffffffff81134b08>] sys_write+0x48/0x90
[  431.767950]  [<ffffffff8178a0e2>] system_call_fastpath+0x16/0x1b
[  431.767957] ---[ end trace f99f468d766f03f8 ]---
[  431.767996] kobject_add_internal failed for 0000:00:1c.2 with -EEXIST, don't try to register things with the same n.
[  431.768060] Pid: 3276, comm: hotplug Tainted: G      D W    3.4.0-rc2+ #20
[  431.768066] Call Trace:
[  431.768077]  [<ffffffff812da33c>] kobject_add_internal+0x15c/0x210
[  431.768085]  [<ffffffff812da767>] kobject_add+0x67/0xc0
[  431.768093]  [<ffffffff817541bc>] ? klist_init+0x3c/0x60
[  431.768102]  [<ffffffff813f590d>] device_add+0xed/0x680
[  431.768111]  [<ffffffff812f632f>] pci_bus_add_device+0x1f/0x50
[  431.768120]  [<ffffffff812f6541>] pci_bus_add_devices+0x41/0x130
[  431.768129]  [<ffffffff81757bf7>] pci_rescan_bus+0xa7/0xc0
[  431.768140]  [<ffffffffa000e066>] legacy_store+0x66/0x80 [fakephp]
[  431.768150]  [<ffffffff811a517e>] ? sysfs_write_file+0xde/0x180
[  431.768160]  [<ffffffff811a5197>] sysfs_write_file+0xf7/0x180
[  431.768169]  [<ffffffff811347e1>] vfs_write+0xb1/0x180
[  431.768178]  [<ffffffff81134b08>] sys_write+0x48/0x90
[  431.768187]  [<ffffffff8178a0e2>] system_call_fastpath+0x16/0x1b
[  431.768205] pci 0000:00:1c.2: Error adding device, continuing


[  431.768229] ------------[ cut here ]------------
[  431.768234] kernel BUG at drivers/pci/bus.c:230!
[  431.768240] invalid opcode: 0000 [#2] SMP 
[  431.768249] CPU 1 
[  431.768252] Modules linked in: shpchp fakephp r8169
[  431.768266] 
[  431.768272] Pid: 3276, comm: hotplug Tainted: G      D W    3.4.0-rc2+ #20 To Be Filled By O.E.M. To Be Filled By O.
[  431.768288] RIP: 0010:[<ffffffff812f6628>]  [<ffffffff812f6628>] pci_bus_add_devices+0x128/0x130
[  431.768300] RSP: 0018:ffff880037aabe08  EFLAGS: 00010246
[  431.768306] RAX: 0000000000000047 RBX: ffff88003cac4800 RCX: 0000000000000001
[  431.768312] RDX: ffffffff81037f09 RSI: 0000000000000001 RDI: ffff88003cbf4c00
[  431.768319] RBP: ffff880037aabe28 R08: 0000000000000001 R09: 0000000000000000
[  431.768325] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88003cbf4428
[  431.768332] R13: ffff88003cbf4c00 R14: ffff88003cbf4428 R15: ffff88003cbf4428
[  431.768339] FS:  00007f4c8d018720(0000) GS:ffff88003d800000(0000) knlGS:0000000000000000
[  431.768345] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  431.768351] CR2: 000000000046f0e0 CR3: 000000003052b000 CR4: 00000000000007e0
[  431.768357] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  431.768364] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  431.768370] Process hotplug (pid: 3276, threadinfo ffff880037aaa000, task ffff88003aae0000)
[  431.768376] Stack:
[  431.768380]  ffff880037aabe28 ffff88003cbf4400 ffff880037aabe38 0000000000000005
[  431.768394]  ffff880037aabe78 ffffffff81757bf7 ffff880037aabe38 ffff880037aabe38
[  431.768407]  0000000000000000 0000000000000002 ffff88003064b2a0 ffff88002fabac80
[  431.768421] Call Trace:
[  431.768431]  [<ffffffff81757bf7>] pci_rescan_bus+0xa7/0xc0
[  431.768442]  [<ffffffffa000e066>] legacy_store+0x66/0x80 [fakephp]
[  431.768452]  [<ffffffff811a517e>] ? sysfs_write_file+0xde/0x180
[  431.768462]  [<ffffffff811a5197>] sysfs_write_file+0xf7/0x180
[  431.768472]  [<ffffffff811347e1>] vfs_write+0xb1/0x180
[  431.768481]  [<ffffffff81134b08>] sys_write+0x48/0x90
[  431.768491]  [<ffffffff8178a0e2>] system_call_fastpath+0x16/0x1b
[  431.768496] Code: 8b 43 10 48 c7 c7 c0 fe c4 81 48 8b 50 
[  431.768537] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
[  431.768544] 20 4c 89 68 20 48 83 c0 18 49 89 45 00 49 89 55 08 4c 89 2a e8 2d 91 d6 ff e9 74 ff ff ff <0f> 0b 90 90 
[  431.768623] RIP  [<ffffffff812f6628>] pci_bus_add_devices+0x128/0x130
[  431.768633]  RSP <ffff880037aabe08>
[  431.768640] ---[ end trace f99f468d766f03f9 ]---


[  266.858024] Pid: 864, comm: kworker/u:3 Tainted: G        W    3.4.0-rc2+ #20 To Be Filled By O.E.M. To Be Filled B.
[  266.858024] RIP: 0010:[<ffffffff81753e78>]  [<ffffffff81753e78>] klist_put+0x28/0xa0
[  266.858024] RSP: 0018:ffff88003b301c70  EFLAGS: 00010246
[  266.858024] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
[  266.858024] RDX: ffffffff81037b65 RSI: 0000000000000001 RDI: 0000000000000000
[  266.858024] RBP: ffff88003b301c90 R08: 0000000000000001 R09: 0000000000000000
[  266.858024] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88003ca60668
[  266.858024] R13: ffff88003c40c828 R14: 0000000000000001 R15: ffffffff811a5220
[  266.858024] FS:  0000000000000000(0000) GS:ffff88003da00000(0000) knlGS:0000000000000000
[  266.858024] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  266.858024] CR2: 0000000000000060 CR3: 0000000001c0b000 CR4: 00000000000007e0
[  266.858024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  266.858024] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  266.858024] Process kworker/u:3 (pid: 864, threadinfo ffff88003b300000, task ffff88003ca5df00)
[  266.858024] Stack:
[  266.858024]  ffff88003c429890 ffff88003c40c400 ffff88003c40c828 ffffffff81fa8840
[  266.858024]  ffff88003b301ca0 ffffffff81753f2e ffff88003b301cd0 ffffffff813f4fc9
[  266.858024]  ffff88003b301cd0 ffff88003c429890 ffff88003c40c828 ffff88003c40c828
[  266.858024] Call Trace:
[  266.858024]  [<ffffffff81753f2e>] klist_del+0xe/0x10
[  266.858024]  [<ffffffff813f4fc9>] device_del+0x59/0x1c0
[  266.858024]  [<ffffffff813f5141>] device_unregister+0x11/0x20
[  266.858024]  [<ffffffff812f7f9c>] pci_stop_bus_device+0x8c/0xa0
[  266.858024]  [<ffffffff812f8141>] pci_stop_and_remove_bus_device+0x11/0x20
[  266.858024]  [<ffffffff812fe946>] remove_callback+0x26/0x40
[  266.858024]  [<ffffffff811a5233>] sysfs_schedule_callback_work+0x13/0x80
[  266.858024]  [<ffffffff81053462>] process_one_work+0x192/0x570
[  266.858024]  [<ffffffff810533f6>] ? process_one_work+0x126/0x570
[  266.858024]  [<ffffffff81054e7f>] worker_thread+0x15f/0x350
[  266.858024]  [<ffffffff81054d20>] ? manage_workers.isra.27+0x220/0x220
[  266.858024]  [<ffffffff81059f4d>] kthread+0x9d/0xb0
[  266.858024]  [<ffffffff8178b3d4>] kernel_thread_helper+0x4/0x10
[  266.858024]  [<ffffffff81059eb0>] ? __init_kthread_worker+0x70/0x70
[  266.858024]  [<ffffffff8178b3d0>] ? gs_change+0xb/0xb
[  266.858024] Code: 5d c3 90 55 48 89 e5 48 83 ec 20 4c 89 65 e8 4c 89 75 f8 49 89 fc 48 89 5d e0 4c 89 6d f0 41 89 f 
[  266.858024] RIP  [<ffffffff81753e78>] klist_put+0x28/0xa0
[  266.858024]  RSP <ffff88003b301c70>
[  266.858024] CR2: 0000000000000060
[  266.858458] ---[ end trace 7358104716347b8e ]---


[  266.860122] BUG: unable to handle kernel paging request at fffffffffffffff8
[  266.860137] IP: [<ffffffff8105a41b>] kthread_data+0xb/0x20
[  266.860155] PGD 1c0d067 PUD 1c0e067 PMD 0 
[  266.860170] Oops: 0000 [#2] SMP 
[  266.860183] CPU 2 
[  266.860188] Modules linked in: fakephp r8169
[  266.860201] 
[  266.860210] Pid: 864, comm: kworker/u:3 Tainted: G      D W    3.4.0-rc2+ #20 To Be Filled By O.E.M. To Be Filled B.
[  266.860228] RIP: 0010:[<ffffffff8105a41b>]  [<ffffffff8105a41b>] kthread_data+0xb/0x20
[  266.860244] RSP: 0018:ffff88003b301868  EFLAGS: 00010096
[  266.860251] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000002
[  266.860259] RDX: ffffffff81fa9440 RSI: 0000000000000002 RDI: ffff88003ca5df00
[  266.860267] RBP: ffff88003b301868 R08: 0000000000989680 R09: 0000000000000000
[  266.860274] R10: 0000000000000400 R11: 0000000000000003 R12: 0000000000000002
[  266.860283] R13: ffff88003ca5e278 R14: ffff88003c9b8000 R15: ffff88003ca5e180
[  266.860291] FS:  0000000000000000(0000) GS:ffff88003da00000(0000) knlGS:0000000000000000
[  266.860300] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  266.860307] CR2: fffffffffffffff8 CR3: 00000000304dc000 CR4: 00000000000007e0
[  266.860315] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  266.860322] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  266.860331] Process kworker/u:3 (pid: 864, threadinfo ffff88003b300000, task ffff88003ca5df00)
[  266.860338] Stack:
[  266.860344]  ffff88003b301888 ffffffff81055810 ffff88003b301888 ffff88003dbd2900
[  266.860364]  ffff88003b301908 ffffffff81780878 ffff880000000000 ffffffff810bda82
[  266.860380]  ffff88003b301fd8 ffff88003ca5df00 ffff88003b301fd8 ffff88003b301fd8
[  266.860397] Call Trace:
[  266.860413]  [<ffffffff81055810>] wq_worker_sleeping+0x10/0xa0
[  266.860428]  [<ffffffff81780878>] __schedule+0x538/0x7c0
[  266.860443]  [<ffffffff810bda82>] ? call_rcu_sched+0x12/0x20
[  266.860456]  [<ffffffff81780de4>] schedule+0x24/0x70
[  266.860470]  [<ffffffff8103b8b0>] do_exit+0x600/0x9d0
[  266.860483]  [<ffffffff81039065>] ? kmsg_dump+0x105/0x160
[  266.860496]  [<ffffffff817834ae>] oops_end+0x9e/0xe0
[  266.860507]  [<ffffffff81037f09>] ? vprintk+0x329/0x510
[  266.860520]  [<ffffffff81774c5e>] no_context+0x271/0x280
[  266.860532]  [<ffffffff81774e33>] __bad_area_nosemaphore+0x1c6/0x1e5
[  266.860545]  [<ffffffff81037b65>] ? console_unlock+0x1e5/0x260
[  266.860558]  [<ffffffff81774e60>] bad_area_nosemaphore+0xe/0x10
[  266.860571]  [<ffffffff81785dfe>] do_page_fault+0x30e/0x500
[  266.860586]  [<ffffffff811a8e9f>] ? sysfs_remove_group+0xdf/0xf0
[  266.860598]  [<ffffffff81775339>] ? printk+0x3c/0x3e
[  266.860613]  [<ffffffff811a5220>] ? sysfs_write_file+0x180/0x180
[  266.860626]  [<ffffffff817829bf>] page_fault+0x1f/0x30
[  266.860638]  [<ffffffff811a5220>] ? sysfs_write_file+0x180/0x180
[  266.860652]  [<ffffffff81037b65>] ? console_unlock+0x1e5/0x260
[  266.860664]  [<ffffffff81753e78>] ? klist_put+0x28/0xa0
[  266.860676]  [<ffffffff81753f2e>] klist_del+0xe/0x10
[  266.860690]  [<ffffffff813f4fc9>] device_del+0x59/0x1c0
[  266.860703]  [<ffffffff813f5141>] device_unregister+0x11/0x20
[  266.860716]  [<ffffffff812f7f9c>] pci_stop_bus_device+0x8c/0xa0
[  266.860729]  [<ffffffff812f8141>] pci_stop_and_remove_bus_device+0x11/0x20
[  266.860741]  [<ffffffff812fe946>] remove_callback+0x26/0x40
[  266.860754]  [<ffffffff811a5233>] sysfs_schedule_callback_work+0x13/0x80
[  266.860769]  [<ffffffff81053462>] process_one_work+0x192/0x570
[  266.860781]  [<ffffffff810533f6>] ? process_one_work+0x126/0x570
[  266.860795]  [<ffffffff81054e7f>] worker_thread+0x15f/0x350
[  266.860808]  [<ffffffff81054d20>] ? manage_workers.isra.27+0x220/0x220
[  266.860821]  [<ffffffff81059f4d>] kthread+0x9d/0xb0
[  266.860834]  [<ffffffff8178b3d4>] kernel_thread_helper+0x4/0x10
[  266.860846]  [<ffffffff81059eb0>] ? __init_kthread_worker+0x70/0x70
[  266.860857]  [<ffffffff8178b3d0>] ? gs_change+0xb/0xb
[  266.860863] Code: eb 90 be 57 01 00 00 48 c7 c7 96 17 a1 81 e8 1d cb fd ff e9 77 fe ff ff 0f 1f 84 00 00 00 00 00 4 
[  266.861014] RIP  [<ffffffff8105a41b>] kthread_data+0xb/0x20
[  266.861014]  RSP <ffff88003b301868>
[  266.861014] CR2: fffffffffffffff8
[  266.861014] ---[ end trace 7358104716347b8f ]---
[  266.861014] Fixing recursive fault but reboot is needed!


> 
> And really, fakephp?  Come on, what happens in the "real world" with
> real pci hotplug systems/devices that this patch set is trying to solve?
> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-04-17 14:53 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-16 16:28 [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Jiang Liu
2012-04-16 16:28 ` [PATCH RFC 01/17] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details Jiang Liu
2012-04-16 16:28 ` [PATCH RFC 02/17] PCI: introduce recursive rwsem to serialize PCI hotplug operations Jiang Liu
2012-04-16 16:28 ` [PATCH RFC 03/17] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock Jiang Liu
2012-04-16 16:28 ` [PATCH RFC 04/17] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces Jiang Liu
2012-04-16 16:28 ` [PATCH RFC 05/17] PCI: correctly flush workqueue when destroy pcie hotplug controller Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 06/17] PCI: prepare for serializing hotplug operations triggered by pciehp driver Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 07/17] PCI: serialize hotplug operaitons triggered by the " Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 08/17] PCI: fix two race windows when probing/removing SHPC controller Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 09/17] PCI: correctly flush workqueues and timer when destroy " Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 10/17] PCI: serialize hotplug operaitons triggered by the shpchp driver Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 11/17] PCI: release IO resource in error handling path in cpcihp_generic_init() Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 12/17] PCI: clean up all resources in error handling path in zt5550_hc_init_one() Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 13/17] PCI: trivial code clean up in cpci_hotplug_core.c Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 14/17] PCI: fix race windows when shutting down cpcihp controller Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 15/17] PCI: hold a reference count to the PCI bus used by cpcihp drivers Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 16/17] PCI: serialize PCI hotplug operations triggered " Jiang Liu
2012-04-16 16:29 ` [PATCH RFC 17/17] PCI: serialize PCI hotplug operations triggered by fakephp drivers Jiang Liu
2012-04-16 21:33 ` [PATCH RFC 00/17] Introduce a global lock to serialize all PCI hotplug Greg KH
2012-04-17 11:57   ` Jiang Liu
2012-04-17 14:53   ` Jiang Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.