linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations
@ 2012-08-07 16:10 Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 01/22] PCI: use pci_get_domain_bus_and_slot() to avoid race conditions Jiang Liu
                   ` (22 more replies)
  0 siblings, 23 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci

From: Jiang Liu <liuj97@gmail.com>

This is the second take to resolve race conditions when hot-plugging PCI
devices/host bridges. Instead of using a globla lock to serialize all hotplug
operations as in previous version, now we introduce a state machine and bit
lock mechanism for PCI buses to serialize hotplug operations. For discussions
related to previous version, please refer to:
http://comments.gmane.org/gmane.linux.kernel.pci/15007

This patch-set is still in early stages, so sending it out just requesting
for comments. Any comments are welcomed, especially about whether it's the
right/suitable way to solve these race condition issues.

patch 1-5:
	Preparing for coming PCI bus lock
patch 6-7: 
	Core of the new PCI bus lock mechanism.
patch 8-13:
	Enhance PCI core to support PCI bus lock mechanism.
patch 14-18:
	Enhance several PCI hotplug drivers to use PCI bus lock to serialize
	hotplug operations.
patch 19-20:
	Enable PCI bus lock mechanism for x86 and IA64, still need to enable
	PCI bus lock for other archs.
patch 21-22:
	Cleanups for unsed code.

There are multiple methods to trigger PCI hotplug requests/operations
concurrently, such as:
1. Sysfs interfaces exported by the PCI core subsystem
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../remove
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../rescan
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../pci_bus/ssss:bb/rescan
	/sys/bus/pci/rescan
2. Sysfs interfaces exported by the PCI hotplug subsystem
	/sys/bus/pci/slots/xx/power
3. PCI hotplug events triggered by PCI Hotplug Controllers
4. ACPI hotplug events for PCI host bridges
5. Driver binding/unbinding events
	binding/unbinding pci drivers with SR-IOV support

With current implementation, the PCI core subsystem doesn't support
concurrent hotplug operations yet. The existing pci_bus_sem lock only
protects several lists in struct pci_bus, such as children list,
devices list, but it doesn't protect the pci_bus or pci_dev structure
themselves.

Let's take pci_remove_bus_device() as an example, which are used by
PCI hotplug drivers to hot-remove PCI devices.  Currently all these
are free running without any protection, so it can't support reentrance.
pci_remove_bus_device()
    ->pci_stop_bus_device()
        ->pci_stop_bus_device()
            ->pci_stop_bus_devices()
        ->pci_stop_dev()

Jiang Liu (22):
  PCI: use pci_get_domain_bus_and_slot() to avoid race conditions
  PCI: trivial cleanups for drivers/pci/remove.c
  PCI: change PCI device management code to better follow device model
  PCI: split PCI bus device registration into two stages
  PCI: introduce pci_bus_{get|put}() to manage PCI bus reference count
  PCI: use a global lock to serialize PCI root bridge hotplug
    operations
  PCI: introduce PCI bus lock to serialize PCI hotplug operations
  PCI: introduce hotplug safe search interfaces for PCI bus/device
  PCI: enhance PCI probe logic to support PCI bus lock mechanism
  PCI: enhance PCI bus specific logic to support PCI bus lock mechanism
  PCI: enhance PCI resource assignment logic to support PCI bus lock
    mechanism
  PCI: enhance PCI remove logic to support PCI bus lock mechanism
  PCI: make each PCI device hold a reference to its parent PCI bus
  PCI/sysfs: use PCI bus lock to avoid race conditions
  PCI/eeepc: use PCI bus lock to avoid race conditions
  PCI/asus-wmi: use PCI bus lock to avoid race conditions
  PCI/pciehp: use PCI bus lock to avoid race conditions
  PCI/acpiphp: use PCI bus lock to avoid race conditions
  PCI/x86: enable PCI bus lock mechanism for x86 platforms
  PCI/IA64: enable PCI bus lock mechanism for IA64 platforms
  PCI: cleanups for PCI bus lock implementation
  PCI: unexport pci_root_buses

 arch/ia64/pci/pci.c                  |    2 +
 arch/ia64/sn/kernel/io_common.c      |    4 +-
 arch/ia64/sn/kernel/io_init.c        |    1 +
 arch/ia64/sn/pci/tioca_provider.c    |    4 +-
 arch/x86/pci/acpi.c                  |    6 +-
 arch/x86/pci/common.c                |   12 +++
 drivers/acpi/pci_root.c              |    8 +-
 drivers/edac/i7core_edac.c           |   16 ++-
 drivers/gpu/drm/drm_fops.c           |    6 +-
 drivers/gpu/vga/vgaarb.c             |   15 +--
 drivers/pci/bus.c                    |  188 +++++++++++++++++++++++++++++-----
 drivers/pci/host-bridge.c            |   19 ++++
 drivers/pci/hotplug/acpiphp_glue.c   |   13 ++-
 drivers/pci/hotplug/cpcihp_generic.c |    8 +-
 drivers/pci/hotplug/pciehp_pci.c     |   15 +++
 drivers/pci/hotplug/sgi_hotplug.c    |    2 +
 drivers/pci/iov.c                    |   11 +-
 drivers/pci/pci-sysfs.c              |   37 ++++---
 drivers/pci/probe.c                  |   83 +++++++++++----
 drivers/pci/remove.c                 |  176 +++++++++++++++++--------------
 drivers/pci/search.c                 |   53 ++++++++--
 drivers/pci/setup-bus.c              |   65 +++++++++---
 drivers/pci/xen-pcifront.c           |   10 +-
 drivers/platform/x86/asus-wmi.c      |   23 ++++-
 drivers/platform/x86/eeepc-laptop.c  |   20 ++--
 include/linux/pci.h                  |   56 +++++++++-
 26 files changed, 629 insertions(+), 224 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 01/22] PCI: use pci_get_domain_bus_and_slot() to avoid race conditions
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-09-11 22:00   ` Bjorn Helgaas
  2012-08-07 16:10 ` [RFC PATCH v1 02/22] PCI: trivial cleanups for drivers/pci/remove.c Jiang Liu
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

There's a typical usage pattern to search a PCI device under a specific
PCI bus (domian, busno) as below:
struct pci_bus *pci_bus = pci_find_bus(domain, busno);
struct pci_dev *pci_dev = pci_get_slot(pci_bus, devfn);

The above code has a race window between pci_find_bus() and pci_get_slot()
if PCI hotplug operations happen between them which removes the pci_bus.
So use PCI hotplug safe interface pci_get_domain_bus_and_slot() instead,
which also reduces code complexity.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 arch/ia64/sn/kernel/io_common.c      |    4 +---
 drivers/gpu/vga/vgaarb.c             |   15 +++------------
 drivers/pci/hotplug/cpcihp_generic.c |    8 ++------
 drivers/pci/iov.c                    |    8 ++------
 drivers/pci/xen-pcifront.c           |   10 ++--------
 5 files changed, 10 insertions(+), 35 deletions(-)

diff --git a/arch/ia64/sn/kernel/io_common.c b/arch/ia64/sn/kernel/io_common.c
index fbb5f2f..8630875 100644
--- a/arch/ia64/sn/kernel/io_common.c
+++ b/arch/ia64/sn/kernel/io_common.c
@@ -229,7 +229,6 @@ void sn_pci_fixup_slot(struct pci_dev *dev, struct pcidev_info *pcidev_info,
 {
 	int segment = pci_domain_nr(dev->bus);
 	struct pcibus_bussoft *bs;
-	struct pci_bus *host_pci_bus;
 	struct pci_dev *host_pci_dev;
 	unsigned int bus_no, devfn;
 
@@ -245,8 +244,7 @@ void sn_pci_fixup_slot(struct pci_dev *dev, struct pcidev_info *pcidev_info,
 
 	bus_no = (pcidev_info->pdi_slot_host_handle >> 32) & 0xff;
 	devfn = pcidev_info->pdi_slot_host_handle & 0xffffffff;
- 	host_pci_bus = pci_find_bus(segment, bus_no);
- 	host_pci_dev = pci_get_slot(host_pci_bus, devfn);
+	host_pci_dev = pci_get_domain_bus_and_slot(segment, bus_no, devfn);
 
 	pcidev_info->host_pci_dev = host_pci_dev;
 	pcidev_info->pdi_linux_pcidev = dev;
diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
index 3df8fc0..b6852b7 100644
--- a/drivers/gpu/vga/vgaarb.c
+++ b/drivers/gpu/vga/vgaarb.c
@@ -1066,7 +1066,6 @@ static ssize_t vga_arb_write(struct file *file, const char __user * buf,
 		}
 
 	} else if (strncmp(curr_pos, "target ", 7) == 0) {
-		struct pci_bus *pbus;
 		unsigned int domain, bus, devfn;
 		struct vga_device *vgadev;
 
@@ -1085,19 +1084,11 @@ static ssize_t vga_arb_write(struct file *file, const char __user * buf,
 			pr_debug("vgaarb: %s ==> %x:%x:%x.%x\n", curr_pos,
 				domain, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
 
-			pbus = pci_find_bus(domain, bus);
-			pr_debug("vgaarb: pbus %p\n", pbus);
-			if (pbus == NULL) {
-				pr_err("vgaarb: invalid PCI domain and/or bus address %x:%x\n",
-					domain, bus);
-				ret_val = -ENODEV;
-				goto done;
-			}
-			pdev = pci_get_slot(pbus, devfn);
+			pdev = pci_get_domain_bus_and_slot(domain, bus, devfn);
 			pr_debug("vgaarb: pdev %p\n", pdev);
 			if (!pdev) {
-				pr_err("vgaarb: invalid PCI address %x:%x\n",
-					bus, devfn);
+				pr_err("vgaarb: invalid PCI address %x:%x:%x\n",
+					domain, bus, devfn);
 				ret_val = -ENODEV;
 				goto done;
 			}
diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index 81af764..a6a71c4 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
@@ -154,12 +154,8 @@ static int __init cpcihp_generic_init(void)
 	if(!r)
 		return -EBUSY;
 
-	bus = pci_find_bus(0, bridge_busnr);
-	if (!bus) {
-		err("Invalid bus number %d", bridge_busnr);
-		return -EINVAL;
-	}
-	dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
+	dev = pci_get_domain_bus_and_slot(0, bridge_busnr,
+					  PCI_DEVFN(bridge_slot, 0));
 	if(!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
 		err("Invalid bridge device %s", bridge);
 		pci_dev_put(dev);
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 74bbaf8..c7d2969 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -152,15 +152,11 @@ failed1:
 static void virtfn_remove(struct pci_dev *dev, int id, int reset)
 {
 	char buf[VIRTFN_ID_LEN];
-	struct pci_bus *bus;
 	struct pci_dev *virtfn;
 	struct pci_sriov *iov = dev->sriov;
 
-	bus = pci_find_bus(pci_domain_nr(dev->bus), virtfn_bus(dev, id));
-	if (!bus)
-		return;
-
-	virtfn = pci_get_slot(bus, virtfn_devfn(dev, id));
+	virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus),
+			virtfn_bus(dev, id), virtfn_devfn(dev, id));
 	if (!virtfn)
 		return;
 
diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index d6cc62c..def8d0b 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -982,7 +982,6 @@ static int pcifront_detach_devices(struct pcifront_device *pdev)
 	int err = 0;
 	int i, num_devs;
 	unsigned int domain, bus, slot, func;
-	struct pci_bus *pci_bus;
 	struct pci_dev *pci_dev;
 	char str[64];
 
@@ -1032,13 +1031,8 @@ static int pcifront_detach_devices(struct pcifront_device *pdev)
 			goto out;
 		}
 
-		pci_bus = pci_find_bus(domain, bus);
-		if (!pci_bus) {
-			dev_dbg(&pdev->xdev->dev, "Cannot get bus %04x:%02x\n",
-				domain, bus);
-			continue;
-		}
-		pci_dev = pci_get_slot(pci_bus, PCI_DEVFN(slot, func));
+		pci_dev = pci_get_domain_bus_and_slot(domain, bus,
+				PCI_DEVFN(slot, func));
 		if (!pci_dev) {
 			dev_dbg(&pdev->xdev->dev,
 				"Cannot get PCI device %04x:%02x:%02x.%d\n",
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 02/22] PCI: trivial cleanups for drivers/pci/remove.c
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 01/22] PCI: use pci_get_domain_bus_and_slot() to avoid race conditions Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-09-11 22:03   ` Bjorn Helgaas
  2012-08-07 16:10 ` [RFC PATCH v1 03/22] PCI: change PCI device management code to better follow device model Jiang Liu
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

Trivial cleanups for drivers/pci/remove.c:
1) move the comment for pci_stop_and_remove_bus_device() to the right place
2) rename __pci_remove_behind_bridge() to pci_remove_behind_bridge()

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/remove.c |   33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 04a4861..33b6318 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -78,25 +78,14 @@ void pci_remove_bus(struct pci_bus *pci_bus)
 }
 EXPORT_SYMBOL(pci_remove_bus);
 
-static void __pci_remove_behind_bridge(struct pci_dev *dev);
-/**
- * pci_stop_and_remove_bus_device - remove a PCI device and any children
- * @dev: the device to remove
- *
- * Remove a PCI device from the device lists, informing the drivers
- * that the device has been removed.  We also remove any subordinate
- * buses and children in a depth-first manner.
- *
- * For each device we remove, delete the device structure from the
- * device lists, remove the /proc entry, and notify userspace
- * (/sbin/hotplug).
- */
+static void pci_remove_behind_bridge(struct pci_dev *dev);
+
 void __pci_remove_bus_device(struct pci_dev *dev)
 {
 	if (dev->subordinate) {
 		struct pci_bus *b = dev->subordinate;
 
-		__pci_remove_behind_bridge(dev);
+		pci_remove_behind_bridge(dev);
 		pci_remove_bus(b);
 		dev->subordinate = NULL;
 	}
@@ -105,13 +94,25 @@ void __pci_remove_bus_device(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(__pci_remove_bus_device);
 
+/**
+ * pci_stop_and_remove_bus_device - remove a PCI device and any children
+ * @dev: the device to remove
+ *
+ * Remove a PCI device from the device lists, informing the drivers
+ * that the device has been removed.  We also remove any subordinate
+ * buses and children in a depth-first manner.
+ *
+ * For each device we remove, delete the device structure from the
+ * device lists, remove the /proc entry, and notify userspace
+ * (/sbin/hotplug).
+ */
 void pci_stop_and_remove_bus_device(struct pci_dev *dev)
 {
 	pci_stop_bus_device(dev);
 	__pci_remove_bus_device(dev);
 }
 
-static void __pci_remove_behind_bridge(struct pci_dev *dev)
+static void pci_remove_behind_bridge(struct pci_dev *dev)
 {
 	struct list_head *l, *n;
 
@@ -141,7 +142,7 @@ static void pci_stop_behind_bridge(struct pci_dev *dev)
 void pci_stop_and_remove_behind_bridge(struct pci_dev *dev)
 {
 	pci_stop_behind_bridge(dev);
-	__pci_remove_behind_bridge(dev);
+	pci_remove_behind_bridge(dev);
 }
 
 static void pci_stop_bus_devices(struct pci_bus *bus)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 03/22] PCI: change PCI device management code to better follow device model
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 01/22] PCI: use pci_get_domain_bus_and_slot() to avoid race conditions Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 02/22] PCI: trivial cleanups for drivers/pci/remove.c Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-09-11 22:03   ` Bjorn Helgaas
  2012-08-07 16:10 ` [RFC PATCH v1 04/22] PCI: split PCI bus device registration into two stages Jiang Liu
                   ` (19 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

According to device model documentation, the way to add/remove device
object should be symmetric.

/**
 * device_del - delete device from system.
 * @dev: device.
 *
 * This is the first part of the device unregistration
 * sequence. This removes the device from the lists we control
 * from here, has it removed from the other driver model
 * subsystems it was added to in device_add(), and removes it
 * from the kobject hierarchy.
 *
 * NOTE: this should be called manually _iff_ device_add() was
 * also called manually.
 */

The rule here is to either use
1) device_register()/device_unregister()
or
2) device_initialize()/device_add()/device_del()/put_device().

So change PCI core to follow the rule and get rid of the redundant
pci_dev_get()/pci_dev_put() pair.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/probe.c  |    1 -
 drivers/pci/remove.c |    4 ++--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 0840409..dacca26 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1294,7 +1294,6 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
 {
 	device_initialize(&dev->dev);
 	dev->dev.release = pci_release_dev;
-	pci_dev_get(dev);
 
 	dev->dev.dma_mask = &dev->dma_mask;
 	dev->dev.dma_parms = &dev->dma_parms;
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 33b6318..b9ac765 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -22,7 +22,7 @@ static void pci_stop_dev(struct pci_dev *dev)
 	if (dev->is_added) {
 		pci_proc_detach_device(dev);
 		pci_remove_sysfs_dev_files(dev);
-		device_unregister(&dev->dev);
+		device_del(&dev->dev);
 		dev->is_added = 0;
 	}
 
@@ -40,7 +40,7 @@ static void pci_destroy_dev(struct pci_dev *dev)
 	up_write(&pci_bus_sem);
 
 	pci_free_resources(dev);
-	pci_dev_put(dev);
+	put_device(&dev->dev);
 }
 
 /**
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 04/22] PCI: split PCI bus device registration into two stages
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (2 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 03/22] PCI: change PCI device management code to better follow device model Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 05/22] PCI: introduce pci_bus_{get|put}() to manage PCI bus reference count Jiang Liu
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

When handling BUS_NOTIFY_ADD_DEVICE event for a new PCI bridge
device, the notification handler can't hold reference count
to the new PCI bus because the device object for the new bus
(pci_dev->subordinate->dev) hasn't been initialized yet.

Split the PCI bus device registration into two stages as below,
so that the event handler could hold reference counts to the new
PCI bus when handling BUS_NOTIFY_ADD_DEVICE event.

1) device_initialize(&pci_dev->dev)
2) device_initialize(&pci_dev->subordinate->dev)
3) notify BUS_NOTIFY_ADD_DEVICE event for pci_dev
4) device_add(&pci_dev->dev)
5) device_add(&pci_dev->subordinate->dev)

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/bus.c    |    2 +-
 drivers/pci/probe.c  |    3 ++-
 drivers/pci/remove.c |   10 +++++-----
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 4ce5ef2..e2a0c52 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -187,7 +187,7 @@ int pci_bus_add_child(struct pci_bus *bus)
 	if (bus->bridge)
 		bus->dev.parent = bus->bridge;
 
-	retval = device_register(&bus->dev);
+	retval = device_add(&bus->dev);
 	if (retval)
 		return retval;
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index dacca26..ad77ae5 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -456,6 +456,7 @@ static struct pci_bus * pci_alloc_bus(void)
 		INIT_LIST_HEAD(&b->resources);
 		b->max_bus_speed = PCI_SPEED_UNKNOWN;
 		b->cur_bus_speed = PCI_SPEED_UNKNOWN;
+		device_initialize(&b->dev);
 	}
 	return b;
 }
@@ -1672,7 +1673,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
 	b->dev.class = &pcibus_class;
 	b->dev.parent = b->bridge;
 	dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus);
-	error = device_register(&b->dev);
+	error = device_add(&b->dev);
 	if (error)
 		goto class_dev_reg_err;
 
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index b9ac765..ba03059 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -70,11 +70,11 @@ void pci_remove_bus(struct pci_bus *pci_bus)
 	list_del(&pci_bus->node);
 	pci_bus_release_busn_res(pci_bus);
 	up_write(&pci_bus_sem);
-	if (!pci_bus->is_added)
-		return;
-
-	pci_remove_legacy_files(pci_bus);
-	device_unregister(&pci_bus->dev);
+	if (pci_bus->is_added) {
+		pci_remove_legacy_files(pci_bus);
+		device_del(&pci_bus->dev);
+	}
+	put_device(&pci_bus->dev);
 }
 EXPORT_SYMBOL(pci_remove_bus);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 05/22] PCI: introduce pci_bus_{get|put}() to manage PCI bus reference count
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (3 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 04/22] PCI: split PCI bus device registration into two stages Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations Jiang Liu
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

Sometimes PCI hotplug drivers need to hold a reference count on a PCI bus,
so introduce pci_bus_{get|put}() to manage PCI bus reference count.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/bus.c   |   15 +++++++++++++++
 include/linux/pci.h |    2 ++
 2 files changed, 17 insertions(+)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index e2a0c52..0e18270 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -325,6 +325,21 @@ void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
 }
 EXPORT_SYMBOL_GPL(pci_walk_bus);
 
+struct pci_bus *pci_bus_get(struct pci_bus *bus)
+{
+	if (bus)
+		get_device(&bus->dev);
+	return bus;
+}
+EXPORT_SYMBOL(pci_bus_get);
+
+void pci_bus_put(struct pci_bus *bus)
+{
+	if (bus)
+		put_device(&bus->dev);
+}
+EXPORT_SYMBOL(pci_bus_put);
+
 EXPORT_SYMBOL(pci_bus_alloc_resource);
 EXPORT_SYMBOL_GPL(pci_bus_add_device);
 EXPORT_SYMBOL(pci_bus_add_devices);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 95479cd..21fa79e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -955,6 +955,8 @@ int pci_request_selected_regions_exclusive(struct pci_dev *, int, const char *);
 void pci_release_selected_regions(struct pci_dev *, int);
 
 /* drivers/pci/bus.c */
+struct pci_bus *pci_bus_get(struct pci_bus *bus);
+void pci_bus_put(struct pci_bus *bus);
 void pci_add_resource(struct list_head *resources, struct resource *res);
 void pci_add_resource_offset(struct list_head *resources, struct resource *res,
 			     resource_size_t offset);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (4 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 05/22] PCI: introduce pci_bus_{get|put}() to manage PCI bus reference count Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-09-11 22:57   ` Bjorn Helgaas
  2012-08-07 16:10 ` [RFC PATCH v1 07/22] PCI: introduce PCI bus lock to serialize PCI " Jiang Liu
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

Currently there's no mechanism to protect the global pci_root_buses list
from dynamic change at runtime. That means, PCI root bridge hotplug
operations, which dynamically change the pci_root_buses list, may cause
invalid memory accesses.

So introduce a global lock to serialize accesses to the pci_root_buses
list and serialize PCI host bridge hotplug operations.

Be careful, never try to acquire this global lock from PCI device drivers,
that may cause deadlocks.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/acpi/pci_root.c           |    8 +++++++-
 drivers/edac/i7core_edac.c        |   16 +++++++---------
 drivers/gpu/drm/drm_fops.c        |    6 +++++-
 drivers/pci/host-bridge.c         |   19 +++++++++++++++++++
 drivers/pci/hotplug/sgi_hotplug.c |    2 ++
 drivers/pci/pci-sysfs.c           |    2 ++
 drivers/pci/probe.c               |    5 ++++-
 drivers/pci/search.c              |    9 ++++++++-
 include/linux/pci.h               |    8 ++++++++
 9 files changed, 62 insertions(+), 13 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 7aff631..6bd0e32 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -463,6 +463,8 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
 	if (!root)
 		return -ENOMEM;
 
+	pci_host_bridge_hotplug_lock();
+
 	segment = 0;
 	status = acpi_evaluate_integer(device->handle, METHOD_NAME__SEG, NULL,
 				       &segment);
@@ -516,7 +518,6 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
 	 * TBD: Need PCI interface for enumeration/configuration of roots.
 	 */
 
-	/* TBD: Locking */
 	list_add_tail(&root->node, &acpi_pci_roots);
 
 	printk(KERN_INFO PREFIX "%s [%s] (domain %04x %pR)\n",
@@ -622,11 +623,14 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
 	if (device->wakeup.flags.run_wake)
 		device_set_run_wake(root->bus->bridge, true);
 
+	pci_host_bridge_hotplug_unlock();
+
 	return 0;
 
 end:
 	if (!list_empty(&root->node))
 		list_del(&root->node);
+	pci_host_bridge_hotplug_unlock();
 	kfree(root);
 	return result;
 }
@@ -643,8 +647,10 @@ static int acpi_pci_root_remove(struct acpi_device *device, int type)
 {
 	struct acpi_pci_root *root = acpi_driver_data(device);
 
+	pci_host_bridge_hotplug_lock();
 	device_set_run_wake(root->bus->bridge, false);
 	pci_acpi_remove_bus_pm_notifier(device);
+	pci_host_bridge_hotplug_unlock();
 
 	kfree(root);
 	return 0;
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index d27778f..8e6f177 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -1196,6 +1196,7 @@ static void __init i7core_xeon_pci_fixup(const struct pci_id_table *table)
 	 * aren't announced by acpi. So, we need to use a legacy scan probing
 	 * to detect them
 	 */
+	pci_host_bridge_hotplug_lock();
 	while (table && table->descr) {
 		pdev = pci_get_device(PCI_VENDOR_ID_INTEL, table->descr[0].dev_id, NULL);
 		if (unlikely(!pdev)) {
@@ -1205,19 +1206,16 @@ static void __init i7core_xeon_pci_fixup(const struct pci_id_table *table)
 		pci_dev_put(pdev);
 		table++;
 	}
+	pci_host_bridge_hotplug_unlock();
 }
 
 static unsigned i7core_pci_lastbus(void)
 {
-	int last_bus = 0, bus;
-	struct pci_bus *b = NULL;
-
-	while ((b = pci_find_next_bus(b)) != NULL) {
-		bus = b->number;
-		debugf0("Found bus %d\n", bus);
-		if (bus > last_bus)
-			last_bus = bus;
-	}
+	int last_bus = 0;
+
+	for (last_bus = 255; last_bus >= 0; last_bus--)
+		if (pci_find_bus(0, last_bus))
+			break;
 
 	debugf0("Last bus %d\n", last_bus);
 
diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c
index 123de28..f559b5b 100644
--- a/drivers/gpu/drm/drm_fops.c
+++ b/drivers/gpu/drm/drm_fops.c
@@ -344,9 +344,13 @@ static int drm_open_helper(struct inode *inode, struct file *filp,
 			pci_dev_put(pci_dev);
 		}
 		if (!dev->hose) {
-			struct pci_bus *b = pci_bus_b(pci_root_buses.next);
+			struct pci_bus *b;
+
+			pci_host_bridge_hotplug_lock();
+			b = pci_find_next_bus(NULL);
 			if (b)
 				dev->hose = b->sysdata;
+			pci_host_bridge_hotplug_unlock();
 		}
 	}
 #endif
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index a68dc61..28d5557 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -94,3 +94,22 @@ void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
 	res->end = region->end + offset;
 }
 EXPORT_SYMBOL(pcibios_bus_to_resource);
+
+static DEFINE_MUTEX(pci_host_bridge_lock);
+
+/*
+ * Get the lock to serialize PCI host bridge hotplug operations.
+ * It can't be called from PCI device drivers, otherwise it may cause
+ * deadlocks when removing a host bridge.
+ */
+void pci_host_bridge_hotplug_lock(void)
+{
+	mutex_lock(&pci_host_bridge_lock);
+}
+EXPORT_SYMBOL(pci_host_bridge_hotplug_lock);
+
+void pci_host_bridge_hotplug_unlock(void)
+{
+	mutex_unlock(&pci_host_bridge_lock);
+}
+EXPORT_SYMBOL(pci_host_bridge_hotplug_unlock);
diff --git a/drivers/pci/hotplug/sgi_hotplug.c b/drivers/pci/hotplug/sgi_hotplug.c
index f64ca92..e0e5d13 100644
--- a/drivers/pci/hotplug/sgi_hotplug.c
+++ b/drivers/pci/hotplug/sgi_hotplug.c
@@ -690,6 +690,7 @@ static int __init sn_pci_hotplug_init(void)
 
 	INIT_LIST_HEAD(&sn_hp_list);
 
+	pci_host_bridge_hotplug_lock();
 	while ((pci_bus = pci_find_next_bus(pci_bus))) {
 		if (!pci_bus->sysdata)
 			continue;
@@ -709,6 +710,7 @@ static int __init sn_pci_hotplug_init(void)
 			break;
 		}
 	}
+	pci_host_bridge_hotplug_unlock();
 
 	return registered == 1 ? 0 : -ENODEV;
 }
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 86c63fe..99fefbe 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -295,10 +295,12 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 		return -EINVAL;
 
 	if (val) {
+		pci_host_bridge_hotplug_lock();
 		mutex_lock(&pci_remove_rescan_mutex);
 		while ((b = pci_find_next_bus(b)) != NULL)
 			pci_rescan_bus(b);
 		mutex_unlock(&pci_remove_rescan_mutex);
+		pci_host_bridge_hotplug_unlock();
 	}
 	return count;
 }
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ad77ae5..1f64e8d 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -23,7 +23,10 @@ struct resource busn_resource = {
 	.flags	= IORESOURCE_BUS,
 };
 
-/* Ugh.  Need to stop exporting this to modules. */
+/*
+ * Ugh.  Need to stop exporting this to modules.
+ * Protected by pci_host_bridge_hotplug_{lock|unlock}().
+ */
 LIST_HEAD(pci_root_buses);
 EXPORT_SYMBOL(pci_root_buses);
 
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 993d4a0..f1147a7 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -100,6 +100,13 @@ struct pci_bus * pci_find_bus(int domain, int busnr)
  * initiated by passing %NULL as the @from argument.  Otherwise if
  * @from is not %NULL, searches continue from next device on the
  * global list.
+ *
+ * Please don't call this function at rumtime if possible.
+ * It's designed to be called at boot time only because it's unsafe
+ * to PCI root bridge hotplug operations. But some drivers do invoke
+ * it at runtime and it's hard to fix those drivers. In such cases,
+ * use pci_host_bridge_hotplug()_{lock|unlock} to protect the PCI root
+ * bus list, but you need to be really careful to avoid deadlock.
  */
 struct pci_bus * 
 pci_find_next_bus(const struct pci_bus *from)
@@ -115,6 +122,7 @@ pci_find_next_bus(const struct pci_bus *from)
 	up_read(&pci_bus_sem);
 	return b;
 }
+EXPORT_SYMBOL(pci_find_next_bus);
 
 /**
  * pci_get_slot - locate PCI device for a given PCI slot
@@ -353,7 +361,6 @@ EXPORT_SYMBOL(pci_dev_present);
 
 /* For boot time work */
 EXPORT_SYMBOL(pci_find_bus);
-EXPORT_SYMBOL(pci_find_next_bus);
 /* For everyone */
 EXPORT_SYMBOL(pci_get_device);
 EXPORT_SYMBOL(pci_get_subsys);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 21fa79e..e02f130 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -388,6 +388,8 @@ struct pci_host_bridge {
 void pci_set_host_bridge_release(struct pci_host_bridge *bridge,
 		     void (*release_fn)(struct pci_host_bridge *),
 		     void *release_data);
+void pci_host_bridge_hotplug_lock(void);
+void pci_host_bridge_hotplug_unlock(void);
 
 /*
  * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that correspond
@@ -1359,6 +1361,12 @@ static inline void pci_unblock_cfg_access(struct pci_dev *dev)
 static inline struct pci_bus *pci_find_next_bus(const struct pci_bus *from)
 { return NULL; }
 
+static inline void pci_host_bridge_hotplug_lock(void)
+{ }
+
+static inline void pci_host_bridge_hotplug_unlock(void)
+{ }
+
 static inline struct pci_dev *pci_get_slot(struct pci_bus *bus,
 						unsigned int devfn)
 { return NULL; }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 07/22] PCI: introduce PCI bus lock to serialize PCI hotplug operations
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (5 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-09-11 23:24   ` Bjorn Helgaas
  2012-08-07 16:10 ` [RFC PATCH v1 08/22] PCI: introduce hotplug safe search interfaces for PCI bus/device Jiang Liu
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

There are multiple ways to trigger concurrent PCI hotplug operations for
a specific PCI bus, but we have no way to serialize those PCI hotplug
operations yet and thus breaks the PCI hotplug logic. This patch introduces
a bus lock mechanism and state machine for PCI buses to serialize PCI
hotplug operations.

The state machine for PCI buses is:
          __________________________     ______________
          |                        v     |            v
INITIALIZED->REGISTERED->WORKING->STOPPING->STOPPED->REMOVED->DESTOYED
                     |_________________________^

The PCI buses is hierarchy, so need to obey the locking rules:
1) The PCI bus must be locked when changing any child devices of it.
2) The PCI bus must be locked when changing its state
3) The global PCI host bridge hotplug lock must be held when hotplugging
   PCI root buses

The lock interfaces cordinated with the state machine will be used to
avoid race conditions when hotplugging PCI devices/host bridges.
A typical usage is (lock bus if it's in WORKING state, and then do hotplug):
if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
	do_pci_hotplug();
	pci_bus_unlock(bus);
}

The PCI_BUS_LOCK config option is a temporary solution to avoid breaking
bisect, it will be removed when all Archs have been converted to the new
PCI bus lock mechanism.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/Kconfig |    4 +++
 drivers/pci/bus.c   |   86 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h |   44 ++++++++++++++++++++++++++
 3 files changed, 134 insertions(+)

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 848bfb8..a6df8b1 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -120,3 +120,7 @@ config PCI_IOAPIC
 config PCI_LABEL
 	def_bool y if (DMI || ACPI)
 	select NLS
+
+config PCI_BUS_LOCK
+	bool
+	default n
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 0e18270..aa25fcf 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -15,9 +15,12 @@
 #include <linux/proc_fs.h>
 #include <linux/init.h>
 #include <linux/slab.h>
+#include <linux/sched.h>
 
 #include "pci.h"
 
+static DECLARE_WAIT_QUEUE_HEAD(pci_bus_state_wait_queue);
+
 void pci_add_resource_offset(struct list_head *resources, struct resource *res,
 			     resource_size_t offset)
 {
@@ -340,6 +343,89 @@ void pci_bus_put(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_bus_put);
 
+static bool pci_bus_wait_for_states(struct pci_bus *bus, int states)
+{
+	int t = atomic_read(&bus->state);
+
+	/* Bus state is bigger than any of the requested states. */
+	if ((t & PCI_BUS_STATE_MASK) > states)
+		return true;
+
+	/* Bus is in one of the requested states and unlocked. */
+	if ((t & states) && !(t & PCI_BUS_STATE_LOCK))
+		return true;
+
+	return false;
+}
+
+/*
+ * Wait for the bus to reach one of the requested states and then lock it.
+ * Return current bus state if succeed to lock the bus, and return -EINVAL
+ * if current bus state is already bigger than any of the requested states.
+ */
+int pci_bus_lock_states(struct pci_bus *bus, int states)
+{
+	int t;
+
+	BUG_ON(states & ~PCI_BUS_STATE_MASK);
+	do {
+		do {
+			wait_event(pci_bus_state_wait_queue,
+				   pci_bus_wait_for_states(bus, states));
+			t = atomic_read(&bus->state);
+			if ((t & PCI_BUS_STATE_MASK) > states)
+				return -EINVAL;
+		} while (!(t & states));
+
+		t &= ~PCI_BUS_STATE_LOCK;
+	} while (atomic_cmpxchg(&bus->state, t , t | PCI_BUS_STATE_LOCK) != t);
+
+	return t & PCI_BUS_STATE_MASK;
+}
+EXPORT_SYMBOL(pci_bus_lock_states);
+
+/* Unlock the bus and wake up waiters, must be called with the bus locked. */
+void pci_bus_unlock(struct pci_bus *bus)
+{
+	int t;
+
+	BUG_ON(!pci_bus_is_locked(bus));
+	do {
+		t = atomic_read(&bus->state);
+	} while (atomic_cmpxchg(&bus->state,
+				t, t & ~PCI_BUS_STATE_LOCK) != t);
+
+	if (waitqueue_active(&pci_bus_state_wait_queue))
+		wake_up_all(&pci_bus_state_wait_queue);
+}
+EXPORT_SYMBOL(pci_bus_unlock);
+
+/*
+ * Change the bus from old state to new state. It must be called with the bus
+ * locked, and the new state must be bigger than the old state.
+ */
+void pci_bus_change_state(struct pci_bus *bus, int old, int new, bool unlock)
+{
+	int t;
+
+	BUG_ON(!pci_bus_is_locked(bus));
+	BUG_ON(new < old || pci_bus_get_state(bus) != old ||
+	       (new & ~PCI_BUS_STATE_MASK));
+
+	old |= PCI_BUS_STATE_LOCK;
+	if (!unlock)
+		new |= PCI_BUS_STATE_LOCK;
+
+	do {
+		t = atomic_read(&bus->state);
+		t &= ~(PCI_BUS_STATE_MASK | PCI_BUS_STATE_LOCK);
+	} while (atomic_cmpxchg(&bus->state, t | old, t | new) != (t | old));
+
+	if (waitqueue_active(&pci_bus_state_wait_queue))
+		wake_up_all(&pci_bus_state_wait_queue);
+}
+EXPORT_SYMBOL(pci_bus_change_state);
+
 EXPORT_SYMBOL(pci_bus_alloc_resource);
 EXPORT_SYMBOL_GPL(pci_bus_add_device);
 EXPORT_SYMBOL(pci_bus_add_devices);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e02f130..e2ef517 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -443,8 +443,52 @@ struct pci_bus {
 	struct bin_attribute	*legacy_io; /* legacy I/O for this bus */
 	struct bin_attribute	*legacy_mem; /* legacy mem */
 	unsigned int		is_added:1;
+	atomic_t		state;
 };
 
+/*
+ * State machine for PCI buses.
+ *          __________________________     ______________
+ *          |                        v     |            v
+ * INITIALIZED->REGISTERED->WORKING->STOPPING->STOPPED->REMOVED->DESTOYED
+ *                     |_________________________^
+ */
+#define	PCI_BUS_STATE_UNKNOWN		0x0	/* invalid state */
+#define	PCI_BUS_STATE_INITIALIZED	0x1	/* device_initialize called */
+#define	PCI_BUS_STATE_REGISTERED	0x2	/* device_add called */
+#define	PCI_BUS_STATE_WORKING		0x4	/* working state */
+#define	PCI_BUS_STATE_STOPPING		0x8	/* stopping devices */
+#define	PCI_BUS_STATE_STOPPED		0x10	/* device_del called */
+#define	PCI_BUS_STATE_REMOVED		0x20	/* bus deleted  */
+#define	PCI_BUS_STATE_DESTROYED		0x40	/* invalid state */
+#define	PCI_BUS_STATE_MASK		0x7F
+
+#ifdef	CONFIG_PCI_BUS_LOCK
+#define	PCI_BUS_STATE_LOCK		0x10000	/* for pci core only */
+
+static inline bool pci_bus_is_locked(struct pci_bus *bus)
+{
+	return !!(atomic_read(&bus->state) & PCI_BUS_STATE_LOCK);
+}
+#else /* CONFIG_PCI_BUS_LOCK */
+#define	PCI_BUS_STATE_LOCK		0x0000	/* for pci core only */
+
+static inline bool pci_bus_is_locked(struct pci_bus *bus)
+{
+	return true;
+}
+#endif /* CONFIG_PCI_BUS_LOCK */
+
+static inline int pci_bus_get_state(struct pci_bus *bus)
+{
+	return atomic_read(&bus->state) & PCI_BUS_STATE_MASK;
+}
+
+extern int pci_bus_lock_states(struct pci_bus *bus, int states);
+extern void pci_bus_unlock(struct pci_bus *bus);
+extern void pci_bus_change_state(struct pci_bus *bus, int new, int old,
+				 bool unlock);
+
 #define pci_bus_b(n)	list_entry(n, struct pci_bus, node)
 #define to_pci_bus(n)	container_of(n, struct pci_bus, dev)
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 08/22] PCI: introduce hotplug safe search interfaces for PCI bus/device
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (6 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 07/22] PCI: introduce PCI bus lock to serialize PCI " Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 09/22] PCI: enhance PCI probe logic to support PCI bus lock mechanism Jiang Liu
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

Function pci_find_bus() is not hotplug safe because it doesn't hold any
reference on the returned bus, so the bus may be destroyed by hotplug
operations just after returning from pci_find_bus.

This patch introduces a hotplug safe interface to get and lock a specific
PCI bus. It also provides several help interfaces to reduce code complexity.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/bus.c    |   34 ++++++++++++++++++++++++++++++++++
 drivers/pci/search.c |   44 ++++++++++++++++++++++++++++++++++----------
 include/linux/pci.h  |   10 ++++++++++
 3 files changed, 78 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index aa25fcf..b6aacaa 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -426,6 +426,40 @@ void pci_bus_change_state(struct pci_bus *bus, int old, int new, bool unlock)
 }
 EXPORT_SYMBOL(pci_bus_change_state);
 
+struct pci_bus *__pci_get_and_lock_bus(int domain, int busnr, int states)
+{
+	struct pci_bus *bus;
+
+	bus = pci_get_bus(domain, busnr);
+	if (bus && pci_bus_lock_states(bus, states) < 0) {
+		pci_bus_put(bus);
+		bus = NULL;
+	}
+
+	return bus;
+}
+EXPORT_SYMBOL(__pci_get_and_lock_bus);
+
+struct pci_bus *pci_lock_subordinate(struct pci_dev *dev, int states)
+{
+	struct pci_bus *bus = dev->subordinate;
+
+	if (bus && pci_bus_lock_states(bus, states) > 0)
+		return bus;
+
+	return NULL;
+}
+EXPORT_SYMBOL(pci_lock_subordinate);
+
+void pci_unlock_and_put_bus(struct pci_bus *bus)
+{
+	if (bus) {
+		pci_bus_unlock(bus);
+		pci_bus_put(bus);
+	}
+}
+EXPORT_SYMBOL(pci_unlock_and_put_bus);
+
 EXPORT_SYMBOL(pci_bus_alloc_resource);
 EXPORT_SYMBOL_GPL(pci_bus_add_device);
 EXPORT_SYMBOL(pci_bus_add_devices);
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index f1147a7..c0a8a2b 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -69,6 +69,35 @@ static struct pci_bus *pci_do_find_bus(struct pci_bus *bus, unsigned char busnr)
 }
 
 /**
+ * pci_get_bus - get PCI bus from a given domain and bus number
+ * @domain: number of PCI domain to search
+ * @busnr: number of desired PCI bus
+ *
+ * Given a PCI bus number and domain number, the desired PCI bus is located
+ * in the global list of PCI buses. If the bus is found, a reference count
+ * is held on the returned PCI bus. If no bus is found, %NULL is returned.
+ */
+struct pci_bus *pci_get_bus(int domain, int busnr)
+{
+	struct pci_bus *bus;
+	struct pci_bus *tmp_bus = NULL;
+
+	down_read(&pci_bus_sem);
+	list_for_each_entry(bus, &pci_root_buses, node)
+		if (pci_domain_nr(bus) == domain) {
+			tmp_bus = pci_do_find_bus(bus, busnr);
+			if (tmp_bus) {
+				pci_bus_get(tmp_bus);
+				break;
+			}
+		}
+	up_read(&pci_bus_sem);
+
+	return tmp_bus;
+}
+EXPORT_SYMBOL(pci_get_bus);
+
+/**
  * pci_find_bus - locate PCI bus from a given domain and bus number
  * @domain: number of PCI domain to search
  * @busnr: number of desired PCI bus
@@ -79,17 +108,12 @@ static struct pci_bus *pci_do_find_bus(struct pci_bus *bus, unsigned char busnr)
  */
 struct pci_bus * pci_find_bus(int domain, int busnr)
 {
-	struct pci_bus *bus = NULL;
-	struct pci_bus *tmp_bus;
+	struct pci_bus *bus;
 
-	while ((bus = pci_find_next_bus(bus)) != NULL)  {
-		if (pci_domain_nr(bus) != domain)
-			continue;
-		tmp_bus = pci_do_find_bus(bus, busnr);
-		if (tmp_bus)
-			return tmp_bus;
-	}
-	return NULL;
+	bus = pci_get_bus(domain, busnr);
+	pci_bus_put(bus);
+
+	return bus;
 }
 
 /**
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e2ef517..9e52e88 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -488,6 +488,15 @@ extern int pci_bus_lock_states(struct pci_bus *bus, int states);
 extern void pci_bus_unlock(struct pci_bus *bus);
 extern void pci_bus_change_state(struct pci_bus *bus, int new, int old,
 				 bool unlock);
+extern struct pci_bus *pci_lock_subordinate(struct pci_dev *dev, int states);
+extern struct pci_bus *__pci_get_and_lock_bus(int domain, int busnr,
+					      int states);
+extern void pci_unlock_and_put_bus(struct pci_bus *bus);
+
+static inline struct pci_bus *pci_get_and_lock_bus(int domain, int busnr)
+{
+	return __pci_get_and_lock_bus(domain, busnr, PCI_BUS_STATE_WORKING);
+}
 
 #define pci_bus_b(n)	list_entry(n, struct pci_bus, node)
 #define to_pci_bus(n)	container_of(n, struct pci_bus, dev)
@@ -734,6 +743,7 @@ void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res,
 			     struct pci_bus_region *region);
 void pcibios_scan_specific_bus(int busn);
 extern struct pci_bus *pci_find_bus(int domain, int busnr);
+struct pci_bus *pci_get_bus(int domain, int busnr);
 void pci_bus_add_devices(const struct pci_bus *bus);
 struct pci_bus *pci_scan_bus_parented(struct device *parent, int bus,
 				      struct pci_ops *ops, void *sysdata);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 09/22] PCI: enhance PCI probe logic to support PCI bus lock mechanism
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (7 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 08/22] PCI: introduce hotplug safe search interfaces for PCI bus/device Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 10/22] PCI: enhance PCI bus specific " Jiang Liu
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch enhances PCI probe logic to support PCI bus lock mechanism.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/probe.c |   65 +++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 50 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 1f64e8d..e6b40d0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -460,6 +460,8 @@ static struct pci_bus * pci_alloc_bus(void)
 		b->max_bus_speed = PCI_SPEED_UNKNOWN;
 		b->cur_bus_speed = PCI_SPEED_UNKNOWN;
 		device_initialize(&b->dev);
+		atomic_set(&b->state,
+			   PCI_BUS_STATE_INITIALIZED | PCI_BUS_STATE_LOCK);
 	}
 	return b;
 }
@@ -753,14 +755,21 @@ int __devinit pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
 		 * However, we continue to descend down the hierarchy and
 		 * scan remaining child buses.
 		 */
-		child = pci_find_bus(pci_domain_nr(bus), secondary);
-		if (!child) {
+		child = __pci_get_and_lock_bus(pci_domain_nr(bus), secondary,
+					       PCI_BUS_STATE_MASK);
+		if (child) {
+			if (pci_bus_get_state(child) > PCI_BUS_STATE_WORKING) {
+				pci_unlock_and_put_bus(child);
+				goto out;
+			}
+		} else {
 			child = pci_add_new_bus(bus, dev, secondary);
 			if (!child)
 				goto out;
 			child->primary = primary;
 			pci_bus_insert_busn_res(child, secondary, subordinate);
 			child->bridge_ctl = bctl;
+			pci_bus_get(child);
 		}
 
 		cmax = pci_scan_child_bus(child);
@@ -792,12 +801,19 @@ int __devinit pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
 		/* Prevent assigning a bus number that already exists.
 		 * This can happen when a bridge is hot-plugged, so in
 		 * this case we only re-scan this bus. */
-		child = pci_find_bus(pci_domain_nr(bus), max+1);
-		if (!child) {
+		child = __pci_get_and_lock_bus(pci_domain_nr(bus), secondary,
+					       PCI_BUS_STATE_MASK);
+		if (child) {
+			if (pci_bus_get_state(child) > PCI_BUS_STATE_WORKING) {
+				pci_unlock_and_put_bus(child);
+				goto out;
+			}
+		} else {
 			child = pci_add_new_bus(bus, dev, ++max);
 			if (!child)
 				goto out;
 			pci_bus_insert_busn_res(child, max, 0xff);
+			pci_bus_get(child);
 		}
 		buses = (buses & 0xff000000)
 		      | ((unsigned int)(child->primary)     <<  0)
@@ -896,6 +912,8 @@ int __devinit pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
 		bus = bus->parent;
 	}
 
+	pci_unlock_and_put_bus(child);
+
 out:
 	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, bctl);
 
@@ -1605,11 +1623,14 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
 	 * After performing arch-dependent fixup of the bus, look behind
 	 * all PCI-to-PCI bridges on this bus.
 	 */
-	if (!bus->is_added) {
+	if (pci_bus_get_state(bus) < PCI_BUS_STATE_WORKING) {
 		dev_dbg(&bus->dev, "fixups for bus\n");
 		pcibios_fixup_bus(bus);
-		if (pci_is_root_bus(bus))
+		if (pci_is_root_bus(bus)) {
+			pci_bus_change_state(bus, PCI_BUS_STATE_REGISTERED,
+					     PCI_BUS_STATE_WORKING, false);
 			bus->is_added = 1;
+		}
 	}
 
 	for (pass=0; pass < 2; pass++)
@@ -1630,6 +1651,11 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
 	return max;
 }
 
+/*
+ * Create a PCI root bus and return with the new root bus locked.
+ * Caller needs to call pci_bus_unlock() to unlock the new root bus after
+ * scanning and configuring children under the new root bus.
+ */
 struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
 		struct pci_ops *ops, void *sysdata, struct list_head *resources)
 {
@@ -1716,6 +1742,9 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
 	list_add_tail(&b->node, &pci_root_buses);
 	up_write(&pci_bus_sem);
 
+	pci_bus_change_state(b, PCI_BUS_STATE_INITIALIZED,
+			     PCI_BUS_STATE_REGISTERED, false);
+
 	return b;
 
 class_dev_reg_err:
@@ -1724,7 +1753,7 @@ class_dev_reg_err:
 bridge_dev_reg_err:
 	kfree(bridge);
 err_out:
-	kfree(b);
+	pci_unlock_and_put_bus(b);
 	return NULL;
 }
 
@@ -1827,6 +1856,8 @@ struct pci_bus * __devinit pci_scan_root_bus(struct device *parent, int bus,
 		pci_bus_update_busn_res_end(b, max);
 
 	pci_bus_add_devices(b);
+	pci_bus_unlock(b);
+
 	return b;
 }
 EXPORT_SYMBOL(pci_scan_root_bus);
@@ -1842,10 +1873,12 @@ struct pci_bus * __devinit pci_scan_bus_parented(struct device *parent,
 	pci_add_resource(&resources, &iomem_resource);
 	pci_add_resource(&resources, &busn_resource);
 	b = pci_create_root_bus(parent, bus, ops, sysdata, &resources);
-	if (b)
+	if (b) {
 		pci_scan_child_bus(b);
-	else
+		pci_bus_unlock(b);
+	} else {
 		pci_free_resource_list(&resources);
+	}
 	return b;
 }
 EXPORT_SYMBOL(pci_scan_bus_parented);
@@ -1863,6 +1896,7 @@ struct pci_bus * __devinit pci_scan_bus(int bus, struct pci_ops *ops,
 	if (b) {
 		pci_scan_child_bus(b);
 		pci_bus_add_devices(b);
+		pci_bus_unlock(b);
 	} else {
 		pci_free_resource_list(&resources);
 	}
@@ -1884,14 +1918,15 @@ EXPORT_SYMBOL(pci_scan_bus);
  */
 unsigned int __ref pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 {
-	unsigned int max;
+	unsigned int max = -1;
 	struct pci_bus *bus = bridge->subordinate;
 
-	max = pci_scan_child_bus(bus);
-
-	pci_assign_unassigned_bridge_resources(bridge);
-
-	pci_bus_add_devices(bus);
+	if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
+		max = pci_scan_child_bus(bus);
+		pci_assign_unassigned_bridge_resources(bridge);
+		pci_bus_add_devices(bus);
+		pci_bus_unlock(bus);
+	}
 
 	return max;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 10/22] PCI: enhance PCI bus specific logic to support PCI bus lock mechanism
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (8 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 09/22] PCI: enhance PCI probe logic to support PCI bus lock mechanism Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 11/22] PCI: enhance PCI resource assignment " Jiang Liu
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch enhances PCI bus specific logic to support PCI bus lock mechanism.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/bus.c |   54 ++++++++++++++++++++++++++++++-----------------------
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index b6aacaa..371f20a 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -185,19 +185,20 @@ int pci_bus_add_device(struct pci_dev *dev)
  */
 int pci_bus_add_child(struct pci_bus *bus)
 {
-	int retval;
-
-	if (bus->bridge)
-		bus->dev.parent = bus->bridge;
-
-	retval = device_add(&bus->dev);
-	if (retval)
-		return retval;
-
-	bus->is_added = 1;
-
-	/* Create legacy_io and legacy_mem files for this bus */
-	pci_create_legacy_files(bus);
+	int retval = -EBUSY;
+
+	if (pci_bus_get_state(bus) == PCI_BUS_STATE_INITIALIZED) {
+		if (bus->bridge)
+			bus->dev.parent = bus->bridge;
+		retval = device_add(&bus->dev);
+		if (retval == 0) {
+			/* Create legacy_io and legacy_mem files for this bus */
+			pci_create_legacy_files(bus);
+			pci_bus_change_state(bus, PCI_BUS_STATE_INITIALIZED,
+					PCI_BUS_STATE_WORKING, false);
+			bus->is_added = 1;
+		}
+	}
 
 	return retval;
 }
@@ -232,13 +233,14 @@ void pci_bus_add_devices(const struct pci_bus *bus)
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		BUG_ON(!dev->is_added);
 
-		child = dev->subordinate;
+		child = pci_lock_subordinate(dev, PCI_BUS_STATE_STOPPING - 1);
+		if (!child)
+			continue;
+
 		/*
 		 * If there is an unattached subordinate bus, attach
 		 * it and then scan for unattached PCI devices.
 		 */
-		if (!child)
-			continue;
 		if (list_empty(&child->node)) {
 			down_write(&pci_bus_sem);
 			list_add_tail(&child->node, &dev->bus->children);
@@ -250,28 +252,34 @@ void pci_bus_add_devices(const struct pci_bus *bus)
 		 * register the bus with sysfs as the parent is now
 		 * properly registered.
 		 */
-		if (child->is_added)
-			continue;
-		retval = pci_bus_add_child(child);
-		if (retval)
-			dev_err(&dev->dev, "Error adding bus, continuing\n");
+		if (pci_bus_get_state(child) == PCI_BUS_STATE_INITIALIZED) {
+			retval = pci_bus_add_child(child);
+			if (retval)
+				dev_err(&dev->dev,
+					"Error adding bus, continuing\n");
+		}
+
+		pci_bus_unlock(child);
 	}
 }
 
 void pci_enable_bridges(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
+	struct pci_bus *child;
 	int retval;
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
-		if (dev->subordinate) {
+		child = pci_lock_subordinate(dev, PCI_BUS_STATE_STOPPING - 1);
+		if (child) {
 			if (!pci_is_enabled(dev)) {
 				retval = pci_enable_device(dev);
 				if (retval)
 					dev_err(&dev->dev, "Error enabling bridge (%d), continuing\n", retval);
 				pci_set_master(dev);
 			}
-			pci_enable_bridges(dev->subordinate);
+			pci_enable_bridges(child);
+			pci_bus_unlock(child);
 		}
 	}
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 11/22] PCI: enhance PCI resource assignment logic to support PCI bus lock mechanism
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (9 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 10/22] PCI: enhance PCI bus specific " Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 12/22] PCI: enhance PCI remove " Jiang Liu
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch enhances PCI resource assignemnt logic to support PCI bus lock
mechanism.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/setup-bus.c |   65 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 49 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 192172c..08bac37 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -47,6 +47,7 @@ static void free_list(struct list_head *head)
 
 	list_for_each_entry_safe(dev_res, tmp, head, list) {
 		list_del(&dev_res->list);
+		pci_dev_put(dev_res->dev);
 		kfree(dev_res);
 	}
 }
@@ -73,7 +74,7 @@ static int add_to_list(struct list_head *head,
 	}
 
 	tmp->res = res;
-	tmp->dev = dev;
+	tmp->dev = pci_dev_get(dev);
 	tmp->start = res->start;
 	tmp->end = res->end;
 	tmp->flags = res->flags;
@@ -93,6 +94,7 @@ static void remove_from_list(struct list_head *head,
 	list_for_each_entry_safe(dev_res, tmp, head, list) {
 		if (dev_res->res == res) {
 			list_del(&dev_res->list);
+			pci_dev_put(dev_res->dev);
 			kfree(dev_res);
 			break;
 		}
@@ -151,7 +153,7 @@ static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
 			panic("pdev_sort_resources(): "
 			      "kmalloc() failed!\n");
 		tmp->res = r;
-		tmp->dev = dev;
+		tmp->dev = pci_dev_get(dev);
 
 		/* fallback is smallest one or list is empty*/
 		n = head;
@@ -257,6 +259,7 @@ static void reassign_resources_sorted(struct list_head *realloc_head,
 		}
 out:
 		list_del(&add_res->list);
+		pci_dev_put(add_res->dev);
 		kfree(add_res);
 	}
 }
@@ -982,15 +985,16 @@ handle_done:
 	;
 }
 
-void __ref __pci_bus_size_bridges(struct pci_bus *bus,
+static void __ref __pci_bus_size_bridges(struct pci_bus *bus,
 			struct list_head *realloc_head)
 {
 	struct pci_dev *dev;
+	struct pci_bus *b;
 	unsigned long mask, prefmask;
 	resource_size_t additional_mem_size = 0, additional_io_size = 0;
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
-		struct pci_bus *b = dev->subordinate;
+		b = pci_lock_subordinate(dev, PCI_BUS_STATE_STOPPING - 1);
 		if (!b)
 			continue;
 
@@ -1004,6 +1008,8 @@ void __ref __pci_bus_size_bridges(struct pci_bus *bus,
 			__pci_bus_size_bridges(b, realloc_head);
 			break;
 		}
+
+		pci_bus_unlock(b);
 	}
 
 	/* The root bus? */
@@ -1063,7 +1069,7 @@ static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,
 	pbus_assign_resources_sorted(bus, realloc_head, fail_head);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
-		b = dev->subordinate;
+		b = pci_lock_subordinate(dev, PCI_BUS_STATE_STOPPING - 1);
 		if (!b)
 			continue;
 
@@ -1084,6 +1090,8 @@ static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,
 				 "%04x:%02x\n", pci_domain_nr(b), b->number);
 			break;
 		}
+
+		pci_bus_unlock(b);
 	}
 }
 
@@ -1098,11 +1106,11 @@ static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
 					 struct list_head *fail_head)
 {
 	struct pci_bus *b;
+	struct pci_dev *dev = (struct pci_dev *)bridge;
 
-	pdev_assign_resources_sorted((struct pci_dev *)bridge,
-					 add_head, fail_head);
+	pdev_assign_resources_sorted(dev, add_head, fail_head);
 
-	b = bridge->subordinate;
+	b = pci_lock_subordinate(dev, PCI_BUS_STATE_STOPPING - 1);
 	if (!b)
 		return;
 
@@ -1122,7 +1130,10 @@ static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
 			 "%04x:%02x\n", pci_domain_nr(b), b->number);
 		break;
 	}
+
+	pci_bus_unlock(b);
 }
+
 static void pci_bridge_release_resources(struct pci_bus *bus,
 					  unsigned long type)
 {
@@ -1169,6 +1180,7 @@ enum release_type {
 	leaf_only,
 	whole_subtree,
 };
+
 /*
  * try to release pci bridge resources that is from leaf bridge,
  * so we can allocate big new one later
@@ -1190,9 +1202,14 @@ static void __ref pci_bus_release_bridge_resources(struct pci_bus *bus,
 		if ((dev->class >> 8) != PCI_CLASS_BRIDGE_PCI)
 			continue;
 
-		if (rel_type == whole_subtree)
+		if (rel_type != whole_subtree)
+			continue;
+
+		if (pci_bus_lock_states(b, PCI_BUS_STATE_STOPPING - 1) > 0) {
 			pci_bus_release_bridge_resources(b, type,
 						 whole_subtree);
+			pci_bus_unlock(b);
+		}
 	}
 
 	if (pci_is_root_bus(bus))
@@ -1253,6 +1270,7 @@ static int __init pci_bus_get_depth(struct pci_bus *bus)
 
 	return depth;
 }
+
 static int __init pci_get_max_depth(void)
 {
 	int depth = 0;
@@ -1285,6 +1303,7 @@ enum enable_type {
 };
 
 static enum enable_type pci_realloc_enable __initdata = undefined;
+
 void __init pci_realloc_get_opt(char *str)
 {
 	if (!strncmp(str, "off", 3))
@@ -1292,6 +1311,7 @@ void __init pci_realloc_get_opt(char *str)
 	else if (!strncmp(str, "on", 2))
 		pci_realloc_enable = user_enabled;
 }
+
 static bool __init pci_realloc_enabled(void)
 {
 	return pci_realloc_enable >= user_enabled;
@@ -1428,7 +1448,7 @@ enable_and_dump:
 
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 {
-	struct pci_bus *parent = bridge->subordinate;
+	struct pci_bus *parent;
 	LIST_HEAD(add_list); /* list of resources that
 					want additional resources */
 	int tried_times = 0;
@@ -1438,6 +1458,10 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 
+	parent = pci_lock_subordinate(bridge, PCI_BUS_STATE_STOPPING - 1);
+	if (!parent)
+		return;
+
 again:
 	__pci_bus_size_bridges(parent, &add_list);
 	__pci_bridge_assign_resources(bridge, &add_list, &fail_head);
@@ -1464,8 +1488,13 @@ again:
 		struct pci_bus *bus = fail_res->dev->bus;
 		unsigned long flags = fail_res->flags;
 
+		if (bus != parent && pci_bus_lock_states(bus,
+					PCI_BUS_STATE_STOPPING - 1) < 0)
+			continue;
 		pci_bus_release_bridge_resources(bus, flags & type_mask,
 						 whole_subtree);
+		if (bus != parent)
+			pci_bus_unlock(bus);
 	}
 	/* restore size and flags */
 	list_for_each_entry(fail_res, &fail_head, list) {
@@ -1485,6 +1514,7 @@ enable_all:
 	retval = pci_reenable_device(bridge);
 	pci_set_master(bridge);
 	pci_enable_bridges(parent);
+	pci_bus_unlock(parent);
 }
 EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
 
@@ -1502,19 +1532,22 @@ unsigned int __ref pci_rescan_bus(struct pci_bus *bus)
 {
 	unsigned int max;
 	struct pci_dev *dev;
+	struct pci_bus *b;
 	LIST_HEAD(add_list); /* list of resources that
 					want additional resources */
 
 	max = pci_scan_child_bus(bus);
 
-	down_read(&pci_bus_sem);
 	list_for_each_entry(dev, &bus->devices, bus_list)
 		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
-		    dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
-			if (dev->subordinate)
-				__pci_bus_size_bridges(dev->subordinate,
-							 &add_list);
-	up_read(&pci_bus_sem);
+		    dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
+			b = pci_lock_subordinate(dev,
+					PCI_BUS_STATE_STOPPING - 1);
+			if (b) {
+				__pci_bus_size_bridges(b, &add_list);
+				pci_bus_unlock(b);
+			}
+		}
 	__pci_bus_assign_resources(bus, &add_list, NULL);
 	BUG_ON(!list_empty(&add_list));
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 12/22] PCI: enhance PCI remove logic to support PCI bus lock mechanism
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (10 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 11/22] PCI: enhance PCI resource assignment " Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 13/22] PCI: make each PCI device hold a reference to its parent PCI bus Jiang Liu
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch enhances PCI remove logic to support PCI bus lock mechanism.
It implements the major part of the PCI bus state machine.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/remove.c |  146 +++++++++++++++++++++++++++++---------------------
 1 file changed, 85 insertions(+), 61 deletions(-)

diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index ba03059..a26a841 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -26,21 +26,25 @@ static void pci_stop_dev(struct pci_dev *dev)
 		dev->is_added = 0;
 	}
 
+	/* TODO: check whether it's safe to call aspm here */
 	if (dev->bus->self)
 		pcie_aspm_exit_link_state(dev);
 }
 
 static void pci_destroy_dev(struct pci_dev *dev)
 {
-	/* Remove the device from the device lists, and prevent any further
-	 * list accesses from this device */
 	down_write(&pci_bus_sem);
-	list_del(&dev->bus_list);
-	dev->bus_list.next = dev->bus_list.prev = NULL;
-	up_write(&pci_bus_sem);
-
-	pci_free_resources(dev);
-	put_device(&dev->dev);
+	if (dev->bus_list.next == NULL) {
+		up_write(&pci_bus_sem);
+	} else {
+		/* Remove the device from the device lists, and prevent any
+		 * further list accesses from this device */
+		list_del(&dev->bus_list);
+		dev->bus_list.next = dev->bus_list.prev = NULL;
+		up_write(&pci_bus_sem);
+		pci_free_resources(dev);
+		put_device(&dev->dev);
+	}
 }
 
 /**
@@ -64,29 +68,44 @@ int pci_remove_device_safe(struct pci_dev *dev)
 
 void pci_remove_bus(struct pci_bus *pci_bus)
 {
-	pci_proc_detach_bus(pci_bus);
+	int state = pci_bus_get_state(pci_bus);
 
-	down_write(&pci_bus_sem);
-	list_del(&pci_bus->node);
-	pci_bus_release_busn_res(pci_bus);
-	up_write(&pci_bus_sem);
-	if (pci_bus->is_added) {
+	switch (state) {
+	case PCI_BUS_STATE_STOPPED:
+	case PCI_BUS_STATE_REGISTERED:
 		pci_remove_legacy_files(pci_bus);
 		device_del(&pci_bus->dev);
+	case PCI_BUS_STATE_STOPPING:
+	case PCI_BUS_STATE_INITIALIZED:
+		pci_proc_detach_bus(pci_bus);
+		down_write(&pci_bus_sem);
+		list_del(&pci_bus->node);
+		pci_bus_release_busn_res(pci_bus);
+		up_write(&pci_bus_sem);
+		pci_bus_change_state(pci_bus, state,
+				     PCI_BUS_STATE_REMOVED, true);
+		pci_bus_put(pci_bus);
+		break;
+	case PCI_BUS_STATE_REMOVED:
+		pci_bus_unlock(pci_bus);
+		break;
+	default:
+		BUG_ON(state);
+		break;
 	}
-	put_device(&pci_bus->dev);
 }
 EXPORT_SYMBOL(pci_remove_bus);
 
-static void pci_remove_behind_bridge(struct pci_dev *dev);
-
 void __pci_remove_bus_device(struct pci_dev *dev)
 {
-	if (dev->subordinate) {
-		struct pci_bus *b = dev->subordinate;
+	struct list_head *l, *n;
+	struct pci_bus *bus;
 
-		pci_remove_behind_bridge(dev);
-		pci_remove_bus(b);
+	bus = pci_lock_subordinate(dev, PCI_BUS_STATE_DESTROYED - 1);
+	if (bus) {
+		list_for_each_safe(l, n, &bus->devices)
+			__pci_remove_bus_device(pci_dev_b(l));
+		pci_remove_bus(bus);
 		dev->subordinate = NULL;
 	}
 
@@ -111,24 +130,7 @@ void pci_stop_and_remove_bus_device(struct pci_dev *dev)
 	pci_stop_bus_device(dev);
 	__pci_remove_bus_device(dev);
 }
-
-static void pci_remove_behind_bridge(struct pci_dev *dev)
-{
-	struct list_head *l, *n;
-
-	if (dev->subordinate)
-		list_for_each_safe(l, n, &dev->subordinate->devices)
-			__pci_remove_bus_device(pci_dev_b(l));
-}
-
-static void pci_stop_behind_bridge(struct pci_dev *dev)
-{
-	struct list_head *l, *n;
-
-	if (dev->subordinate)
-		list_for_each_safe(l, n, &dev->subordinate->devices)
-			pci_stop_bus_device(pci_dev_b(l));
-}
+EXPORT_SYMBOL(pci_stop_and_remove_bus_device);
 
 /**
  * pci_stop_and_remove_behind_bridge - stop and remove all devices behind
@@ -141,27 +143,17 @@ static void pci_stop_behind_bridge(struct pci_dev *dev)
  */
 void pci_stop_and_remove_behind_bridge(struct pci_dev *dev)
 {
-	pci_stop_behind_bridge(dev);
-	pci_remove_behind_bridge(dev);
-}
-
-static void pci_stop_bus_devices(struct pci_bus *bus)
-{
 	struct list_head *l, *n;
+	struct pci_bus *bus;
 
-	/*
-	 * VFs could be removed by pci_stop_and_remove_bus_device() in the
-	 *  pci_stop_bus_devices() code path for PF.
-	 *  aka, bus->devices get updated in the process.
-	 * but VFs are inserted after PFs when SRIOV is enabled for PF,
-	 * We can iterate the list backwards to get prev valid PF instead
-	 *  of removed VF.
-	 */
-	list_for_each_prev_safe(l, n, &bus->devices) {
-		struct pci_dev *dev = pci_dev_b(l);
-		pci_stop_bus_device(dev);
+	bus = pci_lock_subordinate(dev, PCI_BUS_STATE_REMOVED - 1);
+	if (bus) {
+		list_for_each_safe(l, n, &bus->devices)
+			pci_stop_and_remove_bus_device(pci_dev_b(l));
+		pci_bus_unlock(bus);
 	}
 }
+EXPORT_SYMBOL(pci_stop_and_remove_behind_bridge);
 
 /**
  * pci_stop_bus_device - stop a PCI device and any children
@@ -173,12 +165,44 @@ static void pci_stop_bus_devices(struct pci_bus *bus)
  */
 void pci_stop_bus_device(struct pci_dev *dev)
 {
-	if (dev->subordinate)
-		pci_stop_bus_devices(dev->subordinate);
+	int state;
+	struct pci_bus *bus;
+	struct list_head *l, *n;
+
+	bus = pci_lock_subordinate(dev, PCI_BUS_STATE_REMOVED - 1);
+	if (!bus)
+		goto out;
+
+	state = pci_bus_get_state(bus);
+	switch (state) {
+	case PCI_BUS_STATE_INITIALIZED:
+		pci_bus_change_state(bus, state, PCI_BUS_STATE_STOPPING, true);
+		break;
+	case PCI_BUS_STATE_WORKING:
+	case PCI_BUS_STATE_REGISTERED:
+		pci_bus_change_state(bus, state, PCI_BUS_STATE_STOPPING, false);
+		/*
+		 * VFs could be removed by pci_stop_and_remove_bus_device()
+		 * in the pci_stop_bus_devices() code path for PF.
+		 * aka, bus->devices get updated in the process.
+		 * but VFs are inserted after PFs when SRIOV is enabled for PF,
+		 * We can iterate the list backwards to get prev valid PF
+		 * instead of removed VF.
+		 */
+		list_for_each_prev_safe(l, n, &bus->devices)
+			pci_stop_bus_device(pci_dev_b(l));
+		pci_bus_change_state(bus, PCI_BUS_STATE_STOPPING,
+				     PCI_BUS_STATE_STOPPED, true);
+		break;
+	case PCI_BUS_STATE_STOPPING:
+	case PCI_BUS_STATE_STOPPED:
+		pci_bus_unlock(bus);
+		break;
+	default:
+		BUG_ON(state);
+	}
 
+out:
 	pci_stop_dev(dev);
 }
-
-EXPORT_SYMBOL(pci_stop_and_remove_bus_device);
-EXPORT_SYMBOL(pci_stop_and_remove_behind_bridge);
 EXPORT_SYMBOL_GPL(pci_stop_bus_device);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 13/22] PCI: make each PCI device hold a reference to its parent PCI bus
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (11 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 12/22] PCI: enhance PCI remove " Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 14/22] PCI/sysfs: use PCI bus lock to avoid race conditions Jiang Liu
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

Make each PCI device hold a reference to its parent PCI bus, so it won't
cause invalid memory access when doing:
pci_bus_lock_states(dev->bus, PCI_BUS_STATE_xxxx);

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/iov.c    |    3 ++-
 drivers/pci/probe.c  |    2 +-
 drivers/pci/remove.c |    1 +
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index c7d2969..40f5f52 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -92,7 +92,8 @@ static int virtfn_add(struct pci_dev *dev, int id, int reset)
 		kfree(virtfn);
 		mutex_unlock(&iov->dev->sriov->lock);
 		return -ENOMEM;
-	}
+	} else
+		pci_bus_get(virtfn->bus);
 	virtfn->devfn = virtfn_devfn(dev, id);
 	virtfn->vendor = dev->vendor;
 	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_DID, &virtfn->device);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e6b40d0..47bf071 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1272,7 +1272,7 @@ static struct pci_dev *pci_scan_device(struct pci_bus *bus, int devfn)
 	if (!dev)
 		return NULL;
 
-	dev->bus = bus;
+	dev->bus = pci_bus_get(bus);
 	dev->devfn = devfn;
 	dev->vendor = l & 0xffff;
 	dev->device = (l >> 16) & 0xffff;
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index a26a841..0d947c0 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -43,6 +43,7 @@ static void pci_destroy_dev(struct pci_dev *dev)
 		dev->bus_list.next = dev->bus_list.prev = NULL;
 		up_write(&pci_bus_sem);
 		pci_free_resources(dev);
+		pci_bus_put(dev->bus);
 		put_device(&dev->dev);
 	}
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 14/22] PCI/sysfs: use PCI bus lock to avoid race conditions
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (12 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 13/22] PCI: make each PCI device hold a reference to its parent PCI bus Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 15/22] PCI/eeepc: " Jiang Liu
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch uses PCI bus lock mechanism to avoid race conditions when doing
PCI device/host bridge hotplug through PCI sysfs interfaces.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/pci-sysfs.c |   26 +++++++++++++++++++++-----
 drivers/pci/probe.c     |   17 +++++++++++------
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 99fefbe..11043b4 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -298,7 +298,10 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 		pci_host_bridge_hotplug_lock();
 		mutex_lock(&pci_remove_rescan_mutex);
 		while ((b = pci_find_next_bus(b)) != NULL)
-			pci_rescan_bus(b);
+			if (pci_bus_lock_states(b, PCI_BUS_STATE_WORKING) > 0) {
+				pci_rescan_bus(b);
+				pci_bus_unlock(b);
+			}
 		mutex_unlock(&pci_remove_rescan_mutex);
 		pci_host_bridge_hotplug_unlock();
 	}
@@ -321,8 +324,14 @@ dev_rescan_store(struct device *dev, struct device_attribute *attr,
 		return -EINVAL;
 
 	if (val) {
+		struct pci_bus *bus = pdev->bus;
+
 		mutex_lock(&pci_remove_rescan_mutex);
-		pci_rescan_bus(pdev->bus);
+		if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
+			if (pdev->is_added)
+				pci_rescan_bus(bus);
+			pci_bus_unlock(bus);
+		}
 		mutex_unlock(&pci_remove_rescan_mutex);
 	}
 	return count;
@@ -331,9 +340,14 @@ dev_rescan_store(struct device *dev, struct device_attribute *attr,
 static void remove_callback(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
+	struct pci_bus *bus = pdev->bus;
 
 	mutex_lock(&pci_remove_rescan_mutex);
-	pci_stop_and_remove_bus_device(pdev);
+	if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
+		pci_bus_get(bus);
+		pci_stop_and_remove_bus_device(pdev);
+		pci_unlock_and_put_bus(bus);
+	}
 	mutex_unlock(&pci_remove_rescan_mutex);
 }
 
@@ -369,10 +383,12 @@ dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
 
 	if (val) {
 		mutex_lock(&pci_remove_rescan_mutex);
-		if (!pci_is_root_bus(bus) && list_empty(&bus->devices))
+		if (!pci_is_root_bus(bus))
 			pci_rescan_bus_bridge_resize(bus->self);
-		else
+		else if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
 			pci_rescan_bus(bus);
+			pci_bus_unlock(bus);
+		}
 		mutex_unlock(&pci_remove_rescan_mutex);
 	}
 	return count;
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 47bf071..da6f04c 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1919,12 +1919,17 @@ EXPORT_SYMBOL(pci_scan_bus);
 unsigned int __ref pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 {
 	unsigned int max = -1;
-	struct pci_bus *bus = bridge->subordinate;
-
-	if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
-		max = pci_scan_child_bus(bus);
-		pci_assign_unassigned_bridge_resources(bridge);
-		pci_bus_add_devices(bus);
+	struct pci_bus *bus;
+
+	bus = pci_lock_subordinate(bridge, PCI_BUS_STATE_WORKING);
+	if (bus) {
+		if (list_empty(&bus->devices)) {
+			max = pci_scan_child_bus(bus);
+			pci_assign_unassigned_bridge_resources(bridge);
+			pci_bus_add_devices(bus);
+		} else {
+			pci_rescan_bus(bus);
+		}
 		pci_bus_unlock(bus);
 	}
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 15/22] PCI/eeepc: use PCI bus lock to avoid race conditions
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (13 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 14/22] PCI/sysfs: use PCI bus lock to avoid race conditions Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-09-11 23:18   ` Bjorn Helgaas
  2012-08-07 16:10 ` [RFC PATCH v1 16/22] PCI/asus-wmi: use PCI bus lock to avoid race conditions Jiang Liu
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch uses PCI bus lock mechanism to avoid race conditions when doing
PCI device hotplug through eeepc driver. It also fixes a PCI device reference
count leakage issue because acpi_get_pci_dev() holds a reference to the
device returned.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/platform/x86/eeepc-laptop.c |   20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/platform/x86/eeepc-laptop.c b/drivers/platform/x86/eeepc-laptop.c
index dab91b4..25c4176 100644
--- a/drivers/platform/x86/eeepc-laptop.c
+++ b/drivers/platform/x86/eeepc-laptop.c
@@ -606,16 +606,16 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
 			goto out_unlock;
 		}
 
-		bus = port->subordinate;
+		bus = pci_lock_subordinate(port, PCI_BUS_STATE_WORKING);
 
 		if (!bus) {
 			pr_warn("Unable to find PCI bus 1?\n");
-			goto out_unlock;
+			goto out_put_dev;
 		}
 
 		if (pci_bus_read_config_dword(bus, 0, PCI_VENDOR_ID, &l)) {
 			pr_err("Unable to read PCI config space?\n");
-			goto out_unlock;
+			goto out_unlock_bus;
 		}
 
 		absent = (l == 0xffffffff);
@@ -627,7 +627,7 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
 				absent ? "absent" : "present");
 			pr_warn("skipped wireless hotplug as probably "
 				"inappropriate for this model\n");
-			goto out_unlock;
+			goto out_unlock_bus;
 		}
 
 		if (!blocked) {
@@ -635,7 +635,7 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
 			if (dev) {
 				/* Device already present */
 				pci_dev_put(dev);
-				goto out_unlock;
+				goto out_unlock_bus;
 			}
 			dev = pci_scan_single_device(bus, 0);
 			if (dev) {
@@ -650,6 +650,11 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
 				pci_dev_put(dev);
 			}
 		}
+
+out_unlock_bus:
+		pci_bus_unlock(bus);
+out_put_dev:
+		pci_dev_put(port);
 	}
 
 out_unlock:
@@ -757,7 +762,7 @@ static struct hotplug_slot_ops eeepc_hotplug_slot_ops = {
 static int eeepc_setup_pci_hotplug(struct eeepc_laptop *eeepc)
 {
 	int ret = -ENOMEM;
-	struct pci_bus *bus = pci_find_bus(0, 1);
+	struct pci_bus *bus = pci_get_bus(0, 1);
 
 	if (!bus) {
 		pr_err("Unable to find wifi PCI bus\n");
@@ -785,6 +790,8 @@ static int eeepc_setup_pci_hotplug(struct eeepc_laptop *eeepc)
 		goto error_register;
 	}
 
+	pci_bus_put(bus);
+
 	return 0;
 
 error_register:
@@ -793,6 +800,7 @@ error_info:
 	kfree(eeepc->hotplug_slot);
 	eeepc->hotplug_slot = NULL;
 error_slot:
+	pci_bus_put(bus);
 	return ret;
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 16/22] PCI/asus-wmi: use PCI bus lock to avoid race conditions
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (14 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 15/22] PCI/eeepc: " Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 17/22] PCI/pciehp: " Jiang Liu
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch uses PCI bus lock mechanism to avoid race conditions when doing
PCI device hotplug by asum-wmi driver.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/platform/x86/asus-wmi.c |   23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
index 77aadde..9972bc7 100644
--- a/drivers/platform/x86/asus-wmi.c
+++ b/drivers/platform/x86/asus-wmi.c
@@ -533,15 +533,20 @@ static void asus_rfkill_hotplug(struct asus_wmi *asus)
 		rfkill_set_sw_state(asus->wlan.rfkill, blocked);
 
 	if (asus->hotplug_slot) {
-		bus = pci_find_bus(0, 1);
+		bus = pci_get_bus(0, 1);
 		if (!bus) {
 			pr_warn("Unable to find PCI bus 1?\n");
 			goto out_unlock;
 		}
 
+		if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING)) {
+			pr_warn("Unable to lock PCI bus 1?\n");
+			goto out_put_bus;
+		}
+
 		if (pci_bus_read_config_dword(bus, 0, PCI_VENDOR_ID, &l)) {
 			pr_err("Unable to read PCI config space?\n");
-			goto out_unlock;
+			goto out_unlock_bus;
 		}
 		absent = (l == 0xffffffff);
 
@@ -552,7 +557,7 @@ static void asus_rfkill_hotplug(struct asus_wmi *asus)
 				absent ? "absent" : "present");
 			pr_warn("skipped wireless hotplug as probably "
 				"inappropriate for this model\n");
-			goto out_unlock;
+			goto out_unlock_bus;
 		}
 
 		if (!blocked) {
@@ -560,7 +565,7 @@ static void asus_rfkill_hotplug(struct asus_wmi *asus)
 			if (dev) {
 				/* Device already present */
 				pci_dev_put(dev);
-				goto out_unlock;
+				goto out_unlock_bus;
 			}
 			dev = pci_scan_single_device(bus, 0);
 			if (dev) {
@@ -575,6 +580,11 @@ static void asus_rfkill_hotplug(struct asus_wmi *asus)
 				pci_dev_put(dev);
 			}
 		}
+
+out_unlock_bus:
+		pci_bus_unlock(bus);
+out_put_bus:
+		pci_bus_put(bus);
 	}
 
 out_unlock:
@@ -670,7 +680,7 @@ static void asus_hotplug_work(struct work_struct *work)
 static int asus_setup_pci_hotplug(struct asus_wmi *asus)
 {
 	int ret = -ENOMEM;
-	struct pci_bus *bus = pci_find_bus(0, 1);
+	struct pci_bus *bus = pci_get_bus(0, 1);
 
 	if (!bus) {
 		pr_err("Unable to find wifi PCI bus\n");
@@ -705,6 +715,8 @@ static int asus_setup_pci_hotplug(struct asus_wmi *asus)
 		goto error_register;
 	}
 
+	pci_bus_put(bus);
+
 	return 0;
 
 error_register:
@@ -715,6 +727,7 @@ error_info:
 error_slot:
 	destroy_workqueue(asus->hotplug_workqueue);
 error_workqueue:
+	pci_bus_put(bus);
 	return ret;
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 17/22] PCI/pciehp: use PCI bus lock to avoid race conditions
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (15 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 16/22] PCI/asus-wmi: use PCI bus lock to avoid race conditions Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 18/22] PCI/acpiphp: " Jiang Liu
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch uses PCI bus lock mechanism to avoid race conditions when doing
PCI device hotplug by pciehp driver.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/pciehp_pci.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 09cecaf..9536a9e 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -42,18 +42,25 @@ int pciehp_configure_device(struct slot *p_slot)
 	int num, fn;
 	struct controller *ctrl = p_slot->ctrl;
 
+	if (pci_bus_lock_states(parent, PCI_BUS_STATE_WORKING) < 0) {
+		ctrl_dbg(ctrl, "Port has been removed\n");
+		return -EINVAL;
+	}
+
 	dev = pci_get_slot(parent, PCI_DEVFN(0, 0));
 	if (dev) {
 		ctrl_err(ctrl, "Device %s already exists "
 			 "at %04x:%02x:00, cannot hot-add\n", pci_name(dev),
 			 pci_domain_nr(parent), parent->number);
 		pci_dev_put(dev);
+		pci_bus_unlock(parent);
 		return -EINVAL;
 	}
 
 	num = pci_scan_slot(parent, PCI_DEVFN(0, 0));
 	if (num == 0) {
 		ctrl_err(ctrl, "No new device found\n");
+		pci_bus_unlock(parent);
 		return -ENODEV;
 	}
 
@@ -82,6 +89,7 @@ int pciehp_configure_device(struct slot *p_slot)
 	}
 
 	pci_bus_add_devices(parent);
+	pci_bus_unlock(parent);
 
 	return 0;
 }
@@ -96,6 +104,11 @@ int pciehp_unconfigure_device(struct slot *p_slot)
 	u16 command;
 	struct controller *ctrl = p_slot->ctrl;
 
+	if (pci_bus_lock_states(parent, PCI_BUS_STATE_WORKING) < 0) {
+		ctrl_dbg(ctrl, "Port has been removed\n");
+		return -EINVAL;
+	}
+
 	ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n",
 		 __func__, pci_domain_nr(parent), parent->number);
 	ret = pciehp_get_adapter_status(p_slot, &presence);
@@ -131,5 +144,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)
 		pci_dev_put(temp);
 	}
 
+	pci_bus_unlock(parent);
+
 	return rc;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 18/22] PCI/acpiphp: use PCI bus lock to avoid race conditions
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (16 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 17/22] PCI/pciehp: " Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-08-07 16:10 ` [RFC PATCH v1 19/22] PCI/x86: enable PCI bus lock mechanism for x86 platforms Jiang Liu
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci

From: Jiang Liu <liuj97@gmail.com>

This patch uses PCI bus lock mechanism to avoid race conditions when doing
PCI device/host bridge hotplug by acpiphp driver.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/acpiphp_glue.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 73af337..0ea7ab1 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -800,11 +800,14 @@ static int __ref enable_device(struct acpiphp_slot *slot)
 	if (slot->flags & SLOT_ENABLED)
 		goto err_exit;
 
+	if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) < 0)
+		return -EINVAL;
+
 	num = pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
 	if (num == 0) {
 		/* Maybe only part of funcs are added. */
 		dbg("No new device found\n");
-		goto err_exit;
+		goto out_unlock;
 	}
 
 	max = acpiphp_max_busnr(bus);
@@ -862,8 +865,10 @@ static int __ref enable_device(struct acpiphp_slot *slot)
 		pci_dev_put(dev);
 	}
 
+out_unlock:
+	pci_bus_unlock(bus);
 
- err_exit:
+err_exit:
 	return retval;
 }
 
@@ -906,6 +911,9 @@ static int disable_device(struct acpiphp_slot *slot)
 	struct pci_dev *pdev;
 	struct pci_bus *bus = slot->bridge->pci_bus;
 
+	if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) < 0)
+		goto err_exit;
+
 	/* The slot will be enabled when func 0 is added, so check
 	   func 0 before disable the slot. */
 	pdev = pci_get_slot(bus, PCI_DEVFN(slot->device, 0));
@@ -943,6 +951,7 @@ static int disable_device(struct acpiphp_slot *slot)
 	}
 
 	slot->flags &= (~SLOT_ENABLED);
+	pci_bus_unlock(bus);
 
 err_exit:
 	return 0;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 19/22] PCI/x86: enable PCI bus lock mechanism for x86 platforms
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (17 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 18/22] PCI/acpiphp: " Jiang Liu
@ 2012-08-07 16:10 ` Jiang Liu
  2012-09-11 23:22   ` Bjorn Helgaas
  2012-08-07 16:11 ` [RFC PATCH v1 20/22] PCI/IA64: enable PCI bus lock mechanism for IA64 platforms Jiang Liu
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:10 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

This patch turns on PCI bus lock mechanism for x86 platforms. It also
enhances x86 specific PCI implementation to support PCI bus lock.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 arch/x86/pci/acpi.c   |    6 +++++-
 arch/x86/pci/common.c |   12 ++++++++++++
 drivers/pci/Kconfig   |    3 +--
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 2bb885a..c68dbdf 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -414,7 +414,8 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
 	 * Maybe the desired pci bus has been already scanned. In such case
 	 * it is unnecessary to scan the pci bus with the given domain,busnum.
 	 */
-	bus = pci_find_bus(domain, busnum);
+	bus = __pci_get_and_lock_bus(domain, busnum,
+				     PCI_BUS_STATE_STOPPING - 1);
 	if (bus) {
 		/*
 		 * If the desired bus exits, the content of bus->sysdata will
@@ -449,6 +450,7 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
 			pci_free_resource_list(&resources);
 			__release_pci_root_info(info);
 		}
+		pci_bus_get(bus);
 	}
 
 	/* After the PCI-E bus has been walked and all devices discovered,
@@ -475,6 +477,8 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
 #endif
 	}
 
+	pci_unlock_and_put_bus(bus);
+
 	return bus;
 }
 
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 0ad990a..8b7ae63 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -667,6 +667,18 @@ struct pci_bus * __devinit pci_scan_bus_with_sysdata(int busno)
 	return pci_scan_bus_on_node(busno, &pci_root_ops, -1);
 }
 
+static DEFINE_MUTEX(pci_root_bus_mutex);
+
+void arch_pci_lock_host_bridge_hotplug(void)
+{
+	mutex_lock(&pci_root_bus_mutex);
+}
+
+void arch_pci_unlock_host_bridge_hotplug(void)
+{
+	mutex_unlock(&pci_root_bus_mutex);
+}
+
 /*
  * NUMA info for PCI busses
  *
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index a6df8b1..1bbe924 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -122,5 +122,4 @@ config PCI_LABEL
 	select NLS
 
 config PCI_BUS_LOCK
-	bool
-	default n
+	def_bool y if X86
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 20/22] PCI/IA64: enable PCI bus lock mechanism for IA64 platforms
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (18 preceding siblings ...)
  2012-08-07 16:10 ` [RFC PATCH v1 19/22] PCI/x86: enable PCI bus lock mechanism for x86 platforms Jiang Liu
@ 2012-08-07 16:11 ` Jiang Liu
  2012-08-07 16:11 ` [RFC PATCH v1 21/22] PCI: cleanups for PCI bus lock implementation Jiang Liu
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:11 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci

From: Jiang Liu <liuj97@gmail.com>

This patch turns on PCI bus lock mechanism for IA64 platforms.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 arch/ia64/pci/pci.c               |    2 ++
 arch/ia64/sn/kernel/io_init.c     |    1 +
 arch/ia64/sn/pci/tioca_provider.c |    4 +++-
 drivers/pci/Kconfig               |    2 +-
 4 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index d173a88..259a2e1 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -387,6 +387,8 @@ pci_acpi_scan_root(struct acpi_pci_root *root)
 	}
 
 	pci_scan_child_bus(pbus);
+	pci_bus_unlock(pbus);
+
 	return pbus;
 
 out3:
diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c
index 238e2c5..67e8ce9 100644
--- a/arch/ia64/sn/kernel/io_init.c
+++ b/arch/ia64/sn/kernel/io_init.c
@@ -329,6 +329,7 @@ sn_pci_controller_fixup(int segment, int busnum, struct pci_bus *bus)
  		goto error_return; /* error, or bus already scanned */
 
 	bus->sysdata = controller;
+	pci_bus_unlock(bus);
 
 	return;
 
diff --git a/arch/ia64/sn/pci/tioca_provider.c b/arch/ia64/sn/pci/tioca_provider.c
index a70b11f..79f8226 100644
--- a/arch/ia64/sn/pci/tioca_provider.c
+++ b/arch/ia64/sn/pci/tioca_provider.c
@@ -624,7 +624,7 @@ tioca_bus_fixup(struct pcibus_bussoft *prom_bussoft, struct pci_controller *cont
 	    nasid_to_cnodeid(tioca_common->ca_closest_nasid);
 	tioca_common->ca_kernel_private = (u64) tioca_kern;
 
-	bus = pci_find_bus(tioca_common->ca_common.bs_persist_segment,
+	bus = pci_get_bus(tioca_common->ca_common.bs_persist_segment,
 		tioca_common->ca_common.bs_persist_busnum);
 	BUG_ON(!bus);
 	tioca_kern->ca_devices = &bus->devices;
@@ -634,6 +634,7 @@ tioca_bus_fixup(struct pcibus_bussoft *prom_bussoft, struct pci_controller *cont
 	if (tioca_gart_init(tioca_kern) < 0) {
 		kfree(tioca_kern);
 		kfree(tioca_common);
+		pci_bus_put(bus);
 		return NULL;
 	}
 
@@ -654,6 +655,7 @@ tioca_bus_fixup(struct pcibus_bussoft *prom_bussoft, struct pci_controller *cont
 
 	/* Setup locality information */
 	controller->node = tioca_kern->ca_closest_node;
+	pci_bus_put(bus);
 	return tioca_common;
 }
 
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 1bbe924..5a796c0 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -122,4 +122,4 @@ config PCI_LABEL
 	select NLS
 
 config PCI_BUS_LOCK
-	def_bool y if X86
+	def_bool y if (X86 || IA64)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 21/22] PCI: cleanups for PCI bus lock implementation
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (19 preceding siblings ...)
  2012-08-07 16:11 ` [RFC PATCH v1 20/22] PCI/IA64: enable PCI bus lock mechanism for IA64 platforms Jiang Liu
@ 2012-08-07 16:11 ` Jiang Liu
  2012-09-11 23:21   ` Bjorn Helgaas
  2012-08-07 16:11 ` [RFC PATCH v1 22/22] PCI: unexport pci_root_buses Jiang Liu
  2012-08-07 18:11 ` [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Don Dutile
  22 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:11 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

Now all Archs have been converted to the new PCI bus lock mechanism,
so clean up unused code.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/Kconfig     |    3 ---
 drivers/pci/bus.c       |    1 -
 drivers/pci/pci-sysfs.c |    9 ---------
 drivers/pci/probe.c     |    4 +---
 include/linux/pci.h     |   10 ----------
 5 files changed, 1 insertion(+), 26 deletions(-)

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 5a796c0..848bfb8 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -120,6 +120,3 @@ config PCI_IOAPIC
 config PCI_LABEL
 	def_bool y if (DMI || ACPI)
 	select NLS
-
-config PCI_BUS_LOCK
-	def_bool y if (X86 || IA64)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 371f20a..308c376 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -196,7 +196,6 @@ int pci_bus_add_child(struct pci_bus *bus)
 			pci_create_legacy_files(bus);
 			pci_bus_change_state(bus, PCI_BUS_STATE_INITIALIZED,
 					PCI_BUS_STATE_WORKING, false);
-			bus->is_added = 1;
 		}
 	}
 
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 11043b4..a5a4195 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -284,7 +284,6 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
 }
 
 #ifdef CONFIG_HOTPLUG
-static DEFINE_MUTEX(pci_remove_rescan_mutex);
 static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 				size_t count)
 {
@@ -296,13 +295,11 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 
 	if (val) {
 		pci_host_bridge_hotplug_lock();
-		mutex_lock(&pci_remove_rescan_mutex);
 		while ((b = pci_find_next_bus(b)) != NULL)
 			if (pci_bus_lock_states(b, PCI_BUS_STATE_WORKING) > 0) {
 				pci_rescan_bus(b);
 				pci_bus_unlock(b);
 			}
-		mutex_unlock(&pci_remove_rescan_mutex);
 		pci_host_bridge_hotplug_unlock();
 	}
 	return count;
@@ -326,13 +323,11 @@ dev_rescan_store(struct device *dev, struct device_attribute *attr,
 	if (val) {
 		struct pci_bus *bus = pdev->bus;
 
-		mutex_lock(&pci_remove_rescan_mutex);
 		if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
 			if (pdev->is_added)
 				pci_rescan_bus(bus);
 			pci_bus_unlock(bus);
 		}
-		mutex_unlock(&pci_remove_rescan_mutex);
 	}
 	return count;
 }
@@ -342,13 +337,11 @@ static void remove_callback(struct device *dev)
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct pci_bus *bus = pdev->bus;
 
-	mutex_lock(&pci_remove_rescan_mutex);
 	if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
 		pci_bus_get(bus);
 		pci_stop_and_remove_bus_device(pdev);
 		pci_unlock_and_put_bus(bus);
 	}
-	mutex_unlock(&pci_remove_rescan_mutex);
 }
 
 static ssize_t
@@ -382,14 +375,12 @@ dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
 		return -EINVAL;
 
 	if (val) {
-		mutex_lock(&pci_remove_rescan_mutex);
 		if (!pci_is_root_bus(bus))
 			pci_rescan_bus_bridge_resize(bus->self);
 		else if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
 			pci_rescan_bus(bus);
 			pci_bus_unlock(bus);
 		}
-		mutex_unlock(&pci_remove_rescan_mutex);
 	}
 	return count;
 }
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index da6f04c..09517c3 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1626,11 +1626,9 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
 	if (pci_bus_get_state(bus) < PCI_BUS_STATE_WORKING) {
 		dev_dbg(&bus->dev, "fixups for bus\n");
 		pcibios_fixup_bus(bus);
-		if (pci_is_root_bus(bus)) {
+		if (pci_is_root_bus(bus))
 			pci_bus_change_state(bus, PCI_BUS_STATE_REGISTERED,
 					     PCI_BUS_STATE_WORKING, false);
-			bus->is_added = 1;
-		}
 	}
 
 	for (pass=0; pass < 2; pass++)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 9e52e88..0e50ec8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -442,7 +442,6 @@ struct pci_bus {
 	struct device		dev;
 	struct bin_attribute	*legacy_io; /* legacy I/O for this bus */
 	struct bin_attribute	*legacy_mem; /* legacy mem */
-	unsigned int		is_added:1;
 	atomic_t		state;
 };
 
@@ -463,21 +462,12 @@ struct pci_bus {
 #define	PCI_BUS_STATE_DESTROYED		0x40	/* invalid state */
 #define	PCI_BUS_STATE_MASK		0x7F
 
-#ifdef	CONFIG_PCI_BUS_LOCK
 #define	PCI_BUS_STATE_LOCK		0x10000	/* for pci core only */
 
 static inline bool pci_bus_is_locked(struct pci_bus *bus)
 {
 	return !!(atomic_read(&bus->state) & PCI_BUS_STATE_LOCK);
 }
-#else /* CONFIG_PCI_BUS_LOCK */
-#define	PCI_BUS_STATE_LOCK		0x0000	/* for pci core only */
-
-static inline bool pci_bus_is_locked(struct pci_bus *bus)
-{
-	return true;
-}
-#endif /* CONFIG_PCI_BUS_LOCK */
 
 static inline int pci_bus_get_state(struct pci_bus *bus)
 {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH v1 22/22] PCI: unexport pci_root_buses
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (20 preceding siblings ...)
  2012-08-07 16:11 ` [RFC PATCH v1 21/22] PCI: cleanups for PCI bus lock implementation Jiang Liu
@ 2012-08-07 16:11 ` Jiang Liu
  2012-08-07 18:11 ` [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Don Dutile
  22 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-07 16:11 UTC (permalink / raw)
  To: Bjorn Helgaas, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige
  Cc: Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci, Jiang Liu

Now no module refers to pci_root_buses any more, unexport it.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/probe.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 09517c3..dd48d7f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -24,11 +24,9 @@ struct resource busn_resource = {
 };
 
 /*
- * Ugh.  Need to stop exporting this to modules.
  * Protected by pci_host_bridge_hotplug_{lock|unlock}().
  */
 LIST_HEAD(pci_root_buses);
-EXPORT_SYMBOL(pci_root_buses);
 
 static LIST_HEAD(pci_domain_busn_res_list);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations
  2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
                   ` (21 preceding siblings ...)
  2012-08-07 16:11 ` [RFC PATCH v1 22/22] PCI: unexport pci_root_buses Jiang Liu
@ 2012-08-07 18:11 ` Don Dutile
  2012-08-08 15:49   ` Jiang Liu
  22 siblings, 1 reply; 51+ messages in thread
From: Don Dutile @ 2012-08-07 18:11 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Bjorn Helgaas, Yinghai Lu, Greg KH, Kenji Kaneshige, Taku Izumi,
	Rafael J . Wysocki, Yijing Wang, Xinwei Hu, linux-kernel,
	linux-pci

On 08/07/2012 12:10 PM, Jiang Liu wrote:
> From: Jiang Liu<liuj97@gmail.com>
>
> This is the second take to resolve race conditions when hot-plugging PCI
> devices/host bridges. Instead of using a globla lock to serialize all hotplug
> operations as in previous version, now we introduce a state machine and bit
> lock mechanism for PCI buses to serialize hotplug operations. For discussions
> related to previous version, please refer to:
> http://comments.gmane.org/gmane.linux.kernel.pci/15007
>
> This patch-set is still in early stages, so sending it out just requesting
> for comments. Any comments are welcomed, especially about whether it's the
> right/suitable way to solve these race condition issues.
>
> patch 1-5:
> 	Preparing for coming PCI bus lock
> patch 6-7:
> 	Core of the new PCI bus lock mechanism.
> patch 8-13:
> 	Enhance PCI core to support PCI bus lock mechanism.
> patch 14-18:
> 	Enhance several PCI hotplug drivers to use PCI bus lock to serialize
> 	hotplug operations.
> patch 19-20:
> 	Enable PCI bus lock mechanism for x86 and IA64, still need to enable
> 	PCI bus lock for other archs.
> patch 21-22:
> 	Cleanups for unsed code.
>
> There are multiple methods to trigger PCI hotplug requests/operations
> concurrently, such as:
> 1. Sysfs interfaces exported by the PCI core subsystem
> 	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../remove
> 	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../rescan
> 	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../pci_bus/ssss:bb/rescan
> 	/sys/bus/pci/rescan
> 2. Sysfs interfaces exported by the PCI hotplug subsystem
> 	/sys/bus/pci/slots/xx/power
> 3. PCI hotplug events triggered by PCI Hotplug Controllers
> 4. ACPI hotplug events for PCI host bridges
> 5. Driver binding/unbinding events
> 	binding/unbinding pci drivers with SR-IOV support
>
6. PCI reset
    --> a PCIe device-level reset is done by KVM when it assigns a device
        to a guest.  a PCI config-save before reset, and PCI config-restore after reset
        is done in this case.
    --> VF devices are interesting, since they are reset, then bound to
        pci-stub driver.  when more than 1 VF is enabled in a PF,
        and several device-assignments are done simultaneously, you
        get a storm of reset (save/restore pci cfg space), and pci-stub binding
        (pci cfg read for resource allocation/deallocation), and depending on
        the hw design: an AER caused by the FLR reset -- not suppose to, but
        hw has bugs too! ;-)
    PCI locking is 'challenged' in the above scenario.

   So, I ask: have you tried your patch set doing something like:
     a) modprobe an SRIOV device with > 1 vf enabled
   you may also have to do:
     b) while assigning another SRIOV device's VF to another KVM guest

Unfortunately, the PCI cfg-space locking, esp. on x86 (ok, I'll say it:
damn, mutually exclusive, IO-port-based cfg registers), doesn't lend itself
to this multi-task, dynamic PCI scenario.
Much less complicated on linearly-mapped, PCI-mmconf-only accesses.

- Don

> With current implementation, the PCI core subsystem doesn't support
> concurrent hotplug operations yet. The existing pci_bus_sem lock only
> protects several lists in struct pci_bus, such as children list,
> devices list, but it doesn't protect the pci_bus or pci_dev structure
> themselves.
>
> Let's take pci_remove_bus_device() as an example, which are used by
> PCI hotplug drivers to hot-remove PCI devices.  Currently all these
> are free running without any protection, so it can't support reentrance.
> pci_remove_bus_device()
>      ->pci_stop_bus_device()
>          ->pci_stop_bus_device()
>              ->pci_stop_bus_devices()
>          ->pci_stop_dev()
>
> Jiang Liu (22):
>    PCI: use pci_get_domain_bus_and_slot() to avoid race conditions
>    PCI: trivial cleanups for drivers/pci/remove.c
>    PCI: change PCI device management code to better follow device model
>    PCI: split PCI bus device registration into two stages
>    PCI: introduce pci_bus_{get|put}() to manage PCI bus reference count
>    PCI: use a global lock to serialize PCI root bridge hotplug
>      operations
>    PCI: introduce PCI bus lock to serialize PCI hotplug operations
>    PCI: introduce hotplug safe search interfaces for PCI bus/device
>    PCI: enhance PCI probe logic to support PCI bus lock mechanism
>    PCI: enhance PCI bus specific logic to support PCI bus lock mechanism
>    PCI: enhance PCI resource assignment logic to support PCI bus lock
>      mechanism
>    PCI: enhance PCI remove logic to support PCI bus lock mechanism
>    PCI: make each PCI device hold a reference to its parent PCI bus
>    PCI/sysfs: use PCI bus lock to avoid race conditions
>    PCI/eeepc: use PCI bus lock to avoid race conditions
>    PCI/asus-wmi: use PCI bus lock to avoid race conditions
>    PCI/pciehp: use PCI bus lock to avoid race conditions
>    PCI/acpiphp: use PCI bus lock to avoid race conditions
>    PCI/x86: enable PCI bus lock mechanism for x86 platforms
>    PCI/IA64: enable PCI bus lock mechanism for IA64 platforms
>    PCI: cleanups for PCI bus lock implementation
>    PCI: unexport pci_root_buses
>
>   arch/ia64/pci/pci.c                  |    2 +
>   arch/ia64/sn/kernel/io_common.c      |    4 +-
>   arch/ia64/sn/kernel/io_init.c        |    1 +
>   arch/ia64/sn/pci/tioca_provider.c    |    4 +-
>   arch/x86/pci/acpi.c                  |    6 +-
>   arch/x86/pci/common.c                |   12 +++
>   drivers/acpi/pci_root.c              |    8 +-
>   drivers/edac/i7core_edac.c           |   16 ++-
>   drivers/gpu/drm/drm_fops.c           |    6 +-
>   drivers/gpu/vga/vgaarb.c             |   15 +--
>   drivers/pci/bus.c                    |  188 +++++++++++++++++++++++++++++-----
>   drivers/pci/host-bridge.c            |   19 ++++
>   drivers/pci/hotplug/acpiphp_glue.c   |   13 ++-
>   drivers/pci/hotplug/cpcihp_generic.c |    8 +-
>   drivers/pci/hotplug/pciehp_pci.c     |   15 +++
>   drivers/pci/hotplug/sgi_hotplug.c    |    2 +
>   drivers/pci/iov.c                    |   11 +-
>   drivers/pci/pci-sysfs.c              |   37 ++++---
>   drivers/pci/probe.c                  |   83 +++++++++++----
>   drivers/pci/remove.c                 |  176 +++++++++++++++++--------------
>   drivers/pci/search.c                 |   53 ++++++++--
>   drivers/pci/setup-bus.c              |   65 +++++++++---
>   drivers/pci/xen-pcifront.c           |   10 +-
>   drivers/platform/x86/asus-wmi.c      |   23 ++++-
>   drivers/platform/x86/eeepc-laptop.c  |   20 ++--
>   include/linux/pci.h                  |   56 +++++++++-
>   26 files changed, 629 insertions(+), 224 deletions(-)
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations
  2012-08-07 18:11 ` [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Don Dutile
@ 2012-08-08 15:49   ` Jiang Liu
  0 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-08-08 15:49 UTC (permalink / raw)
  To: Don Dutile
  Cc: Bjorn Helgaas, Yinghai Lu, Greg KH, Kenji Kaneshige, Taku Izumi,
	Rafael J . Wysocki, Yijing Wang, Xinwei Hu, linux-kernel,
	linux-pci

On 08/08/2012 02:11 AM, Don Dutile wrote:
> On 08/07/2012 12:10 PM, Jiang Liu wrote:
>> From: Jiang Liu<liuj97@gmail.com>
>>
>> This is the second take to resolve race conditions when hot-plugging PCI
>> devices/host bridges. Instead of using a globla lock to serialize all hotplug
>> operations as in previous version, now we introduce a state machine and bit
>> lock mechanism for PCI buses to serialize hotplug operations. For discussions
>> related to previous version, please refer to:
>> http://comments.gmane.org/gmane.linux.kernel.pci/15007
>>
>> This patch-set is still in early stages, so sending it out just requesting
>> for comments. Any comments are welcomed, especially about whether it's the
>> right/suitable way to solve these race condition issues.
>>
>> patch 1-5:
>>     Preparing for coming PCI bus lock
>> patch 6-7:
>>     Core of the new PCI bus lock mechanism.
>> patch 8-13:
>>     Enhance PCI core to support PCI bus lock mechanism.
>> patch 14-18:
>>     Enhance several PCI hotplug drivers to use PCI bus lock to serialize
>>     hotplug operations.
>> patch 19-20:
>>     Enable PCI bus lock mechanism for x86 and IA64, still need to enable
>>     PCI bus lock for other archs.
>> patch 21-22:
>>     Cleanups for unsed code.
>>
>> There are multiple methods to trigger PCI hotplug requests/operations
>> concurrently, such as:
>> 1. Sysfs interfaces exported by the PCI core subsystem
>>     /sys/devices/pcissss:bb/ssss:bb:dd.f/.../remove
>>     /sys/devices/pcissss:bb/ssss:bb:dd.f/.../rescan
>>     /sys/devices/pcissss:bb/ssss:bb:dd.f/.../pci_bus/ssss:bb/rescan
>>     /sys/bus/pci/rescan
>> 2. Sysfs interfaces exported by the PCI hotplug subsystem
>>     /sys/bus/pci/slots/xx/power
>> 3. PCI hotplug events triggered by PCI Hotplug Controllers
>> 4. ACPI hotplug events for PCI host bridges
>> 5. Driver binding/unbinding events
>>     binding/unbinding pci drivers with SR-IOV support
>>
> 6. PCI reset
>    --> a PCIe device-level reset is done by KVM when it assigns a device
>        to a guest.  a PCI config-save before reset, and PCI config-restore after reset
>        is done in this case.
>    --> VF devices are interesting, since they are reset, then bound to
>        pci-stub driver.  when more than 1 VF is enabled in a PF,
>        and several device-assignments are done simultaneously, you
>        get a storm of reset (save/restore pci cfg space), and pci-stub binding
>        (pci cfg read for resource allocation/deallocation), and depending on
>        the hw design: an AER caused by the FLR reset -- not suppose to, but
>        hw has bugs too! ;-)
>    PCI locking is 'challenged' in the above scenario.
> 
>   So, I ask: have you tried your patch set doing something like:
>     a) modprobe an SRIOV device with > 1 vf enabled
>   you may also have to do:
>     b) while assigning another SRIOV device's VF to another KVM guest
> 
> Unfortunately, the PCI cfg-space locking, esp. on x86 (ok, I'll say it:
> damn, mutually exclusive, IO-port-based cfg registers), doesn't lend itself
> to this multi-task, dynamic PCI scenario.
> Much less complicated on linearly-mapped, PCI-mmconf-only accesses.
> 
> - Don
Hi Don,
	Thanks for your comments. Haven't done such tests for SR-IOV yet. We will
try to find some NICs with SR-IOV capability for testing and will send the result
to you once done.
	Regards!
	Gerry


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 01/22] PCI: use pci_get_domain_bus_and_slot() to avoid race conditions
  2012-08-07 16:10 ` [RFC PATCH v1 01/22] PCI: use pci_get_domain_bus_and_slot() to avoid race conditions Jiang Liu
@ 2012-09-11 22:00   ` Bjorn Helgaas
  2012-09-12  8:37     ` Jiang Liu
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-11 22:00 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
> There's a typical usage pattern to search a PCI device under a specific
> PCI bus (domian, busno) as below:
> struct pci_bus *pci_bus = pci_find_bus(domain, busno);
> struct pci_dev *pci_dev = pci_get_slot(pci_bus, devfn);
>
> The above code has a race window between pci_find_bus() and pci_get_slot()
> if PCI hotplug operations happen between them which removes the pci_bus.
> So use PCI hotplug safe interface pci_get_domain_bus_and_slot() instead,
> which also reduces code complexity.

This makes sense to me.  If we support hotplug, it's fundamentally
unsafe to keep a struct pci_bus * without having a reference or some
way to make sure it doesn't go away.  I think we should try to
eradicate all uses of pci_find_bus() and pci_find_next_bus().

> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  arch/ia64/sn/kernel/io_common.c      |    4 +---
>  drivers/gpu/vga/vgaarb.c             |   15 +++------------
>  drivers/pci/hotplug/cpcihp_generic.c |    8 ++------
>  drivers/pci/iov.c                    |    8 ++------
>  drivers/pci/xen-pcifront.c           |   10 ++--------
>  5 files changed, 10 insertions(+), 35 deletions(-)
>
> diff --git a/arch/ia64/sn/kernel/io_common.c b/arch/ia64/sn/kernel/io_common.c
> index fbb5f2f..8630875 100644
> --- a/arch/ia64/sn/kernel/io_common.c
> +++ b/arch/ia64/sn/kernel/io_common.c
> @@ -229,7 +229,6 @@ void sn_pci_fixup_slot(struct pci_dev *dev, struct pcidev_info *pcidev_info,
>  {
>         int segment = pci_domain_nr(dev->bus);
>         struct pcibus_bussoft *bs;
> -       struct pci_bus *host_pci_bus;
>         struct pci_dev *host_pci_dev;
>         unsigned int bus_no, devfn;
>
> @@ -245,8 +244,7 @@ void sn_pci_fixup_slot(struct pci_dev *dev, struct pcidev_info *pcidev_info,
>
>         bus_no = (pcidev_info->pdi_slot_host_handle >> 32) & 0xff;
>         devfn = pcidev_info->pdi_slot_host_handle & 0xffffffff;
> -       host_pci_bus = pci_find_bus(segment, bus_no);
> -       host_pci_dev = pci_get_slot(host_pci_bus, devfn);
> +       host_pci_dev = pci_get_domain_bus_and_slot(segment, bus_no, devfn);
>
>         pcidev_info->host_pci_dev = host_pci_dev;
>         pcidev_info->pdi_linux_pcidev = dev;
> diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
> index 3df8fc0..b6852b7 100644
> --- a/drivers/gpu/vga/vgaarb.c
> +++ b/drivers/gpu/vga/vgaarb.c
> @@ -1066,7 +1066,6 @@ static ssize_t vga_arb_write(struct file *file, const char __user * buf,
>                 }
>
>         } else if (strncmp(curr_pos, "target ", 7) == 0) {
> -               struct pci_bus *pbus;
>                 unsigned int domain, bus, devfn;
>                 struct vga_device *vgadev;
>
> @@ -1085,19 +1084,11 @@ static ssize_t vga_arb_write(struct file *file, const char __user * buf,
>                         pr_debug("vgaarb: %s ==> %x:%x:%x.%x\n", curr_pos,
>                                 domain, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
>
> -                       pbus = pci_find_bus(domain, bus);
> -                       pr_debug("vgaarb: pbus %p\n", pbus);
> -                       if (pbus == NULL) {
> -                               pr_err("vgaarb: invalid PCI domain and/or bus address %x:%x\n",
> -                                       domain, bus);
> -                               ret_val = -ENODEV;
> -                               goto done;
> -                       }
> -                       pdev = pci_get_slot(pbus, devfn);
> +                       pdev = pci_get_domain_bus_and_slot(domain, bus, devfn);
>                         pr_debug("vgaarb: pdev %p\n", pdev);
>                         if (!pdev) {
> -                               pr_err("vgaarb: invalid PCI address %x:%x\n",
> -                                       bus, devfn);
> +                               pr_err("vgaarb: invalid PCI address %x:%x:%x\n",
> +                                       domain, bus, devfn);
>                                 ret_val = -ENODEV;
>                                 goto done;
>                         }
> diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
> index 81af764..a6a71c4 100644
> --- a/drivers/pci/hotplug/cpcihp_generic.c
> +++ b/drivers/pci/hotplug/cpcihp_generic.c
> @@ -154,12 +154,8 @@ static int __init cpcihp_generic_init(void)
>         if(!r)
>                 return -EBUSY;
>
> -       bus = pci_find_bus(0, bridge_busnr);
> -       if (!bus) {
> -               err("Invalid bus number %d", bridge_busnr);
> -               return -EINVAL;
> -       }
> -       dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
> +       dev = pci_get_domain_bus_and_slot(0, bridge_busnr,
> +                                         PCI_DEVFN(bridge_slot, 0));
>         if(!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
>                 err("Invalid bridge device %s", bridge);
>                 pci_dev_put(dev);
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 74bbaf8..c7d2969 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -152,15 +152,11 @@ failed1:
>  static void virtfn_remove(struct pci_dev *dev, int id, int reset)
>  {
>         char buf[VIRTFN_ID_LEN];
> -       struct pci_bus *bus;
>         struct pci_dev *virtfn;
>         struct pci_sriov *iov = dev->sriov;
>
> -       bus = pci_find_bus(pci_domain_nr(dev->bus), virtfn_bus(dev, id));
> -       if (!bus)
> -               return;
> -
> -       virtfn = pci_get_slot(bus, virtfn_devfn(dev, id));
> +       virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus),
> +                       virtfn_bus(dev, id), virtfn_devfn(dev, id));
>         if (!virtfn)
>                 return;
>
> diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
> index d6cc62c..def8d0b 100644
> --- a/drivers/pci/xen-pcifront.c
> +++ b/drivers/pci/xen-pcifront.c
> @@ -982,7 +982,6 @@ static int pcifront_detach_devices(struct pcifront_device *pdev)
>         int err = 0;
>         int i, num_devs;
>         unsigned int domain, bus, slot, func;
> -       struct pci_bus *pci_bus;
>         struct pci_dev *pci_dev;
>         char str[64];
>
> @@ -1032,13 +1031,8 @@ static int pcifront_detach_devices(struct pcifront_device *pdev)
>                         goto out;
>                 }
>
> -               pci_bus = pci_find_bus(domain, bus);
> -               if (!pci_bus) {
> -                       dev_dbg(&pdev->xdev->dev, "Cannot get bus %04x:%02x\n",
> -                               domain, bus);
> -                       continue;
> -               }
> -               pci_dev = pci_get_slot(pci_bus, PCI_DEVFN(slot, func));
> +               pci_dev = pci_get_domain_bus_and_slot(domain, bus,
> +                               PCI_DEVFN(slot, func));
>                 if (!pci_dev) {
>                         dev_dbg(&pdev->xdev->dev,
>                                 "Cannot get PCI device %04x:%02x:%02x.%d\n",
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 02/22] PCI: trivial cleanups for drivers/pci/remove.c
  2012-08-07 16:10 ` [RFC PATCH v1 02/22] PCI: trivial cleanups for drivers/pci/remove.c Jiang Liu
@ 2012-09-11 22:03   ` Bjorn Helgaas
  2012-09-12  8:50     ` Jiang Liu
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-11 22:03 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
> Trivial cleanups for drivers/pci/remove.c:
> 1) move the comment for pci_stop_and_remove_bus_device() to the right place
> 2) rename __pci_remove_behind_bridge() to pci_remove_behind_bridge()

This seems fine, but I think my pci/bjorn-cleanup-remove branch subsumes it.

> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  drivers/pci/remove.c |   33 +++++++++++++++++----------------
>  1 file changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
> index 04a4861..33b6318 100644
> --- a/drivers/pci/remove.c
> +++ b/drivers/pci/remove.c
> @@ -78,25 +78,14 @@ void pci_remove_bus(struct pci_bus *pci_bus)
>  }
>  EXPORT_SYMBOL(pci_remove_bus);
>
> -static void __pci_remove_behind_bridge(struct pci_dev *dev);
> -/**
> - * pci_stop_and_remove_bus_device - remove a PCI device and any children
> - * @dev: the device to remove
> - *
> - * Remove a PCI device from the device lists, informing the drivers
> - * that the device has been removed.  We also remove any subordinate
> - * buses and children in a depth-first manner.
> - *
> - * For each device we remove, delete the device structure from the
> - * device lists, remove the /proc entry, and notify userspace
> - * (/sbin/hotplug).
> - */
> +static void pci_remove_behind_bridge(struct pci_dev *dev);
> +
>  void __pci_remove_bus_device(struct pci_dev *dev)
>  {
>         if (dev->subordinate) {
>                 struct pci_bus *b = dev->subordinate;
>
> -               __pci_remove_behind_bridge(dev);
> +               pci_remove_behind_bridge(dev);
>                 pci_remove_bus(b);
>                 dev->subordinate = NULL;
>         }
> @@ -105,13 +94,25 @@ void __pci_remove_bus_device(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL(__pci_remove_bus_device);
>
> +/**
> + * pci_stop_and_remove_bus_device - remove a PCI device and any children
> + * @dev: the device to remove
> + *
> + * Remove a PCI device from the device lists, informing the drivers
> + * that the device has been removed.  We also remove any subordinate
> + * buses and children in a depth-first manner.
> + *
> + * For each device we remove, delete the device structure from the
> + * device lists, remove the /proc entry, and notify userspace
> + * (/sbin/hotplug).
> + */
>  void pci_stop_and_remove_bus_device(struct pci_dev *dev)
>  {
>         pci_stop_bus_device(dev);
>         __pci_remove_bus_device(dev);
>  }
>
> -static void __pci_remove_behind_bridge(struct pci_dev *dev)
> +static void pci_remove_behind_bridge(struct pci_dev *dev)
>  {
>         struct list_head *l, *n;
>
> @@ -141,7 +142,7 @@ static void pci_stop_behind_bridge(struct pci_dev *dev)
>  void pci_stop_and_remove_behind_bridge(struct pci_dev *dev)
>  {
>         pci_stop_behind_bridge(dev);
> -       __pci_remove_behind_bridge(dev);
> +       pci_remove_behind_bridge(dev);
>  }
>
>  static void pci_stop_bus_devices(struct pci_bus *bus)
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 03/22] PCI: change PCI device management code to better follow device model
  2012-08-07 16:10 ` [RFC PATCH v1 03/22] PCI: change PCI device management code to better follow device model Jiang Liu
@ 2012-09-11 22:03   ` Bjorn Helgaas
  0 siblings, 0 replies; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-11 22:03 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
> According to device model documentation, the way to add/remove device
> object should be symmetric.
>
> /**
>  * device_del - delete device from system.
>  * @dev: device.
>  *
>  * This is the first part of the device unregistration
>  * sequence. This removes the device from the lists we control
>  * from here, has it removed from the other driver model
>  * subsystems it was added to in device_add(), and removes it
>  * from the kobject hierarchy.
>  *
>  * NOTE: this should be called manually _iff_ device_add() was
>  * also called manually.
>  */
>
> The rule here is to either use
> 1) device_register()/device_unregister()
> or
> 2) device_initialize()/device_add()/device_del()/put_device().
>
> So change PCI core to follow the rule and get rid of the redundant
> pci_dev_get()/pci_dev_put() pair.

Seems OK to me.

> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  drivers/pci/probe.c  |    1 -
>  drivers/pci/remove.c |    4 ++--
>  2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 0840409..dacca26 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1294,7 +1294,6 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
>  {
>         device_initialize(&dev->dev);
>         dev->dev.release = pci_release_dev;
> -       pci_dev_get(dev);
>
>         dev->dev.dma_mask = &dev->dma_mask;
>         dev->dev.dma_parms = &dev->dma_parms;
> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
> index 33b6318..b9ac765 100644
> --- a/drivers/pci/remove.c
> +++ b/drivers/pci/remove.c
> @@ -22,7 +22,7 @@ static void pci_stop_dev(struct pci_dev *dev)
>         if (dev->is_added) {
>                 pci_proc_detach_device(dev);
>                 pci_remove_sysfs_dev_files(dev);
> -               device_unregister(&dev->dev);
> +               device_del(&dev->dev);
>                 dev->is_added = 0;
>         }
>
> @@ -40,7 +40,7 @@ static void pci_destroy_dev(struct pci_dev *dev)
>         up_write(&pci_bus_sem);
>
>         pci_free_resources(dev);
> -       pci_dev_put(dev);
> +       put_device(&dev->dev);
>  }
>
>  /**
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations
  2012-08-07 16:10 ` [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations Jiang Liu
@ 2012-09-11 22:57   ` Bjorn Helgaas
  2012-09-12 15:42     ` Jiang Liu
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-11 22:57 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
> Currently there's no mechanism to protect the global pci_root_buses list
> from dynamic change at runtime. That means, PCI root bridge hotplug
> operations, which dynamically change the pci_root_buses list, may cause
> invalid memory accesses.
>
> So introduce a global lock to serialize accesses to the pci_root_buses
> list and serialize PCI host bridge hotplug operations.
>
> Be careful, never try to acquire this global lock from PCI device drivers,
> that may cause deadlocks.
>
> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  drivers/acpi/pci_root.c           |    8 +++++++-
>  drivers/edac/i7core_edac.c        |   16 +++++++---------
>  drivers/gpu/drm/drm_fops.c        |    6 +++++-
>  drivers/pci/host-bridge.c         |   19 +++++++++++++++++++
>  drivers/pci/hotplug/sgi_hotplug.c |    2 ++
>  drivers/pci/pci-sysfs.c           |    2 ++
>  drivers/pci/probe.c               |    5 ++++-
>  drivers/pci/search.c              |    9 ++++++++-
>  include/linux/pci.h               |    8 ++++++++
>  9 files changed, 62 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 7aff631..6bd0e32 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -463,6 +463,8 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
>         if (!root)
>                 return -ENOMEM;
>
> +       pci_host_bridge_hotplug_lock();

Here's where I get lost.  This is an ACPI driver's .add() routine,
which is analogous to a PCI driver's .probe() routine.  PCI driver
.probe() routines don't need to be concerned with PCI device hotplug.
All the hotplug-related locking is handled by the PCI core, not by
individual drivers.  So why do we need it here?

I'm not suggesting that the existing locking is correct.  I'm just not
convinced this is the right way to fix it.

The commit log says we need protection for the global pci_root_buses
list.  But even with this whole series, we still traverse the list
without protection in places like pcibios_resource_survey() and
pci_assign_unassigned_resources().

Maybe we can make progress on this by identifying specific failures
that can happen in a couple of these paths, e.g., acpi_pci_root_add()
and i7core_xeon_pci_fixup().  If we look at those paths, we might a
way to fix this in a more general fashion than throwing in lock/unlock
pairs.

It might also help to know what the rule is for when we need to use
pci_host_bridge_hotplug_lock() and pci_host_bridge_hotplug_unlock().
Apparently it is not as simple as protecting every reference to the
pci_root_buses list.

> diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c
> index 123de28..f559b5b 100644
> --- a/drivers/gpu/drm/drm_fops.c
> +++ b/drivers/gpu/drm/drm_fops.c
> @@ -344,9 +344,13 @@ static int drm_open_helper(struct inode *inode, struct file *filp,
>                         pci_dev_put(pci_dev);
>                 }
>                 if (!dev->hose) {
> -                       struct pci_bus *b = pci_bus_b(pci_root_buses.next);
> +                       struct pci_bus *b;
> +
> +                       pci_host_bridge_hotplug_lock();
> +                       b = pci_find_next_bus(NULL);

Here's another case I don't understand.  We know already that
pci_find_next_bus() is unsafe with respect to hotplug because it
doesn't hold a reference on the struct pci_bus it returns.  Can't we
replace this with some variety of pci_get_next_bus() that *does*
acquire a reference?

Actually, I looked at the callers of pci_find_next_bus(), and most of
them are unsafe in an even deeper way: they're doing device setup in
initcalls, so that setup won't be done for hot-added devices.  For
example, I can pick on sba_init() because I think I wrote it back in
the dark ages.  sba_init() is a subsys_initcall that calls
sba_connect_bus() for every bus we know about at boot-time, and it
sets the host bridge's iommu pointer.  If we were to hot-add a host
bridge, we would never set the iommu pointer.

I'm not sure why you didn't add a pci_host_bridge_hotplug_lock() in
the sba_init() path, since it looks similar to the drm_open_helper()
path above.  But in any case, I think that would be the wrong thing to
do because it would fix the superficial problem while leaving the
deeper problem of host bridge hot-add not setting the iommu pointer.

>                         if (b)
>                                 dev->hose = b->sysdata;
> +                       pci_host_bridge_hotplug_unlock();
>                 }
>         }
>  #endif
...
> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> index 993d4a0..f1147a7 100644
> --- a/drivers/pci/search.c
> +++ b/drivers/pci/search.c
> @@ -100,6 +100,13 @@ struct pci_bus * pci_find_bus(int domain, int busnr)
>   * initiated by passing %NULL as the @from argument.  Otherwise if
>   * @from is not %NULL, searches continue from next device on the
>   * global list.
> + *
> + * Please don't call this function at rumtime if possible.
> + * It's designed to be called at boot time only because it's unsafe
> + * to PCI root bridge hotplug operations. But some drivers do invoke
> + * it at runtime and it's hard to fix those drivers. In such cases,
> + * use pci_host_bridge_hotplug()_{lock|unlock} to protect the PCI root
> + * bus list, but you need to be really careful to avoid deadlock.

I'm not convinced that it's too hard to fix these drivers :)  There
are only six callers, and the only ones that could possibly be at
runtime are drm_open_helper(), sn_pci_hotplug_init(), and
bus_rescan_store().

Bjorn

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 15/22] PCI/eeepc: use PCI bus lock to avoid race conditions
  2012-08-07 16:10 ` [RFC PATCH v1 15/22] PCI/eeepc: " Jiang Liu
@ 2012-09-11 23:18   ` Bjorn Helgaas
  2012-09-12 14:24     ` [PATCH] eeepc-laptop: fix device reference count leakage in eeepc_rfkill_hotplug() Jiang Liu
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-11 23:18 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
> This patch uses PCI bus lock mechanism to avoid race conditions when doing
> PCI device hotplug through eeepc driver.

> It also fixes a PCI device reference
> count leakage issue because acpi_get_pci_dev() holds a reference to the
> device returned.

Can you split this refcount fix out as a separate patch?  That looks
pretty straightforward, unlike the bus lock stuff.

> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  drivers/platform/x86/eeepc-laptop.c |   20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/platform/x86/eeepc-laptop.c b/drivers/platform/x86/eeepc-laptop.c
> index dab91b4..25c4176 100644
> --- a/drivers/platform/x86/eeepc-laptop.c
> +++ b/drivers/platform/x86/eeepc-laptop.c
> @@ -606,16 +606,16 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
>                         goto out_unlock;
>                 }
>
> -               bus = port->subordinate;
> +               bus = pci_lock_subordinate(port, PCI_BUS_STATE_WORKING);
>
>                 if (!bus) {
>                         pr_warn("Unable to find PCI bus 1?\n");
> -                       goto out_unlock;
> +                       goto out_put_dev;
>                 }
>
>                 if (pci_bus_read_config_dword(bus, 0, PCI_VENDOR_ID, &l)) {
>                         pr_err("Unable to read PCI config space?\n");
> -                       goto out_unlock;
> +                       goto out_unlock_bus;
>                 }
>
>                 absent = (l == 0xffffffff);
> @@ -627,7 +627,7 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
>                                 absent ? "absent" : "present");
>                         pr_warn("skipped wireless hotplug as probably "
>                                 "inappropriate for this model\n");
> -                       goto out_unlock;
> +                       goto out_unlock_bus;
>                 }
>
>                 if (!blocked) {
> @@ -635,7 +635,7 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
>                         if (dev) {
>                                 /* Device already present */
>                                 pci_dev_put(dev);
> -                               goto out_unlock;
> +                               goto out_unlock_bus;
>                         }
>                         dev = pci_scan_single_device(bus, 0);
>                         if (dev) {
> @@ -650,6 +650,11 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
>                                 pci_dev_put(dev);
>                         }
>                 }
> +
> +out_unlock_bus:
> +               pci_bus_unlock(bus);
> +out_put_dev:
> +               pci_dev_put(port);
>         }
>
>  out_unlock:
> @@ -757,7 +762,7 @@ static struct hotplug_slot_ops eeepc_hotplug_slot_ops = {
>  static int eeepc_setup_pci_hotplug(struct eeepc_laptop *eeepc)
>  {
>         int ret = -ENOMEM;
> -       struct pci_bus *bus = pci_find_bus(0, 1);
> +       struct pci_bus *bus = pci_get_bus(0, 1);
>
>         if (!bus) {
>                 pr_err("Unable to find wifi PCI bus\n");
> @@ -785,6 +790,8 @@ static int eeepc_setup_pci_hotplug(struct eeepc_laptop *eeepc)
>                 goto error_register;
>         }
>
> +       pci_bus_put(bus);
> +
>         return 0;
>
>  error_register:
> @@ -793,6 +800,7 @@ error_info:
>         kfree(eeepc->hotplug_slot);
>         eeepc->hotplug_slot = NULL;
>  error_slot:
> +       pci_bus_put(bus);
>         return ret;
>  }
>
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 21/22] PCI: cleanups for PCI bus lock implementation
  2012-08-07 16:11 ` [RFC PATCH v1 21/22] PCI: cleanups for PCI bus lock implementation Jiang Liu
@ 2012-09-11 23:21   ` Bjorn Helgaas
  2012-09-12  8:58     ` Jiang Liu
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-11 23:21 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Tue, Aug 7, 2012 at 10:11 AM, Jiang Liu <liuj97@gmail.com> wrote:
> Now all Archs have been converted to the new PCI bus lock mechanism,
> so clean up unused code.

When you say "all arches," do you really mean "x86 and ia64"?  I
assume all the other architectures have similar issues.  Or is this
somehow ACPI-specific?

> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  drivers/pci/Kconfig     |    3 ---
>  drivers/pci/bus.c       |    1 -
>  drivers/pci/pci-sysfs.c |    9 ---------
>  drivers/pci/probe.c     |    4 +---
>  include/linux/pci.h     |   10 ----------
>  5 files changed, 1 insertion(+), 26 deletions(-)
>
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 5a796c0..848bfb8 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -120,6 +120,3 @@ config PCI_IOAPIC
>  config PCI_LABEL
>         def_bool y if (DMI || ACPI)
>         select NLS
> -
> -config PCI_BUS_LOCK
> -       def_bool y if (X86 || IA64)
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 371f20a..308c376 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -196,7 +196,6 @@ int pci_bus_add_child(struct pci_bus *bus)
>                         pci_create_legacy_files(bus);
>                         pci_bus_change_state(bus, PCI_BUS_STATE_INITIALIZED,
>                                         PCI_BUS_STATE_WORKING, false);
> -                       bus->is_added = 1;
>                 }
>         }
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 11043b4..a5a4195 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -284,7 +284,6 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
>  }
>
>  #ifdef CONFIG_HOTPLUG
> -static DEFINE_MUTEX(pci_remove_rescan_mutex);
>  static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
>                                 size_t count)
>  {
> @@ -296,13 +295,11 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
>
>         if (val) {
>                 pci_host_bridge_hotplug_lock();
> -               mutex_lock(&pci_remove_rescan_mutex);
>                 while ((b = pci_find_next_bus(b)) != NULL)
>                         if (pci_bus_lock_states(b, PCI_BUS_STATE_WORKING) > 0) {
>                                 pci_rescan_bus(b);
>                                 pci_bus_unlock(b);
>                         }
> -               mutex_unlock(&pci_remove_rescan_mutex);
>                 pci_host_bridge_hotplug_unlock();
>         }
>         return count;
> @@ -326,13 +323,11 @@ dev_rescan_store(struct device *dev, struct device_attribute *attr,
>         if (val) {
>                 struct pci_bus *bus = pdev->bus;
>
> -               mutex_lock(&pci_remove_rescan_mutex);
>                 if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
>                         if (pdev->is_added)
>                                 pci_rescan_bus(bus);
>                         pci_bus_unlock(bus);
>                 }
> -               mutex_unlock(&pci_remove_rescan_mutex);
>         }
>         return count;
>  }
> @@ -342,13 +337,11 @@ static void remove_callback(struct device *dev)
>         struct pci_dev *pdev = to_pci_dev(dev);
>         struct pci_bus *bus = pdev->bus;
>
> -       mutex_lock(&pci_remove_rescan_mutex);
>         if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
>                 pci_bus_get(bus);
>                 pci_stop_and_remove_bus_device(pdev);
>                 pci_unlock_and_put_bus(bus);
>         }
> -       mutex_unlock(&pci_remove_rescan_mutex);
>  }
>
>  static ssize_t
> @@ -382,14 +375,12 @@ dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
>                 return -EINVAL;
>
>         if (val) {
> -               mutex_lock(&pci_remove_rescan_mutex);
>                 if (!pci_is_root_bus(bus))
>                         pci_rescan_bus_bridge_resize(bus->self);
>                 else if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
>                         pci_rescan_bus(bus);
>                         pci_bus_unlock(bus);
>                 }
> -               mutex_unlock(&pci_remove_rescan_mutex);
>         }
>         return count;
>  }
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index da6f04c..09517c3 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1626,11 +1626,9 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
>         if (pci_bus_get_state(bus) < PCI_BUS_STATE_WORKING) {
>                 dev_dbg(&bus->dev, "fixups for bus\n");
>                 pcibios_fixup_bus(bus);
> -               if (pci_is_root_bus(bus)) {
> +               if (pci_is_root_bus(bus))
>                         pci_bus_change_state(bus, PCI_BUS_STATE_REGISTERED,
>                                              PCI_BUS_STATE_WORKING, false);
> -                       bus->is_added = 1;
> -               }
>         }
>
>         for (pass=0; pass < 2; pass++)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 9e52e88..0e50ec8 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -442,7 +442,6 @@ struct pci_bus {
>         struct device           dev;
>         struct bin_attribute    *legacy_io; /* legacy I/O for this bus */
>         struct bin_attribute    *legacy_mem; /* legacy mem */
> -       unsigned int            is_added:1;
>         atomic_t                state;
>  };
>
> @@ -463,21 +462,12 @@ struct pci_bus {
>  #define        PCI_BUS_STATE_DESTROYED         0x40    /* invalid state */
>  #define        PCI_BUS_STATE_MASK              0x7F
>
> -#ifdef CONFIG_PCI_BUS_LOCK
>  #define        PCI_BUS_STATE_LOCK              0x10000 /* for pci core only */
>
>  static inline bool pci_bus_is_locked(struct pci_bus *bus)
>  {
>         return !!(atomic_read(&bus->state) & PCI_BUS_STATE_LOCK);
>  }
> -#else /* CONFIG_PCI_BUS_LOCK */
> -#define        PCI_BUS_STATE_LOCK              0x0000  /* for pci core only */
> -
> -static inline bool pci_bus_is_locked(struct pci_bus *bus)
> -{
> -       return true;
> -}
> -#endif /* CONFIG_PCI_BUS_LOCK */
>
>  static inline int pci_bus_get_state(struct pci_bus *bus)
>  {
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 19/22] PCI/x86: enable PCI bus lock mechanism for x86 platforms
  2012-08-07 16:10 ` [RFC PATCH v1 19/22] PCI/x86: enable PCI bus lock mechanism for x86 platforms Jiang Liu
@ 2012-09-11 23:22   ` Bjorn Helgaas
  2012-09-12  9:56     ` Jiang Liu
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-11 23:22 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
> This patch turns on PCI bus lock mechanism for x86 platforms. It also
> enhances x86 specific PCI implementation to support PCI bus lock.
>
> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  arch/x86/pci/acpi.c   |    6 +++++-
>  arch/x86/pci/common.c |   12 ++++++++++++
>  drivers/pci/Kconfig   |    3 +--
>  3 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> index 2bb885a..c68dbdf 100644
> --- a/arch/x86/pci/acpi.c
> +++ b/arch/x86/pci/acpi.c
> @@ -414,7 +414,8 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
>          * Maybe the desired pci bus has been already scanned. In such case
>          * it is unnecessary to scan the pci bus with the given domain,busnum.
>          */
> -       bus = pci_find_bus(domain, busnum);
> +       bus = __pci_get_and_lock_bus(domain, busnum,
> +                                    PCI_BUS_STATE_STOPPING - 1);
>         if (bus) {
>                 /*
>                  * If the desired bus exits, the content of bus->sysdata will
> @@ -449,6 +450,7 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
>                         pci_free_resource_list(&resources);
>                         __release_pci_root_info(info);
>                 }
> +               pci_bus_get(bus);
>         }
>
>         /* After the PCI-E bus has been walked and all devices discovered,
> @@ -475,6 +477,8 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
>  #endif
>         }
>
> +       pci_unlock_and_put_bus(bus);
> +
>         return bus;
>  }
>
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 0ad990a..8b7ae63 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -667,6 +667,18 @@ struct pci_bus * __devinit pci_scan_bus_with_sysdata(int busno)
>         return pci_scan_bus_on_node(busno, &pci_root_ops, -1);
>  }
>
> +static DEFINE_MUTEX(pci_root_bus_mutex);
> +
> +void arch_pci_lock_host_bridge_hotplug(void)
> +{
> +       mutex_lock(&pci_root_bus_mutex);
> +}
> +
> +void arch_pci_unlock_host_bridge_hotplug(void)
> +{
> +       mutex_unlock(&pci_root_bus_mutex);
> +}

Are these left over from previous work?  I don't see any reference to
them elsewhere in your patch series.

>  /*
>   * NUMA info for PCI busses
>   *
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index a6df8b1..1bbe924 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -122,5 +122,4 @@ config PCI_LABEL
>         select NLS
>
>  config PCI_BUS_LOCK
> -       bool
> -       default n
> +       def_bool y if X86
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 07/22] PCI: introduce PCI bus lock to serialize PCI hotplug operations
  2012-08-07 16:10 ` [RFC PATCH v1 07/22] PCI: introduce PCI bus lock to serialize PCI " Jiang Liu
@ 2012-09-11 23:24   ` Bjorn Helgaas
  0 siblings, 0 replies; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-11 23:24 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
> There are multiple ways to trigger concurrent PCI hotplug operations for
> a specific PCI bus, but we have no way to serialize those PCI hotplug
> operations yet and thus breaks the PCI hotplug logic. This patch introduces
> a bus lock mechanism and state machine for PCI buses to serialize PCI
> hotplug operations.
>
> The state machine for PCI buses is:
>           __________________________     ______________
>           |                        v     |            v
> INITIALIZED->REGISTERED->WORKING->STOPPING->STOPPED->REMOVED->DESTOYED
>                      |_________________________^
>
> The PCI buses is hierarchy, so need to obey the locking rules:
> 1) The PCI bus must be locked when changing any child devices of it.
> 2) The PCI bus must be locked when changing its state
> 3) The global PCI host bridge hotplug lock must be held when hotplugging
>    PCI root buses
>
> The lock interfaces cordinated with the state machine will be used to
> avoid race conditions when hotplugging PCI devices/host bridges.
> A typical usage is (lock bus if it's in WORKING state, and then do hotplug):
> if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
>         do_pci_hotplug();
>         pci_bus_unlock(bus);
> }
>
> The PCI_BUS_LOCK config option is a temporary solution to avoid breaking
> bisect, it will be removed when all Archs have been converted to the new
> PCI bus lock mechanism.

I'm going to wait until I understand the global lock issues better
before I even look at these PCI bus lock patches.

> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  drivers/pci/Kconfig |    4 +++
>  drivers/pci/bus.c   |   86 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h |   44 ++++++++++++++++++++++++++
>  3 files changed, 134 insertions(+)
>
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 848bfb8..a6df8b1 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -120,3 +120,7 @@ config PCI_IOAPIC
>  config PCI_LABEL
>         def_bool y if (DMI || ACPI)
>         select NLS
> +
> +config PCI_BUS_LOCK
> +       bool
> +       default n
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 0e18270..aa25fcf 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -15,9 +15,12 @@
>  #include <linux/proc_fs.h>
>  #include <linux/init.h>
>  #include <linux/slab.h>
> +#include <linux/sched.h>
>
>  #include "pci.h"
>
> +static DECLARE_WAIT_QUEUE_HEAD(pci_bus_state_wait_queue);
> +
>  void pci_add_resource_offset(struct list_head *resources, struct resource *res,
>                              resource_size_t offset)
>  {
> @@ -340,6 +343,89 @@ void pci_bus_put(struct pci_bus *bus)
>  }
>  EXPORT_SYMBOL(pci_bus_put);
>
> +static bool pci_bus_wait_for_states(struct pci_bus *bus, int states)
> +{
> +       int t = atomic_read(&bus->state);
> +
> +       /* Bus state is bigger than any of the requested states. */
> +       if ((t & PCI_BUS_STATE_MASK) > states)
> +               return true;
> +
> +       /* Bus is in one of the requested states and unlocked. */
> +       if ((t & states) && !(t & PCI_BUS_STATE_LOCK))
> +               return true;
> +
> +       return false;
> +}
> +
> +/*
> + * Wait for the bus to reach one of the requested states and then lock it.
> + * Return current bus state if succeed to lock the bus, and return -EINVAL
> + * if current bus state is already bigger than any of the requested states.
> + */
> +int pci_bus_lock_states(struct pci_bus *bus, int states)
> +{
> +       int t;
> +
> +       BUG_ON(states & ~PCI_BUS_STATE_MASK);
> +       do {
> +               do {
> +                       wait_event(pci_bus_state_wait_queue,
> +                                  pci_bus_wait_for_states(bus, states));
> +                       t = atomic_read(&bus->state);
> +                       if ((t & PCI_BUS_STATE_MASK) > states)
> +                               return -EINVAL;
> +               } while (!(t & states));
> +
> +               t &= ~PCI_BUS_STATE_LOCK;
> +       } while (atomic_cmpxchg(&bus->state, t , t | PCI_BUS_STATE_LOCK) != t);
> +
> +       return t & PCI_BUS_STATE_MASK;
> +}
> +EXPORT_SYMBOL(pci_bus_lock_states);
> +
> +/* Unlock the bus and wake up waiters, must be called with the bus locked. */
> +void pci_bus_unlock(struct pci_bus *bus)
> +{
> +       int t;
> +
> +       BUG_ON(!pci_bus_is_locked(bus));
> +       do {
> +               t = atomic_read(&bus->state);
> +       } while (atomic_cmpxchg(&bus->state,
> +                               t, t & ~PCI_BUS_STATE_LOCK) != t);
> +
> +       if (waitqueue_active(&pci_bus_state_wait_queue))
> +               wake_up_all(&pci_bus_state_wait_queue);
> +}
> +EXPORT_SYMBOL(pci_bus_unlock);
> +
> +/*
> + * Change the bus from old state to new state. It must be called with the bus
> + * locked, and the new state must be bigger than the old state.
> + */
> +void pci_bus_change_state(struct pci_bus *bus, int old, int new, bool unlock)
> +{
> +       int t;
> +
> +       BUG_ON(!pci_bus_is_locked(bus));
> +       BUG_ON(new < old || pci_bus_get_state(bus) != old ||
> +              (new & ~PCI_BUS_STATE_MASK));
> +
> +       old |= PCI_BUS_STATE_LOCK;
> +       if (!unlock)
> +               new |= PCI_BUS_STATE_LOCK;
> +
> +       do {
> +               t = atomic_read(&bus->state);
> +               t &= ~(PCI_BUS_STATE_MASK | PCI_BUS_STATE_LOCK);
> +       } while (atomic_cmpxchg(&bus->state, t | old, t | new) != (t | old));
> +
> +       if (waitqueue_active(&pci_bus_state_wait_queue))
> +               wake_up_all(&pci_bus_state_wait_queue);
> +}
> +EXPORT_SYMBOL(pci_bus_change_state);
> +
>  EXPORT_SYMBOL(pci_bus_alloc_resource);
>  EXPORT_SYMBOL_GPL(pci_bus_add_device);
>  EXPORT_SYMBOL(pci_bus_add_devices);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index e02f130..e2ef517 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -443,8 +443,52 @@ struct pci_bus {
>         struct bin_attribute    *legacy_io; /* legacy I/O for this bus */
>         struct bin_attribute    *legacy_mem; /* legacy mem */
>         unsigned int            is_added:1;
> +       atomic_t                state;
>  };
>
> +/*
> + * State machine for PCI buses.
> + *          __________________________     ______________
> + *          |                        v     |            v
> + * INITIALIZED->REGISTERED->WORKING->STOPPING->STOPPED->REMOVED->DESTOYED
> + *                     |_________________________^
> + */
> +#define        PCI_BUS_STATE_UNKNOWN           0x0     /* invalid state */
> +#define        PCI_BUS_STATE_INITIALIZED       0x1     /* device_initialize called */
> +#define        PCI_BUS_STATE_REGISTERED        0x2     /* device_add called */
> +#define        PCI_BUS_STATE_WORKING           0x4     /* working state */
> +#define        PCI_BUS_STATE_STOPPING          0x8     /* stopping devices */
> +#define        PCI_BUS_STATE_STOPPED           0x10    /* device_del called */
> +#define        PCI_BUS_STATE_REMOVED           0x20    /* bus deleted  */
> +#define        PCI_BUS_STATE_DESTROYED         0x40    /* invalid state */
> +#define        PCI_BUS_STATE_MASK              0x7F
> +
> +#ifdef CONFIG_PCI_BUS_LOCK
> +#define        PCI_BUS_STATE_LOCK              0x10000 /* for pci core only */
> +
> +static inline bool pci_bus_is_locked(struct pci_bus *bus)
> +{
> +       return !!(atomic_read(&bus->state) & PCI_BUS_STATE_LOCK);
> +}
> +#else /* CONFIG_PCI_BUS_LOCK */
> +#define        PCI_BUS_STATE_LOCK              0x0000  /* for pci core only */
> +
> +static inline bool pci_bus_is_locked(struct pci_bus *bus)
> +{
> +       return true;
> +}
> +#endif /* CONFIG_PCI_BUS_LOCK */
> +
> +static inline int pci_bus_get_state(struct pci_bus *bus)
> +{
> +       return atomic_read(&bus->state) & PCI_BUS_STATE_MASK;
> +}
> +
> +extern int pci_bus_lock_states(struct pci_bus *bus, int states);
> +extern void pci_bus_unlock(struct pci_bus *bus);
> +extern void pci_bus_change_state(struct pci_bus *bus, int new, int old,
> +                                bool unlock);
> +
>  #define pci_bus_b(n)   list_entry(n, struct pci_bus, node)
>  #define to_pci_bus(n)  container_of(n, struct pci_bus, dev)
>
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 01/22] PCI: use pci_get_domain_bus_and_slot() to avoid race conditions
  2012-09-11 22:00   ` Bjorn Helgaas
@ 2012-09-12  8:37     ` Jiang Liu
  0 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-09-12  8:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiang Liu, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On 2012-9-12 6:00, Bjorn Helgaas wrote:
> On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
>> There's a typical usage pattern to search a PCI device under a specific
>> PCI bus (domian, busno) as below:
>> struct pci_bus *pci_bus = pci_find_bus(domain, busno);
>> struct pci_dev *pci_dev = pci_get_slot(pci_bus, devfn);
>>
>> The above code has a race window between pci_find_bus() and pci_get_slot()
>> if PCI hotplug operations happen between them which removes the pci_bus.
>> So use PCI hotplug safe interface pci_get_domain_bus_and_slot() instead,
>> which also reduces code complexity.
> 
> This makes sense to me.  If we support hotplug, it's fundamentally
> unsafe to keep a struct pci_bus * without having a reference or some
> way to make sure it doesn't go away.  I think we should try to
> eradicate all uses of pci_find_bus() and pci_find_next_bus().
Hi Bjorn
	I have split out these into a separate patch set at
http://www.spinics.net/lists/linux-pci/msg17227.html.
	In my latest PCI bus lock patch set, I have introduced a new
interface pci_get_bus() to replace pci_find_bus(). But pci_find_next_bus()
is a little harder to deal with, still need more work here.
	Thanks!

> 
>> Signed-off-by: Jiang Liu <liuj97@gmail.com>
>> ---
>>  arch/ia64/sn/kernel/io_common.c      |    4 +---
>>  drivers/gpu/vga/vgaarb.c             |   15 +++------------
>>  drivers/pci/hotplug/cpcihp_generic.c |    8 ++------
>>  drivers/pci/iov.c                    |    8 ++------
>>  drivers/pci/xen-pcifront.c           |   10 ++--------
>>  5 files changed, 10 insertions(+), 35 deletions(-)
>>
>> diff --git a/arch/ia64/sn/kernel/io_common.c b/arch/ia64/sn/kernel/io_common.c
>> index fbb5f2f..8630875 100644
>> --- a/arch/ia64/sn/kernel/io_common.c
>> +++ b/arch/ia64/sn/kernel/io_common.c
>> @@ -229,7 +229,6 @@ void sn_pci_fixup_slot(struct pci_dev *dev, struct pcidev_info *pcidev_info,
>>  {
>>         int segment = pci_domain_nr(dev->bus);
>>         struct pcibus_bussoft *bs;
>> -       struct pci_bus *host_pci_bus;
>>         struct pci_dev *host_pci_dev;
>>         unsigned int bus_no, devfn;
>>
>> @@ -245,8 +244,7 @@ void sn_pci_fixup_slot(struct pci_dev *dev, struct pcidev_info *pcidev_info,
>>
>>         bus_no = (pcidev_info->pdi_slot_host_handle >> 32) & 0xff;
>>         devfn = pcidev_info->pdi_slot_host_handle & 0xffffffff;
>> -       host_pci_bus = pci_find_bus(segment, bus_no);
>> -       host_pci_dev = pci_get_slot(host_pci_bus, devfn);
>> +       host_pci_dev = pci_get_domain_bus_and_slot(segment, bus_no, devfn);
>>
>>         pcidev_info->host_pci_dev = host_pci_dev;
>>         pcidev_info->pdi_linux_pcidev = dev;
>> diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
>> index 3df8fc0..b6852b7 100644
>> --- a/drivers/gpu/vga/vgaarb.c
>> +++ b/drivers/gpu/vga/vgaarb.c
>> @@ -1066,7 +1066,6 @@ static ssize_t vga_arb_write(struct file *file, const char __user * buf,
>>                 }
>>
>>         } else if (strncmp(curr_pos, "target ", 7) == 0) {
>> -               struct pci_bus *pbus;
>>                 unsigned int domain, bus, devfn;
>>                 struct vga_device *vgadev;
>>
>> @@ -1085,19 +1084,11 @@ static ssize_t vga_arb_write(struct file *file, const char __user * buf,
>>                         pr_debug("vgaarb: %s ==> %x:%x:%x.%x\n", curr_pos,
>>                                 domain, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
>>
>> -                       pbus = pci_find_bus(domain, bus);
>> -                       pr_debug("vgaarb: pbus %p\n", pbus);
>> -                       if (pbus == NULL) {
>> -                               pr_err("vgaarb: invalid PCI domain and/or bus address %x:%x\n",
>> -                                       domain, bus);
>> -                               ret_val = -ENODEV;
>> -                               goto done;
>> -                       }
>> -                       pdev = pci_get_slot(pbus, devfn);
>> +                       pdev = pci_get_domain_bus_and_slot(domain, bus, devfn);
>>                         pr_debug("vgaarb: pdev %p\n", pdev);
>>                         if (!pdev) {
>> -                               pr_err("vgaarb: invalid PCI address %x:%x\n",
>> -                                       bus, devfn);
>> +                               pr_err("vgaarb: invalid PCI address %x:%x:%x\n",
>> +                                       domain, bus, devfn);
>>                                 ret_val = -ENODEV;
>>                                 goto done;
>>                         }
>> diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
>> index 81af764..a6a71c4 100644
>> --- a/drivers/pci/hotplug/cpcihp_generic.c
>> +++ b/drivers/pci/hotplug/cpcihp_generic.c
>> @@ -154,12 +154,8 @@ static int __init cpcihp_generic_init(void)
>>         if(!r)
>>                 return -EBUSY;
>>
>> -       bus = pci_find_bus(0, bridge_busnr);
>> -       if (!bus) {
>> -               err("Invalid bus number %d", bridge_busnr);
>> -               return -EINVAL;
>> -       }
>> -       dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
>> +       dev = pci_get_domain_bus_and_slot(0, bridge_busnr,
>> +                                         PCI_DEVFN(bridge_slot, 0));
>>         if(!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
>>                 err("Invalid bridge device %s", bridge);
>>                 pci_dev_put(dev);
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index 74bbaf8..c7d2969 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -152,15 +152,11 @@ failed1:
>>  static void virtfn_remove(struct pci_dev *dev, int id, int reset)
>>  {
>>         char buf[VIRTFN_ID_LEN];
>> -       struct pci_bus *bus;
>>         struct pci_dev *virtfn;
>>         struct pci_sriov *iov = dev->sriov;
>>
>> -       bus = pci_find_bus(pci_domain_nr(dev->bus), virtfn_bus(dev, id));
>> -       if (!bus)
>> -               return;
>> -
>> -       virtfn = pci_get_slot(bus, virtfn_devfn(dev, id));
>> +       virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus),
>> +                       virtfn_bus(dev, id), virtfn_devfn(dev, id));
>>         if (!virtfn)
>>                 return;
>>
>> diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
>> index d6cc62c..def8d0b 100644
>> --- a/drivers/pci/xen-pcifront.c
>> +++ b/drivers/pci/xen-pcifront.c
>> @@ -982,7 +982,6 @@ static int pcifront_detach_devices(struct pcifront_device *pdev)
>>         int err = 0;
>>         int i, num_devs;
>>         unsigned int domain, bus, slot, func;
>> -       struct pci_bus *pci_bus;
>>         struct pci_dev *pci_dev;
>>         char str[64];
>>
>> @@ -1032,13 +1031,8 @@ static int pcifront_detach_devices(struct pcifront_device *pdev)
>>                         goto out;
>>                 }
>>
>> -               pci_bus = pci_find_bus(domain, bus);
>> -               if (!pci_bus) {
>> -                       dev_dbg(&pdev->xdev->dev, "Cannot get bus %04x:%02x\n",
>> -                               domain, bus);
>> -                       continue;
>> -               }
>> -               pci_dev = pci_get_slot(pci_bus, PCI_DEVFN(slot, func));
>> +               pci_dev = pci_get_domain_bus_and_slot(domain, bus,
>> +                               PCI_DEVFN(slot, func));
>>                 if (!pci_dev) {
>>                         dev_dbg(&pdev->xdev->dev,
>>                                 "Cannot get PCI device %04x:%02x:%02x.%d\n",
>> --
>> 1.7.9.5
>>
> 
> .
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 02/22] PCI: trivial cleanups for drivers/pci/remove.c
  2012-09-11 22:03   ` Bjorn Helgaas
@ 2012-09-12  8:50     ` Jiang Liu
  0 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-09-12  8:50 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiang Liu, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On 2012-9-12 6:03, Bjorn Helgaas wrote:
> On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
>> Trivial cleanups for drivers/pci/remove.c:
>> 1) move the comment for pci_stop_and_remove_bus_device() to the right place
>> 2) rename __pci_remove_behind_bridge() to pci_remove_behind_bridge()
> 
> This seems fine, but I think my pci/bjorn-cleanup-remove branch subsumes it.
Hi Bjorn,
	I have rebased my latest patchset to your pci-next branch, so this
patch has been dropped.

> 
>> Signed-off-by: Jiang Liu <liuj97@gmail.com>
>> ---
>>  drivers/pci/remove.c |   33 +++++++++++++++++----------------
>>  1 file changed, 17 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
>> index 04a4861..33b6318 100644
>> --- a/drivers/pci/remove.c
>> +++ b/drivers/pci/remove.c
>> @@ -78,25 +78,14 @@ void pci_remove_bus(struct pci_bus *pci_bus)
>>  }
>>  EXPORT_SYMBOL(pci_remove_bus);
>>
>> -static void __pci_remove_behind_bridge(struct pci_dev *dev);
>> -/**
>> - * pci_stop_and_remove_bus_device - remove a PCI device and any children
>> - * @dev: the device to remove
>> - *
>> - * Remove a PCI device from the device lists, informing the drivers
>> - * that the device has been removed.  We also remove any subordinate
>> - * buses and children in a depth-first manner.
>> - *
>> - * For each device we remove, delete the device structure from the
>> - * device lists, remove the /proc entry, and notify userspace
>> - * (/sbin/hotplug).
>> - */
>> +static void pci_remove_behind_bridge(struct pci_dev *dev);
>> +
>>  void __pci_remove_bus_device(struct pci_dev *dev)
>>  {
>>         if (dev->subordinate) {
>>                 struct pci_bus *b = dev->subordinate;
>>
>> -               __pci_remove_behind_bridge(dev);
>> +               pci_remove_behind_bridge(dev);
>>                 pci_remove_bus(b);
>>                 dev->subordinate = NULL;
>>         }
>> @@ -105,13 +94,25 @@ void __pci_remove_bus_device(struct pci_dev *dev)
>>  }
>>  EXPORT_SYMBOL(__pci_remove_bus_device);
>>
>> +/**
>> + * pci_stop_and_remove_bus_device - remove a PCI device and any children
>> + * @dev: the device to remove
>> + *
>> + * Remove a PCI device from the device lists, informing the drivers
>> + * that the device has been removed.  We also remove any subordinate
>> + * buses and children in a depth-first manner.
>> + *
>> + * For each device we remove, delete the device structure from the
>> + * device lists, remove the /proc entry, and notify userspace
>> + * (/sbin/hotplug).
>> + */
>>  void pci_stop_and_remove_bus_device(struct pci_dev *dev)
>>  {
>>         pci_stop_bus_device(dev);
>>         __pci_remove_bus_device(dev);
>>  }
>>
>> -static void __pci_remove_behind_bridge(struct pci_dev *dev)
>> +static void pci_remove_behind_bridge(struct pci_dev *dev)
>>  {
>>         struct list_head *l, *n;
>>
>> @@ -141,7 +142,7 @@ static void pci_stop_behind_bridge(struct pci_dev *dev)
>>  void pci_stop_and_remove_behind_bridge(struct pci_dev *dev)
>>  {
>>         pci_stop_behind_bridge(dev);
>> -       __pci_remove_behind_bridge(dev);
>> +       pci_remove_behind_bridge(dev);
>>  }
>>
>>  static void pci_stop_bus_devices(struct pci_bus *bus)
>> --
>> 1.7.9.5
>>
> 
> .
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 21/22] PCI: cleanups for PCI bus lock implementation
  2012-09-11 23:21   ` Bjorn Helgaas
@ 2012-09-12  8:58     ` Jiang Liu
  0 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-09-12  8:58 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiang Liu, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On 2012-9-12 7:21, Bjorn Helgaas wrote:
> On Tue, Aug 7, 2012 at 10:11 AM, Jiang Liu <liuj97@gmail.com> wrote:
>> Now all Archs have been converted to the new PCI bus lock mechanism,
>> so clean up unused code.
> 
> When you say "all arches," do you really mean "x86 and ia64"?  I
> assume all the other architectures have similar issues.  Or is this
> somehow ACPI-specific?
Hi Bjorn,
	It's still RFC to give an overview of the whole patch set. 
Here I mean once we have enhanced all arches, we will apply following patches:)
I'm working to enhance all arches recently, and will post the latest patch
once it passes basic tests.
	Thanks!

> 
>> Signed-off-by: Jiang Liu <liuj97@gmail.com>
>> ---
>>  drivers/pci/Kconfig     |    3 ---
>>  drivers/pci/bus.c       |    1 -
>>  drivers/pci/pci-sysfs.c |    9 ---------
>>  drivers/pci/probe.c     |    4 +---
>>  include/linux/pci.h     |   10 ----------
>>  5 files changed, 1 insertion(+), 26 deletions(-)
>>
>> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
>> index 5a796c0..848bfb8 100644
>> --- a/drivers/pci/Kconfig
>> +++ b/drivers/pci/Kconfig
>> @@ -120,6 +120,3 @@ config PCI_IOAPIC
>>  config PCI_LABEL
>>         def_bool y if (DMI || ACPI)
>>         select NLS
>> -
>> -config PCI_BUS_LOCK
>> -       def_bool y if (X86 || IA64)
>> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
>> index 371f20a..308c376 100644
>> --- a/drivers/pci/bus.c
>> +++ b/drivers/pci/bus.c
>> @@ -196,7 +196,6 @@ int pci_bus_add_child(struct pci_bus *bus)
>>                         pci_create_legacy_files(bus);
>>                         pci_bus_change_state(bus, PCI_BUS_STATE_INITIALIZED,
>>                                         PCI_BUS_STATE_WORKING, false);
>> -                       bus->is_added = 1;
>>                 }
>>         }
>>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index 11043b4..a5a4195 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -284,7 +284,6 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
>>  }
>>
>>  #ifdef CONFIG_HOTPLUG
>> -static DEFINE_MUTEX(pci_remove_rescan_mutex);
>>  static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
>>                                 size_t count)
>>  {
>> @@ -296,13 +295,11 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
>>
>>         if (val) {
>>                 pci_host_bridge_hotplug_lock();
>> -               mutex_lock(&pci_remove_rescan_mutex);
>>                 while ((b = pci_find_next_bus(b)) != NULL)
>>                         if (pci_bus_lock_states(b, PCI_BUS_STATE_WORKING) > 0) {
>>                                 pci_rescan_bus(b);
>>                                 pci_bus_unlock(b);
>>                         }
>> -               mutex_unlock(&pci_remove_rescan_mutex);
>>                 pci_host_bridge_hotplug_unlock();
>>         }
>>         return count;
>> @@ -326,13 +323,11 @@ dev_rescan_store(struct device *dev, struct device_attribute *attr,
>>         if (val) {
>>                 struct pci_bus *bus = pdev->bus;
>>
>> -               mutex_lock(&pci_remove_rescan_mutex);
>>                 if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
>>                         if (pdev->is_added)
>>                                 pci_rescan_bus(bus);
>>                         pci_bus_unlock(bus);
>>                 }
>> -               mutex_unlock(&pci_remove_rescan_mutex);
>>         }
>>         return count;
>>  }
>> @@ -342,13 +337,11 @@ static void remove_callback(struct device *dev)
>>         struct pci_dev *pdev = to_pci_dev(dev);
>>         struct pci_bus *bus = pdev->bus;
>>
>> -       mutex_lock(&pci_remove_rescan_mutex);
>>         if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
>>                 pci_bus_get(bus);
>>                 pci_stop_and_remove_bus_device(pdev);
>>                 pci_unlock_and_put_bus(bus);
>>         }
>> -       mutex_unlock(&pci_remove_rescan_mutex);
>>  }
>>
>>  static ssize_t
>> @@ -382,14 +375,12 @@ dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
>>                 return -EINVAL;
>>
>>         if (val) {
>> -               mutex_lock(&pci_remove_rescan_mutex);
>>                 if (!pci_is_root_bus(bus))
>>                         pci_rescan_bus_bridge_resize(bus->self);
>>                 else if (pci_bus_lock_states(bus, PCI_BUS_STATE_WORKING) > 0) {
>>                         pci_rescan_bus(bus);
>>                         pci_bus_unlock(bus);
>>                 }
>> -               mutex_unlock(&pci_remove_rescan_mutex);
>>         }
>>         return count;
>>  }
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index da6f04c..09517c3 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -1626,11 +1626,9 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
>>         if (pci_bus_get_state(bus) < PCI_BUS_STATE_WORKING) {
>>                 dev_dbg(&bus->dev, "fixups for bus\n");
>>                 pcibios_fixup_bus(bus);
>> -               if (pci_is_root_bus(bus)) {
>> +               if (pci_is_root_bus(bus))
>>                         pci_bus_change_state(bus, PCI_BUS_STATE_REGISTERED,
>>                                              PCI_BUS_STATE_WORKING, false);
>> -                       bus->is_added = 1;
>> -               }
>>         }
>>
>>         for (pass=0; pass < 2; pass++)
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 9e52e88..0e50ec8 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -442,7 +442,6 @@ struct pci_bus {
>>         struct device           dev;
>>         struct bin_attribute    *legacy_io; /* legacy I/O for this bus */
>>         struct bin_attribute    *legacy_mem; /* legacy mem */
>> -       unsigned int            is_added:1;
>>         atomic_t                state;
>>  };
>>
>> @@ -463,21 +462,12 @@ struct pci_bus {
>>  #define        PCI_BUS_STATE_DESTROYED         0x40    /* invalid state */
>>  #define        PCI_BUS_STATE_MASK              0x7F
>>
>> -#ifdef CONFIG_PCI_BUS_LOCK
>>  #define        PCI_BUS_STATE_LOCK              0x10000 /* for pci core only */
>>
>>  static inline bool pci_bus_is_locked(struct pci_bus *bus)
>>  {
>>         return !!(atomic_read(&bus->state) & PCI_BUS_STATE_LOCK);
>>  }
>> -#else /* CONFIG_PCI_BUS_LOCK */
>> -#define        PCI_BUS_STATE_LOCK              0x0000  /* for pci core only */
>> -
>> -static inline bool pci_bus_is_locked(struct pci_bus *bus)
>> -{
>> -       return true;
>> -}
>> -#endif /* CONFIG_PCI_BUS_LOCK */
>>
>>  static inline int pci_bus_get_state(struct pci_bus *bus)
>>  {
>> --
>> 1.7.9.5
>>
> 
> .
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 19/22] PCI/x86: enable PCI bus lock mechanism for x86 platforms
  2012-09-11 23:22   ` Bjorn Helgaas
@ 2012-09-12  9:56     ` Jiang Liu
  0 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-09-12  9:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiang Liu, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On 2012-9-12 7:22, Bjorn Helgaas wrote:
> On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
>> This patch turns on PCI bus lock mechanism for x86 platforms. It also
>> enhances x86 specific PCI implementation to support PCI bus lock.
>>
>> Signed-off-by: Jiang Liu <liuj97@gmail.com>
>> ---
>>  arch/x86/pci/acpi.c   |    6 +++++-
>>  arch/x86/pci/common.c |   12 ++++++++++++
>>  drivers/pci/Kconfig   |    3 +--
>>  3 files changed, 18 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
>> index 2bb885a..c68dbdf 100644
>> --- a/arch/x86/pci/acpi.c
>> +++ b/arch/x86/pci/acpi.c
>> @@ -414,7 +414,8 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
>>          * Maybe the desired pci bus has been already scanned. In such case
>>          * it is unnecessary to scan the pci bus with the given domain,busnum.
>>          */
>> -       bus = pci_find_bus(domain, busnum);
>> +       bus = __pci_get_and_lock_bus(domain, busnum,
>> +                                    PCI_BUS_STATE_STOPPING - 1);
>>         if (bus) {
>>                 /*
>>                  * If the desired bus exits, the content of bus->sysdata will
>> @@ -449,6 +450,7 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
>>                         pci_free_resource_list(&resources);
>>                         __release_pci_root_info(info);
>>                 }
>> +               pci_bus_get(bus);
>>         }
>>
>>         /* After the PCI-E bus has been walked and all devices discovered,
>> @@ -475,6 +477,8 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root)
>>  #endif
>>         }
>>
>> +       pci_unlock_and_put_bus(bus);
>> +
>>         return bus;
>>  }
>>
>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index 0ad990a..8b7ae63 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -667,6 +667,18 @@ struct pci_bus * __devinit pci_scan_bus_with_sysdata(int busno)
>>         return pci_scan_bus_on_node(busno, &pci_root_ops, -1);
>>  }
>>
>> +static DEFINE_MUTEX(pci_root_bus_mutex);
>> +
>> +void arch_pci_lock_host_bridge_hotplug(void)
>> +{
>> +       mutex_lock(&pci_root_bus_mutex);
>> +}
>> +
>> +void arch_pci_unlock_host_bridge_hotplug(void)
>> +{
>> +       mutex_unlock(&pci_root_bus_mutex);
>> +}
> 
> Are these left over from previous work?  I don't see any reference to
> them elsewhere in your patch series.
Hi Bjorn
	Eagle eyes! Yes, it's left from previous work and I have changed to
other mechanism for root bus locks.
	--Gerry

> 
>>  /*
>>   * NUMA info for PCI busses
>>   *
>> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
>> index a6df8b1..1bbe924 100644
>> --- a/drivers/pci/Kconfig
>> +++ b/drivers/pci/Kconfig
>> @@ -122,5 +122,4 @@ config PCI_LABEL
>>         select NLS
>>
>>  config PCI_BUS_LOCK
>> -       bool
>> -       default n
>> +       def_bool y if X86
>> --
>> 1.7.9.5
>>
> 
> .
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] eeepc-laptop: fix device reference count leakage in eeepc_rfkill_hotplug()
  2012-09-11 23:18   ` Bjorn Helgaas
@ 2012-09-12 14:24     ` Jiang Liu
  2012-09-12 19:59       ` Bjorn Helgaas
  0 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-09-12 14:24 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Jiang Liu, huxinwei, Jiang Liu, linux-pci

Fix a device reference count leakage issue in function
eeepc_rfkill_hotplug().
---
 drivers/platform/x86/eeepc-laptop.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/platform/x86/eeepc-laptop.c b/drivers/platform/x86/eeepc-laptop.c
index dab91b4..5ca2641 100644
--- a/drivers/platform/x86/eeepc-laptop.c
+++ b/drivers/platform/x86/eeepc-laptop.c
@@ -610,12 +610,12 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
 
 		if (!bus) {
 			pr_warn("Unable to find PCI bus 1?\n");
-			goto out_unlock;
+			goto out_put_dev;
 		}
 
 		if (pci_bus_read_config_dword(bus, 0, PCI_VENDOR_ID, &l)) {
 			pr_err("Unable to read PCI config space?\n");
-			goto out_unlock;
+			goto out_put_dev;
 		}
 
 		absent = (l == 0xffffffff);
@@ -627,7 +627,7 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
 				absent ? "absent" : "present");
 			pr_warn("skipped wireless hotplug as probably "
 				"inappropriate for this model\n");
-			goto out_unlock;
+			goto out_put_dev;
 		}
 
 		if (!blocked) {
@@ -635,7 +635,7 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
 			if (dev) {
 				/* Device already present */
 				pci_dev_put(dev);
-				goto out_unlock;
+				goto out_put_dev;
 			}
 			dev = pci_scan_single_device(bus, 0);
 			if (dev) {
@@ -650,6 +650,8 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
 				pci_dev_put(dev);
 			}
 		}
+out_put_dev:
+		pci_dev_put(port);
 	}
 
 out_unlock:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations
  2012-09-11 22:57   ` Bjorn Helgaas
@ 2012-09-12 15:42     ` Jiang Liu
  2012-09-12 16:51       ` Bjorn Helgaas
  0 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-09-12 15:42 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On 09/12/2012 06:57 AM, Bjorn Helgaas wrote:
> On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
>> Currently there's no mechanism to protect the global pci_root_buses list
>> from dynamic change at runtime. That means, PCI root bridge hotplug
>> operations, which dynamically change the pci_root_buses list, may cause
>> invalid memory accesses.
>>
>> So introduce a global lock to serialize accesses to the pci_root_buses
>> list and serialize PCI host bridge hotplug operations.
>>
>> Be careful, never try to acquire this global lock from PCI device drivers,
>> that may cause deadlocks.
>>
>> Signed-off-by: Jiang Liu <liuj97@gmail.com>
>> ---
>>  drivers/acpi/pci_root.c           |    8 +++++++-
>>  drivers/edac/i7core_edac.c        |   16 +++++++---------
>>  drivers/gpu/drm/drm_fops.c        |    6 +++++-
>>  drivers/pci/host-bridge.c         |   19 +++++++++++++++++++
>>  drivers/pci/hotplug/sgi_hotplug.c |    2 ++
>>  drivers/pci/pci-sysfs.c           |    2 ++
>>  drivers/pci/probe.c               |    5 ++++-
>>  drivers/pci/search.c              |    9 ++++++++-
>>  include/linux/pci.h               |    8 ++++++++
>>  9 files changed, 62 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
>> index 7aff631..6bd0e32 100644
>> --- a/drivers/acpi/pci_root.c
>> +++ b/drivers/acpi/pci_root.c
>> @@ -463,6 +463,8 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
>>         if (!root)
>>                 return -ENOMEM;
>>
>> +       pci_host_bridge_hotplug_lock();
> 
> Here's where I get lost.  This is an ACPI driver's .add() routine,
> which is analogous to a PCI driver's .probe() routine.  PCI driver
> .probe() routines don't need to be concerned with PCI device hotplug.
> All the hotplug-related locking is handled by the PCI core, not by
> individual drivers.  So why do we need it here?
> 
> I'm not suggesting that the existing locking is correct.  I'm just not
> convinced this is the right way to fix it.
> 
> The commit log says we need protection for the global pci_root_buses
> list.  But even with this whole series, we still traverse the list
> without protection in places like pcibios_resource_survey() and
> pci_assign_unassigned_resources().
> 
> Maybe we can make progress on this by identifying specific failures
> that can happen in a couple of these paths, e.g., acpi_pci_root_add()
> and i7core_xeon_pci_fixup().  If we look at those paths, we might a
> way to fix this in a more general fashion than throwing in lock/unlock
> pairs.
> 
> It might also help to know what the rule is for when we need to use
> pci_host_bridge_hotplug_lock() and pci_host_bridge_hotplug_unlock().
> Apparently it is not as simple as protecting every reference to the
> pci_root_buses list.
Hi Bjorn,
	It's really a challenge work to protect the pci_root_buses list:)
All evils are caused by the pci_find_next_bus() interface, which is designed
to be called at boot time only. I have tried several other solutions but
failed.
	First I tried "pci_get_next_bus()" which holds a reference to the
returned root bus "pci_bus". But that doesn't help because pci_bus could
be removed from the pci_root_buses list even you hold a reference to
pci_bus. And it will cause trouble when you call pci_get_next_bus(pci_bus)
again because pci_bus->node.next is invalid now.
	Then I tried RCU and also failed because caller of pci_get_next_bus()
may sleep.
	And at last the global host bridge hotplug lock solution. The rules
for locking are:
	1) No need for locking when accessing the pci_root_buses list at
system initialization stages. (It's system initialization instead of driver
initialization here because driver's initialization code may be called
at runtime when loading the driver.) It's single-threaded and no hotplug
during system initialization stages.
	2) Should acquire the global lock when accessing the pci_root_buses
list at runtime.

	I have done several rounds of scanning to identify accessing to
the pci_root_buses list at runtime. But there may still be something missed:(

	I think the best solution is to get rid of the pci_find_next_bus().
but not sure whether we could achieve that.

> 
>> diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c
>> index 123de28..f559b5b 100644
>> --- a/drivers/gpu/drm/drm_fops.c
>> +++ b/drivers/gpu/drm/drm_fops.c
>> @@ -344,9 +344,13 @@ static int drm_open_helper(struct inode *inode, struct file *filp,
>>                         pci_dev_put(pci_dev);
>>                 }
>>                 if (!dev->hose) {
>> -                       struct pci_bus *b = pci_bus_b(pci_root_buses.next);
>> +                       struct pci_bus *b;
>> +
>> +                       pci_host_bridge_hotplug_lock();
>> +                       b = pci_find_next_bus(NULL);
> 
> Here's another case I don't understand.  We know already that
> pci_find_next_bus() is unsafe with respect to hotplug because it
> doesn't hold a reference on the struct pci_bus it returns.  Can't we
> replace this with some variety of pci_get_next_bus() that *does*
> acquire a reference?
> 
> Actually, I looked at the callers of pci_find_next_bus(), and most of
> them are unsafe in an even deeper way: they're doing device setup in
> initcalls, so that setup won't be done for hot-added devices.  For
> example, I can pick on sba_init() because I think I wrote it back in
> the dark ages.  sba_init() is a subsys_initcall that calls
> sba_connect_bus() for every bus we know about at boot-time, and it
> sets the host bridge's iommu pointer.  If we were to hot-add a host
> bridge, we would never set the iommu pointer.
That's a more fundamental issue, another big topic for us:(

> 
> I'm not sure why you didn't add a pci_host_bridge_hotplug_lock() in
> the sba_init() path, since it looks similar to the drm_open_helper()
> path above.  But in any case, I think that would be the wrong thing to
> do because it would fix the superficial problem while leaving the
> deeper problem of host bridge hot-add not setting the iommu pointer.
sba_init is called during system initialization stages through subsys_initcall,
so no extra protection for it.

>>                         if (b)
>>                                 dev->hose = b->sysdata;
>> +                       pci_host_bridge_hotplug_unlock();
>>                 }
>>         }
>>  #endif
> ...
>> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
>> index 993d4a0..f1147a7 100644
>> --- a/drivers/pci/search.c
>> +++ b/drivers/pci/search.c
>> @@ -100,6 +100,13 @@ struct pci_bus * pci_find_bus(int domain, int busnr)
>>   * initiated by passing %NULL as the @from argument.  Otherwise if
>>   * @from is not %NULL, searches continue from next device on the
>>   * global list.
>> + *
>> + * Please don't call this function at rumtime if possible.
>> + * It's designed to be called at boot time only because it's unsafe
>> + * to PCI root bridge hotplug operations. But some drivers do invoke
>> + * it at runtime and it's hard to fix those drivers. In such cases,
>> + * use pci_host_bridge_hotplug()_{lock|unlock} to protect the PCI root
>> + * bus list, but you need to be really careful to avoid deadlock.
> 
> I'm not convinced that it's too hard to fix these drivers :)  There
> are only six callers, and the only ones that could possibly be at
> runtime are drm_open_helper(), sn_pci_hotplug_init(), and
> bus_rescan_store().
The same issue for i7core_xeon_pci_fixup() in i7core_edac driver too.
Will think about this solution.

--Gerry

> 
> Bjorn
> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations
  2012-09-12 15:42     ` Jiang Liu
@ 2012-09-12 16:51       ` Bjorn Helgaas
  2012-09-13 16:00         ` [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses Jiang Liu
                           ` (3 more replies)
  0 siblings, 4 replies; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-12 16:51 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

On Wed, Sep 12, 2012 at 9:42 AM, Jiang Liu <liuj97@gmail.com> wrote:
> On 09/12/2012 06:57 AM, Bjorn Helgaas wrote:
>> On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
>>> Currently there's no mechanism to protect the global pci_root_buses list
>>> from dynamic change at runtime. That means, PCI root bridge hotplug
>>> operations, which dynamically change the pci_root_buses list, may cause
>>> invalid memory accesses.
>>>
>>> So introduce a global lock to serialize accesses to the pci_root_buses
>>> list and serialize PCI host bridge hotplug operations.

>>> @@ -463,6 +463,8 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
>>>         if (!root)
>>>                 return -ENOMEM;
>>>
>>> +       pci_host_bridge_hotplug_lock();
>>
>> Here's where I get lost.  This is an ACPI driver's .add() routine,
>> which is analogous to a PCI driver's .probe() routine.  PCI driver
>> .probe() routines don't need to be concerned with PCI device hotplug.
>> All the hotplug-related locking is handled by the PCI core, not by
>> individual drivers.  So why do we need it here?
>>
>> I'm not suggesting that the existing locking is correct.  I'm just not
>> convinced this is the right way to fix it.
>>
>> The commit log says we need protection for the global pci_root_buses
>> list.  But even with this whole series, we still traverse the list
>> without protection in places like pcibios_resource_survey() and
>> pci_assign_unassigned_resources().
>>
>> Maybe we can make progress on this by identifying specific failures
>> that can happen in a couple of these paths, e.g., acpi_pci_root_add()
>> and i7core_xeon_pci_fixup().  If we look at those paths, we might a
>> way to fix this in a more general fashion than throwing in lock/unlock
>> pairs.
>>
>> It might also help to know what the rule is for when we need to use
>> pci_host_bridge_hotplug_lock() and pci_host_bridge_hotplug_unlock().
>> Apparently it is not as simple as protecting every reference to the
>> pci_root_buses list.
> Hi Bjorn,
>         It's really a challenge work to protect the pci_root_buses list:)

Yes.  IIRC, your last patch was to unexport pci_root_buses, which I
think is a great idea.

> All evils are caused by the pci_find_next_bus() interface, which is designed
> to be called at boot time only. I have tried several other solutions but
> failed.
>         First I tried "pci_get_next_bus()" which holds a reference to the
> returned root bus "pci_bus". But that doesn't help because pci_bus could
> be removed from the pci_root_buses list even you hold a reference to
> pci_bus. And it will cause trouble when you call pci_get_next_bus(pci_bus)
> again because pci_bus->node.next is invalid now.

That sounds like a bug.  If an interface returns a structure after
acquiring a reference, the caller should be able to rely on the
structure remaining valid.  Adding extra locks doesn't feel like the
right solution for that problem.

In the big picture, I'm not sure how much sense all the
pci_find_bus(), pci_find_next_bus(), pci_get_bus(),
pci_get_next_bus(), etc., interfaces really make.  There really aren't
very many callers, and most of them look a bit hacky to me.  Usually
they're quirks trying to locate a device or drivers for device A
trying to locate companion device B or something similar.  I wonder if
we could figure out some entirely new interface that wouldn't involve
traversing so much of the hierarchy and therefore could be safer.

>         Then I tried RCU and also failed because caller of pci_get_next_bus()
> may sleep.
>         And at last the global host bridge hotplug lock solution. The rules
> for locking are:
>         1) No need for locking when accessing the pci_root_buses list at
> system initialization stages. (It's system initialization instead of driver
> initialization here because driver's initialization code may be called
> at runtime when loading the driver.) It's single-threaded and no hotplug
> during system initialization stages.
>         2) Should acquire the global lock when accessing the pci_root_buses
> list at runtime.
>
>         I have done several rounds of scanning to identify accessing to
> the pci_root_buses list at runtime. But there may still be something missed:(

That's part of what makes me uneasy.  We have to look at a lot of code
outside drivers/pci to analyze correctness, which is difficult.  It
would be much better if we could do something in the core, where we
only have to analyze drivers/pci.  I know this is probably much harder
and probably involves replacing or removing some of these interfaces
that cause problems.

>         I think the best solution is to get rid of the pci_find_next_bus().
> but not sure whether we could achieve that.

>> Actually, I looked at the callers of pci_find_next_bus(), and most of
>> them are unsafe in an even deeper way: they're doing device setup in
>> initcalls, so that setup won't be done for hot-added devices.  For
>> example, I can pick on sba_init() because I think I wrote it back in
>> the dark ages.  sba_init() is a subsys_initcall that calls
>> sba_connect_bus() for every bus we know about at boot-time, and it
>> sets the host bridge's iommu pointer.  If we were to hot-add a host
>> bridge, we would never set the iommu pointer.

> That's a more fundamental issue, another big topic for us:(

>> I'm not sure why you didn't add a pci_host_bridge_hotplug_lock() in
>> the sba_init() path, since it looks similar to the drm_open_helper()
>> path above.  But in any case, I think that would be the wrong thing to
>> do because it would fix the superficial problem while leaving the
>> deeper problem of host bridge hot-add not setting the iommu pointer.

> sba_init is called during system initialization stages through subsys_initcall,
> so no extra protection for it.

OK, I see your reasoning.  But I don't agree :)  All the users of an
interface should use the same locking scheme, even if they're at
boot-time where we "know" we don't need it.  It's too hard to analyze
differences, and code gets copied from one place to somewhere else
where it might not be appropriate.

Bjorn

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] eeepc-laptop: fix device reference count leakage in eeepc_rfkill_hotplug()
  2012-09-12 14:24     ` [PATCH] eeepc-laptop: fix device reference count leakage in eeepc_rfkill_hotplug() Jiang Liu
@ 2012-09-12 19:59       ` Bjorn Helgaas
  0 siblings, 0 replies; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-12 19:59 UTC (permalink / raw)
  To: Jiang Liu; +Cc: huxinwei, Jiang Liu, linux-pci, Corentin Chary, Matthew Garrett

On Wed, Sep 12, 2012 at 8:24 AM, Jiang Liu <liuj97@gmail.com> wrote:
> Fix a device reference count leakage issue in function
> eeepc_rfkill_hotplug().

This looks good to me, but it needs a Signed-off-by and probably
should go through something other than the PCI tree.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/platform/x86/eeepc-laptop.c |   10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/platform/x86/eeepc-laptop.c b/drivers/platform/x86/eeepc-laptop.c
> index dab91b4..5ca2641 100644
> --- a/drivers/platform/x86/eeepc-laptop.c
> +++ b/drivers/platform/x86/eeepc-laptop.c
> @@ -610,12 +610,12 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
>
>                 if (!bus) {
>                         pr_warn("Unable to find PCI bus 1?\n");
> -                       goto out_unlock;
> +                       goto out_put_dev;
>                 }
>
>                 if (pci_bus_read_config_dword(bus, 0, PCI_VENDOR_ID, &l)) {
>                         pr_err("Unable to read PCI config space?\n");
> -                       goto out_unlock;
> +                       goto out_put_dev;
>                 }
>
>                 absent = (l == 0xffffffff);
> @@ -627,7 +627,7 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
>                                 absent ? "absent" : "present");
>                         pr_warn("skipped wireless hotplug as probably "
>                                 "inappropriate for this model\n");
> -                       goto out_unlock;
> +                       goto out_put_dev;
>                 }
>
>                 if (!blocked) {
> @@ -635,7 +635,7 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
>                         if (dev) {
>                                 /* Device already present */
>                                 pci_dev_put(dev);
> -                               goto out_unlock;
> +                               goto out_put_dev;
>                         }
>                         dev = pci_scan_single_device(bus, 0);
>                         if (dev) {
> @@ -650,6 +650,8 @@ static void eeepc_rfkill_hotplug(struct eeepc_laptop *eeepc, acpi_handle handle)
>                                 pci_dev_put(dev);
>                         }
>                 }
> +out_put_dev:
> +               pci_dev_put(port);
>         }
>
>  out_unlock:
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses
  2012-09-12 16:51       ` Bjorn Helgaas
@ 2012-09-13 16:00         ` Jiang Liu
  2012-09-13 17:40           ` Bjorn Helgaas
  2012-09-13 16:00         ` [PATCH 2/2] PCI: remove host bridge hotplug unsafe interface pci_get_next_bus() Jiang Liu
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-09-13 16:00 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Jiang Liu, linux-pci, Yinghai Lu

This patch introduces two root bridge hotplug safe interfaces to walk
all root buses. Function pci_get_root_buses() takes a snopshot of the
pci_root_buses list and holds a reference count to each root buses.
pci_{get|put}_root_buses are used to replace hotplug unsafe interface
pci_find_next_bus().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
Hi Bjorn,
	How about this solution? We could avoid the global lock by
taking a snapshot of the pci_root_buses list.
	These two patches just pass basic compilation tests on x86:)
	Regards!
	Gerry
---
 arch/ia64/hp/common/sba_iommu.c   |   10 +++++++---
 arch/ia64/sn/kernel/io_common.c   |   12 ++++++-----
 arch/sparc/kernel/pci.c           |   15 ++++++++------
 arch/x86/pci/common.c             |   10 ++++++++--
 drivers/edac/i7core_edac.c        |    9 ++++++---
 drivers/gpu/drm/drm_fops.c        |   10 +++++++---
 drivers/pci/hotplug/sgi_hotplug.c |    7 ++++++-
 drivers/pci/pci-sysfs.c           |    9 ++++++---
 drivers/pci/search.c              |   40 +++++++++++++++++++++++++++++++++++++
 include/linux/pci.h               |   11 ++++++++++
 10 files changed, 107 insertions(+), 26 deletions(-)

diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index bcda5b2..2903c58 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -2155,9 +2155,13 @@ sba_init(void)
 
 #ifdef CONFIG_PCI
 	{
-		struct pci_bus *b = NULL;
-		while ((b = pci_find_next_bus(b)) != NULL)
-			sba_connect_bus(b);
+		int i, count;
+		struct pci_bus **buses = NULL;
+
+		buses = pci_get_root_buses(&count);
+		for (i = 0; i < count; i++)
+			sba_connect_bus(buses[i]);
+		pci_put_root_buses(buses, count);
 	}
 #endif
 
diff --git a/arch/ia64/sn/kernel/io_common.c b/arch/ia64/sn/kernel/io_common.c
index 8630875..f667971 100644
--- a/arch/ia64/sn/kernel/io_common.c
+++ b/arch/ia64/sn/kernel/io_common.c
@@ -516,7 +516,8 @@ arch_initcall(sn_io_early_init);
 int __init
 sn_io_late_init(void)
 {
-	struct pci_bus *bus;
+	int i, count;
+	struct pci_bus **buses
 	struct pcibus_bussoft *bussoft;
 	cnodeid_t cnode;
 	nasid_t nasid;
@@ -530,9 +531,9 @@ sn_io_late_init(void)
 	 * PIC, TIOCP, TIOCE (TIOCA does it during bus fixup using
 	 * info from the PROM).
 	 */
-	bus = NULL;
-	while ((bus = pci_find_next_bus(bus)) != NULL) {
-		bussoft = SN_PCIBUS_BUSSOFT(bus);
+	buses = pci_get_root_buses(&count);
+	for (i = 0; i < count; i++) {
+		bussoft = SN_PCIBUS_BUSSOFT(buses[i]);
 		nasid = NASID_GET(bussoft->bs_base);
 		cnode = nasid_to_cnodeid(nasid);
 		if ((bussoft->bs_asic_type == PCIIO_ASIC_TYPE_TIOCP) ||
@@ -547,9 +548,10 @@ sn_io_late_init(void)
 				       "to find near node with CPUs for "
 				       "node %d, err=%d\n", cnode, e);
 			}
-			PCI_CONTROLLER(bus)->node = near_cnode;
+			PCI_CONTROLLER(buses[i])->node = near_cnode;
 		}
 	}
+	pci_put_root_buses(buses, count);
 
 	sn_ioif_inited = 1;	/* SN I/O infrastructure now initialized */
 
diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index 0e8f43a..00d2a83 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -1005,23 +1005,26 @@ static void __devinit pci_bus_slot_names(struct device_node *node,
 
 static int __init of_pci_slot_init(void)
 {
-	struct pci_bus *pbus = NULL;
+	int i, count;
+	struct pci_bus **buses;
 
-	while ((pbus = pci_find_next_bus(pbus)) != NULL) {
+	buses = pci_get_root_buses(&count);
+	for (i = 0; i < count; i++) {
 		struct device_node *node;
 
-		if (pbus->self) {
+		if (buses[i]->self) {
 			/* PCI->PCI bridge */
-			node = pbus->self->dev.of_node;
+			node = buses[i]->self->dev.of_node;
 		} else {
-			struct pci_pbm_info *pbm = pbus->sysdata;
+			struct pci_pbm_info *pbm = buses[i]->sysdata;
 
 			/* Host PCI controller */
 			node = pbm->op->dev.of_node;
 		}
 
-		pci_bus_slot_names(node, pbus);
+		pci_bus_slot_names(node, buses[i]);
 	}
+	pci_put_root_buses(buses, count);
 
 	return 0;
 }
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 720e973f..986be6e 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -446,14 +446,20 @@ void __init dmi_check_pciprobe(void)
 
 struct pci_bus * __devinit pcibios_scan_root(int busnum)
 {
-	struct pci_bus *bus = NULL;
+	int i, count;
+	struct pci_bus *bus;
+	struct pci_bus **buses;
 
-	while ((bus = pci_find_next_bus(bus)) != NULL) {
+	buses = pci_get_root_buses(&count);
+	for (i = 0; i < count; i++) {
+		bus = buses[i];
 		if (bus->number == busnum) {
+			pci_put_root_buses(buses, count);
 			/* Already scanned */
 			return bus;
 		}
 	}
+	pci_put_root_buses(buses, count);
 
 	return pci_scan_bus_on_node(busnum, &pci_root_ops,
 					get_mp_bus_to_node(busnum));
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 3672101..4f4b221 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -1294,14 +1294,17 @@ static void __init i7core_xeon_pci_fixup(const struct pci_id_table *table)
 static unsigned i7core_pci_lastbus(void)
 {
 	int last_bus = 0, bus;
-	struct pci_bus *b = NULL;
+	int i, count;
+	struct pci_bus **buses;
 
-	while ((b = pci_find_next_bus(b)) != NULL) {
-		bus = b->number;
+	buses = pci_get_root_buses(&count);
+	for (i = 0; i < count; i++) {
+		bus = buses[i]->number;
 		edac_dbg(0, "Found bus %d\n", bus);
 		if (bus > last_bus)
 			last_bus = bus;
 	}
+	pci_put_root_buses(buses, count);
 
 	edac_dbg(0, "Last bus %d\n", last_bus);
 
diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c
index 5062eec..1a9955f 100644
--- a/drivers/gpu/drm/drm_fops.c
+++ b/drivers/gpu/drm/drm_fops.c
@@ -340,9 +340,13 @@ static int drm_open_helper(struct inode *inode, struct file *filp,
 			pci_dev_put(pci_dev);
 		}
 		if (!dev->hose) {
-			struct pci_bus *b = pci_bus_b(pci_root_buses.next);
-			if (b)
-				dev->hose = b->sysdata;
+			int count;
+			struct pci_bus **buses;
+
+			buses = pci_get_root_buses(&count);
+			if (count > 0)
+				dev->hose = buses[0]->sysdata;
+			pci_put_root_buses(buses, count);
 		}
 	}
 #endif
diff --git a/drivers/pci/hotplug/sgi_hotplug.c b/drivers/pci/hotplug/sgi_hotplug.c
index f64ca92..6ec3ecb 100644
--- a/drivers/pci/hotplug/sgi_hotplug.c
+++ b/drivers/pci/hotplug/sgi_hotplug.c
@@ -679,8 +679,10 @@ alloc_err:
 static int __init sn_pci_hotplug_init(void)
 {
 	struct pci_bus *pci_bus = NULL;
+	struct pci_bus **buses;
 	int rc;
 	int registered = 0;
+	int i, count;
 
 	if (!sn_prom_feature_available(PRF_HOTPLUG_SUPPORT)) {
 		printk(KERN_ERR "%s: PROM version does not support hotplug.\n",
@@ -690,7 +692,9 @@ static int __init sn_pci_hotplug_init(void)
 
 	INIT_LIST_HEAD(&sn_hp_list);
 
-	while ((pci_bus = pci_find_next_bus(pci_bus))) {
+	buses = pci_get_root_buses(&count);
+	for (i = 0; i < count; i++) {
+		pci_bus = buses[i];
 		if (!pci_bus->sysdata)
 			continue;
 
@@ -709,6 +713,7 @@ static int __init sn_pci_hotplug_init(void)
 			break;
 		}
 	}
+	pci_put_root_buses(buses, count);
 
 	return registered == 1 ? 0 : -ENODEV;
 }
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 6869009..f8e9309 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -290,15 +290,18 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 				size_t count)
 {
 	unsigned long val;
-	struct pci_bus *b = NULL;
+	int i, num;
+	struct pci_bus **buses;
 
 	if (strict_strtoul(buf, 0, &val) < 0)
 		return -EINVAL;
 
 	if (val) {
 		mutex_lock(&pci_remove_rescan_mutex);
-		while ((b = pci_find_next_bus(b)) != NULL)
-			pci_rescan_bus(b);
+		buses = pci_get_root_buses(&num);
+		for (i = 0; i < num; i++)
+			pci_rescan_bus(buses[i]);
+		pci_put_root_buses(buses, num);
 		mutex_unlock(&pci_remove_rescan_mutex);
 	}
 	return count;
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 8f68dbe..8b20a33 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -140,6 +140,46 @@ pci_find_next_bus(const struct pci_bus *from)
 	return b;
 }
 
+struct pci_bus **
+pci_get_root_buses(int *bus_num)
+{
+	int count;
+	struct pci_bus *bus;
+	struct pci_bus **buses = NULL;
+
+	down_read(&pci_bus_sem);
+
+	count = 0;
+	list_for_each_entry(bus, &pci_root_buses, node)
+		count++;
+
+	if (count)
+		buses = kmalloc(sizeof(*buses) * count, GFP_KERNEL);
+
+	if (buses) {
+		count = 0;
+		list_for_each_entry(bus, &pci_root_buses, node)
+			buses[count++] = pci_bus_get(bus);
+		*bus_num = count;
+	} else
+		*bus_num = 0;
+
+	up_read(&pci_bus_sem);
+
+	return buses;
+}
+EXPORT_SYMBOL(pci_get_root_buses);
+
+void pci_put_root_buses(struct pci_bus **buses, int count)
+{
+	int i;
+
+	for (i = 0; i < count; i++)
+		pci_bus_put(buses[i]);
+	kfree(buses);
+}
+EXPORT_SYMBOL(pci_put_root_buses);
+
 /**
  * pci_get_slot - locate PCI device for a given PCI slot
  * @bus: PCI bus on which desired PCI device resides
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 98de988..bc1ab5f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -757,6 +757,8 @@ int pci_find_next_ext_capability(struct pci_dev *dev, int pos, int cap);
 int pci_find_ht_capability(struct pci_dev *dev, int ht_cap);
 int pci_find_next_ht_capability(struct pci_dev *dev, int pos, int ht_cap);
 struct pci_bus *pci_find_next_bus(const struct pci_bus *from);
+struct pci_bus ** pci_get_root_buses(int *bus_num);
+void pci_put_root_buses(struct pci_bus **buses, int count);
 
 struct pci_dev *pci_get_device(unsigned int vendor, unsigned int device,
 				struct pci_dev *from);
@@ -1402,6 +1404,15 @@ static inline void pci_unblock_cfg_access(struct pci_dev *dev)
 static inline struct pci_bus *pci_find_next_bus(const struct pci_bus *from)
 { return NULL; }
 
+static inline struct pci_bus ** pci_get_root_buses(int *bus_num)
+{
+	*bus_num = 0;
+	return NULL;
+}
+
+static inline void pci_put_root_buses(struct pci_bus **buses, int count)
+{ }
+
 static inline struct pci_dev *pci_get_slot(struct pci_bus *bus,
 						unsigned int devfn)
 { return NULL; }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/2] PCI: remove host bridge hotplug unsafe interface pci_get_next_bus()
  2012-09-12 16:51       ` Bjorn Helgaas
  2012-09-13 16:00         ` [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses Jiang Liu
@ 2012-09-13 16:00         ` Jiang Liu
  2012-09-17 15:51         ` [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations Jiang Liu
  2012-09-20 18:49         ` Paul E. McKenney
  3 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-09-13 16:00 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Jiang Liu, linux-pci, Yinghai Lu

Remove host bridge hotplug unsafe interface pci_get_next_bus(), it has
been replaced by pci_{get|put}_root_buses().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
---
 drivers/pci/search.c |   25 -------------------------
 include/linux/pci.h  |    4 ----
 2 files changed, 29 deletions(-)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 8b20a33..7aadd45 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -116,30 +116,6 @@ struct pci_bus * pci_find_bus(int domain, int busnr)
 	return bus;
 }
 
-/**
- * pci_find_next_bus - begin or continue searching for a PCI bus
- * @from: Previous PCI bus found, or %NULL for new search.
- *
- * Iterates through the list of known PCI busses.  A new search is
- * initiated by passing %NULL as the @from argument.  Otherwise if
- * @from is not %NULL, searches continue from next device on the
- * global list.
- */
-struct pci_bus * 
-pci_find_next_bus(const struct pci_bus *from)
-{
-	struct list_head *n;
-	struct pci_bus *b = NULL;
-
-	WARN_ON(in_interrupt());
-	down_read(&pci_bus_sem);
-	n = from ? from->node.next : pci_root_buses.next;
-	if (n != &pci_root_buses)
-		b = pci_bus_b(n);
-	up_read(&pci_bus_sem);
-	return b;
-}
-
 struct pci_bus **
 pci_get_root_buses(int *bus_num)
 {
@@ -396,7 +372,6 @@ EXPORT_SYMBOL(pci_dev_present);
 
 /* For boot time work */
 EXPORT_SYMBOL(pci_find_bus);
-EXPORT_SYMBOL(pci_find_next_bus);
 /* For everyone */
 EXPORT_SYMBOL(pci_get_device);
 EXPORT_SYMBOL(pci_get_subsys);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index bc1ab5f..ea78235 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -756,7 +756,6 @@ int pci_find_ext_capability(struct pci_dev *dev, int cap);
 int pci_find_next_ext_capability(struct pci_dev *dev, int pos, int cap);
 int pci_find_ht_capability(struct pci_dev *dev, int ht_cap);
 int pci_find_next_ht_capability(struct pci_dev *dev, int pos, int ht_cap);
-struct pci_bus *pci_find_next_bus(const struct pci_bus *from);
 struct pci_bus ** pci_get_root_buses(int *bus_num);
 void pci_put_root_buses(struct pci_bus **buses, int count);
 
@@ -1401,9 +1400,6 @@ static inline int pci_block_cfg_access_in_atomic(struct pci_dev *dev)
 static inline void pci_unblock_cfg_access(struct pci_dev *dev)
 { }
 
-static inline struct pci_bus *pci_find_next_bus(const struct pci_bus *from)
-{ return NULL; }
-
 static inline struct pci_bus ** pci_get_root_buses(int *bus_num)
 {
 	*bus_num = 0;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses
  2012-09-13 16:00         ` [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses Jiang Liu
@ 2012-09-13 17:40           ` Bjorn Helgaas
  2012-09-17 15:55             ` Jiang Liu
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-13 17:40 UTC (permalink / raw)
  To: Jiang Liu; +Cc: Jiang Liu, linux-pci, Yinghai Lu

On Thu, Sep 13, 2012 at 10:00 AM, Jiang Liu <liuj97@gmail.com> wrote:
> This patch introduces two root bridge hotplug safe interfaces to walk
> all root buses. Function pci_get_root_buses() takes a snopshot of the
> pci_root_buses list and holds a reference count to each root buses.
> pci_{get|put}_root_buses are used to replace hotplug unsafe interface
> pci_find_next_bus().

Honestly, I think the whole idea of walking these lists is wrong, and
adding safer interfaces just perpetuates the idea that it's OK to walk
them.

We should be doing the setup in the device add path instead.  I know
we have other issues with that in some cases, but I'd like to at least
move in that direction.

For example, sba_init() is a problem because it's an ACPI driver, and
we currently enumerate PCI devices before binding most ACPI drivers.
That's broken -- in that particular case, there's an HWP0001 IOMMU
device that encloses the PNP0A03 PCI host bridge.  Currently we bind
the PNP0A03 driver first, enumerate the PCI devices below it, then
bind the HWP0001 driver (sba_init).  Obviously that's backwards and
the HWP0001 driver should have been bound first, then the PNP0A03
driver.  But I don't think we're ready to make that shift yet (though
it'd be nice if somebody were working on it).

I wonder if we could add some kind of iterator that does the list
traversals in the PCI core and calls a callback for every device?  I
think that would work for sba_init(), but I don't know about the
others.  This would still be ugly in that the iterator would have to
hold some sort of hotplug lock while doing for_each_pci_dev() and the
callers, e.g., sba_init(), are not solving the problem for hot-added
devices, but at least the locking would be in the core and the drivers
would stop depending on the lists themselves.

> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
> ---
> Hi Bjorn,
>         How about this solution? We could avoid the global lock by
> taking a snapshot of the pci_root_buses list.
>         These two patches just pass basic compilation tests on x86:)
>         Regards!
>         Gerry
> ---
>  arch/ia64/hp/common/sba_iommu.c   |   10 +++++++---
>  arch/ia64/sn/kernel/io_common.c   |   12 ++++++-----
>  arch/sparc/kernel/pci.c           |   15 ++++++++------
>  arch/x86/pci/common.c             |   10 ++++++++--
>  drivers/edac/i7core_edac.c        |    9 ++++++---
>  drivers/gpu/drm/drm_fops.c        |   10 +++++++---
>  drivers/pci/hotplug/sgi_hotplug.c |    7 ++++++-
>  drivers/pci/pci-sysfs.c           |    9 ++++++---
>  drivers/pci/search.c              |   40 +++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h               |   11 ++++++++++
>  10 files changed, 107 insertions(+), 26 deletions(-)
>
> diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
> index bcda5b2..2903c58 100644
> --- a/arch/ia64/hp/common/sba_iommu.c
> +++ b/arch/ia64/hp/common/sba_iommu.c
> @@ -2155,9 +2155,13 @@ sba_init(void)
>
>  #ifdef CONFIG_PCI
>         {
> -               struct pci_bus *b = NULL;
> -               while ((b = pci_find_next_bus(b)) != NULL)
> -                       sba_connect_bus(b);
> +               int i, count;
> +               struct pci_bus **buses = NULL;
> +
> +               buses = pci_get_root_buses(&count);
> +               for (i = 0; i < count; i++)
> +                       sba_connect_bus(buses[i]);
> +               pci_put_root_buses(buses, count);
>         }
>  #endif
>
> diff --git a/arch/ia64/sn/kernel/io_common.c b/arch/ia64/sn/kernel/io_common.c
> index 8630875..f667971 100644
> --- a/arch/ia64/sn/kernel/io_common.c
> +++ b/arch/ia64/sn/kernel/io_common.c
> @@ -516,7 +516,8 @@ arch_initcall(sn_io_early_init);
>  int __init
>  sn_io_late_init(void)
>  {
> -       struct pci_bus *bus;
> +       int i, count;
> +       struct pci_bus **buses
>         struct pcibus_bussoft *bussoft;
>         cnodeid_t cnode;
>         nasid_t nasid;
> @@ -530,9 +531,9 @@ sn_io_late_init(void)
>          * PIC, TIOCP, TIOCE (TIOCA does it during bus fixup using
>          * info from the PROM).
>          */
> -       bus = NULL;
> -       while ((bus = pci_find_next_bus(bus)) != NULL) {
> -               bussoft = SN_PCIBUS_BUSSOFT(bus);
> +       buses = pci_get_root_buses(&count);
> +       for (i = 0; i < count; i++) {
> +               bussoft = SN_PCIBUS_BUSSOFT(buses[i]);
>                 nasid = NASID_GET(bussoft->bs_base);
>                 cnode = nasid_to_cnodeid(nasid);
>                 if ((bussoft->bs_asic_type == PCIIO_ASIC_TYPE_TIOCP) ||
> @@ -547,9 +548,10 @@ sn_io_late_init(void)
>                                        "to find near node with CPUs for "
>                                        "node %d, err=%d\n", cnode, e);
>                         }
> -                       PCI_CONTROLLER(bus)->node = near_cnode;
> +                       PCI_CONTROLLER(buses[i])->node = near_cnode;
>                 }
>         }
> +       pci_put_root_buses(buses, count);
>
>         sn_ioif_inited = 1;     /* SN I/O infrastructure now initialized */
>
> diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
> index 0e8f43a..00d2a83 100644
> --- a/arch/sparc/kernel/pci.c
> +++ b/arch/sparc/kernel/pci.c
> @@ -1005,23 +1005,26 @@ static void __devinit pci_bus_slot_names(struct device_node *node,
>
>  static int __init of_pci_slot_init(void)
>  {
> -       struct pci_bus *pbus = NULL;
> +       int i, count;
> +       struct pci_bus **buses;
>
> -       while ((pbus = pci_find_next_bus(pbus)) != NULL) {
> +       buses = pci_get_root_buses(&count);
> +       for (i = 0; i < count; i++) {
>                 struct device_node *node;
>
> -               if (pbus->self) {
> +               if (buses[i]->self) {
>                         /* PCI->PCI bridge */
> -                       node = pbus->self->dev.of_node;
> +                       node = buses[i]->self->dev.of_node;
>                 } else {
> -                       struct pci_pbm_info *pbm = pbus->sysdata;
> +                       struct pci_pbm_info *pbm = buses[i]->sysdata;
>
>                         /* Host PCI controller */
>                         node = pbm->op->dev.of_node;
>                 }
>
> -               pci_bus_slot_names(node, pbus);
> +               pci_bus_slot_names(node, buses[i]);
>         }
> +       pci_put_root_buses(buses, count);
>
>         return 0;
>  }
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 720e973f..986be6e 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -446,14 +446,20 @@ void __init dmi_check_pciprobe(void)
>
>  struct pci_bus * __devinit pcibios_scan_root(int busnum)
>  {
> -       struct pci_bus *bus = NULL;
> +       int i, count;
> +       struct pci_bus *bus;
> +       struct pci_bus **buses;
>
> -       while ((bus = pci_find_next_bus(bus)) != NULL) {
> +       buses = pci_get_root_buses(&count);
> +       for (i = 0; i < count; i++) {
> +               bus = buses[i];
>                 if (bus->number == busnum) {
> +                       pci_put_root_buses(buses, count);
>                         /* Already scanned */
>                         return bus;
>                 }
>         }
> +       pci_put_root_buses(buses, count);
>
>         return pci_scan_bus_on_node(busnum, &pci_root_ops,
>                                         get_mp_bus_to_node(busnum));
> diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
> index 3672101..4f4b221 100644
> --- a/drivers/edac/i7core_edac.c
> +++ b/drivers/edac/i7core_edac.c
> @@ -1294,14 +1294,17 @@ static void __init i7core_xeon_pci_fixup(const struct pci_id_table *table)
>  static unsigned i7core_pci_lastbus(void)
>  {
>         int last_bus = 0, bus;
> -       struct pci_bus *b = NULL;
> +       int i, count;
> +       struct pci_bus **buses;
>
> -       while ((b = pci_find_next_bus(b)) != NULL) {
> -               bus = b->number;
> +       buses = pci_get_root_buses(&count);
> +       for (i = 0; i < count; i++) {
> +               bus = buses[i]->number;
>                 edac_dbg(0, "Found bus %d\n", bus);
>                 if (bus > last_bus)
>                         last_bus = bus;
>         }
> +       pci_put_root_buses(buses, count);
>
>         edac_dbg(0, "Last bus %d\n", last_bus);
>
> diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c
> index 5062eec..1a9955f 100644
> --- a/drivers/gpu/drm/drm_fops.c
> +++ b/drivers/gpu/drm/drm_fops.c
> @@ -340,9 +340,13 @@ static int drm_open_helper(struct inode *inode, struct file *filp,
>                         pci_dev_put(pci_dev);
>                 }
>                 if (!dev->hose) {
> -                       struct pci_bus *b = pci_bus_b(pci_root_buses.next);
> -                       if (b)
> -                               dev->hose = b->sysdata;
> +                       int count;
> +                       struct pci_bus **buses;
> +
> +                       buses = pci_get_root_buses(&count);
> +                       if (count > 0)
> +                               dev->hose = buses[0]->sysdata;
> +                       pci_put_root_buses(buses, count);
>                 }
>         }
>  #endif
> diff --git a/drivers/pci/hotplug/sgi_hotplug.c b/drivers/pci/hotplug/sgi_hotplug.c
> index f64ca92..6ec3ecb 100644
> --- a/drivers/pci/hotplug/sgi_hotplug.c
> +++ b/drivers/pci/hotplug/sgi_hotplug.c
> @@ -679,8 +679,10 @@ alloc_err:
>  static int __init sn_pci_hotplug_init(void)
>  {
>         struct pci_bus *pci_bus = NULL;
> +       struct pci_bus **buses;
>         int rc;
>         int registered = 0;
> +       int i, count;
>
>         if (!sn_prom_feature_available(PRF_HOTPLUG_SUPPORT)) {
>                 printk(KERN_ERR "%s: PROM version does not support hotplug.\n",
> @@ -690,7 +692,9 @@ static int __init sn_pci_hotplug_init(void)
>
>         INIT_LIST_HEAD(&sn_hp_list);
>
> -       while ((pci_bus = pci_find_next_bus(pci_bus))) {
> +       buses = pci_get_root_buses(&count);
> +       for (i = 0; i < count; i++) {
> +               pci_bus = buses[i];
>                 if (!pci_bus->sysdata)
>                         continue;
>
> @@ -709,6 +713,7 @@ static int __init sn_pci_hotplug_init(void)
>                         break;
>                 }
>         }
> +       pci_put_root_buses(buses, count);
>
>         return registered == 1 ? 0 : -ENODEV;
>  }
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 6869009..f8e9309 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -290,15 +290,18 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
>                                 size_t count)
>  {
>         unsigned long val;
> -       struct pci_bus *b = NULL;
> +       int i, num;
> +       struct pci_bus **buses;
>
>         if (strict_strtoul(buf, 0, &val) < 0)
>                 return -EINVAL;
>
>         if (val) {
>                 mutex_lock(&pci_remove_rescan_mutex);
> -               while ((b = pci_find_next_bus(b)) != NULL)
> -                       pci_rescan_bus(b);
> +               buses = pci_get_root_buses(&num);
> +               for (i = 0; i < num; i++)
> +                       pci_rescan_bus(buses[i]);
> +               pci_put_root_buses(buses, num);
>                 mutex_unlock(&pci_remove_rescan_mutex);
>         }
>         return count;
> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> index 8f68dbe..8b20a33 100644
> --- a/drivers/pci/search.c
> +++ b/drivers/pci/search.c
> @@ -140,6 +140,46 @@ pci_find_next_bus(const struct pci_bus *from)
>         return b;
>  }
>
> +struct pci_bus **
> +pci_get_root_buses(int *bus_num)
> +{
> +       int count;
> +       struct pci_bus *bus;
> +       struct pci_bus **buses = NULL;
> +
> +       down_read(&pci_bus_sem);
> +
> +       count = 0;
> +       list_for_each_entry(bus, &pci_root_buses, node)
> +               count++;
> +
> +       if (count)
> +               buses = kmalloc(sizeof(*buses) * count, GFP_KERNEL);
> +
> +       if (buses) {
> +               count = 0;
> +               list_for_each_entry(bus, &pci_root_buses, node)
> +                       buses[count++] = pci_bus_get(bus);
> +               *bus_num = count;
> +       } else
> +               *bus_num = 0;
> +
> +       up_read(&pci_bus_sem);
> +
> +       return buses;
> +}
> +EXPORT_SYMBOL(pci_get_root_buses);
> +
> +void pci_put_root_buses(struct pci_bus **buses, int count)
> +{
> +       int i;
> +
> +       for (i = 0; i < count; i++)
> +               pci_bus_put(buses[i]);
> +       kfree(buses);
> +}
> +EXPORT_SYMBOL(pci_put_root_buses);
> +
>  /**
>   * pci_get_slot - locate PCI device for a given PCI slot
>   * @bus: PCI bus on which desired PCI device resides
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 98de988..bc1ab5f 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -757,6 +757,8 @@ int pci_find_next_ext_capability(struct pci_dev *dev, int pos, int cap);
>  int pci_find_ht_capability(struct pci_dev *dev, int ht_cap);
>  int pci_find_next_ht_capability(struct pci_dev *dev, int pos, int ht_cap);
>  struct pci_bus *pci_find_next_bus(const struct pci_bus *from);
> +struct pci_bus ** pci_get_root_buses(int *bus_num);
> +void pci_put_root_buses(struct pci_bus **buses, int count);
>
>  struct pci_dev *pci_get_device(unsigned int vendor, unsigned int device,
>                                 struct pci_dev *from);
> @@ -1402,6 +1404,15 @@ static inline void pci_unblock_cfg_access(struct pci_dev *dev)
>  static inline struct pci_bus *pci_find_next_bus(const struct pci_bus *from)
>  { return NULL; }
>
> +static inline struct pci_bus ** pci_get_root_buses(int *bus_num)
> +{
> +       *bus_num = 0;
> +       return NULL;
> +}
> +
> +static inline void pci_put_root_buses(struct pci_bus **buses, int count)
> +{ }
> +
>  static inline struct pci_dev *pci_get_slot(struct pci_bus *bus,
>                                                 unsigned int devfn)
>  { return NULL; }
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations
  2012-09-12 16:51       ` Bjorn Helgaas
  2012-09-13 16:00         ` [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses Jiang Liu
  2012-09-13 16:00         ` [PATCH 2/2] PCI: remove host bridge hotplug unsafe interface pci_get_next_bus() Jiang Liu
@ 2012-09-17 15:51         ` Jiang Liu
  2012-09-20 18:49         ` Paul E. McKenney
  3 siblings, 0 replies; 51+ messages in thread
From: Jiang Liu @ 2012-09-17 15:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige, Jiang Liu,
	Taku Izumi, Rafael J . Wysocki, Yijing Wang, Xinwei Hu,
	linux-kernel, linux-pci

>>> I'm not sure why you didn't add a pci_host_bridge_hotplug_lock() in
>>> the sba_init() path, since it looks similar to the drm_open_helper()
>>> path above.  But in any case, I think that would be the wrong thing to
>>> do because it would fix the superficial problem while leaving the
>>> deeper problem of host bridge hot-add not setting the iommu pointer.
> 
>> sba_init is called during system initialization stages through subsys_initcall,
>> so no extra protection for it.
> 
> OK, I see your reasoning.  But I don't agree :)  All the users of an
> interface should use the same locking scheme, even if they're at
> boot-time where we "know" we don't need it.  It's too hard to analyze
> differences, and code gets copied from one place to somewhere else
> where it might not be appropriate.
Hi Bjorn,
	"All the users of an interface should use the same locking scheme", a really
good design pattern to follow. It's so easy for everybody to remove the __init
modifier without noticing the design limitation.
	--Gerry

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses
  2012-09-13 17:40           ` Bjorn Helgaas
@ 2012-09-17 15:55             ` Jiang Liu
  2012-09-17 16:24               ` Bjorn Helgaas
  0 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-09-17 15:55 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Jiang Liu, linux-pci, Yinghai Lu

On 09/14/2012 01:40 AM, Bjorn Helgaas wrote:
> On Thu, Sep 13, 2012 at 10:00 AM, Jiang Liu <liuj97@gmail.com> wrote:
>> This patch introduces two root bridge hotplug safe interfaces to walk
>> all root buses. Function pci_get_root_buses() takes a snopshot of the
>> pci_root_buses list and holds a reference count to each root buses.
>> pci_{get|put}_root_buses are used to replace hotplug unsafe interface
>> pci_find_next_bus().
> 
> Honestly, I think the whole idea of walking these lists is wrong, and
> adding safer interfaces just perpetuates the idea that it's OK to walk
> them.
> 
> We should be doing the setup in the device add path instead.  I know
> we have other issues with that in some cases, but I'd like to at least
> move in that direction.
> 
> For example, sba_init() is a problem because it's an ACPI driver, and
> we currently enumerate PCI devices before binding most ACPI drivers.
> That's broken -- in that particular case, there's an HWP0001 IOMMU
> device that encloses the PNP0A03 PCI host bridge.  Currently we bind
> the PNP0A03 driver first, enumerate the PCI devices below it, then
> bind the HWP0001 driver (sba_init).  Obviously that's backwards and
> the HWP0001 driver should have been bound first, then the PNP0A03
> driver.  But I don't think we're ready to make that shift yet (though
> it'd be nice if somebody were working on it).
I remember there were some discussions on the mail list above the divergence
between boot and hotplug paths. But it's a little hard for me to work on
this, I only have experience with PCI on IA64 and x86:(

> 
> I wonder if we could add some kind of iterator that does the list
> traversals in the PCI core and calls a callback for every device?  I
> think that would work for sba_init(), but I don't know about the
> others.  This would still be ugly in that the iterator would have to
> hold some sort of hotplug lock while doing for_each_pci_dev() and the
> callers, e.g., sba_init(), are not solving the problem for hot-added
> devices, but at least the locking would be in the core and the drivers
> would stop depending on the lists themselves.
I will try the iterator first, hope we could find a solution here.
--Gerry

> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses
  2012-09-17 15:55             ` Jiang Liu
@ 2012-09-17 16:24               ` Bjorn Helgaas
  2012-09-18 21:39                 ` Bjorn Helgaas
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-17 16:24 UTC (permalink / raw)
  To: Jiang Liu; +Cc: Jiang Liu, linux-pci, Yinghai Lu

On Mon, Sep 17, 2012 at 9:55 AM, Jiang Liu <liuj97@gmail.com> wrote:
> On 09/14/2012 01:40 AM, Bjorn Helgaas wrote:
>> On Thu, Sep 13, 2012 at 10:00 AM, Jiang Liu <liuj97@gmail.com> wrote:
>>> This patch introduces two root bridge hotplug safe interfaces to walk
>>> all root buses. Function pci_get_root_buses() takes a snopshot of the
>>> pci_root_buses list and holds a reference count to each root buses.
>>> pci_{get|put}_root_buses are used to replace hotplug unsafe interface
>>> pci_find_next_bus().
>>
>> Honestly, I think the whole idea of walking these lists is wrong, and
>> adding safer interfaces just perpetuates the idea that it's OK to walk
>> them.
>>
>> We should be doing the setup in the device add path instead.  I know
>> we have other issues with that in some cases, but I'd like to at least
>> move in that direction.
>>
>> For example, sba_init() is a problem because it's an ACPI driver, and
>> we currently enumerate PCI devices before binding most ACPI drivers.
>> That's broken -- in that particular case, there's an HWP0001 IOMMU
>> device that encloses the PNP0A03 PCI host bridge.  Currently we bind
>> the PNP0A03 driver first, enumerate the PCI devices below it, then
>> bind the HWP0001 driver (sba_init).  Obviously that's backwards and
>> the HWP0001 driver should have been bound first, then the PNP0A03
>> driver.  But I don't think we're ready to make that shift yet (though
>> it'd be nice if somebody were working on it).
> I remember there were some discussions on the mail list above the divergence
> between boot and hotplug paths. But it's a little hard for me to work on
> this, I only have experience with PCI on IA64 and x86:(
>
>>
>> I wonder if we could add some kind of iterator that does the list
>> traversals in the PCI core and calls a callback for every device?  I
>> think that would work for sba_init(), but I don't know about the
>> others.  This would still be ugly in that the iterator would have to
>> hold some sort of hotplug lock while doing for_each_pci_dev() and the
>> callers, e.g., sba_init(), are not solving the problem for hot-added
>> devices, but at least the locking would be in the core and the drivers
>> would stop depending on the lists themselves.

> I will try the iterator first, hope we could find a solution here.

A plain iterator only handles devices that already exist.  But I
wonder if it would work to have an interface like "call this callback
for every device that exists already *and* for every device that's
hot-added in the future."  The bus notifiers are close to this, e.g.,
"bus_register_notifier(&pci_bus_type, ...)" handles this for hot-added
devices.  A little glue around it could take care of doing it for
already-existing devices as well.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses
  2012-09-17 16:24               ` Bjorn Helgaas
@ 2012-09-18 21:39                 ` Bjorn Helgaas
  2012-09-21 16:07                   ` [PATCH v4] PCI: introduce two interfaces to walk PCI buses Jiang Liu
  0 siblings, 1 reply; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-18 21:39 UTC (permalink / raw)
  To: Jiang Liu; +Cc: Jiang Liu, linux-pci, Yinghai Lu

On Mon, Sep 17, 2012 at 10:24 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Mon, Sep 17, 2012 at 9:55 AM, Jiang Liu <liuj97@gmail.com> wrote:
>> On 09/14/2012 01:40 AM, Bjorn Helgaas wrote:
>>> On Thu, Sep 13, 2012 at 10:00 AM, Jiang Liu <liuj97@gmail.com> wrote:
>>>> This patch introduces two root bridge hotplug safe interfaces to walk
>>>> all root buses. Function pci_get_root_buses() takes a snopshot of the
>>>> pci_root_buses list and holds a reference count to each root buses.
>>>> pci_{get|put}_root_buses are used to replace hotplug unsafe interface
>>>> pci_find_next_bus().
>>>
>>> Honestly, I think the whole idea of walking these lists is wrong, and
>>> adding safer interfaces just perpetuates the idea that it's OK to walk
>>> them.
>>>
>>> We should be doing the setup in the device add path instead.  I know
>>> we have other issues with that in some cases, but I'd like to at least
>>> move in that direction.
>>>
>>> For example, sba_init() is a problem because it's an ACPI driver, and
>>> we currently enumerate PCI devices before binding most ACPI drivers.
>>> That's broken -- in that particular case, there's an HWP0001 IOMMU
>>> device that encloses the PNP0A03 PCI host bridge.  Currently we bind
>>> the PNP0A03 driver first, enumerate the PCI devices below it, then
>>> bind the HWP0001 driver (sba_init).  Obviously that's backwards and
>>> the HWP0001 driver should have been bound first, then the PNP0A03
>>> driver.  But I don't think we're ready to make that shift yet (though
>>> it'd be nice if somebody were working on it).
>> I remember there were some discussions on the mail list above the divergence
>> between boot and hotplug paths. But it's a little hard for me to work on
>> this, I only have experience with PCI on IA64 and x86:(
>>
>>>
>>> I wonder if we could add some kind of iterator that does the list
>>> traversals in the PCI core and calls a callback for every device?  I
>>> think that would work for sba_init(), but I don't know about the
>>> others.  This would still be ugly in that the iterator would have to
>>> hold some sort of hotplug lock while doing for_each_pci_dev() and the
>>> callers, e.g., sba_init(), are not solving the problem for hot-added
>>> devices, but at least the locking would be in the core and the drivers
>>> would stop depending on the lists themselves.
>
>> I will try the iterator first, hope we could find a solution here.
>
> A plain iterator only handles devices that already exist.  But I
> wonder if it would work to have an interface like "call this callback
> for every device that exists already *and* for every device that's
> hot-added in the future."  The bus notifiers are close to this, e.g.,
> "bus_register_notifier(&pci_bus_type, ...)" handles this for hot-added
> devices.  A little glue around it could take care of doing it for
> already-existing devices as well.

BTW, while reviewing Yinghai's vga patch, I found a case in
vga_arb_device_init() that does exactly this: registers a notifier to
catch future hot-added devices, then calls the notifier "add" function
for every existing device.  So that's another place that could use
something like this.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations
  2012-09-12 16:51       ` Bjorn Helgaas
                           ` (2 preceding siblings ...)
  2012-09-17 15:51         ` [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations Jiang Liu
@ 2012-09-20 18:49         ` Paul E. McKenney
  3 siblings, 0 replies; 51+ messages in thread
From: Paul E. McKenney @ 2012-09-20 18:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiang Liu, Don Dutile, Yinghai Lu, Greg KH, Kenji Kaneshige,
	Jiang Liu, Taku Izumi, Rafael J . Wysocki, Yijing Wang,
	Xinwei Hu, linux-kernel, linux-pci

On Wed, Sep 12, 2012 at 10:51:05AM -0600, Bjorn Helgaas wrote:
> On Wed, Sep 12, 2012 at 9:42 AM, Jiang Liu <liuj97@gmail.com> wrote:
> > On 09/12/2012 06:57 AM, Bjorn Helgaas wrote:
> >> On Tue, Aug 7, 2012 at 10:10 AM, Jiang Liu <liuj97@gmail.com> wrote:
> >>> Currently there's no mechanism to protect the global pci_root_buses list
> >>> from dynamic change at runtime. That means, PCI root bridge hotplug
> >>> operations, which dynamically change the pci_root_buses list, may cause
> >>> invalid memory accesses.
> >>>
> >>> So introduce a global lock to serialize accesses to the pci_root_buses
> >>> list and serialize PCI host bridge hotplug operations.
> 
> >>> @@ -463,6 +463,8 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
> >>>         if (!root)
> >>>                 return -ENOMEM;
> >>>
> >>> +       pci_host_bridge_hotplug_lock();
> >>
> >> Here's where I get lost.  This is an ACPI driver's .add() routine,
> >> which is analogous to a PCI driver's .probe() routine.  PCI driver
> >> .probe() routines don't need to be concerned with PCI device hotplug.
> >> All the hotplug-related locking is handled by the PCI core, not by
> >> individual drivers.  So why do we need it here?
> >>
> >> I'm not suggesting that the existing locking is correct.  I'm just not
> >> convinced this is the right way to fix it.
> >>
> >> The commit log says we need protection for the global pci_root_buses
> >> list.  But even with this whole series, we still traverse the list
> >> without protection in places like pcibios_resource_survey() and
> >> pci_assign_unassigned_resources().
> >>
> >> Maybe we can make progress on this by identifying specific failures
> >> that can happen in a couple of these paths, e.g., acpi_pci_root_add()
> >> and i7core_xeon_pci_fixup().  If we look at those paths, we might a
> >> way to fix this in a more general fashion than throwing in lock/unlock
> >> pairs.
> >>
> >> It might also help to know what the rule is for when we need to use
> >> pci_host_bridge_hotplug_lock() and pci_host_bridge_hotplug_unlock().
> >> Apparently it is not as simple as protecting every reference to the
> >> pci_root_buses list.
> > Hi Bjorn,
> >         It's really a challenge work to protect the pci_root_buses list:)
> 
> Yes.  IIRC, your last patch was to unexport pci_root_buses, which I
> think is a great idea.
> 
> > All evils are caused by the pci_find_next_bus() interface, which is designed
> > to be called at boot time only. I have tried several other solutions but
> > failed.
> >         First I tried "pci_get_next_bus()" which holds a reference to the
> > returned root bus "pci_bus". But that doesn't help because pci_bus could
> > be removed from the pci_root_buses list even you hold a reference to
> > pci_bus. And it will cause trouble when you call pci_get_next_bus(pci_bus)
> > again because pci_bus->node.next is invalid now.
> 
> That sounds like a bug.  If an interface returns a structure after
> acquiring a reference, the caller should be able to rely on the
> structure remaining valid.  Adding extra locks doesn't feel like the
> right solution for that problem.
> 
> In the big picture, I'm not sure how much sense all the
> pci_find_bus(), pci_find_next_bus(), pci_get_bus(),
> pci_get_next_bus(), etc., interfaces really make.  There really aren't
> very many callers, and most of them look a bit hacky to me.  Usually
> they're quirks trying to locate a device or drivers for device A
> trying to locate companion device B or something similar.  I wonder if
> we could figure out some entirely new interface that wouldn't involve
> traversing so much of the hierarchy and therefore could be safer.
> 
> >         Then I tried RCU and also failed because caller of pci_get_next_bus()
> > may sleep.

On the unlikely off-chance that it helps, SRCU does allow sleeping
readers.

							Thanx, Paul

> >         And at last the global host bridge hotplug lock solution. The rules
> > for locking are:
> >         1) No need for locking when accessing the pci_root_buses list at
> > system initialization stages. (It's system initialization instead of driver
> > initialization here because driver's initialization code may be called
> > at runtime when loading the driver.) It's single-threaded and no hotplug
> > during system initialization stages.
> >         2) Should acquire the global lock when accessing the pci_root_buses
> > list at runtime.
> >
> >         I have done several rounds of scanning to identify accessing to
> > the pci_root_buses list at runtime. But there may still be something missed:(
> 
> That's part of what makes me uneasy.  We have to look at a lot of code
> outside drivers/pci to analyze correctness, which is difficult.  It
> would be much better if we could do something in the core, where we
> only have to analyze drivers/pci.  I know this is probably much harder
> and probably involves replacing or removing some of these interfaces
> that cause problems.
> 
> >         I think the best solution is to get rid of the pci_find_next_bus().
> > but not sure whether we could achieve that.
> 
> >> Actually, I looked at the callers of pci_find_next_bus(), and most of
> >> them are unsafe in an even deeper way: they're doing device setup in
> >> initcalls, so that setup won't be done for hot-added devices.  For
> >> example, I can pick on sba_init() because I think I wrote it back in
> >> the dark ages.  sba_init() is a subsys_initcall that calls
> >> sba_connect_bus() for every bus we know about at boot-time, and it
> >> sets the host bridge's iommu pointer.  If we were to hot-add a host
> >> bridge, we would never set the iommu pointer.
> 
> > That's a more fundamental issue, another big topic for us:(
> 
> >> I'm not sure why you didn't add a pci_host_bridge_hotplug_lock() in
> >> the sba_init() path, since it looks similar to the drm_open_helper()
> >> path above.  But in any case, I think that would be the wrong thing to
> >> do because it would fix the superficial problem while leaving the
> >> deeper problem of host bridge hot-add not setting the iommu pointer.
> 
> > sba_init is called during system initialization stages through subsys_initcall,
> > so no extra protection for it.
> 
> OK, I see your reasoning.  But I don't agree :)  All the users of an
> interface should use the same locking scheme, even if they're at
> boot-time where we "know" we don't need it.  It's too hard to analyze
> differences, and code gets copied from one place to somewhere else
> where it might not be appropriate.
> 
> Bjorn
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4] PCI: introduce two interfaces to walk PCI buses
  2012-09-18 21:39                 ` Bjorn Helgaas
@ 2012-09-21 16:07                   ` Jiang Liu
  2012-09-26 20:14                     ` Bjorn Helgaas
  0 siblings, 1 reply; 51+ messages in thread
From: Jiang Liu @ 2012-09-21 16:07 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiang Liu, Yinghai Lu, Kenji Kaneshige, Yijing Wang, Jiang Liu,
	linux-kernel, linux-pci

The pci_find_next_bus() is not hotplug safe, so introduce PCI hotplug
safe interfaces to walk PCI buses. To avoid some deadlock scenarios,
two interfaces are introduced.

The first one is pci_for_each_bus(), which walks all PCI buses holding
read lock on the pci_bus_sem.

The second one is pci_for_each_started_bus(), which walks all started
PCI buses without holding any global locks. Started PCI buses are those
which have been added to the device tree by calling device_add().
---
Hi Bjorn,
	How about this PCI bus iterator design? It's a little ugly that
we need to two interfaces to work around some deadlock scenarios.
And I plan to split the task into two parts:
	1) a hotplug safe PCI bus iteraror to replace pci_find_next_bus
	2) handle hotplug notifications to update bus related states.

My patchset at http://www.spinics.net/lists/linux-pci/msg17515.html has
patially solved issue 2 above for x86/ACPI, and will add more supports
for other platforms.
	Thanks!
	Gerry
---
 drivers/pci/bus.c   |   42 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h |    1 +
 2 files changed, 43 insertions(+)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index e16a8f0f..21b0ade 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -327,6 +327,48 @@ void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
 }
 EXPORT_SYMBOL_GPL(pci_walk_bus);
 
+static int pci_bus_iter(struct pci_bus *bus,
+			int (* cb)(struct pci_bus *, void *), void *data)
+{
+	int rc;
+
+	rc = cb(bus, data);
+	if (rc == 0)
+		list_for_each_entry(bus, &bus->children, node) {
+			rc = pci_bus_iter(bus, cb, data);
+			if (rc)
+				break;
+		}
+
+	return rc;
+}
+
+/** pci_for_each_bus - walk all PCI buses and call the provided callback.
+ *  @cb   callback to be called for each bus found
+ *  @data arbitrary pointer to be passed to callback.
+ *
+ *  Walk all PCI buses and call the provided callback with pci_bus_sem held.
+ *
+ *  We check the return of @cb each time. If it returns anything
+ *  other than 0, we break out.
+ */
+int pci_for_each_bus(int (* cb)(struct pci_bus *, void *), void *data)
+{
+	int rc = 0;
+	struct pci_bus *bus;
+
+	down_read(&pci_bus_sem);
+	list_for_each_entry(bus, &pci_root_buses, node) {
+		rc = pci_bus_iter(bus, cb, data);
+		if (rc)
+			break;
+	}
+	up_read(&pci_bus_sem);
+
+	return rc;
+}
+EXPORT_SYMBOL(pci_for_each_bus);
+
 struct pci_bus *pci_bus_get(struct pci_bus *bus)
 {
 	if (bus)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3c5017d..1423a24 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1060,6 +1060,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
 
 void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
 		  void *userdata);
+int pci_for_each_bus(int (* cb)(struct pci_bus *, void *), void *data);
 int pci_cfg_space_size_ext(struct pci_dev *dev);
 int pci_cfg_space_size(struct pci_dev *dev);
 unsigned char pci_bus_max_busnr(struct pci_bus *bus);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v4] PCI: introduce two interfaces to walk PCI buses
  2012-09-21 16:07                   ` [PATCH v4] PCI: introduce two interfaces to walk PCI buses Jiang Liu
@ 2012-09-26 20:14                     ` Bjorn Helgaas
  0 siblings, 0 replies; 51+ messages in thread
From: Bjorn Helgaas @ 2012-09-26 20:14 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Jiang Liu, Yinghai Lu, Kenji Kaneshige, Yijing Wang,
	linux-kernel, linux-pci

On Fri, Sep 21, 2012 at 10:07 AM, Jiang Liu <liuj97@gmail.com> wrote:
> The pci_find_next_bus() is not hotplug safe, so introduce PCI hotplug
> safe interfaces to walk PCI buses. To avoid some deadlock scenarios,
> two interfaces are introduced.
>
> The first one is pci_for_each_bus(), which walks all PCI buses holding
> read lock on the pci_bus_sem.
>
> The second one is pci_for_each_started_bus(), which walks all started
> PCI buses without holding any global locks. Started PCI buses are those
> which have been added to the device tree by calling device_add().
> ---
> Hi Bjorn,
>         How about this PCI bus iterator design? It's a little ugly that
> we need to two interfaces to work around some deadlock scenarios.
> And I plan to split the task into two parts:
>         1) a hotplug safe PCI bus iteraror to replace pci_find_next_bus
>         2) handle hotplug notifications to update bus related states.
>
> My patchset at http://www.spinics.net/lists/linux-pci/msg17515.html has
> patially solved issue 2 above for x86/ACPI, and will add more supports
> for other platforms.
>         Thanks!
>         Gerry

I like this interface:

    int pci_for_each_bus(int (* cb)(struct pci_bus *, void *), void *data)

quite a bit because it has the potential for removing all the list
knowledge from the callers, but there are two things I don't like:

  1) The interface would allow the PCI core to call the callback for
future hot-added buses, but your implementation doesn't have that yet.

  2) I'd rather have something like "pci_for_each_dev()"  so this is
device-based instead of bus-based.  The struct pci_bus is of limited
usefulness outside the core, except as a container for a set of
devices.  So the users of pci_for_each_bus() would often iterate
through that set of devices, and given the complications of SR-IOV
virtual buses, I don't think it's really clear how to do that
iteration correctly.

> ---
>  drivers/pci/bus.c   |   42 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h |    1 +
>  2 files changed, 43 insertions(+)
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index e16a8f0f..21b0ade 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -327,6 +327,48 @@ void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
>  }
>  EXPORT_SYMBOL_GPL(pci_walk_bus);
>
> +static int pci_bus_iter(struct pci_bus *bus,
> +                       int (* cb)(struct pci_bus *, void *), void *data)
> +{
> +       int rc;
> +
> +       rc = cb(bus, data);
> +       if (rc == 0)
> +               list_for_each_entry(bus, &bus->children, node) {
> +                       rc = pci_bus_iter(bus, cb, data);
> +                       if (rc)
> +                               break;
> +               }
> +
> +       return rc;
> +}
> +
> +/** pci_for_each_bus - walk all PCI buses and call the provided callback.
> + *  @cb   callback to be called for each bus found
> + *  @data arbitrary pointer to be passed to callback.
> + *
> + *  Walk all PCI buses and call the provided callback with pci_bus_sem held.
> + *
> + *  We check the return of @cb each time. If it returns anything
> + *  other than 0, we break out.
> + */
> +int pci_for_each_bus(int (* cb)(struct pci_bus *, void *), void *data)
> +{
> +       int rc = 0;
> +       struct pci_bus *bus;
> +
> +       down_read(&pci_bus_sem);
> +       list_for_each_entry(bus, &pci_root_buses, node) {
> +               rc = pci_bus_iter(bus, cb, data);
> +               if (rc)
> +                       break;
> +       }
> +       up_read(&pci_bus_sem);
> +
> +       return rc;
> +}
> +EXPORT_SYMBOL(pci_for_each_bus);
> +
>  struct pci_bus *pci_bus_get(struct pci_bus *bus)
>  {
>         if (bus)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 3c5017d..1423a24 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1060,6 +1060,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
>
>  void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
>                   void *userdata);
> +int pci_for_each_bus(int (* cb)(struct pci_bus *, void *), void *data);
>  int pci_cfg_space_size_ext(struct pci_dev *dev);
>  int pci_cfg_space_size(struct pci_dev *dev);
>  unsigned char pci_bus_max_busnr(struct pci_bus *bus);
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2012-09-26 20:15 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-07 16:10 [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 01/22] PCI: use pci_get_domain_bus_and_slot() to avoid race conditions Jiang Liu
2012-09-11 22:00   ` Bjorn Helgaas
2012-09-12  8:37     ` Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 02/22] PCI: trivial cleanups for drivers/pci/remove.c Jiang Liu
2012-09-11 22:03   ` Bjorn Helgaas
2012-09-12  8:50     ` Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 03/22] PCI: change PCI device management code to better follow device model Jiang Liu
2012-09-11 22:03   ` Bjorn Helgaas
2012-08-07 16:10 ` [RFC PATCH v1 04/22] PCI: split PCI bus device registration into two stages Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 05/22] PCI: introduce pci_bus_{get|put}() to manage PCI bus reference count Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations Jiang Liu
2012-09-11 22:57   ` Bjorn Helgaas
2012-09-12 15:42     ` Jiang Liu
2012-09-12 16:51       ` Bjorn Helgaas
2012-09-13 16:00         ` [PATCH 1/2] PCI: introduce root bridge hotplug safe interfaces to walk root buses Jiang Liu
2012-09-13 17:40           ` Bjorn Helgaas
2012-09-17 15:55             ` Jiang Liu
2012-09-17 16:24               ` Bjorn Helgaas
2012-09-18 21:39                 ` Bjorn Helgaas
2012-09-21 16:07                   ` [PATCH v4] PCI: introduce two interfaces to walk PCI buses Jiang Liu
2012-09-26 20:14                     ` Bjorn Helgaas
2012-09-13 16:00         ` [PATCH 2/2] PCI: remove host bridge hotplug unsafe interface pci_get_next_bus() Jiang Liu
2012-09-17 15:51         ` [RFC PATCH v1 06/22] PCI: use a global lock to serialize PCI root bridge hotplug operations Jiang Liu
2012-09-20 18:49         ` Paul E. McKenney
2012-08-07 16:10 ` [RFC PATCH v1 07/22] PCI: introduce PCI bus lock to serialize PCI " Jiang Liu
2012-09-11 23:24   ` Bjorn Helgaas
2012-08-07 16:10 ` [RFC PATCH v1 08/22] PCI: introduce hotplug safe search interfaces for PCI bus/device Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 09/22] PCI: enhance PCI probe logic to support PCI bus lock mechanism Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 10/22] PCI: enhance PCI bus specific " Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 11/22] PCI: enhance PCI resource assignment " Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 12/22] PCI: enhance PCI remove " Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 13/22] PCI: make each PCI device hold a reference to its parent PCI bus Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 14/22] PCI/sysfs: use PCI bus lock to avoid race conditions Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 15/22] PCI/eeepc: " Jiang Liu
2012-09-11 23:18   ` Bjorn Helgaas
2012-09-12 14:24     ` [PATCH] eeepc-laptop: fix device reference count leakage in eeepc_rfkill_hotplug() Jiang Liu
2012-09-12 19:59       ` Bjorn Helgaas
2012-08-07 16:10 ` [RFC PATCH v1 16/22] PCI/asus-wmi: use PCI bus lock to avoid race conditions Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 17/22] PCI/pciehp: " Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 18/22] PCI/acpiphp: " Jiang Liu
2012-08-07 16:10 ` [RFC PATCH v1 19/22] PCI/x86: enable PCI bus lock mechanism for x86 platforms Jiang Liu
2012-09-11 23:22   ` Bjorn Helgaas
2012-09-12  9:56     ` Jiang Liu
2012-08-07 16:11 ` [RFC PATCH v1 20/22] PCI/IA64: enable PCI bus lock mechanism for IA64 platforms Jiang Liu
2012-08-07 16:11 ` [RFC PATCH v1 21/22] PCI: cleanups for PCI bus lock implementation Jiang Liu
2012-09-11 23:21   ` Bjorn Helgaas
2012-09-12  8:58     ` Jiang Liu
2012-08-07 16:11 ` [RFC PATCH v1 22/22] PCI: unexport pci_root_buses Jiang Liu
2012-08-07 18:11 ` [RFC PATCH v1 00/22] introduce PCI bus lock to serialize PCI hotplug operations Don Dutile
2012-08-08 15:49   ` Jiang Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).