linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug
@ 2012-04-27 15:16 Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 01/19] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details Jiang Liu
                   ` (18 more replies)
  0 siblings, 19 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci

From: Jiang Liu <liuj97@gmail.com>

There are multiple methods to trigger PCI hotplug requests/operations
concurrently, such as:
1. Sysfs interfaces exported by the PCI core subsystem
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../remove
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../rescan
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../pci_bus/ssss:bb/rescan
	/sys/bus/pci/rescan
2. Sysfs interfaces exported by the PCI hotplug subsystem
	/sys/bus/pci/slots/xx/power
3. PCI hotplug events triggered by PCI Hotplug Controllers
4. ACPI hotplug events for PCI host bridges
5. Driver binding/unbinding events
	binding/unbinding pci drivers with SR-IOV support

With current implementation, the PCI core subsystem doesn't support
concurrent hotplug operations yet. The existing pci_bus_sem lock only
protects several lists in struct pci_bus, such as children list,
devices list, but it doesn't protect the pci_bus or pci_dev structure
themselves.

Let's take pci_remove_bus_device() as an example, which are used by
PCI hotplug drivers to hot-remove PCI devices.  Currently all these
are free running without any protection, so it can't support reentrance.
pci_remove_bus_device()
    ->pci_stop_bus_device()
        ->pci_stop_bus_device()
            ->pci_stop_bus_devices()
        ->pci_stop_dev()
                ->pci_proc_detach_device()
                ->pci_remove_sysfs_dev_files()
            ->device_unregister()
                ->pcie_aspm_exit_link_state()
    ->__pci_remove_bus_device()
        ->__pci_remove_behind_bridge()
        ->pci_remove_bus()
            ->device_unregister()
        ->pci_destroy_dev()
            ->pci_free_resources()
            ->pci_dev_put()

There are similar issues on hot-adding side.  It may also cause trouble
if pci_remove_bus_device() and pci_rescan_bus() are called concurrently.

So this patchset introduces a recursive rwsem to globally serialize all
PCI hotplug operations. It also fixes some minor bugs in current code.
Following PCI hotplug drivers/interfaces have been enhanced with this
new lock.
1. Sysfs interfaces exported by the PCI core subsystem
2. Sysfs interfaces exported by the PCI hotplug subsystem
3. pciehp
4. shpchp
5. cpcihp_generic and cpcihp_zt5550
6. fakephp

We have tested pciehp and fakephp, but lack of hardware platforms to test
other affected hotplug drivers.

There are still several tasks on the TODO list:
1) all other PCI hotplug driver in drivers/pci/hotplug directory
2) SR-IOV
3) acpiphp (plan to do this based on Yinghai's PCI root bus hotplug gate)
4) pci_root (plan to do this based on Yinghai's PCI root bus hotplug gate)

V2: refine commit messages and hide sys interface 'remove' and 'rescan' of
    SR-IOV virtural devices

Jiang Liu (18):
  PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation
    details
  PCI: introduce recursive rwsem to serialize PCI hotplug operations
  PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock
  PCI: serialize hotplug operations triggered by PCI hotplug sysfs
    interfaces
  PCI: correctly flush workqueue when destroy pcie hotplug controller
  PCI: prepare for serializing hotplug operations triggered by pciehp
    driver
  PCI: serialize hotplug operaitons triggered by the pciehp driver
  PCI: fix two race windows when probing/removing SHPC controller
  PCI: correctly flush workqueues and timer when destroy SHPC
    controller
  PCI: serialize hotplug operaitons triggered by the shpchp driver
  PCI: release IO resource in error handling path in
    cpcihp_generic_init()
  PCI: clean up all resources in error handling path in
    zt5550_hc_init_one()
  PCI: trivial code clean up in cpci_hotplug_core.c
  PCI: fix race windows when shutting down cpcihp controller
  PCI: hold a reference count to the PCI bus used by cpcihp drivers
  PCI: serialize PCI hotplug operations triggered by cpcihp drivers
  PCI: serialize PCI hotplug operations triggered by fakephp drivers
  PCI: hide sys interface 'remove' and 'rescan' for SR-IOV virtual
    devices

Yinghai Lu (1):
  PCI, sysfs: Use device_type and attr_groups with pci dev

 drivers/pci/bus.c                       |   15 +++
 drivers/pci/hotplug.c                   |   55 ++++++++++
 drivers/pci/hotplug/cpci_hotplug_core.c |   53 ++++++-----
 drivers/pci/hotplug/cpcihp_generic.c    |   30 ++++--
 drivers/pci/hotplug/cpcihp_zt5550.c     |   21 +++-
 drivers/pci/hotplug/fakephp.c           |   38 ++++++-
 drivers/pci/hotplug/pci_hotplug_core.c  |   26 ++++-
 drivers/pci/hotplug/pciehp.h            |    5 +-
 drivers/pci/hotplug/pciehp_core.c       |   25 ++++-
 drivers/pci/hotplug/pciehp_ctrl.c       |   56 ++++++++++-
 drivers/pci/hotplug/pciehp_hpc.c        |   18 +++-
 drivers/pci/hotplug/shpchp.h            |    3 +
 drivers/pci/hotplug/shpchp_core.c       |   11 +-
 drivers/pci/hotplug/shpchp_ctrl.c       |   32 ++++++
 drivers/pci/hotplug/shpchp_hpc.c        |   36 ++++---
 drivers/pci/pci-sysfs.c                 |  164 ++++++++++++++++++++++---------
 drivers/pci/pci.h                       |    1 +
 drivers/pci/probe.c                     |    1 +
 drivers/pci/remove.c                    |    1 +
 include/linux/pci.h                     |   20 ++++-
 20 files changed, 480 insertions(+), 131 deletions(-)

-- 
1.7.5.4


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 01/19] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations Jiang Liu
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/bus.c   |   15 +++++++++++++++
 include/linux/pci.h |    2 ++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 4ce5ef2..50f9c5d 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -35,6 +35,21 @@ void pci_add_resource_offset(struct list_head *resources, struct resource *res,
 }
 EXPORT_SYMBOL(pci_add_resource_offset);
 
+struct pci_bus *pci_bus_get(struct pci_bus *bus)
+{
+	if (bus)
+		get_device(&bus->dev);
+	return bus;
+}
+EXPORT_SYMBOL(pci_bus_get);
+
+void pci_bus_put(struct pci_bus *bus)
+{
+	if (bus)
+		put_device(&bus->dev);
+}
+EXPORT_SYMBOL(pci_bus_put);
+
 void pci_add_resource(struct list_head *resources, struct resource *res)
 {
 	pci_add_resource_offset(resources, res, 0);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e444f5b..0603a60 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -912,6 +912,8 @@ int pci_request_selected_regions_exclusive(struct pci_dev *, int, const char *);
 void pci_release_selected_regions(struct pci_dev *, int);
 
 /* drivers/pci/bus.c */
+struct pci_bus *pci_bus_get(struct pci_bus *bus);
+void pci_bus_put(struct pci_bus *bus);
 void pci_add_resource(struct list_head *resources, struct resource *res);
 void pci_add_resource_offset(struct list_head *resources, struct resource *res,
 			     resource_size_t offset);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 01/19] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-05-02  5:00   ` Greg KH
  2012-04-27 15:16 ` [PATCH v2 03/19] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock Jiang Liu
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

There are multiple ways to trigger PCI hotplug requests concurrently,
such as:
1. Sysfs interfaces exported by the PCI core subsystem
2. Sysfs interfaces exported by the PCI hotplug subsystem
3. PCI hotplug events triggered by PCI Hotplug Controllers
4. ACPI hotplug events for PCI host bridges
5. Driver binding/unbinding events

The PCI core subsystem doesn't support concurrent hotplug operations yet,
so all PCI hotplug requests should be globally serialized. This patch
introduces several new interfaces to serialize PCI hotplug operations.

pci_hotplug_try_enter(): try to acquire write lock
pci_hotplug_enter(): acquire write lock
pci_hotplug_exit(): release write lock
pci_hotplug_disable(): acquire read lock
pci_hotplug_enable(): release read lock

Today we have reproduced the issue on a real platform by using
acpiphp driver. It's an IA64 platform running Suse 11SP1 (official
2.6.32.12 kernel). The test script is:

This issue could be reproduced on an IA64 platform with Suse 11SP1
(official 2.6.32.12 kernel) and acpiphp driver.
---------------------------------------------------------------------
#!/bin/bash

for ((i=0;i<100;i++))
do
    echo 1 > /sys/bus/pci/devices/0000\:43\:00.0/remove
    echo 0 > /sys/bus/pci/slots/3/power
    sleep 1
    echo 1 > /sys/bus/pci/slots/3/power
done

And the bug report is:

------------[ cut here ]------------
WARNING: at fs/sysfs/group.c:138 sysfs_remove_group+0x210/0x240()
Hardware name: H8900
sysfs group a0000001012014f0 not found for kobject '0000:45:00.1'
Modules linked in: acpiphp(N) ipv6(N) cpufreq_conservative(N) cpufreq_userspace(
N) cpufreq_powersave(N) acpi_cpufreq(N) binfmt_misc(N) fuse(N) nls_iso8859_1(N)
loop(N) dm_mod(N) tpm_tis(N) tpm(N) ppdev(N) shpchp(N) tpm_bios(N) serio_raw(N)
qla2xxx(N) i2c_i801(N) scsi_transport_fc(N) pci_hotplug(N) scsi_tgt(N) iTCO_wdt(
N) sg(N) iTCO_vendor_support(N) i2c_core(N) mptctl(N) igb(N) parport_pc(N) parpo
rt(N) button(N) container(N) usbhid(N) hid(N) uhci_hcd(N) ehci_hcd(N) usbcore(N)
sd_mod(N) crc_t10dif(N) ext3(N) mbcache(N) jbd(N) fan(N) processor(N) ide_pci_g
eneric(N) ide_core(N) ata_piix(N) libata(N) mptsas(N) mptscsih(N) mptbase(N) scs
i_transport_sas(N) scsi_mod(N) thermal(N) thermal_sys(N) hwmon(N)
Supported: Yes

Call Trace:
 [<a000000100017640>] show_stack+0x80/0xa0
                                sp=e000002f4421fc00 bsp=e000002f44211678
 [<a0000001008cfd10>] dump_stack+0x30/0x50
                                sp=e000002f4421fdd0 bsp=e000002f44211660
 [<a0000001000b9bc0>] warn_slowpath_common+0xc0/0x120
                                sp=e000002f4421fdd0 bsp=e000002f44211628
 [<a0000001000b9d10>] warn_slowpath_fmt+0x90/0xc0
                                sp=e000002f4421fdd0 bsp=e000002f442115c0
 [<a000000100331690>] sysfs_remove_group+0x210/0x240
                                sp=e000002f4421fe10 bsp=e000002f44211590
 [<a000000100636190>] dpm_sysfs_remove+0x30/0x60
                                sp=e000002f4421fe10 bsp=e000002f44211570
 [<a0000001006236c0>] device_del+0x80/0x460
                                sp=e000002f4421fe10 bsp=e000002f44211528
 [<a000000100623ae0>] device_unregister+0x40/0x140
                                sp=e000002f4421fe10 bsp=e000002f44211508
 [<a0000001004d2320>] pci_stop_bus_device+0x160/0x200
                                sp=e000002f4421fe10 bsp=e000002f442114d8
 [<a000000223104e70>] acpiphp_disable_slot+0x170/0x580 [acpiphp]
                                sp=e000002f4421fe10 bsp=e000002f44211470
 [<a000000223100b70>] disable_slot+0x50/0x160 [acpiphp]
                                sp=e000002f4421fe20 bsp=e000002f44211448
 [<a00000021e960e60>] power_write_file+0x240/0x340 [pci_hotplug]
                                sp=e000002f4421fe20 bsp=e000002f44211418
 [<a0000001004e5e00>] pci_slot_attr_store+0x60/0xa0
                                sp=e000002f4421fe20 bsp=e000002f442113d8
 [<a00000010032a260>] sysfs_write_file+0x240/0x340
                                sp=e000002f4421fe20 bsp=e000002f44211380
 [<a000000100232910>] vfs_write+0x1b0/0x3c0
                                sp=e000002f4421fe20 bsp=e000002f44211330
 [<a000000100232ce0>] sys_write+0x80/0x100
                                sp=e000002f4421fe20 bsp=e000002f442112b8
 [<a00000010000c9c0>] ia64_ret_from_syscall+0x0/0x20
                                sp=e000002f4421fe30 bsp=e000002f442112b8
 [<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
                                sp=e000002f44220000 bsp=e000002f442112b8
---[ end trace bd659e9a3f4f6279 ]---
offline_pci.sh[6450]: NaT consumption 17179869216 [1]
Modules linked in: acpiphp(N) ipv6(N) cpufreq_conservative(N) cpufreq_userspace(
N) cpufreq_powersave(N) acpi_cpufreq(N) binfmt_misc(N) fuse(N) nls_iso8859_1(N)
loop(N) dm_mod(N) tpm_tis(N) tpm(N) ppdev(N) shpchp(N) tpm_bios(N) serio_raw(N)                                        qla2xxx(N) i2c_i801(N) scsi_transport_fc(N) pci_hotplug(N) scsi_tgt(N) iTCO_wdt(
N) sg(N) iTCO_vendor_support(N) i2c_core(N) mptctl(N) igb(N) parport_pc(N) parpo                                       rt(N) button(N) container(N) usbhid(N) hid(N) uhci_hcd(N) ehci_hcd(N) usbcore(N)
sd_mod(N) crc_t10dif(N) ext3(N) mbcache(N) jbd(N) fan(N) processor(N) ide_pci_g
eneric(N) ide_core(N) ata_piix(N) libata(N) mptsas(N) mptscsih(N) mptbase(N) scs
i_transport_sas(N) scsi_mod(N) thermal(N) thermal_sys(N) hwmon(N)
Supported: Yes

Pid: 6450, CPU 11, comm:       offline_pci.sh
psr : 0000101009526030 ifs : 8000000000000389 ip  : [<a0000001008a9870>]    Tain
ted: G        W N  (2.6.32.12-yyz)
ip is at klist_put+0x30/0x160
unat: 0000000000000000 pfs : 0000000000000206 rsc : 0000000000000003
rnat: 8000000000000711 bsps: 0000000000000000 pr  : 65519aa656999969
ldrs: 0000000000000000 ccv : 0000000040000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a0000001008a9a50 b6  : a0000001004b1320 b7  : a00000010000d170
qla2xxx 0000:45:00.1: PCI INT B disabled
f6  : 000000000000000000000 f7  : 1003e9e3779b97f4a7c16
f8  : 1003e0a00000000001072 f9  : 1003effffffffffffffee
f10 : 1003e0000000000000023 f11 : 1003e8208208208208209
r1  : a0000001015c8460 r2  : 0000000000000000 r3  : a0000001013e75b0
r8  : 0000000000000001 r9  : a0000001013e75b0 r10 : a0000001013e8ed8
r11 : 0000000000000000 r12 : e000002f4421fe10 r13 : e000002f44210000
r14 : 0000000000000020 r15 : 0000000000004000 r16 : 0000000000000009
r17 : 0000000000000200 r18 : 0000000040000000 r19 : 0000000040000000
r20 : 0000000040000200 r21 : 0000000040000000 r22 : 000000000001ae13
r23 : 0000000000100000 r24 : a0000001029780f0 r25 : 000000000001ae10
r26 : 000000000001ae10 r27 : 0000000000100000 r28 : 0000000000000034
r29 : 0000000000000034 r30 : a0000001029780f1 r31 : 000000000001ae11

Call Trace:
 [<a000000100017640>] show_stack+0x80/0xa0
                                sp=e000002f4421f850 bsp=e000002f442116f8
 [<a000000100017ca0>] show_regs+0x640/0x920
                                sp=e000002f4421fa20 bsp=e000002f442116a0
 [<a000000100028c70>] die+0x190/0x2e0
                                sp=e000002f4421fa30 bsp=e000002f44211660
 [<a000000100028e10>] die_if_kernel+0x50/0x80
                                sp=e000002f4421fa30 bsp=e000002f44211630
 [<a0000001008d8d70>] ia64_fault+0xf0/0x1640
                                sp=e000002f4421fa30 bsp=e000002f442115d8
 [<a00000010000cb60>] ia64_native_leave_kernel+0x0/0x270
                                sp=e000002f4421fc40 bsp=e000002f442115d8
 [<a0000001008a9870>] klist_put+0x30/0x160
                                sp=e000002f4421fe10 bsp=e000002f44211590
 [<a0000001008a9a50>] klist_del+0x30/0x60
                                sp=e000002f4421fe10 bsp=e000002f44211570
 [<a0000001006236e0>] device_del+0xa0/0x460
                                sp=e000002f4421fe10 bsp=e000002f44211528
 [<a000000100623ae0>] device_unregister+0x40/0x140
                                sp=e000002f4421fe10 bsp=e000002f44211508
 [<a0000001004d2320>] pci_stop_bus_device+0x160/0x200
                                sp=e000002f4421fe10 bsp=e000002f442114d8
 [<a000000223104e70>] acpiphp_disable_slot+0x170/0x580 [acpiphp]
                                sp=e000002f4421fe10 bsp=e000002f44211470
 [<a000000223100b70>] disable_slot+0x50/0x160 [acpiphp]
                                sp=e000002f4421fe20 bsp=e000002f44211448
 [<a00000021e960e60>] power_write_file+0x240/0x340 [pci_hotplug]
                                sp=e000002f4421fe20 bsp=e000002f44211418
 [<a0000001004e5e00>] pci_slot_attr_store+0x60/0xa0
                                sp=e000002f4421fe20 bsp=e000002f442113d8
 [<a00000010032a260>] sysfs_write_file+0x240/0x340
                                sp=e000002f4421fe20 bsp=e000002f44211380
 [<a000000100232910>] vfs_write+0x1b0/0x3c0
                                sp=e000002f4421fe20 bsp=e000002f44211330
 [<a000000100232ce0>] sys_write+0x80/0x100
                                sp=e000002f4421fe20 bsp=e000002f442112b8
 [<a00000010000c9c0>] ia64_ret_from_syscall+0x0/0x20
                                sp=e000002f4421fe30 bsp=e000002f442112b8
 [<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
                                sp=e000002f44220000 bsp=e000002f442112b8
Disabling lock debugging due to kernel taint

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug.c                  |   55 ++++++++++++++++++++++++++++++++
 drivers/pci/hotplug/pci_hotplug_core.c |    8 ++--
 include/linux/pci.h                    |   14 ++++++++
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug.c b/drivers/pci/hotplug.c
index 2b5352a..975bd3d 100644
--- a/drivers/pci/hotplug.c
+++ b/drivers/pci/hotplug.c
@@ -1,8 +1,63 @@
 #include <linux/kernel.h>
 #include <linux/pci.h>
 #include <linux/module.h>
+#include <linux/rwsem.h>
 #include "pci.h"
 
+/* Recursive mutex for PCI hotplug operations. */
+static DECLARE_RWSEM(pci_hotplug_rwsem);
+static struct task_struct *pci_hotplug_mutex_owner;
+static int pci_hotplug_mutex_recursive;
+
+/*
+ * trylock for writing -- returns 1 if successful, 0 if contention
+ */
+int pci_hotplug_try_enter(void)
+{
+	if (current != pci_hotplug_mutex_owner) {
+		if (down_write_trylock(&pci_hotplug_rwsem) == 0)
+			return 0;
+		pci_hotplug_mutex_owner = current;
+	}
+	pci_hotplug_mutex_recursive++;
+
+	return 1;
+}
+EXPORT_SYMBOL(pci_hotplug_try_enter);
+
+void pci_hotplug_enter(void)
+{
+	if (current != pci_hotplug_mutex_owner) {
+		down_write(&pci_hotplug_rwsem);
+		pci_hotplug_mutex_owner = current;
+	}
+	pci_hotplug_mutex_recursive++;
+
+}
+EXPORT_SYMBOL(pci_hotplug_enter);
+
+void pci_hotplug_exit(void)
+{
+	BUG_ON(pci_hotplug_mutex_owner != current);
+	if (--pci_hotplug_mutex_recursive == 0) {
+		pci_hotplug_mutex_owner = NULL;
+		up_write(&pci_hotplug_rwsem);
+	}
+}
+EXPORT_SYMBOL(pci_hotplug_exit);
+
+void pci_hotplug_enable(void)
+{
+	up_read(&pci_hotplug_rwsem);
+}
+EXPORT_SYMBOL(pci_hotplug_enable);
+
+void pci_hotplug_disable(void)
+{
+	down_read(&pci_hotplug_rwsem);
+}
+EXPORT_SYMBOL(pci_hotplug_disable);
+
 int pci_uevent(struct device *dev, struct kobj_uevent_env *env)
 {
 	struct pci_dev *pdev;
diff --git a/drivers/pci/hotplug/pci_hotplug_core.c b/drivers/pci/hotplug/pci_hotplug_core.c
index 202f4a9..1572665 100644
--- a/drivers/pci/hotplug/pci_hotplug_core.c
+++ b/drivers/pci/hotplug/pci_hotplug_core.c
@@ -537,7 +537,7 @@ int __must_check pci_hp_change_slot_info(struct hotplug_slot *hotplug,
 	return 0;
 }
 
-static int __init pci_hotplug_init (void)
+static int __init pci_hp_init(void)
 {
 	int result;
 
@@ -553,13 +553,13 @@ err_cpci:
 	return result;
 }
 
-static void __exit pci_hotplug_exit (void)
+static void __exit pci_hp_exit(void)
 {
 	cpci_hotplug_exit();
 }
 
-module_init(pci_hotplug_init);
-module_exit(pci_hotplug_exit);
+module_init(pci_hp_init);
+module_exit(pci_hp_exit);
 
 MODULE_AUTHOR(DRIVER_AUTHOR);
 MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0603a60..1c5f153 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -884,6 +884,20 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge);
 unsigned int pci_rescan_bus(struct pci_bus *bus);
 #endif
 
+#ifdef CONFIG_HOTPLUG
+extern int pci_hotplug_try_enter(void);
+extern void pci_hotplug_enter(void);
+extern void pci_hotplug_exit(void);
+extern void pci_hotplug_disable(void);
+extern void pci_hotplug_enable(void);
+#else
+static inline int pci_hotplug_try_enter(void) { return 1; }
+static inline void pci_hotplug_enter(void) {}
+static inline void pci_hotplug_exit(void) {}
+static inline void pci_hotplug_enable(void) {}
+static inline void pci_hotplug_disable(void) {}
+#endif
+
 /* Vital product data routines */
 ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
 ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void *buf);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 03/19] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 01/19] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces Jiang Liu
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Replace pci_remove_rescan_mutex with the PCI hotplug lock, which is used to
globally serialize all PCI hotplug operations.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/pci-sysfs.c |  100 +++++++++++++++++++++++++++--------------------
 drivers/pci/remove.c    |    1 +
 2 files changed, 58 insertions(+), 43 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index a55e248..ecf197d 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -283,7 +283,31 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
 }
 
 #ifdef CONFIG_HOTPLUG
-static DEFINE_MUTEX(pci_remove_rescan_mutex);
+/*
+ * Schedule a device callback to avoid deadlock in case of
+ * 1) An attribute method tries to deregister the sysfs file itself
+ * 2) Another thread is trying to remove the sysfs file, which have
+ *    already held the PCI hotplug lock by invoking pci_hotplug_enter().
+ */
+static int schedule_hp_callback(struct device *dev,
+				const char *buf, size_t count,
+				void (*func)(struct device *))
+{
+	int ret = 0;
+	unsigned long val;
+
+	if (kstrtoul(buf, 0, &val) < 0)
+		return -EINVAL;
+
+	if (val) {
+		ret = device_schedule_callback(dev, func);
+		if (ret)
+			count = ret;
+	}
+
+	return count;
+}
+
 static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 				size_t count)
 {
@@ -294,10 +318,10 @@ static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
 		return -EINVAL;
 
 	if (val) {
-		mutex_lock(&pci_remove_rescan_mutex);
+		pci_hotplug_enter();
 		while ((b = pci_find_next_bus(b)) != NULL)
 			pci_rescan_bus(b);
-		mutex_unlock(&pci_remove_rescan_mutex);
+		pci_hotplug_exit();
 	}
 	return count;
 }
@@ -307,72 +331,62 @@ struct bus_attribute pci_bus_attrs[] = {
 	__ATTR_NULL
 };
 
-static ssize_t
-dev_rescan_store(struct device *dev, struct device_attribute *attr,
-		 const char *buf, size_t count)
+static void dev_rescan_callback(struct device *dev)
 {
-	unsigned long val;
 	struct pci_dev *pdev = to_pci_dev(dev);
 
-	if (strict_strtoul(buf, 0, &val) < 0)
-		return -EINVAL;
-
-	if (val) {
-		mutex_lock(&pci_remove_rescan_mutex);
+	pci_hotplug_enter();
+	/* Skip if the device has been stopped. */
+	if (pdev->is_added && pdev->bus)
 		pci_rescan_bus(pdev->bus);
-		mutex_unlock(&pci_remove_rescan_mutex);
-	}
-	return count;
+	pci_hotplug_exit();
+}
+
+static ssize_t
+dev_rescan_store(struct device *dev, struct device_attribute *attr,
+		 const char *buf, size_t count)
+{
+	return schedule_hp_callback(dev, buf, count, dev_rescan_callback);
 }
 
 static void remove_callback(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
 
-	mutex_lock(&pci_remove_rescan_mutex);
-	pci_stop_and_remove_bus_device(pdev);
-	mutex_unlock(&pci_remove_rescan_mutex);
+	pci_hotplug_enter();
+	/* Skip if the bus has been removed. */
+	if (pdev->bus_list.next)
+		pci_stop_and_remove_bus_device(pdev);
+	pci_hotplug_exit();
 }
 
 static ssize_t
 remove_store(struct device *dev, struct device_attribute *dummy,
 	     const char *buf, size_t count)
 {
-	int ret = 0;
-	unsigned long val;
-
-	if (strict_strtoul(buf, 0, &val) < 0)
-		return -EINVAL;
-
-	/* An attribute cannot be unregistered by one of its own methods,
-	 * so we have to use this roundabout approach.
-	 */
-	if (val)
-		ret = device_schedule_callback(dev, remove_callback);
-	if (ret)
-		count = ret;
-	return count;
+	return schedule_hp_callback(dev, buf, count, remove_callback);
 }
 
-static ssize_t
-dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
-		 const char *buf, size_t count)
+static void dev_bus_rescan_callback(struct device *dev)
 {
-	unsigned long val;
 	struct pci_bus *bus = to_pci_bus(dev);
 
-	if (strict_strtoul(buf, 0, &val) < 0)
-		return -EINVAL;
-
-	if (val) {
-		mutex_lock(&pci_remove_rescan_mutex);
+	pci_hotplug_enter();
+	/* Skip if the bus has been stopped. */
+	if (bus->is_added) {
 		if (!pci_is_root_bus(bus) && list_empty(&bus->devices))
 			pci_rescan_bus_bridge_resize(bus->self);
 		else
 			pci_rescan_bus(bus);
-		mutex_unlock(&pci_remove_rescan_mutex);
 	}
-	return count;
+	pci_hotplug_exit();
+}
+
+static ssize_t
+dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
+		 const char *buf, size_t count)
+{
+	return schedule_hp_callback(dev, buf, count, dev_bus_rescan_callback);
 }
 
 #endif
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index fd77e2b..0b4342b 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -72,6 +72,7 @@ void pci_remove_bus(struct pci_bus *pci_bus)
 	if (!pci_bus->is_added)
 		return;
 
+	pci_bus->is_added = 0;
 	pci_remove_legacy_files(pci_bus);
 	device_unregister(&pci_bus->dev);
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (2 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 03/19] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-05-02  5:06   ` Greg KH
  2012-04-27 15:16 ` [PATCH v2 05/19] PCI: correctly flush workqueue when destroy pcie hotplug controller Jiang Liu
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Use PCI hotplug lock to globally serialize hotplug operations triggered by
PCI hotplug sysfs interfaces.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/pci_hotplug_core.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/hotplug/pci_hotplug_core.c b/drivers/pci/hotplug/pci_hotplug_core.c
index 1572665..9bbbe3e 100644
--- a/drivers/pci/hotplug/pci_hotplug_core.c
+++ b/drivers/pci/hotplug/pci_hotplug_core.c
@@ -39,6 +39,7 @@
 #include <linux/mutex.h>
 #include <linux/pci.h>
 #include <linux/pci_hotplug.h>
+#include <linux/delay.h>
 #include <asm/uaccess.h>
 #include "../pci.h"
 
@@ -121,6 +122,17 @@ static ssize_t power_write_file(struct pci_slot *pci_slot, const char *buf,
 		retval = -ENODEV;
 		goto exit;
 	}
+
+	/* Avoid deadlock with pci_hp_deregister() */
+	while (!pci_hotplug_try_enter()) {
+		/* Check whether the slot has been deregistered. */
+		if (list_empty(&slot->slot_list)) {
+			retval = -ENODEV;
+			goto exit_put;
+		}
+		msleep(1);
+	}
+
 	switch (power) {
 		case 0:
 			if (slot->ops->disable_slot)
@@ -136,8 +148,10 @@ static ssize_t power_write_file(struct pci_slot *pci_slot, const char *buf,
 			err ("Illegal value specified for power\n");
 			retval = -EINVAL;
 	}
-	module_put(slot->ops->owner);
 
+	pci_hotplug_exit();
+exit_put:
+	module_put(slot->ops->owner);
 exit:	
 	if (retval)
 		return retval;
@@ -500,7 +514,7 @@ int pci_hp_deregister(struct hotplug_slot *hotplug)
 		return -ENODEV;
 	}
 
-	list_del(&hotplug->slot_list);
+	list_del_init(&hotplug->slot_list);
 
 	slot = hotplug->pci_slot;
 	fs_remove_slot(slot);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 05/19] PCI: correctly flush workqueue when destroy pcie hotplug controller
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (3 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-05-02  5:08   ` Greg KH
  2012-04-27 15:16 ` [PATCH v2 06/19] PCI: prepare for serializing hotplug operations triggered by pciehp driver Jiang Liu
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

When destroying a PCIe hotplug controller, all work items associated with
that controller should be flushed.  Function pcie_cleanup_slot() calls
cancel_delayed_work() and flush_workqueue() to achieve that.
Function flush_workqueue() will flush all work items already submitted,
but new work items submitted by those already submitted work items may
still be in live state when returning from flush_workqueue().

For the extreme case, pciehp driver may expierence following calling path:
1) pcie_isr() -> pciehp_handle_xxx() -> queue_interrupt_event()->queue_work()
2) interrupt_event_handler() -> handle_button_press_event() ->
   queue_delayed_work()
3) pciehp_queue_pushbutton_work() -> queue_work()

So enhance pcie_cleanup_slot() to correctly flush workqueue when destroying
PCIe hotplug controller.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/pciehp_hpc.c |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index a960fae..98b775f 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -889,8 +889,20 @@ static int pcie_init_slot(struct controller *ctrl)
 static void pcie_cleanup_slot(struct controller *ctrl)
 {
 	struct slot *slot = ctrl->slot;
-	cancel_delayed_work(&slot->work);
+
+	/*
+	 * Following workqueue flushing logic is to deal with the special
+	 * call path:
+	 * 1) pcie_isr() -> pciehp_handle_xxx() ->
+	 *    queue_interrupt_event(pciehp_wq_event)->queue_work(pciehp_wq)
+	 * 2) interrupt_event_handler() -> handle_button_press_event() ->
+	 *    queue_delayed_work(pciehp_wq)
+	 * 3) pciehp_queue_pushbutton_work() -> queue_work(pciehp_wq)
+	 */
 	flush_workqueue(pciehp_wq);
+	cancel_delayed_work_sync(&slot->work);
+	flush_workqueue(pciehp_wq);
+
 	kfree(slot);
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 06/19] PCI: prepare for serializing hotplug operations triggered by pciehp driver
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (4 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 05/19] PCI: correctly flush workqueue when destroy pcie hotplug controller Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-05-02  5:10   ` Greg KH
  2012-04-27 15:16 ` [PATCH v2 07/19] PCI: serialize hotplug operaitons triggered by the " Jiang Liu
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Split pciehp_wq into two workqueues to avoid possible deadlock issues
when using PCI hotplug lock to serialize hotplug operations triggered
by pciehp driver.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/pciehp.h      |    3 ++-
 drivers/pci/hotplug/pciehp_core.c |   18 +++++++++++++-----
 drivers/pci/hotplug/pciehp_ctrl.c |    8 ++++----
 drivers/pci/hotplug/pciehp_hpc.c  |   11 ++++++-----
 4 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 4b7cce1..c8a1a27 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -44,7 +44,8 @@ extern bool pciehp_poll_mode;
 extern int pciehp_poll_time;
 extern bool pciehp_debug;
 extern bool pciehp_force;
-extern struct workqueue_struct *pciehp_wq;
+extern struct workqueue_struct *pciehp_wq_event;
+extern struct workqueue_struct *pciehp_wq_power;
 
 #define dbg(format, arg...)						\
 do {									\
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 365c6b9..4ceefe3 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -42,7 +42,8 @@ bool pciehp_debug;
 bool pciehp_poll_mode;
 int pciehp_poll_time;
 bool pciehp_force;
-struct workqueue_struct *pciehp_wq;
+struct workqueue_struct *pciehp_wq_event;
+struct workqueue_struct *pciehp_wq_power;
 
 #define DRIVER_VERSION	"0.4"
 #define DRIVER_AUTHOR	"Dan Zink <dan.zink@compaq.com>, Greg Kroah-Hartman <greg@kroah.com>, Dely Sy <dely.l.sy@intel.com>"
@@ -340,16 +341,22 @@ static int __init pcied_init(void)
 {
 	int retval = 0;
 
-	pciehp_wq = alloc_workqueue("pciehp", 0, 0);
-	if (!pciehp_wq)
+	pciehp_wq_event = alloc_workqueue("pciehp_event", 0, 0);
+	if (!pciehp_wq_event)
 		return -ENOMEM;
+	pciehp_wq_power = alloc_workqueue("pciehp_power", 0, 0);
+	if (!pciehp_wq_power) {
+		destroy_workqueue(pciehp_wq_event);
+		return -ENOMEM;
+	}
 
 	pciehp_firmware_init();
 	retval = pcie_port_service_register(&hpdriver_portdrv);
  	dbg("pcie_port_service_register = %d\n", retval);
   	info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
  	if (retval) {
-		destroy_workqueue(pciehp_wq);
+		destroy_workqueue(pciehp_wq_event);
+		destroy_workqueue(pciehp_wq_power);
 		dbg("Failure to register service\n");
 	}
 	return retval;
@@ -359,7 +366,8 @@ static void __exit pcied_cleanup(void)
 {
 	dbg("unload_pciehpd()\n");
 	pcie_port_service_unregister(&hpdriver_portdrv);
-	destroy_workqueue(pciehp_wq);
+	destroy_workqueue(pciehp_wq_event);
+	destroy_workqueue(pciehp_wq_power);
 	info(DRIVER_DESC " version: " DRIVER_VERSION " unloaded\n");
 }
 
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 27f4429..8f4d261 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -49,7 +49,7 @@ static int queue_interrupt_event(struct slot *p_slot, u32 event_type)
 	info->p_slot = p_slot;
 	INIT_WORK(&info->work, interrupt_event_handler);
 
-	queue_work(pciehp_wq, &info->work);
+	queue_work(pciehp_wq_event, &info->work);
 
 	return 0;
 }
@@ -344,7 +344,7 @@ void pciehp_queue_pushbutton_work(struct work_struct *work)
 		kfree(info);
 		goto out;
 	}
-	queue_work(pciehp_wq, &info->work);
+	queue_work(pciehp_wq_power, &info->work);
  out:
 	mutex_unlock(&p_slot->lock);
 }
@@ -377,7 +377,7 @@ static void handle_button_press_event(struct slot *p_slot)
 		if (ATTN_LED(ctrl))
 			pciehp_set_attention_status(p_slot, 0);
 
-		queue_delayed_work(pciehp_wq, &p_slot->work, 5*HZ);
+		queue_delayed_work(pciehp_wq_event, &p_slot->work, 5*HZ);
 		break;
 	case BLINKINGOFF_STATE:
 	case BLINKINGON_STATE:
@@ -439,7 +439,7 @@ static void handle_surprise_event(struct slot *p_slot)
 	else
 		p_slot->state = POWERON_STATE;
 
-	queue_work(pciehp_wq, &info->work);
+	queue_work(pciehp_wq_power, &info->work);
 }
 
 static void interrupt_event_handler(struct work_struct *work)
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 98b775f..d5c826d 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -894,14 +894,15 @@ static void pcie_cleanup_slot(struct controller *ctrl)
 	 * Following workqueue flushing logic is to deal with the special
 	 * call path:
 	 * 1) pcie_isr() -> pciehp_handle_xxx() ->
-	 *    queue_interrupt_event(pciehp_wq_event)->queue_work(pciehp_wq)
+	 *    queue_interrupt_event(pciehp_wq_event)->
+	 *    queue_work(pciehp_wq_event)
 	 * 2) interrupt_event_handler() -> handle_button_press_event() ->
-	 *    queue_delayed_work(pciehp_wq)
-	 * 3) pciehp_queue_pushbutton_work() -> queue_work(pciehp_wq)
+	 *    queue_delayed_work(pciehp_wq_event)
+	 * 3) pciehp_queue_pushbutton_work() -> queue_work(pciehp_wq_power)
 	 */
-	flush_workqueue(pciehp_wq);
+	flush_workqueue(pciehp_wq_event);
 	cancel_delayed_work_sync(&slot->work);
-	flush_workqueue(pciehp_wq);
+	flush_workqueue(pciehp_wq_power);
 
 	kfree(slot);
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 07/19] PCI: serialize hotplug operaitons triggered by the pciehp driver
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (5 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 06/19] PCI: prepare for serializing hotplug operations triggered by pciehp driver Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 08/19] PCI: fix two race windows when probing/removing SHPC controller Jiang Liu
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Use PCI hotplug lock to serialize hotplug operations triggered by the
pciehp driver. It solves following crash issues.

test scripts:
[root]cat offline.sh
#!/bin/bash

for((i=0;i<100000;i++))
do
	`echo 0 > /sys/bus/pci/slots/16/power`
	`echo 1 > /sys/bus/pci/slots/16/power`
done

[root]cat remove.sh
#!/bin/bash

for((i=0;i<100000;i++))
do
	`echo 1 > /sys/bus/pci/devices/0000:0f:00.0/remove`
	`echo 1 > /sys/bus/pci/rescan`
done

--------------------------------------------
CPU 11
Modules linked in: pciehp pci_hotplug ipv6 cpufreq_conservative cpufreq_userspac

Pid: 8675, comm: offline.sh Not tainted 3.4.0-rc3-yijing+ #7 Huawei Technologies
CPU 11
Modules linked in: pciehp pci_hotplug ipv6 cpufreq_conservative cpufreq_userspac

Pid: 8675, comm: offline.sh Not tainted 3.4.0-rc3-yijing+ #7 Huawei Technologies
RIP: 0010:[<ffffffff81141f90>]  [<ffffffff81141f90>] sysfs_name_hash+0x17/0xaa
RSP: 0018:ffff880c1a487c60  EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff880c1c230c40 RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880c1a487c68 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff811bd401 R11: ffff880c1a4d5d40 R12: ffff880c1babd558
R13: 0000000000000000 R14: 0000000000000000 R15: ffff880c1a487dbf
FS:  00007f00722e36f0(0000) GS:ffff880c3fce0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000c1b594000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process offline.sh (pid: 8675, threadinfo ffff880c1a486000, task ffff880c1b84000
Stack:
 0000000000000000 ffff880c1a487ca8 ffffffff811421fc ffff880c1a487cd8
 ffff880c1c230c40 0000000000000000 0000000000000000 ffff880c1a487cb8
 ffff880c1a487dbf ffff880c1a487ce8 ffffffff81140e99 ffff880c1c230c40
Call Trace:
 [<ffffffff811421fc>] sysfs_find_dirent+0x78/0xd4
 [<ffffffff81140e99>] sysfs_hash_and_remove+0x5e/0x8e
 [<ffffffff81143ffc>] sysfs_remove_bin_file+0x12/0x14
 [<ffffffff811bd030>] pci_remove_resource_files+0x30/0x6b
 [<ffffffff811bd454>] pci_remove_sysfs_dev_files+0x9d/0x110
 [<ffffffff811b748e>] pci_stop_bus_device+0x52/0x80
 [<ffffffff811b766c>] pci_stop_and_remove_bus_device+0x11/0x21
 [<ffffffffa02164ae>] pciehp_unconfigure_device+0x102/0x166 [pciehp]
 [<ffffffffa0216934>] ? pciehp_get_power_status+0x33/0xc3 [pciehp]
 [<ffffffffa02157aa>] pciehp_disable_slot+0x106/0x163 [pciehp]
 [<ffffffffa0215866>] pciehp_sysfs_disable_slot+0x5f/0xee [pciehp]
 [<ffffffffa0215649>] disable_slot+0x52/0x56 [pciehp]
 [<ffffffffa01f640d>] power_write_file+0x8c/0xcd [pci_hotplug]
 [<ffffffff811bf10e>] pci_slot_attr_store+0x24/0x26
 [<ffffffff811417c5>] sysfs_write_file+0xdc/0x111
 [<ffffffff810eb55c>] vfs_write+0xae/0x151
 [<ffffffff810eb6c3>] sys_write+0x47/0x6e
 [<ffffffff8131fce2>] system_call_fastpath+0x16/0x1b
Code: 48 8b 47 78 55 48 89 e5 0f b7 40 60 c9 c1 e8 0d 83 e0 01 c3 55 49 89 f8 31
RIP  [<ffffffff81141f90>] sysfs_name_hash+0x17/0xaa
 RSP <ffff880c1a487c60>
CR2: 0000000000000000
---[ end trace af1f0f7871dbd9bc ]---

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/pciehp.h      |    2 +
 drivers/pci/hotplug/pciehp_core.c |    7 ++++-
 drivers/pci/hotplug/pciehp_ctrl.c |   48 +++++++++++++++++++++++++++++++++++++
 drivers/pci/hotplug/pciehp_hpc.c  |    1 +
 4 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index c8a1a27..6078eea 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -117,6 +117,7 @@ struct controller {
 #define BLINKINGOFF_STATE		2
 #define POWERON_STATE			3
 #define POWEROFF_STATE			4
+#define SHUTDOWN_STATE			5
 
 #define ATTN_BUTTN(ctrl)	((ctrl)->slot_cap & PCI_EXP_SLTCAP_ABP)
 #define POWER_CTRL(ctrl)	((ctrl)->slot_cap & PCI_EXP_SLTCAP_PCP)
@@ -160,6 +161,7 @@ void pciehp_green_led_off(struct slot *slot);
 void pciehp_green_led_blink(struct slot *slot);
 int pciehp_check_link_status(struct controller *ctrl);
 void pciehp_release_ctrl(struct controller *ctrl);
+void pciehp_shutdown_slot(struct slot *slot);
 
 static inline const char *slot_name(struct slot *slot)
 {
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 4ceefe3..2195f67 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -268,8 +268,11 @@ static int pciehp_probe(struct pcie_device *dev)
 	slot = ctrl->slot;
 	pciehp_get_adapter_status(slot, &occupied);
 	pciehp_get_power_status(slot, &poweron);
-	if (occupied && pciehp_force)
+	if (occupied && pciehp_force) {
+		pci_hotplug_enter();
 		pciehp_enable_slot(slot);
+		pci_hotplug_exit();
+	}
 	/* If empty slot's power status is on, turn power off */
 	if (!occupied && poweron && POWER_CTRL(ctrl))
 		pciehp_power_off_slot(slot);
@@ -313,11 +316,13 @@ static int pciehp_resume (struct pcie_device *dev)
 		slot = ctrl->slot;
 
 		/* Check if slot is occupied */
+		pci_hotplug_enter();
 		pciehp_get_adapter_status(slot, &status);
 		if (status)
 			pciehp_enable_slot(slot);
 		else
 			pciehp_disable_slot(slot);
+		pci_hotplug_exit();
 	}
 	return 0;
 }
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 8f4d261..b83f2cc 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -31,6 +31,7 @@
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/slab.h>
+#include <linux/delay.h>
 #include <linux/pci.h>
 #include "../pci.h"
 #include "pciehp.h"
@@ -290,6 +291,30 @@ static void pciehp_power_thread(struct work_struct *work)
 	struct power_work_info *info =
 		container_of(work, struct power_work_info, work);
 	struct slot *p_slot = info->p_slot;
+	bool shutdown;
+
+	/*
+	 * Break out deadlock issues caused by following scenario:
+	 * Thread A:
+	 *	1) acquire the PCI hotplug lock
+	 *	2) remove the PCI device associated with this PCIe HPC
+	 *	3) call pciehp_remove() for this PCIe HPC
+	 *	4) call flush_workqueue(pciehp_wq_power) to flush queued works
+	 *      5) wait until all queued works done
+	 * Thread B is a workqueue worker thread:
+	 *	1) call pciehp_power_thread() to handle hotplug requests
+	 *	2) try to acquire the PCI hotplug lock
+	 * Please refer to pciehp_shutdown_slot() for the counterpart.
+	 */
+	while (!pci_hotplug_try_enter()) {
+		mutex_lock(&p_slot->lock);
+		shutdown = p_slot->state == SHUTDOWN_STATE;
+		mutex_unlock(&p_slot->lock);
+		if (shutdown)
+			goto out;
+		else
+			mdelay(1);
+	}
 
 	mutex_lock(&p_slot->lock);
 	switch (p_slot->state) {
@@ -315,6 +340,9 @@ static void pciehp_power_thread(struct work_struct *work)
 	}
 	mutex_unlock(&p_slot->lock);
 
+	pci_hotplug_exit();
+
+out:
 	kfree(info);
 }
 
@@ -623,3 +651,23 @@ int pciehp_sysfs_disable_slot(struct slot *p_slot)
 
 	return retval;
 }
+
+void pciehp_shutdown_slot(struct slot *slot)
+{
+	u8 getstatus;
+	struct controller *ctrl = slot->ctrl;
+
+	mutex_lock(&slot->lock);
+	slot->state = SHUTDOWN_STATE;
+	mutex_unlock(&slot->lock);
+
+	if (ATTN_LED(ctrl))
+		pciehp_set_attention_status(slot, 0);
+	if (PWR_LED(ctrl)) {
+		pciehp_get_power_status(slot, &getstatus);
+		if (getstatus)
+			pciehp_green_led_on(slot);
+		else
+			pciehp_green_led_off(slot);
+	}
+}
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index d5c826d..c492b2c 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -902,6 +902,7 @@ static void pcie_cleanup_slot(struct controller *ctrl)
 	 */
 	flush_workqueue(pciehp_wq_event);
 	cancel_delayed_work_sync(&slot->work);
+	pciehp_shutdown_slot(slot);
 	flush_workqueue(pciehp_wq_power);
 
 	kfree(slot);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 08/19] PCI: fix two race windows when probing/removing SHPC controller
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (6 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 07/19] PCI: serialize hotplug operaitons triggered by the " Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 09/19] PCI: correctly flush workqueues and timer when destroy " Jiang Liu
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

With current shpchp implementation, interrupt is enabled before corresponding
slot data structures are created. If interrupt happens immediately after
enabling the hotplug interrupt, shpchp_find_slot() may return NULL, thus
causes NULL pointer dereference. So create slot data structures before
enabling interrupt.

And cleanup_slots() is called to destroy slot data structures before disabling
interrupt, which may cause invalid memory access in shpc_isr().  So call
cleanup_slots() after disabling interrupt/removing the poll timer.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/shpchp.h      |    1 +
 drivers/pci/hotplug/shpchp_core.c |    6 +++---
 drivers/pci/hotplug/shpchp_hpc.c  |   36 +++++++++++++++++++++---------------
 3 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp.h b/drivers/pci/hotplug/shpchp.h
index ca64932..6691dc4 100644
--- a/drivers/pci/hotplug/shpchp.h
+++ b/drivers/pci/hotplug/shpchp.h
@@ -182,6 +182,7 @@ extern int shpchp_unconfigure_device(struct slot *p_slot);
 extern void cleanup_slots(struct controller *ctrl);
 extern void shpchp_queue_pushbutton_work(struct work_struct *work);
 extern int shpc_init( struct controller *ctrl, struct pci_dev *pdev);
+extern void shpc_enable_intr(struct controller *ctrl);
 
 static inline const char *slot_name(struct slot *slot)
 {
diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index 7414fd9..aaa66be 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
@@ -318,14 +318,14 @@ static int shpc_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_out_release_ctlr;
 	}
 
+	shpc_enable_intr(ctrl);
+
 	rc = shpchp_create_ctrl_files(ctrl);
 	if (rc)
-		goto err_cleanup_slots;
+		goto err_out_release_ctlr;
 
 	return 0;
 
-err_cleanup_slots:
-	cleanup_slots(ctrl);
 err_out_release_ctlr:
 	ctrl->hpc_ops->release_ctlr(ctrl);
 err_out_free_ctrl:
diff --git a/drivers/pci/hotplug/shpchp_hpc.c b/drivers/pci/hotplug/shpchp_hpc.c
index 75ba231..2bc6182 100644
--- a/drivers/pci/hotplug/shpchp_hpc.c
+++ b/drivers/pci/hotplug/shpchp_hpc.c
@@ -592,6 +592,13 @@ static void hpc_release_ctlr(struct controller *ctrl)
 		shpc_writel(ctrl, SLOT_REG(i), slot_reg);
 	}
 
+	if (shpchp_poll_mode)
+		del_timer(&ctrl->poll_timer);
+	else {
+		free_irq(ctrl->pci_dev->irq, ctrl);
+		pci_disable_msi(ctrl->pci_dev);
+	}
+
 	cleanup_slots(ctrl);
 
 	/*
@@ -603,13 +610,6 @@ static void hpc_release_ctlr(struct controller *ctrl)
 	serr_int &= ~SERR_INTR_RSVDZ_MASK;
 	shpc_writel(ctrl, SERR_INTR_ENABLE, serr_int);
 
-	if (shpchp_poll_mode)
-		del_timer(&ctrl->poll_timer);
-	else {
-		free_irq(ctrl->pci_dev->irq, ctrl);
-		pci_disable_msi(ctrl->pci_dev);
-	}
-
 	iounmap(ctrl->creg);
 	release_mem_region(ctrl->mmio_base, ctrl->mmio_size);
 }
@@ -1081,6 +1081,20 @@ int shpc_init(struct controller *ctrl, struct pci_dev *pdev)
 	shpc_get_max_bus_speed(ctrl);
 	shpc_get_cur_bus_speed(ctrl);
 
+	return 0;
+
+	/* We end up here for the many possible ways to fail this API.  */
+abort_iounmap:
+	iounmap(ctrl->creg);
+abort:
+	return rc;
+}
+
+void shpc_enable_intr(struct controller *ctrl)
+{
+	u8 hp_slot;
+	u32 tempdword, slot_reg;
+
 	/*
 	 * Unmask all event interrupts of all slots
 	 */
@@ -1102,12 +1116,4 @@ int shpc_init(struct controller *ctrl, struct pci_dev *pdev)
 		tempdword = shpc_readl(ctrl, SERR_INTR_ENABLE);
 		ctrl_dbg(ctrl, "SERR_INTR_ENABLE = %x\n", tempdword);
 	}
-
-	return 0;
-
-	/* We end up here for the many possible ways to fail this API.  */
-abort_iounmap:
-	iounmap(ctrl->creg);
-abort:
-	return rc;
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 09/19] PCI: correctly flush workqueues and timer when destroy SHPC controller
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (7 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 08/19] PCI: fix two race windows when probing/removing SHPC controller Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 10/19] PCI: serialize hotplug operaitons triggered by the shpchp driver Jiang Liu
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

del_timer() only deactivates a timer but doesn't wait for the handler
to finish, so use del_timer_sync() to deactivate a timer and wait for
the handler to finish in hpc_release_ctrl().

This patch also tune the workqueue flush logic to correctly flush all
work items.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/shpchp_core.c |    2 +-
 drivers/pci/hotplug/shpchp_hpc.c  |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index aaa66be..19762b3 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
@@ -173,8 +173,8 @@ void cleanup_slots(struct controller *ctrl)
 	list_for_each_safe(tmp, next, &ctrl->slot_list) {
 		slot = list_entry(tmp, struct slot, slot_list);
 		list_del(&slot->slot_list);
-		cancel_delayed_work(&slot->work);
 		flush_workqueue(shpchp_wq);
+		cancel_delayed_work_sync(&slot->work);
 		flush_workqueue(shpchp_ordered_wq);
 		pci_hp_deregister(slot->hotplug_slot);
 	}
diff --git a/drivers/pci/hotplug/shpchp_hpc.c b/drivers/pci/hotplug/shpchp_hpc.c
index 2bc6182..ff63887 100644
--- a/drivers/pci/hotplug/shpchp_hpc.c
+++ b/drivers/pci/hotplug/shpchp_hpc.c
@@ -593,7 +593,7 @@ static void hpc_release_ctlr(struct controller *ctrl)
 	}
 
 	if (shpchp_poll_mode)
-		del_timer(&ctrl->poll_timer);
+		del_timer_sync(&ctrl->poll_timer);
 	else {
 		free_irq(ctrl->pci_dev->irq, ctrl);
 		pci_disable_msi(ctrl->pci_dev);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 10/19] PCI: serialize hotplug operaitons triggered by the shpchp driver
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (8 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 09/19] PCI: correctly flush workqueues and timer when destroy " Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 11/19] PCI: release IO resource in error handling path in cpcihp_generic_init() Jiang Liu
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Use PCI hotplug lock to serialize hotplug operations triggered by the
shpchp driver.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/shpchp.h      |    2 ++
 drivers/pci/hotplug/shpchp_core.c |    3 ++-
 drivers/pci/hotplug/shpchp_ctrl.c |   32 ++++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp.h b/drivers/pci/hotplug/shpchp.h
index 6691dc4..07f3b2d 100644
--- a/drivers/pci/hotplug/shpchp.h
+++ b/drivers/pci/hotplug/shpchp.h
@@ -157,6 +157,7 @@ struct controller {
 #define BLINKINGOFF_STATE		2
 #define POWERON_STATE			3
 #define POWEROFF_STATE			4
+#define SHUTDOWN_STATE			5
 
 /* Error messages */
 #define INTERLOCK_OPEN			0x00000002
@@ -179,6 +180,7 @@ extern u8 shpchp_handle_presence_change(u8 hp_slot, struct controller *ctrl);
 extern u8 shpchp_handle_power_fault(u8 hp_slot, struct controller *ctrl);
 extern int shpchp_configure_device(struct slot *p_slot);
 extern int shpchp_unconfigure_device(struct slot *p_slot);
+extern void shpchp_shutdown_slot(struct slot *slot);
 extern void cleanup_slots(struct controller *ctrl);
 extern void shpchp_queue_pushbutton_work(struct work_struct *work);
 extern int shpc_init( struct controller *ctrl, struct pci_dev *pdev);
diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index 19762b3..71cc5f2 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
@@ -172,10 +172,11 @@ void cleanup_slots(struct controller *ctrl)
 
 	list_for_each_safe(tmp, next, &ctrl->slot_list) {
 		slot = list_entry(tmp, struct slot, slot_list);
-		list_del(&slot->slot_list);
 		flush_workqueue(shpchp_wq);
 		cancel_delayed_work_sync(&slot->work);
+		shpchp_shutdown_slot(slot);
 		flush_workqueue(shpchp_ordered_wq);
+		list_del(&slot->slot_list);
 		pci_hp_deregister(slot->hotplug_slot);
 	}
 }
diff --git a/drivers/pci/hotplug/shpchp_ctrl.c b/drivers/pci/hotplug/shpchp_ctrl.c
index b00b09b..7b8e3a6 100644
--- a/drivers/pci/hotplug/shpchp_ctrl.c
+++ b/drivers/pci/hotplug/shpchp_ctrl.c
@@ -31,6 +31,7 @@
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/slab.h>
+#include <linux/delay.h>
 #include <linux/pci.h>
 #include "../pci.h"
 #include "shpchp.h"
@@ -406,6 +407,18 @@ static void shpchp_pushbutton_thread(struct work_struct *work)
 	struct pushbutton_work_info *info =
 		container_of(work, struct pushbutton_work_info, work);
 	struct slot *p_slot = info->p_slot;
+	bool shutdown;
+
+	/* Break possible deadlock by using pci_hotplug_try_enter() */
+	while (!pci_hotplug_try_enter()) {
+		mutex_lock(&p_slot->lock);
+		shutdown = p_slot->state == SHUTDOWN_STATE;
+		mutex_unlock(&p_slot->lock);
+		if (shutdown)
+			goto out;
+		else
+			mdelay(1);
+	}
 
 	mutex_lock(&p_slot->lock);
 	switch (p_slot->state) {
@@ -427,6 +440,9 @@ static void shpchp_pushbutton_thread(struct work_struct *work)
 	}
 	mutex_unlock(&p_slot->lock);
 
+	pci_hotplug_exit();
+
+out:
 	kfree(info);
 }
 
@@ -729,3 +745,19 @@ int shpchp_sysfs_disable_slot(struct slot *p_slot)
 
 	return retval;
 }
+
+void shpchp_shutdown_slot(struct slot *slot)
+{
+	u8 getstatus;
+
+	mutex_lock(&slot->lock);
+	slot->state = SHUTDOWN_STATE;
+	mutex_unlock(&slot->lock);
+
+	slot->hpc_ops->set_attention_status(slot, 0);
+	slot->hpc_ops->get_power_status(slot, &getstatus);
+	if (getstatus)
+		slot->hpc_ops->green_led_on(slot);
+	else
+		slot->hpc_ops->green_led_off(slot);
+}
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 11/19] PCI: release IO resource in error handling path in cpcihp_generic_init()
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (9 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 10/19] PCI: serialize hotplug operaitons triggered by the shpchp driver Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 12/19] PCI: clean up all resources in error handling path in zt5550_hc_init_one() Jiang Liu
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Release IO resource in error handling path in function cpcihp_generic_init().

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/cpcihp_generic.c |   25 ++++++++++++++++---------
 1 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index 81af764..518f387 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
@@ -157,16 +157,19 @@ static int __init cpcihp_generic_init(void)
 	bus = pci_find_bus(0, bridge_busnr);
 	if (!bus) {
 		err("Invalid bus number %d", bridge_busnr);
-		return -EINVAL;
-	}
-	dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
-	if(!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
-		err("Invalid bridge device %s", bridge);
+	} else {
+		dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
+		if (!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
+			err("Invalid bridge device %s", bridge);
+			bus = NULL;
+		} else
+			bus = dev->subordinate;
 		pci_dev_put(dev);
-		return -EINVAL;
 	}
-	bus = dev->subordinate;
-	pci_dev_put(dev);
+	if (!bus) {
+		status = -EINVAL;
+		goto init_find_bus_error;
+	}
 
 	memset(&generic_hpc, 0, sizeof (struct cpci_hp_controller));
 	generic_hpc_ops.query_enum = query_enum;
@@ -175,7 +178,8 @@ static int __init cpcihp_generic_init(void)
 	status = cpci_hp_register_controller(&generic_hpc);
 	if(status != 0) {
 		err("Could not register cPCI hotplug controller");
-		return -ENODEV;
+		status = -ENODEV;
+		goto init_find_bus_error;
 	}
 	dbg("registered controller");
 
@@ -193,10 +197,13 @@ static int __init cpcihp_generic_init(void)
 	}
 	dbg("started cpci hp system");
 	return 0;
+
 init_start_error:
 	cpci_hp_unregister_bus(bus);
 init_bus_register_error:
 	cpci_hp_unregister_controller(&generic_hpc);
+init_find_bus_error:
+	release_region(port, 1);
 	err("status = %d", status);
 	return status;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 12/19] PCI: clean up all resources in error handling path in zt5550_hc_init_one()
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (10 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 11/19] PCI: release IO resource in error handling path in cpcihp_generic_init() Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 13/19] PCI: trivial code clean up in cpci_hotplug_core.c Jiang Liu
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Clean up all resources in error handling path in function zt5550_hc_init_one().

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/cpcihp_zt5550.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_zt5550.c b/drivers/pci/hotplug/cpcihp_zt5550.c
index 6bf8d2a..8a6f968 100644
--- a/drivers/pci/hotplug/cpcihp_zt5550.c
+++ b/drivers/pci/hotplug/cpcihp_zt5550.c
@@ -257,11 +257,13 @@ static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 	if(status != 0) {
 		err("could not started cPCI hotplug system");
 		cpci_hp_unregister_bus(bus0);
-		goto init_register_error;
+		goto init_start_error;
 	}
 	dbg("started cpci hp system");
 
 	return 0;
+init_start_error:
+	cpci_hp_unregister_bus(bus0);
 init_register_error:
 	cpci_hp_unregister_controller(&zt5550_hpc);
 init_hc_error:
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 13/19] PCI: trivial code clean up in cpci_hotplug_core.c
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (11 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 12/19] PCI: clean up all resources in error handling path in zt5550_hc_init_one() Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 14/19] PCI: fix race windows when shutting down cpcihp controller Jiang Liu
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

1) get rid of redundant lock operations in cpcihp_core.
2) return suitable error code instead of -1.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   16 +++-------------
 1 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 3fadf2f..7898023 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
@@ -304,7 +304,7 @@ cpci_hp_unregister_bus(struct pci_bus *bus)
 	down_write(&list_rwsem);
 	if (!slots) {
 		up_write(&list_rwsem);
-		return -1;
+		return -ENODEV;
 	}
 	list_for_each_entry_safe(slot, tmp, &slot_list, slot_list) {
 		if (slot->bus == bus) {
@@ -357,11 +357,6 @@ init_slots(int clear_ins)
 	struct pci_dev* dev;
 
 	dbg("%s - enter", __func__);
-	down_read(&list_rwsem);
-	if (!slots) {
-		up_read(&list_rwsem);
-		return -1;
-	}
 	list_for_each_entry(slot, &slot_list, slot_list) {
 		dbg("%s - looking at slot %s", __func__, slot_name(slot));
 		if (clear_ins && cpci_check_and_clear_ins(slot))
@@ -376,7 +371,6 @@ init_slots(int clear_ins)
 			slot->dev = dev;
 		}
 	}
-	up_read(&list_rwsem);
 	dbg("%s - exit", __func__);
 	return 0;
 }
@@ -585,7 +579,7 @@ cpci_hp_register_controller(struct cpci_hp_controller *new_controller)
 	int status = 0;
 
 	if (controller)
-		return -1;
+		return -EBUSY;
 	if (!(new_controller && new_controller->ops))
 		return -EINVAL;
 	if (new_controller->irq) {
@@ -620,15 +614,11 @@ cleanup_slots(void)
 	 * and free up all memory that we had allocated.
 	 */
 	down_write(&list_rwsem);
-	if (!slots)
-		goto cleanup_null;
 	list_for_each_entry_safe(slot, tmp, &slot_list, slot_list) {
 		list_del(&slot->slot_list);
 		pci_hp_deregister(slot->hotplug_slot);
 	}
-cleanup_null:
 	up_write(&list_rwsem);
-	return;
 }
 
 int
@@ -663,9 +653,9 @@ cpci_hp_start(void)
 		up_read(&list_rwsem);
 		return -ENODEV;
 	}
+	status = init_slots(first);
 	up_read(&list_rwsem);
 
-	status = init_slots(first);
 	if (first)
 		first = 0;
 	if (status)
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 14/19] PCI: fix race windows when shutting down cpcihp controller
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (12 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 13/19] PCI: trivial code clean up in cpci_hotplug_core.c Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 15/19] PCI: hold a reference count to the PCI bus used by cpcihp drivers Jiang Liu
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

When cpci_hp_stop() is called to disabled cpcihp controller, it will disable
interrupt for that controller. But there's small window for event_thread()
to reenable the interrupt again. So stop the worker thread before disabling
the interrupt.

If check_slots() returns error, ther working thread (cpci_thread) will exit.
Later when cpci_stop_thread() or cpci_hp_intr() tries to access cpci_thread,
it may have already been destroyed. So hold a reference count to cpci_thread
to avoid invalid memory access.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   27 +++++++++++++++++----------
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 7898023..68e43c7 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
@@ -60,7 +60,6 @@ static atomic_t extracting;
 int cpci_debug;
 static struct cpci_hp_controller *controller;
 static struct task_struct *cpci_thread;
-static int thread_finished;
 
 static int enable_slot(struct hotplug_slot *slot);
 static int disable_slot(struct hotplug_slot *slot);
@@ -341,7 +340,8 @@ cpci_hp_intr(int irq, void *data)
 	controller->ops->disable_irq();
 
 	/* Trigger processing by the event thread */
-	wake_up_process(cpci_thread);
+	if (cpci_thread)
+		wake_up_process(cpci_thread);
 	return IRQ_HANDLED;
 }
 
@@ -508,7 +508,6 @@ event_thread(void *data)
 				msleep(500);
 			} else if (rc < 0) {
 				dbg("%s - error checking slots", __func__);
-				thread_finished = 1;
 				goto out;
 			}
 		} while (atomic_read(&extracting) && !kthread_should_stop());
@@ -540,7 +539,6 @@ poll_thread(void *data)
 					msleep(500);
 				} else if (rc < 0) {
 					dbg("%s - error checking slots", __func__);
-					thread_finished = 1;
 					goto out;
 				}
 			} while (atomic_read(&extracting) && !kthread_should_stop());
@@ -562,15 +560,24 @@ cpci_start_thread(void)
 		err("Can't start up our thread");
 		return PTR_ERR(cpci_thread);
 	}
-	thread_finished = 0;
+	get_task_struct(cpci_thread);
+
 	return 0;
 }
 
 static void
 cpci_stop_thread(void)
 {
-	kthread_stop(cpci_thread);
-	thread_finished = 1;
+	struct task_struct *tp;
+
+	if (cpci_thread) {
+		local_irq_disable();
+		tp = cpci_thread;
+		cpci_thread = NULL;
+		local_irq_enable();
+		kthread_stop(tp);
+		put_task_struct(tp);
+	}
 }
 
 int
@@ -627,8 +634,7 @@ cpci_hp_unregister_controller(struct cpci_hp_controller *old_controller)
 	int status = 0;
 
 	if (controller) {
-		if (!thread_finished)
-			cpci_stop_thread();
+		cpci_stop_thread();
 		if (controller->irq)
 			free_irq(controller->irq, controller->dev_id);
 		controller = NULL;
@@ -680,12 +686,13 @@ cpci_hp_stop(void)
 {
 	if (!controller)
 		return -ENODEV;
+	cpci_stop_thread();
 	if (controller->irq) {
 		/* Stop enum interrupt processing */
 		dbg("%s - disabling irq", __func__);
 		controller->ops->disable_irq();
+		synchronize_irq(controller->irq);
 	}
-	cpci_stop_thread();
 	return 0;
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 15/19] PCI: hold a reference count to the PCI bus used by cpcihp drivers
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (13 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 14/19] PCI: fix race windows when shutting down cpcihp controller Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 16/19] PCI: serialize PCI hotplug operations triggered " Jiang Liu
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

The PCI bus used by cpcihp drivers may be removed at runtime through
/sys/devices/pcissss:bb/ssss\:bb\:dd.f/.../remove interface, so hold
a reference count to the PCI bus to avoid invalid memory access.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/cpcihp_generic.c |    7 +++++--
 drivers/pci/hotplug/cpcihp_zt5550.c  |    7 +++++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index 518f387..f002be5 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
@@ -163,7 +163,7 @@ static int __init cpcihp_generic_init(void)
 			err("Invalid bridge device %s", bridge);
 			bus = NULL;
 		} else
-			bus = dev->subordinate;
+			bus = pci_bus_get(dev->subordinate);
 		pci_dev_put(dev);
 	}
 	if (!bus) {
@@ -179,7 +179,7 @@ static int __init cpcihp_generic_init(void)
 	if(status != 0) {
 		err("Could not register cPCI hotplug controller");
 		status = -ENODEV;
-		goto init_find_bus_error;
+		goto init_register_controller_error;
 	}
 	dbg("registered controller");
 
@@ -202,6 +202,8 @@ init_start_error:
 	cpci_hp_unregister_bus(bus);
 init_bus_register_error:
 	cpci_hp_unregister_controller(&generic_hpc);
+init_register_controller_error:
+	pci_bus_put(bus);
 init_find_bus_error:
 	release_region(port, 1);
 	err("status = %d", status);
@@ -214,6 +216,7 @@ static void __exit cpcihp_generic_exit(void)
 	cpci_hp_stop();
 	cpci_hp_unregister_bus(bus);
 	cpci_hp_unregister_controller(&generic_hpc);
+	pci_bus_put(bus);
 	release_region(port, 1);
 }
 
diff --git a/drivers/pci/hotplug/cpcihp_zt5550.c b/drivers/pci/hotplug/cpcihp_zt5550.c
index 8a6f968..6606520 100644
--- a/drivers/pci/hotplug/cpcihp_zt5550.c
+++ b/drivers/pci/hotplug/cpcihp_zt5550.c
@@ -241,9 +241,9 @@ static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 	if(!(bus0_dev = pci_get_device(PCI_VENDOR_ID_DEC,
 					 PCI_DEVICE_ID_DEC_21154, NULL))) {
 		status = -ENODEV;
-		goto init_register_error;
+		goto init_find_bus_error;
 	}
-	bus0 = bus0_dev->subordinate;
+	bus0 = pci_bus_get(bus0_dev->subordinate);
 	pci_dev_put(bus0_dev);
 
 	status = cpci_hp_register_bus(bus0, 0x0a, 0x0f);
@@ -265,6 +265,8 @@ static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 init_start_error:
 	cpci_hp_unregister_bus(bus0);
 init_register_error:
+	pci_bus_put(bus0);
+init_find_bus_error:
 	cpci_hp_unregister_controller(&zt5550_hpc);
 init_hc_error:
 	err("status = %d", status);
@@ -278,6 +280,7 @@ static void __devexit zt5550_hc_remove_one(struct pci_dev *pdev)
 	cpci_hp_stop();
 	cpci_hp_unregister_bus(bus0);
 	cpci_hp_unregister_controller(&zt5550_hpc);
+	pci_bus_put(bus0);
 	zt5550_hc_cleanup();
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 16/19] PCI: serialize PCI hotplug operations triggered by cpcihp drivers
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (14 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 15/19] PCI: hold a reference count to the PCI bus used by cpcihp drivers Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 17/19] PCI: serialize PCI hotplug operations triggered by fakephp drivers Jiang Liu
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Use PCI hotplug lock to globally serialize hotplug operations triggered
by cpcihp_xxx drivers.

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   10 ++++++++++
 drivers/pci/hotplug/cpcihp_generic.c    |    2 ++
 drivers/pci/hotplug/cpcihp_zt5550.c     |   12 ++++++++----
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 68e43c7..bbac305 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
@@ -502,6 +502,10 @@ event_thread(void *data)
 		if (kthread_should_stop())
 			break;
 		do {
+			if (!pci_hotplug_try_enter()) {
+				msleep(1);
+				continue;
+			}
 			rc = check_slots();
 			if (rc > 0) {
 				/* Give userspace a chance to handle extraction */
@@ -510,6 +514,7 @@ event_thread(void *data)
 				dbg("%s - error checking slots", __func__);
 				goto out;
 			}
+			pci_hotplug_exit();
 		} while (atomic_read(&extracting) && !kthread_should_stop());
 		if (kthread_should_stop())
 			break;
@@ -533,6 +538,10 @@ poll_thread(void *data)
 			break;
 		if (controller->ops->query_enum()) {
 			do {
+				if (!pci_hotplug_try_enter()) {
+					msleep(1);
+					continue;
+				}
 				rc = check_slots();
 				if (rc > 0) {
 					/* Give userspace a chance to handle extraction */
@@ -541,6 +550,7 @@ poll_thread(void *data)
 					dbg("%s - error checking slots", __func__);
 					goto out;
 				}
+				pci_hotplug_exit();
 			} while (atomic_read(&extracting) && !kthread_should_stop());
 		}
 		msleep(100);
diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index f002be5..91d27d5 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
@@ -154,6 +154,7 @@ static int __init cpcihp_generic_init(void)
 	if(!r)
 		return -EBUSY;
 
+	pci_hotplug_disable();
 	bus = pci_find_bus(0, bridge_busnr);
 	if (!bus) {
 		err("Invalid bus number %d", bridge_busnr);
@@ -166,6 +167,7 @@ static int __init cpcihp_generic_init(void)
 			bus = pci_bus_get(dev->subordinate);
 		pci_dev_put(dev);
 	}
+	pci_hotplug_enable();
 	if (!bus) {
 		status = -EINVAL;
 		goto init_find_bus_error;
diff --git a/drivers/pci/hotplug/cpcihp_zt5550.c b/drivers/pci/hotplug/cpcihp_zt5550.c
index 6606520..1f81d85 100644
--- a/drivers/pci/hotplug/cpcihp_zt5550.c
+++ b/drivers/pci/hotplug/cpcihp_zt5550.c
@@ -238,13 +238,17 @@ static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 	dbg("registered controller");
 
 	/* Look for first device matching cPCI bus's bridge vendor and device IDs */
-	if(!(bus0_dev = pci_get_device(PCI_VENDOR_ID_DEC,
-					 PCI_DEVICE_ID_DEC_21154, NULL))) {
+	pci_hotplug_enter();
+	bus0_dev = pci_get_device(PCI_VENDOR_ID_DEC,
+				  PCI_DEVICE_ID_DEC_21154, NULL);
+	if (bus0_dev)
+		bus0 = pci_bus_get(bus0_dev->subordinate);
+	pci_dev_put(bus0_dev);
+	pci_hotplug_exit();
+	if (!bus0) {
 		status = -ENODEV;
 		goto init_find_bus_error;
 	}
-	bus0 = pci_bus_get(bus0_dev->subordinate);
-	pci_dev_put(bus0_dev);
 
 	status = cpci_hp_register_bus(bus0, 0x0a, 0x0f);
 	if(status != 0) {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 17/19] PCI: serialize PCI hotplug operations triggered by fakephp drivers
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (15 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 16/19] PCI: serialize PCI hotplug operations triggered " Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:16 ` [PATCH v2 18/19] PCI, sysfs: Use device_type and attr_groups with pci dev Jiang Liu
  2012-04-27 15:17 ` [PATCH v2 19/19] PCI: hide sys interface 'remove' and 'rescan' for SR-IOV virtual devices Jiang Liu
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Jiang Liu <jiang.liu@huawei.com>

Use PCI hotplug lock to globally serialize hotplug operations triggered
by fakephp driver. This patch solves following crash.

[ 1426.145264] IP: [<ffffffff812f811b>] __pci_remove_bus_device+0x4b/0xc0
[ 1426.145264] PGD 30463067 PUD 38f9e067 PMD 0
[ 1426.145264] Oops: 0002 [#1] SMP
[ 1426.145264] CPU 0
[ 1426.145264] Modules linked in: fakephp shpchp r8169 [last unloaded: fakephp]
[ 1426.145264]
[ 1426.145264] Pid: 2086, comm: kworker/u:0 Tainted: G        W    3.4.0-rc2+ #19 To Be Filled By O.E.M. To Be Filled .
[ 1426.145264] RIP: 0010:[<ffffffff812f811b>]  [<ffffffff812f811b>] __pci_remove_bus_device+0x4b/0xc0
[ 1426.145264] RSP: 0018:ffff88002e851d10  EFLAGS: 00010282
[ 1426.145264] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000001880
[ 1426.145264] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81c4fec0
[ 1426.145264] RBP: ffff88002e851d20 R08: 0000000000000000 R09: 0000000000000000
[ 1426.145264] R10: 00000000000003c7 R11: 0001f630d1b3ac30 R12: ffff880030db3800
[ 1426.145264] R13: ffff880030443400 R14: ffffffff81fa8840 R15: ffffffff811a5220
[ 1426.145264] FS:  0000000000000000(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000
[ 1426.145264] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1426.145264] CR2: 0000000000000008 CR3: 0000000030ff8000 CR4: 00000000000007f0
[ 1426.145264] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1426.145264] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1426.145264] Process kworker/u:0 (pid: 2086, threadinfo ffff88002e850000, task ffff880037b38000)
[ 1426.145264] Stack:
[ 1426.145264]  ffff880030db3800 ffff88002aa1c530 ffff88002e851d40 ffffffff812f81a9
[ 1426.145264]  0000000000000000 ffff88002a81b900 ffff88002e851d60 ffffffffa17ec0a4
[ 1426.145264]  ffffffff81fa8840 ffff88002aa1c530 ffff88002e851d80 ffffffff811a5233
[ 1426.145264] Call Trace:
[ 1426.145264]  [<ffffffff812f81a9>] pci_stop_and_remove_bus_device+0x19/0x20
[ 1426.145264]  [<ffffffffa17ec0a4>] remove_callback+0x24/0x30 [fakephp]
[ 1426.145264]  [<ffffffff811a5233>] sysfs_schedule_callback_work+0x13/0x80
[ 1426.145264]  [<ffffffff81053462>] process_one_work+0x192/0x570
[ 1426.145264]  [<ffffffff810533f6>] ? process_one_work+0x126/0x570
[ 1426.145264]  [<ffffffff81054e7f>] worker_thread+0x15f/0x350
[ 1426.145264]  [<ffffffff81054d20>] ? manage_workers.isra.27+0x220/0x220
[ 1426.145264]  [<ffffffff81059f4d>] kthread+0x9d/0xb0
[ 1426.145264]  [<ffffffff8178b594>] kernel_thread_helper+0x4/0x10
[ 1426.145264]  [<ffffffff81059eb0>] ? __init_kthread_worker+0x70/0x70
[ 1426.145264]  [<ffffffff8178b590>] ? gs_change+0xb/0xb
[ 1426.145264] Code: 0c ff ff ff 49 c7 44 24 18 00 00 00 00 48 c7 c7 c0 fe c4 81 31 db e8 d5 7f 48 00 49 8b 14 24 49 8
[ 1426.145264] RIP  [<ffffffff812f811b>] __pci_remove_bus_device+0x4b/0xc0
[ 1426.145264]  RSP <ffff88002e851d10>
[ 1426.145264] CR2: 0000000000000008
[ 1426.426612] ---[ end trace 5217fdeceed9de00 ]---
[ 1426.431546] BUG: unable to handle kernel paging request at fffffffffffffff8
[ 1426.432252] IP: [<ffffffff8105a41b>] kthread_data+0xb/0x20
[ 1426.432252] PGD 1c0d067 PUD 1c0e067 PMD 0
[ 1426.432252] Oops: 0000 [#2] SMP
[ 1426.432252] CPU 0
[ 1426.432252] Modules linked in: fakephp shpchp r8169 [last unloaded: fakephp]
[ 1426.432252]
[ 1426.432252] Pid: 2086, comm: kworker/u:0 Tainted: G      D W    3.4.0-rc2+ #19 To Be Filled By O.E.M. To Be Filled .
[ 1426.432252] RIP: 0010:[<ffffffff8105a41b>]  [<ffffffff8105a41b>] kthread_data+0xb/0x20
[ 1426.432252] RSP: 0018:ffff88002e851908  EFLAGS: 00010096
[ 1426.432252] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 1426.432252] RDX: ffffffff81fa9440 RSI: 0000000000000000 RDI: ffff880037b38000
[ 1426.432252] RBP: ffff88002e851908 R08: 0000000000989680 R09: 0000000000000000
[ 1426.432252] R10: 0000000000000400 R11: 0000000000000004 R12: 0000000000000000
[ 1426.432252] R13: ffff880037b38378 R14: ffff88003c9b8000 R15: ffff880037b38280
[ 1426.432252] FS:  0000000000000000(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000
[ 1426.432252] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1426.432252] CR2: fffffffffffffff8 CR3: 0000000030ff8000 CR4: 00000000000007f0
[ 1426.432252] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1426.432252] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1426.432252] Process kworker/u:0 (pid: 2086, threadinfo ffff88002e850000, task ffff880037b38000)
[ 1426.432252] Stack:
[ 1426.432252]  ffff88002e851928 ffffffff81055810 ffff88002e851928 ffff88003d7d2900
[ 1426.432252]  ffff88002e8519a8 ffffffff81780a38 ffff880000000000 ffffffff810bda82
[ 1426.432252]  ffff88002e851fd8 ffff880037b38000 ffff88002e851fd8 ffff88002e851fd8
[ 1426.432252] Call Trace:
[ 1426.432252]  [<ffffffff81055810>] wq_worker_sleeping+0x10/0xa0
[ 1426.432252]  [<ffffffff81780a38>] __schedule+0x538/0x7c0
[ 1426.432252]  [<ffffffff810bda82>] ? call_rcu_sched+0x12/0x20
[ 1426.432252]  [<ffffffff81780fa4>] schedule+0x24/0x70
[ 1426.432252]  [<ffffffff8103b8b0>] do_exit+0x600/0x9d0
[ 1426.432252]  [<ffffffff81039065>] ? kmsg_dump+0x105/0x160
[ 1426.432252]  [<ffffffff8178366e>] oops_end+0x9e/0xe0
[ 1426.432252]  [<ffffffff81037b65>] ? console_unlock+0x1e5/0x260
[ 1426.432252]  [<ffffffff81774e1e>] no_context+0x271/0x280
[ 1426.432252]  [<ffffffff810884b8>] ? __lock_acquire.isra.31+0x298/0x960
[ 1426.432252]  [<ffffffff81774ff3>] __bad_area_nosemaphore+0x1c6/0x1e5
[ 1426.432252]  [<ffffffff8106cd15>] ? sched_clock_local+0x25/0x90
[ 1426.432252]  [<ffffffff81775020>] bad_area_nosemaphore+0xe/0x10
[ 1426.432252]  [<ffffffff81785fbe>] do_page_fault+0x30e/0x500
[ 1426.432252]  [<ffffffff8106cea8>] ? sched_clock_cpu+0xa8/0x120
[ 1426.432252]  [<ffffffff810884b8>] ? __lock_acquire.isra.31+0x298/0x960
[ 1426.432252]  [<ffffffff810884b8>] ? __lock_acquire.isra.31+0x298/0x960
[ 1426.432252]  [<ffffffff8106cd15>] ? sched_clock_local+0x25/0x90
[ 1426.432252]  [<ffffffff812f810b>] ? __pci_remove_bus_device+0x3b/0xc0
[ 1426.432252]  [<ffffffff811a5220>] ? sysfs_write_file+0x180/0x180
[ 1426.432252]  [<ffffffff81782b7f>] page_fault+0x1f/0x30
[ 1426.432252]  [<ffffffff811a5220>] ? sysfs_write_file+0x180/0x180
[ 1426.432252]  [<ffffffff812f811b>] ? __pci_remove_bus_device+0x4b/0xc0
[ 1426.432252]  [<ffffffff812f81a9>] pci_stop_and_remove_bus_device+0x19/0x20
[ 1426.432252]  [<ffffffffa17ec0a4>] remove_callback+0x24/0x30 [fakephp]
[ 1426.432252]  [<ffffffff811a5233>] sysfs_schedule_callback_work+0x13/0x80
[ 1426.432252]  [<ffffffff81053462>] process_one_work+0x192/0x570
[ 1426.432252]  [<ffffffff810533f6>] ? process_one_work+0x126/0x570
[ 1426.432252]  [<ffffffff81054e7f>] worker_thread+0x15f/0x350
[ 1426.432252]  [<ffffffff81054d20>] ? manage_workers.isra.27+0x220/0x220
[ 1426.432252]  [<ffffffff81059f4d>] kthread+0x9d/0xb0
[ 1426.432252]  [<ffffffff8178b594>] kernel_thread_helper+0x4/0x10
[ 1426.432252]  [<ffffffff81059eb0>] ? __init_kthread_worker+0x70/0x70
[ 1426.432252]  [<ffffffff8178b590>] ? gs_change+0xb/0xb
[ 1426.432252] Code: eb 90 be 57 01 00 00 48 c7 c7 86 19 a1 81 e8 1d cb fd ff e9 77 fe ff ff 0f 1f 84 00 00 00 00 00 4
[ 1426.432252] RIP  [<ffffffff8105a41b>] kthread_data+0xb/0x20
[ 1426.432252]  RSP <ffff88002e851908>
[ 1426.432252] CR2: fffffffffffffff8
[ 1426.432252] ---[ end trace 5217fdeceed9de01 ]---
[ 1426.432252] Fixing recursive fault but reboot is needed!
[ 1428.998901] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 2
[ 1428.998901] panic occurred, switching back to text console

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/hotplug/fakephp.c |   38 ++++++++++++++++++++++++++++++++------
 1 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
index a019c9a..ee6c79e 100644
--- a/drivers/pci/hotplug/fakephp.c
+++ b/drivers/pci/hotplug/fakephp.c
@@ -38,9 +38,24 @@ static ssize_t legacy_show(struct kobject *kobj, struct attribute *attr,
 	return 2;
 }
 
+static void rescan_callback(void *data)
+{
+	struct legacy_slot *slot = data;
+
+	pci_hotplug_enter();
+	if (!list_empty(&slot->list))
+		pci_rescan_bus(slot->dev->bus);
+	pci_hotplug_exit();
+}
+
 static void remove_callback(void *data)
 {
-	pci_stop_and_remove_bus_device((struct pci_dev *)data);
+	struct legacy_slot *slot = data;
+
+	pci_hotplug_enter();
+	if (!list_empty(&slot->list))
+		pci_stop_and_remove_bus_device(slot->dev);
+	pci_hotplug_exit();
 }
 
 static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
@@ -53,10 +68,11 @@ static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
 		return -EINVAL;
 
 	if (val)
-		pci_rescan_bus(slot->dev->bus);
+		sysfs_schedule_callback(&slot->kobj, rescan_callback,
+					slot, THIS_MODULE);
 	else
-		sysfs_schedule_callback(&slot->dev->dev.kobj, remove_callback,
-					slot->dev, THIS_MODULE);
+		sysfs_schedule_callback(&slot->kobj, remove_callback,
+					slot, THIS_MODULE);
 	return len;
 }
 
@@ -107,20 +123,25 @@ static int legacy_notify(struct notifier_block *nb,
 	struct pci_dev *pdev = to_pci_dev(data);
 
 	if (action == BUS_NOTIFY_ADD_DEVICE) {
+		pci_hotplug_enter();
 		legacy_add_slot(pdev);
+		pci_hotplug_exit();
 	} else if (action == BUS_NOTIFY_DEL_DEVICE) {
 		struct legacy_slot *slot;
 
+		pci_hotplug_enter();
 		list_for_each_entry(slot, &legacy_list, list)
 			if (slot->dev == pdev)
 				goto found;
 
+		pci_hotplug_exit();
 		dev_warn(&pdev->dev, "Missing legacy fake slot?");
 		return -ENODEV;
 found:
 		kobject_del(&slot->kobj);
-		list_del(&slot->list);
+		list_del_init(&slot->list);
 		kobject_put(&slot->kobj);
+		pci_hotplug_exit();
 	}
 
 	return 0;
@@ -135,11 +156,14 @@ static int __init init_legacy(void)
 	struct pci_dev *pdev = NULL;
 
 	/* Add existing devices */
+	pci_hotplug_disable();
 	for_each_pci_dev(pdev)
 		legacy_add_slot(pdev);
 
 	/* Be alerted of any new ones */
 	bus_register_notifier(&pci_bus_type, &legacy_notifier);
+	pci_hotplug_enable();
+
 	return 0;
 }
 module_init(init_legacy);
@@ -150,11 +174,13 @@ static void __exit remove_legacy(void)
 
 	bus_unregister_notifier(&pci_bus_type, &legacy_notifier);
 
+	pci_hotplug_disable();
 	list_for_each_entry_safe(slot, tmp, &legacy_list, list) {
-		list_del(&slot->list);
+		list_del_init(&slot->list);
 		kobject_del(&slot->kobj);
 		kobject_put(&slot->kobj);
 	}
+	pci_hotplug_enable();
 }
 module_exit(remove_legacy);
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 18/19] PCI, sysfs: Use device_type and attr_groups with pci dev
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (16 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 17/19] PCI: serialize PCI hotplug operations triggered by fakephp drivers Jiang Liu
@ 2012-04-27 15:16 ` Jiang Liu
  2012-04-27 15:17 ` [PATCH v2 19/19] PCI: hide sys interface 'remove' and 'rescan' for SR-IOV virtual devices Jiang Liu
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Keping Chen, linux-kernel, linux-pci, Jiang Liu

From: Yinghai Lu <yinghai@kernel.org>

From: Yinghai Lu <yinghai@kernel.org>

We want to create rescan in sys only for pci bridge instead of all pci dev.

We could use attribute_groups/is_visible method to do that.

Now pci dev does not use device type yet. So add pci_dev_type to take
attr_groups with it.

Add pci_dev_bridge_attrs_are_visible() to control attr_bridge_group only create
attr for bridge.

This is the framework related change, later could attr to bridge_attr_group,
to make those attr only show up on pci bridge device.

Also We could add more attr groups with is_visible to reduce messness in
	pci_create_sysfs_dev_files. ( at least for boot_vga one )

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/pci-sysfs.c |   30 ++++++++++++++++++++++++++++++
 drivers/pci/pci.h       |    1 +
 drivers/pci/probe.c     |    1 +
 3 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index ecf197d..bc3c422 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1349,3 +1349,33 @@ static int __init pci_sysfs_init(void)
 }
 
 late_initcall(pci_sysfs_init);
+
+static struct attribute *pci_dev_bridge_attrs[] = {
+	NULL,
+};
+
+static umode_t pci_dev_bridge_attrs_are_visible(struct kobject *kobj,
+						struct attribute *a, int n)
+{
+	struct device *dev = container_of(kobj, struct device, kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (!pdev->subordinate)
+		return 0;
+
+	return a->mode;
+}
+
+static struct attribute_group pci_dev_bridge_attr_group = {
+	.attrs = pci_dev_bridge_attrs,
+	.is_visible = pci_dev_bridge_attrs_are_visible,
+};
+
+static const struct attribute_group *pci_dev_attr_groups[] = {
+	&pci_dev_bridge_attr_group,
+	NULL,
+};
+
+struct device_type pci_dev_type = {
+	.groups = pci_dev_attr_groups,
+};
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index e494347..13b1159 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -162,6 +162,7 @@ static inline int pci_no_d1d2(struct pci_dev *dev)
 }
 extern struct device_attribute pci_dev_attrs[];
 extern struct device_attribute pcibus_dev_attrs[];
+extern struct device_type pci_dev_type;
 #ifdef CONFIG_HOTPLUG
 extern struct bus_attribute pci_bus_attrs[];
 #else
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 5e1ca3c..b1473d3 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1000,6 +1000,7 @@ int pci_setup_device(struct pci_dev *dev)
 	dev->sysdata = dev->bus->sysdata;
 	dev->dev.parent = dev->bus->bridge;
 	dev->dev.bus = &pci_bus_type;
+	dev->dev.type = &pci_dev_type;
 	dev->hdr_type = hdr_type & 0x7f;
 	dev->multifunction = !!(hdr_type & 0x80);
 	dev->error_state = pci_channel_io_normal;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 19/19] PCI: hide sys interface 'remove' and 'rescan' for SR-IOV virtual devices
  2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
                   ` (17 preceding siblings ...)
  2012-04-27 15:16 ` [PATCH v2 18/19] PCI, sysfs: Use device_type and attr_groups with pci dev Jiang Liu
@ 2012-04-27 15:17 ` Jiang Liu
  18 siblings, 0 replies; 28+ messages in thread
From: Jiang Liu @ 2012-04-27 15:17 UTC (permalink / raw)
  To: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile, Greg KH
  Cc: Jiang Liu, Keping Chen, linux-kernel, linux-pci

From: Jiang Liu <liuj97@gmail.com>

From: Jiang Liu <liuj97@gmail.com>

All SR-IOV virtual PCI devices should be managed by corresponding physical
device drivers. And the PCI core shouldn't create or destroy virtual PCI
devices directly without cordination with physical device drivers.
Otherwise it may cause system crashes like below.  So hide the remove and
rescan sys interfaces for SR-IOV virtual PCI devices.

Running following two scripts may trigger system dump on a system with
Intel 82576 NIC.

[root@localhost tests]# cat mod.sh
#!/bin/bash
while true; do
        modprobe igb max_vfs=2
        sleep 0.01
        rmmod igb
done
[root@localhost tests]# cat remove_virt.sh
#!/bin/bash
while true; do
        echo 1 > /sys/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:44:10.0/remove
        echo 1 > /sys/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:44:10.1/remove
        echo 1 > /sys/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:44:10.2/remove
        echo 1 > /sys/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:43:10.3/rescan
        sleep 0.01
done

------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:481 sysfs_add_one+0xb8/0xd0()
Hardware name: FBSA
sysfs: cannot create duplicate filename '/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:43:00.0/virtfn0'
Modules linked in: igb(+) igbvf fuse ebtable_nat ebtables xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat bridge autofs4 sunrpc 8021q fcoe libfcoe garp stp llc libfc scsi_transport_fc scsi_tgt cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod kvm_intel kvm uinput wmi pcspkr sg iTCO_wdt iTCO_vendor_support e1000e i2c_i801 i2c_core ioatdma ixgbe dca mdio ext4 mbcache jbd2 sd_mod crc_t10dif ahci libahci [last unloaded: igb]
Pid: 6297, comm: work_for_cpu Not tainted 3.2.0IOAT+ #1
Call Trace:
 [<ffffffff81060c1f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff81060d16>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff811d0428>] sysfs_add_one+0xb8/0xd0
 [<ffffffff811d15ab>] sysfs_do_create_link+0x13b/0x210
 [<ffffffff81242330>] ? sprintf+0x40/0x50
 [<ffffffff811d16b3>] sysfs_create_link+0x13/0x20
 [<ffffffff81272d33>] virtfn_add+0x283/0x430
 [<ffffffff81273252>] pci_enable_sriov+0x232/0x4c0
 [<ffffffffa049ae1b>] igb_probe+0x6b4/0x1212 [igb]
 [<ffffffff81321aa2>] ? __pm_runtime_set_status+0x172/0x210
 [<ffffffff8125dc0f>] local_pci_probe+0x5f/0xd0
 [<ffffffff8107ab60>] ? move_linked_works+0x90/0x90
 [<ffffffff8107ab78>] do_work_for_cpu+0x18/0x30
 [<ffffffff810829e6>] kthread+0x96/0xa0
 [<ffffffff814e4ab4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81082950>] ? kthread_worker_fn+0x1a0/0x1a0
 [<ffffffff814e4ab0>] ? gs_change+0x13/0x13
---[ end trace 7c33eee57d617c55 ]---
libfcoe_device_notification: NETDEV_UNREGISTER eth3
Trying to free nonexistent resource <00000000e1660000-00000000e1663fff>
Trying to free nonexistent resource <00000000e1640000-00000000e1643fff>
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff8124a529>] __list_del_entry+0x29/0xd0
PGD 3fdf7e3067 PUD 3fdf45d067 PMD 0
Oops: 0000 [#1] SMP
CPU 8
Modules linked in: igb(+) igbvf fuse ebtable_nat ebtables xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat bridge autofs4 sunrpc 8021q fcoe libfcoe garp stp llc libfc scsi_transport_fc scsi_tgt cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod kvm_intel kvm uinput wmi pcspkr sg iTCO_wdt iTCO_vendor_support e1000e i2c_i801 i2c_core ioatdma ixgbe dca mdio ext4 mbcache jbd2 sd_mod crc_t10dif ahci libahci [last unloaded: igb]

Pid: 6297, comm: work_for_cpu Tainted: G        W    3.2.0IOAT+ #1 INSYDE FBSA/Type2 - Board Product Name1
RIP: 0010:[<ffffffff8124a529>]  [<ffffffff8124a529>] __list_del_entry+0x29/0xd0
RSP: 0018:ffff883fdb499cb0  EFLAGS: 00010207
RAX: 0000000000000000 RBX: ffff881fdde1e000 RCX: dead000000200200
RDX: 0000000000000000 RSI: ffffffff81238d10 RDI: ffff881fdde1e000
RBP: ffff883fdb499cb0 R08: ffff881fdde1e0a8 R09: 0000000000000000
R10: 00000000000009c5 R11: 0000000000000000 R12: 0000000000000011
R13: ffff881fdde1e000 R14: ffff883fdb499d30 R15: ffff883fe07ae0a0
FS:  0000000000000000(0000) GS:ffff88203fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000003fdf88c000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process work_for_cpu (pid: 6297, threadinfo ffff883fdb498000, task ffff883fda832a70)
Stack:
 ffff883fdb499cd0 ffffffff8124a5e1 ffff883fe07ae000 0000000000000000
 ffff883fdb499d00 ffffffff81259467 ffff883fdb499d00 ffff881fdde1e000
 ffff883fe07ae000 ffff882fe0f87b40 ffff883fdb499d80 ffffffff81272d64
Call Trace:
 [<ffffffff8124a5e1>] list_del+0x11/0x40
 [<ffffffff81259467>] pci_remove_bus_device+0x57/0xd0
 [<ffffffff81272d64>] virtfn_add+0x2b4/0x430
 [<ffffffff81273252>] pci_enable_sriov+0x232/0x4c0
 [<ffffffffa049ae1b>] igb_probe+0x6b4/0x1212 [igb]
 [<ffffffff81321aa2>] ? __pm_runtime_set_status+0x172/0x210
 [<ffffffff8125dc0f>] local_pci_probe+0x5f/0xd0
 [<ffffffff8107ab60>] ? move_linked_works+0x90/0x90
 [<ffffffff8107ab78>] do_work_for_cpu+0x18/0x30
 [<ffffffff810829e6>] kthread+0x96/0xa0
 [<ffffffff814e4ab4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81082950>] ? kthread_worker_fn+0x1a0/0x1a0
 [<ffffffff814e4ab0>] ? gs_change+0x13/0x13
Code: 90 90 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08
RIP  [<ffffffff8124a529>] __list_del_entry+0x29/0xd0
 RSP <ffff883fdb499cb0>
CR2: 0000000000000000
---[ end trace 7c33eee57d617c56 ]---

Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
 drivers/pci/pci-sysfs.c |   34 ++++++++++++++++++++++++++++++----
 include/linux/pci.h     |    4 +++-
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index bc3c422..348995d 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -367,6 +367,9 @@ remove_store(struct device *dev, struct device_attribute *dummy,
 	return schedule_hp_callback(dev, buf, count, remove_callback);
 }
 
+static struct device_attribute pci_dev_remove_attr =
+	__ATTR(remove, (S_IWUSR|S_IWGRP), NULL, remove_store);
+
 static void dev_bus_rescan_callback(struct device *dev)
 {
 	struct pci_bus *bus = to_pci_bus(dev);
@@ -389,6 +392,8 @@ dev_bus_rescan_store(struct device *dev, struct device_attribute *attr,
 	return schedule_hp_callback(dev, buf, count, dev_bus_rescan_callback);
 }
 
+static struct device_attribute pci_dev_rescan_attr =
+	__ATTR(rescan, (S_IWUSR|S_IWGRP), NULL, dev_rescan_store);
 #endif
 
 struct device_attribute pci_dev_attrs[] = {
@@ -411,10 +416,6 @@ struct device_attribute pci_dev_attrs[] = {
 	__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
 		broken_parity_status_show,broken_parity_status_store),
 	__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
-#ifdef CONFIG_HOTPLUG
-	__ATTR(remove, (S_IWUSR|S_IWGRP), NULL, remove_store),
-	__ATTR(rescan, (S_IWUSR|S_IWGRP), NULL, dev_rescan_store),
-#endif
 	__ATTR_NULL,
 };
 
@@ -1350,6 +1351,30 @@ static int __init pci_sysfs_init(void)
 
 late_initcall(pci_sysfs_init);
 
+static struct attribute *pci_dev_phys_attrs[] = {
+#ifdef CONFIG_HOTPLUG
+	&pci_dev_remove_attr.attr,
+	&pci_dev_rescan_attr.attr,
+#endif
+	NULL
+};
+
+static umode_t pci_dev_phys_attrs_are_visible(struct kobject *kobj,
+					      struct attribute *a, int n)
+{
+	struct device *dev = container_of(kobj, struct device, kobj);
+
+	if (dev_is_pf(dev))
+		return a->mode;
+
+	return 0;
+}
+
+static struct attribute_group pci_dev_phys_attr_group = {
+	.attrs = pci_dev_phys_attrs,
+	.is_visible = pci_dev_phys_attrs_are_visible,
+};
+
 static struct attribute *pci_dev_bridge_attrs[] = {
 	NULL,
 };
@@ -1373,6 +1398,7 @@ static struct attribute_group pci_dev_bridge_attr_group = {
 
 static const struct attribute_group *pci_dev_attr_groups[] = {
 	&pci_dev_bridge_attr_group,
+	&pci_dev_phys_attr_group,
 	NULL,
 };
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1c5f153..6c2c5c9 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -692,7 +692,8 @@ extern void pci_stop_bus_device(struct pci_dev *dev);
 void pci_setup_cardbus(struct pci_bus *bus);
 extern void pci_sort_breadthfirst(void);
 #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
-#define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
+#define dev_is_vf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_virtfn : false))
+#define dev_is_pf(d) (!dev_is_vf(d))
 #define dev_num_vf(d) ((dev_is_pci(d) ? pci_num_vf(to_pci_dev(d)) : 0))
 
 /* Generic PCI functions exported to card drivers */
@@ -1343,6 +1344,7 @@ static inline int pci_domain_nr(struct pci_bus *bus)
 
 #define dev_is_pci(d) (false)
 #define dev_is_pf(d) (false)
+#define dev_is_vf(d) (false)
 #define dev_num_vf(d) (0)
 #endif /* CONFIG_PCI */
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations
  2012-04-27 15:16 ` [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations Jiang Liu
@ 2012-05-02  5:00   ` Greg KH
  2012-05-02  7:25     ` Jiang Liu
  0 siblings, 1 reply; 28+ messages in thread
From: Greg KH @ 2012-05-02  5:00 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile,
	Jiang Liu, Keping Chen, linux-kernel, linux-pci

On Fri, Apr 27, 2012 at 11:16:43PM +0800, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> There are multiple ways to trigger PCI hotplug requests concurrently,
> such as:
> 1. Sysfs interfaces exported by the PCI core subsystem
> 2. Sysfs interfaces exported by the PCI hotplug subsystem
> 3. PCI hotplug events triggered by PCI Hotplug Controllers
> 4. ACPI hotplug events for PCI host bridges
> 5. Driver binding/unbinding events
> 
> The PCI core subsystem doesn't support concurrent hotplug operations yet,
> so all PCI hotplug requests should be globally serialized. This patch
> introduces several new interfaces to serialize PCI hotplug operations.
> 
> pci_hotplug_try_enter(): try to acquire write lock

Ick, no, why would you ever want to do that?

> pci_hotplug_enter(): acquire write lock
> pci_hotplug_exit(): release write lock
> pci_hotplug_disable(): acquire read lock
> pci_hotplug_enable(): release read lock

No, the pci hotplug core should not need a rwsem, just a simple lock, if
that:
	pci_hotplug_lock()
	pci_hotplug_unlock()
and that's it.

Really you should not need these functions, the pci hotplug core should
handle it for you, and the drivers should not care at all.  That's the
"proper" way to fix this up, serialize stuff within the pci core, not
the individual drivers.

> Today we have reproduced the issue on a real platform by using
> acpiphp driver. It's an IA64 platform running Suse 11SP1 (official
> 2.6.32.12 kernel). The test script is:
> 
> This issue could be reproduced on an IA64 platform with Suse 11SP1
> (official 2.6.32.12 kernel) and acpiphp driver.

You do realize just how old that kernel is right?  Please don't assume
that this kernel version is relevant to us all these years later.

> +/*
> + * trylock for writing -- returns 1 if successful, 0 if contention
> + */
> +int pci_hotplug_try_enter(void)
> +{
> +	if (current != pci_hotplug_mutex_owner) {
> +		if (down_write_trylock(&pci_hotplug_rwsem) == 0)
> +			return 0;
> +		pci_hotplug_mutex_owner = current;
> +	}
> +	pci_hotplug_mutex_recursive++;
> +
> +	return 1;
> +}
> +EXPORT_SYMBOL(pci_hotplug_try_enter);

Don't try to invent new lock types like this, you are bound to get it
wrong.  And don't create recursive locks, fix the code up to never need
it.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces
  2012-04-27 15:16 ` [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces Jiang Liu
@ 2012-05-02  5:06   ` Greg KH
  2012-05-02  7:20     ` Jiang Liu
  0 siblings, 1 reply; 28+ messages in thread
From: Greg KH @ 2012-05-02  5:06 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile,
	Jiang Liu, Keping Chen, linux-kernel, linux-pci

On Fri, Apr 27, 2012 at 11:16:45PM +0800, Jiang Liu wrote:
> +	/* Avoid deadlock with pci_hp_deregister() */
> +	while (!pci_hotplug_try_enter()) {
> +		/* Check whether the slot has been deregistered. */
> +		if (list_empty(&slot->slot_list)) {
> +			retval = -ENODEV;
> +			goto exit_put;
> +		}
> +		msleep(1);
> +	}

Oh my.

Wow.

{sigh}

ick.

My eyes hurt.

And your cpu load just went crazy.

You can now handle all of the nasty emails from sysadmins asking why
their systems look like their load is high for no good reason.

Not to mention all of the other issues here.

My statement about not inventing new lock types has now been proven
true.

The fact that this would even be a chunk of code that was proposed to be
merged makes me weep.

You owe me a bottle of whatever you are drinking if you expect me to
continue to review these patch submissions.

I'm stopping here in the series, please rework this whole thing in a
major way.

greg k-h

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 05/19] PCI: correctly flush workqueue when destroy pcie hotplug controller
  2012-04-27 15:16 ` [PATCH v2 05/19] PCI: correctly flush workqueue when destroy pcie hotplug controller Jiang Liu
@ 2012-05-02  5:08   ` Greg KH
  0 siblings, 0 replies; 28+ messages in thread
From: Greg KH @ 2012-05-02  5:08 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile,
	Jiang Liu, Keping Chen, linux-kernel, linux-pci

On Fri, Apr 27, 2012 at 11:16:46PM +0800, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> When destroying a PCIe hotplug controller, all work items associated with
> that controller should be flushed.  Function pcie_cleanup_slot() calls
> cancel_delayed_work() and flush_workqueue() to achieve that.
> Function flush_workqueue() will flush all work items already submitted,
> but new work items submitted by those already submitted work items may
> still be in live state when returning from flush_workqueue().
> 
> For the extreme case, pciehp driver may expierence following calling path:
> 1) pcie_isr() -> pciehp_handle_xxx() -> queue_interrupt_event()->queue_work()
> 2) interrupt_event_handler() -> handle_button_press_event() ->
>    queue_delayed_work()
> 3) pciehp_queue_pushbutton_work() -> queue_work()
> 
> So enhance pcie_cleanup_slot() to correctly flush workqueue when destroying
> PCIe hotplug controller.
> 
> Signed-off-by: Jiang Liu <liuj97@gmail.com>
> ---
>  drivers/pci/hotplug/pciehp_hpc.c |   14 +++++++++++++-
>  1 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index a960fae..98b775f 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -889,8 +889,20 @@ static int pcie_init_slot(struct controller *ctrl)
>  static void pcie_cleanup_slot(struct controller *ctrl)
>  {
>  	struct slot *slot = ctrl->slot;
> -	cancel_delayed_work(&slot->work);
> +
> +	/*
> +	 * Following workqueue flushing logic is to deal with the special
> +	 * call path:
> +	 * 1) pcie_isr() -> pciehp_handle_xxx() ->
> +	 *    queue_interrupt_event(pciehp_wq_event)->queue_work(pciehp_wq)
> +	 * 2) interrupt_event_handler() -> handle_button_press_event() ->
> +	 *    queue_delayed_work(pciehp_wq)
> +	 * 3) pciehp_queue_pushbutton_work() -> queue_work(pciehp_wq)
> +	 */
>  	flush_workqueue(pciehp_wq);
> +	cancel_delayed_work_sync(&slot->work);
> +	flush_workqueue(pciehp_wq);

Why just flush it twice?  Why not three times?

Like the magic "sync && sync && sync" invocation long typed by
sysadmins.

This really does not feel right, sorry.

Odds are the whole workqueue path needs to be reworked here, right?

greg k-h

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 06/19] PCI: prepare for serializing hotplug operations triggered by pciehp driver
  2012-04-27 15:16 ` [PATCH v2 06/19] PCI: prepare for serializing hotplug operations triggered by pciehp driver Jiang Liu
@ 2012-05-02  5:10   ` Greg KH
  0 siblings, 0 replies; 28+ messages in thread
From: Greg KH @ 2012-05-02  5:10 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas, Don Dutile,
	Jiang Liu, Keping Chen, linux-kernel, linux-pci

On Fri, Apr 27, 2012 at 11:16:47PM +0800, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu@huawei.com>
> 
> Split pciehp_wq into two workqueues to avoid possible deadlock issues
> when using PCI hotplug lock to serialize hotplug operations triggered
> by pciehp driver.

Why two workqueues?  What is deadlocking?  And why name one "power" and
one "event"?  What do they now do differently?  How are you serializing
events across the workqueues as they can be doing the same thing at the
same time to the same hardware now, right?

What am I missing?

Ick, I said I wasn't going to read anymore, I'm really stopping now.
Sorry, it's like watching a train-wreck, you just can't turn away...

greg k-h

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces
  2012-05-02  5:06   ` Greg KH
@ 2012-05-02  7:20     ` Jiang Liu
  2012-05-02 21:48       ` Greg KH
  0 siblings, 1 reply; 28+ messages in thread
From: Jiang Liu @ 2012-05-02  7:20 UTC (permalink / raw)
  To: Greg KH
  Cc: Jiang Liu, Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas,
	Don Dutile, Keping Chen, linux-kernel, linux-pci

On 2012-5-2 13:06, Greg KH wrote:
> On Fri, Apr 27, 2012 at 11:16:45PM +0800, Jiang Liu wrote:
>> +	/* Avoid deadlock with pci_hp_deregister() */
>> +	while (!pci_hotplug_try_enter()) {
>> +		/* Check whether the slot has been deregistered. */
>> +		if (list_empty(&slot->slot_list)) {
>> +			retval = -ENODEV;
>> +			goto exit_put;
>> +		}
>> +		msleep(1);
>> +	}
>
> Oh my.
>
> Wow.
>
> {sigh}
>
> ick.
>
> My eyes hurt.
>
> And your cpu load just went crazy.
>
> You can now handle all of the nasty emails from sysadmins asking why
> their systems look like their load is high for no good reason.
My bad, should use schedule_timeout_interruptible() instead of
msleep(1) here to avoid busy waiting.

>
> Not to mention all of the other issues here.
>
> My statement about not inventing new lock types has now been proven
> true.
>
> The fact that this would even be a chunk of code that was proposed to be
> merged makes me weep.
>
> You owe me a bottle of whatever you are drinking if you expect me to
> continue to review these patch submissions.
>
> I'm stopping here in the series, please rework this whole thing in a
> major way.
>
> greg k-h
>
>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations
  2012-05-02  5:00   ` Greg KH
@ 2012-05-02  7:25     ` Jiang Liu
  2012-05-02 21:46       ` Greg KH
  0 siblings, 1 reply; 28+ messages in thread
From: Jiang Liu @ 2012-05-02  7:25 UTC (permalink / raw)
  To: Greg KH
  Cc: Jiang Liu, Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas,
	Don Dutile, Keping Chen, linux-kernel, linux-pci

On 2012-5-2 13:00, Greg KH wrote:
> On Fri, Apr 27, 2012 at 11:16:43PM +0800, Jiang Liu wrote:
>> From: Jiang Liu<jiang.liu@huawei.com>
>>
>> There are multiple ways to trigger PCI hotplug requests concurrently,
>> such as:
>> 1. Sysfs interfaces exported by the PCI core subsystem
>> 2. Sysfs interfaces exported by the PCI hotplug subsystem
>> 3. PCI hotplug events triggered by PCI Hotplug Controllers
>> 4. ACPI hotplug events for PCI host bridges
>> 5. Driver binding/unbinding events
>>
>> The PCI core subsystem doesn't support concurrent hotplug operations yet,
>> so all PCI hotplug requests should be globally serialized. This patch
>> introduces several new interfaces to serialize PCI hotplug operations.
>>
>> pci_hotplug_try_enter(): try to acquire write lock
>
> Ick, no, why would you ever want to do that?
>
>> pci_hotplug_enter(): acquire write lock
>> pci_hotplug_exit(): release write lock
>> pci_hotplug_disable(): acquire read lock
>> pci_hotplug_enable(): release read lock
>
> No, the pci hotplug core should not need a rwsem, just a simple lock, if
> that:
> 	pci_hotplug_lock()
> 	pci_hotplug_unlock()
> and that's it.
>
> Really you should not need these functions, the pci hotplug core should
> handle it for you, and the drivers should not care at all.  That's the
> "proper" way to fix this up, serialize stuff within the pci core, not
> the individual drivers.
>
>> Today we have reproduced the issue on a real platform by using
>> acpiphp driver. It's an IA64 platform running Suse 11SP1 (official
>> 2.6.32.12 kernel). The test script is:
>>
>> This issue could be reproduced on an IA64 platform with Suse 11SP1
>> (official 2.6.32.12 kernel) and acpiphp driver.
>
> You do realize just how old that kernel is right?  Please don't assume
> that this kernel version is relevant to us all these years later.
This could be reproduced with latest kernel too, such as 3.3. We
use 2.6.32.12 because the all 3.x series kernels can't boot on
our IA64 platform due to an issue in scheduler. We have reproduced
similar issue on x86 platforms with 3.x kernels.

>
>> +/*
>> + * trylock for writing -- returns 1 if successful, 0 if contention
>> + */
>> +int pci_hotplug_try_enter(void)
>> +{
>> +	if (current != pci_hotplug_mutex_owner) {
>> +		if (down_write_trylock(&pci_hotplug_rwsem) == 0)
>> +			return 0;
>> +		pci_hotplug_mutex_owner = current;
>> +	}
>> +	pci_hotplug_mutex_recursive++;
>> +
>> +	return 1;
>> +}
>> +EXPORT_SYMBOL(pci_hotplug_try_enter);
>
> Don't try to invent new lock types like this, you are bound to get it
> wrong.  And don't create recursive locks, fix the code up to never need
> it.
>
> thanks,
>
> greg k-h
>
> .
>



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations
  2012-05-02  7:25     ` Jiang Liu
@ 2012-05-02 21:46       ` Greg KH
  0 siblings, 0 replies; 28+ messages in thread
From: Greg KH @ 2012-05-02 21:46 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Jiang Liu, Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas,
	Don Dutile, Keping Chen, linux-kernel, linux-pci

On Wed, May 02, 2012 at 03:25:21PM +0800, Jiang Liu wrote:
> On 2012-5-2 13:00, Greg KH wrote:
> >On Fri, Apr 27, 2012 at 11:16:43PM +0800, Jiang Liu wrote:
> >>From: Jiang Liu<jiang.liu@huawei.com>
> >>
> >>There are multiple ways to trigger PCI hotplug requests concurrently,
> >>such as:
> >>1. Sysfs interfaces exported by the PCI core subsystem
> >>2. Sysfs interfaces exported by the PCI hotplug subsystem
> >>3. PCI hotplug events triggered by PCI Hotplug Controllers
> >>4. ACPI hotplug events for PCI host bridges
> >>5. Driver binding/unbinding events
> >>
> >>The PCI core subsystem doesn't support concurrent hotplug operations yet,
> >>so all PCI hotplug requests should be globally serialized. This patch
> >>introduces several new interfaces to serialize PCI hotplug operations.
> >>
> >>pci_hotplug_try_enter(): try to acquire write lock
> >
> >Ick, no, why would you ever want to do that?
> >
> >>pci_hotplug_enter(): acquire write lock
> >>pci_hotplug_exit(): release write lock
> >>pci_hotplug_disable(): acquire read lock
> >>pci_hotplug_enable(): release read lock
> >
> >No, the pci hotplug core should not need a rwsem, just a simple lock, if
> >that:
> >	pci_hotplug_lock()
> >	pci_hotplug_unlock()
> >and that's it.
> >
> >Really you should not need these functions, the pci hotplug core should
> >handle it for you, and the drivers should not care at all.  That's the
> >"proper" way to fix this up, serialize stuff within the pci core, not
> >the individual drivers.
> >
> >>Today we have reproduced the issue on a real platform by using
> >>acpiphp driver. It's an IA64 platform running Suse 11SP1 (official
> >>2.6.32.12 kernel). The test script is:
> >>
> >>This issue could be reproduced on an IA64 platform with Suse 11SP1
> >>(official 2.6.32.12 kernel) and acpiphp driver.
> >
> >You do realize just how old that kernel is right?  Please don't assume
> >that this kernel version is relevant to us all these years later.
> This could be reproduced with latest kernel too, such as 3.3. We
> use 2.6.32.12 because the all 3.x series kernels can't boot on
> our IA64 platform due to an issue in scheduler. We have reproduced
> similar issue on x86 platforms with 3.x kernels.

That's wonderful, and is relevant, unlike what you originally wrote.

Please always write relevant changelog comments, otherwise they are...

greg k-h

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces
  2012-05-02  7:20     ` Jiang Liu
@ 2012-05-02 21:48       ` Greg KH
  0 siblings, 0 replies; 28+ messages in thread
From: Greg KH @ 2012-05-02 21:48 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Jiang Liu, Yinghai Lu, Kenji Kaneshige, Bjorn Helgaas,
	Don Dutile, Keping Chen, linux-kernel, linux-pci

On Wed, May 02, 2012 at 03:20:15PM +0800, Jiang Liu wrote:
> On 2012-5-2 13:06, Greg KH wrote:
> >On Fri, Apr 27, 2012 at 11:16:45PM +0800, Jiang Liu wrote:
> >>+	/* Avoid deadlock with pci_hp_deregister() */
> >>+	while (!pci_hotplug_try_enter()) {
> >>+		/* Check whether the slot has been deregistered. */
> >>+		if (list_empty(&slot->slot_list)) {
> >>+			retval = -ENODEV;
> >>+			goto exit_put;
> >>+		}
> >>+		msleep(1);
> >>+	}
> >
> >Oh my.
> >
> >Wow.
> >
> >{sigh}
> >
> >ick.
> >
> >My eyes hurt.
> >
> >And your cpu load just went crazy.
> >
> >You can now handle all of the nasty emails from sysadmins asking why
> >their systems look like their load is high for no good reason.
> My bad, should use schedule_timeout_interruptible() instead of
> msleep(1) here to avoid busy waiting.

Um, no, that's not my main point here at all.

My main point is this type of loop is the wrong thing to do in the first
place.

Please redo this whole thing, as explained in my first email in this
thread.

greg k-h

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2012-05-02 21:48 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-27 15:16 [PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug Jiang Liu
2012-04-27 15:16 ` [PATCH v2 01/19] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details Jiang Liu
2012-04-27 15:16 ` [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations Jiang Liu
2012-05-02  5:00   ` Greg KH
2012-05-02  7:25     ` Jiang Liu
2012-05-02 21:46       ` Greg KH
2012-04-27 15:16 ` [PATCH v2 03/19] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock Jiang Liu
2012-04-27 15:16 ` [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces Jiang Liu
2012-05-02  5:06   ` Greg KH
2012-05-02  7:20     ` Jiang Liu
2012-05-02 21:48       ` Greg KH
2012-04-27 15:16 ` [PATCH v2 05/19] PCI: correctly flush workqueue when destroy pcie hotplug controller Jiang Liu
2012-05-02  5:08   ` Greg KH
2012-04-27 15:16 ` [PATCH v2 06/19] PCI: prepare for serializing hotplug operations triggered by pciehp driver Jiang Liu
2012-05-02  5:10   ` Greg KH
2012-04-27 15:16 ` [PATCH v2 07/19] PCI: serialize hotplug operaitons triggered by the " Jiang Liu
2012-04-27 15:16 ` [PATCH v2 08/19] PCI: fix two race windows when probing/removing SHPC controller Jiang Liu
2012-04-27 15:16 ` [PATCH v2 09/19] PCI: correctly flush workqueues and timer when destroy " Jiang Liu
2012-04-27 15:16 ` [PATCH v2 10/19] PCI: serialize hotplug operaitons triggered by the shpchp driver Jiang Liu
2012-04-27 15:16 ` [PATCH v2 11/19] PCI: release IO resource in error handling path in cpcihp_generic_init() Jiang Liu
2012-04-27 15:16 ` [PATCH v2 12/19] PCI: clean up all resources in error handling path in zt5550_hc_init_one() Jiang Liu
2012-04-27 15:16 ` [PATCH v2 13/19] PCI: trivial code clean up in cpci_hotplug_core.c Jiang Liu
2012-04-27 15:16 ` [PATCH v2 14/19] PCI: fix race windows when shutting down cpcihp controller Jiang Liu
2012-04-27 15:16 ` [PATCH v2 15/19] PCI: hold a reference count to the PCI bus used by cpcihp drivers Jiang Liu
2012-04-27 15:16 ` [PATCH v2 16/19] PCI: serialize PCI hotplug operations triggered " Jiang Liu
2012-04-27 15:16 ` [PATCH v2 17/19] PCI: serialize PCI hotplug operations triggered by fakephp drivers Jiang Liu
2012-04-27 15:16 ` [PATCH v2 18/19] PCI, sysfs: Use device_type and attr_groups with pci dev Jiang Liu
2012-04-27 15:17 ` [PATCH v2 19/19] PCI: hide sys interface 'remove' and 'rescan' for SR-IOV virtual devices Jiang Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).