linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs
@ 2013-10-30 10:14 Gu Zheng
  2013-10-30 17:08 ` Bjorn Helgaas
  2013-10-31  6:45 ` Yinghai Lu
  0 siblings, 2 replies; 6+ messages in thread
From: Gu Zheng @ 2013-10-30 10:14 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, linux-pci, linux-kernel

When concurent removing pci devices which are in the same pci subtree
via sysfs, such as:
echo -n 1 > /sys/bus/pci/devices/0000\:10\:00.0/remove ; echo -n 1 >
/sys/bus/pci/devices/0000\:1a\:01.0/remove
(1a:01.0 device is downstream from the 10:00.0 bridge)

the following warning will show:
[ 1799.280918] ------------[ cut here ]------------
[ 1799.336199] WARNING: CPU: 7 PID: 126 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
[ 1799.433093] list_del corruption, ffff8807b4a7c000->next is LIST_POISON1 (dead000000100100)
[ 1799.532110] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sg dm_mirror dm_region_hash dm_log dm_mod vfat fat e1000e igb iTCO_wdt iTCO_vendor_support ioatdma ptp i7core_edac ipmi_si edac_core lpc_ich pps_core i2c_i801 pcspkr mfd_core dca ipmi_msghandler acpi_cpufreq xfs libcrc32c sd_mod crc_t10dif crct10dif_common i2c_algo_bit drm_kms_helper ttm drm mptsas scsi_transport_sas mptscsih i2c_core megaraid_sas mptbase
[ 1800.276623] CPU: 7 PID: 126 Comm: kworker/u512:1 Tainted: G        W    3.12.0-rc5+ #196
[ 1800.508918] Workqueue: sysfsd sysfs_schedule_callback_work
[ 1800.574703]  0000000000000009 ffff8807adbadbd8 ffffffff8168b26c ffff8807c27d08a8
[ 1800.663860]  ffff8807adbadc28 ffff8807adbadc18 ffffffff810711dc ffff8807adbadc68
[ 1800.753130]  ffff8807b4a7c000 ffff8807b4a7c000 ffff8807ad089c00 0000000000000000
[ 1800.842282] Call Trace:
[ 1800.871651]  [<ffffffff8168b26c>] dump_stack+0x55/0x76
[ 1800.933301]  [<ffffffff810711dc>] warn_slowpath_common+0x8c/0xc0
[ 1801.005283]  [<ffffffff810712c6>] warn_slowpath_fmt+0x46/0x50
[ 1801.074081]  [<ffffffff8135a343>] __list_del_entry+0x63/0xd0
[ 1801.141839]  [<ffffffff8135a3c1>] list_del+0x11/0x40
[ 1801.201320]  [<ffffffff813734da>] pci_remove_bus_device+0x6a/0xe0
[ 1801.274279]  [<ffffffff8137356e>] pci_stop_and_remove_bus_device+0x1e/0x30
[ 1801.356606]  [<ffffffff8137b20b>] remove_callback+0x2b/0x40
[ 1801.423412]  [<ffffffff81251848>] sysfs_schedule_callback_work+0x18/0x60
[ 1801.503744]  [<ffffffff8108eab5>] process_one_work+0x1f5/0x540
[ 1801.573640]  [<ffffffff8108ea53>] ? process_one_work+0x193/0x540
[ 1801.645616]  [<ffffffff8108f2ac>] worker_thread+0x11c/0x370
[ 1801.712337]  [<ffffffff8108f190>] ? rescuer_thread+0x350/0x350
[ 1801.782178]  [<ffffffff8109731d>] kthread+0xed/0x100
[ 1801.841661]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
[ 1801.919919]  [<ffffffff8169cc3c>] ret_from_fork+0x7c/0xb0
[ 1801.984608]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
[ 1802.062825] ---[ end trace d77f2054de000fb7 ]---

This issue is related to the bug 54411:
https://bugzilla.kernel.org/show_bug.cgi?id=54411

Since we added the pci_bus reference management, the bug becomes a
invalid list entry warning as descripted above. Beacuse it still
trys to delete an deleted pci device from device list.

So here we make stop device actually detach driver only, and remove
device will do device_del instead, and move bus_list change and pci device
resource free into pci_release_dev, so that it'll consistent with
bus reference managment, and the device only can be deleted from device
list in pci_release_dev just for one time.

Besides, it also makes hostbridge to use device_unregister to be pair
with device_register before.

This patch is based on Yinghai's privious similar patch:
http://lkml.org/lkml/2013/5/13/658

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
---
 drivers/pci/probe.c  | 23 +++++++++++++++++++++--
 drivers/pci/remove.c | 32 ++++++++------------------------
 2 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 7ef0f86..c639f44 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1156,6 +1156,21 @@ static void pci_release_capabilities(struct pci_dev *dev)
 	pci_free_cap_save_buffers(dev);
 }
 
+static void pci_free_resources(struct pci_dev *dev)
+{
+	int i;
+
+	msi_remove_pci_irq_vectors(dev);
+
+	pci_cleanup_rom(dev);
+	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+		struct resource *res = dev->resource + i;
+
+		if (res->parent)
+			release_resource(res);
+	}
+}
+
 /**
  * pci_release_dev - free a pci device structure when all users of it are finished.
  * @dev: device that's been disconnected
@@ -1165,9 +1180,13 @@ static void pci_release_capabilities(struct pci_dev *dev)
  */
 static void pci_release_dev(struct device *dev)
 {
-	struct pci_dev *pci_dev;
+	struct pci_dev *pci_dev = to_pci_dev(dev);
+
+	down_write(&pci_bus_sem);
+	list_del(&pci_dev->bus_list);
+	up_write(&pci_bus_sem);
+	pci_free_resources(pci_dev);
 
-	pci_dev = to_pci_dev(dev);
 	pci_release_capabilities(pci_dev);
 	pci_release_of_node(pci_dev);
 	pcibios_release_device(pci_dev);
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 8fc54b7..5659283 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -3,20 +3,6 @@
 #include <linux/pci-aspm.h>
 #include "pci.h"
 
-static void pci_free_resources(struct pci_dev *dev)
-{
-	int i;
-
- 	msi_remove_pci_irq_vectors(dev);
-
-	pci_cleanup_rom(dev);
-	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
-		struct resource *res = dev->resource + i;
-		if (res->parent)
-			release_resource(res);
-	}
-}
-
 static void pci_stop_dev(struct pci_dev *dev)
 {
 	pci_pme_active(dev, false);
@@ -24,8 +10,7 @@ static void pci_stop_dev(struct pci_dev *dev)
 	if (dev->is_added) {
 		pci_proc_detach_device(dev);
 		pci_remove_sysfs_dev_files(dev);
-		device_del(&dev->dev);
-		dev->is_added = 0;
+		device_release_driver(&dev->dev);
 	}
 
 	if (dev->bus->self)
@@ -34,12 +19,11 @@ static void pci_stop_dev(struct pci_dev *dev)
 
 static void pci_destroy_dev(struct pci_dev *dev)
 {
-	down_write(&pci_bus_sem);
-	list_del(&dev->bus_list);
-	up_write(&pci_bus_sem);
-
-	pci_free_resources(dev);
-	put_device(&dev->dev);
+	if (dev->is_added) {
+		device_del(&dev->dev);
+		put_device(&dev->dev);
+		dev->is_added = 0;
+	}
 }
 
 void pci_remove_bus(struct pci_bus *bus)
@@ -126,7 +110,7 @@ void pci_stop_root_bus(struct pci_bus *bus)
 		pci_stop_bus_device(child);
 
 	/* stop the host bridge */
-	device_del(&host_bridge->dev);
+	device_release_driver(&host_bridge->dev);
 }
 
 void pci_remove_root_bus(struct pci_bus *bus)
@@ -145,5 +129,5 @@ void pci_remove_root_bus(struct pci_bus *bus)
 	host_bridge->bus = NULL;
 
 	/* remove the host bridge */
-	put_device(&host_bridge->dev);
+	device_unregister(&host_bridge->dev);
 }
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs
  2013-10-30 10:14 [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs Gu Zheng
@ 2013-10-30 17:08 ` Bjorn Helgaas
  2013-10-31 10:11   ` Gu Zheng
  2013-10-31  6:45 ` Yinghai Lu
  1 sibling, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2013-10-30 17:08 UTC (permalink / raw)
  To: Gu Zheng; +Cc: Yinghai Lu, linux-pci, linux-kernel

On Wed, Oct 30, 2013 at 06:14:20PM +0800, Gu Zheng wrote:
> When concurent removing pci devices which are in the same pci subtree
> via sysfs, such as:
> echo -n 1 > /sys/bus/pci/devices/0000\:10\:00.0/remove ; echo -n 1 >
> /sys/bus/pci/devices/0000\:1a\:01.0/remove
> (1a:01.0 device is downstream from the 10:00.0 bridge)
> 
> the following warning will show:
> [ 1799.280918] ------------[ cut here ]------------
> [ 1799.336199] WARNING: CPU: 7 PID: 126 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
> [ 1799.433093] list_del corruption, ffff8807b4a7c000->next is LIST_POISON1 (dead000000100100)
> [ 1799.532110] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sg dm_mirror dm_region_hash dm_log dm_mod vfat fat e1000e igb iTCO_wdt iTCO_vendor_support ioatdma ptp i7core_edac ipmi_si edac_core lpc_ich pps_core i2c_i801 pcspkr mfd_core dca ipmi_msghandler acpi_cpufreq xfs libcrc32c sd_mod crc_t10dif crct10dif_common i2c_algo_bit drm_kms_helper ttm drm mptsas scsi_transport_sas mptscsih i2c_core megaraid_sas mptbase
> [ 1800.276623] CPU: 7 PID: 126 Comm: kworker/u512:1 Tainted: G        W    3.12.0-rc5+ #196
> [ 1800.508918] Workqueue: sysfsd sysfs_schedule_callback_work
> [ 1800.574703]  0000000000000009 ffff8807adbadbd8 ffffffff8168b26c ffff8807c27d08a8
> [ 1800.663860]  ffff8807adbadc28 ffff8807adbadc18 ffffffff810711dc ffff8807adbadc68
> [ 1800.753130]  ffff8807b4a7c000 ffff8807b4a7c000 ffff8807ad089c00 0000000000000000
> [ 1800.842282] Call Trace:
> [ 1800.871651]  [<ffffffff8168b26c>] dump_stack+0x55/0x76
> [ 1800.933301]  [<ffffffff810711dc>] warn_slowpath_common+0x8c/0xc0
> [ 1801.005283]  [<ffffffff810712c6>] warn_slowpath_fmt+0x46/0x50
> [ 1801.074081]  [<ffffffff8135a343>] __list_del_entry+0x63/0xd0
> [ 1801.141839]  [<ffffffff8135a3c1>] list_del+0x11/0x40
> [ 1801.201320]  [<ffffffff813734da>] pci_remove_bus_device+0x6a/0xe0
> [ 1801.274279]  [<ffffffff8137356e>] pci_stop_and_remove_bus_device+0x1e/0x30
> [ 1801.356606]  [<ffffffff8137b20b>] remove_callback+0x2b/0x40
> [ 1801.423412]  [<ffffffff81251848>] sysfs_schedule_callback_work+0x18/0x60
> [ 1801.503744]  [<ffffffff8108eab5>] process_one_work+0x1f5/0x540
> [ 1801.573640]  [<ffffffff8108ea53>] ? process_one_work+0x193/0x540
> [ 1801.645616]  [<ffffffff8108f2ac>] worker_thread+0x11c/0x370
> [ 1801.712337]  [<ffffffff8108f190>] ? rescuer_thread+0x350/0x350
> [ 1801.782178]  [<ffffffff8109731d>] kthread+0xed/0x100
> [ 1801.841661]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
> [ 1801.919919]  [<ffffffff8169cc3c>] ret_from_fork+0x7c/0xb0
> [ 1801.984608]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
> [ 1802.062825] ---[ end trace d77f2054de000fb7 ]---
> 
> This issue is related to the bug 54411:
> https://bugzilla.kernel.org/show_bug.cgi?id=54411
> 
> Since we added the pci_bus reference management, the bug becomes a
> invalid list entry warning as descripted above. Beacuse it still
> trys to delete an deleted pci device from device list.
> 
> So here we make stop device actually detach driver only, and remove
> device will do device_del instead, and move bus_list change and pci device
> resource free into pci_release_dev, so that it'll consistent with
> bus reference managment, and the device only can be deleted from device
> list in pci_release_dev just for one time.

You have a good argument for moving the &dev->bus_list maintenance from
pci_destroy_dev() to pci_release_dev(), because I think we can call
pci_destroy_dev() twice for the same device, and we don't want to do the
list_del() twice.

I suspect we want to move other stuff, too, but you haven't explained
why we need the device_del(), device_release_driver(), and
pci_free_resources() changes.  Possibly similar arguments apply.  I'd
rather see separate patches to move these pieces so we can think about
them individually.

> Besides, it also makes hostbridge to use device_unregister to be pair
> with device_register before.

Please split the host bridge changes into a separate patch.  One logical
change, one patch.  If they *can* be split up, please split them up.

> This patch is based on Yinghai's privious similar patch:
> http://lkml.org/lkml/2013/5/13/658

There are seven patches in that series.  I don't know which one you're
referring to.

Please run a spell checker on your changelog.  There are many spelling
errors.

> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
> ---
>  drivers/pci/probe.c  | 23 +++++++++++++++++++++--
>  drivers/pci/remove.c | 32 ++++++++------------------------
>  2 files changed, 29 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 7ef0f86..c639f44 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1156,6 +1156,21 @@ static void pci_release_capabilities(struct pci_dev *dev)
>  	pci_free_cap_save_buffers(dev);
>  }
>  
> +static void pci_free_resources(struct pci_dev *dev)
> +{
> +	int i;
> +
> +	msi_remove_pci_irq_vectors(dev);
> +
> +	pci_cleanup_rom(dev);
> +	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
> +		struct resource *res = dev->resource + i;
> +
> +		if (res->parent)
> +			release_resource(res);
> +	}
> +}
> +
>  /**
>   * pci_release_dev - free a pci device structure when all users of it are finished.
>   * @dev: device that's been disconnected
> @@ -1165,9 +1180,13 @@ static void pci_release_capabilities(struct pci_dev *dev)
>   */
>  static void pci_release_dev(struct device *dev)
>  {
> -	struct pci_dev *pci_dev;
> +	struct pci_dev *pci_dev = to_pci_dev(dev);
> +
> +	down_write(&pci_bus_sem);
> +	list_del(&pci_dev->bus_list);
> +	up_write(&pci_bus_sem);
> +	pci_free_resources(pci_dev);
>  
> -	pci_dev = to_pci_dev(dev);
>  	pci_release_capabilities(pci_dev);
>  	pci_release_of_node(pci_dev);
>  	pcibios_release_device(pci_dev);
> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
> index 8fc54b7..5659283 100644
> --- a/drivers/pci/remove.c
> +++ b/drivers/pci/remove.c
> @@ -3,20 +3,6 @@
>  #include <linux/pci-aspm.h>
>  #include "pci.h"
>  
> -static void pci_free_resources(struct pci_dev *dev)
> -{
> -	int i;
> -
> - 	msi_remove_pci_irq_vectors(dev);
> -
> -	pci_cleanup_rom(dev);
> -	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
> -		struct resource *res = dev->resource + i;
> -		if (res->parent)
> -			release_resource(res);
> -	}
> -}
> -
>  static void pci_stop_dev(struct pci_dev *dev)
>  {
>  	pci_pme_active(dev, false);
> @@ -24,8 +10,7 @@ static void pci_stop_dev(struct pci_dev *dev)
>  	if (dev->is_added) {
>  		pci_proc_detach_device(dev);
>  		pci_remove_sysfs_dev_files(dev);
> -		device_del(&dev->dev);
> -		dev->is_added = 0;
> +		device_release_driver(&dev->dev);
>  	}
>  
>  	if (dev->bus->self)
> @@ -34,12 +19,11 @@ static void pci_stop_dev(struct pci_dev *dev)
>  
>  static void pci_destroy_dev(struct pci_dev *dev)
>  {
> -	down_write(&pci_bus_sem);
> -	list_del(&dev->bus_list);
> -	up_write(&pci_bus_sem);
> -
> -	pci_free_resources(dev);
> -	put_device(&dev->dev);
> +	if (dev->is_added) {
> +		device_del(&dev->dev);
> +		put_device(&dev->dev);
> +		dev->is_added = 0;

This is wrong.  You can't reference dev->is_added after put_device()
because you have to assume you are holding the only reference, and the
pci_dev may be deallocated before put_device() returns.

> +	}
>  }
>  
>  void pci_remove_bus(struct pci_bus *bus)
> @@ -126,7 +110,7 @@ void pci_stop_root_bus(struct pci_bus *bus)
>  		pci_stop_bus_device(child);
>  
>  	/* stop the host bridge */
> -	device_del(&host_bridge->dev);
> +	device_release_driver(&host_bridge->dev);
>  }
>  
>  void pci_remove_root_bus(struct pci_bus *bus)
> @@ -145,5 +129,5 @@ void pci_remove_root_bus(struct pci_bus *bus)
>  	host_bridge->bus = NULL;
>  
>  	/* remove the host bridge */
> -	put_device(&host_bridge->dev);
> +	device_unregister(&host_bridge->dev);
>  }
> -- 
> 1.8.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs
  2013-10-30 10:14 [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs Gu Zheng
  2013-10-30 17:08 ` Bjorn Helgaas
@ 2013-10-31  6:45 ` Yinghai Lu
  2013-10-31 10:12   ` Gu Zheng
  2013-10-31 13:06   ` Bjorn Helgaas
  1 sibling, 2 replies; 6+ messages in thread
From: Yinghai Lu @ 2013-10-31  6:45 UTC (permalink / raw)
  To: Gu Zheng; +Cc: Bjorn Helgaas, linux-pci, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4115 bytes --]

On Wed, Oct 30, 2013 at 3:14 AM, Gu Zheng <guz.fnst@cn.fujitsu.com> wrote:
> When concurent removing pci devices which are in the same pci subtree
> via sysfs, such as:
> echo -n 1 > /sys/bus/pci/devices/0000\:10\:00.0/remove ; echo -n 1 >
> /sys/bus/pci/devices/0000\:1a\:01.0/remove
> (1a:01.0 device is downstream from the 10:00.0 bridge)
>
> the following warning will show:
> [ 1799.280918] ------------[ cut here ]------------
> [ 1799.336199] WARNING: CPU: 7 PID: 126 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
> [ 1799.433093] list_del corruption, ffff8807b4a7c000->next is LIST_POISON1 (dead000000100100)
> [ 1799.532110] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sg dm_mirror dm_region_hash dm_log dm_mod vfat fat e1000e igb iTCO_wdt iTCO_vendor_support ioatdma ptp i7core_edac ipmi_si edac_core lpc_ich pps_core i2c_i801 pcspkr mfd_core dca ipmi_msghandler acpi_cpufreq xfs libcrc32c sd_mod crc_t10dif crct10dif_common i2c_algo_bit drm_kms_helper ttm drm mptsas scsi_transport_sas mptscsih i2c_core megaraid_sas mptbase
> [ 1800.276623] CPU: 7 PID: 126 Comm: kworker/u512:1 Tainted: G        W    3.12.0-rc5+ #196
> [ 1800.508918] Workqueue: sysfsd sysfs_schedule_callback_work
> [ 1800.574703]  0000000000000009 ffff8807adbadbd8 ffffffff8168b26c ffff8807c27d08a8
> [ 1800.663860]  ffff8807adbadc28 ffff8807adbadc18 ffffffff810711dc ffff8807adbadc68
> [ 1800.753130]  ffff8807b4a7c000 ffff8807b4a7c000 ffff8807ad089c00 0000000000000000
> [ 1800.842282] Call Trace:
> [ 1800.871651]  [<ffffffff8168b26c>] dump_stack+0x55/0x76
> [ 1800.933301]  [<ffffffff810711dc>] warn_slowpath_common+0x8c/0xc0
> [ 1801.005283]  [<ffffffff810712c6>] warn_slowpath_fmt+0x46/0x50
> [ 1801.074081]  [<ffffffff8135a343>] __list_del_entry+0x63/0xd0
> [ 1801.141839]  [<ffffffff8135a3c1>] list_del+0x11/0x40
> [ 1801.201320]  [<ffffffff813734da>] pci_remove_bus_device+0x6a/0xe0
> [ 1801.274279]  [<ffffffff8137356e>] pci_stop_and_remove_bus_device+0x1e/0x30
> [ 1801.356606]  [<ffffffff8137b20b>] remove_callback+0x2b/0x40
> [ 1801.423412]  [<ffffffff81251848>] sysfs_schedule_callback_work+0x18/0x60
> [ 1801.503744]  [<ffffffff8108eab5>] process_one_work+0x1f5/0x540
> [ 1801.573640]  [<ffffffff8108ea53>] ? process_one_work+0x193/0x540
> [ 1801.645616]  [<ffffffff8108f2ac>] worker_thread+0x11c/0x370
> [ 1801.712337]  [<ffffffff8108f190>] ? rescuer_thread+0x350/0x350
> [ 1801.782178]  [<ffffffff8109731d>] kthread+0xed/0x100
> [ 1801.841661]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
> [ 1801.919919]  [<ffffffff8169cc3c>] ret_from_fork+0x7c/0xb0
> [ 1801.984608]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
> [ 1802.062825] ---[ end trace d77f2054de000fb7 ]---
>
> This issue is related to the bug 54411:
> https://bugzilla.kernel.org/show_bug.cgi?id=54411
>
> Since we added the pci_bus reference management, the bug becomes a
> invalid list entry warning as descripted above. Beacuse it still
> trys to delete an deleted pci device from device list.
>
> So here we make stop device actually detach driver only, and remove
> device will do device_del instead, and move bus_list change and pci device
> resource free into pci_release_dev, so that it'll consistent with
> bus reference managment, and the device only can be deleted from device
> list in pci_release_dev just for one time.
>
> Besides, it also makes hostbridge to use device_unregister to be pair
> with device_register before.
>
> This patch is based on Yinghai's privious similar patch:
> http://lkml.org/lkml/2013/5/13/658

I have updated version, please check if attached patches fix the problem.

virtfn_release.patch
fix_cx3_hotplug_2.patch
fix_racing_removing_6_1.patch
fix_racing_removing_6_2.patch
fix_racing_removing_6_3.patch

Thanks

Yinghai

[-- Attachment #2: virtfn_release.patch --]
[-- Type: text/x-patch, Size: 4589 bytes --]

Subject: [PATCH] PCI: Release PF ref during removing VF
From: Yinghai Lu <yinghai@kernel.org>

commit dc087f2f6a2925e81831f3016b9cbb6e470e7423
(PCI: Simplify IOV implementation and fix reference count races)
broke the pcie native hotplug with SRIOV enabled: PF is not freed
during hot-remove, and later hot-add do not work as old pci_dev
is still around, and can not create new pci_dev.

That commit need VFs to be removed via pci_disable_sriov/virtfn_remove
to make sure ref to PF is put back.

So we can not call stop_and_remove for VF before PF, that will
make VF get removed directly before PF's driver try to use
pci_disable_sriov/virtfn_remove to do it.

Solution would be
1. separating stop and remove in two iterations.
2. or make pci_stop_and_remove_bus_device could be used on VF.

This patch is second solution:
Separate PF ref releasing from virtfn_remove, all that during
pci_destroy_dev for VFs.

-v2: fix warning when SRIOV is not defined.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/iov.c    |   34 +++++++++++++++++++++++++++++-----
 drivers/pci/pci.h    |    4 ++++
 drivers/pci/remove.c |   17 +++++++++++++++++
 3 files changed, 50 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/pci/iov.c
===================================================================
--- linux-2.6.orig/drivers/pci/iov.c
+++ linux-2.6/drivers/pci/iov.c
@@ -132,9 +132,37 @@ failed:
 	return rc;
 }
 
-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+void virtfn_release(struct pci_dev *virtfn)
 {
+	int i;
+	struct pci_dev *dev;
 	char buf[VIRTFN_ID_LEN];
+
+	if (!virtfn->is_virtfn)
+		return;
+
+	dev = virtfn->physfn;
+
+	/*
+	 * Remove link to virtfn for PF.
+	 * pci_disable_sriov() could be called after pci_stop_dev() is
+	 * called for PF. So check sd to avoid spurious sfsfs warning.
+	 */
+	if (dev->dev.kobj.sd)
+		for (i = 0; i < dev->sriov->num_VFs; i++)
+			if ((virtfn_bus(dev, i) == virtfn->bus->number) &&
+			    (virtfn_devfn(dev, i) == virtfn->devfn)) {
+				sprintf(buf, "virtfn%u", i);
+				sysfs_remove_link(&dev->dev.kobj, buf);
+				break;
+			}
+
+	virtfn_remove_bus(dev->bus, virtfn->bus);
+	pci_dev_put(dev);
+}
+
+static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+{
 	struct pci_dev *virtfn;
 	struct pci_sriov *iov = dev->sriov;
 
@@ -149,8 +177,6 @@ static void virtfn_remove(struct pci_dev
 		__pci_reset_function(virtfn);
 	}
 
-	sprintf(buf, "virtfn%u", id);
-	sysfs_remove_link(&dev->dev.kobj, buf);
 	/*
 	 * pci_stop_dev() could have been called for this virtfn already,
 	 * so the directory for the virtfn may have been removed before.
@@ -161,12 +187,10 @@ static void virtfn_remove(struct pci_dev
 
 	mutex_lock(&iov->dev->sriov->lock);
 	pci_stop_and_remove_bus_device(virtfn);
-	virtfn_remove_bus(dev->bus, virtfn->bus);
 	mutex_unlock(&iov->dev->sriov->lock);
 
 	/* balance pci_get_domain_bus_and_slot() */
 	pci_dev_put(virtfn);
-	pci_dev_put(dev);
 }
 
 static int sriov_migration(struct pci_dev *dev)
Index: linux-2.6/drivers/pci/pci.h
===================================================================
--- linux-2.6.orig/drivers/pci/pci.h
+++ linux-2.6/drivers/pci/pci.h
@@ -259,6 +259,7 @@ int pci_iov_resource_bar(struct pci_dev
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
 int pci_iov_bus_range(struct pci_bus *bus);
+void virtfn_release(struct pci_dev *dev);
 
 #else
 static inline int pci_iov_init(struct pci_dev *dev)
@@ -281,6 +282,9 @@ static inline int pci_iov_bus_range(stru
 {
 	return 0;
 }
+static inline void virtfn_release(struct pci_dev *dev)
+{
+}
 
 #endif /* CONFIG_PCI_IOV */
 
Index: linux-2.6/drivers/pci/remove.c
===================================================================
--- linux-2.6.orig/drivers/pci/remove.c
+++ linux-2.6/drivers/pci/remove.c
@@ -38,6 +38,8 @@ static void pci_destroy_dev(struct pci_d
 	list_del(&dev->bus_list);
 	up_write(&pci_bus_sem);
 
+	virtfn_release(dev);
+
 	pci_free_resources(dev);
 	put_device(&dev->dev);
 }
@@ -107,8 +109,23 @@ static void pci_remove_bus_device(struct
  */
 void pci_stop_and_remove_bus_device(struct pci_dev *dev)
 {
+#ifdef CONFIG_PCI_ATS
+	struct pci_sriov *iov = NULL;
+	int locked = 0;
+
+	if (dev->is_virtfn) {
+		iov = dev->physfn->sriov;
+		locked = mutex_trylock(&iov->dev->sriov->lock);
+	}
+#endif
+
 	pci_stop_bus_device(dev);
 	pci_remove_bus_device(dev);
+
+#ifdef CONFIG_PCI_ATS
+	if (locked)
+		mutex_unlock(&iov->dev->sriov->lock);
+#endif
 }
 EXPORT_SYMBOL(pci_stop_and_remove_bus_device);
 

[-- Attachment #3: fix_cx3_hotplug_2.patch --]
[-- Type: text/x-patch, Size: 1575 bytes --]

Subject: [PATCH v2 3/3] PCI: Stop sriov after stop PF if PF's driver skip that
From: Yinghai Lu <yinghai@kernel.org>

After commit dc087f2f6a2925e81831f3016b9cbb6e470e7423
(PCI: Simplify IOV implementation and fix reference count races)
VF need to be removed via virtfn_remove to make sure ref to PF
is put back.

Some driver (like ixgbe) does not call pci_disable_sriov() if
sriov is enabled via /sys/.../sriov_numvfs setting.
ixgbe does allow driver for PF get detached, but still have VFs
around.

But how about PF get removed via /sys or pciehp finally?

During hot-remove, VF will still hold one ref to PF and it
prevent PF to be removed.
That make the next hot-add fails, as old PF dev struct is still around.

We need to add pci_disable_sriov() calling during stop PF .

Need this one for v3.11

-v2: Accoring to Bjorn, move that calling to pci_stop_dev.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Greg Rose <gregory.v.rose@intel.com>

---
 drivers/pci/remove.c |    2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6/drivers/pci/remove.c
===================================================================
--- linux-2.6.orig/drivers/pci/remove.c
+++ linux-2.6/drivers/pci/remove.c
@@ -25,6 +25,8 @@ static void pci_stop_dev(struct pci_dev
 		pci_proc_detach_device(dev);
 		pci_remove_sysfs_dev_files(dev);
 		device_del(&dev->dev);
+		/* remove VF, if PF driver skip that */
+		pci_disable_sriov(dev);
 		dev->is_added = 0;
 	}
 

[-- Attachment #4: fix_racing_removing_6_1.patch --]
[-- Type: text/x-patch, Size: 1220 bytes --]

Subject: [PATCH 1/7] PCI: move back pci_proc_attach_devices calling
From: Yinghai Lu <yinghai@kernel.org>

We stop detach proc when pci_stop_device.
So should attach that during pci_bus_add_device.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/bus.c   |    1 +
 drivers/pci/probe.c |    2 --
 2 files changed, 1 insertion(+), 2 deletions(-)

Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -1382,8 +1382,6 @@ void pci_device_add(struct pci_dev *dev,
 	dev->match_driver = false;
 	ret = device_add(&dev->dev);
 	WARN_ON(ret < 0);
-
-	pci_proc_attach_device(dev);
 }
 
 struct pci_dev *__ref pci_scan_single_device(struct pci_bus *bus, int devfn)
Index: linux-2.6/drivers/pci/bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/bus.c
+++ linux-2.6/drivers/pci/bus.c
@@ -175,6 +175,7 @@ int pci_bus_add_device(struct pci_dev *d
 	 * are not assigned yet for some devices.
 	 */
 	pci_fixup_device(pci_fixup_final, dev);
+	pci_proc_attach_device(dev);
 	pci_create_sysfs_dev_files(dev);
 
 	dev->match_driver = true;

[-- Attachment #5: fix_racing_removing_6_2.patch --]
[-- Type: text/x-patch, Size: 2578 bytes --]

Subject: [PATCH 2/7] PCI: move resources and bus_list releasing to pci_release_dev
From: Yinghai Lu <yinghai@kernel.org>

We should not release resource in pci_destroy that is too early
as there could be still other use hold reference.

release them or remove it from bus devices list at last
in pci_release_dev instead.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/probe.c  |   25 +++++++++++++++++++++++--
 drivers/pci/remove.c |   21 ---------------------
 2 files changed, 23 insertions(+), 23 deletions(-)

Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -1155,6 +1155,20 @@ static void pci_release_capabilities(str
 	pci_free_cap_save_buffers(dev);
 }
 
+static void pci_free_resources(struct pci_dev *dev)
+{
+	int i;
+
+	msi_remove_pci_irq_vectors(dev);
+
+	pci_cleanup_rom(dev);
+	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+		struct resource *res = dev->resource + i;
+		if (res->parent)
+			release_resource(res);
+	}
+}
+
 /**
  * pci_release_dev - free a pci device structure when all users of it are finished.
  * @dev: device that's been disconnected
@@ -1164,9 +1178,16 @@ static void pci_release_capabilities(str
  */
 static void pci_release_dev(struct device *dev)
 {
-	struct pci_dev *pci_dev;
+	struct pci_dev *pci_dev = to_pci_dev(dev);
+
+	down_write(&pci_bus_sem);
+	list_del(&pci_dev->bus_list);
+	up_write(&pci_bus_sem);
+
+	virtfn_release(pci_dev);
+
+	pci_free_resources(pci_dev);
 
-	pci_dev = to_pci_dev(dev);
 	pci_release_capabilities(pci_dev);
 	pci_release_of_node(pci_dev);
 	pcibios_release_device(pci_dev);
Index: linux-2.6/drivers/pci/remove.c
===================================================================
--- linux-2.6.orig/drivers/pci/remove.c
+++ linux-2.6/drivers/pci/remove.c
@@ -3,20 +3,6 @@
 #include <linux/pci-aspm.h>
 #include "pci.h"
 
-static void pci_free_resources(struct pci_dev *dev)
-{
-	int i;
-
- 	msi_remove_pci_irq_vectors(dev);
-
-	pci_cleanup_rom(dev);
-	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
-		struct resource *res = dev->resource + i;
-		if (res->parent)
-			release_resource(res);
-	}
-}
-
 static void pci_stop_dev(struct pci_dev *dev)
 {
 	pci_pme_active(dev, false);
@@ -36,13 +22,6 @@ static void pci_stop_dev(struct pci_dev
 
 static void pci_destroy_dev(struct pci_dev *dev)
 {
-	down_write(&pci_bus_sem);
-	list_del(&dev->bus_list);
-	up_write(&pci_bus_sem);
-
-	virtfn_release(dev);
-
-	pci_free_resources(dev);
 	put_device(&dev->dev);
 }
 

[-- Attachment #6: fix_racing_removing_6_3.patch --]
[-- Type: text/x-patch, Size: 1544 bytes --]

Subject: [PATCH 3/7] PCI: Destroy pci dev only once
From: Yinghai Lu <yinghai@kernel.org>

Mutliple removing via /sys will call pci_destroy_dev two times.

Add is_removed to record if pci_destroy_dev is called already.

During second calling, still have extra dev ref hold via
device_schedule_call, so we are safe to check dev->is_removed.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/remove.c |    5 ++++-
 include/linux/pci.h  |    1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/remove.c
===================================================================
--- linux-2.6.orig/drivers/pci/remove.c
+++ linux-2.6/drivers/pci/remove.c
@@ -22,7 +22,10 @@ static void pci_stop_dev(struct pci_dev
 
 static void pci_destroy_dev(struct pci_dev *dev)
 {
-	put_device(&dev->dev);
+	if (!dev->is_removed) {
+		dev->is_removed = 1;
+		put_device(&dev->dev);
+	}
 }
 
 void pci_remove_bus(struct pci_bus *bus)
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -308,6 +308,7 @@ struct pci_dev {
 	unsigned int	multifunction:1;/* Part of multi-function device */
 	/* keep track of device state */
 	unsigned int	is_added:1;
+	unsigned int	is_removed:1;	/* pci_destroy_dev is called */
 	unsigned int	is_busmaster:1; /* device is busmaster */
 	unsigned int	no_msi:1;	/* device may not use msi */
 	unsigned int	block_cfg_access:1;	/* config space access is blocked */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs
  2013-10-30 17:08 ` Bjorn Helgaas
@ 2013-10-31 10:11   ` Gu Zheng
  0 siblings, 0 replies; 6+ messages in thread
From: Gu Zheng @ 2013-10-31 10:11 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, linux-pci, linux-kernel

Hi Bjorn,
Thanks for your comments.:)
Please refer to inline.

Best regards,
Gu

On 10/31/2013 01:08 AM, Bjorn Helgaas wrote:

> On Wed, Oct 30, 2013 at 06:14:20PM +0800, Gu Zheng wrote:
>> When concurent removing pci devices which are in the same pci subtree
>> via sysfs, such as:
>> echo -n 1 > /sys/bus/pci/devices/0000\:10\:00.0/remove ; echo -n 1 >
>> /sys/bus/pci/devices/0000\:1a\:01.0/remove
>> (1a:01.0 device is downstream from the 10:00.0 bridge)
>>
>> the following warning will show:
>> [ 1799.280918] ------------[ cut here ]------------
>> [ 1799.336199] WARNING: CPU: 7 PID: 126 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
>> [ 1799.433093] list_del corruption, ffff8807b4a7c000->next is LIST_POISON1 (dead000000100100)
>> [ 1799.532110] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sg dm_mirror dm_region_hash dm_log dm_mod vfat fat e1000e igb iTCO_wdt iTCO_vendor_support ioatdma ptp i7core_edac ipmi_si edac_core lpc_ich pps_core i2c_i801 pcspkr mfd_core dca ipmi_msghandler acpi_cpufreq xfs libcrc32c sd_mod crc_t10dif crct10dif_common i2c_algo_bit drm_kms_helper ttm drm mptsas scsi_transport_sas mptscsih i2c_core megaraid_sas mptbase
>> [ 1800.276623] CPU: 7 PID: 126 Comm: kworker/u512:1 Tainted: G        W    3.12.0-rc5+ #196
>> [ 1800.508918] Workqueue: sysfsd sysfs_schedule_callback_work
>> [ 1800.574703]  0000000000000009 ffff8807adbadbd8 ffffffff8168b26c ffff8807c27d08a8
>> [ 1800.663860]  ffff8807adbadc28 ffff8807adbadc18 ffffffff810711dc ffff8807adbadc68
>> [ 1800.753130]  ffff8807b4a7c000 ffff8807b4a7c000 ffff8807ad089c00 0000000000000000
>> [ 1800.842282] Call Trace:
>> [ 1800.871651]  [<ffffffff8168b26c>] dump_stack+0x55/0x76
>> [ 1800.933301]  [<ffffffff810711dc>] warn_slowpath_common+0x8c/0xc0
>> [ 1801.005283]  [<ffffffff810712c6>] warn_slowpath_fmt+0x46/0x50
>> [ 1801.074081]  [<ffffffff8135a343>] __list_del_entry+0x63/0xd0
>> [ 1801.141839]  [<ffffffff8135a3c1>] list_del+0x11/0x40
>> [ 1801.201320]  [<ffffffff813734da>] pci_remove_bus_device+0x6a/0xe0
>> [ 1801.274279]  [<ffffffff8137356e>] pci_stop_and_remove_bus_device+0x1e/0x30
>> [ 1801.356606]  [<ffffffff8137b20b>] remove_callback+0x2b/0x40
>> [ 1801.423412]  [<ffffffff81251848>] sysfs_schedule_callback_work+0x18/0x60
>> [ 1801.503744]  [<ffffffff8108eab5>] process_one_work+0x1f5/0x540
>> [ 1801.573640]  [<ffffffff8108ea53>] ? process_one_work+0x193/0x540
>> [ 1801.645616]  [<ffffffff8108f2ac>] worker_thread+0x11c/0x370
>> [ 1801.712337]  [<ffffffff8108f190>] ? rescuer_thread+0x350/0x350
>> [ 1801.782178]  [<ffffffff8109731d>] kthread+0xed/0x100
>> [ 1801.841661]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
>> [ 1801.919919]  [<ffffffff8169cc3c>] ret_from_fork+0x7c/0xb0
>> [ 1801.984608]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
>> [ 1802.062825] ---[ end trace d77f2054de000fb7 ]---
>>
>> This issue is related to the bug 54411:
>> https://bugzilla.kernel.org/show_bug.cgi?id=54411
>>
>> Since we added the pci_bus reference management, the bug becomes a
>> invalid list entry warning as descripted above. Beacuse it still
>> trys to delete an deleted pci device from device list.
>>
>> So here we make stop device actually detach driver only, and remove
>> device will do device_del instead, and move bus_list change and pci device
>> resource free into pci_release_dev, so that it'll consistent with
>> bus reference managment, and the device only can be deleted from device
>> list in pci_release_dev just for one time.
> 
> You have a good argument for moving the &dev->bus_list maintenance from
> pci_destroy_dev() to pci_release_dev(), because I think we can call
> pci_destroy_dev() twice for the same device, and we don't want to do the
> list_del() twice.
> 
> I suspect we want to move other stuff, too, but you haven't explained
> why we need the device_del(), device_release_driver(), and
> pci_free_resources() changes.  Possibly similar arguments apply.  I'd
> rather see separate patches to move these pieces so we can think about
> them individually.

Sorry for missing explanation.
The reason of moving pci_free_resources() into pci_release_dev is that the original
place(pci_destory_dev()) is too early as there may be still others holding reference
as the case mentioned above.
Changing device_del() and device_release_driver() in order to make pci_stop_dev
just do one thing--release driver, but it seems this change is nonsense.:(

> 
>> Besides, it also makes hostbridge to use device_unregister to be pair
>> with device_register before.
> 
> Please split the host bridge changes into a separate patch.  One logical
> change, one patch.  If they *can* be split up, please split them up.

Agree all. Thanks for your suggestion, got it.

> 
>> This patch is based on Yinghai's privious similar patch:
>> http://lkml.org/lkml/2013/5/13/658
> 
> There are seven patches in that series.  I don't know which one you're
> referring to.

[2/7]~[4/7]

> 
> Please run a spell checker on your changelog.  There are many spelling
> errors.

Got it, Thanks very much.

> 
>> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
>> ---
>>  drivers/pci/probe.c  | 23 +++++++++++++++++++++--
>>  drivers/pci/remove.c | 32 ++++++++------------------------
>>  2 files changed, 29 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index 7ef0f86..c639f44 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -1156,6 +1156,21 @@ static void pci_release_capabilities(struct pci_dev *dev)
>>  	pci_free_cap_save_buffers(dev);
>>  }
>>  
>> +static void pci_free_resources(struct pci_dev *dev)
>> +{
>> +	int i;
>> +
>> +	msi_remove_pci_irq_vectors(dev);
>> +
>> +	pci_cleanup_rom(dev);
>> +	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
>> +		struct resource *res = dev->resource + i;
>> +
>> +		if (res->parent)
>> +			release_resource(res);
>> +	}
>> +}
>> +
>>  /**
>>   * pci_release_dev - free a pci device structure when all users of it are finished.
>>   * @dev: device that's been disconnected
>> @@ -1165,9 +1180,13 @@ static void pci_release_capabilities(struct pci_dev *dev)
>>   */
>>  static void pci_release_dev(struct device *dev)
>>  {
>> -	struct pci_dev *pci_dev;
>> +	struct pci_dev *pci_dev = to_pci_dev(dev);
>> +
>> +	down_write(&pci_bus_sem);
>> +	list_del(&pci_dev->bus_list);
>> +	up_write(&pci_bus_sem);
>> +	pci_free_resources(pci_dev);
>>  
>> -	pci_dev = to_pci_dev(dev);
>>  	pci_release_capabilities(pci_dev);
>>  	pci_release_of_node(pci_dev);
>>  	pcibios_release_device(pci_dev);
>> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
>> index 8fc54b7..5659283 100644
>> --- a/drivers/pci/remove.c
>> +++ b/drivers/pci/remove.c
>> @@ -3,20 +3,6 @@
>>  #include <linux/pci-aspm.h>
>>  #include "pci.h"
>>  
>> -static void pci_free_resources(struct pci_dev *dev)
>> -{
>> -	int i;
>> -
>> - 	msi_remove_pci_irq_vectors(dev);
>> -
>> -	pci_cleanup_rom(dev);
>> -	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
>> -		struct resource *res = dev->resource + i;
>> -		if (res->parent)
>> -			release_resource(res);
>> -	}
>> -}
>> -
>>  static void pci_stop_dev(struct pci_dev *dev)
>>  {
>>  	pci_pme_active(dev, false);
>> @@ -24,8 +10,7 @@ static void pci_stop_dev(struct pci_dev *dev)
>>  	if (dev->is_added) {
>>  		pci_proc_detach_device(dev);
>>  		pci_remove_sysfs_dev_files(dev);
>> -		device_del(&dev->dev);
>> -		dev->is_added = 0;
>> +		device_release_driver(&dev->dev);
>>  	}
>>  
>>  	if (dev->bus->self)
>> @@ -34,12 +19,11 @@ static void pci_stop_dev(struct pci_dev *dev)
>>  
>>  static void pci_destroy_dev(struct pci_dev *dev)
>>  {
>> -	down_write(&pci_bus_sem);
>> -	list_del(&dev->bus_list);
>> -	up_write(&pci_bus_sem);
>> -
>> -	pci_free_resources(dev);
>> -	put_device(&dev->dev);
>> +	if (dev->is_added) {
>> +		device_del(&dev->dev);
>> +		put_device(&dev->dev);
>> +		dev->is_added = 0;
> 
> This is wrong.  You can't reference dev->is_added after put_device()
> because you have to assume you are holding the only reference, and the
> pci_dev may be deallocated before put_device() returns.

Yes, you're right, it's a low mistake here, dev may be invalid after calling
put_device().

> 
>> +	}
>>  }
>>  
>>  void pci_remove_bus(struct pci_bus *bus)
>> @@ -126,7 +110,7 @@ void pci_stop_root_bus(struct pci_bus *bus)
>>  		pci_stop_bus_device(child);
>>  
>>  	/* stop the host bridge */
>> -	device_del(&host_bridge->dev);
>> +	device_release_driver(&host_bridge->dev);
>>  }
>>  
>>  void pci_remove_root_bus(struct pci_bus *bus)
>> @@ -145,5 +129,5 @@ void pci_remove_root_bus(struct pci_bus *bus)
>>  	host_bridge->bus = NULL;
>>  
>>  	/* remove the host bridge */
>> -	put_device(&host_bridge->dev);
>> +	device_unregister(&host_bridge->dev);
>>  }
>> -- 
>> 1.8.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs
  2013-10-31  6:45 ` Yinghai Lu
@ 2013-10-31 10:12   ` Gu Zheng
  2013-10-31 13:06   ` Bjorn Helgaas
  1 sibling, 0 replies; 6+ messages in thread
From: Gu Zheng @ 2013-10-31 10:12 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Bjorn Helgaas, linux-pci, linux-kernel

Hi Yinghai,

On 10/31/2013 02:45 PM, Yinghai Lu wrote:

> On Wed, Oct 30, 2013 at 3:14 AM, Gu Zheng <guz.fnst@cn.fujitsu.com> wrote:
>> When concurent removing pci devices which are in the same pci subtree
>> via sysfs, such as:
>> echo -n 1 > /sys/bus/pci/devices/0000\:10\:00.0/remove ; echo -n 1 >
>> /sys/bus/pci/devices/0000\:1a\:01.0/remove
>> (1a:01.0 device is downstream from the 10:00.0 bridge)
>>
>> the following warning will show:
>> [ 1799.280918] ------------[ cut here ]------------
>> [ 1799.336199] WARNING: CPU: 7 PID: 126 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
>> [ 1799.433093] list_del corruption, ffff8807b4a7c000->next is LIST_POISON1 (dead000000100100)
>> [ 1799.532110] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sg dm_mirror dm_region_hash dm_log dm_mod vfat fat e1000e igb iTCO_wdt iTCO_vendor_support ioatdma ptp i7core_edac ipmi_si edac_core lpc_ich pps_core i2c_i801 pcspkr mfd_core dca ipmi_msghandler acpi_cpufreq xfs libcrc32c sd_mod crc_t10dif crct10dif_common i2c_algo_bit drm_kms_helper ttm drm mptsas scsi_transport_sas mptscsih i2c_core megaraid_sas mptbase
>> [ 1800.276623] CPU: 7 PID: 126 Comm: kworker/u512:1 Tainted: G        W    3.12.0-rc5+ #196
>> [ 1800.508918] Workqueue: sysfsd sysfs_schedule_callback_work
>> [ 1800.574703]  0000000000000009 ffff8807adbadbd8 ffffffff8168b26c ffff8807c27d08a8
>> [ 1800.663860]  ffff8807adbadc28 ffff8807adbadc18 ffffffff810711dc ffff8807adbadc68
>> [ 1800.753130]  ffff8807b4a7c000 ffff8807b4a7c000 ffff8807ad089c00 0000000000000000
>> [ 1800.842282] Call Trace:
>> [ 1800.871651]  [<ffffffff8168b26c>] dump_stack+0x55/0x76
>> [ 1800.933301]  [<ffffffff810711dc>] warn_slowpath_common+0x8c/0xc0
>> [ 1801.005283]  [<ffffffff810712c6>] warn_slowpath_fmt+0x46/0x50
>> [ 1801.074081]  [<ffffffff8135a343>] __list_del_entry+0x63/0xd0
>> [ 1801.141839]  [<ffffffff8135a3c1>] list_del+0x11/0x40
>> [ 1801.201320]  [<ffffffff813734da>] pci_remove_bus_device+0x6a/0xe0
>> [ 1801.274279]  [<ffffffff8137356e>] pci_stop_and_remove_bus_device+0x1e/0x30
>> [ 1801.356606]  [<ffffffff8137b20b>] remove_callback+0x2b/0x40
>> [ 1801.423412]  [<ffffffff81251848>] sysfs_schedule_callback_work+0x18/0x60
>> [ 1801.503744]  [<ffffffff8108eab5>] process_one_work+0x1f5/0x540
>> [ 1801.573640]  [<ffffffff8108ea53>] ? process_one_work+0x193/0x540
>> [ 1801.645616]  [<ffffffff8108f2ac>] worker_thread+0x11c/0x370
>> [ 1801.712337]  [<ffffffff8108f190>] ? rescuer_thread+0x350/0x350
>> [ 1801.782178]  [<ffffffff8109731d>] kthread+0xed/0x100
>> [ 1801.841661]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
>> [ 1801.919919]  [<ffffffff8169cc3c>] ret_from_fork+0x7c/0xb0
>> [ 1801.984608]  [<ffffffff81097230>] ? kthread_create_on_node+0x160/0x160
>> [ 1802.062825] ---[ end trace d77f2054de000fb7 ]---
>>
>> This issue is related to the bug 54411:
>> https://bugzilla.kernel.org/show_bug.cgi?id=54411
>>
>> Since we added the pci_bus reference management, the bug becomes a
>> invalid list entry warning as descripted above. Beacuse it still
>> trys to delete an deleted pci device from device list.
>>
>> So here we make stop device actually detach driver only, and remove
>> device will do device_del instead, and move bus_list change and pci device
>> resource free into pci_release_dev, so that it'll consistent with
>> bus reference managment, and the device only can be deleted from device
>> list in pci_release_dev just for one time.
>>
>> Besides, it also makes hostbridge to use device_unregister to be pair
>> with device_register before.
>>
>> This patch is based on Yinghai's privious similar patch:
>> http://lkml.org/lkml/2013/5/13/658
> 
> I have updated version, please check if attached patches fix the problem.
> 
> virtfn_release.patch
> fix_cx3_hotplug_2.patch
> fix_racing_removing_6_1.patch
> fix_racing_removing_6_2.patch
> fix_racing_removing_6_3.patch

It works well on latest Linus' tree, thanks.

Regards,
Gu 

> 
> Thanks
> 
> Yinghai



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs
  2013-10-31  6:45 ` Yinghai Lu
  2013-10-31 10:12   ` Gu Zheng
@ 2013-10-31 13:06   ` Bjorn Helgaas
  1 sibling, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2013-10-31 13:06 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Gu Zheng, linux-pci, linux-kernel

On Thu, Oct 31, 2013 at 12:45 AM, Yinghai Lu <yinghai@kernel.org> wrote:

> I have updated version, please check if attached patches fix the problem.
>
> virtfn_release.patch
> fix_cx3_hotplug_2.patch
> fix_racing_removing_6_1.patch
> fix_racing_removing_6_2.patch
> fix_racing_removing_6_3.patch

Ignored because it's a hassle for me to deal with attachments.  They
don't show up in patchwork, it's hard for reviewers to comment on
them, and I have to manually download the attachments and keep track
of what order they need to go in.

Bjorn

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-10-31 13:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-30 10:14 [PATCH] pci: fix invalid list entry warning for double pci device removing via sysfs Gu Zheng
2013-10-30 17:08 ` Bjorn Helgaas
2013-10-31 10:11   ` Gu Zheng
2013-10-31  6:45 ` Yinghai Lu
2013-10-31 10:12   ` Gu Zheng
2013-10-31 13:06   ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).