linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] ACPI: container hot remove support.
@ 2012-10-24  6:05 Tang Chen
  2012-10-24  6:05 ` [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event Tang Chen
  2012-10-24  6:05 ` [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove Tang Chen
  0 siblings, 2 replies; 15+ messages in thread
From: Tang Chen @ 2012-10-24  6:05 UTC (permalink / raw)
  To: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

Hi,

The container hotplug handler container_notify_cb() didn't implement
the hot-remove functionality. So, these 2 patches implement it like
the following way:

patch 1. Do not use kacpid_wq/kacpid_notify_wq to handle container hotplug event,
         use kacpi_hotplug_wq instead to avoid deadlock.
         Doing this is to reuse acpi_bus_hot_remove_device() in container
         hot-remove handling.

patch 2. Introduce a new function container_device_remove() to handle
         ACPI_NOTIFY_EJECT_REQUEST event for container.


change log v1 -> v2:

1. In patch1: Based on the lastest for-pci-split-pci-root-hp-2 branch from Lu Yinghai, 
   use alloc_acpi_hp_work() to add container hotplug work into kacpi_hotplug_wq.

2. In patch2: Allocate ej_event after container is stopped, so that we don't need to
   kfree the ej_event if stopping container failed.


This is based on Lu Yinghai's job.
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-split-pci-root-hp-2

Tang Chen (2):
  Use kacpi_hotplug_wq to handle container hotplug event.
  Improve container_notify_cb() to support container hot-remove.

 drivers/acpi/container.c |   75 +++++++++++++++++++++++++++++++++++++++------
 1 files changed, 65 insertions(+), 10 deletions(-)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event.
  2012-10-24  6:05 [PATCH v2 0/2] ACPI: container hot remove support Tang Chen
@ 2012-10-24  6:05 ` Tang Chen
  2012-10-24  6:54   ` Yasuaki Ishimatsu
  2012-10-24  6:05 ` [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove Tang Chen
  1 sibling, 1 reply; 15+ messages in thread
From: Tang Chen @ 2012-10-24  6:05 UTC (permalink / raw)
  To: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

As the comments in __acpi_os_execute() said:

	We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq
	because the hotplug code may call driver .remove() functions,
	which invoke flush_scheduled_work/acpi_os_wait_events_complete
	to flush these workqueues.

we should keep the hotplug code in kacpi_hotplug_wq.

But we have the following call series in kernel now:
	acpi_ev_queue_notify_request()
	|--> acpi_os_execute()
	     |--> __acpi_os_execute(type, function, context, 0)

The last parameter 0 makes the container_notify_cb() executed in
kacpi_notify_wq or kacpid_wq. So, we need to put the real hotplug code
into kacpi_hotplug_wq.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
---
 drivers/acpi/container.c |   17 ++++++++++++++++-
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c
index 69e2d6b..d300e03 100644
--- a/drivers/acpi/container.c
+++ b/drivers/acpi/container.c
@@ -35,6 +35,7 @@
 #include <acpi/acpi_bus.h>
 #include <acpi/acpi_drivers.h>
 #include <acpi/container.h>
+#include <acpi/acpiosxf.h>
 
 #define PREFIX "ACPI: "
 
@@ -165,14 +166,21 @@ static int container_device_add(struct acpi_device **device, acpi_handle handle)
 	return result;
 }
 
-static void container_notify_cb(acpi_handle handle, u32 type, void *context)
+static void __container_notify_cb(struct work_struct *work)
 {
 	struct acpi_device *device = NULL;
 	int result;
 	int present;
 	acpi_status status;
+	struct acpi_hp_work *hp_work;
+	acpi_handle handle;
+	u32 type;
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
 
+	hp_work = container_of(work, struct acpi_hp_work, work);
+	handle = hp_work->handle;
+	type = hp_work->type;
+
 	switch (type) {
 	case ACPI_NOTIFY_BUS_CHECK:
 		/* Fall through */
@@ -224,6 +232,13 @@ static void container_notify_cb(acpi_handle handle, u32 type, void *context)
 	return;
 }
 
+static void container_notify_cb(acpi_handle handle, u32 type,
+				void *context)
+{
+	alloc_acpi_hp_work(handle, type, context,
+			   __container_notify_cb);
+}
+
 static acpi_status
 container_walk_namespace_cb(acpi_handle handle,
 			    u32 lvl, void *context, void **rv)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-24  6:05 [PATCH v2 0/2] ACPI: container hot remove support Tang Chen
  2012-10-24  6:05 ` [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event Tang Chen
@ 2012-10-24  6:05 ` Tang Chen
  2012-10-24 17:14   ` Toshi Kani
  1 sibling, 1 reply; 15+ messages in thread
From: Tang Chen @ 2012-10-24  6:05 UTC (permalink / raw)
  To: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

This patch introduces a new function container_device_remove() to do
the container hot-remove job. It works like the following:

1. call acpi_bus_trim(device, 0) to stop the container device.
2. generate the KOBJ_OFFLINE uevent. (Did I do this correct ?)
3. call acpi_bus_hot_remove_device(), which will call acpi_bus_trim(device, 1)
   to remove the container.

This patch is based on Lu Yinghai's work.
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-split-pci-root-hp-2

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
---
 drivers/acpi/container.c |   58 ++++++++++++++++++++++++++++++++++++++-------
 1 files changed, 49 insertions(+), 9 deletions(-)

diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c
index d300e03..9676578 100644
--- a/drivers/acpi/container.c
+++ b/drivers/acpi/container.c
@@ -166,6 +166,35 @@ static int container_device_add(struct acpi_device **device, acpi_handle handle)
 	return result;
 }
 
+static int container_device_remove(struct acpi_device *device)
+{
+	int ret;
+	struct acpi_eject_event *ej_event;
+
+	/* stop container device at first */
+	ret = acpi_bus_trim(device, 0);
+	printk(KERN_WARNING "acpi_bus_trim stop return %x\n", ret);
+	if (ret)
+		return ret;
+
+	/* event originated from ACPI eject notification */
+	device->flags.eject_pending = 1;
+
+	/* send the uevent before remove the device */
+	kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
+
+	ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
+	if (!ej_event)
+		return -ENOMEM;
+
+	ej_event->device = device;
+	ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
+
+	acpi_bus_hot_remove_device(ej_event);
+
+	return 0;
+}
+
 static void __container_notify_cb(struct work_struct *work)
 {
 	struct acpi_device *device = NULL;
@@ -181,6 +210,9 @@ static void __container_notify_cb(struct work_struct *work)
 	handle = hp_work->handle;
 	type = hp_work->type;
 
+	present = is_device_present(handle);
+	status = acpi_bus_get_device(handle, &device);
+
 	switch (type) {
 	case ACPI_NOTIFY_BUS_CHECK:
 		/* Fall through */
@@ -189,13 +221,14 @@ static void __container_notify_cb(struct work_struct *work)
 		       (type == ACPI_NOTIFY_BUS_CHECK) ?
 		       "ACPI_NOTIFY_BUS_CHECK" : "ACPI_NOTIFY_DEVICE_CHECK");
 
-		present = is_device_present(handle);
-		status = acpi_bus_get_device(handle, &device);
 		if (!present) {
 			if (ACPI_SUCCESS(status)) {
 				/* device exist and this is a remove request */
-				device->flags.eject_pending = 1;
-				kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
+				result = container_device_remove(device);
+				if (result) {
+					printk(KERN_WARNING "Failed to remove container\n");
+					break;
+				}
 				return;
 			}
 			break;
@@ -215,12 +248,19 @@ static void __container_notify_cb(struct work_struct *work)
 		break;
 
 	case ACPI_NOTIFY_EJECT_REQUEST:
-		if (!acpi_bus_get_device(handle, &device) && device) {
-			device->flags.eject_pending = 1;
-			kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
-			return;
+		printk(KERN_WARNING "Container driver received %s event\n",
+			"ACPI_NOTIFY_EJECT_REQUEST");
+
+		if (!present || ACPI_FAILURE(status) || !device)
+			break;
+
+		result = container_device_remove(device);
+		if (result) {
+			printk(KERN_WARNING "Failed to remove container\n");
+			break;
 		}
-		break;
+
+		return;
 
 	default:
 		/* non-hotplug event; possibly handled by other handler */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event.
  2012-10-24  6:05 ` [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event Tang Chen
@ 2012-10-24  6:54   ` Yasuaki Ishimatsu
  2012-10-24  7:24     ` Tang Chen
  2012-10-24  7:42     ` Tang Chen
  0 siblings, 2 replies; 15+ messages in thread
From: Yasuaki Ishimatsu @ 2012-10-24  6:54 UTC (permalink / raw)
  To: Tang Chen
  Cc: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, mihailm,
	linux-acpi, linux-pci, linux-kernel

Hi Tang,

2012/10/24 15:05, Tang Chen wrote:
> As the comments in __acpi_os_execute() said:
> 
> 	We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq
> 	because the hotplug code may call driver .remove() functions,
> 	which invoke flush_scheduled_work/acpi_os_wait_events_complete
> 	to flush these workqueues.
> 
> we should keep the hotplug code in kacpi_hotplug_wq.
> 
> But we have the following call series in kernel now:
> 	acpi_ev_queue_notify_request()
> 	|--> acpi_os_execute()
> 	     |--> __acpi_os_execute(type, function, context, 0)
> 
> The last parameter 0 makes the container_notify_cb() executed in
> kacpi_notify_wq or kacpid_wq. So, we need to put the real hotplug code
> into kacpi_hotplug_wq.

I cannot understand the purpose of the patch.
Is the patch a bug fix patch? If yes, what problem happens?

Thanks,
Yasuaki Ishimatsu

> 
> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
> ---
>   drivers/acpi/container.c |   17 ++++++++++++++++-
>   1 files changed, 16 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c
> index 69e2d6b..d300e03 100644
> --- a/drivers/acpi/container.c
> +++ b/drivers/acpi/container.c
> @@ -35,6 +35,7 @@
>   #include <acpi/acpi_bus.h>
>   #include <acpi/acpi_drivers.h>
>   #include <acpi/container.h>
> +#include <acpi/acpiosxf.h>
>   
>   #define PREFIX "ACPI: "
>   
> @@ -165,14 +166,21 @@ static int container_device_add(struct acpi_device **device, acpi_handle handle)
>   	return result;
>   }
>   
> -static void container_notify_cb(acpi_handle handle, u32 type, void *context)
> +static void __container_notify_cb(struct work_struct *work)
>   {
>   	struct acpi_device *device = NULL;
>   	int result;
>   	int present;
>   	acpi_status status;
> +	struct acpi_hp_work *hp_work;
> +	acpi_handle handle;
> +	u32 type;
>   	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>   
> +	hp_work = container_of(work, struct acpi_hp_work, work);
> +	handle = hp_work->handle;
> +	type = hp_work->type;
> +
>   	switch (type) {
>   	case ACPI_NOTIFY_BUS_CHECK:
>   		/* Fall through */
> @@ -224,6 +232,13 @@ static void container_notify_cb(acpi_handle handle, u32 type, void *context)
>   	return;
>   }
>   
> +static void container_notify_cb(acpi_handle handle, u32 type,
> +				void *context)
> +{
> +	alloc_acpi_hp_work(handle, type, context,
> +			   __container_notify_cb);
> +}
> +
>   static acpi_status
>   container_walk_namespace_cb(acpi_handle handle,
>   			    u32 lvl, void *context, void **rv)
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event.
  2012-10-24  6:54   ` Yasuaki Ishimatsu
@ 2012-10-24  7:24     ` Tang Chen
  2012-10-24  8:09       ` Yasuaki Ishimatsu
  2012-10-24  7:42     ` Tang Chen
  1 sibling, 1 reply; 15+ messages in thread
From: Tang Chen @ 2012-10-24  7:24 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, mihailm,
	linux-acpi, linux-pci, linux-kernel

On 10/24/2012 02:54 PM, Yasuaki Ishimatsu wrote:
> Hi Tang,
> 
> 2012/10/24 15:05, Tang Chen wrote:
>> As the comments in __acpi_os_execute() said:
>>
>> 	We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq
>> 	because the hotplug code may call driver .remove() functions,
>> 	which invoke flush_scheduled_work/acpi_os_wait_events_complete
>> 	to flush these workqueues.
>>
>> we should keep the hotplug code in kacpi_hotplug_wq.
>>
>> But we have the following call series in kernel now:
>> 	acpi_ev_queue_notify_request()
>> 	|-->  acpi_os_execute()
>> 	     |-->  __acpi_os_execute(type, function, context, 0)
>>
>> The last parameter 0 makes the container_notify_cb() executed in
>> kacpi_notify_wq or kacpid_wq. So, we need to put the real hotplug code
>> into kacpi_hotplug_wq.
> 
> I cannot understand the purpose of the patch.
> Is the patch a bug fix patch? If yes, what problem happens?

Hi Yasuaki-san,

Actually, it is a problem. But container hot-remove was not implemented
in container_notify_cb(), so this problem would never be triggered. So I
cannot say it is a bug in kernel.

The problem is here:

acpi_pci_root_remove() will finally call acpi_os_wait_events_complete():

void acpi_os_wait_events_complete(void)
{
        flush_workqueue(kacpid_wq);
        flush_workqueue(kacpi_notify_wq);
}

which means it will flush kacpid_wq and kacpi_notify_wq. So the current
work should not be in these 2 workqueue, otherwise it will cause
deadlock: the worker will wait for itself to complete.

But unfortunately, in the beginning, we have:

	acpi_ev_queue_notify_request()
	|-->  acpi_os_execute()
	     |-->  __acpi_os_execute(type, function, context, 0)

Please refer to the code, you will see the last parameter 0 will make
the hotplug call serial in kacpid_wq or kacpi_notify_wq. And it is hard
coded in kernel. I don't know why and I don't how to fix it.

So I made this patch, and want to see what you guys think about it. :)


The deadlock call trace is like below:


[  302.383606] =============================================
[  302.448094] [ INFO: possible recursive locking detected ]
[  302.512578] 3.6.0-rc5-luyh-hostbridge-hotplug+ #13 Not tainted
[  302.582252] ---------------------------------------------
[  302.646736] kworker/0:2/1412 is trying to acquire lock:
[  302.709143]  (kacpi_notify){++++.+}, at: [<ffffffff81091300>]
flush_workqueue+0x0/0x5c0
[  302.805222]
[  302.805222] but task is already holding lock:
[  302.874898]  (kacpi_notify){++++.+}, at: [<ffffffff81090528>]
process_one_work+0x1b8/0x680
[  302.974083]
[  302.974083] other info that might help us debug this:
[  303.052067]  Possible unsafe locking scenario:
[  303.052067]
[  303.122785]        CPU0
[  303.151965]        ----
[  303.181150]   lock(kacpi_notify);
[  303.220935]   lock(kacpi_notify);
[  303.260721]
[  303.260721]  *** DEADLOCK ***
[  303.260721]
[  303.331434]  May be due to missing lock nesting notation
[  303.331434]
[  303.412529] 4 locks held by kworker/0:2/1412:
[  303.464553]  #0:  (kacpi_notify){++++.+}, at: [<ffffffff81090528>]
process_one_work+0x1b8/0x680
[  303.569042]  #1:  ((&dpc->work)#2){+.+.+.}, at: [<ffffffff81090528>]
process_one_work+0x1b8/0x680
[  303.675718]  #2:  (&__lockdep_no_validate__){......}, at:
[<ffffffff8143cca7>] device_release_driver+0x27/0x50
[  303.795782]  #3:  (pci_acpi_pm_notify_mtx){+.+.+.}, at:
[<ffffffff81385443>] remove_pm_notifier+0x33/0x90
[  303.910662]
[  303.910662] stack backtrace:
[  303.962687] Pid: 1412, comm: kworker/0:2 Not tainted
3.6.0-rc5-luyh-hostbridge-hotplug+ #13
[  304.062470] Call Trace:
[  304.091666]  [<ffffffff810da704>] print_deadlock_bug+0xf4/0x100
[  304.162384]  [<ffffffff810dc6a9>] validate_chain+0x549/0x7e0
[  304.229987]  [<ffffffff810dcc36>] __lock_acquire+0x2f6/0x4f0
[  304.297587]  [<ffffffff810dba65>] ? debug_check_no_locks_freed+0xa5/0xf0
[  304.377650]  [<ffffffff810dcecd>] lock_acquire+0x9d/0x190
[  304.442141]  [<ffffffff81091300>] ? flush_workqueue_prep_cwqs+0x260/0x260
[  304.523242]  [<ffffffff810d8759>] ? lockdep_init_map+0x59/0x150
[  304.593963]  [<ffffffff810914af>] flush_workqueue+0x1af/0x5c0
[  304.662605]  [<ffffffff81091300>] ? flush_workqueue_prep_cwqs+0x260/0x260
[  304.743713]  [<ffffffff810a6ab8>] ? complete+0x28/0x60
[  304.805084]  [<ffffffff810a6ab8>] ? complete+0x28/0x60
[  304.866457]  [<ffffffff810db925>] ? trace_hardirqs_on_caller+0x105/0x190
[  304.946515]  [<ffffffff810a6ab8>] ? complete+0x28/0x60
[  305.007891]  [<ffffffff81385443>] ? remove_pm_notifier+0x33/0x90
[  305.079649]  [<ffffffff813854e0>] ?
pci_acpi_remove_bus_pm_notifier+0x20/0x20
[  305.164905]  [<ffffffff813a340e>] acpi_os_wait_events_complete+0x21/0x23
[  305.244970]  [<ffffffff813b7b3c>] acpi_remove_notify_handler+0x47/0x183
[  305.323994]  [<ffffffff813854e0>] ?
pci_acpi_remove_bus_pm_notifier+0x20/0x20
[  305.409251]  [<ffffffff81385481>] remove_pm_notifier+0x71/0x90
[  305.478931]  [<ffffffff813854d5>]
pci_acpi_remove_bus_pm_notifier+0x15/0x20
[  305.562111]  [<ffffffff813aac25>] acpi_pci_root_remove+0x83/0xec
[  305.633869]  [<ffffffff813a69fc>] acpi_device_remove+0x90/0xb2
[  305.703548]  [<ffffffff8143cb2c>] __device_release_driver+0x7c/0xf0
[  305.778422]  [<ffffffff8143ccaf>] device_release_driver+0x2f/0x50
[  305.851219]  [<ffffffff813a79b5>] acpi_bus_remove+0x32/0x4d
[  305.917785]  [<ffffffff813a7a57>] acpi_bus_trim+0x87/0xee
[  305.982276]  [<ffffffff813d888c>] container_notify_cb+0x1c5/0x221
[  306.055075]  [<ffffffff813b6386>] acpi_ev_notify_dispatch+0x41/0x5f
[  306.129951]  [<ffffffff813a3437>] acpi_os_execute_deferred+0x27/0x34
[  306.205861]  [<ffffffff81090589>] process_one_work+0x219/0x680
[  306.275543]  [<ffffffff81090528>] ? process_one_work+0x1b8/0x680
[  306.347299]  [<ffffffff813a3410>] ?
acpi_os_wait_events_complete+0x23/0x23
[  306.429436]  [<ffffffff810923be>] worker_thread+0x12e/0x320
[  306.496001]  [<ffffffff81092290>] ? manage_workers+0x110/0x110
[  306.565680]  [<ffffffff81098396>] kthread+0xc6/0xd0
[  306.623937]  [<ffffffff8167c3c4>] kernel_thread_helper+0x4/0x10
[  306.694656]  [<ffffffff81671e30>] ? retint_restore_args+0x13/0x13
[  306.767451]  [<ffffffff810982d0>] ? __init_kthread_worker+0x70/0x70
[  306.842323]  [<ffffffff8167c3c0>] ? gs_change+0x13/0x13



[  519.588281] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
[  519.667375] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  519.761130] kworker/0:0     D ffff8801bdcb7b60  5608     4      2
0x00000000
[  519.846044]  ffff8801bdcb7a50 0000000000000046 ffff8801bdcb7a50
ffff8801bdcb7fd8
[  519.935363]  ffff8801bdcb6000 ffff8801bdcb6010 ffff8801bdcb6000
ffff8801bdcb6000
[  520.024650]  ffff8801bdcb7fd8 ffff8801bdcb6000 ffffffff81c13420
ffff8801bde18000
[  520.114003] Call Trace:
[  520.143402]  [<ffffffff8166ff49>] schedule+0x29/0x70
[  520.202939]  [<ffffffff8166dd55>] schedule_timeout+0x235/0x2d0
[  520.272834]  [<ffffffff810d9377>] ? __lock_acquired+0x347/0x380
[  520.343789]  [<ffffffff8166fd0f>] ? wait_for_common+0x4f/0x180
[  520.413583]  [<ffffffff8166fde3>] ? wait_for_common+0x123/0x180
[  520.484526]  [<ffffffff8166fdeb>] wait_for_common+0x12b/0x180
[  520.553422]  [<ffffffff810b0b60>] ? try_to_wake_up+0x2f0/0x2f0
[  520.623342]  [<ffffffff810db9bd>] ? trace_hardirqs_on+0xd/0x10
[  520.693270]  [<ffffffff8166ff1d>] wait_for_completion+0x1d/0x20
[  520.764219]  [<ffffffff81091587>] flush_workqueue+0x287/0x5c0
[  520.833087]  [<ffffffff81091300>] ? flush_workqueue_prep_cwqs+0x260/0x260
[  520.914421]  [<ffffffff813a340e>] acpi_os_wait_events_complete+0x21/0x23
[  520.994730]  [<ffffffff813a3430>] acpi_os_execute_deferred+0x20/0x34
[  521.070882]  [<ffffffff81090589>] process_one_work+0x219/0x680
[  521.140790]  [<ffffffff81090528>] ? process_one_work+0x1b8/0x680
[  521.212780]  [<ffffffff810922e9>] ? worker_thread+0x59/0x320
[  521.280609]  [<ffffffff813a3410>] ?
acpi_os_wait_events_complete+0x23/0x23
[  521.362994]  [<ffffffff810923be>] worker_thread+0x12e/0x320
[  521.429815]  [<ffffffff81092290>] ? manage_workers+0x110/0x110
[  521.499739]  [<ffffffff81098396>] kthread+0xc6/0xd0
[  521.558261]  [<ffffffff8167c3c4>] kernel_thread_helper+0x4/0x10
[  521.629217]  [<ffffffff81671e30>] ? retint_restore_args+0x13/0x13
[  521.702220]  [<ffffffff810982d0>] ? __init_kthread_worker+0x70/0x70
[  521.777303]  [<ffffffff8167c3c0>] ? gs_change+0x13/0x13
[  521.839913] INFO: lockdep is turned off.











^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event.
  2012-10-24  6:54   ` Yasuaki Ishimatsu
  2012-10-24  7:24     ` Tang Chen
@ 2012-10-24  7:42     ` Tang Chen
  1 sibling, 0 replies; 15+ messages in thread
From: Tang Chen @ 2012-10-24  7:42 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, mihailm,
	linux-acpi, linux-pci, linux-kernel

Hi Ishimatsu-san:

By the way, if you want to reproduce this problem, just
modify my patch1 to call __container_notify_cb() directly
in container_notify_cb(). And apply my patch2.

Then, you add a container, and remove it.
The deadlock will be triggered.

And this patch is based on Lu Yinghai's tree:
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
for-pci-split-pci-root-hp-2


On 10/24/2012 02:54 PM, Yasuaki Ishimatsu wrote:
> Hi Tang,
> 
> 2012/10/24 15:05, Tang Chen wrote:
>> As the comments in __acpi_os_execute() said:
>>
>> 	We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq
>> 	because the hotplug code may call driver .remove() functions,
>> 	which invoke flush_scheduled_work/acpi_os_wait_events_complete
>> 	to flush these workqueues.
>>
>> we should keep the hotplug code in kacpi_hotplug_wq.
>>
>> But we have the following call series in kernel now:
>> 	acpi_ev_queue_notify_request()
>> 	|-->  acpi_os_execute()
>> 	     |-->  __acpi_os_execute(type, function, context, 0)
>>
>> The last parameter 0 makes the container_notify_cb() executed in
>> kacpi_notify_wq or kacpid_wq. So, we need to put the real hotplug code
>> into kacpi_hotplug_wq.
> 
> I cannot understand the purpose of the patch.
> Is the patch a bug fix patch? If yes, what problem happens?
> 
> Thanks,
> Yasuaki Ishimatsu
> 
>>
>> Signed-off-by: Tang Chen<tangchen@cn.fujitsu.com>
>> ---
>>    drivers/acpi/container.c |   17 ++++++++++++++++-
>>    1 files changed, 16 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c
>> index 69e2d6b..d300e03 100644
>> --- a/drivers/acpi/container.c
>> +++ b/drivers/acpi/container.c
>> @@ -35,6 +35,7 @@
>>    #include<acpi/acpi_bus.h>
>>    #include<acpi/acpi_drivers.h>
>>    #include<acpi/container.h>
>> +#include<acpi/acpiosxf.h>
>>
>>    #define PREFIX "ACPI: "
>>
>> @@ -165,14 +166,21 @@ static int container_device_add(struct acpi_device **device, acpi_handle handle)
>>    	return result;
>>    }
>>
>> -static void container_notify_cb(acpi_handle handle, u32 type, void *context)
>> +static void __container_notify_cb(struct work_struct *work)
>>    {
>>    	struct acpi_device *device = NULL;
>>    	int result;
>>    	int present;
>>    	acpi_status status;
>> +	struct acpi_hp_work *hp_work;
>> +	acpi_handle handle;
>> +	u32 type;
>>    	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>>
>> +	hp_work = container_of(work, struct acpi_hp_work, work);
>> +	handle = hp_work->handle;
>> +	type = hp_work->type;
>> +
>>    	switch (type) {
>>    	case ACPI_NOTIFY_BUS_CHECK:
>>    		/* Fall through */
>> @@ -224,6 +232,13 @@ static void container_notify_cb(acpi_handle handle, u32 type, void *context)
>>    	return;
>>    }
>>
>> +static void container_notify_cb(acpi_handle handle, u32 type,
>> +				void *context)
>> +{
>> +	alloc_acpi_hp_work(handle, type, context,
>> +			   __container_notify_cb);
>> +}
>> +
>>    static acpi_status
>>    container_walk_namespace_cb(acpi_handle handle,
>>    			    u32 lvl, void *context, void **rv)
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event.
  2012-10-24  7:24     ` Tang Chen
@ 2012-10-24  8:09       ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 15+ messages in thread
From: Yasuaki Ishimatsu @ 2012-10-24  8:09 UTC (permalink / raw)
  To: Tang Chen
  Cc: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, mihailm,
	linux-acpi, linux-pci, linux-kernel

Hi Tang,

2012/10/24 16:24, Tang Chen wrote:
> On 10/24/2012 02:54 PM, Yasuaki Ishimatsu wrote:
>> Hi Tang,
>>
>> 2012/10/24 15:05, Tang Chen wrote:
>>> As the comments in __acpi_os_execute() said:
>>>
>>> 	We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq
>>> 	because the hotplug code may call driver .remove() functions,
>>> 	which invoke flush_scheduled_work/acpi_os_wait_events_complete
>>> 	to flush these workqueues.
>>>
>>> we should keep the hotplug code in kacpi_hotplug_wq.
>>>
>>> But we have the following call series in kernel now:
>>> 	acpi_ev_queue_notify_request()
>>> 	|-->  acpi_os_execute()
>>> 	     |-->  __acpi_os_execute(type, function, context, 0)
>>>
>>> The last parameter 0 makes the container_notify_cb() executed in
>>> kacpi_notify_wq or kacpid_wq. So, we need to put the real hotplug code
>>> into kacpi_hotplug_wq.
>>
>> I cannot understand the purpose of the patch.
>> Is the patch a bug fix patch? If yes, what problem happens?
> 
> Hi Yasuaki-san,
> 
> Actually, it is a problem. But container hot-remove was not implemented
> in container_notify_cb(), so this problem would never be triggered. So I
> cannot say it is a bug in kernel.
> 
> The problem is here:
> 
> acpi_pci_root_remove() will finally call acpi_os_wait_events_complete():
> 
> void acpi_os_wait_events_complete(void)
> {
>          flush_workqueue(kacpid_wq);
>          flush_workqueue(kacpi_notify_wq);
> }
> 
> which means it will flush kacpid_wq and kacpi_notify_wq. So the current
> work should not be in these 2 workqueue, otherwise it will cause
> deadlock: the worker will wait for itself to complete.
> 
> But unfortunately, in the beginning, we have:
> 
> 	acpi_ev_queue_notify_request()
> 	|-->  acpi_os_execute()
> 	     |-->  __acpi_os_execute(type, function, context, 0)
> 
> Please refer to the code, you will see the last parameter 0 will make
> the hotplug call serial in kacpid_wq or kacpi_notify_wq. And it is hard
> coded in kernel. I don't know why and I don't how to fix it.
> 
> So I made this patch, and want to see what you guys think about it. :)
> 
> 
> The deadlock call trace is like below:
> 
> 
> [  302.383606] =============================================
> [  302.448094] [ INFO: possible recursive locking detected ]
> [  302.512578] 3.6.0-rc5-luyh-hostbridge-hotplug+ #13 Not tainted
> [  302.582252] ---------------------------------------------
> [  302.646736] kworker/0:2/1412 is trying to acquire lock:
> [  302.709143]  (kacpi_notify){++++.+}, at: [<ffffffff81091300>]
> flush_workqueue+0x0/0x5c0
> [  302.805222]
> [  302.805222] but task is already holding lock:
> [  302.874898]  (kacpi_notify){++++.+}, at: [<ffffffff81090528>]
> process_one_work+0x1b8/0x680
> [  302.974083]
> [  302.974083] other info that might help us debug this:
> [  303.052067]  Possible unsafe locking scenario:
> [  303.052067]
> [  303.122785]        CPU0
> [  303.151965]        ----
> [  303.181150]   lock(kacpi_notify);
> [  303.220935]   lock(kacpi_notify);
> [  303.260721]
> [  303.260721]  *** DEADLOCK ***
> [  303.260721]
> [  303.331434]  May be due to missing lock nesting notation
> [  303.331434]
> [  303.412529] 4 locks held by kworker/0:2/1412:
> [  303.464553]  #0:  (kacpi_notify){++++.+}, at: [<ffffffff81090528>]
> process_one_work+0x1b8/0x680
> [  303.569042]  #1:  ((&dpc->work)#2){+.+.+.}, at: [<ffffffff81090528>]
> process_one_work+0x1b8/0x680
> [  303.675718]  #2:  (&__lockdep_no_validate__){......}, at:
> [<ffffffff8143cca7>] device_release_driver+0x27/0x50
> [  303.795782]  #3:  (pci_acpi_pm_notify_mtx){+.+.+.}, at:
> [<ffffffff81385443>] remove_pm_notifier+0x33/0x90
> [  303.910662]
> [  303.910662] stack backtrace:
> [  303.962687] Pid: 1412, comm: kworker/0:2 Not tainted
> 3.6.0-rc5-luyh-hostbridge-hotplug+ #13
> [  304.062470] Call Trace:
> [  304.091666]  [<ffffffff810da704>] print_deadlock_bug+0xf4/0x100
> [  304.162384]  [<ffffffff810dc6a9>] validate_chain+0x549/0x7e0
> [  304.229987]  [<ffffffff810dcc36>] __lock_acquire+0x2f6/0x4f0
> [  304.297587]  [<ffffffff810dba65>] ? debug_check_no_locks_freed+0xa5/0xf0
> [  304.377650]  [<ffffffff810dcecd>] lock_acquire+0x9d/0x190
> [  304.442141]  [<ffffffff81091300>] ? flush_workqueue_prep_cwqs+0x260/0x260
> [  304.523242]  [<ffffffff810d8759>] ? lockdep_init_map+0x59/0x150
> [  304.593963]  [<ffffffff810914af>] flush_workqueue+0x1af/0x5c0
> [  304.662605]  [<ffffffff81091300>] ? flush_workqueue_prep_cwqs+0x260/0x260
> [  304.743713]  [<ffffffff810a6ab8>] ? complete+0x28/0x60
> [  304.805084]  [<ffffffff810a6ab8>] ? complete+0x28/0x60
> [  304.866457]  [<ffffffff810db925>] ? trace_hardirqs_on_caller+0x105/0x190
> [  304.946515]  [<ffffffff810a6ab8>] ? complete+0x28/0x60
> [  305.007891]  [<ffffffff81385443>] ? remove_pm_notifier+0x33/0x90
> [  305.079649]  [<ffffffff813854e0>] ?
> pci_acpi_remove_bus_pm_notifier+0x20/0x20
> [  305.164905]  [<ffffffff813a340e>] acpi_os_wait_events_complete+0x21/0x23
> [  305.244970]  [<ffffffff813b7b3c>] acpi_remove_notify_handler+0x47/0x183
> [  305.323994]  [<ffffffff813854e0>] ?
> pci_acpi_remove_bus_pm_notifier+0x20/0x20
> [  305.409251]  [<ffffffff81385481>] remove_pm_notifier+0x71/0x90
> [  305.478931]  [<ffffffff813854d5>]
> pci_acpi_remove_bus_pm_notifier+0x15/0x20
> [  305.562111]  [<ffffffff813aac25>] acpi_pci_root_remove+0x83/0xec
> [  305.633869]  [<ffffffff813a69fc>] acpi_device_remove+0x90/0xb2
> [  305.703548]  [<ffffffff8143cb2c>] __device_release_driver+0x7c/0xf0
> [  305.778422]  [<ffffffff8143ccaf>] device_release_driver+0x2f/0x50
> [  305.851219]  [<ffffffff813a79b5>] acpi_bus_remove+0x32/0x4d
> [  305.917785]  [<ffffffff813a7a57>] acpi_bus_trim+0x87/0xee
> [  305.982276]  [<ffffffff813d888c>] container_notify_cb+0x1c5/0x221
> [  306.055075]  [<ffffffff813b6386>] acpi_ev_notify_dispatch+0x41/0x5f
> [  306.129951]  [<ffffffff813a3437>] acpi_os_execute_deferred+0x27/0x34
> [  306.205861]  [<ffffffff81090589>] process_one_work+0x219/0x680
> [  306.275543]  [<ffffffff81090528>] ? process_one_work+0x1b8/0x680
> [  306.347299]  [<ffffffff813a3410>] ?
> acpi_os_wait_events_complete+0x23/0x23
> [  306.429436]  [<ffffffff810923be>] worker_thread+0x12e/0x320
> [  306.496001]  [<ffffffff81092290>] ? manage_workers+0x110/0x110
> [  306.565680]  [<ffffffff81098396>] kthread+0xc6/0xd0
> [  306.623937]  [<ffffffff8167c3c4>] kernel_thread_helper+0x4/0x10
> [  306.694656]  [<ffffffff81671e30>] ? retint_restore_args+0x13/0x13
> [  306.767451]  [<ffffffff810982d0>] ? __init_kthread_worker+0x70/0x70
> [  306.842323]  [<ffffffff8167c3c0>] ? gs_change+0x13/0x13
> 
> 
> 
> [  519.588281] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
> [  519.667375] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  519.761130] kworker/0:0     D ffff8801bdcb7b60  5608     4      2
> 0x00000000
> [  519.846044]  ffff8801bdcb7a50 0000000000000046 ffff8801bdcb7a50
> ffff8801bdcb7fd8
> [  519.935363]  ffff8801bdcb6000 ffff8801bdcb6010 ffff8801bdcb6000
> ffff8801bdcb6000
> [  520.024650]  ffff8801bdcb7fd8 ffff8801bdcb6000 ffffffff81c13420
> ffff8801bde18000
> [  520.114003] Call Trace:
> [  520.143402]  [<ffffffff8166ff49>] schedule+0x29/0x70
> [  520.202939]  [<ffffffff8166dd55>] schedule_timeout+0x235/0x2d0
> [  520.272834]  [<ffffffff810d9377>] ? __lock_acquired+0x347/0x380
> [  520.343789]  [<ffffffff8166fd0f>] ? wait_for_common+0x4f/0x180
> [  520.413583]  [<ffffffff8166fde3>] ? wait_for_common+0x123/0x180
> [  520.484526]  [<ffffffff8166fdeb>] wait_for_common+0x12b/0x180
> [  520.553422]  [<ffffffff810b0b60>] ? try_to_wake_up+0x2f0/0x2f0
> [  520.623342]  [<ffffffff810db9bd>] ? trace_hardirqs_on+0xd/0x10
> [  520.693270]  [<ffffffff8166ff1d>] wait_for_completion+0x1d/0x20
> [  520.764219]  [<ffffffff81091587>] flush_workqueue+0x287/0x5c0
> [  520.833087]  [<ffffffff81091300>] ? flush_workqueue_prep_cwqs+0x260/0x260
> [  520.914421]  [<ffffffff813a340e>] acpi_os_wait_events_complete+0x21/0x23
> [  520.994730]  [<ffffffff813a3430>] acpi_os_execute_deferred+0x20/0x34
> [  521.070882]  [<ffffffff81090589>] process_one_work+0x219/0x680
> [  521.140790]  [<ffffffff81090528>] ? process_one_work+0x1b8/0x680
> [  521.212780]  [<ffffffff810922e9>] ? worker_thread+0x59/0x320
> [  521.280609]  [<ffffffff813a3410>] ?
> acpi_os_wait_events_complete+0x23/0x23
> [  521.362994]  [<ffffffff810923be>] worker_thread+0x12e/0x320
> [  521.429815]  [<ffffffff81092290>] ? manage_workers+0x110/0x110
> [  521.499739]  [<ffffffff81098396>] kthread+0xc6/0xd0
> [  521.558261]  [<ffffffff8167c3c4>] kernel_thread_helper+0x4/0x10
> [  521.629217]  [<ffffffff81671e30>] ? retint_restore_args+0x13/0x13
> [  521.702220]  [<ffffffff810982d0>] ? __init_kthread_worker+0x70/0x70
> [  521.777303]  [<ffffffff8167c3c0>] ? gs_change+0x13/0x13
> [  521.839913] INFO: lockdep is turned off.
> 

Thank you for your explanation.
I understang the purpose of the patch.

Thanks,
Yasuaki Ishimatsu



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-24  6:05 ` [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove Tang Chen
@ 2012-10-24 17:14   ` Toshi Kani
  2012-10-24 18:02     ` Bjorn Helgaas
  2012-10-25  1:31     ` Tang Chen
  0 siblings, 2 replies; 15+ messages in thread
From: Toshi Kani @ 2012-10-24 17:14 UTC (permalink / raw)
  To: Tang Chen
  Cc: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

On Wed, 2012-10-24 at 14:05 +0800, Tang Chen wrote:
> This patch introduces a new function container_device_remove() to do
> the container hot-remove job. It works like the following:
> 
> 1. call acpi_bus_trim(device, 0) to stop the container device.
> 2. generate the KOBJ_OFFLINE uevent. (Did I do this correct ?)
> 3. call acpi_bus_hot_remove_device(), which will call acpi_bus_trim(device, 1)
>    to remove the container.
> 
> This patch is based on Lu Yinghai's work.
> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-split-pci-root-hp-2
> 
> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
> ---
>  drivers/acpi/container.c |   58 ++++++++++++++++++++++++++++++++++++++-------
>  1 files changed, 49 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c
> index d300e03..9676578 100644
> --- a/drivers/acpi/container.c
> +++ b/drivers/acpi/container.c
> @@ -166,6 +166,35 @@ static int container_device_add(struct acpi_device **device, acpi_handle handle)
>  	return result;
>  }
>  
> +static int container_device_remove(struct acpi_device *device)
> +{
> +	int ret;
> +	struct acpi_eject_event *ej_event;
> +
> +	/* stop container device at first */
> +	ret = acpi_bus_trim(device, 0);

Hi Tang,

Why do you need to call acpi_bus_trim(device,0) to stop the container
device first?

> +	printk(KERN_WARNING "acpi_bus_trim stop return %x\n", ret);

Do you need this message in the normal case?  If so, I'd suggest to use
pr_debug().

> +	if (ret)
> +		return ret;
> +
> +	/* event originated from ACPI eject notification */
> +	device->flags.eject_pending = 1;

You do not need to set the eject_pending flag when the handler calls
acpi_bus_hot_remove_device().  It was set before because the handler did
not initiate the hot-remove operation.

> +
> +	/* send the uevent before remove the device */
> +	kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
> +
> +	ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
> +	if (!ej_event)
> +		return -ENOMEM;
> +
> +	ej_event->device = device;
> +	ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> +
> +	acpi_bus_hot_remove_device(ej_event);
> +
> +	return 0;
> +}
> +
>  static void __container_notify_cb(struct work_struct *work)
>  {
>  	struct acpi_device *device = NULL;
> @@ -181,6 +210,9 @@ static void __container_notify_cb(struct work_struct *work)
>  	handle = hp_work->handle;
>  	type = hp_work->type;
>  
> +	present = is_device_present(handle);
> +	status = acpi_bus_get_device(handle, &device);
> +
>  	switch (type) {
>  	case ACPI_NOTIFY_BUS_CHECK:
>  		/* Fall through */
> @@ -189,13 +221,14 @@ static void __container_notify_cb(struct work_struct *work)
>  		       (type == ACPI_NOTIFY_BUS_CHECK) ?
>  		       "ACPI_NOTIFY_BUS_CHECK" : "ACPI_NOTIFY_DEVICE_CHECK");
>  
> -		present = is_device_present(handle);
> -		status = acpi_bus_get_device(handle, &device);
>  		if (!present) {
>  			if (ACPI_SUCCESS(status)) {
>  				/* device exist and this is a remove request */
> -				device->flags.eject_pending = 1;
> -				kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
> +				result = container_device_remove(device);
> +				if (result) {
> +					printk(KERN_WARNING "Failed to remove container\n");
> +					break;
> +				}
>  				return;
>  			}
>  			break;
> @@ -215,12 +248,19 @@ static void __container_notify_cb(struct work_struct *work)
>  		break;
>  
>  	case ACPI_NOTIFY_EJECT_REQUEST:
> -		if (!acpi_bus_get_device(handle, &device) && device) {
> -			device->flags.eject_pending = 1;
> -			kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
> -			return;
> +		printk(KERN_WARNING "Container driver received %s event\n",
> +			"ACPI_NOTIFY_EJECT_REQUEST");

Same as other comment.  Suggest to use pr_debug().

> +
> +		if (!present || ACPI_FAILURE(status) || !device)
> +			break;
> +
> +		result = container_device_remove(device);
> +		if (result) {
> +			printk(KERN_WARNING "Failed to remove container\n");

Please use pr_warn().

Thanks,
-Toshi

> +			break;
>  		}
> -		break;
> +
> +		return;
>  
>  	default:
>  		/* non-hotplug event; possibly handled by other handler */



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-24 17:14   ` Toshi Kani
@ 2012-10-24 18:02     ` Bjorn Helgaas
  2012-10-25  0:53       ` Jiang Liu
  2012-10-25  1:31     ` Tang Chen
  1 sibling, 1 reply; 15+ messages in thread
From: Bjorn Helgaas @ 2012-10-24 18:02 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Tang Chen, yinghai, lenb, jiang.liu, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

On Wed, Oct 24, 2012 at 11:14 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> On Wed, 2012-10-24 at 14:05 +0800, Tang Chen wrote:

>> +             result = container_device_remove(device);
>> +             if (result) {
>> +                     printk(KERN_WARNING "Failed to remove container\n");
>
> Please use pr_warn().

I think you should use dev_warn() and similar when possible.  In a
year or two, the text "Failed to remove container" all by itself in a
dmesg log is not going to mean anything to anybody except you, and it
doesn't give any clue where to start looking for issues.

I also have a larger question.  I'm not sure if
drivers/acpi/container.c does anything important in the first place.
The code in it looks an awful lot like the code in
drivers/acpi/processor_driver.c, drivers/acpi/acpi_memhotplug.c, and
drivers/pci/hotplug/acpiphp_glue.c.

To me, it looks like container.c (as well as the other places I
mentioned) is an attempt to compensate for the lack of hotplug support
in the ACPI core.  If the ACPI core had generic hotplug support, e.g.,
if it handled BUS_CHECK, DEVICE_CHECK, and EJECT_REQUEST notifications
and turned those into the appropriate calls to driver .add()/.start()
and .remove() methods, would we need container.c at all?

Bjorn

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-24 18:02     ` Bjorn Helgaas
@ 2012-10-25  0:53       ` Jiang Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Jiang Liu @ 2012-10-25  0:53 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Toshi Kani, Tang Chen, yinghai, lenb, izumi.taku,
	isimatu.yasuaki, mihailm, linux-acpi, linux-pci, linux-kernel

On 2012-10-25 2:02, Bjorn Helgaas wrote:
> On Wed, Oct 24, 2012 at 11:14 AM, Toshi Kani <toshi.kani@hp.com> wrote:
>> On Wed, 2012-10-24 at 14:05 +0800, Tang Chen wrote:
> 
>>> +             result = container_device_remove(device);
>>> +             if (result) {
>>> +                     printk(KERN_WARNING "Failed to remove container\n");
>>
>> Please use pr_warn().
> 
> I think you should use dev_warn() and similar when possible.  In a
> year or two, the text "Failed to remove container" all by itself in a
> dmesg log is not going to mean anything to anybody except you, and it
> doesn't give any clue where to start looking for issues.
> 
> I also have a larger question.  I'm not sure if
> drivers/acpi/container.c does anything important in the first place.
> The code in it looks an awful lot like the code in
> drivers/acpi/processor_driver.c, drivers/acpi/acpi_memhotplug.c, and
> drivers/pci/hotplug/acpiphp_glue.c.
> 
> To me, it looks like container.c (as well as the other places I
> mentioned) is an attempt to compensate for the lack of hotplug support
> in the ACPI core.  If the ACPI core had generic hotplug support, e.g.,
> if it handled BUS_CHECK, DEVICE_CHECK, and EJECT_REQUEST notifications
> and turned those into the appropriate calls to driver .add()/.start()
> and .remove() methods, would we need container.c at all?
Hi Bjorn,
	It's true that container driver is just for hotplug. We are working
on a hotplug framework which will consolidate all hotplug logic into one
driver.
	--Gerry

> 
> Bjorn
> 
> .
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-24 17:14   ` Toshi Kani
  2012-10-24 18:02     ` Bjorn Helgaas
@ 2012-10-25  1:31     ` Tang Chen
  2012-10-25  1:47       ` Jiang Liu
  1 sibling, 1 reply; 15+ messages in thread
From: Tang Chen @ 2012-10-25  1:31 UTC (permalink / raw)
  To: Toshi Kani
  Cc: yinghai, bhelgaas, lenb, jiang.liu, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

Hi Toshi,

On 10/25/2012 01:14 AM, Toshi Kani wrote:
> On Wed, 2012-10-24 at 14:05 +0800, Tang Chen wrote:
>> +static int container_device_remove(struct acpi_device *device)
>> +{
>> +	int ret;
>> +	struct acpi_eject_event *ej_event;
>> +
>> +	/* stop container device at first */
>> +	ret = acpi_bus_trim(device, 0);
>
> Hi Tang,
>
> Why do you need to call acpi_bus_trim(device,0) to stop the container
> device first?

This issue was introduced by Lu Yinghai, I think he could give a better
answer than me. :)
Please refer to the following url:

http://www.spinics.net/lists/linux-pci/msg17667.html

However, this is not applied into the pci tree yet.

>
>> +	printk(KERN_WARNING "acpi_bus_trim stop return %x\n", ret);
>
> Do you need this message in the normal case?  If so, I'd suggest to use
> pr_debug().
>
>> +	if (ret)
>> +		return ret;
>> +
>> +	/* event originated from ACPI eject notification */
>> +	device->flags.eject_pending = 1;
>
> You do not need to set the eject_pending flag when the handler calls
> acpi_bus_hot_remove_device().  It was set before because the handler did
> not initiate the hot-remove operation.

I just set it to keep the logic the same as before.
And thanks for telling me this. :)

>
...
>> +		printk(KERN_WARNING "Container driver received %s event\n",
>> +			"ACPI_NOTIFY_EJECT_REQUEST");
>
> Same as other comment.  Suggest to use pr_debug().

OK. :)

>
>> +
>> +		if (!present || ACPI_FAILURE(status) || !device)
>> +			break;
>> +
>> +		result = container_device_remove(device);
>> +		if (result) {
>> +			printk(KERN_WARNING "Failed to remove container\n");
>
> Please use pr_warn().
>
> Thanks,
> -Toshi
>
>> +			break;
>>   		}
>> -		break;
>> +
>> +		return;
>>
>>   	default:
>>   		/* non-hotplug event; possibly handled by other handler */
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-25  1:31     ` Tang Chen
@ 2012-10-25  1:47       ` Jiang Liu
  2012-10-25 17:20         ` Toshi Kani
  0 siblings, 1 reply; 15+ messages in thread
From: Jiang Liu @ 2012-10-25  1:47 UTC (permalink / raw)
  To: Tang Chen
  Cc: Toshi Kani, yinghai, bhelgaas, lenb, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

On 2012-10-25 9:31, Tang Chen wrote:
> Hi Toshi,
> 
> On 10/25/2012 01:14 AM, Toshi Kani wrote:
>> On Wed, 2012-10-24 at 14:05 +0800, Tang Chen wrote:
>>> +static int container_device_remove(struct acpi_device *device)
>>> +{
>>> +    int ret;
>>> +    struct acpi_eject_event *ej_event;
>>> +
>>> +    /* stop container device at first */
>>> +    ret = acpi_bus_trim(device, 0);
>>
>> Hi Tang,
>>
>> Why do you need to call acpi_bus_trim(device,0) to stop the container
>> device first?
> 
> This issue was introduced by Lu Yinghai, I think he could give a better
> answer than me. :)
> Please refer to the following url:
> 
> http://www.spinics.net/lists/linux-pci/msg17667.html
> 
> However, this is not applied into the pci tree yet.
We have worked out a patch set to clean up the logic for PCI/ACPI binding
relationship. It updates PCI/ACPI binding relationship by registering bus
notification onto pci_bus_type instead of hooking into the ACPI/glue.c.

To accommodate that patch set, the ACPI device destroy process has been
split into two steps:
1) acpi_bus_trim(device,0) to unbind ACPI drivers
2) acpi_bus_trim(device,1) to destroy ACPI devices

> 
>>
>>> +    printk(KERN_WARNING "acpi_bus_trim stop return %x\n", ret);
>>
>> Do you need this message in the normal case?  If so, I'd suggest to use
>> pr_debug().
>>
>>> +    if (ret)
>>> +        return ret;
>>> +
>>> +    /* event originated from ACPI eject notification */
>>> +    device->flags.eject_pending = 1;
>>
>> You do not need to set the eject_pending flag when the handler calls
>> acpi_bus_hot_remove_device().  It was set before because the handler did
>> not initiate the hot-remove operation.
> 
> I just set it to keep the logic the same as before.
> And thanks for telling me this. :)
> 
>>
> ...
>>> +        printk(KERN_WARNING "Container driver received %s event\n",
>>> +            "ACPI_NOTIFY_EJECT_REQUEST");
>>
>> Same as other comment.  Suggest to use pr_debug().
> 
> OK. :)
> 
>>
>>> +
>>> +        if (!present || ACPI_FAILURE(status) || !device)
>>> +            break;
>>> +
>>> +        result = container_device_remove(device);
>>> +        if (result) {
>>> +            printk(KERN_WARNING "Failed to remove container\n");
>>
>> Please use pr_warn().
>>
>> Thanks,
>> -Toshi
>>
>>> +            break;
>>>           }
>>> -        break;
>>> +
>>> +        return;
>>>
>>>       default:
>>>           /* non-hotplug event; possibly handled by other handler */
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 
> .
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-25  1:47       ` Jiang Liu
@ 2012-10-25 17:20         ` Toshi Kani
  2012-10-26  5:43           ` Tang Chen
  0 siblings, 1 reply; 15+ messages in thread
From: Toshi Kani @ 2012-10-25 17:20 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Tang Chen, yinghai, bhelgaas, lenb, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

On Thu, 2012-10-25 at 09:47 +0800, Jiang Liu wrote:
> On 2012-10-25 9:31, Tang Chen wrote:
> > Hi Toshi,
> > 
> > On 10/25/2012 01:14 AM, Toshi Kani wrote:
> >> On Wed, 2012-10-24 at 14:05 +0800, Tang Chen wrote:
> >>> +static int container_device_remove(struct acpi_device *device)
> >>> +{
> >>> +    int ret;
> >>> +    struct acpi_eject_event *ej_event;
> >>> +
> >>> +    /* stop container device at first */
> >>> +    ret = acpi_bus_trim(device, 0);
> >>
> >> Hi Tang,
> >>
> >> Why do you need to call acpi_bus_trim(device,0) to stop the container
> >> device first?
> > 
> > This issue was introduced by Lu Yinghai, I think he could give a better
> > answer than me. :)
> > Please refer to the following url:
> > 
> > http://www.spinics.net/lists/linux-pci/msg17667.html
> > 
> > However, this is not applied into the pci tree yet.
> We have worked out a patch set to clean up the logic for PCI/ACPI binding
> relationship. It updates PCI/ACPI binding relationship by registering bus
> notification onto pci_bus_type instead of hooking into the ACPI/glue.c.

Thanks for the info and pointer.  Tang, I'd suggest you add such info to
the comment so that others know that this step is needed for removing
PCI bridges.  It helps us to know where to look at...

> To accommodate that patch set, the ACPI device destroy process has been
> split into two steps:
> 1) acpi_bus_trim(device,0) to unbind ACPI drivers

Does this step also detach PCI drivers from PCI cards as well?

Thanks,
-Toshi


> 2) acpi_bus_trim(device,1) to destroy ACPI devices
> 
> > 
> >>
> >>> +    printk(KERN_WARNING "acpi_bus_trim stop return %x\n", ret);
> >>
> >> Do you need this message in the normal case?  If so, I'd suggest to use
> >> pr_debug().
> >>
> >>> +    if (ret)
> >>> +        return ret;
> >>> +
> >>> +    /* event originated from ACPI eject notification */
> >>> +    device->flags.eject_pending = 1;
> >>
> >> You do not need to set the eject_pending flag when the handler calls
> >> acpi_bus_hot_remove_device().  It was set before because the handler did
> >> not initiate the hot-remove operation.
> > 
> > I just set it to keep the logic the same as before.
> > And thanks for telling me this. :)
> > 
> >>
> > ...
> >>> +        printk(KERN_WARNING "Container driver received %s event\n",
> >>> +            "ACPI_NOTIFY_EJECT_REQUEST");
> >>
> >> Same as other comment.  Suggest to use pr_debug().
> > 
> > OK. :)
> > 
> >>
> >>> +
> >>> +        if (!present || ACPI_FAILURE(status) || !device)
> >>> +            break;
> >>> +
> >>> +        result = container_device_remove(device);
> >>> +        if (result) {
> >>> +            printk(KERN_WARNING "Failed to remove container\n");
> >>
> >> Please use pr_warn().
> >>
> >> Thanks,
> >> -Toshi
> >>
> >>> +            break;
> >>>           }
> >>> -        break;
> >>> +
> >>> +        return;
> >>>
> >>>       default:
> >>>           /* non-hotplug event; possibly handled by other handler */
> >>
> >>
> >> -- 
> >> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> > 
> > 
> > .
> > 
> 
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-25 17:20         ` Toshi Kani
@ 2012-10-26  5:43           ` Tang Chen
  2012-10-26 20:02             ` Toshi Kani
  0 siblings, 1 reply; 15+ messages in thread
From: Tang Chen @ 2012-10-26  5:43 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Jiang Liu, yinghai, bhelgaas, lenb, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

Hi Toshi,

On 10/26/2012 01:20 AM, Toshi Kani wrote:
...
>>>> Why do you need to call acpi_bus_trim(device,0) to stop the container
>>>> device first?
>>>
>>> This issue was introduced by Lu Yinghai, I think he could give a better
>>> answer than me. :)
>>> Please refer to the following url:
>>>
>>> http://www.spinics.net/lists/linux-pci/msg17667.html
>>>
>>> However, this is not applied into the pci tree yet.
>> We have worked out a patch set to clean up the logic for PCI/ACPI binding
>> relationship. It updates PCI/ACPI binding relationship by registering bus
>> notification onto pci_bus_type instead of hooking into the ACPI/glue.c.
>
> Thanks for the info and pointer.  Tang, I'd suggest you add such info to
> the comment so that others know that this step is needed for removing
> PCI bridges.  It helps us to know where to look at...

OK, I'll add it in the next version. :)

>
>> To accommodate that patch set, the ACPI device destroy process has been
>> split into two steps:
>> 1) acpi_bus_trim(device,0) to unbind ACPI drivers
>
> Does this step also detach PCI drivers from PCI cards as well?

Yes, it calls device_release_driver() to release the device driver.

     device_release_driver()
       |->__device_release_driver()
            |->dev->driver = NULL;

Thanks. :)

>
> Thanks,
> -Toshi
>
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove.
  2012-10-26  5:43           ` Tang Chen
@ 2012-10-26 20:02             ` Toshi Kani
  0 siblings, 0 replies; 15+ messages in thread
From: Toshi Kani @ 2012-10-26 20:02 UTC (permalink / raw)
  To: Tang Chen
  Cc: Jiang Liu, yinghai, bhelgaas, lenb, izumi.taku, isimatu.yasuaki,
	mihailm, linux-acpi, linux-pci, linux-kernel

On Fri, 2012-10-26 at 13:43 +0800, Tang Chen wrote:
> Hi Toshi,
> 
> On 10/26/2012 01:20 AM, Toshi Kani wrote:
> ...
> >>>> Why do you need to call acpi_bus_trim(device,0) to stop the container
> >>>> device first?
> >>>
> >>> This issue was introduced by Lu Yinghai, I think he could give a better
> >>> answer than me. :)
> >>> Please refer to the following url:
> >>>
> >>> http://www.spinics.net/lists/linux-pci/msg17667.html
> >>>
> >>> However, this is not applied into the pci tree yet.
> >> We have worked out a patch set to clean up the logic for PCI/ACPI binding
> >> relationship. It updates PCI/ACPI binding relationship by registering bus
> >> notification onto pci_bus_type instead of hooking into the ACPI/glue.c.
> >
> > Thanks for the info and pointer.  Tang, I'd suggest you add such info to
> > the comment so that others know that this step is needed for removing
> > PCI bridges.  It helps us to know where to look at...
> 
> OK, I'll add it in the next version. :)
> 
> >
> >> To accommodate that patch set, the ACPI device destroy process has been
> >> split into two steps:
> >> 1) acpi_bus_trim(device,0) to unbind ACPI drivers
> >
> > Does this step also detach PCI drivers from PCI cards as well?
> 
> Yes, it calls device_release_driver() to release the device driver.
> 
>      device_release_driver()
>        |->__device_release_driver()
>             |->dev->driver = NULL;

I see.  Thanks for the info.
-Toshi


> 
> Thanks. :)
> 
> >
> > Thanks,
> > -Toshi
> >
> >
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-10-26 20:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-24  6:05 [PATCH v2 0/2] ACPI: container hot remove support Tang Chen
2012-10-24  6:05 ` [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event Tang Chen
2012-10-24  6:54   ` Yasuaki Ishimatsu
2012-10-24  7:24     ` Tang Chen
2012-10-24  8:09       ` Yasuaki Ishimatsu
2012-10-24  7:42     ` Tang Chen
2012-10-24  6:05 ` [PATCH v2 2/2] Improve container_notify_cb() to support container hot-remove Tang Chen
2012-10-24 17:14   ` Toshi Kani
2012-10-24 18:02     ` Bjorn Helgaas
2012-10-25  0:53       ` Jiang Liu
2012-10-25  1:31     ` Tang Chen
2012-10-25  1:47       ` Jiang Liu
2012-10-25 17:20         ` Toshi Kani
2012-10-26  5:43           ` Tang Chen
2012-10-26 20:02             ` Toshi Kani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).