All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-13  0:19 Rafael J. Wysocki
  2013-02-13  1:55 ` Yinghai Lu
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-13  0:19 UTC (permalink / raw)
  To: ACPI Devel Maling List
  Cc: LKML, Bjorn Helgaas, Jiang Liu, Yinghai Lu, Toshi Kani,
	Yasuaki Ishimatsu, Myron Stowe, linux-pci

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

This changeset is aimed at fixing a few different but related
problems in the ACPI hotplug infrastructure.

First of all, since notify handlers may be run in parallel with
acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
and some of them are installed for ACPI handles that have no struct
acpi_device objects attached (i.e. before those objects are created),
those notify handlers have to take acpi_scan_lock to prevent races
from taking place (e.g. a struct acpi_device is found to be present
for the given ACPI handle, but right after that it is removed by
acpi_bus_trim() running in parallel to the given notify handler).
Moreover, since some of them call acpi_bus_scan() and
acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
should be acquired by the callers of these two funtions rather by
these functions themselves.

For these reasons, make all notify handlers that can handle device
addition and eject events take acpi_scan_lock and remove the
acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
Accordingly, update all of their users to make sure that they
are always called under acpi_scan_lock.

Furthermore, since eject operations are carried out asynchronously
with respect to the notify events that trigger them, with the help
of acpi_bus_hot_remove_device(), even if notify handlers take the
ACPI scan lock, it still is possible that, for example,
acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
the notify handler that scheduled its execution and that
acpi_bus_trim() will remove the device node passed to
acpi_bus_hot_remove_device() for ejection.  In that case, the struct
acpi_device object obtained by acpi_bus_hot_remove_device() will be
invalid and not-so-funny things will ensue.  To protect agaist that,
make the users of acpi_bus_hot_remove_device() run get_device() on
ACPI device node objects that are about to be passed to it and make
acpi_bus_hot_remove_device() run put_device() on them and check if
their ACPI handles are not NULL (make acpi_device_unregister() clear
the device nodes' ACPI handles for that check to work).

Finally, observe that acpi_os_hotplug_execute() actually can fail,
in which case its caller ought to free memory allocated for the
context object to prevent leaks from happening.  It also needs to
run put_device() on the device node that it ran get_device() on
previously in that case.  Modify the code accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

On top of linux-pm.git/linux-next.

Thanks,
Rafael

---
 drivers/acpi/acpi_memhotplug.c     |   56 +++++++++++++++++++-----------
 drivers/acpi/container.c           |   10 +++--
 drivers/acpi/dock.c                |   19 ++++++++--
 drivers/acpi/processor_driver.c    |   24 +++++++++---
 drivers/acpi/scan.c                |   69 +++++++++++++++++++++++++------------
 drivers/pci/hotplug/acpiphp_glue.c |    6 +++
 drivers/pci/hotplug/sgi_hotplug.c  |    5 ++
 include/acpi/acpi_bus.h            |    3 +
 8 files changed, 138 insertions(+), 54 deletions(-)

Index: test/drivers/acpi/scan.c
===================================================================
--- test.orig/drivers/acpi/scan.c
+++ test/drivers/acpi/scan.c
@@ -42,6 +42,18 @@ struct acpi_device_bus_id{
 	struct list_head node;
 };
 
+void acpi_scan_lock_acquire(void)
+{
+	mutex_lock(&acpi_scan_lock);
+}
+EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
+
+void acpi_scan_lock_release(void)
+{
+	mutex_unlock(&acpi_scan_lock);
+}
+EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
+
 int acpi_scan_add_handler(struct acpi_scan_handler *handler)
 {
 	if (!handler || !handler->attach)
@@ -95,8 +107,6 @@ acpi_device_modalias_show(struct device
 }
 static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
 
-static void __acpi_bus_trim(struct acpi_device *start);
-
 /**
  * acpi_bus_hot_remove_device: hot-remove a device and its children
  * @context: struct acpi_eject_event pointer (freed in this func)
@@ -107,7 +117,7 @@ static void __acpi_bus_trim(struct acpi_
  */
 void acpi_bus_hot_remove_device(void *context)
 {
-	struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context;
+	struct acpi_eject_event *ej_event = context;
 	struct acpi_device *device = ej_event->device;
 	acpi_handle handle = device->handle;
 	acpi_handle temp;
@@ -118,11 +128,19 @@ void acpi_bus_hot_remove_device(void *co
 
 	mutex_lock(&acpi_scan_lock);
 
+	/* If there is no handle, the device node has been unregistered. */
+	if (!device->handle) {
+		dev_dbg(&device->dev, "ACPI handle missing\n");
+		put_device(&device->dev);
+		goto out;
+	}
+
 	ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 		"Hot-removing device %s...\n", dev_name(&device->dev)));
 
-	__acpi_bus_trim(device);
-	/* Device node has been released. */
+	acpi_bus_trim(device);
+	/* Device node has been unregistered. */
+	put_device(&device->dev);
 	device = NULL;
 
 	if (ACPI_SUCCESS(acpi_get_handle(handle, "_LCK", &temp))) {
@@ -151,6 +169,7 @@ void acpi_bus_hot_remove_device(void *co
 					  ost_code, NULL);
 	}
 
+ out:
 	mutex_unlock(&acpi_scan_lock);
 	kfree(context);
 	return;
@@ -212,6 +231,7 @@ acpi_eject_store(struct device *d, struc
 		goto err;
 	}
 
+	get_device(&acpi_device->dev);
 	ej_event->device = acpi_device;
 	if (acpi_device->flags.eject_pending) {
 		/* event originated from ACPI eject notification */
@@ -224,7 +244,11 @@ acpi_eject_store(struct device *d, struc
 			ej_event->event, ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
 	}
 
-	acpi_os_hotplug_execute(acpi_bus_hot_remove_device, (void *)ej_event);
+	status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
+	if (ACPI_FAILURE(status)) {
+		put_device(&acpi_device->dev);
+		kfree(ej_event);
+	}
 err:
 	return ret;
 }
@@ -779,6 +803,7 @@ static void acpi_device_unregister(struc
 	 * no more references.
 	 */
 	acpi_device_set_power(device, ACPI_STATE_D3_COLD);
+	device->handle = NULL;
 	put_device(&device->dev);
 }
 
@@ -1626,14 +1651,14 @@ static acpi_status acpi_bus_device_attac
  * there has been a real error.  There just have been no suitable ACPI objects
  * in the table trunk from which the kernel could create a device and add an
  * appropriate driver.
+ *
+ * Must be called under acpi_scan_lock.
  */
 int acpi_bus_scan(acpi_handle handle)
 {
 	void *device = NULL;
 	int error = 0;
 
-	mutex_lock(&acpi_scan_lock);
-
 	if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device)))
 		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
 				    acpi_bus_check_add, NULL, NULL, &device);
@@ -1644,7 +1669,6 @@ int acpi_bus_scan(acpi_handle handle)
 		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
 				    acpi_bus_device_attach, NULL, NULL, NULL);
 
-	mutex_unlock(&acpi_scan_lock);
 	return error;
 }
 EXPORT_SYMBOL(acpi_bus_scan);
@@ -1681,7 +1705,13 @@ static acpi_status acpi_bus_remove(acpi_
 	return AE_OK;
 }
 
-static void __acpi_bus_trim(struct acpi_device *start)
+/**
+ * acpi_bus_trim - Remove ACPI device node and all of its descendants
+ * @start: Root of the ACPI device nodes subtree to remove.
+ *
+ * Must be called under acpi_scan_lock.
+ */
+void acpi_bus_trim(struct acpi_device *start)
 {
 	/*
 	 * Execute acpi_bus_device_detach() as a post-order callback to detach
@@ -1698,13 +1728,6 @@ static void __acpi_bus_trim(struct acpi_
 			    acpi_bus_remove, NULL, NULL);
 	acpi_bus_remove(start->handle, 0, NULL, NULL);
 }
-
-void acpi_bus_trim(struct acpi_device *start)
-{
-	mutex_lock(&acpi_scan_lock);
-	__acpi_bus_trim(start);
-	mutex_unlock(&acpi_scan_lock);
-}
 EXPORT_SYMBOL_GPL(acpi_bus_trim);
 
 static int acpi_bus_scan_fixed(void)
@@ -1762,23 +1785,27 @@ int __init acpi_scan_init(void)
 	acpi_csrt_init();
 	acpi_container_init();
 
+	mutex_lock(&acpi_scan_lock);
 	/*
 	 * Enumerate devices in the ACPI namespace.
 	 */
 	result = acpi_bus_scan(ACPI_ROOT_OBJECT);
 	if (result)
-		return result;
+		goto out;
 
 	result = acpi_bus_get_device(ACPI_ROOT_OBJECT, &acpi_root);
 	if (result)
-		return result;
+		goto out;
 
 	result = acpi_bus_scan_fixed();
 	if (result) {
 		acpi_device_unregister(acpi_root);
-		return result;
+		goto out;
 	}
 
 	acpi_update_all_gpes();
-	return 0;
+
+ out:
+	mutex_unlock(&acpi_scan_lock);
+	return result;
 }
Index: test/include/acpi/acpi_bus.h
===================================================================
--- test.orig/include/acpi/acpi_bus.h
+++ test/include/acpi/acpi_bus.h
@@ -395,6 +395,9 @@ int acpi_bus_receive_event(struct acpi_b
 static inline int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data)
 	{ return 0; }
 #endif
+
+void acpi_scan_lock_acquire(void);
+void acpi_scan_lock_release(void);
 int acpi_scan_add_handler(struct acpi_scan_handler *handler);
 int acpi_bus_register_driver(struct acpi_driver *driver);
 void acpi_bus_unregister_driver(struct acpi_driver *driver);
Index: test/drivers/acpi/acpi_memhotplug.c
===================================================================
--- test.orig/drivers/acpi/acpi_memhotplug.c
+++ test/drivers/acpi/acpi_memhotplug.c
@@ -153,14 +153,16 @@ acpi_memory_get_device_resources(struct
 	return 0;
 }
 
-static int
-acpi_memory_get_device(acpi_handle handle,
-		       struct acpi_memory_device **mem_device)
+static int acpi_memory_get_device(acpi_handle handle,
+				  struct acpi_memory_device **mem_device)
 {
 	struct acpi_device *device = NULL;
-	int result;
+	int result = 0;
+
+	acpi_scan_lock_acquire();
 
-	if (!acpi_bus_get_device(handle, &device) && device)
+	acpi_bus_get_device(handle, &device);
+	if (device)
 		goto end;
 
 	/*
@@ -169,23 +171,28 @@ acpi_memory_get_device(acpi_handle handl
 	 */
 	result = acpi_bus_scan(handle);
 	if (result) {
-		acpi_handle_warn(handle, "Cannot add acpi bus\n");
-		return -EINVAL;
+		acpi_handle_warn(handle, "ACPI namespace scan failed\n");
+		result = -EINVAL;
+		goto out;
 	}
 	result = acpi_bus_get_device(handle, &device);
 	if (result) {
 		acpi_handle_warn(handle, "Missing device object\n");
-		return -EINVAL;
+		result = -EINVAL;
+		goto out;
 	}
 
-      end:
+ end:
 	*mem_device = acpi_driver_data(device);
 	if (!(*mem_device)) {
 		dev_err(&device->dev, "driver data not found\n");
-		return -ENODEV;
+		result = -ENODEV;
+		goto out;
 	}
 
-	return 0;
+ out:
+	acpi_scan_lock_release();
+	return result;
 }
 
 static int acpi_memory_check_device(struct acpi_memory_device *mem_device)
@@ -305,6 +312,7 @@ static void acpi_memory_device_notify(ac
 	struct acpi_device *device;
 	struct acpi_eject_event *ej_event = NULL;
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
+	acpi_status status;
 
 	switch (event) {
 	case ACPI_NOTIFY_BUS_CHECK:
@@ -327,29 +335,40 @@ static void acpi_memory_device_notify(ac
 		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				  "\nReceived EJECT REQUEST notification for device\n"));
 
+		status = AE_ERROR;
+		acpi_scan_lock_acquire();
+
 		if (acpi_bus_get_device(handle, &device)) {
 			acpi_handle_err(handle, "Device doesn't exist\n");
-			break;
+			goto unlock;
 		}
 		mem_device = acpi_driver_data(device);
 		if (!mem_device) {
 			acpi_handle_err(handle, "Driver Data is NULL\n");
-			break;
+			goto unlock;
 		}
 
 		ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
 		if (!ej_event) {
 			pr_err(PREFIX "No memory, dropping EJECT\n");
-			break;
+			goto unlock;
 		}
 
+		get_device(&device->dev);
 		ej_event->device = device;
 		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
-		acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
-					(void *)ej_event);
+		/* The eject is carried out asynchronously. */
+		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
+						 ej_event);
+		if (ACPI_FAILURE(status)) {
+			put_device(&device->dev);
+			kfree(ej_event);
+		}
 
-		/* eject is performed asynchronously */
-		return;
+ unlock:
+		acpi_scan_lock_release();
+		if (ACPI_SUCCESS(status))
+			return;
 	default:
 		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				  "Unsupported event [0x%x]\n", event));
@@ -360,7 +379,6 @@ static void acpi_memory_device_notify(ac
 
 	/* Inform firmware that the hotplug operation has completed */
 	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
-	return;
 }
 
 static void acpi_memory_device_free(struct acpi_memory_device *mem_device)
Index: test/drivers/acpi/processor_driver.c
===================================================================
--- test.orig/drivers/acpi/processor_driver.c
+++ test/drivers/acpi/processor_driver.c
@@ -683,8 +683,11 @@ static void acpi_processor_hotplug_notif
 	struct acpi_device *device = NULL;
 	struct acpi_eject_event *ej_event = NULL;
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
+	acpi_status status;
 	int result;
 
+	acpi_scan_lock_acquire();
+
 	switch (event) {
 	case ACPI_NOTIFY_BUS_CHECK:
 	case ACPI_NOTIFY_DEVICE_CHECK:
@@ -733,25 +736,32 @@ static void acpi_processor_hotplug_notif
 			break;
 		}
 
+		get_device(&device->dev);
 		ej_event->device = device;
 		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
-		acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
-					(void *)ej_event);
-
-		/* eject is performed asynchronously */
-		return;
+		/* The eject is carried out asynchronously. */
+		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
+						 ej_event);
+		if (ACPI_FAILURE(status)) {
+			put_device(&device->dev);
+			kfree(ej_event);
+			break;
+		}
+		goto out;
 
 	default:
 		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				  "Unsupported event [0x%x]\n", event));
 
 		/* non-hotplug event; possibly handled by other handler */
-		return;
+		goto out;
 	}
 
 	/* Inform firmware that the hotplug operation has completed */
 	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
-	return;
+
+ out:
+	acpi_scan_lock_release();;
 }
 
 static acpi_status is_processor_device(acpi_handle handle)
Index: test/drivers/acpi/container.c
===================================================================
--- test.orig/drivers/acpi/container.c
+++ test/drivers/acpi/container.c
@@ -88,6 +88,8 @@ static void container_notify_cb(acpi_han
 	acpi_status status;
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
 
+	acpi_scan_lock_acquire();
+
 	switch (type) {
 	case ACPI_NOTIFY_BUS_CHECK:
 		/* Fall through */
@@ -130,18 +132,20 @@ static void container_notify_cb(acpi_han
 		if (!acpi_bus_get_device(handle, &device) && device) {
 			device->flags.eject_pending = 1;
 			kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
-			return;
+			goto out;
 		}
 		break;
 
 	default:
 		/* non-hotplug event; possibly handled by other handler */
-		return;
+		goto out;
 	}
 
 	/* Inform firmware that the hotplug operation has completed */
 	(void) acpi_evaluate_hotplug_ost(handle, type, ost_code, NULL);
-	return;
+
+ out:
+	acpi_scan_lock_release();
 }
 
 static bool is_container(acpi_handle handle)
Index: test/drivers/acpi/dock.c
===================================================================
--- test.orig/drivers/acpi/dock.c
+++ test/drivers/acpi/dock.c
@@ -744,7 +744,9 @@ static void acpi_dock_deferred_cb(void *
 {
 	struct dock_data *data = context;
 
+	acpi_scan_lock_acquire();
 	dock_notify(data->handle, data->event, data->ds);
+	acpi_scan_lock_release();
 	kfree(data);
 }
 
@@ -757,20 +759,31 @@ static int acpi_dock_notifier_call(struc
 	if (event != ACPI_NOTIFY_BUS_CHECK && event != ACPI_NOTIFY_DEVICE_CHECK
 	   && event != ACPI_NOTIFY_EJECT_REQUEST)
 		return 0;
+
+	acpi_scan_lock_acquire();
+
 	list_for_each_entry(dock_station, &dock_stations, sibling) {
 		if (dock_station->handle == handle) {
 			struct dock_data *dd;
+			acpi_status status;
 
 			dd = kmalloc(sizeof(*dd), GFP_KERNEL);
 			if (!dd)
-				return 0;
+				break;
+
 			dd->handle = handle;
 			dd->event = event;
 			dd->ds = dock_station;
-			acpi_os_hotplug_execute(acpi_dock_deferred_cb, dd);
-			return 0 ;
+			status = acpi_os_hotplug_execute(acpi_dock_deferred_cb,
+							 dd);
+			if (ACPI_FAILURE(status))
+				kfree(dd);
+
+			break;
 		}
 	}
+
+	acpi_scan_lock_release();
 	return 0;
 }
 
Index: test/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- test.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ test/drivers/pci/hotplug/acpiphp_glue.c
@@ -1218,6 +1218,8 @@ static void _handle_hotplug_event_bridge
 	handle = hp_work->handle;
 	type = hp_work->type;
 
+	acpi_scan_lock_acquire();
+
 	if (acpi_bus_get_device(handle, &device)) {
 		/* This bridge must have just been physically inserted */
 		handle_bridge_insertion(handle, type);
@@ -1295,6 +1297,7 @@ static void _handle_hotplug_event_bridge
 	}
 
 out:
+	acpi_scan_lock_release();
 	kfree(hp_work); /* allocated in handle_hotplug_event_bridge */
 }
 
@@ -1341,6 +1344,8 @@ static void _handle_hotplug_event_func(s
 
 	func = (struct acpiphp_func *)context;
 
+	acpi_scan_lock_acquire();
+
 	switch (type) {
 	case ACPI_NOTIFY_BUS_CHECK:
 		/* bus re-enumerate */
@@ -1371,6 +1376,7 @@ static void _handle_hotplug_event_func(s
 		break;
 	}
 
+	acpi_scan_lock_release();
 	kfree(hp_work); /* allocated in handle_hotplug_event_func */
 }
 
Index: test/drivers/pci/hotplug/sgi_hotplug.c
===================================================================
--- test.orig/drivers/pci/hotplug/sgi_hotplug.c
+++ test/drivers/pci/hotplug/sgi_hotplug.c
@@ -425,6 +425,7 @@ static int enable_slot(struct hotplug_sl
 			pdevice = NULL;
 		}
 
+		acpi_scan_lock_acquire();
 		/*
 		 * Walk the rootbus node's immediate children looking for
 		 * the slot's device node(s). There can be more than
@@ -458,6 +459,7 @@ static int enable_slot(struct hotplug_sl
 				}
 			}
 		}
+		acpi_scan_lock_release();
 	}
 
 	/* Call the driver for the new device */
@@ -508,6 +510,7 @@ static int disable_slot(struct hotplug_s
 		/* Get the rootbus node pointer */
 		phandle = PCI_CONTROLLER(slot->pci_bus)->acpi_handle;
 
+		acpi_scan_lock_acquire();
 		/*
 		 * Walk the rootbus node's immediate children looking for
 		 * the slot's device node(s). There can be more than
@@ -538,7 +541,7 @@ static int disable_slot(struct hotplug_s
 					acpi_bus_trim(device);
 			}
 		}
-
+		acpi_scan_lock_release();
 	}
 
 	/* Free the SN resources assigned to the Linux device.*/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13  0:19 [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks Rafael J. Wysocki
@ 2013-02-13  1:55 ` Yinghai Lu
  2013-02-13 13:08   ` Rafael J. Wysocki
  2013-02-13  3:08   ` Yasuaki Ishimatsu
  2013-02-13 13:16 ` [Update][PATCH] " Rafael J. Wysocki
  2 siblings, 1 reply; 35+ messages in thread
From: Yinghai Lu @ 2013-02-13  1:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Toshi Kani, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Tue, Feb 12, 2013 at 4:19 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> This changeset is aimed at fixing a few different but related
> problems in the ACPI hotplug infrastructure.
>
> First of all, since notify handlers may be run in parallel with
> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> and some of them are installed for ACPI handles that have no struct
> acpi_device objects attached (i.e. before those objects are created),
> those notify handlers have to take acpi_scan_lock to prevent races
> from taking place (e.g. a struct acpi_device is found to be present
> for the given ACPI handle, but right after that it is removed by
> acpi_bus_trim() running in parallel to the given notify handler).
> Moreover, since some of them call acpi_bus_scan() and
> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> should be acquired by the callers of these two funtions rather by
> these functions themselves.
>
> For these reasons, make all notify handlers that can handle device
> addition and eject events take acpi_scan_lock and remove the
> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> Accordingly, update all of their users to make sure that they
> are always called under acpi_scan_lock.
>
> Furthermore, since eject operations are carried out asynchronously
> with respect to the notify events that trigger them, with the help
> of acpi_bus_hot_remove_device(), even if notify handlers take the
> ACPI scan lock, it still is possible that, for example,
> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> the notify handler that scheduled its execution and that
> acpi_bus_trim() will remove the device node passed to
> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> acpi_device object obtained by acpi_bus_hot_remove_device() will be
> invalid and not-so-funny things will ensue.  To protect agaist that,
> make the users of acpi_bus_hot_remove_device() run get_device() on
> ACPI device node objects that are about to be passed to it and make
> acpi_bus_hot_remove_device() run put_device() on them and check if
> their ACPI handles are not NULL (make acpi_device_unregister() clear
> the device nodes' ACPI handles for that check to work).
>
> Finally, observe that acpi_os_hotplug_execute() actually can fail,
> in which case its caller ought to free memory allocated for the
> context object to prevent leaks from happening.  It also needs to
> run put_device() on the device node that it ran get_device() on
> previously in that case.  Modify the code accordingly.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Acked-by: Yinghai Lu <yinghai@kernel.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13  0:19 [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks Rafael J. Wysocki
@ 2013-02-13  3:08   ` Yasuaki Ishimatsu
  2013-02-13  3:08   ` Yasuaki Ishimatsu
  2013-02-13 13:16 ` [Update][PATCH] " Rafael J. Wysocki
  2 siblings, 0 replies; 35+ messages in thread
From: Yasuaki Ishimatsu @ 2013-02-13  3:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Toshi Kani, Myron Stowe, linux-pci

Hi Rafael,

The patch seems good.
There is a comment below.

2013/02/13 9:19, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> This changeset is aimed at fixing a few different but related
> problems in the ACPI hotplug infrastructure.
>
> First of all, since notify handlers may be run in parallel with
> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> and some of them are installed for ACPI handles that have no struct
> acpi_device objects attached (i.e. before those objects are created),
> those notify handlers have to take acpi_scan_lock to prevent races
> from taking place (e.g. a struct acpi_device is found to be present
> for the given ACPI handle, but right after that it is removed by
> acpi_bus_trim() running in parallel to the given notify handler).
> Moreover, since some of them call acpi_bus_scan() and
> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> should be acquired by the callers of these two funtions rather by
> these functions themselves.
>
> For these reasons, make all notify handlers that can handle device
> addition and eject events take acpi_scan_lock and remove the
> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> Accordingly, update all of their users to make sure that they
> are always called under acpi_scan_lock.
>
> Furthermore, since eject operations are carried out asynchronously
> with respect to the notify events that trigger them, with the help
> of acpi_bus_hot_remove_device(), even if notify handlers take the
> ACPI scan lock, it still is possible that, for example,
> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> the notify handler that scheduled its execution and that
> acpi_bus_trim() will remove the device node passed to
> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> acpi_device object obtained by acpi_bus_hot_remove_device() will be
> invalid and not-so-funny things will ensue.  To protect agaist that,
> make the users of acpi_bus_hot_remove_device() run get_device() on
> ACPI device node objects that are about to be passed to it and make
> acpi_bus_hot_remove_device() run put_device() on them and check if
> their ACPI handles are not NULL (make acpi_device_unregister() clear
> the device nodes' ACPI handles for that check to work).
>
> Finally, observe that acpi_os_hotplug_execute() actually can fail,
> in which case its caller ought to free memory allocated for the
> context object to prevent leaks from happening.  It also needs to
> run put_device() on the device node that it ran get_device() on
> previously in that case.  Modify the code accordingly.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>
> On top of linux-pm.git/linux-next.
>
> Thanks,
> Rafael
>
> ---
>   drivers/acpi/acpi_memhotplug.c     |   56 +++++++++++++++++++-----------
>   drivers/acpi/container.c           |   10 +++--
>   drivers/acpi/dock.c                |   19 ++++++++--
>   drivers/acpi/processor_driver.c    |   24 +++++++++---
>   drivers/acpi/scan.c                |   69 +++++++++++++++++++++++++------------
>   drivers/pci/hotplug/acpiphp_glue.c |    6 +++
>   drivers/pci/hotplug/sgi_hotplug.c  |    5 ++
>   include/acpi/acpi_bus.h            |    3 +
>   8 files changed, 138 insertions(+), 54 deletions(-)
>
> Index: test/drivers/acpi/scan.c
> ===================================================================
> --- test.orig/drivers/acpi/scan.c
> +++ test/drivers/acpi/scan.c
> @@ -42,6 +42,18 @@ struct acpi_device_bus_id{
>   	struct list_head node;
>   };
>
> +void acpi_scan_lock_acquire(void)
> +{
> +	mutex_lock(&acpi_scan_lock);
> +}
> +EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
> +
> +void acpi_scan_lock_release(void)
> +{
> +	mutex_unlock(&acpi_scan_lock);
> +}
> +EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
> +
>   int acpi_scan_add_handler(struct acpi_scan_handler *handler)
>   {
>   	if (!handler || !handler->attach)
> @@ -95,8 +107,6 @@ acpi_device_modalias_show(struct device
>   }
>   static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
>
> -static void __acpi_bus_trim(struct acpi_device *start);
> -
>   /**
>    * acpi_bus_hot_remove_device: hot-remove a device and its children
>    * @context: struct acpi_eject_event pointer (freed in this func)
> @@ -107,7 +117,7 @@ static void __acpi_bus_trim(struct acpi_
>    */
>   void acpi_bus_hot_remove_device(void *context)
>   {
> -	struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context;
> +	struct acpi_eject_event *ej_event = context;
>   	struct acpi_device *device = ej_event->device;
>   	acpi_handle handle = device->handle;
>   	acpi_handle temp;
> @@ -118,11 +128,19 @@ void acpi_bus_hot_remove_device(void *co
>
>   	mutex_lock(&acpi_scan_lock);
>
> +	/* If there is no handle, the device node has been unregistered. */
> +	if (!device->handle) {
> +		dev_dbg(&device->dev, "ACPI handle missing\n");
> +		put_device(&device->dev);
> +		goto out;
> +	}
> +
>   	ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>   		"Hot-removing device %s...\n", dev_name(&device->dev)));
>
> -	__acpi_bus_trim(device);
> -	/* Device node has been released. */
> +	acpi_bus_trim(device);
> +	/* Device node has been unregistered. */
> +	put_device(&device->dev);
>   	device = NULL;
>
>   	if (ACPI_SUCCESS(acpi_get_handle(handle, "_LCK", &temp))) {
> @@ -151,6 +169,7 @@ void acpi_bus_hot_remove_device(void *co
>   					  ost_code, NULL);
>   	}
>
> + out:
>   	mutex_unlock(&acpi_scan_lock);
>   	kfree(context);
>   	return;
> @@ -212,6 +231,7 @@ acpi_eject_store(struct device *d, struc
>   		goto err;
>   	}
>
> +	get_device(&acpi_device->dev);
>   	ej_event->device = acpi_device;
>   	if (acpi_device->flags.eject_pending) {
>   		/* event originated from ACPI eject notification */
> @@ -224,7 +244,11 @@ acpi_eject_store(struct device *d, struc
>   			ej_event->event, ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
>   	}
>
> -	acpi_os_hotplug_execute(acpi_bus_hot_remove_device, (void *)ej_event);
> +	status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
> +	if (ACPI_FAILURE(status)) {
> +		put_device(&acpi_device->dev);
> +		kfree(ej_event);
> +	}
>   err:
>   	return ret;
>   }
> @@ -779,6 +803,7 @@ static void acpi_device_unregister(struc
>   	 * no more references.
>   	 */
>   	acpi_device_set_power(device, ACPI_STATE_D3_COLD);
> +	device->handle = NULL;
>   	put_device(&device->dev);
>   }
>
> @@ -1626,14 +1651,14 @@ static acpi_status acpi_bus_device_attac
>    * there has been a real error.  There just have been no suitable ACPI objects
>    * in the table trunk from which the kernel could create a device and add an
>    * appropriate driver.
> + *
> + * Must be called under acpi_scan_lock.
>    */
>   int acpi_bus_scan(acpi_handle handle)
>   {
>   	void *device = NULL;
>   	int error = 0;
>
> -	mutex_lock(&acpi_scan_lock);
> -
>   	if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device)))
>   		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>   				    acpi_bus_check_add, NULL, NULL, &device);
> @@ -1644,7 +1669,6 @@ int acpi_bus_scan(acpi_handle handle)
>   		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>   				    acpi_bus_device_attach, NULL, NULL, NULL);
>
> -	mutex_unlock(&acpi_scan_lock);
>   	return error;
>   }
>   EXPORT_SYMBOL(acpi_bus_scan);
> @@ -1681,7 +1705,13 @@ static acpi_status acpi_bus_remove(acpi_
>   	return AE_OK;
>   }
>
> -static void __acpi_bus_trim(struct acpi_device *start)
> +/**
> + * acpi_bus_trim - Remove ACPI device node and all of its descendants
> + * @start: Root of the ACPI device nodes subtree to remove.
> + *
> + * Must be called under acpi_scan_lock.
> + */
> +void acpi_bus_trim(struct acpi_device *start)
>   {
>   	/*
>   	 * Execute acpi_bus_device_detach() as a post-order callback to detach
> @@ -1698,13 +1728,6 @@ static void __acpi_bus_trim(struct acpi_
>   			    acpi_bus_remove, NULL, NULL);
>   	acpi_bus_remove(start->handle, 0, NULL, NULL);
>   }
> -
> -void acpi_bus_trim(struct acpi_device *start)
> -{
> -	mutex_lock(&acpi_scan_lock);
> -	__acpi_bus_trim(start);
> -	mutex_unlock(&acpi_scan_lock);
> -}
>   EXPORT_SYMBOL_GPL(acpi_bus_trim);
>
>   static int acpi_bus_scan_fixed(void)
> @@ -1762,23 +1785,27 @@ int __init acpi_scan_init(void)
>   	acpi_csrt_init();
>   	acpi_container_init();
>
> +	mutex_lock(&acpi_scan_lock);
>   	/*
>   	 * Enumerate devices in the ACPI namespace.
>   	 */
>   	result = acpi_bus_scan(ACPI_ROOT_OBJECT);
>   	if (result)
> -		return result;
> +		goto out;
>
>   	result = acpi_bus_get_device(ACPI_ROOT_OBJECT, &acpi_root);
>   	if (result)
> -		return result;
> +		goto out;
>
>   	result = acpi_bus_scan_fixed();
>   	if (result) {
>   		acpi_device_unregister(acpi_root);
> -		return result;
> +		goto out;
>   	}
>
>   	acpi_update_all_gpes();
> -	return 0;
> +
> + out:
> +	mutex_unlock(&acpi_scan_lock);
> +	return result;
>   }
> Index: test/include/acpi/acpi_bus.h
> ===================================================================
> --- test.orig/include/acpi/acpi_bus.h
> +++ test/include/acpi/acpi_bus.h
> @@ -395,6 +395,9 @@ int acpi_bus_receive_event(struct acpi_b
>   static inline int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data)
>   	{ return 0; }
>   #endif
> +
> +void acpi_scan_lock_acquire(void);
> +void acpi_scan_lock_release(void);
>   int acpi_scan_add_handler(struct acpi_scan_handler *handler);
>   int acpi_bus_register_driver(struct acpi_driver *driver);
>   void acpi_bus_unregister_driver(struct acpi_driver *driver);
> Index: test/drivers/acpi/acpi_memhotplug.c
> ===================================================================
> --- test.orig/drivers/acpi/acpi_memhotplug.c
> +++ test/drivers/acpi/acpi_memhotplug.c
> @@ -153,14 +153,16 @@ acpi_memory_get_device_resources(struct
>   	return 0;
>   }
>
> -static int
> -acpi_memory_get_device(acpi_handle handle,
> -		       struct acpi_memory_device **mem_device)
> +static int acpi_memory_get_device(acpi_handle handle,
> +				  struct acpi_memory_device **mem_device)
>   {
>   	struct acpi_device *device = NULL;
> -	int result;
> +	int result = 0;
> +
> +	acpi_scan_lock_acquire();
>
> -	if (!acpi_bus_get_device(handle, &device) && device)
> +	acpi_bus_get_device(handle, &device);
> +	if (device)
>   		goto end;
>
>   	/*
> @@ -169,23 +171,28 @@ acpi_memory_get_device(acpi_handle handl
>   	 */
>   	result = acpi_bus_scan(handle);
>   	if (result) {
> -		acpi_handle_warn(handle, "Cannot add acpi bus\n");
> -		return -EINVAL;
> +		acpi_handle_warn(handle, "ACPI namespace scan failed\n");
> +		result = -EINVAL;
> +		goto out;
>   	}
>   	result = acpi_bus_get_device(handle, &device);
>   	if (result) {
>   		acpi_handle_warn(handle, "Missing device object\n");
> -		return -EINVAL;
> +		result = -EINVAL;
> +		goto out;
>   	}
>
> -      end:
> + end:
>   	*mem_device = acpi_driver_data(device);
>   	if (!(*mem_device)) {
>   		dev_err(&device->dev, "driver data not found\n");
> -		return -ENODEV;
> +		result = -ENODEV;
> +		goto out;
>   	}
>
> -	return 0;
> + out:
> +	acpi_scan_lock_release();
> +	return result;
>   }
>
>   static int acpi_memory_check_device(struct acpi_memory_device *mem_device)
> @@ -305,6 +312,7 @@ static void acpi_memory_device_notify(ac
>   	struct acpi_device *device;
>   	struct acpi_eject_event *ej_event = NULL;
>   	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> +	acpi_status status;
>
>   	switch (event) {
>   	case ACPI_NOTIFY_BUS_CHECK:
> @@ -327,29 +335,40 @@ static void acpi_memory_device_notify(ac
>   		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>   				  "\nReceived EJECT REQUEST notification for device\n"));
>
> +		status = AE_ERROR;
> +		acpi_scan_lock_acquire();
> +
>   		if (acpi_bus_get_device(handle, &device)) {
>   			acpi_handle_err(handle, "Device doesn't exist\n");
> -			break;
> +			goto unlock;
>   		}
>   		mem_device = acpi_driver_data(device);
>   		if (!mem_device) {
>   			acpi_handle_err(handle, "Driver Data is NULL\n");
> -			break;
> +			goto unlock;
>   		}
>
>   		ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
>   		if (!ej_event) {
>   			pr_err(PREFIX "No memory, dropping EJECT\n");
> -			break;
> +			goto unlock;
>   		}
>
> +		get_device(&device->dev);
>   		ej_event->device = device;
>   		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> -		acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> -					(void *)ej_event);
> +		/* The eject is carried out asynchronously. */
> +		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> +						 ej_event);
> +		if (ACPI_FAILURE(status)) {
> +			put_device(&device->dev);
> +			kfree(ej_event);
> +		}
>
> -		/* eject is performed asynchronously */
> -		return;
> + unlock:
> +		acpi_scan_lock_release();
> +		if (ACPI_SUCCESS(status))
> +			return;
>   	default:
>   		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>   				  "Unsupported event [0x%x]\n", event));
> @@ -360,7 +379,6 @@ static void acpi_memory_device_notify(ac
>
>   	/* Inform firmware that the hotplug operation has completed */
>   	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
> -	return;
>   }
>
>   static void acpi_memory_device_free(struct acpi_memory_device *mem_device)
> Index: test/drivers/acpi/processor_driver.c
> ===================================================================
> --- test.orig/drivers/acpi/processor_driver.c
> +++ test/drivers/acpi/processor_driver.c
> @@ -683,8 +683,11 @@ static void acpi_processor_hotplug_notif
>   	struct acpi_device *device = NULL;
>   	struct acpi_eject_event *ej_event = NULL;
>   	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> +	acpi_status status;
>   	int result;
>
> +	acpi_scan_lock_acquire();
> +
>   	switch (event) {
>   	case ACPI_NOTIFY_BUS_CHECK:
>   	case ACPI_NOTIFY_DEVICE_CHECK:
> @@ -733,25 +736,32 @@ static void acpi_processor_hotplug_notif
>   			break;
>   		}
>
> +		get_device(&device->dev);
>   		ej_event->device = device;
>   		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> -		acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> -					(void *)ej_event);
> -
> -		/* eject is performed asynchronously */
> -		return;
> +		/* The eject is carried out asynchronously. */
> +		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> +						 ej_event);
> +		if (ACPI_FAILURE(status)) {
> +			put_device(&device->dev);
> +			kfree(ej_event);
> +			break;
> +		}
> +		goto out;
>
>   	default:
>   		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>   				  "Unsupported event [0x%x]\n", event));
>
>   		/* non-hotplug event; possibly handled by other handler */
> -		return;
> +		goto out;
>   	}
>
>   	/* Inform firmware that the hotplug operation has completed */
>   	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
> -	return;
> +
> + out:

> +	acpi_scan_lock_release();;

extra ";"

Thanks,
Yasuaki Ishimatsu

>   }
>
>   static acpi_status is_processor_device(acpi_handle handle)
> Index: test/drivers/acpi/container.c
> ===================================================================
> --- test.orig/drivers/acpi/container.c
> +++ test/drivers/acpi/container.c
> @@ -88,6 +88,8 @@ static void container_notify_cb(acpi_han
>   	acpi_status status;
>   	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>
> +	acpi_scan_lock_acquire();
> +
>   	switch (type) {
>   	case ACPI_NOTIFY_BUS_CHECK:
>   		/* Fall through */
> @@ -130,18 +132,20 @@ static void container_notify_cb(acpi_han
>   		if (!acpi_bus_get_device(handle, &device) && device) {
>   			device->flags.eject_pending = 1;
>   			kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
> -			return;
> +			goto out;
>   		}
>   		break;
>
>   	default:
>   		/* non-hotplug event; possibly handled by other handler */
> -		return;
> +		goto out;
>   	}
>
>   	/* Inform firmware that the hotplug operation has completed */
>   	(void) acpi_evaluate_hotplug_ost(handle, type, ost_code, NULL);
> -	return;
> +
> + out:
> +	acpi_scan_lock_release();
>   }
>
>   static bool is_container(acpi_handle handle)
> Index: test/drivers/acpi/dock.c
> ===================================================================
> --- test.orig/drivers/acpi/dock.c
> +++ test/drivers/acpi/dock.c
> @@ -744,7 +744,9 @@ static void acpi_dock_deferred_cb(void *
>   {
>   	struct dock_data *data = context;
>
> +	acpi_scan_lock_acquire();
>   	dock_notify(data->handle, data->event, data->ds);
> +	acpi_scan_lock_release();
>   	kfree(data);
>   }
>
> @@ -757,20 +759,31 @@ static int acpi_dock_notifier_call(struc
>   	if (event != ACPI_NOTIFY_BUS_CHECK && event != ACPI_NOTIFY_DEVICE_CHECK
>   	   && event != ACPI_NOTIFY_EJECT_REQUEST)
>   		return 0;
> +
> +	acpi_scan_lock_acquire();
> +
>   	list_for_each_entry(dock_station, &dock_stations, sibling) {
>   		if (dock_station->handle == handle) {
>   			struct dock_data *dd;
> +			acpi_status status;
>
>   			dd = kmalloc(sizeof(*dd), GFP_KERNEL);
>   			if (!dd)
> -				return 0;
> +				break;
> +
>   			dd->handle = handle;
>   			dd->event = event;
>   			dd->ds = dock_station;
> -			acpi_os_hotplug_execute(acpi_dock_deferred_cb, dd);
> -			return 0 ;
> +			status = acpi_os_hotplug_execute(acpi_dock_deferred_cb,
> +							 dd);
> +			if (ACPI_FAILURE(status))
> +				kfree(dd);
> +
> +			break;
>   		}
>   	}
> +
> +	acpi_scan_lock_release();
>   	return 0;
>   }
>
> Index: test/drivers/pci/hotplug/acpiphp_glue.c
> ===================================================================
> --- test.orig/drivers/pci/hotplug/acpiphp_glue.c
> +++ test/drivers/pci/hotplug/acpiphp_glue.c
> @@ -1218,6 +1218,8 @@ static void _handle_hotplug_event_bridge
>   	handle = hp_work->handle;
>   	type = hp_work->type;
>
> +	acpi_scan_lock_acquire();
> +
>   	if (acpi_bus_get_device(handle, &device)) {
>   		/* This bridge must have just been physically inserted */
>   		handle_bridge_insertion(handle, type);
> @@ -1295,6 +1297,7 @@ static void _handle_hotplug_event_bridge
>   	}
>
>   out:
> +	acpi_scan_lock_release();
>   	kfree(hp_work); /* allocated in handle_hotplug_event_bridge */
>   }
>
> @@ -1341,6 +1344,8 @@ static void _handle_hotplug_event_func(s
>
>   	func = (struct acpiphp_func *)context;
>
> +	acpi_scan_lock_acquire();
> +
>   	switch (type) {
>   	case ACPI_NOTIFY_BUS_CHECK:
>   		/* bus re-enumerate */
> @@ -1371,6 +1376,7 @@ static void _handle_hotplug_event_func(s
>   		break;
>   	}
>
> +	acpi_scan_lock_release();
>   	kfree(hp_work); /* allocated in handle_hotplug_event_func */
>   }
>
> Index: test/drivers/pci/hotplug/sgi_hotplug.c
> ===================================================================
> --- test.orig/drivers/pci/hotplug/sgi_hotplug.c
> +++ test/drivers/pci/hotplug/sgi_hotplug.c
> @@ -425,6 +425,7 @@ static int enable_slot(struct hotplug_sl
>   			pdevice = NULL;
>   		}
>
> +		acpi_scan_lock_acquire();
>   		/*
>   		 * Walk the rootbus node's immediate children looking for
>   		 * the slot's device node(s). There can be more than
> @@ -458,6 +459,7 @@ static int enable_slot(struct hotplug_sl
>   				}
>   			}
>   		}
> +		acpi_scan_lock_release();
>   	}
>
>   	/* Call the driver for the new device */
> @@ -508,6 +510,7 @@ static int disable_slot(struct hotplug_s
>   		/* Get the rootbus node pointer */
>   		phandle = PCI_CONTROLLER(slot->pci_bus)->acpi_handle;
>
> +		acpi_scan_lock_acquire();
>   		/*
>   		 * Walk the rootbus node's immediate children looking for
>   		 * the slot's device node(s). There can be more than
> @@ -538,7 +541,7 @@ static int disable_slot(struct hotplug_s
>   					acpi_bus_trim(device);
>   			}
>   		}
> -
> +		acpi_scan_lock_release();
>   	}
>
>   	/* Free the SN resources assigned to the Linux device.*/
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-13  3:08   ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 35+ messages in thread
From: Yasuaki Ishimatsu @ 2013-02-13  3:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Toshi Kani, Myron Stowe, linux-pci

Hi Rafael,

The patch seems good.
There is a comment below.

2013/02/13 9:19, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> This changeset is aimed at fixing a few different but related
> problems in the ACPI hotplug infrastructure.
>
> First of all, since notify handlers may be run in parallel with
> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> and some of them are installed for ACPI handles that have no struct
> acpi_device objects attached (i.e. before those objects are created),
> those notify handlers have to take acpi_scan_lock to prevent races
> from taking place (e.g. a struct acpi_device is found to be present
> for the given ACPI handle, but right after that it is removed by
> acpi_bus_trim() running in parallel to the given notify handler).
> Moreover, since some of them call acpi_bus_scan() and
> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> should be acquired by the callers of these two funtions rather by
> these functions themselves.
>
> For these reasons, make all notify handlers that can handle device
> addition and eject events take acpi_scan_lock and remove the
> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> Accordingly, update all of their users to make sure that they
> are always called under acpi_scan_lock.
>
> Furthermore, since eject operations are carried out asynchronously
> with respect to the notify events that trigger them, with the help
> of acpi_bus_hot_remove_device(), even if notify handlers take the
> ACPI scan lock, it still is possible that, for example,
> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> the notify handler that scheduled its execution and that
> acpi_bus_trim() will remove the device node passed to
> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> acpi_device object obtained by acpi_bus_hot_remove_device() will be
> invalid and not-so-funny things will ensue.  To protect agaist that,
> make the users of acpi_bus_hot_remove_device() run get_device() on
> ACPI device node objects that are about to be passed to it and make
> acpi_bus_hot_remove_device() run put_device() on them and check if
> their ACPI handles are not NULL (make acpi_device_unregister() clear
> the device nodes' ACPI handles for that check to work).
>
> Finally, observe that acpi_os_hotplug_execute() actually can fail,
> in which case its caller ought to free memory allocated for the
> context object to prevent leaks from happening.  It also needs to
> run put_device() on the device node that it ran get_device() on
> previously in that case.  Modify the code accordingly.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>
> On top of linux-pm.git/linux-next.
>
> Thanks,
> Rafael
>
> ---
>   drivers/acpi/acpi_memhotplug.c     |   56 +++++++++++++++++++-----------
>   drivers/acpi/container.c           |   10 +++--
>   drivers/acpi/dock.c                |   19 ++++++++--
>   drivers/acpi/processor_driver.c    |   24 +++++++++---
>   drivers/acpi/scan.c                |   69 +++++++++++++++++++++++++------------
>   drivers/pci/hotplug/acpiphp_glue.c |    6 +++
>   drivers/pci/hotplug/sgi_hotplug.c  |    5 ++
>   include/acpi/acpi_bus.h            |    3 +
>   8 files changed, 138 insertions(+), 54 deletions(-)
>
> Index: test/drivers/acpi/scan.c
> ===================================================================
> --- test.orig/drivers/acpi/scan.c
> +++ test/drivers/acpi/scan.c
> @@ -42,6 +42,18 @@ struct acpi_device_bus_id{
>   	struct list_head node;
>   };
>
> +void acpi_scan_lock_acquire(void)
> +{
> +	mutex_lock(&acpi_scan_lock);
> +}
> +EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
> +
> +void acpi_scan_lock_release(void)
> +{
> +	mutex_unlock(&acpi_scan_lock);
> +}
> +EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
> +
>   int acpi_scan_add_handler(struct acpi_scan_handler *handler)
>   {
>   	if (!handler || !handler->attach)
> @@ -95,8 +107,6 @@ acpi_device_modalias_show(struct device
>   }
>   static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
>
> -static void __acpi_bus_trim(struct acpi_device *start);
> -
>   /**
>    * acpi_bus_hot_remove_device: hot-remove a device and its children
>    * @context: struct acpi_eject_event pointer (freed in this func)
> @@ -107,7 +117,7 @@ static void __acpi_bus_trim(struct acpi_
>    */
>   void acpi_bus_hot_remove_device(void *context)
>   {
> -	struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context;
> +	struct acpi_eject_event *ej_event = context;
>   	struct acpi_device *device = ej_event->device;
>   	acpi_handle handle = device->handle;
>   	acpi_handle temp;
> @@ -118,11 +128,19 @@ void acpi_bus_hot_remove_device(void *co
>
>   	mutex_lock(&acpi_scan_lock);
>
> +	/* If there is no handle, the device node has been unregistered. */
> +	if (!device->handle) {
> +		dev_dbg(&device->dev, "ACPI handle missing\n");
> +		put_device(&device->dev);
> +		goto out;
> +	}
> +
>   	ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>   		"Hot-removing device %s...\n", dev_name(&device->dev)));
>
> -	__acpi_bus_trim(device);
> -	/* Device node has been released. */
> +	acpi_bus_trim(device);
> +	/* Device node has been unregistered. */
> +	put_device(&device->dev);
>   	device = NULL;
>
>   	if (ACPI_SUCCESS(acpi_get_handle(handle, "_LCK", &temp))) {
> @@ -151,6 +169,7 @@ void acpi_bus_hot_remove_device(void *co
>   					  ost_code, NULL);
>   	}
>
> + out:
>   	mutex_unlock(&acpi_scan_lock);
>   	kfree(context);
>   	return;
> @@ -212,6 +231,7 @@ acpi_eject_store(struct device *d, struc
>   		goto err;
>   	}
>
> +	get_device(&acpi_device->dev);
>   	ej_event->device = acpi_device;
>   	if (acpi_device->flags.eject_pending) {
>   		/* event originated from ACPI eject notification */
> @@ -224,7 +244,11 @@ acpi_eject_store(struct device *d, struc
>   			ej_event->event, ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
>   	}
>
> -	acpi_os_hotplug_execute(acpi_bus_hot_remove_device, (void *)ej_event);
> +	status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
> +	if (ACPI_FAILURE(status)) {
> +		put_device(&acpi_device->dev);
> +		kfree(ej_event);
> +	}
>   err:
>   	return ret;
>   }
> @@ -779,6 +803,7 @@ static void acpi_device_unregister(struc
>   	 * no more references.
>   	 */
>   	acpi_device_set_power(device, ACPI_STATE_D3_COLD);
> +	device->handle = NULL;
>   	put_device(&device->dev);
>   }
>
> @@ -1626,14 +1651,14 @@ static acpi_status acpi_bus_device_attac
>    * there has been a real error.  There just have been no suitable ACPI objects
>    * in the table trunk from which the kernel could create a device and add an
>    * appropriate driver.
> + *
> + * Must be called under acpi_scan_lock.
>    */
>   int acpi_bus_scan(acpi_handle handle)
>   {
>   	void *device = NULL;
>   	int error = 0;
>
> -	mutex_lock(&acpi_scan_lock);
> -
>   	if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device)))
>   		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>   				    acpi_bus_check_add, NULL, NULL, &device);
> @@ -1644,7 +1669,6 @@ int acpi_bus_scan(acpi_handle handle)
>   		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>   				    acpi_bus_device_attach, NULL, NULL, NULL);
>
> -	mutex_unlock(&acpi_scan_lock);
>   	return error;
>   }
>   EXPORT_SYMBOL(acpi_bus_scan);
> @@ -1681,7 +1705,13 @@ static acpi_status acpi_bus_remove(acpi_
>   	return AE_OK;
>   }
>
> -static void __acpi_bus_trim(struct acpi_device *start)
> +/**
> + * acpi_bus_trim - Remove ACPI device node and all of its descendants
> + * @start: Root of the ACPI device nodes subtree to remove.
> + *
> + * Must be called under acpi_scan_lock.
> + */
> +void acpi_bus_trim(struct acpi_device *start)
>   {
>   	/*
>   	 * Execute acpi_bus_device_detach() as a post-order callback to detach
> @@ -1698,13 +1728,6 @@ static void __acpi_bus_trim(struct acpi_
>   			    acpi_bus_remove, NULL, NULL);
>   	acpi_bus_remove(start->handle, 0, NULL, NULL);
>   }
> -
> -void acpi_bus_trim(struct acpi_device *start)
> -{
> -	mutex_lock(&acpi_scan_lock);
> -	__acpi_bus_trim(start);
> -	mutex_unlock(&acpi_scan_lock);
> -}
>   EXPORT_SYMBOL_GPL(acpi_bus_trim);
>
>   static int acpi_bus_scan_fixed(void)
> @@ -1762,23 +1785,27 @@ int __init acpi_scan_init(void)
>   	acpi_csrt_init();
>   	acpi_container_init();
>
> +	mutex_lock(&acpi_scan_lock);
>   	/*
>   	 * Enumerate devices in the ACPI namespace.
>   	 */
>   	result = acpi_bus_scan(ACPI_ROOT_OBJECT);
>   	if (result)
> -		return result;
> +		goto out;
>
>   	result = acpi_bus_get_device(ACPI_ROOT_OBJECT, &acpi_root);
>   	if (result)
> -		return result;
> +		goto out;
>
>   	result = acpi_bus_scan_fixed();
>   	if (result) {
>   		acpi_device_unregister(acpi_root);
> -		return result;
> +		goto out;
>   	}
>
>   	acpi_update_all_gpes();
> -	return 0;
> +
> + out:
> +	mutex_unlock(&acpi_scan_lock);
> +	return result;
>   }
> Index: test/include/acpi/acpi_bus.h
> ===================================================================
> --- test.orig/include/acpi/acpi_bus.h
> +++ test/include/acpi/acpi_bus.h
> @@ -395,6 +395,9 @@ int acpi_bus_receive_event(struct acpi_b
>   static inline int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data)
>   	{ return 0; }
>   #endif
> +
> +void acpi_scan_lock_acquire(void);
> +void acpi_scan_lock_release(void);
>   int acpi_scan_add_handler(struct acpi_scan_handler *handler);
>   int acpi_bus_register_driver(struct acpi_driver *driver);
>   void acpi_bus_unregister_driver(struct acpi_driver *driver);
> Index: test/drivers/acpi/acpi_memhotplug.c
> ===================================================================
> --- test.orig/drivers/acpi/acpi_memhotplug.c
> +++ test/drivers/acpi/acpi_memhotplug.c
> @@ -153,14 +153,16 @@ acpi_memory_get_device_resources(struct
>   	return 0;
>   }
>
> -static int
> -acpi_memory_get_device(acpi_handle handle,
> -		       struct acpi_memory_device **mem_device)
> +static int acpi_memory_get_device(acpi_handle handle,
> +				  struct acpi_memory_device **mem_device)
>   {
>   	struct acpi_device *device = NULL;
> -	int result;
> +	int result = 0;
> +
> +	acpi_scan_lock_acquire();
>
> -	if (!acpi_bus_get_device(handle, &device) && device)
> +	acpi_bus_get_device(handle, &device);
> +	if (device)
>   		goto end;
>
>   	/*
> @@ -169,23 +171,28 @@ acpi_memory_get_device(acpi_handle handl
>   	 */
>   	result = acpi_bus_scan(handle);
>   	if (result) {
> -		acpi_handle_warn(handle, "Cannot add acpi bus\n");
> -		return -EINVAL;
> +		acpi_handle_warn(handle, "ACPI namespace scan failed\n");
> +		result = -EINVAL;
> +		goto out;
>   	}
>   	result = acpi_bus_get_device(handle, &device);
>   	if (result) {
>   		acpi_handle_warn(handle, "Missing device object\n");
> -		return -EINVAL;
> +		result = -EINVAL;
> +		goto out;
>   	}
>
> -      end:
> + end:
>   	*mem_device = acpi_driver_data(device);
>   	if (!(*mem_device)) {
>   		dev_err(&device->dev, "driver data not found\n");
> -		return -ENODEV;
> +		result = -ENODEV;
> +		goto out;
>   	}
>
> -	return 0;
> + out:
> +	acpi_scan_lock_release();
> +	return result;
>   }
>
>   static int acpi_memory_check_device(struct acpi_memory_device *mem_device)
> @@ -305,6 +312,7 @@ static void acpi_memory_device_notify(ac
>   	struct acpi_device *device;
>   	struct acpi_eject_event *ej_event = NULL;
>   	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> +	acpi_status status;
>
>   	switch (event) {
>   	case ACPI_NOTIFY_BUS_CHECK:
> @@ -327,29 +335,40 @@ static void acpi_memory_device_notify(ac
>   		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>   				  "\nReceived EJECT REQUEST notification for device\n"));
>
> +		status = AE_ERROR;
> +		acpi_scan_lock_acquire();
> +
>   		if (acpi_bus_get_device(handle, &device)) {
>   			acpi_handle_err(handle, "Device doesn't exist\n");
> -			break;
> +			goto unlock;
>   		}
>   		mem_device = acpi_driver_data(device);
>   		if (!mem_device) {
>   			acpi_handle_err(handle, "Driver Data is NULL\n");
> -			break;
> +			goto unlock;
>   		}
>
>   		ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
>   		if (!ej_event) {
>   			pr_err(PREFIX "No memory, dropping EJECT\n");
> -			break;
> +			goto unlock;
>   		}
>
> +		get_device(&device->dev);
>   		ej_event->device = device;
>   		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> -		acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> -					(void *)ej_event);
> +		/* The eject is carried out asynchronously. */
> +		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> +						 ej_event);
> +		if (ACPI_FAILURE(status)) {
> +			put_device(&device->dev);
> +			kfree(ej_event);
> +		}
>
> -		/* eject is performed asynchronously */
> -		return;
> + unlock:
> +		acpi_scan_lock_release();
> +		if (ACPI_SUCCESS(status))
> +			return;
>   	default:
>   		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>   				  "Unsupported event [0x%x]\n", event));
> @@ -360,7 +379,6 @@ static void acpi_memory_device_notify(ac
>
>   	/* Inform firmware that the hotplug operation has completed */
>   	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
> -	return;
>   }
>
>   static void acpi_memory_device_free(struct acpi_memory_device *mem_device)
> Index: test/drivers/acpi/processor_driver.c
> ===================================================================
> --- test.orig/drivers/acpi/processor_driver.c
> +++ test/drivers/acpi/processor_driver.c
> @@ -683,8 +683,11 @@ static void acpi_processor_hotplug_notif
>   	struct acpi_device *device = NULL;
>   	struct acpi_eject_event *ej_event = NULL;
>   	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> +	acpi_status status;
>   	int result;
>
> +	acpi_scan_lock_acquire();
> +
>   	switch (event) {
>   	case ACPI_NOTIFY_BUS_CHECK:
>   	case ACPI_NOTIFY_DEVICE_CHECK:
> @@ -733,25 +736,32 @@ static void acpi_processor_hotplug_notif
>   			break;
>   		}
>
> +		get_device(&device->dev);
>   		ej_event->device = device;
>   		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> -		acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> -					(void *)ej_event);
> -
> -		/* eject is performed asynchronously */
> -		return;
> +		/* The eject is carried out asynchronously. */
> +		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> +						 ej_event);
> +		if (ACPI_FAILURE(status)) {
> +			put_device(&device->dev);
> +			kfree(ej_event);
> +			break;
> +		}
> +		goto out;
>
>   	default:
>   		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>   				  "Unsupported event [0x%x]\n", event));
>
>   		/* non-hotplug event; possibly handled by other handler */
> -		return;
> +		goto out;
>   	}
>
>   	/* Inform firmware that the hotplug operation has completed */
>   	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
> -	return;
> +
> + out:

> +	acpi_scan_lock_release();;

extra ";"

Thanks,
Yasuaki Ishimatsu

>   }
>
>   static acpi_status is_processor_device(acpi_handle handle)
> Index: test/drivers/acpi/container.c
> ===================================================================
> --- test.orig/drivers/acpi/container.c
> +++ test/drivers/acpi/container.c
> @@ -88,6 +88,8 @@ static void container_notify_cb(acpi_han
>   	acpi_status status;
>   	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>
> +	acpi_scan_lock_acquire();
> +
>   	switch (type) {
>   	case ACPI_NOTIFY_BUS_CHECK:
>   		/* Fall through */
> @@ -130,18 +132,20 @@ static void container_notify_cb(acpi_han
>   		if (!acpi_bus_get_device(handle, &device) && device) {
>   			device->flags.eject_pending = 1;
>   			kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
> -			return;
> +			goto out;
>   		}
>   		break;
>
>   	default:
>   		/* non-hotplug event; possibly handled by other handler */
> -		return;
> +		goto out;
>   	}
>
>   	/* Inform firmware that the hotplug operation has completed */
>   	(void) acpi_evaluate_hotplug_ost(handle, type, ost_code, NULL);
> -	return;
> +
> + out:
> +	acpi_scan_lock_release();
>   }
>
>   static bool is_container(acpi_handle handle)
> Index: test/drivers/acpi/dock.c
> ===================================================================
> --- test.orig/drivers/acpi/dock.c
> +++ test/drivers/acpi/dock.c
> @@ -744,7 +744,9 @@ static void acpi_dock_deferred_cb(void *
>   {
>   	struct dock_data *data = context;
>
> +	acpi_scan_lock_acquire();
>   	dock_notify(data->handle, data->event, data->ds);
> +	acpi_scan_lock_release();
>   	kfree(data);
>   }
>
> @@ -757,20 +759,31 @@ static int acpi_dock_notifier_call(struc
>   	if (event != ACPI_NOTIFY_BUS_CHECK && event != ACPI_NOTIFY_DEVICE_CHECK
>   	   && event != ACPI_NOTIFY_EJECT_REQUEST)
>   		return 0;
> +
> +	acpi_scan_lock_acquire();
> +
>   	list_for_each_entry(dock_station, &dock_stations, sibling) {
>   		if (dock_station->handle == handle) {
>   			struct dock_data *dd;
> +			acpi_status status;
>
>   			dd = kmalloc(sizeof(*dd), GFP_KERNEL);
>   			if (!dd)
> -				return 0;
> +				break;
> +
>   			dd->handle = handle;
>   			dd->event = event;
>   			dd->ds = dock_station;
> -			acpi_os_hotplug_execute(acpi_dock_deferred_cb, dd);
> -			return 0 ;
> +			status = acpi_os_hotplug_execute(acpi_dock_deferred_cb,
> +							 dd);
> +			if (ACPI_FAILURE(status))
> +				kfree(dd);
> +
> +			break;
>   		}
>   	}
> +
> +	acpi_scan_lock_release();
>   	return 0;
>   }
>
> Index: test/drivers/pci/hotplug/acpiphp_glue.c
> ===================================================================
> --- test.orig/drivers/pci/hotplug/acpiphp_glue.c
> +++ test/drivers/pci/hotplug/acpiphp_glue.c
> @@ -1218,6 +1218,8 @@ static void _handle_hotplug_event_bridge
>   	handle = hp_work->handle;
>   	type = hp_work->type;
>
> +	acpi_scan_lock_acquire();
> +
>   	if (acpi_bus_get_device(handle, &device)) {
>   		/* This bridge must have just been physically inserted */
>   		handle_bridge_insertion(handle, type);
> @@ -1295,6 +1297,7 @@ static void _handle_hotplug_event_bridge
>   	}
>
>   out:
> +	acpi_scan_lock_release();
>   	kfree(hp_work); /* allocated in handle_hotplug_event_bridge */
>   }
>
> @@ -1341,6 +1344,8 @@ static void _handle_hotplug_event_func(s
>
>   	func = (struct acpiphp_func *)context;
>
> +	acpi_scan_lock_acquire();
> +
>   	switch (type) {
>   	case ACPI_NOTIFY_BUS_CHECK:
>   		/* bus re-enumerate */
> @@ -1371,6 +1376,7 @@ static void _handle_hotplug_event_func(s
>   		break;
>   	}
>
> +	acpi_scan_lock_release();
>   	kfree(hp_work); /* allocated in handle_hotplug_event_func */
>   }
>
> Index: test/drivers/pci/hotplug/sgi_hotplug.c
> ===================================================================
> --- test.orig/drivers/pci/hotplug/sgi_hotplug.c
> +++ test/drivers/pci/hotplug/sgi_hotplug.c
> @@ -425,6 +425,7 @@ static int enable_slot(struct hotplug_sl
>   			pdevice = NULL;
>   		}
>
> +		acpi_scan_lock_acquire();
>   		/*
>   		 * Walk the rootbus node's immediate children looking for
>   		 * the slot's device node(s). There can be more than
> @@ -458,6 +459,7 @@ static int enable_slot(struct hotplug_sl
>   				}
>   			}
>   		}
> +		acpi_scan_lock_release();
>   	}
>
>   	/* Call the driver for the new device */
> @@ -508,6 +510,7 @@ static int disable_slot(struct hotplug_s
>   		/* Get the rootbus node pointer */
>   		phandle = PCI_CONTROLLER(slot->pci_bus)->acpi_handle;
>
> +		acpi_scan_lock_acquire();
>   		/*
>   		 * Walk the rootbus node's immediate children looking for
>   		 * the slot's device node(s). There can be more than
> @@ -538,7 +541,7 @@ static int disable_slot(struct hotplug_s
>   					acpi_bus_trim(device);
>   			}
>   		}
> -
> +		acpi_scan_lock_release();
>   	}
>
>   	/* Free the SN resources assigned to the Linux device.*/
>



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13  3:08   ` Yasuaki Ishimatsu
@ 2013-02-13  3:31     ` Yasuaki Ishimatsu
  -1 siblings, 0 replies; 35+ messages in thread
From: Yasuaki Ishimatsu @ 2013-02-13  3:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Toshi Kani, Myron Stowe, linux-pci

Hi Rafael,

I have another comment at container.c.

2013/02/13 12:08, Yasuaki Ishimatsu wrote:
> Hi Rafael,
>
> The patch seems good.
> There is a comment below.
>
> 2013/02/13 9:19, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> This changeset is aimed at fixing a few different but related
>> problems in the ACPI hotplug infrastructure.
>>
>> First of all, since notify handlers may be run in parallel with
>> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
>> and some of them are installed for ACPI handles that have no struct
>> acpi_device objects attached (i.e. before those objects are created),
>> those notify handlers have to take acpi_scan_lock to prevent races
>> from taking place (e.g. a struct acpi_device is found to be present
>> for the given ACPI handle, but right after that it is removed by
>> acpi_bus_trim() running in parallel to the given notify handler).
>> Moreover, since some of them call acpi_bus_scan() and
>> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
>> should be acquired by the callers of these two funtions rather by
>> these functions themselves.
>>
>> For these reasons, make all notify handlers that can handle device
>> addition and eject events take acpi_scan_lock and remove the
>> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
>> Accordingly, update all of their users to make sure that they
>> are always called under acpi_scan_lock.
>>
>> Furthermore, since eject operations are carried out asynchronously
>> with respect to the notify events that trigger them, with the help
>> of acpi_bus_hot_remove_device(), even if notify handlers take the
>> ACPI scan lock, it still is possible that, for example,
>> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
>> the notify handler that scheduled its execution and that
>> acpi_bus_trim() will remove the device node passed to
>> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
>> acpi_device object obtained by acpi_bus_hot_remove_device() will be
>> invalid and not-so-funny things will ensue.  To protect agaist that,
>> make the users of acpi_bus_hot_remove_device() run get_device() on
>> ACPI device node objects that are about to be passed to it and make
>> acpi_bus_hot_remove_device() run put_device() on them and check if
>> their ACPI handles are not NULL (make acpi_device_unregister() clear
>> the device nodes' ACPI handles for that check to work).
>>
>> Finally, observe that acpi_os_hotplug_execute() actually can fail,
>> in which case its caller ought to free memory allocated for the
>> context object to prevent leaks from happening.  It also needs to
>> run put_device() on the device node that it ran get_device() on
>> previously in that case.  Modify the code accordingly.
>>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>
>> On top of linux-pm.git/linux-next.
>>
>> Thanks,
>> Rafael
>>
>> ---
>>   drivers/acpi/acpi_memhotplug.c     |   56 +++++++++++++++++++-----------
>>   drivers/acpi/container.c           |   10 +++--
>>   drivers/acpi/dock.c                |   19 ++++++++--
>>   drivers/acpi/processor_driver.c    |   24 +++++++++---
>>   drivers/acpi/scan.c                |   69 +++++++++++++++++++++++++------------
>>   drivers/pci/hotplug/acpiphp_glue.c |    6 +++
>>   drivers/pci/hotplug/sgi_hotplug.c  |    5 ++
>>   include/acpi/acpi_bus.h            |    3 +
>>   8 files changed, 138 insertions(+), 54 deletions(-)
>>
>> Index: test/drivers/acpi/scan.c
>> ===================================================================
>> --- test.orig/drivers/acpi/scan.c
>> +++ test/drivers/acpi/scan.c
>> @@ -42,6 +42,18 @@ struct acpi_device_bus_id{
>>       struct list_head node;
>>   };
>>
>> +void acpi_scan_lock_acquire(void)
>> +{
>> +    mutex_lock(&acpi_scan_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
>> +
>> +void acpi_scan_lock_release(void)
>> +{
>> +    mutex_unlock(&acpi_scan_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
>> +
>>   int acpi_scan_add_handler(struct acpi_scan_handler *handler)
>>   {
>>       if (!handler || !handler->attach)
>> @@ -95,8 +107,6 @@ acpi_device_modalias_show(struct device
>>   }
>>   static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
>>
>> -static void __acpi_bus_trim(struct acpi_device *start);
>> -
>>   /**
>>    * acpi_bus_hot_remove_device: hot-remove a device and its children
>>    * @context: struct acpi_eject_event pointer (freed in this func)
>> @@ -107,7 +117,7 @@ static void __acpi_bus_trim(struct acpi_
>>    */
>>   void acpi_bus_hot_remove_device(void *context)
>>   {
>> -    struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context;
>> +    struct acpi_eject_event *ej_event = context;
>>       struct acpi_device *device = ej_event->device;
>>       acpi_handle handle = device->handle;
>>       acpi_handle temp;
>> @@ -118,11 +128,19 @@ void acpi_bus_hot_remove_device(void *co
>>
>>       mutex_lock(&acpi_scan_lock);
>>
>> +    /* If there is no handle, the device node has been unregistered. */
>> +    if (!device->handle) {
>> +        dev_dbg(&device->dev, "ACPI handle missing\n");
>> +        put_device(&device->dev);
>> +        goto out;
>> +    }
>> +
>>       ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>           "Hot-removing device %s...\n", dev_name(&device->dev)));
>>
>> -    __acpi_bus_trim(device);
>> -    /* Device node has been released. */
>> +    acpi_bus_trim(device);
>> +    /* Device node has been unregistered. */
>> +    put_device(&device->dev);
>>       device = NULL;
>>
>>       if (ACPI_SUCCESS(acpi_get_handle(handle, "_LCK", &temp))) {
>> @@ -151,6 +169,7 @@ void acpi_bus_hot_remove_device(void *co
>>                         ost_code, NULL);
>>       }
>>
>> + out:
>>       mutex_unlock(&acpi_scan_lock);
>>       kfree(context);
>>       return;
>> @@ -212,6 +231,7 @@ acpi_eject_store(struct device *d, struc
>>           goto err;
>>       }
>>
>> +    get_device(&acpi_device->dev);
>>       ej_event->device = acpi_device;
>>       if (acpi_device->flags.eject_pending) {
>>           /* event originated from ACPI eject notification */
>> @@ -224,7 +244,11 @@ acpi_eject_store(struct device *d, struc
>>               ej_event->event, ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
>>       }
>>
>> -    acpi_os_hotplug_execute(acpi_bus_hot_remove_device, (void *)ej_event);
>> +    status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
>> +    if (ACPI_FAILURE(status)) {
>> +        put_device(&acpi_device->dev);
>> +        kfree(ej_event);
>> +    }
>>   err:
>>       return ret;
>>   }
>> @@ -779,6 +803,7 @@ static void acpi_device_unregister(struc
>>        * no more references.
>>        */
>>       acpi_device_set_power(device, ACPI_STATE_D3_COLD);
>> +    device->handle = NULL;
>>       put_device(&device->dev);
>>   }
>>
>> @@ -1626,14 +1651,14 @@ static acpi_status acpi_bus_device_attac
>>    * there has been a real error.  There just have been no suitable ACPI objects
>>    * in the table trunk from which the kernel could create a device and add an
>>    * appropriate driver.
>> + *
>> + * Must be called under acpi_scan_lock.
>>    */
>>   int acpi_bus_scan(acpi_handle handle)
>>   {
>>       void *device = NULL;
>>       int error = 0;
>>
>> -    mutex_lock(&acpi_scan_lock);
>> -
>>       if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device)))
>>           acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>>                       acpi_bus_check_add, NULL, NULL, &device);
>> @@ -1644,7 +1669,6 @@ int acpi_bus_scan(acpi_handle handle)
>>           acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>>                       acpi_bus_device_attach, NULL, NULL, NULL);
>>
>> -    mutex_unlock(&acpi_scan_lock);
>>       return error;
>>   }
>>   EXPORT_SYMBOL(acpi_bus_scan);
>> @@ -1681,7 +1705,13 @@ static acpi_status acpi_bus_remove(acpi_
>>       return AE_OK;
>>   }
>>
>> -static void __acpi_bus_trim(struct acpi_device *start)
>> +/**
>> + * acpi_bus_trim - Remove ACPI device node and all of its descendants
>> + * @start: Root of the ACPI device nodes subtree to remove.
>> + *
>> + * Must be called under acpi_scan_lock.
>> + */
>> +void acpi_bus_trim(struct acpi_device *start)
>>   {
>>       /*
>>        * Execute acpi_bus_device_detach() as a post-order callback to detach
>> @@ -1698,13 +1728,6 @@ static void __acpi_bus_trim(struct acpi_
>>                   acpi_bus_remove, NULL, NULL);
>>       acpi_bus_remove(start->handle, 0, NULL, NULL);
>>   }
>> -
>> -void acpi_bus_trim(struct acpi_device *start)
>> -{
>> -    mutex_lock(&acpi_scan_lock);
>> -    __acpi_bus_trim(start);
>> -    mutex_unlock(&acpi_scan_lock);
>> -}
>>   EXPORT_SYMBOL_GPL(acpi_bus_trim);
>>
>>   static int acpi_bus_scan_fixed(void)
>> @@ -1762,23 +1785,27 @@ int __init acpi_scan_init(void)
>>       acpi_csrt_init();
>>       acpi_container_init();
>>
>> +    mutex_lock(&acpi_scan_lock);
>>       /*
>>        * Enumerate devices in the ACPI namespace.
>>        */
>>       result = acpi_bus_scan(ACPI_ROOT_OBJECT);
>>       if (result)
>> -        return result;
>> +        goto out;
>>
>>       result = acpi_bus_get_device(ACPI_ROOT_OBJECT, &acpi_root);
>>       if (result)
>> -        return result;
>> +        goto out;
>>
>>       result = acpi_bus_scan_fixed();
>>       if (result) {
>>           acpi_device_unregister(acpi_root);
>> -        return result;
>> +        goto out;
>>       }
>>
>>       acpi_update_all_gpes();
>> -    return 0;
>> +
>> + out:
>> +    mutex_unlock(&acpi_scan_lock);
>> +    return result;
>>   }
>> Index: test/include/acpi/acpi_bus.h
>> ===================================================================
>> --- test.orig/include/acpi/acpi_bus.h
>> +++ test/include/acpi/acpi_bus.h
>> @@ -395,6 +395,9 @@ int acpi_bus_receive_event(struct acpi_b
>>   static inline int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data)
>>       { return 0; }
>>   #endif
>> +
>> +void acpi_scan_lock_acquire(void);
>> +void acpi_scan_lock_release(void);
>>   int acpi_scan_add_handler(struct acpi_scan_handler *handler);
>>   int acpi_bus_register_driver(struct acpi_driver *driver);
>>   void acpi_bus_unregister_driver(struct acpi_driver *driver);
>> Index: test/drivers/acpi/acpi_memhotplug.c
>> ===================================================================
>> --- test.orig/drivers/acpi/acpi_memhotplug.c
>> +++ test/drivers/acpi/acpi_memhotplug.c
>> @@ -153,14 +153,16 @@ acpi_memory_get_device_resources(struct
>>       return 0;
>>   }
>>
>> -static int
>> -acpi_memory_get_device(acpi_handle handle,
>> -               struct acpi_memory_device **mem_device)
>> +static int acpi_memory_get_device(acpi_handle handle,
>> +                  struct acpi_memory_device **mem_device)
>>   {
>>       struct acpi_device *device = NULL;
>> -    int result;
>> +    int result = 0;
>> +
>> +    acpi_scan_lock_acquire();
>>
>> -    if (!acpi_bus_get_device(handle, &device) && device)
>> +    acpi_bus_get_device(handle, &device);
>> +    if (device)
>>           goto end;
>>
>>       /*
>> @@ -169,23 +171,28 @@ acpi_memory_get_device(acpi_handle handl
>>        */
>>       result = acpi_bus_scan(handle);
>>       if (result) {
>> -        acpi_handle_warn(handle, "Cannot add acpi bus\n");
>> -        return -EINVAL;
>> +        acpi_handle_warn(handle, "ACPI namespace scan failed\n");
>> +        result = -EINVAL;
>> +        goto out;
>>       }
>>       result = acpi_bus_get_device(handle, &device);
>>       if (result) {
>>           acpi_handle_warn(handle, "Missing device object\n");
>> -        return -EINVAL;
>> +        result = -EINVAL;
>> +        goto out;
>>       }
>>
>> -      end:
>> + end:
>>       *mem_device = acpi_driver_data(device);
>>       if (!(*mem_device)) {
>>           dev_err(&device->dev, "driver data not found\n");
>> -        return -ENODEV;
>> +        result = -ENODEV;
>> +        goto out;
>>       }
>>
>> -    return 0;
>> + out:
>> +    acpi_scan_lock_release();
>> +    return result;
>>   }
>>
>>   static int acpi_memory_check_device(struct acpi_memory_device *mem_device)
>> @@ -305,6 +312,7 @@ static void acpi_memory_device_notify(ac
>>       struct acpi_device *device;
>>       struct acpi_eject_event *ej_event = NULL;
>>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>> +    acpi_status status;
>>
>>       switch (event) {
>>       case ACPI_NOTIFY_BUS_CHECK:
>> @@ -327,29 +335,40 @@ static void acpi_memory_device_notify(ac
>>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>                     "\nReceived EJECT REQUEST notification for device\n"));
>>
>> +        status = AE_ERROR;
>> +        acpi_scan_lock_acquire();
>> +
>>           if (acpi_bus_get_device(handle, &device)) {
>>               acpi_handle_err(handle, "Device doesn't exist\n");
>> -            break;
>> +            goto unlock;
>>           }
>>           mem_device = acpi_driver_data(device);
>>           if (!mem_device) {
>>               acpi_handle_err(handle, "Driver Data is NULL\n");
>> -            break;
>> +            goto unlock;
>>           }
>>
>>           ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
>>           if (!ej_event) {
>>               pr_err(PREFIX "No memory, dropping EJECT\n");
>> -            break;
>> +            goto unlock;
>>           }
>>
>> +        get_device(&device->dev);
>>           ej_event->device = device;
>>           ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
>> -        acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
>> -                    (void *)ej_event);
>> +        /* The eject is carried out asynchronously. */
>> +        status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
>> +                         ej_event);
>> +        if (ACPI_FAILURE(status)) {
>> +            put_device(&device->dev);
>> +            kfree(ej_event);
>> +        }
>>
>> -        /* eject is performed asynchronously */
>> -        return;
>> + unlock:
>> +        acpi_scan_lock_release();
>> +        if (ACPI_SUCCESS(status))
>> +            return;
>>       default:
>>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>                     "Unsupported event [0x%x]\n", event));
>> @@ -360,7 +379,6 @@ static void acpi_memory_device_notify(ac
>>
>>       /* Inform firmware that the hotplug operation has completed */
>>       (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
>> -    return;
>>   }
>>
>>   static void acpi_memory_device_free(struct acpi_memory_device *mem_device)
>> Index: test/drivers/acpi/processor_driver.c
>> ===================================================================
>> --- test.orig/drivers/acpi/processor_driver.c
>> +++ test/drivers/acpi/processor_driver.c
>> @@ -683,8 +683,11 @@ static void acpi_processor_hotplug_notif
>>       struct acpi_device *device = NULL;
>>       struct acpi_eject_event *ej_event = NULL;
>>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>> +    acpi_status status;
>>       int result;
>>
>> +    acpi_scan_lock_acquire();
>> +
>>       switch (event) {
>>       case ACPI_NOTIFY_BUS_CHECK:
>>       case ACPI_NOTIFY_DEVICE_CHECK:
>> @@ -733,25 +736,32 @@ static void acpi_processor_hotplug_notif
>>               break;
>>           }
>>
>> +        get_device(&device->dev);
>>           ej_event->device = device;
>>           ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
>> -        acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
>> -                    (void *)ej_event);
>> -
>> -        /* eject is performed asynchronously */
>> -        return;
>> +        /* The eject is carried out asynchronously. */
>> +        status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
>> +                         ej_event);
>> +        if (ACPI_FAILURE(status)) {
>> +            put_device(&device->dev);
>> +            kfree(ej_event);
>> +            break;
>> +        }
>> +        goto out;
>>
>>       default:
>>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>                     "Unsupported event [0x%x]\n", event));
>>
>>           /* non-hotplug event; possibly handled by other handler */
>> -        return;
>> +        goto out;
>>       }
>>
>>       /* Inform firmware that the hotplug operation has completed */
>>       (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
>> -    return;
>> +
>> + out:
>
>> +    acpi_scan_lock_release();;
>
> extra ";"
>
> Thanks,
> Yasuaki Ishimatsu
>
>>   }
>>
>>   static acpi_status is_processor_device(acpi_handle handle)
>> Index: test/drivers/acpi/container.c
>> ===================================================================
>> --- test.orig/drivers/acpi/container.c
>> +++ test/drivers/acpi/container.c
>> @@ -88,6 +88,8 @@ static void container_notify_cb(acpi_han
>>       acpi_status status;
>>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>>
>> +    acpi_scan_lock_acquire();
>> +
>>       switch (type) {
>>       case ACPI_NOTIFY_BUS_CHECK:
>>           /* Fall through */

101                 present = is_device_present(handle);
102                 status = acpi_bus_get_device(handle, &device);
103                 if (!present) {
104                         if (ACPI_SUCCESS(status)) {
105                                 /* device exist and this is a remove request */
106                                 device->flags.eject_pending = 1;
107                                 kobject_uevent(&device->dev.kobj, KOBJ_OFLINE);

108                                 return;

It should use "goto out" instead of return.

109                         }
110                         break;
111                 }

Thanks,
Yasuaki Ishimatsu

>> @@ -130,18 +132,20 @@ static void container_notify_cb(acpi_han
>>           if (!acpi_bus_get_device(handle, &device) && device) {
>>               device->flags.eject_pending = 1;
>>               kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
>> -            return;
>> +            goto out;
>>           }
>>           break;
>>
>>       default:
>>           /* non-hotplug event; possibly handled by other handler */
>> -        return;
>> +        goto out;
>>       }
>>
>>       /* Inform firmware that the hotplug operation has completed */
>>       (void) acpi_evaluate_hotplug_ost(handle, type, ost_code, NULL);
>> -    return;
>> +
>> + out:
>> +    acpi_scan_lock_release();
>>   }
>>
>>   static bool is_container(acpi_handle handle)
>> Index: test/drivers/acpi/dock.c
>> ===================================================================
>> --- test.orig/drivers/acpi/dock.c
>> +++ test/drivers/acpi/dock.c
>> @@ -744,7 +744,9 @@ static void acpi_dock_deferred_cb(void *
>>   {
>>       struct dock_data *data = context;
>>
>> +    acpi_scan_lock_acquire();
>>       dock_notify(data->handle, data->event, data->ds);
>> +    acpi_scan_lock_release();
>>       kfree(data);
>>   }
>>
>> @@ -757,20 +759,31 @@ static int acpi_dock_notifier_call(struc
>>       if (event != ACPI_NOTIFY_BUS_CHECK && event != ACPI_NOTIFY_DEVICE_CHECK
>>          && event != ACPI_NOTIFY_EJECT_REQUEST)
>>           return 0;
>> +
>> +    acpi_scan_lock_acquire();
>> +
>>       list_for_each_entry(dock_station, &dock_stations, sibling) {
>>           if (dock_station->handle == handle) {
>>               struct dock_data *dd;
>> +            acpi_status status;
>>
>>               dd = kmalloc(sizeof(*dd), GFP_KERNEL);
>>               if (!dd)
>> -                return 0;
>> +                break;
>> +
>>               dd->handle = handle;
>>               dd->event = event;
>>               dd->ds = dock_station;
>> -            acpi_os_hotplug_execute(acpi_dock_deferred_cb, dd);
>> -            return 0 ;
>> +            status = acpi_os_hotplug_execute(acpi_dock_deferred_cb,
>> +                             dd);
>> +            if (ACPI_FAILURE(status))
>> +                kfree(dd);
>> +
>> +            break;
>>           }
>>       }
>> +
>> +    acpi_scan_lock_release();
>>       return 0;
>>   }
>>
>> Index: test/drivers/pci/hotplug/acpiphp_glue.c
>> ===================================================================
>> --- test.orig/drivers/pci/hotplug/acpiphp_glue.c
>> +++ test/drivers/pci/hotplug/acpiphp_glue.c
>> @@ -1218,6 +1218,8 @@ static void _handle_hotplug_event_bridge
>>       handle = hp_work->handle;
>>       type = hp_work->type;
>>
>> +    acpi_scan_lock_acquire();
>> +
>>       if (acpi_bus_get_device(handle, &device)) {
>>           /* This bridge must have just been physically inserted */
>>           handle_bridge_insertion(handle, type);
>> @@ -1295,6 +1297,7 @@ static void _handle_hotplug_event_bridge
>>       }
>>
>>   out:
>> +    acpi_scan_lock_release();
>>       kfree(hp_work); /* allocated in handle_hotplug_event_bridge */
>>   }
>>
>> @@ -1341,6 +1344,8 @@ static void _handle_hotplug_event_func(s
>>
>>       func = (struct acpiphp_func *)context;
>>
>> +    acpi_scan_lock_acquire();
>> +
>>       switch (type) {
>>       case ACPI_NOTIFY_BUS_CHECK:
>>           /* bus re-enumerate */
>> @@ -1371,6 +1376,7 @@ static void _handle_hotplug_event_func(s
>>           break;
>>       }
>>
>> +    acpi_scan_lock_release();
>>       kfree(hp_work); /* allocated in handle_hotplug_event_func */
>>   }
>>
>> Index: test/drivers/pci/hotplug/sgi_hotplug.c
>> ===================================================================
>> --- test.orig/drivers/pci/hotplug/sgi_hotplug.c
>> +++ test/drivers/pci/hotplug/sgi_hotplug.c
>> @@ -425,6 +425,7 @@ static int enable_slot(struct hotplug_sl
>>               pdevice = NULL;
>>           }
>>
>> +        acpi_scan_lock_acquire();
>>           /*
>>            * Walk the rootbus node's immediate children looking for
>>            * the slot's device node(s). There can be more than
>> @@ -458,6 +459,7 @@ static int enable_slot(struct hotplug_sl
>>                   }
>>               }
>>           }
>> +        acpi_scan_lock_release();
>>       }
>>
>>       /* Call the driver for the new device */
>> @@ -508,6 +510,7 @@ static int disable_slot(struct hotplug_s
>>           /* Get the rootbus node pointer */
>>           phandle = PCI_CONTROLLER(slot->pci_bus)->acpi_handle;
>>
>> +        acpi_scan_lock_acquire();
>>           /*
>>            * Walk the rootbus node's immediate children looking for
>>            * the slot's device node(s). There can be more than
>> @@ -538,7 +541,7 @@ static int disable_slot(struct hotplug_s
>>                       acpi_bus_trim(device);
>>               }
>>           }
>> -
>> +        acpi_scan_lock_release();
>>       }
>>
>>       /* Free the SN resources assigned to the Linux device.*/
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-13  3:31     ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 35+ messages in thread
From: Yasuaki Ishimatsu @ 2013-02-13  3:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Toshi Kani, Myron Stowe, linux-pci

Hi Rafael,

I have another comment at container.c.

2013/02/13 12:08, Yasuaki Ishimatsu wrote:
> Hi Rafael,
>
> The patch seems good.
> There is a comment below.
>
> 2013/02/13 9:19, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> This changeset is aimed at fixing a few different but related
>> problems in the ACPI hotplug infrastructure.
>>
>> First of all, since notify handlers may be run in parallel with
>> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
>> and some of them are installed for ACPI handles that have no struct
>> acpi_device objects attached (i.e. before those objects are created),
>> those notify handlers have to take acpi_scan_lock to prevent races
>> from taking place (e.g. a struct acpi_device is found to be present
>> for the given ACPI handle, but right after that it is removed by
>> acpi_bus_trim() running in parallel to the given notify handler).
>> Moreover, since some of them call acpi_bus_scan() and
>> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
>> should be acquired by the callers of these two funtions rather by
>> these functions themselves.
>>
>> For these reasons, make all notify handlers that can handle device
>> addition and eject events take acpi_scan_lock and remove the
>> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
>> Accordingly, update all of their users to make sure that they
>> are always called under acpi_scan_lock.
>>
>> Furthermore, since eject operations are carried out asynchronously
>> with respect to the notify events that trigger them, with the help
>> of acpi_bus_hot_remove_device(), even if notify handlers take the
>> ACPI scan lock, it still is possible that, for example,
>> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
>> the notify handler that scheduled its execution and that
>> acpi_bus_trim() will remove the device node passed to
>> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
>> acpi_device object obtained by acpi_bus_hot_remove_device() will be
>> invalid and not-so-funny things will ensue.  To protect agaist that,
>> make the users of acpi_bus_hot_remove_device() run get_device() on
>> ACPI device node objects that are about to be passed to it and make
>> acpi_bus_hot_remove_device() run put_device() on them and check if
>> their ACPI handles are not NULL (make acpi_device_unregister() clear
>> the device nodes' ACPI handles for that check to work).
>>
>> Finally, observe that acpi_os_hotplug_execute() actually can fail,
>> in which case its caller ought to free memory allocated for the
>> context object to prevent leaks from happening.  It also needs to
>> run put_device() on the device node that it ran get_device() on
>> previously in that case.  Modify the code accordingly.
>>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>
>> On top of linux-pm.git/linux-next.
>>
>> Thanks,
>> Rafael
>>
>> ---
>>   drivers/acpi/acpi_memhotplug.c     |   56 +++++++++++++++++++-----------
>>   drivers/acpi/container.c           |   10 +++--
>>   drivers/acpi/dock.c                |   19 ++++++++--
>>   drivers/acpi/processor_driver.c    |   24 +++++++++---
>>   drivers/acpi/scan.c                |   69 +++++++++++++++++++++++++------------
>>   drivers/pci/hotplug/acpiphp_glue.c |    6 +++
>>   drivers/pci/hotplug/sgi_hotplug.c  |    5 ++
>>   include/acpi/acpi_bus.h            |    3 +
>>   8 files changed, 138 insertions(+), 54 deletions(-)
>>
>> Index: test/drivers/acpi/scan.c
>> ===================================================================
>> --- test.orig/drivers/acpi/scan.c
>> +++ test/drivers/acpi/scan.c
>> @@ -42,6 +42,18 @@ struct acpi_device_bus_id{
>>       struct list_head node;
>>   };
>>
>> +void acpi_scan_lock_acquire(void)
>> +{
>> +    mutex_lock(&acpi_scan_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
>> +
>> +void acpi_scan_lock_release(void)
>> +{
>> +    mutex_unlock(&acpi_scan_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
>> +
>>   int acpi_scan_add_handler(struct acpi_scan_handler *handler)
>>   {
>>       if (!handler || !handler->attach)
>> @@ -95,8 +107,6 @@ acpi_device_modalias_show(struct device
>>   }
>>   static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
>>
>> -static void __acpi_bus_trim(struct acpi_device *start);
>> -
>>   /**
>>    * acpi_bus_hot_remove_device: hot-remove a device and its children
>>    * @context: struct acpi_eject_event pointer (freed in this func)
>> @@ -107,7 +117,7 @@ static void __acpi_bus_trim(struct acpi_
>>    */
>>   void acpi_bus_hot_remove_device(void *context)
>>   {
>> -    struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context;
>> +    struct acpi_eject_event *ej_event = context;
>>       struct acpi_device *device = ej_event->device;
>>       acpi_handle handle = device->handle;
>>       acpi_handle temp;
>> @@ -118,11 +128,19 @@ void acpi_bus_hot_remove_device(void *co
>>
>>       mutex_lock(&acpi_scan_lock);
>>
>> +    /* If there is no handle, the device node has been unregistered. */
>> +    if (!device->handle) {
>> +        dev_dbg(&device->dev, "ACPI handle missing\n");
>> +        put_device(&device->dev);
>> +        goto out;
>> +    }
>> +
>>       ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>           "Hot-removing device %s...\n", dev_name(&device->dev)));
>>
>> -    __acpi_bus_trim(device);
>> -    /* Device node has been released. */
>> +    acpi_bus_trim(device);
>> +    /* Device node has been unregistered. */
>> +    put_device(&device->dev);
>>       device = NULL;
>>
>>       if (ACPI_SUCCESS(acpi_get_handle(handle, "_LCK", &temp))) {
>> @@ -151,6 +169,7 @@ void acpi_bus_hot_remove_device(void *co
>>                         ost_code, NULL);
>>       }
>>
>> + out:
>>       mutex_unlock(&acpi_scan_lock);
>>       kfree(context);
>>       return;
>> @@ -212,6 +231,7 @@ acpi_eject_store(struct device *d, struc
>>           goto err;
>>       }
>>
>> +    get_device(&acpi_device->dev);
>>       ej_event->device = acpi_device;
>>       if (acpi_device->flags.eject_pending) {
>>           /* event originated from ACPI eject notification */
>> @@ -224,7 +244,11 @@ acpi_eject_store(struct device *d, struc
>>               ej_event->event, ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
>>       }
>>
>> -    acpi_os_hotplug_execute(acpi_bus_hot_remove_device, (void *)ej_event);
>> +    status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
>> +    if (ACPI_FAILURE(status)) {
>> +        put_device(&acpi_device->dev);
>> +        kfree(ej_event);
>> +    }
>>   err:
>>       return ret;
>>   }
>> @@ -779,6 +803,7 @@ static void acpi_device_unregister(struc
>>        * no more references.
>>        */
>>       acpi_device_set_power(device, ACPI_STATE_D3_COLD);
>> +    device->handle = NULL;
>>       put_device(&device->dev);
>>   }
>>
>> @@ -1626,14 +1651,14 @@ static acpi_status acpi_bus_device_attac
>>    * there has been a real error.  There just have been no suitable ACPI objects
>>    * in the table trunk from which the kernel could create a device and add an
>>    * appropriate driver.
>> + *
>> + * Must be called under acpi_scan_lock.
>>    */
>>   int acpi_bus_scan(acpi_handle handle)
>>   {
>>       void *device = NULL;
>>       int error = 0;
>>
>> -    mutex_lock(&acpi_scan_lock);
>> -
>>       if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device)))
>>           acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>>                       acpi_bus_check_add, NULL, NULL, &device);
>> @@ -1644,7 +1669,6 @@ int acpi_bus_scan(acpi_handle handle)
>>           acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>>                       acpi_bus_device_attach, NULL, NULL, NULL);
>>
>> -    mutex_unlock(&acpi_scan_lock);
>>       return error;
>>   }
>>   EXPORT_SYMBOL(acpi_bus_scan);
>> @@ -1681,7 +1705,13 @@ static acpi_status acpi_bus_remove(acpi_
>>       return AE_OK;
>>   }
>>
>> -static void __acpi_bus_trim(struct acpi_device *start)
>> +/**
>> + * acpi_bus_trim - Remove ACPI device node and all of its descendants
>> + * @start: Root of the ACPI device nodes subtree to remove.
>> + *
>> + * Must be called under acpi_scan_lock.
>> + */
>> +void acpi_bus_trim(struct acpi_device *start)
>>   {
>>       /*
>>        * Execute acpi_bus_device_detach() as a post-order callback to detach
>> @@ -1698,13 +1728,6 @@ static void __acpi_bus_trim(struct acpi_
>>                   acpi_bus_remove, NULL, NULL);
>>       acpi_bus_remove(start->handle, 0, NULL, NULL);
>>   }
>> -
>> -void acpi_bus_trim(struct acpi_device *start)
>> -{
>> -    mutex_lock(&acpi_scan_lock);
>> -    __acpi_bus_trim(start);
>> -    mutex_unlock(&acpi_scan_lock);
>> -}
>>   EXPORT_SYMBOL_GPL(acpi_bus_trim);
>>
>>   static int acpi_bus_scan_fixed(void)
>> @@ -1762,23 +1785,27 @@ int __init acpi_scan_init(void)
>>       acpi_csrt_init();
>>       acpi_container_init();
>>
>> +    mutex_lock(&acpi_scan_lock);
>>       /*
>>        * Enumerate devices in the ACPI namespace.
>>        */
>>       result = acpi_bus_scan(ACPI_ROOT_OBJECT);
>>       if (result)
>> -        return result;
>> +        goto out;
>>
>>       result = acpi_bus_get_device(ACPI_ROOT_OBJECT, &acpi_root);
>>       if (result)
>> -        return result;
>> +        goto out;
>>
>>       result = acpi_bus_scan_fixed();
>>       if (result) {
>>           acpi_device_unregister(acpi_root);
>> -        return result;
>> +        goto out;
>>       }
>>
>>       acpi_update_all_gpes();
>> -    return 0;
>> +
>> + out:
>> +    mutex_unlock(&acpi_scan_lock);
>> +    return result;
>>   }
>> Index: test/include/acpi/acpi_bus.h
>> ===================================================================
>> --- test.orig/include/acpi/acpi_bus.h
>> +++ test/include/acpi/acpi_bus.h
>> @@ -395,6 +395,9 @@ int acpi_bus_receive_event(struct acpi_b
>>   static inline int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data)
>>       { return 0; }
>>   #endif
>> +
>> +void acpi_scan_lock_acquire(void);
>> +void acpi_scan_lock_release(void);
>>   int acpi_scan_add_handler(struct acpi_scan_handler *handler);
>>   int acpi_bus_register_driver(struct acpi_driver *driver);
>>   void acpi_bus_unregister_driver(struct acpi_driver *driver);
>> Index: test/drivers/acpi/acpi_memhotplug.c
>> ===================================================================
>> --- test.orig/drivers/acpi/acpi_memhotplug.c
>> +++ test/drivers/acpi/acpi_memhotplug.c
>> @@ -153,14 +153,16 @@ acpi_memory_get_device_resources(struct
>>       return 0;
>>   }
>>
>> -static int
>> -acpi_memory_get_device(acpi_handle handle,
>> -               struct acpi_memory_device **mem_device)
>> +static int acpi_memory_get_device(acpi_handle handle,
>> +                  struct acpi_memory_device **mem_device)
>>   {
>>       struct acpi_device *device = NULL;
>> -    int result;
>> +    int result = 0;
>> +
>> +    acpi_scan_lock_acquire();
>>
>> -    if (!acpi_bus_get_device(handle, &device) && device)
>> +    acpi_bus_get_device(handle, &device);
>> +    if (device)
>>           goto end;
>>
>>       /*
>> @@ -169,23 +171,28 @@ acpi_memory_get_device(acpi_handle handl
>>        */
>>       result = acpi_bus_scan(handle);
>>       if (result) {
>> -        acpi_handle_warn(handle, "Cannot add acpi bus\n");
>> -        return -EINVAL;
>> +        acpi_handle_warn(handle, "ACPI namespace scan failed\n");
>> +        result = -EINVAL;
>> +        goto out;
>>       }
>>       result = acpi_bus_get_device(handle, &device);
>>       if (result) {
>>           acpi_handle_warn(handle, "Missing device object\n");
>> -        return -EINVAL;
>> +        result = -EINVAL;
>> +        goto out;
>>       }
>>
>> -      end:
>> + end:
>>       *mem_device = acpi_driver_data(device);
>>       if (!(*mem_device)) {
>>           dev_err(&device->dev, "driver data not found\n");
>> -        return -ENODEV;
>> +        result = -ENODEV;
>> +        goto out;
>>       }
>>
>> -    return 0;
>> + out:
>> +    acpi_scan_lock_release();
>> +    return result;
>>   }
>>
>>   static int acpi_memory_check_device(struct acpi_memory_device *mem_device)
>> @@ -305,6 +312,7 @@ static void acpi_memory_device_notify(ac
>>       struct acpi_device *device;
>>       struct acpi_eject_event *ej_event = NULL;
>>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>> +    acpi_status status;
>>
>>       switch (event) {
>>       case ACPI_NOTIFY_BUS_CHECK:
>> @@ -327,29 +335,40 @@ static void acpi_memory_device_notify(ac
>>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>                     "\nReceived EJECT REQUEST notification for device\n"));
>>
>> +        status = AE_ERROR;
>> +        acpi_scan_lock_acquire();
>> +
>>           if (acpi_bus_get_device(handle, &device)) {
>>               acpi_handle_err(handle, "Device doesn't exist\n");
>> -            break;
>> +            goto unlock;
>>           }
>>           mem_device = acpi_driver_data(device);
>>           if (!mem_device) {
>>               acpi_handle_err(handle, "Driver Data is NULL\n");
>> -            break;
>> +            goto unlock;
>>           }
>>
>>           ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
>>           if (!ej_event) {
>>               pr_err(PREFIX "No memory, dropping EJECT\n");
>> -            break;
>> +            goto unlock;
>>           }
>>
>> +        get_device(&device->dev);
>>           ej_event->device = device;
>>           ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
>> -        acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
>> -                    (void *)ej_event);
>> +        /* The eject is carried out asynchronously. */
>> +        status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
>> +                         ej_event);
>> +        if (ACPI_FAILURE(status)) {
>> +            put_device(&device->dev);
>> +            kfree(ej_event);
>> +        }
>>
>> -        /* eject is performed asynchronously */
>> -        return;
>> + unlock:
>> +        acpi_scan_lock_release();
>> +        if (ACPI_SUCCESS(status))
>> +            return;
>>       default:
>>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>                     "Unsupported event [0x%x]\n", event));
>> @@ -360,7 +379,6 @@ static void acpi_memory_device_notify(ac
>>
>>       /* Inform firmware that the hotplug operation has completed */
>>       (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
>> -    return;
>>   }
>>
>>   static void acpi_memory_device_free(struct acpi_memory_device *mem_device)
>> Index: test/drivers/acpi/processor_driver.c
>> ===================================================================
>> --- test.orig/drivers/acpi/processor_driver.c
>> +++ test/drivers/acpi/processor_driver.c
>> @@ -683,8 +683,11 @@ static void acpi_processor_hotplug_notif
>>       struct acpi_device *device = NULL;
>>       struct acpi_eject_event *ej_event = NULL;
>>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>> +    acpi_status status;
>>       int result;
>>
>> +    acpi_scan_lock_acquire();
>> +
>>       switch (event) {
>>       case ACPI_NOTIFY_BUS_CHECK:
>>       case ACPI_NOTIFY_DEVICE_CHECK:
>> @@ -733,25 +736,32 @@ static void acpi_processor_hotplug_notif
>>               break;
>>           }
>>
>> +        get_device(&device->dev);
>>           ej_event->device = device;
>>           ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
>> -        acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
>> -                    (void *)ej_event);
>> -
>> -        /* eject is performed asynchronously */
>> -        return;
>> +        /* The eject is carried out asynchronously. */
>> +        status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
>> +                         ej_event);
>> +        if (ACPI_FAILURE(status)) {
>> +            put_device(&device->dev);
>> +            kfree(ej_event);
>> +            break;
>> +        }
>> +        goto out;
>>
>>       default:
>>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>                     "Unsupported event [0x%x]\n", event));
>>
>>           /* non-hotplug event; possibly handled by other handler */
>> -        return;
>> +        goto out;
>>       }
>>
>>       /* Inform firmware that the hotplug operation has completed */
>>       (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
>> -    return;
>> +
>> + out:
>
>> +    acpi_scan_lock_release();;
>
> extra ";"
>
> Thanks,
> Yasuaki Ishimatsu
>
>>   }
>>
>>   static acpi_status is_processor_device(acpi_handle handle)
>> Index: test/drivers/acpi/container.c
>> ===================================================================
>> --- test.orig/drivers/acpi/container.c
>> +++ test/drivers/acpi/container.c
>> @@ -88,6 +88,8 @@ static void container_notify_cb(acpi_han
>>       acpi_status status;
>>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>>
>> +    acpi_scan_lock_acquire();
>> +
>>       switch (type) {
>>       case ACPI_NOTIFY_BUS_CHECK:
>>           /* Fall through */

101                 present = is_device_present(handle);
102                 status = acpi_bus_get_device(handle, &device);
103                 if (!present) {
104                         if (ACPI_SUCCESS(status)) {
105                                 /* device exist and this is a remove request */
106                                 device->flags.eject_pending = 1;
107                                 kobject_uevent(&device->dev.kobj, KOBJ_OFLINE);

108                                 return;

It should use "goto out" instead of return.

109                         }
110                         break;
111                 }

Thanks,
Yasuaki Ishimatsu

>> @@ -130,18 +132,20 @@ static void container_notify_cb(acpi_han
>>           if (!acpi_bus_get_device(handle, &device) && device) {
>>               device->flags.eject_pending = 1;
>>               kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
>> -            return;
>> +            goto out;
>>           }
>>           break;
>>
>>       default:
>>           /* non-hotplug event; possibly handled by other handler */
>> -        return;
>> +        goto out;
>>       }
>>
>>       /* Inform firmware that the hotplug operation has completed */
>>       (void) acpi_evaluate_hotplug_ost(handle, type, ost_code, NULL);
>> -    return;
>> +
>> + out:
>> +    acpi_scan_lock_release();
>>   }
>>
>>   static bool is_container(acpi_handle handle)
>> Index: test/drivers/acpi/dock.c
>> ===================================================================
>> --- test.orig/drivers/acpi/dock.c
>> +++ test/drivers/acpi/dock.c
>> @@ -744,7 +744,9 @@ static void acpi_dock_deferred_cb(void *
>>   {
>>       struct dock_data *data = context;
>>
>> +    acpi_scan_lock_acquire();
>>       dock_notify(data->handle, data->event, data->ds);
>> +    acpi_scan_lock_release();
>>       kfree(data);
>>   }
>>
>> @@ -757,20 +759,31 @@ static int acpi_dock_notifier_call(struc
>>       if (event != ACPI_NOTIFY_BUS_CHECK && event != ACPI_NOTIFY_DEVICE_CHECK
>>          && event != ACPI_NOTIFY_EJECT_REQUEST)
>>           return 0;
>> +
>> +    acpi_scan_lock_acquire();
>> +
>>       list_for_each_entry(dock_station, &dock_stations, sibling) {
>>           if (dock_station->handle == handle) {
>>               struct dock_data *dd;
>> +            acpi_status status;
>>
>>               dd = kmalloc(sizeof(*dd), GFP_KERNEL);
>>               if (!dd)
>> -                return 0;
>> +                break;
>> +
>>               dd->handle = handle;
>>               dd->event = event;
>>               dd->ds = dock_station;
>> -            acpi_os_hotplug_execute(acpi_dock_deferred_cb, dd);
>> -            return 0 ;
>> +            status = acpi_os_hotplug_execute(acpi_dock_deferred_cb,
>> +                             dd);
>> +            if (ACPI_FAILURE(status))
>> +                kfree(dd);
>> +
>> +            break;
>>           }
>>       }
>> +
>> +    acpi_scan_lock_release();
>>       return 0;
>>   }
>>
>> Index: test/drivers/pci/hotplug/acpiphp_glue.c
>> ===================================================================
>> --- test.orig/drivers/pci/hotplug/acpiphp_glue.c
>> +++ test/drivers/pci/hotplug/acpiphp_glue.c
>> @@ -1218,6 +1218,8 @@ static void _handle_hotplug_event_bridge
>>       handle = hp_work->handle;
>>       type = hp_work->type;
>>
>> +    acpi_scan_lock_acquire();
>> +
>>       if (acpi_bus_get_device(handle, &device)) {
>>           /* This bridge must have just been physically inserted */
>>           handle_bridge_insertion(handle, type);
>> @@ -1295,6 +1297,7 @@ static void _handle_hotplug_event_bridge
>>       }
>>
>>   out:
>> +    acpi_scan_lock_release();
>>       kfree(hp_work); /* allocated in handle_hotplug_event_bridge */
>>   }
>>
>> @@ -1341,6 +1344,8 @@ static void _handle_hotplug_event_func(s
>>
>>       func = (struct acpiphp_func *)context;
>>
>> +    acpi_scan_lock_acquire();
>> +
>>       switch (type) {
>>       case ACPI_NOTIFY_BUS_CHECK:
>>           /* bus re-enumerate */
>> @@ -1371,6 +1376,7 @@ static void _handle_hotplug_event_func(s
>>           break;
>>       }
>>
>> +    acpi_scan_lock_release();
>>       kfree(hp_work); /* allocated in handle_hotplug_event_func */
>>   }
>>
>> Index: test/drivers/pci/hotplug/sgi_hotplug.c
>> ===================================================================
>> --- test.orig/drivers/pci/hotplug/sgi_hotplug.c
>> +++ test/drivers/pci/hotplug/sgi_hotplug.c
>> @@ -425,6 +425,7 @@ static int enable_slot(struct hotplug_sl
>>               pdevice = NULL;
>>           }
>>
>> +        acpi_scan_lock_acquire();
>>           /*
>>            * Walk the rootbus node's immediate children looking for
>>            * the slot's device node(s). There can be more than
>> @@ -458,6 +459,7 @@ static int enable_slot(struct hotplug_sl
>>                   }
>>               }
>>           }
>> +        acpi_scan_lock_release();
>>       }
>>
>>       /* Call the driver for the new device */
>> @@ -508,6 +510,7 @@ static int disable_slot(struct hotplug_s
>>           /* Get the rootbus node pointer */
>>           phandle = PCI_CONTROLLER(slot->pci_bus)->acpi_handle;
>>
>> +        acpi_scan_lock_acquire();
>>           /*
>>            * Walk the rootbus node's immediate children looking for
>>            * the slot's device node(s). There can be more than
>> @@ -538,7 +541,7 @@ static int disable_slot(struct hotplug_s
>>                       acpi_bus_trim(device);
>>               }
>>           }
>> -
>> +        acpi_scan_lock_release();
>>       }
>>
>>       /* Free the SN resources assigned to the Linux device.*/
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13  1:55 ` Yinghai Lu
@ 2013-02-13 13:08   ` Rafael J. Wysocki
  0 siblings, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-13 13:08 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Toshi Kani, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Tuesday, February 12, 2013 05:55:26 PM Yinghai Lu wrote:
> On Tue, Feb 12, 2013 at 4:19 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > This changeset is aimed at fixing a few different but related
> > problems in the ACPI hotplug infrastructure.
> >
> > First of all, since notify handlers may be run in parallel with
> > acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> > and some of them are installed for ACPI handles that have no struct
> > acpi_device objects attached (i.e. before those objects are created),
> > those notify handlers have to take acpi_scan_lock to prevent races
> > from taking place (e.g. a struct acpi_device is found to be present
> > for the given ACPI handle, but right after that it is removed by
> > acpi_bus_trim() running in parallel to the given notify handler).
> > Moreover, since some of them call acpi_bus_scan() and
> > acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> > should be acquired by the callers of these two funtions rather by
> > these functions themselves.
> >
> > For these reasons, make all notify handlers that can handle device
> > addition and eject events take acpi_scan_lock and remove the
> > acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> > Accordingly, update all of their users to make sure that they
> > are always called under acpi_scan_lock.
> >
> > Furthermore, since eject operations are carried out asynchronously
> > with respect to the notify events that trigger them, with the help
> > of acpi_bus_hot_remove_device(), even if notify handlers take the
> > ACPI scan lock, it still is possible that, for example,
> > acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> > the notify handler that scheduled its execution and that
> > acpi_bus_trim() will remove the device node passed to
> > acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> > acpi_device object obtained by acpi_bus_hot_remove_device() will be
> > invalid and not-so-funny things will ensue.  To protect agaist that,
> > make the users of acpi_bus_hot_remove_device() run get_device() on
> > ACPI device node objects that are about to be passed to it and make
> > acpi_bus_hot_remove_device() run put_device() on them and check if
> > their ACPI handles are not NULL (make acpi_device_unregister() clear
> > the device nodes' ACPI handles for that check to work).
> >
> > Finally, observe that acpi_os_hotplug_execute() actually can fail,
> > in which case its caller ought to free memory allocated for the
> > context object to prevent leaks from happening.  It also needs to
> > run put_device() on the device node that it ran get_device() on
> > previously in that case.  Modify the code accordingly.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Acked-by: Yinghai Lu <yinghai@kernel.org>

Thanks!


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13  3:31     ` Yasuaki Ishimatsu
  (?)
@ 2013-02-13 13:12     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-13 13:12 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Toshi Kani, Myron Stowe, linux-pci

On Wednesday, February 13, 2013 12:31:05 PM Yasuaki Ishimatsu wrote:
> Hi Rafael,
> 
> I have another comment at container.c.
> 
> 2013/02/13 12:08, Yasuaki Ishimatsu wrote:
> > Hi Rafael,
> >
> > The patch seems good.
> > There is a comment below.
> >
> > 2013/02/13 9:19, Rafael J. Wysocki wrote:
> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>
> >> This changeset is aimed at fixing a few different but related
> >> problems in the ACPI hotplug infrastructure.
> >>
> >> First of all, since notify handlers may be run in parallel with
> >> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> >> and some of them are installed for ACPI handles that have no struct
> >> acpi_device objects attached (i.e. before those objects are created),
> >> those notify handlers have to take acpi_scan_lock to prevent races
> >> from taking place (e.g. a struct acpi_device is found to be present
> >> for the given ACPI handle, but right after that it is removed by
> >> acpi_bus_trim() running in parallel to the given notify handler).
> >> Moreover, since some of them call acpi_bus_scan() and
> >> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> >> should be acquired by the callers of these two funtions rather by
> >> these functions themselves.
> >>
> >> For these reasons, make all notify handlers that can handle device
> >> addition and eject events take acpi_scan_lock and remove the
> >> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> >> Accordingly, update all of their users to make sure that they
> >> are always called under acpi_scan_lock.
> >>
> >> Furthermore, since eject operations are carried out asynchronously
> >> with respect to the notify events that trigger them, with the help
> >> of acpi_bus_hot_remove_device(), even if notify handlers take the
> >> ACPI scan lock, it still is possible that, for example,
> >> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> >> the notify handler that scheduled its execution and that
> >> acpi_bus_trim() will remove the device node passed to
> >> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> >> acpi_device object obtained by acpi_bus_hot_remove_device() will be
> >> invalid and not-so-funny things will ensue.  To protect agaist that,
> >> make the users of acpi_bus_hot_remove_device() run get_device() on
> >> ACPI device node objects that are about to be passed to it and make
> >> acpi_bus_hot_remove_device() run put_device() on them and check if
> >> their ACPI handles are not NULL (make acpi_device_unregister() clear
> >> the device nodes' ACPI handles for that check to work).
> >>
> >> Finally, observe that acpi_os_hotplug_execute() actually can fail,
> >> in which case its caller ought to free memory allocated for the
> >> context object to prevent leaks from happening.  It also needs to
> >> run put_device() on the device node that it ran get_device() on
> >> previously in that case.  Modify the code accordingly.
> >>
> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> ---
> >>
> >> On top of linux-pm.git/linux-next.
> >>
> >> Thanks,
> >> Rafael
> >>
> >> ---
> >>   drivers/acpi/acpi_memhotplug.c     |   56 +++++++++++++++++++-----------
> >>   drivers/acpi/container.c           |   10 +++--
> >>   drivers/acpi/dock.c                |   19 ++++++++--
> >>   drivers/acpi/processor_driver.c    |   24 +++++++++---
> >>   drivers/acpi/scan.c                |   69 +++++++++++++++++++++++++------------
> >>   drivers/pci/hotplug/acpiphp_glue.c |    6 +++
> >>   drivers/pci/hotplug/sgi_hotplug.c  |    5 ++
> >>   include/acpi/acpi_bus.h            |    3 +
> >>   8 files changed, 138 insertions(+), 54 deletions(-)
> >>
> >> Index: test/drivers/acpi/scan.c
> >> ===================================================================
> >> --- test.orig/drivers/acpi/scan.c
> >> +++ test/drivers/acpi/scan.c
> >> @@ -42,6 +42,18 @@ struct acpi_device_bus_id{
> >>       struct list_head node;
> >>   };
> >>
> >> +void acpi_scan_lock_acquire(void)
> >> +{
> >> +    mutex_lock(&acpi_scan_lock);
> >> +}
> >> +EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
> >> +
> >> +void acpi_scan_lock_release(void)
> >> +{
> >> +    mutex_unlock(&acpi_scan_lock);
> >> +}
> >> +EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
> >> +
> >>   int acpi_scan_add_handler(struct acpi_scan_handler *handler)
> >>   {
> >>       if (!handler || !handler->attach)
> >> @@ -95,8 +107,6 @@ acpi_device_modalias_show(struct device
> >>   }
> >>   static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
> >>
> >> -static void __acpi_bus_trim(struct acpi_device *start);
> >> -
> >>   /**
> >>    * acpi_bus_hot_remove_device: hot-remove a device and its children
> >>    * @context: struct acpi_eject_event pointer (freed in this func)
> >> @@ -107,7 +117,7 @@ static void __acpi_bus_trim(struct acpi_
> >>    */
> >>   void acpi_bus_hot_remove_device(void *context)
> >>   {
> >> -    struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context;
> >> +    struct acpi_eject_event *ej_event = context;
> >>       struct acpi_device *device = ej_event->device;
> >>       acpi_handle handle = device->handle;
> >>       acpi_handle temp;
> >> @@ -118,11 +128,19 @@ void acpi_bus_hot_remove_device(void *co
> >>
> >>       mutex_lock(&acpi_scan_lock);
> >>
> >> +    /* If there is no handle, the device node has been unregistered. */
> >> +    if (!device->handle) {
> >> +        dev_dbg(&device->dev, "ACPI handle missing\n");
> >> +        put_device(&device->dev);
> >> +        goto out;
> >> +    }
> >> +
> >>       ACPI_DEBUG_PRINT((ACPI_DB_INFO,
> >>           "Hot-removing device %s...\n", dev_name(&device->dev)));
> >>
> >> -    __acpi_bus_trim(device);
> >> -    /* Device node has been released. */
> >> +    acpi_bus_trim(device);
> >> +    /* Device node has been unregistered. */
> >> +    put_device(&device->dev);
> >>       device = NULL;
> >>
> >>       if (ACPI_SUCCESS(acpi_get_handle(handle, "_LCK", &temp))) {
> >> @@ -151,6 +169,7 @@ void acpi_bus_hot_remove_device(void *co
> >>                         ost_code, NULL);
> >>       }
> >>
> >> + out:
> >>       mutex_unlock(&acpi_scan_lock);
> >>       kfree(context);
> >>       return;
> >> @@ -212,6 +231,7 @@ acpi_eject_store(struct device *d, struc
> >>           goto err;
> >>       }
> >>
> >> +    get_device(&acpi_device->dev);
> >>       ej_event->device = acpi_device;
> >>       if (acpi_device->flags.eject_pending) {
> >>           /* event originated from ACPI eject notification */
> >> @@ -224,7 +244,11 @@ acpi_eject_store(struct device *d, struc
> >>               ej_event->event, ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
> >>       }
> >>
> >> -    acpi_os_hotplug_execute(acpi_bus_hot_remove_device, (void *)ej_event);
> >> +    status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
> >> +    if (ACPI_FAILURE(status)) {
> >> +        put_device(&acpi_device->dev);
> >> +        kfree(ej_event);
> >> +    }
> >>   err:
> >>       return ret;
> >>   }
> >> @@ -779,6 +803,7 @@ static void acpi_device_unregister(struc
> >>        * no more references.
> >>        */
> >>       acpi_device_set_power(device, ACPI_STATE_D3_COLD);
> >> +    device->handle = NULL;
> >>       put_device(&device->dev);
> >>   }
> >>
> >> @@ -1626,14 +1651,14 @@ static acpi_status acpi_bus_device_attac
> >>    * there has been a real error.  There just have been no suitable ACPI objects
> >>    * in the table trunk from which the kernel could create a device and add an
> >>    * appropriate driver.
> >> + *
> >> + * Must be called under acpi_scan_lock.
> >>    */
> >>   int acpi_bus_scan(acpi_handle handle)
> >>   {
> >>       void *device = NULL;
> >>       int error = 0;
> >>
> >> -    mutex_lock(&acpi_scan_lock);
> >> -
> >>       if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device)))
> >>           acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
> >>                       acpi_bus_check_add, NULL, NULL, &device);
> >> @@ -1644,7 +1669,6 @@ int acpi_bus_scan(acpi_handle handle)
> >>           acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
> >>                       acpi_bus_device_attach, NULL, NULL, NULL);
> >>
> >> -    mutex_unlock(&acpi_scan_lock);
> >>       return error;
> >>   }
> >>   EXPORT_SYMBOL(acpi_bus_scan);
> >> @@ -1681,7 +1705,13 @@ static acpi_status acpi_bus_remove(acpi_
> >>       return AE_OK;
> >>   }
> >>
> >> -static void __acpi_bus_trim(struct acpi_device *start)
> >> +/**
> >> + * acpi_bus_trim - Remove ACPI device node and all of its descendants
> >> + * @start: Root of the ACPI device nodes subtree to remove.
> >> + *
> >> + * Must be called under acpi_scan_lock.
> >> + */
> >> +void acpi_bus_trim(struct acpi_device *start)
> >>   {
> >>       /*
> >>        * Execute acpi_bus_device_detach() as a post-order callback to detach
> >> @@ -1698,13 +1728,6 @@ static void __acpi_bus_trim(struct acpi_
> >>                   acpi_bus_remove, NULL, NULL);
> >>       acpi_bus_remove(start->handle, 0, NULL, NULL);
> >>   }
> >> -
> >> -void acpi_bus_trim(struct acpi_device *start)
> >> -{
> >> -    mutex_lock(&acpi_scan_lock);
> >> -    __acpi_bus_trim(start);
> >> -    mutex_unlock(&acpi_scan_lock);
> >> -}
> >>   EXPORT_SYMBOL_GPL(acpi_bus_trim);
> >>
> >>   static int acpi_bus_scan_fixed(void)
> >> @@ -1762,23 +1785,27 @@ int __init acpi_scan_init(void)
> >>       acpi_csrt_init();
> >>       acpi_container_init();
> >>
> >> +    mutex_lock(&acpi_scan_lock);
> >>       /*
> >>        * Enumerate devices in the ACPI namespace.
> >>        */
> >>       result = acpi_bus_scan(ACPI_ROOT_OBJECT);
> >>       if (result)
> >> -        return result;
> >> +        goto out;
> >>
> >>       result = acpi_bus_get_device(ACPI_ROOT_OBJECT, &acpi_root);
> >>       if (result)
> >> -        return result;
> >> +        goto out;
> >>
> >>       result = acpi_bus_scan_fixed();
> >>       if (result) {
> >>           acpi_device_unregister(acpi_root);
> >> -        return result;
> >> +        goto out;
> >>       }
> >>
> >>       acpi_update_all_gpes();
> >> -    return 0;
> >> +
> >> + out:
> >> +    mutex_unlock(&acpi_scan_lock);
> >> +    return result;
> >>   }
> >> Index: test/include/acpi/acpi_bus.h
> >> ===================================================================
> >> --- test.orig/include/acpi/acpi_bus.h
> >> +++ test/include/acpi/acpi_bus.h
> >> @@ -395,6 +395,9 @@ int acpi_bus_receive_event(struct acpi_b
> >>   static inline int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data)
> >>       { return 0; }
> >>   #endif
> >> +
> >> +void acpi_scan_lock_acquire(void);
> >> +void acpi_scan_lock_release(void);
> >>   int acpi_scan_add_handler(struct acpi_scan_handler *handler);
> >>   int acpi_bus_register_driver(struct acpi_driver *driver);
> >>   void acpi_bus_unregister_driver(struct acpi_driver *driver);
> >> Index: test/drivers/acpi/acpi_memhotplug.c
> >> ===================================================================
> >> --- test.orig/drivers/acpi/acpi_memhotplug.c
> >> +++ test/drivers/acpi/acpi_memhotplug.c
> >> @@ -153,14 +153,16 @@ acpi_memory_get_device_resources(struct
> >>       return 0;
> >>   }
> >>
> >> -static int
> >> -acpi_memory_get_device(acpi_handle handle,
> >> -               struct acpi_memory_device **mem_device)
> >> +static int acpi_memory_get_device(acpi_handle handle,
> >> +                  struct acpi_memory_device **mem_device)
> >>   {
> >>       struct acpi_device *device = NULL;
> >> -    int result;
> >> +    int result = 0;
> >> +
> >> +    acpi_scan_lock_acquire();
> >>
> >> -    if (!acpi_bus_get_device(handle, &device) && device)
> >> +    acpi_bus_get_device(handle, &device);
> >> +    if (device)
> >>           goto end;
> >>
> >>       /*
> >> @@ -169,23 +171,28 @@ acpi_memory_get_device(acpi_handle handl
> >>        */
> >>       result = acpi_bus_scan(handle);
> >>       if (result) {
> >> -        acpi_handle_warn(handle, "Cannot add acpi bus\n");
> >> -        return -EINVAL;
> >> +        acpi_handle_warn(handle, "ACPI namespace scan failed\n");
> >> +        result = -EINVAL;
> >> +        goto out;
> >>       }
> >>       result = acpi_bus_get_device(handle, &device);
> >>       if (result) {
> >>           acpi_handle_warn(handle, "Missing device object\n");
> >> -        return -EINVAL;
> >> +        result = -EINVAL;
> >> +        goto out;
> >>       }
> >>
> >> -      end:
> >> + end:
> >>       *mem_device = acpi_driver_data(device);
> >>       if (!(*mem_device)) {
> >>           dev_err(&device->dev, "driver data not found\n");
> >> -        return -ENODEV;
> >> +        result = -ENODEV;
> >> +        goto out;
> >>       }
> >>
> >> -    return 0;
> >> + out:
> >> +    acpi_scan_lock_release();
> >> +    return result;
> >>   }
> >>
> >>   static int acpi_memory_check_device(struct acpi_memory_device *mem_device)
> >> @@ -305,6 +312,7 @@ static void acpi_memory_device_notify(ac
> >>       struct acpi_device *device;
> >>       struct acpi_eject_event *ej_event = NULL;
> >>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> >> +    acpi_status status;
> >>
> >>       switch (event) {
> >>       case ACPI_NOTIFY_BUS_CHECK:
> >> @@ -327,29 +335,40 @@ static void acpi_memory_device_notify(ac
> >>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
> >>                     "\nReceived EJECT REQUEST notification for device\n"));
> >>
> >> +        status = AE_ERROR;
> >> +        acpi_scan_lock_acquire();
> >> +
> >>           if (acpi_bus_get_device(handle, &device)) {
> >>               acpi_handle_err(handle, "Device doesn't exist\n");
> >> -            break;
> >> +            goto unlock;
> >>           }
> >>           mem_device = acpi_driver_data(device);
> >>           if (!mem_device) {
> >>               acpi_handle_err(handle, "Driver Data is NULL\n");
> >> -            break;
> >> +            goto unlock;
> >>           }
> >>
> >>           ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
> >>           if (!ej_event) {
> >>               pr_err(PREFIX "No memory, dropping EJECT\n");
> >> -            break;
> >> +            goto unlock;
> >>           }
> >>
> >> +        get_device(&device->dev);
> >>           ej_event->device = device;
> >>           ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> >> -        acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> >> -                    (void *)ej_event);
> >> +        /* The eject is carried out asynchronously. */
> >> +        status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> >> +                         ej_event);
> >> +        if (ACPI_FAILURE(status)) {
> >> +            put_device(&device->dev);
> >> +            kfree(ej_event);
> >> +        }
> >>
> >> -        /* eject is performed asynchronously */
> >> -        return;
> >> + unlock:
> >> +        acpi_scan_lock_release();
> >> +        if (ACPI_SUCCESS(status))
> >> +            return;
> >>       default:
> >>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
> >>                     "Unsupported event [0x%x]\n", event));
> >> @@ -360,7 +379,6 @@ static void acpi_memory_device_notify(ac
> >>
> >>       /* Inform firmware that the hotplug operation has completed */
> >>       (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
> >> -    return;
> >>   }
> >>
> >>   static void acpi_memory_device_free(struct acpi_memory_device *mem_device)
> >> Index: test/drivers/acpi/processor_driver.c
> >> ===================================================================
> >> --- test.orig/drivers/acpi/processor_driver.c
> >> +++ test/drivers/acpi/processor_driver.c
> >> @@ -683,8 +683,11 @@ static void acpi_processor_hotplug_notif
> >>       struct acpi_device *device = NULL;
> >>       struct acpi_eject_event *ej_event = NULL;
> >>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> >> +    acpi_status status;
> >>       int result;
> >>
> >> +    acpi_scan_lock_acquire();
> >> +
> >>       switch (event) {
> >>       case ACPI_NOTIFY_BUS_CHECK:
> >>       case ACPI_NOTIFY_DEVICE_CHECK:
> >> @@ -733,25 +736,32 @@ static void acpi_processor_hotplug_notif
> >>               break;
> >>           }
> >>
> >> +        get_device(&device->dev);
> >>           ej_event->device = device;
> >>           ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> >> -        acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> >> -                    (void *)ej_event);
> >> -
> >> -        /* eject is performed asynchronously */
> >> -        return;
> >> +        /* The eject is carried out asynchronously. */
> >> +        status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> >> +                         ej_event);
> >> +        if (ACPI_FAILURE(status)) {
> >> +            put_device(&device->dev);
> >> +            kfree(ej_event);
> >> +            break;
> >> +        }
> >> +        goto out;
> >>
> >>       default:
> >>           ACPI_DEBUG_PRINT((ACPI_DB_INFO,
> >>                     "Unsupported event [0x%x]\n", event));
> >>
> >>           /* non-hotplug event; possibly handled by other handler */
> >> -        return;
> >> +        goto out;
> >>       }
> >>
> >>       /* Inform firmware that the hotplug operation has completed */
> >>       (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
> >> -    return;
> >> +
> >> + out:
> >
> >> +    acpi_scan_lock_release();;
> >
> > extra ";"
> >
> > Thanks,
> > Yasuaki Ishimatsu
> >
> >>   }
> >>
> >>   static acpi_status is_processor_device(acpi_handle handle)
> >> Index: test/drivers/acpi/container.c
> >> ===================================================================
> >> --- test.orig/drivers/acpi/container.c
> >> +++ test/drivers/acpi/container.c
> >> @@ -88,6 +88,8 @@ static void container_notify_cb(acpi_han
> >>       acpi_status status;
> >>       u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> >>
> >> +    acpi_scan_lock_acquire();
> >> +
> >>       switch (type) {
> >>       case ACPI_NOTIFY_BUS_CHECK:
> >>           /* Fall through */
> 
> 101                 present = is_device_present(handle);
> 102                 status = acpi_bus_get_device(handle, &device);
> 103                 if (!present) {
> 104                         if (ACPI_SUCCESS(status)) {
> 105                                 /* device exist and this is a remove request */
> 106                                 device->flags.eject_pending = 1;
> 107                                 kobject_uevent(&device->dev.kobj, KOBJ_OFLINE);
> 
> 108                                 return;
> 
> It should use "goto out" instead of return.
> 
> 109                         }
> 110                         break;
> 111                 }

Indeed, I have overlooked that.  Thanks for spotting it!

I'll send an update with this issue fixed (and your comment from the previous
message addressed) shortly.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13  0:19 [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks Rafael J. Wysocki
  2013-02-13  1:55 ` Yinghai Lu
  2013-02-13  3:08   ` Yasuaki Ishimatsu
@ 2013-02-13 13:16 ` Rafael J. Wysocki
  2013-02-13 17:43   ` Toshi Kani
  2013-02-14 20:05   ` Yinghai Lu
  2 siblings, 2 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-13 13:16 UTC (permalink / raw)
  To: ACPI Devel Maling List
  Cc: LKML, Bjorn Helgaas, Jiang Liu, Yinghai Lu, Toshi Kani,
	Yasuaki Ishimatsu, Myron Stowe, linux-pci

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

This changeset is aimed at fixing a few different but related
problems in the ACPI hotplug infrastructure.

First of all, since notify handlers may be run in parallel with
acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
and some of them are installed for ACPI handles that have no struct
acpi_device objects attached (i.e. before those objects are created),
those notify handlers have to take acpi_scan_lock to prevent races
from taking place (e.g. a struct acpi_device is found to be present
for the given ACPI handle, but right after that it is removed by
acpi_bus_trim() running in parallel to the given notify handler).
Moreover, since some of them call acpi_bus_scan() and
acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
should be acquired by the callers of these two funtions rather by
these functions themselves.

For these reasons, make all notify handlers that can handle device
addition and eject events take acpi_scan_lock and remove the
acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
Accordingly, update all of their users to make sure that they
are always called under acpi_scan_lock.

Furthermore, since eject operations are carried out asynchronously
with respect to the notify events that trigger them, with the help
of acpi_bus_hot_remove_device(), even if notify handlers take the
ACPI scan lock, it still is possible that, for example,
acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
the notify handler that scheduled its execution and that
acpi_bus_trim() will remove the device node passed to
acpi_bus_hot_remove_device() for ejection.  In that case, the struct
acpi_device object obtained by acpi_bus_hot_remove_device() will be
invalid and not-so-funny things will ensue.  To protect agaist that,
make the users of acpi_bus_hot_remove_device() run get_device() on
ACPI device node objects that are about to be passed to it and make
acpi_bus_hot_remove_device() run put_device() on them and check if
their ACPI handles are not NULL (make acpi_device_unregister() clear
the device nodes' ACPI handles for that check to work).

Finally, observe that acpi_os_hotplug_execute() actually can fail,
in which case its caller ought to free memory allocated for the
context object to prevent leaks from happening.  It also needs to
run put_device() on the device node that it ran get_device() on
previously in that case.  Modify the code accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
---

This includes fixes for two issues spotted by Yasuaki Ishimatsu.

Thanks,
Rafael

---
 drivers/acpi/acpi_memhotplug.c     |   56 +++++++++++++++++++-----------
 drivers/acpi/container.c           |   12 ++++--
 drivers/acpi/dock.c                |   19 ++++++++--
 drivers/acpi/processor_driver.c    |   24 +++++++++---
 drivers/acpi/scan.c                |   69 +++++++++++++++++++++++++------------
 drivers/pci/hotplug/acpiphp_glue.c |    6 +++
 drivers/pci/hotplug/sgi_hotplug.c  |    5 ++
 include/acpi/acpi_bus.h            |    3 +
 8 files changed, 139 insertions(+), 55 deletions(-)

Index: test/drivers/acpi/scan.c
===================================================================
--- test.orig/drivers/acpi/scan.c
+++ test/drivers/acpi/scan.c
@@ -42,6 +42,18 @@ struct acpi_device_bus_id{
 	struct list_head node;
 };
 
+void acpi_scan_lock_acquire(void)
+{
+	mutex_lock(&acpi_scan_lock);
+}
+EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
+
+void acpi_scan_lock_release(void)
+{
+	mutex_unlock(&acpi_scan_lock);
+}
+EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
+
 int acpi_scan_add_handler(struct acpi_scan_handler *handler)
 {
 	if (!handler || !handler->attach)
@@ -95,8 +107,6 @@ acpi_device_modalias_show(struct device
 }
 static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
 
-static void __acpi_bus_trim(struct acpi_device *start);
-
 /**
  * acpi_bus_hot_remove_device: hot-remove a device and its children
  * @context: struct acpi_eject_event pointer (freed in this func)
@@ -107,7 +117,7 @@ static void __acpi_bus_trim(struct acpi_
  */
 void acpi_bus_hot_remove_device(void *context)
 {
-	struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context;
+	struct acpi_eject_event *ej_event = context;
 	struct acpi_device *device = ej_event->device;
 	acpi_handle handle = device->handle;
 	acpi_handle temp;
@@ -118,11 +128,19 @@ void acpi_bus_hot_remove_device(void *co
 
 	mutex_lock(&acpi_scan_lock);
 
+	/* If there is no handle, the device node has been unregistered. */
+	if (!device->handle) {
+		dev_dbg(&device->dev, "ACPI handle missing\n");
+		put_device(&device->dev);
+		goto out;
+	}
+
 	ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 		"Hot-removing device %s...\n", dev_name(&device->dev)));
 
-	__acpi_bus_trim(device);
-	/* Device node has been released. */
+	acpi_bus_trim(device);
+	/* Device node has been unregistered. */
+	put_device(&device->dev);
 	device = NULL;
 
 	if (ACPI_SUCCESS(acpi_get_handle(handle, "_LCK", &temp))) {
@@ -151,6 +169,7 @@ void acpi_bus_hot_remove_device(void *co
 					  ost_code, NULL);
 	}
 
+ out:
 	mutex_unlock(&acpi_scan_lock);
 	kfree(context);
 	return;
@@ -212,6 +231,7 @@ acpi_eject_store(struct device *d, struc
 		goto err;
 	}
 
+	get_device(&acpi_device->dev);
 	ej_event->device = acpi_device;
 	if (acpi_device->flags.eject_pending) {
 		/* event originated from ACPI eject notification */
@@ -224,7 +244,11 @@ acpi_eject_store(struct device *d, struc
 			ej_event->event, ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
 	}
 
-	acpi_os_hotplug_execute(acpi_bus_hot_remove_device, (void *)ej_event);
+	status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
+	if (ACPI_FAILURE(status)) {
+		put_device(&acpi_device->dev);
+		kfree(ej_event);
+	}
 err:
 	return ret;
 }
@@ -779,6 +803,7 @@ static void acpi_device_unregister(struc
 	 * no more references.
 	 */
 	acpi_device_set_power(device, ACPI_STATE_D3_COLD);
+	device->handle = NULL;
 	put_device(&device->dev);
 }
 
@@ -1626,14 +1651,14 @@ static acpi_status acpi_bus_device_attac
  * there has been a real error.  There just have been no suitable ACPI objects
  * in the table trunk from which the kernel could create a device and add an
  * appropriate driver.
+ *
+ * Must be called under acpi_scan_lock.
  */
 int acpi_bus_scan(acpi_handle handle)
 {
 	void *device = NULL;
 	int error = 0;
 
-	mutex_lock(&acpi_scan_lock);
-
 	if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device)))
 		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
 				    acpi_bus_check_add, NULL, NULL, &device);
@@ -1644,7 +1669,6 @@ int acpi_bus_scan(acpi_handle handle)
 		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
 				    acpi_bus_device_attach, NULL, NULL, NULL);
 
-	mutex_unlock(&acpi_scan_lock);
 	return error;
 }
 EXPORT_SYMBOL(acpi_bus_scan);
@@ -1681,7 +1705,13 @@ static acpi_status acpi_bus_remove(acpi_
 	return AE_OK;
 }
 
-static void __acpi_bus_trim(struct acpi_device *start)
+/**
+ * acpi_bus_trim - Remove ACPI device node and all of its descendants
+ * @start: Root of the ACPI device nodes subtree to remove.
+ *
+ * Must be called under acpi_scan_lock.
+ */
+void acpi_bus_trim(struct acpi_device *start)
 {
 	/*
 	 * Execute acpi_bus_device_detach() as a post-order callback to detach
@@ -1698,13 +1728,6 @@ static void __acpi_bus_trim(struct acpi_
 			    acpi_bus_remove, NULL, NULL);
 	acpi_bus_remove(start->handle, 0, NULL, NULL);
 }
-
-void acpi_bus_trim(struct acpi_device *start)
-{
-	mutex_lock(&acpi_scan_lock);
-	__acpi_bus_trim(start);
-	mutex_unlock(&acpi_scan_lock);
-}
 EXPORT_SYMBOL_GPL(acpi_bus_trim);
 
 static int acpi_bus_scan_fixed(void)
@@ -1761,23 +1784,27 @@ int __init acpi_scan_init(void)
 	acpi_csrt_init();
 	acpi_container_init();
 
+	mutex_lock(&acpi_scan_lock);
 	/*
 	 * Enumerate devices in the ACPI namespace.
 	 */
 	result = acpi_bus_scan(ACPI_ROOT_OBJECT);
 	if (result)
-		return result;
+		goto out;
 
 	result = acpi_bus_get_device(ACPI_ROOT_OBJECT, &acpi_root);
 	if (result)
-		return result;
+		goto out;
 
 	result = acpi_bus_scan_fixed();
 	if (result) {
 		acpi_device_unregister(acpi_root);
-		return result;
+		goto out;
 	}
 
 	acpi_update_all_gpes();
-	return 0;
+
+ out:
+	mutex_unlock(&acpi_scan_lock);
+	return result;
 }
Index: test/include/acpi/acpi_bus.h
===================================================================
--- test.orig/include/acpi/acpi_bus.h
+++ test/include/acpi/acpi_bus.h
@@ -395,6 +395,9 @@ int acpi_bus_receive_event(struct acpi_b
 static inline int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data)
 	{ return 0; }
 #endif
+
+void acpi_scan_lock_acquire(void);
+void acpi_scan_lock_release(void);
 int acpi_scan_add_handler(struct acpi_scan_handler *handler);
 int acpi_bus_register_driver(struct acpi_driver *driver);
 void acpi_bus_unregister_driver(struct acpi_driver *driver);
Index: test/drivers/acpi/acpi_memhotplug.c
===================================================================
--- test.orig/drivers/acpi/acpi_memhotplug.c
+++ test/drivers/acpi/acpi_memhotplug.c
@@ -153,14 +153,16 @@ acpi_memory_get_device_resources(struct
 	return 0;
 }
 
-static int
-acpi_memory_get_device(acpi_handle handle,
-		       struct acpi_memory_device **mem_device)
+static int acpi_memory_get_device(acpi_handle handle,
+				  struct acpi_memory_device **mem_device)
 {
 	struct acpi_device *device = NULL;
-	int result;
+	int result = 0;
+
+	acpi_scan_lock_acquire();
 
-	if (!acpi_bus_get_device(handle, &device) && device)
+	acpi_bus_get_device(handle, &device);
+	if (device)
 		goto end;
 
 	/*
@@ -169,23 +171,28 @@ acpi_memory_get_device(acpi_handle handl
 	 */
 	result = acpi_bus_scan(handle);
 	if (result) {
-		acpi_handle_warn(handle, "Cannot add acpi bus\n");
-		return -EINVAL;
+		acpi_handle_warn(handle, "ACPI namespace scan failed\n");
+		result = -EINVAL;
+		goto out;
 	}
 	result = acpi_bus_get_device(handle, &device);
 	if (result) {
 		acpi_handle_warn(handle, "Missing device object\n");
-		return -EINVAL;
+		result = -EINVAL;
+		goto out;
 	}
 
-      end:
+ end:
 	*mem_device = acpi_driver_data(device);
 	if (!(*mem_device)) {
 		dev_err(&device->dev, "driver data not found\n");
-		return -ENODEV;
+		result = -ENODEV;
+		goto out;
 	}
 
-	return 0;
+ out:
+	acpi_scan_lock_release();
+	return result;
 }
 
 static int acpi_memory_check_device(struct acpi_memory_device *mem_device)
@@ -305,6 +312,7 @@ static void acpi_memory_device_notify(ac
 	struct acpi_device *device;
 	struct acpi_eject_event *ej_event = NULL;
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
+	acpi_status status;
 
 	switch (event) {
 	case ACPI_NOTIFY_BUS_CHECK:
@@ -327,29 +335,40 @@ static void acpi_memory_device_notify(ac
 		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				  "\nReceived EJECT REQUEST notification for device\n"));
 
+		status = AE_ERROR;
+		acpi_scan_lock_acquire();
+
 		if (acpi_bus_get_device(handle, &device)) {
 			acpi_handle_err(handle, "Device doesn't exist\n");
-			break;
+			goto unlock;
 		}
 		mem_device = acpi_driver_data(device);
 		if (!mem_device) {
 			acpi_handle_err(handle, "Driver Data is NULL\n");
-			break;
+			goto unlock;
 		}
 
 		ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
 		if (!ej_event) {
 			pr_err(PREFIX "No memory, dropping EJECT\n");
-			break;
+			goto unlock;
 		}
 
+		get_device(&device->dev);
 		ej_event->device = device;
 		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
-		acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
-					(void *)ej_event);
+		/* The eject is carried out asynchronously. */
+		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
+						 ej_event);
+		if (ACPI_FAILURE(status)) {
+			put_device(&device->dev);
+			kfree(ej_event);
+		}
 
-		/* eject is performed asynchronously */
-		return;
+ unlock:
+		acpi_scan_lock_release();
+		if (ACPI_SUCCESS(status))
+			return;
 	default:
 		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				  "Unsupported event [0x%x]\n", event));
@@ -360,7 +379,6 @@ static void acpi_memory_device_notify(ac
 
 	/* Inform firmware that the hotplug operation has completed */
 	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
-	return;
 }
 
 static void acpi_memory_device_free(struct acpi_memory_device *mem_device)
Index: test/drivers/acpi/processor_driver.c
===================================================================
--- test.orig/drivers/acpi/processor_driver.c
+++ test/drivers/acpi/processor_driver.c
@@ -683,8 +683,11 @@ static void acpi_processor_hotplug_notif
 	struct acpi_device *device = NULL;
 	struct acpi_eject_event *ej_event = NULL;
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
+	acpi_status status;
 	int result;
 
+	acpi_scan_lock_acquire();
+
 	switch (event) {
 	case ACPI_NOTIFY_BUS_CHECK:
 	case ACPI_NOTIFY_DEVICE_CHECK:
@@ -733,25 +736,32 @@ static void acpi_processor_hotplug_notif
 			break;
 		}
 
+		get_device(&device->dev);
 		ej_event->device = device;
 		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
-		acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
-					(void *)ej_event);
-
-		/* eject is performed asynchronously */
-		return;
+		/* The eject is carried out asynchronously. */
+		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
+						 ej_event);
+		if (ACPI_FAILURE(status)) {
+			put_device(&device->dev);
+			kfree(ej_event);
+			break;
+		}
+		goto out;
 
 	default:
 		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				  "Unsupported event [0x%x]\n", event));
 
 		/* non-hotplug event; possibly handled by other handler */
-		return;
+		goto out;
 	}
 
 	/* Inform firmware that the hotplug operation has completed */
 	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
-	return;
+
+ out:
+	acpi_scan_lock_release();
 }
 
 static acpi_status is_processor_device(acpi_handle handle)
Index: test/drivers/acpi/container.c
===================================================================
--- test.orig/drivers/acpi/container.c
+++ test/drivers/acpi/container.c
@@ -88,6 +88,8 @@ static void container_notify_cb(acpi_han
 	acpi_status status;
 	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
 
+	acpi_scan_lock_acquire();
+
 	switch (type) {
 	case ACPI_NOTIFY_BUS_CHECK:
 		/* Fall through */
@@ -103,7 +105,7 @@ static void container_notify_cb(acpi_han
 				/* device exist and this is a remove request */
 				device->flags.eject_pending = 1;
 				kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
-				return;
+				goto out;
 			}
 			break;
 		}
@@ -130,18 +132,20 @@ static void container_notify_cb(acpi_han
 		if (!acpi_bus_get_device(handle, &device) && device) {
 			device->flags.eject_pending = 1;
 			kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
-			return;
+			goto out;
 		}
 		break;
 
 	default:
 		/* non-hotplug event; possibly handled by other handler */
-		return;
+		goto out;
 	}
 
 	/* Inform firmware that the hotplug operation has completed */
 	(void) acpi_evaluate_hotplug_ost(handle, type, ost_code, NULL);
-	return;
+
+ out:
+	acpi_scan_lock_release();
 }
 
 static bool is_container(acpi_handle handle)
Index: test/drivers/acpi/dock.c
===================================================================
--- test.orig/drivers/acpi/dock.c
+++ test/drivers/acpi/dock.c
@@ -744,7 +744,9 @@ static void acpi_dock_deferred_cb(void *
 {
 	struct dock_data *data = context;
 
+	acpi_scan_lock_acquire();
 	dock_notify(data->handle, data->event, data->ds);
+	acpi_scan_lock_release();
 	kfree(data);
 }
 
@@ -757,20 +759,31 @@ static int acpi_dock_notifier_call(struc
 	if (event != ACPI_NOTIFY_BUS_CHECK && event != ACPI_NOTIFY_DEVICE_CHECK
 	   && event != ACPI_NOTIFY_EJECT_REQUEST)
 		return 0;
+
+	acpi_scan_lock_acquire();
+
 	list_for_each_entry(dock_station, &dock_stations, sibling) {
 		if (dock_station->handle == handle) {
 			struct dock_data *dd;
+			acpi_status status;
 
 			dd = kmalloc(sizeof(*dd), GFP_KERNEL);
 			if (!dd)
-				return 0;
+				break;
+
 			dd->handle = handle;
 			dd->event = event;
 			dd->ds = dock_station;
-			acpi_os_hotplug_execute(acpi_dock_deferred_cb, dd);
-			return 0 ;
+			status = acpi_os_hotplug_execute(acpi_dock_deferred_cb,
+							 dd);
+			if (ACPI_FAILURE(status))
+				kfree(dd);
+
+			break;
 		}
 	}
+
+	acpi_scan_lock_release();
 	return 0;
 }
 
Index: test/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- test.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ test/drivers/pci/hotplug/acpiphp_glue.c
@@ -1218,6 +1218,8 @@ static void _handle_hotplug_event_bridge
 	handle = hp_work->handle;
 	type = hp_work->type;
 
+	acpi_scan_lock_acquire();
+
 	if (acpi_bus_get_device(handle, &device)) {
 		/* This bridge must have just been physically inserted */
 		handle_bridge_insertion(handle, type);
@@ -1295,6 +1297,7 @@ static void _handle_hotplug_event_bridge
 	}
 
 out:
+	acpi_scan_lock_release();
 	kfree(hp_work); /* allocated in handle_hotplug_event_bridge */
 }
 
@@ -1341,6 +1344,8 @@ static void _handle_hotplug_event_func(s
 
 	func = (struct acpiphp_func *)context;
 
+	acpi_scan_lock_acquire();
+
 	switch (type) {
 	case ACPI_NOTIFY_BUS_CHECK:
 		/* bus re-enumerate */
@@ -1371,6 +1376,7 @@ static void _handle_hotplug_event_func(s
 		break;
 	}
 
+	acpi_scan_lock_release();
 	kfree(hp_work); /* allocated in handle_hotplug_event_func */
 }
 
Index: test/drivers/pci/hotplug/sgi_hotplug.c
===================================================================
--- test.orig/drivers/pci/hotplug/sgi_hotplug.c
+++ test/drivers/pci/hotplug/sgi_hotplug.c
@@ -425,6 +425,7 @@ static int enable_slot(struct hotplug_sl
 			pdevice = NULL;
 		}
 
+		acpi_scan_lock_acquire();
 		/*
 		 * Walk the rootbus node's immediate children looking for
 		 * the slot's device node(s). There can be more than
@@ -458,6 +459,7 @@ static int enable_slot(struct hotplug_sl
 				}
 			}
 		}
+		acpi_scan_lock_release();
 	}
 
 	/* Call the driver for the new device */
@@ -508,6 +510,7 @@ static int disable_slot(struct hotplug_s
 		/* Get the rootbus node pointer */
 		phandle = PCI_CONTROLLER(slot->pci_bus)->acpi_handle;
 
+		acpi_scan_lock_acquire();
 		/*
 		 * Walk the rootbus node's immediate children looking for
 		 * the slot's device node(s). There can be more than
@@ -538,7 +541,7 @@ static int disable_slot(struct hotplug_s
 					acpi_bus_trim(device);
 			}
 		}
-
+		acpi_scan_lock_release();
 	}
 
 	/* Free the SN resources assigned to the Linux device.*/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13 13:16 ` [Update][PATCH] " Rafael J. Wysocki
@ 2013-02-13 17:43   ` Toshi Kani
  2013-02-13 20:52     ` Rafael J. Wysocki
  2013-02-14 20:05   ` Yinghai Lu
  1 sibling, 1 reply; 35+ messages in thread
From: Toshi Kani @ 2013-02-13 17:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Wed, 2013-02-13 at 14:16 +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> This changeset is aimed at fixing a few different but related
> problems in the ACPI hotplug infrastructure.
> 
> First of all, since notify handlers may be run in parallel with
> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> and some of them are installed for ACPI handles that have no struct
> acpi_device objects attached (i.e. before those objects are created),
> those notify handlers have to take acpi_scan_lock to prevent races
> from taking place (e.g. a struct acpi_device is found to be present
> for the given ACPI handle, but right after that it is removed by
> acpi_bus_trim() running in parallel to the given notify handler).
> Moreover, since some of them call acpi_bus_scan() and
> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> should be acquired by the callers of these two funtions rather by
> these functions themselves.
> 
> For these reasons, make all notify handlers that can handle device
> addition and eject events take acpi_scan_lock and remove the
> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> Accordingly, update all of their users to make sure that they
> are always called under acpi_scan_lock.
> 
> Furthermore, since eject operations are carried out asynchronously
> with respect to the notify events that trigger them, with the help
> of acpi_bus_hot_remove_device(), even if notify handlers take the
> ACPI scan lock, it still is possible that, for example,
> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> the notify handler that scheduled its execution and that
> acpi_bus_trim() will remove the device node passed to
> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> acpi_device object obtained by acpi_bus_hot_remove_device() will be
> invalid and not-so-funny things will ensue.  To protect agaist that,
> make the users of acpi_bus_hot_remove_device() run get_device() on
> ACPI device node objects that are about to be passed to it and make
> acpi_bus_hot_remove_device() run put_device() on them and check if
> their ACPI handles are not NULL (make acpi_device_unregister() clear
> the device nodes' ACPI handles for that check to work).
> 
> Finally, observe that acpi_os_hotplug_execute() actually can fail,
> in which case its caller ought to free memory allocated for the
> context object to prevent leaks from happening.  It also needs to
> run put_device() on the device node that it ran get_device() on
> previously in that case.  Modify the code accordingly.

I am concerned with this approach.  ACPICA calls notify handlers through
kacpi_notify_wq, which has the max active set to 1.  We then use
kacpi_hotplug_wq (which also has the max active set to 1) so that a
hotplug procedure does not block the notify handlers since they can be
used for non-hotplug events as well.  Acquiring the scan lock in a
notify handler means that a hotplug procedure can block any notify
events.

So, I'd prefer the following approach.

 - Change all hot-plug procedures (i.e. both add and delete) to proceed
under kacpi_hotplug_wq by calling acpi_os_hotplug_execute(). This
serializes all hotplug procedures, and prevents blocking other notify
events.  (Ideally, we should also run all online/offline procedures
under a same work-queue, just like my RFC patchset did, but this is a
different topic for now.)

 - Revert 5993c4670 unless this change is absolutely necessary.  From
the change log, it is not clear to me why this change was needed.  It
changed acpi_bus_hot_remove_device() to take an acpi_device, instead of
an acpi_handle, which introduced a race condition with acpi_device.
acpi_bus_hot_remove_device() should take an acpi_handle, and then obtain
its acpi_device from the acpi_handle since this function is serialized.

 - Remove sanity checks with an acpi_device in the notify handlers,
which have a race condition with acpi_device.  These type-specific
checks will need to be removed when we have a common notify handler
anyway.  The notify handler can continue to check the status of ACPI
device object with an acpi_handle.  Type-specific sanity checks /
validations can be performed within a hotplug procedure, instead.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13 17:43   ` Toshi Kani
@ 2013-02-13 20:52     ` Rafael J. Wysocki
  2013-02-13 23:09       ` Toshi Kani
  0 siblings, 1 reply; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-13 20:52 UTC (permalink / raw)
  To: Toshi Kani
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Wednesday, February 13, 2013 10:43:58 AM Toshi Kani wrote:
> On Wed, 2013-02-13 at 14:16 +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > This changeset is aimed at fixing a few different but related
> > problems in the ACPI hotplug infrastructure.
> > 
> > First of all, since notify handlers may be run in parallel with
> > acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> > and some of them are installed for ACPI handles that have no struct
> > acpi_device objects attached (i.e. before those objects are created),
> > those notify handlers have to take acpi_scan_lock to prevent races
> > from taking place (e.g. a struct acpi_device is found to be present
> > for the given ACPI handle, but right after that it is removed by
> > acpi_bus_trim() running in parallel to the given notify handler).
> > Moreover, since some of them call acpi_bus_scan() and
> > acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> > should be acquired by the callers of these two funtions rather by
> > these functions themselves.
> > 
> > For these reasons, make all notify handlers that can handle device
> > addition and eject events take acpi_scan_lock and remove the
> > acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> > Accordingly, update all of their users to make sure that they
> > are always called under acpi_scan_lock.
> > 
> > Furthermore, since eject operations are carried out asynchronously
> > with respect to the notify events that trigger them, with the help
> > of acpi_bus_hot_remove_device(), even if notify handlers take the
> > ACPI scan lock, it still is possible that, for example,
> > acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> > the notify handler that scheduled its execution and that
> > acpi_bus_trim() will remove the device node passed to
> > acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> > acpi_device object obtained by acpi_bus_hot_remove_device() will be
> > invalid and not-so-funny things will ensue.  To protect agaist that,
> > make the users of acpi_bus_hot_remove_device() run get_device() on
> > ACPI device node objects that are about to be passed to it and make
> > acpi_bus_hot_remove_device() run put_device() on them and check if
> > their ACPI handles are not NULL (make acpi_device_unregister() clear
> > the device nodes' ACPI handles for that check to work).
> > 
> > Finally, observe that acpi_os_hotplug_execute() actually can fail,
> > in which case its caller ought to free memory allocated for the
> > context object to prevent leaks from happening.  It also needs to
> > run put_device() on the device node that it ran get_device() on
> > previously in that case.  Modify the code accordingly.
> 
> I am concerned with this approach.  ACPICA calls notify handlers through
> kacpi_notify_wq, which has the max active set to 1.  We then use
> kacpi_hotplug_wq (which also has the max active set to 1) so that a
> hotplug procedure does not block the notify handlers since they can be
> used for non-hotplug events as well.

In fact we use kacpi_hotplug_wq for a different reason.  Please read the
comment in __acpi_os_execute() for more details.

> Acquiring the scan lock in a notify handler means that a hotplug procedure
> can block any notify events.

Yes, it can.

> So, I'd prefer the following approach.
> 
>  - Change all hot-plug procedures (i.e. both add and delete) to proceed
> under kacpi_hotplug_wq by calling acpi_os_hotplug_execute(). This
> serializes all hotplug procedures, and prevents blocking other notify
> events.

Yes, we can do that.  I was thinking about doing that change, but not in v3.9.
There are simply too many notify handlers already there to do that so late in
the cycle.  And doing that for acpiphp, for example, won't be straightforward
at all.

Please think about the $subject patch as a temporary measure until we can do
something better (which we need to do anyway to reduce code duplication among
other things).

> (Ideally, we should also run all online/offline procedures
> under a same work-queue, just like my RFC patchset did, but this is a
> different topic for now.)

No, I don't think it is appropriate to run online/offline from _any_
workqueue.  In my opinion they should be run from user space.

>  - Revert 5993c4670 unless this change is absolutely necessary.  From
> the change log, it is not clear to me why this change was needed.  It
> changed acpi_bus_hot_remove_device() to take an acpi_device, instead of
> an acpi_handle, which introduced a race condition with acpi_device.
> acpi_bus_hot_remove_device() should take an acpi_handle, and then obtain
> its acpi_device from the acpi_handle since this function is serialized.

I thought about that, but actually there's no guarantee that the handle
will be valid after _EJ0 as far as I can say.  So the race condition is
going to be there anyway and using struct acpi_device just makes it easier
to avoid it.

>  - Remove sanity checks with an acpi_device in the notify handlers,
> which have a race condition with acpi_device.  These type-specific
> checks will need to be removed when we have a common notify handler
> anyway.  The notify handler can continue to check the status of ACPI
> device object with an acpi_handle.  Type-specific sanity checks /
> validations can be performed within a hotplug procedure, instead.

Well, the sanest approach here would be to queue up a work item on
kacpi_hotplug_wq if the event is of a "hotplug" type and let that work
item do all checks, run acpi_bus_scan() etc.  But not in v3.9.

For v3.9, the most straightforward and least intrusive change we can do
is the $subject patch as far as I can say.  If you can suggest something
less intrusive and more straightforward, please do.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13 20:52     ` Rafael J. Wysocki
@ 2013-02-13 23:09       ` Toshi Kani
  2013-02-13 23:42         ` Rafael J. Wysocki
  0 siblings, 1 reply; 35+ messages in thread
From: Toshi Kani @ 2013-02-13 23:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Wed, 2013-02-13 at 21:52 +0100, Rafael J. Wysocki wrote:
> On Wednesday, February 13, 2013 10:43:58 AM Toshi Kani wrote:
> > On Wed, 2013-02-13 at 14:16 +0100, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > This changeset is aimed at fixing a few different but related
> > > problems in the ACPI hotplug infrastructure.
> > > 
> > > First of all, since notify handlers may be run in parallel with
> > > acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> > > and some of them are installed for ACPI handles that have no struct
> > > acpi_device objects attached (i.e. before those objects are created),
> > > those notify handlers have to take acpi_scan_lock to prevent races
> > > from taking place (e.g. a struct acpi_device is found to be present
> > > for the given ACPI handle, but right after that it is removed by
> > > acpi_bus_trim() running in parallel to the given notify handler).
> > > Moreover, since some of them call acpi_bus_scan() and
> > > acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> > > should be acquired by the callers of these two funtions rather by
> > > these functions themselves.
> > > 
> > > For these reasons, make all notify handlers that can handle device
> > > addition and eject events take acpi_scan_lock and remove the
> > > acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> > > Accordingly, update all of their users to make sure that they
> > > are always called under acpi_scan_lock.
> > > 
> > > Furthermore, since eject operations are carried out asynchronously
> > > with respect to the notify events that trigger them, with the help
> > > of acpi_bus_hot_remove_device(), even if notify handlers take the
> > > ACPI scan lock, it still is possible that, for example,
> > > acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> > > the notify handler that scheduled its execution and that
> > > acpi_bus_trim() will remove the device node passed to
> > > acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> > > acpi_device object obtained by acpi_bus_hot_remove_device() will be
> > > invalid and not-so-funny things will ensue.  To protect agaist that,
> > > make the users of acpi_bus_hot_remove_device() run get_device() on
> > > ACPI device node objects that are about to be passed to it and make
> > > acpi_bus_hot_remove_device() run put_device() on them and check if
> > > their ACPI handles are not NULL (make acpi_device_unregister() clear
> > > the device nodes' ACPI handles for that check to work).
> > > 
> > > Finally, observe that acpi_os_hotplug_execute() actually can fail,
> > > in which case its caller ought to free memory allocated for the
> > > context object to prevent leaks from happening.  It also needs to
> > > run put_device() on the device node that it ran get_device() on
> > > previously in that case.  Modify the code accordingly.
> > 
> > I am concerned with this approach.  ACPICA calls notify handlers through
> > kacpi_notify_wq, which has the max active set to 1.  We then use
> > kacpi_hotplug_wq (which also has the max active set to 1) so that a
> > hotplug procedure does not block the notify handlers since they can be
> > used for non-hotplug events as well.
> 
> In fact we use kacpi_hotplug_wq for a different reason.  Please read the
> comment in __acpi_os_execute() for more details.

Yes, I am aware of the issue as well.

> > Acquiring the scan lock in a notify handler means that a hotplug procedure
> > can block any notify events.
> 
> Yes, it can.
> 
> > So, I'd prefer the following approach.
> > 
> >  - Change all hot-plug procedures (i.e. both add and delete) to proceed
> > under kacpi_hotplug_wq by calling acpi_os_hotplug_execute(). This
> > serializes all hotplug procedures, and prevents blocking other notify
> > events.
> 
> Yes, we can do that.  I was thinking about doing that change, but not in v3.9.
> There are simply too many notify handlers already there to do that so late in
> the cycle.  And doing that for acpiphp, for example, won't be straightforward
> at all.

Right.  I was not suggesting this approach for v3.9.

> Please think about the $subject patch as a temporary measure until we can do
> something better (which we need to do anyway to reduce code duplication among
> other things).

I am fine with the scan lock as long as it is internal.  This patch
publishes the locking interfaces to other modules, which made me worried
that this might become a long term solution.  If we need to fix this
issue for v3.9, I am OK with it as you clarified this as a temporary
solution.

> > (Ideally, we should also run all online/offline procedures
> > under a same work-queue, just like my RFC patchset did, but this is a
> > different topic for now.)
> 
> No, I don't think it is appropriate to run online/offline from _any_
> workqueue.  In my opinion they should be run from user space.

I think there are pros and cons for this.  If we use a user thread to
run online/offline procedure, we can return a result directly.  However,
if an operation takes a long time, it will block the user thread until
it is done.  In addition, we have race conditions between hotplug and
online/offline operations.  So, we may need to come up with other type
of locking if we do not use a workqueue to address it.  Having both the
scan lock and other lock in the callers would not be good.

> >  - Revert 5993c4670 unless this change is absolutely necessary.  From
> > the change log, it is not clear to me why this change was needed.  It
> > changed acpi_bus_hot_remove_device() to take an acpi_device, instead of
> > an acpi_handle, which introduced a race condition with acpi_device.
> > acpi_bus_hot_remove_device() should take an acpi_handle, and then obtain
> > its acpi_device from the acpi_handle since this function is serialized.
> 
> I thought about that, but actually there's no guarantee that the handle
> will be valid after _EJ0 as far as I can say.  So the race condition is
> going to be there anyway and using struct acpi_device just makes it easier
> to avoid it.

In theory, yes, a stale handle could be a problem, if _EJ0 performs
unload table and if ACPICA frees up its internal data structure pointed
by the handle as a result.  But we should not see such issue now since
we do not support dynamic ACPI namespace yet.

> >  - Remove sanity checks with an acpi_device in the notify handlers,
> > which have a race condition with acpi_device.  These type-specific
> > checks will need to be removed when we have a common notify handler
> > anyway.  The notify handler can continue to check the status of ACPI
> > device object with an acpi_handle.  Type-specific sanity checks /
> > validations can be performed within a hotplug procedure, instead.
> 
> Well, the sanest approach here would be to queue up a work item on
> kacpi_hotplug_wq if the event is of a "hotplug" type and let that work
> item do all checks, run acpi_bus_scan() etc.  But not in v3.9.

Agreed, and that's what I meant.

> For v3.9, the most straightforward and least intrusive change we can do
> is the $subject patch as far as I can say.  If you can suggest something
> less intrusive and more straightforward, please do.

My suggestion is to keep the scan lock internal for v3.9 and implement a
new hotplug framework (i.e. the one with user space approach) for v3.10
with a proper locking mechanism.  But, since you clarified this as a
temporary solution, I am OK with it if we need to fix it now.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13 23:09       ` Toshi Kani
@ 2013-02-13 23:42         ` Rafael J. Wysocki
  2013-02-14  0:16           ` Toshi Kani
  2013-02-14  2:31             ` Moore, Robert
  0 siblings, 2 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-13 23:42 UTC (permalink / raw)
  To: Toshi Kani
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci, Bob Moore

On Wednesday, February 13, 2013 04:09:29 PM Toshi Kani wrote:
> On Wed, 2013-02-13 at 21:52 +0100, Rafael J. Wysocki wrote:
> > On Wednesday, February 13, 2013 10:43:58 AM Toshi Kani wrote:
> > > On Wed, 2013-02-13 at 14:16 +0100, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > This changeset is aimed at fixing a few different but related
> > > > problems in the ACPI hotplug infrastructure.
> > > > 
> > > > First of all, since notify handlers may be run in parallel with
> > > > acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> > > > and some of them are installed for ACPI handles that have no struct
> > > > acpi_device objects attached (i.e. before those objects are created),
> > > > those notify handlers have to take acpi_scan_lock to prevent races
> > > > from taking place (e.g. a struct acpi_device is found to be present
> > > > for the given ACPI handle, but right after that it is removed by
> > > > acpi_bus_trim() running in parallel to the given notify handler).
> > > > Moreover, since some of them call acpi_bus_scan() and
> > > > acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> > > > should be acquired by the callers of these two funtions rather by
> > > > these functions themselves.
> > > > 
> > > > For these reasons, make all notify handlers that can handle device
> > > > addition and eject events take acpi_scan_lock and remove the
> > > > acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> > > > Accordingly, update all of their users to make sure that they
> > > > are always called under acpi_scan_lock.
> > > > 
> > > > Furthermore, since eject operations are carried out asynchronously
> > > > with respect to the notify events that trigger them, with the help
> > > > of acpi_bus_hot_remove_device(), even if notify handlers take the
> > > > ACPI scan lock, it still is possible that, for example,
> > > > acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> > > > the notify handler that scheduled its execution and that
> > > > acpi_bus_trim() will remove the device node passed to
> > > > acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> > > > acpi_device object obtained by acpi_bus_hot_remove_device() will be
> > > > invalid and not-so-funny things will ensue.  To protect agaist that,
> > > > make the users of acpi_bus_hot_remove_device() run get_device() on
> > > > ACPI device node objects that are about to be passed to it and make
> > > > acpi_bus_hot_remove_device() run put_device() on them and check if
> > > > their ACPI handles are not NULL (make acpi_device_unregister() clear
> > > > the device nodes' ACPI handles for that check to work).
> > > > 
> > > > Finally, observe that acpi_os_hotplug_execute() actually can fail,
> > > > in which case its caller ought to free memory allocated for the
> > > > context object to prevent leaks from happening.  It also needs to
> > > > run put_device() on the device node that it ran get_device() on
> > > > previously in that case.  Modify the code accordingly.
> > > 
> > > I am concerned with this approach.  ACPICA calls notify handlers through
> > > kacpi_notify_wq, which has the max active set to 1.  We then use
> > > kacpi_hotplug_wq (which also has the max active set to 1) so that a
> > > hotplug procedure does not block the notify handlers since they can be
> > > used for non-hotplug events as well.
> > 
> > In fact we use kacpi_hotplug_wq for a different reason.  Please read the
> > comment in __acpi_os_execute() for more details.
> 
> Yes, I am aware of the issue as well.
> 
> > > Acquiring the scan lock in a notify handler means that a hotplug procedure
> > > can block any notify events.
> > 
> > Yes, it can.
> > 
> > > So, I'd prefer the following approach.
> > > 
> > >  - Change all hot-plug procedures (i.e. both add and delete) to proceed
> > > under kacpi_hotplug_wq by calling acpi_os_hotplug_execute(). This
> > > serializes all hotplug procedures, and prevents blocking other notify
> > > events.
> > 
> > Yes, we can do that.  I was thinking about doing that change, but not in v3.9.
> > There are simply too many notify handlers already there to do that so late in
> > the cycle.  And doing that for acpiphp, for example, won't be straightforward
> > at all.
> 
> Right.  I was not suggesting this approach for v3.9.

OK

> > Please think about the $subject patch as a temporary measure until we can do
> > something better (which we need to do anyway to reduce code duplication among
> > other things).
> 
> I am fine with the scan lock as long as it is internal.  This patch
> publishes the locking interfaces to other modules, which made me worried
> that this might become a long term solution.  If we need to fix this
> issue for v3.9, I am OK with it as you clarified this as a temporary
> solution.

Yes, I'm not going to allow anyone to use acpi_scan_lock anywhere else. :-)
I'm also going to make it internal again in v3.10 if possible.

> > > (Ideally, we should also run all online/offline procedures
> > > under a same work-queue, just like my RFC patchset did, but this is a
> > > different topic for now.)
> > 
> > No, I don't think it is appropriate to run online/offline from _any_
> > workqueue.  In my opinion they should be run from user space.
> 
> I think there are pros and cons for this.  If we use a user thread to
> run online/offline procedure, we can return a result directly.  However,
> if an operation takes a long time, it will block the user thread until
> it is done.  In addition, we have race conditions between hotplug and
> online/offline operations.  So, we may need to come up with other type
> of locking if we do not use a workqueue to address it.  Having both the
> scan lock and other lock in the callers would not be good.

Well, we definitely need to make offline/online and hotplug mutually exclusive,
this way or another.  I'm hoping that the scan lock will be sufficient for
that, but I may be wrong.

> > >  - Revert 5993c4670 unless this change is absolutely necessary.  From
> > > the change log, it is not clear to me why this change was needed.  It
> > > changed acpi_bus_hot_remove_device() to take an acpi_device, instead of
> > > an acpi_handle, which introduced a race condition with acpi_device.
> > > acpi_bus_hot_remove_device() should take an acpi_handle, and then obtain
> > > its acpi_device from the acpi_handle since this function is serialized.
> > 
> > I thought about that, but actually there's no guarantee that the handle
> > will be valid after _EJ0 as far as I can say.  So the race condition is
> > going to be there anyway and using struct acpi_device just makes it easier
> > to avoid it.
> 
> In theory, yes, a stale handle could be a problem, if _EJ0 performs
> unload table and if ACPICA frees up its internal data structure pointed
> by the handle as a result.  But we should not see such issue now since
> we do not support dynamic ACPI namespace yet.

I'm waiting for information from Bob about that.  If we can assume ACPI handles
to be always valid, that will simplify things quite a bit.

> > >  - Remove sanity checks with an acpi_device in the notify handlers,
> > > which have a race condition with acpi_device.  These type-specific
> > > checks will need to be removed when we have a common notify handler
> > > anyway.  The notify handler can continue to check the status of ACPI
> > > device object with an acpi_handle.  Type-specific sanity checks /
> > > validations can be performed within a hotplug procedure, instead.
> > 
> > Well, the sanest approach here would be to queue up a work item on
> > kacpi_hotplug_wq if the event is of a "hotplug" type and let that work
> > item do all checks, run acpi_bus_scan() etc.  But not in v3.9.
> 
> Agreed, and that's what I meant.
> 
> > For v3.9, the most straightforward and least intrusive change we can do
> > is the $subject patch as far as I can say.  If you can suggest something
> > less intrusive and more straightforward, please do.
> 
> My suggestion is to keep the scan lock internal for v3.9 and implement a
> new hotplug framework (i.e. the one with user space approach) for v3.10
> with a proper locking mechanism.  But, since you clarified this as a
> temporary solution, I am OK with it if we need to fix it now.

Well, honestly, I wouldn't have posted this patch if I hadn't thought that we
needed a fix for v3.9. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13 23:42         ` Rafael J. Wysocki
@ 2013-02-14  0:16           ` Toshi Kani
  2013-02-14  2:31             ` Moore, Robert
  1 sibling, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2013-02-14  0:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci, Bob Moore

On Thu, 2013-02-14 at 00:42 +0100, Rafael J. Wysocki wrote:
> On Wednesday, February 13, 2013 04:09:29 PM Toshi Kani wrote:
> > On Wed, 2013-02-13 at 21:52 +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, February 13, 2013 10:43:58 AM Toshi Kani wrote:
> > > > On Wed, 2013-02-13 at 14:16 +0100, Rafael J. Wysocki wrote:
> > > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > > 
> > > > > This changeset is aimed at fixing a few different but related
> > > > > problems in the ACPI hotplug infrastructure.
> > > > > 
> > > > > First of all, since notify handlers may be run in parallel with
> > > > > acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> > > > > and some of them are installed for ACPI handles that have no struct
> > > > > acpi_device objects attached (i.e. before those objects are created),
> > > > > those notify handlers have to take acpi_scan_lock to prevent races
> > > > > from taking place (e.g. a struct acpi_device is found to be present
> > > > > for the given ACPI handle, but right after that it is removed by
> > > > > acpi_bus_trim() running in parallel to the given notify handler).
> > > > > Moreover, since some of them call acpi_bus_scan() and
> > > > > acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> > > > > should be acquired by the callers of these two funtions rather by
> > > > > these functions themselves.
> > > > > 
> > > > > For these reasons, make all notify handlers that can handle device
> > > > > addition and eject events take acpi_scan_lock and remove the
> > > > > acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> > > > > Accordingly, update all of their users to make sure that they
> > > > > are always called under acpi_scan_lock.
> > > > > 
> > > > > Furthermore, since eject operations are carried out asynchronously
> > > > > with respect to the notify events that trigger them, with the help
> > > > > of acpi_bus_hot_remove_device(), even if notify handlers take the
> > > > > ACPI scan lock, it still is possible that, for example,
> > > > > acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> > > > > the notify handler that scheduled its execution and that
> > > > > acpi_bus_trim() will remove the device node passed to
> > > > > acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> > > > > acpi_device object obtained by acpi_bus_hot_remove_device() will be
> > > > > invalid and not-so-funny things will ensue.  To protect agaist that,
> > > > > make the users of acpi_bus_hot_remove_device() run get_device() on
> > > > > ACPI device node objects that are about to be passed to it and make
> > > > > acpi_bus_hot_remove_device() run put_device() on them and check if
> > > > > their ACPI handles are not NULL (make acpi_device_unregister() clear
> > > > > the device nodes' ACPI handles for that check to work).
> > > > > 
> > > > > Finally, observe that acpi_os_hotplug_execute() actually can fail,
> > > > > in which case its caller ought to free memory allocated for the
> > > > > context object to prevent leaks from happening.  It also needs to
> > > > > run put_device() on the device node that it ran get_device() on
> > > > > previously in that case.  Modify the code accordingly.
> > > > 
> > > > I am concerned with this approach.  ACPICA calls notify handlers through
> > > > kacpi_notify_wq, which has the max active set to 1.  We then use
> > > > kacpi_hotplug_wq (which also has the max active set to 1) so that a
> > > > hotplug procedure does not block the notify handlers since they can be
> > > > used for non-hotplug events as well.
> > > 
> > > In fact we use kacpi_hotplug_wq for a different reason.  Please read the
> > > comment in __acpi_os_execute() for more details.
> > 
> > Yes, I am aware of the issue as well.
> > 
> > > > Acquiring the scan lock in a notify handler means that a hotplug procedure
> > > > can block any notify events.
> > > 
> > > Yes, it can.
> > > 
> > > > So, I'd prefer the following approach.
> > > > 
> > > >  - Change all hot-plug procedures (i.e. both add and delete) to proceed
> > > > under kacpi_hotplug_wq by calling acpi_os_hotplug_execute(). This
> > > > serializes all hotplug procedures, and prevents blocking other notify
> > > > events.
> > > 
> > > Yes, we can do that.  I was thinking about doing that change, but not in v3.9.
> > > There are simply too many notify handlers already there to do that so late in
> > > the cycle.  And doing that for acpiphp, for example, won't be straightforward
> > > at all.
> > 
> > Right.  I was not suggesting this approach for v3.9.
> 
> OK
> 
> > > Please think about the $subject patch as a temporary measure until we can do
> > > something better (which we need to do anyway to reduce code duplication among
> > > other things).
> > 
> > I am fine with the scan lock as long as it is internal.  This patch
> > publishes the locking interfaces to other modules, which made me worried
> > that this might become a long term solution.  If we need to fix this
> > issue for v3.9, I am OK with it as you clarified this as a temporary
> > solution.
> 
> Yes, I'm not going to allow anyone to use acpi_scan_lock anywhere else. :-)
> I'm also going to make it internal again in v3.10 if possible.

Sounds good. :-)

> > > > (Ideally, we should also run all online/offline procedures
> > > > under a same work-queue, just like my RFC patchset did, but this is a
> > > > different topic for now.)
> > > 
> > > No, I don't think it is appropriate to run online/offline from _any_
> > > workqueue.  In my opinion they should be run from user space.
> > 
> > I think there are pros and cons for this.  If we use a user thread to
> > run online/offline procedure, we can return a result directly.  However,
> > if an operation takes a long time, it will block the user thread until
> > it is done.  In addition, we have race conditions between hotplug and
> > online/offline operations.  So, we may need to come up with other type
> > of locking if we do not use a workqueue to address it.  Having both the
> > scan lock and other lock in the callers would not be good.
> 
> Well, we definitely need to make offline/online and hotplug mutually exclusive,
> this way or another.  I'm hoping that the scan lock will be sufficient for
> that, but I may be wrong.

Online/offline operations are independent from ACPI, so I do not think
the scan lock will be effective on this.

> > > >  - Revert 5993c4670 unless this change is absolutely necessary.  From
> > > > the change log, it is not clear to me why this change was needed.  It
> > > > changed acpi_bus_hot_remove_device() to take an acpi_device, instead of
> > > > an acpi_handle, which introduced a race condition with acpi_device.
> > > > acpi_bus_hot_remove_device() should take an acpi_handle, and then obtain
> > > > its acpi_device from the acpi_handle since this function is serialized.
> > > 
> > > I thought about that, but actually there's no guarantee that the handle
> > > will be valid after _EJ0 as far as I can say.  So the race condition is
> > > going to be there anyway and using struct acpi_device just makes it easier
> > > to avoid it.
> > 
> > In theory, yes, a stale handle could be a problem, if _EJ0 performs
> > unload table and if ACPICA frees up its internal data structure pointed
> > by the handle as a result.  But we should not see such issue now since
> > we do not support dynamic ACPI namespace yet.
> 
> I'm waiting for information from Bob about that.  If we can assume ACPI handles
> to be always valid, that will simplify things quite a bit.

Thanks for checking on this.

> > > >  - Remove sanity checks with an acpi_device in the notify handlers,
> > > > which have a race condition with acpi_device.  These type-specific
> > > > checks will need to be removed when we have a common notify handler
> > > > anyway.  The notify handler can continue to check the status of ACPI
> > > > device object with an acpi_handle.  Type-specific sanity checks /
> > > > validations can be performed within a hotplug procedure, instead.
> > > 
> > > Well, the sanest approach here would be to queue up a work item on
> > > kacpi_hotplug_wq if the event is of a "hotplug" type and let that work
> > > item do all checks, run acpi_bus_scan() etc.  But not in v3.9.
> > 
> > Agreed, and that's what I meant.
> > 
> > > For v3.9, the most straightforward and least intrusive change we can do
> > > is the $subject patch as far as I can say.  If you can suggest something
> > > less intrusive and more straightforward, please do.
> > 
> > My suggestion is to keep the scan lock internal for v3.9 and implement a
> > new hotplug framework (i.e. the one with user space approach) for v3.10
> > with a proper locking mechanism.  But, since you clarified this as a
> > temporary solution, I am OK with it if we need to fix it now.
> 
> Well, honestly, I wouldn't have posted this patch if I hadn't thought that we
> needed a fix for v3.9. :-)

Very true. :-)

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13 23:42         ` Rafael J. Wysocki
  2013-02-14  0:16           ` Toshi Kani
@ 2013-02-14  2:31             ` Moore, Robert
  1 sibling, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14  2:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Toshi Kani
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

> > > I thought about that, but actually there's no guarantee that the
> > > handle will be valid after _EJ0 as far as I can say.  So the race
> > > condition is going to be there anyway and using struct acpi_device
> > > just makes it easier to avoid it.
> >
> > In theory, yes, a stale handle could be a problem, if _EJ0 performs
> > unload table and if ACPICA frees up its internal data structure
> > pointed by the handle as a result.  But we should not see such issue
> > now since we do not support dynamic ACPI namespace yet.
> 
> I'm waiting for information from Bob about that.  If we can assume ACPI
> handles to be always valid, that will simplify things quite a bit.

If a table is unloaded, all the namespace nodes for that table are removed from the namespace, and thus any ACPI_HANDLE pointers go stale and invalid.

Bob


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-14  2:31             ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14  2:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Toshi Kani
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 997 bytes --]

> > > I thought about that, but actually there's no guarantee that the
> > > handle will be valid after _EJ0 as far as I can say.  So the race
> > > condition is going to be there anyway and using struct acpi_device
> > > just makes it easier to avoid it.
> >
> > In theory, yes, a stale handle could be a problem, if _EJ0 performs
> > unload table and if ACPICA frees up its internal data structure
> > pointed by the handle as a result.  But we should not see such issue
> > now since we do not support dynamic ACPI namespace yet.
> 
> I'm waiting for information from Bob about that.  If we can assume ACPI
> handles to be always valid, that will simplify things quite a bit.

If a table is unloaded, all the namespace nodes for that table are removed from the namespace, and thus any ACPI_HANDLE pointers go stale and invalid.

Bob

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-14  2:31             ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14  2:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Toshi Kani
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

PiA+ID4gSSB0aG91Z2h0IGFib3V0IHRoYXQsIGJ1dCBhY3R1YWxseSB0aGVyZSdzIG5vIGd1YXJh
bnRlZSB0aGF0IHRoZQ0KPiA+ID4gaGFuZGxlIHdpbGwgYmUgdmFsaWQgYWZ0ZXIgX0VKMCBhcyBm
YXIgYXMgSSBjYW4gc2F5LiAgU28gdGhlIHJhY2UNCj4gPiA+IGNvbmRpdGlvbiBpcyBnb2luZyB0
byBiZSB0aGVyZSBhbnl3YXkgYW5kIHVzaW5nIHN0cnVjdCBhY3BpX2RldmljZQ0KPiA+ID4ganVz
dCBtYWtlcyBpdCBlYXNpZXIgdG8gYXZvaWQgaXQuDQo+ID4NCj4gPiBJbiB0aGVvcnksIHllcywg
YSBzdGFsZSBoYW5kbGUgY291bGQgYmUgYSBwcm9ibGVtLCBpZiBfRUowIHBlcmZvcm1zDQo+ID4g
dW5sb2FkIHRhYmxlIGFuZCBpZiBBQ1BJQ0EgZnJlZXMgdXAgaXRzIGludGVybmFsIGRhdGEgc3Ry
dWN0dXJlDQo+ID4gcG9pbnRlZCBieSB0aGUgaGFuZGxlIGFzIGEgcmVzdWx0LiAgQnV0IHdlIHNo
b3VsZCBub3Qgc2VlIHN1Y2ggaXNzdWUNCj4gPiBub3cgc2luY2Ugd2UgZG8gbm90IHN1cHBvcnQg
ZHluYW1pYyBBQ1BJIG5hbWVzcGFjZSB5ZXQuDQo+IA0KPiBJJ20gd2FpdGluZyBmb3IgaW5mb3Jt
YXRpb24gZnJvbSBCb2IgYWJvdXQgdGhhdC4gIElmIHdlIGNhbiBhc3N1bWUgQUNQSQ0KPiBoYW5k
bGVzIHRvIGJlIGFsd2F5cyB2YWxpZCwgdGhhdCB3aWxsIHNpbXBsaWZ5IHRoaW5ncyBxdWl0ZSBh
IGJpdC4NCg0KSWYgYSB0YWJsZSBpcyB1bmxvYWRlZCwgYWxsIHRoZSBuYW1lc3BhY2Ugbm9kZXMg
Zm9yIHRoYXQgdGFibGUgYXJlIHJlbW92ZWQgZnJvbSB0aGUgbmFtZXNwYWNlLCBhbmQgdGh1cyBh
bnkgQUNQSV9IQU5ETEUgcG9pbnRlcnMgZ28gc3RhbGUgYW5kIGludmFsaWQuDQoNCkJvYg0KDQo=

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-14  2:31             ` Moore, Robert
  (?)
  (?)
@ 2013-02-14 12:03             ` Rafael J. Wysocki
  2013-02-14 20:45                 ` Moore, Robert
  -1 siblings, 1 reply; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-14 12:03 UTC (permalink / raw)
  To: Moore, Robert
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > I thought about that, but actually there's no guarantee that the
> > > > handle will be valid after _EJ0 as far as I can say.  So the race
> > > > condition is going to be there anyway and using struct acpi_device
> > > > just makes it easier to avoid it.
> > >
> > > In theory, yes, a stale handle could be a problem, if _EJ0 performs
> > > unload table and if ACPICA frees up its internal data structure
> > > pointed by the handle as a result.  But we should not see such issue
> > > now since we do not support dynamic ACPI namespace yet.
> > 
> > I'm waiting for information from Bob about that.  If we can assume ACPI
> > handles to be always valid, that will simplify things quite a bit.
> 
> If a table is unloaded, all the namespace nodes for that table are removed
> from the namespace, and thus any ACPI_HANDLE pointers go stale and invalid.

OK, thanks!

To me this means that we cannot assume a handle to stay valid between
a notify handler and acpi_bus_hot_remove_device() run from a workqueue.

Is there a mechanism in ACPICA to ensure that a handle won't become stale while
a notify handler is running for it or is the OS responsible for ensuring that
_EJ0 won't be run in parallel with notify handlers for device objects being
ejected?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-13 13:16 ` [Update][PATCH] " Rafael J. Wysocki
  2013-02-13 17:43   ` Toshi Kani
@ 2013-02-14 20:05   ` Yinghai Lu
  2013-02-14 20:17     ` Rafael J. Wysocki
  1 sibling, 1 reply; 35+ messages in thread
From: Yinghai Lu @ 2013-02-14 20:05 UTC (permalink / raw)
  To: Rafael J. Wysocki, Stephen Rothwell
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Toshi Kani, Yasuaki Ishimatsu, Myron Stowe, linux-pci

[-- Attachment #1: Type: text/plain, Size: 23801 bytes --]

On Wed, Feb 13, 2013 at 5:16 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> This changeset is aimed at fixing a few different but related
> problems in the ACPI hotplug infrastructure.
>
> First of all, since notify handlers may be run in parallel with
> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> and some of them are installed for ACPI handles that have no struct
> acpi_device objects attached (i.e. before those objects are created),
> those notify handlers have to take acpi_scan_lock to prevent races
> from taking place (e.g. a struct acpi_device is found to be present
> for the given ACPI handle, but right after that it is removed by
> acpi_bus_trim() running in parallel to the given notify handler).
> Moreover, since some of them call acpi_bus_scan() and
> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> should be acquired by the callers of these two funtions rather by
> these functions themselves.
>
> For these reasons, make all notify handlers that can handle device
> addition and eject events take acpi_scan_lock and remove the
> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> Accordingly, update all of their users to make sure that they
> are always called under acpi_scan_lock.
>
> Furthermore, since eject operations are carried out asynchronously
> with respect to the notify events that trigger them, with the help
> of acpi_bus_hot_remove_device(), even if notify handlers take the
> ACPI scan lock, it still is possible that, for example,
> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> the notify handler that scheduled its execution and that
> acpi_bus_trim() will remove the device node passed to
> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> acpi_device object obtained by acpi_bus_hot_remove_device() will be
> invalid and not-so-funny things will ensue.  To protect agaist that,
> make the users of acpi_bus_hot_remove_device() run get_device() on
> ACPI device node objects that are about to be passed to it and make
> acpi_bus_hot_remove_device() run put_device() on them and check if
> their ACPI handles are not NULL (make acpi_device_unregister() clear
> the device nodes' ACPI handles for that check to work).
>
> Finally, observe that acpi_os_hotplug_execute() actually can fail,
> in which case its caller ought to free memory allocated for the
> context object to prevent leaks from happening.  It also needs to
> run put_device() on the device node that it ran get_device() on
> previously in that case.  Modify the code accordingly.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Acked-by: Yinghai Lu <yinghai@kernel.org>
> ---
>
> This includes fixes for two issues spotted by Yasuaki Ishimatsu.
>

this one will make pci/next and pm/linux-next conflicts

Please check if attached fix is right.

Thanks

Yinghai

> Thanks,
> Rafael
>
> ---
>  drivers/acpi/acpi_memhotplug.c     |   56 +++++++++++++++++++-----------
>  drivers/acpi/container.c           |   12 ++++--
>  drivers/acpi/dock.c                |   19 ++++++++--
>  drivers/acpi/processor_driver.c    |   24 +++++++++---
>  drivers/acpi/scan.c                |   69 +++++++++++++++++++++++++------------
>  drivers/pci/hotplug/acpiphp_glue.c |    6 +++
>  drivers/pci/hotplug/sgi_hotplug.c  |    5 ++
>  include/acpi/acpi_bus.h            |    3 +
>  8 files changed, 139 insertions(+), 55 deletions(-)
>
> Index: test/drivers/acpi/scan.c
> ===================================================================
> --- test.orig/drivers/acpi/scan.c
> +++ test/drivers/acpi/scan.c
> @@ -42,6 +42,18 @@ struct acpi_device_bus_id{
>         struct list_head node;
>  };
>
> +void acpi_scan_lock_acquire(void)
> +{
> +       mutex_lock(&acpi_scan_lock);
> +}
> +EXPORT_SYMBOL_GPL(acpi_scan_lock_acquire);
> +
> +void acpi_scan_lock_release(void)
> +{
> +       mutex_unlock(&acpi_scan_lock);
> +}
> +EXPORT_SYMBOL_GPL(acpi_scan_lock_release);
> +
>  int acpi_scan_add_handler(struct acpi_scan_handler *handler)
>  {
>         if (!handler || !handler->attach)
> @@ -95,8 +107,6 @@ acpi_device_modalias_show(struct device
>  }
>  static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
>
> -static void __acpi_bus_trim(struct acpi_device *start);
> -
>  /**
>   * acpi_bus_hot_remove_device: hot-remove a device and its children
>   * @context: struct acpi_eject_event pointer (freed in this func)
> @@ -107,7 +117,7 @@ static void __acpi_bus_trim(struct acpi_
>   */
>  void acpi_bus_hot_remove_device(void *context)
>  {
> -       struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context;
> +       struct acpi_eject_event *ej_event = context;
>         struct acpi_device *device = ej_event->device;
>         acpi_handle handle = device->handle;
>         acpi_handle temp;
> @@ -118,11 +128,19 @@ void acpi_bus_hot_remove_device(void *co
>
>         mutex_lock(&acpi_scan_lock);
>
> +       /* If there is no handle, the device node has been unregistered. */
> +       if (!device->handle) {
> +               dev_dbg(&device->dev, "ACPI handle missing\n");
> +               put_device(&device->dev);
> +               goto out;
> +       }
> +
>         ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>                 "Hot-removing device %s...\n", dev_name(&device->dev)));
>
> -       __acpi_bus_trim(device);
> -       /* Device node has been released. */
> +       acpi_bus_trim(device);
> +       /* Device node has been unregistered. */
> +       put_device(&device->dev);
>         device = NULL;
>
>         if (ACPI_SUCCESS(acpi_get_handle(handle, "_LCK", &temp))) {
> @@ -151,6 +169,7 @@ void acpi_bus_hot_remove_device(void *co
>                                           ost_code, NULL);
>         }
>
> + out:
>         mutex_unlock(&acpi_scan_lock);
>         kfree(context);
>         return;
> @@ -212,6 +231,7 @@ acpi_eject_store(struct device *d, struc
>                 goto err;
>         }
>
> +       get_device(&acpi_device->dev);
>         ej_event->device = acpi_device;
>         if (acpi_device->flags.eject_pending) {
>                 /* event originated from ACPI eject notification */
> @@ -224,7 +244,11 @@ acpi_eject_store(struct device *d, struc
>                         ej_event->event, ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
>         }
>
> -       acpi_os_hotplug_execute(acpi_bus_hot_remove_device, (void *)ej_event);
> +       status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
> +       if (ACPI_FAILURE(status)) {
> +               put_device(&acpi_device->dev);
> +               kfree(ej_event);
> +       }
>  err:
>         return ret;
>  }
> @@ -779,6 +803,7 @@ static void acpi_device_unregister(struc
>          * no more references.
>          */
>         acpi_device_set_power(device, ACPI_STATE_D3_COLD);
> +       device->handle = NULL;
>         put_device(&device->dev);
>  }
>
> @@ -1626,14 +1651,14 @@ static acpi_status acpi_bus_device_attac
>   * there has been a real error.  There just have been no suitable ACPI objects
>   * in the table trunk from which the kernel could create a device and add an
>   * appropriate driver.
> + *
> + * Must be called under acpi_scan_lock.
>   */
>  int acpi_bus_scan(acpi_handle handle)
>  {
>         void *device = NULL;
>         int error = 0;
>
> -       mutex_lock(&acpi_scan_lock);
> -
>         if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device)))
>                 acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>                                     acpi_bus_check_add, NULL, NULL, &device);
> @@ -1644,7 +1669,6 @@ int acpi_bus_scan(acpi_handle handle)
>                 acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
>                                     acpi_bus_device_attach, NULL, NULL, NULL);
>
> -       mutex_unlock(&acpi_scan_lock);
>         return error;
>  }
>  EXPORT_SYMBOL(acpi_bus_scan);
> @@ -1681,7 +1705,13 @@ static acpi_status acpi_bus_remove(acpi_
>         return AE_OK;
>  }
>
> -static void __acpi_bus_trim(struct acpi_device *start)
> +/**
> + * acpi_bus_trim - Remove ACPI device node and all of its descendants
> + * @start: Root of the ACPI device nodes subtree to remove.
> + *
> + * Must be called under acpi_scan_lock.
> + */
> +void acpi_bus_trim(struct acpi_device *start)
>  {
>         /*
>          * Execute acpi_bus_device_detach() as a post-order callback to detach
> @@ -1698,13 +1728,6 @@ static void __acpi_bus_trim(struct acpi_
>                             acpi_bus_remove, NULL, NULL);
>         acpi_bus_remove(start->handle, 0, NULL, NULL);
>  }
> -
> -void acpi_bus_trim(struct acpi_device *start)
> -{
> -       mutex_lock(&acpi_scan_lock);
> -       __acpi_bus_trim(start);
> -       mutex_unlock(&acpi_scan_lock);
> -}
>  EXPORT_SYMBOL_GPL(acpi_bus_trim);
>
>  static int acpi_bus_scan_fixed(void)
> @@ -1761,23 +1784,27 @@ int __init acpi_scan_init(void)
>         acpi_csrt_init();
>         acpi_container_init();
>
> +       mutex_lock(&acpi_scan_lock);
>         /*
>          * Enumerate devices in the ACPI namespace.
>          */
>         result = acpi_bus_scan(ACPI_ROOT_OBJECT);
>         if (result)
> -               return result;
> +               goto out;
>
>         result = acpi_bus_get_device(ACPI_ROOT_OBJECT, &acpi_root);
>         if (result)
> -               return result;
> +               goto out;
>
>         result = acpi_bus_scan_fixed();
>         if (result) {
>                 acpi_device_unregister(acpi_root);
> -               return result;
> +               goto out;
>         }
>
>         acpi_update_all_gpes();
> -       return 0;
> +
> + out:
> +       mutex_unlock(&acpi_scan_lock);
> +       return result;
>  }
> Index: test/include/acpi/acpi_bus.h
> ===================================================================
> --- test.orig/include/acpi/acpi_bus.h
> +++ test/include/acpi/acpi_bus.h
> @@ -395,6 +395,9 @@ int acpi_bus_receive_event(struct acpi_b
>  static inline int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data)
>         { return 0; }
>  #endif
> +
> +void acpi_scan_lock_acquire(void);
> +void acpi_scan_lock_release(void);
>  int acpi_scan_add_handler(struct acpi_scan_handler *handler);
>  int acpi_bus_register_driver(struct acpi_driver *driver);
>  void acpi_bus_unregister_driver(struct acpi_driver *driver);
> Index: test/drivers/acpi/acpi_memhotplug.c
> ===================================================================
> --- test.orig/drivers/acpi/acpi_memhotplug.c
> +++ test/drivers/acpi/acpi_memhotplug.c
> @@ -153,14 +153,16 @@ acpi_memory_get_device_resources(struct
>         return 0;
>  }
>
> -static int
> -acpi_memory_get_device(acpi_handle handle,
> -                      struct acpi_memory_device **mem_device)
> +static int acpi_memory_get_device(acpi_handle handle,
> +                                 struct acpi_memory_device **mem_device)
>  {
>         struct acpi_device *device = NULL;
> -       int result;
> +       int result = 0;
> +
> +       acpi_scan_lock_acquire();
>
> -       if (!acpi_bus_get_device(handle, &device) && device)
> +       acpi_bus_get_device(handle, &device);
> +       if (device)
>                 goto end;
>
>         /*
> @@ -169,23 +171,28 @@ acpi_memory_get_device(acpi_handle handl
>          */
>         result = acpi_bus_scan(handle);
>         if (result) {
> -               acpi_handle_warn(handle, "Cannot add acpi bus\n");
> -               return -EINVAL;
> +               acpi_handle_warn(handle, "ACPI namespace scan failed\n");
> +               result = -EINVAL;
> +               goto out;
>         }
>         result = acpi_bus_get_device(handle, &device);
>         if (result) {
>                 acpi_handle_warn(handle, "Missing device object\n");
> -               return -EINVAL;
> +               result = -EINVAL;
> +               goto out;
>         }
>
> -      end:
> + end:
>         *mem_device = acpi_driver_data(device);
>         if (!(*mem_device)) {
>                 dev_err(&device->dev, "driver data not found\n");
> -               return -ENODEV;
> +               result = -ENODEV;
> +               goto out;
>         }
>
> -       return 0;
> + out:
> +       acpi_scan_lock_release();
> +       return result;
>  }
>
>  static int acpi_memory_check_device(struct acpi_memory_device *mem_device)
> @@ -305,6 +312,7 @@ static void acpi_memory_device_notify(ac
>         struct acpi_device *device;
>         struct acpi_eject_event *ej_event = NULL;
>         u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> +       acpi_status status;
>
>         switch (event) {
>         case ACPI_NOTIFY_BUS_CHECK:
> @@ -327,29 +335,40 @@ static void acpi_memory_device_notify(ac
>                 ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>                                   "\nReceived EJECT REQUEST notification for device\n"));
>
> +               status = AE_ERROR;
> +               acpi_scan_lock_acquire();
> +
>                 if (acpi_bus_get_device(handle, &device)) {
>                         acpi_handle_err(handle, "Device doesn't exist\n");
> -                       break;
> +                       goto unlock;
>                 }
>                 mem_device = acpi_driver_data(device);
>                 if (!mem_device) {
>                         acpi_handle_err(handle, "Driver Data is NULL\n");
> -                       break;
> +                       goto unlock;
>                 }
>
>                 ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
>                 if (!ej_event) {
>                         pr_err(PREFIX "No memory, dropping EJECT\n");
> -                       break;
> +                       goto unlock;
>                 }
>
> +               get_device(&device->dev);
>                 ej_event->device = device;
>                 ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> -               acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> -                                       (void *)ej_event);
> +               /* The eject is carried out asynchronously. */
> +               status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> +                                                ej_event);
> +               if (ACPI_FAILURE(status)) {
> +                       put_device(&device->dev);
> +                       kfree(ej_event);
> +               }
>
> -               /* eject is performed asynchronously */
> -               return;
> + unlock:
> +               acpi_scan_lock_release();
> +               if (ACPI_SUCCESS(status))
> +                       return;
>         default:
>                 ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>                                   "Unsupported event [0x%x]\n", event));
> @@ -360,7 +379,6 @@ static void acpi_memory_device_notify(ac
>
>         /* Inform firmware that the hotplug operation has completed */
>         (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
> -       return;
>  }
>
>  static void acpi_memory_device_free(struct acpi_memory_device *mem_device)
> Index: test/drivers/acpi/processor_driver.c
> ===================================================================
> --- test.orig/drivers/acpi/processor_driver.c
> +++ test/drivers/acpi/processor_driver.c
> @@ -683,8 +683,11 @@ static void acpi_processor_hotplug_notif
>         struct acpi_device *device = NULL;
>         struct acpi_eject_event *ej_event = NULL;
>         u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
> +       acpi_status status;
>         int result;
>
> +       acpi_scan_lock_acquire();
> +
>         switch (event) {
>         case ACPI_NOTIFY_BUS_CHECK:
>         case ACPI_NOTIFY_DEVICE_CHECK:
> @@ -733,25 +736,32 @@ static void acpi_processor_hotplug_notif
>                         break;
>                 }
>
> +               get_device(&device->dev);
>                 ej_event->device = device;
>                 ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
> -               acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> -                                       (void *)ej_event);
> -
> -               /* eject is performed asynchronously */
> -               return;
> +               /* The eject is carried out asynchronously. */
> +               status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
> +                                                ej_event);
> +               if (ACPI_FAILURE(status)) {
> +                       put_device(&device->dev);
> +                       kfree(ej_event);
> +                       break;
> +               }
> +               goto out;
>
>         default:
>                 ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>                                   "Unsupported event [0x%x]\n", event));
>
>                 /* non-hotplug event; possibly handled by other handler */
> -               return;
> +               goto out;
>         }
>
>         /* Inform firmware that the hotplug operation has completed */
>         (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
> -       return;
> +
> + out:
> +       acpi_scan_lock_release();
>  }
>
>  static acpi_status is_processor_device(acpi_handle handle)
> Index: test/drivers/acpi/container.c
> ===================================================================
> --- test.orig/drivers/acpi/container.c
> +++ test/drivers/acpi/container.c
> @@ -88,6 +88,8 @@ static void container_notify_cb(acpi_han
>         acpi_status status;
>         u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
>
> +       acpi_scan_lock_acquire();
> +
>         switch (type) {
>         case ACPI_NOTIFY_BUS_CHECK:
>                 /* Fall through */
> @@ -103,7 +105,7 @@ static void container_notify_cb(acpi_han
>                                 /* device exist and this is a remove request */
>                                 device->flags.eject_pending = 1;
>                                 kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
> -                               return;
> +                               goto out;
>                         }
>                         break;
>                 }
> @@ -130,18 +132,20 @@ static void container_notify_cb(acpi_han
>                 if (!acpi_bus_get_device(handle, &device) && device) {
>                         device->flags.eject_pending = 1;
>                         kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
> -                       return;
> +                       goto out;
>                 }
>                 break;
>
>         default:
>                 /* non-hotplug event; possibly handled by other handler */
> -               return;
> +               goto out;
>         }
>
>         /* Inform firmware that the hotplug operation has completed */
>         (void) acpi_evaluate_hotplug_ost(handle, type, ost_code, NULL);
> -       return;
> +
> + out:
> +       acpi_scan_lock_release();
>  }
>
>  static bool is_container(acpi_handle handle)
> Index: test/drivers/acpi/dock.c
> ===================================================================
> --- test.orig/drivers/acpi/dock.c
> +++ test/drivers/acpi/dock.c
> @@ -744,7 +744,9 @@ static void acpi_dock_deferred_cb(void *
>  {
>         struct dock_data *data = context;
>
> +       acpi_scan_lock_acquire();
>         dock_notify(data->handle, data->event, data->ds);
> +       acpi_scan_lock_release();
>         kfree(data);
>  }
>
> @@ -757,20 +759,31 @@ static int acpi_dock_notifier_call(struc
>         if (event != ACPI_NOTIFY_BUS_CHECK && event != ACPI_NOTIFY_DEVICE_CHECK
>            && event != ACPI_NOTIFY_EJECT_REQUEST)
>                 return 0;
> +
> +       acpi_scan_lock_acquire();
> +
>         list_for_each_entry(dock_station, &dock_stations, sibling) {
>                 if (dock_station->handle == handle) {
>                         struct dock_data *dd;
> +                       acpi_status status;
>
>                         dd = kmalloc(sizeof(*dd), GFP_KERNEL);
>                         if (!dd)
> -                               return 0;
> +                               break;
> +
>                         dd->handle = handle;
>                         dd->event = event;
>                         dd->ds = dock_station;
> -                       acpi_os_hotplug_execute(acpi_dock_deferred_cb, dd);
> -                       return 0 ;
> +                       status = acpi_os_hotplug_execute(acpi_dock_deferred_cb,
> +                                                        dd);
> +                       if (ACPI_FAILURE(status))
> +                               kfree(dd);
> +
> +                       break;
>                 }
>         }
> +
> +       acpi_scan_lock_release();
>         return 0;
>  }
>
> Index: test/drivers/pci/hotplug/acpiphp_glue.c
> ===================================================================
> --- test.orig/drivers/pci/hotplug/acpiphp_glue.c
> +++ test/drivers/pci/hotplug/acpiphp_glue.c
> @@ -1218,6 +1218,8 @@ static void _handle_hotplug_event_bridge
>         handle = hp_work->handle;
>         type = hp_work->type;
>
> +       acpi_scan_lock_acquire();
> +
>         if (acpi_bus_get_device(handle, &device)) {
>                 /* This bridge must have just been physically inserted */
>                 handle_bridge_insertion(handle, type);
> @@ -1295,6 +1297,7 @@ static void _handle_hotplug_event_bridge
>         }
>
>  out:
> +       acpi_scan_lock_release();
>         kfree(hp_work); /* allocated in handle_hotplug_event_bridge */
>  }
>
> @@ -1341,6 +1344,8 @@ static void _handle_hotplug_event_func(s
>
>         func = (struct acpiphp_func *)context;
>
> +       acpi_scan_lock_acquire();
> +
>         switch (type) {
>         case ACPI_NOTIFY_BUS_CHECK:
>                 /* bus re-enumerate */
> @@ -1371,6 +1376,7 @@ static void _handle_hotplug_event_func(s
>                 break;
>         }
>
> +       acpi_scan_lock_release();
>         kfree(hp_work); /* allocated in handle_hotplug_event_func */
>  }
>
> Index: test/drivers/pci/hotplug/sgi_hotplug.c
> ===================================================================
> --- test.orig/drivers/pci/hotplug/sgi_hotplug.c
> +++ test/drivers/pci/hotplug/sgi_hotplug.c
> @@ -425,6 +425,7 @@ static int enable_slot(struct hotplug_sl
>                         pdevice = NULL;
>                 }
>
> +               acpi_scan_lock_acquire();
>                 /*
>                  * Walk the rootbus node's immediate children looking for
>                  * the slot's device node(s). There can be more than
> @@ -458,6 +459,7 @@ static int enable_slot(struct hotplug_sl
>                                 }
>                         }
>                 }
> +               acpi_scan_lock_release();
>         }
>
>         /* Call the driver for the new device */
> @@ -508,6 +510,7 @@ static int disable_slot(struct hotplug_s
>                 /* Get the rootbus node pointer */
>                 phandle = PCI_CONTROLLER(slot->pci_bus)->acpi_handle;
>
> +               acpi_scan_lock_acquire();
>                 /*
>                  * Walk the rootbus node's immediate children looking for
>                  * the slot's device node(s). There can be more than
> @@ -538,7 +541,7 @@ static int disable_slot(struct hotplug_s
>                                         acpi_bus_trim(device);
>                         }
>                 }
> -
> +               acpi_scan_lock_release();
>         }
>
>         /* Free the SN resources assigned to the Linux device.*/
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: acpi_pci_merge_fix.patch --]
[-- Type: application/octet-stream, Size: 2702 bytes --]

---
 drivers/acpi/internal.h            |    4 ----
 drivers/acpi/scan.c                |    4 ----
 drivers/pci/hotplug/acpiphp_glue.c |   29 -----------------------------
 3 files changed, 37 deletions(-)

Index: linux-2.6/drivers/acpi/internal.h
===================================================================
--- linux-2.6.orig/drivers/acpi/internal.h
+++ linux-2.6/drivers/acpi/internal.h
@@ -94,11 +94,7 @@ struct acpi_ec {
 
 extern struct acpi_ec *first_ec;
 
-<<<<<<< HEAD
-int acpi_pci_root_init(void);
 void acpi_pci_root_hp_init(void);
-=======
->>>>>>> pm/linux-next
 int acpi_ec_init(void);
 int acpi_ec_ecdt_probe(void);
 int acpi_boot_ec_enable(void);
Index: linux-2.6/drivers/acpi/scan.c
===================================================================
--- linux-2.6.orig/drivers/acpi/scan.c
+++ linux-2.6/drivers/acpi/scan.c
@@ -1804,13 +1804,9 @@ int __init acpi_scan_init(void)
 
 	acpi_update_all_gpes();
 
-<<<<<<< HEAD
 	acpi_pci_root_hp_init();
 
-	return 0;
-=======
  out:
 	mutex_unlock(&acpi_scan_lock);
 	return result;
->>>>>>> pm/linux-next
 }
Index: linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
@@ -1135,30 +1135,10 @@ static void _handle_hotplug_event_bridge
 	hp_work = container_of(work, struct acpi_hp_work, work);
 	handle = hp_work->handle;
 	type = hp_work->type;
-<<<<<<< HEAD
 	bridge = (struct acpiphp_bridge *)hp_work->context;
-=======
 
 	acpi_scan_lock_acquire();
 
-	if (acpi_bus_get_device(handle, &device)) {
-		/* This bridge must have just been physically inserted */
-		handle_bridge_insertion(handle, type);
-		goto out;
-	}
-
-	bridge = acpiphp_handle_to_bridge(handle);
-	if (type == ACPI_NOTIFY_BUS_CHECK) {
-		acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, ACPI_UINT32_MAX,
-			count_sub_bridges, NULL, &num_sub_bridges, NULL);
-	}
-
-	if (!bridge && !num_sub_bridges) {
-		err("cannot get bridge info\n");
-		goto out;
-	}
->>>>>>> pm/linux-next
-
 	acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer);
 
 	switch (type) {
@@ -1213,11 +1193,7 @@ static void _handle_hotplug_event_bridge
 		break;
 	}
 
-<<<<<<< HEAD
-=======
-out:
 	acpi_scan_lock_release();
->>>>>>> pm/linux-next
 	kfree(hp_work); /* allocated in handle_hotplug_event_bridge */
 }
 
@@ -1260,13 +1236,8 @@ static void _handle_hotplug_event_func(s
 
 	acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer);
 
-<<<<<<< HEAD
-=======
-	func = (struct acpiphp_func *)context;
-
 	acpi_scan_lock_acquire();
 
->>>>>>> pm/linux-next
 	switch (type) {
 	case ACPI_NOTIFY_BUS_CHECK:
 		/* bus re-enumerate */

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-14 20:05   ` Yinghai Lu
@ 2013-02-14 20:17     ` Rafael J. Wysocki
  0 siblings, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-14 20:17 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Stephen Rothwell, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Toshi Kani, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Thursday, February 14, 2013 12:05:43 PM Yinghai Lu wrote:
> On Wed, Feb 13, 2013 at 5:16 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > This changeset is aimed at fixing a few different but related
> > problems in the ACPI hotplug infrastructure.
> >
> > First of all, since notify handlers may be run in parallel with
> > acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> > and some of them are installed for ACPI handles that have no struct
> > acpi_device objects attached (i.e. before those objects are created),
> > those notify handlers have to take acpi_scan_lock to prevent races
> > from taking place (e.g. a struct acpi_device is found to be present
> > for the given ACPI handle, but right after that it is removed by
> > acpi_bus_trim() running in parallel to the given notify handler).
> > Moreover, since some of them call acpi_bus_scan() and
> > acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> > should be acquired by the callers of these two funtions rather by
> > these functions themselves.
> >
> > For these reasons, make all notify handlers that can handle device
> > addition and eject events take acpi_scan_lock and remove the
> > acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> > Accordingly, update all of their users to make sure that they
> > are always called under acpi_scan_lock.
> >
> > Furthermore, since eject operations are carried out asynchronously
> > with respect to the notify events that trigger them, with the help
> > of acpi_bus_hot_remove_device(), even if notify handlers take the
> > ACPI scan lock, it still is possible that, for example,
> > acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> > the notify handler that scheduled its execution and that
> > acpi_bus_trim() will remove the device node passed to
> > acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> > acpi_device object obtained by acpi_bus_hot_remove_device() will be
> > invalid and not-so-funny things will ensue.  To protect agaist that,
> > make the users of acpi_bus_hot_remove_device() run get_device() on
> > ACPI device node objects that are about to be passed to it and make
> > acpi_bus_hot_remove_device() run put_device() on them and check if
> > their ACPI handles are not NULL (make acpi_device_unregister() clear
> > the device nodes' ACPI handles for that check to work).
> >
> > Finally, observe that acpi_os_hotplug_execute() actually can fail,
> > in which case its caller ought to free memory allocated for the
> > context object to prevent leaks from happening.  It also needs to
> > run put_device() on the device node that it ran get_device() on
> > previously in that case.  Modify the code accordingly.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Acked-by: Yinghai Lu <yinghai@kernel.org>
> > ---
> >
> > This includes fixes for two issues spotted by Yasuaki Ishimatsu.
> >
> 
> this one will make pci/next and pm/linux-next conflicts
> 
> Please check if attached fix is right.

Looks correct to me.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-14 12:03             ` Rafael J. Wysocki
  2013-02-14 20:45                 ` Moore, Robert
@ 2013-02-14 20:45                 ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14 20:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci



> -----Original Message-----
> From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> Sent: Thursday, February 14, 2013 4:04 AM
> To: Moore, Robert
> Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> memory leaks
> 
> On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > I thought about that, but actually there's no guarantee that the
> > > > > handle will be valid after _EJ0 as far as I can say.  So the
> > > > > race condition is going to be there anyway and using struct
> > > > > acpi_device just makes it easier to avoid it.
> > > >
> > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > performs unload table and if ACPICA frees up its internal data
> > > > structure pointed by the handle as a result.  But we should not
> > > > see such issue now since we do not support dynamic ACPI namespace
> yet.
> > >
> > > I'm waiting for information from Bob about that.  If we can assume
> > > ACPI handles to be always valid, that will simplify things quite a
> bit.
> >
> > If a table is unloaded, all the namespace nodes for that table are
> > removed from the namespace, and thus any ACPI_HANDLE pointers go stale
> and invalid.
> 
> OK, thanks!
> 
> To me this means that we cannot assume a handle to stay valid between a
> notify handler and acpi_bus_hot_remove_device() run from a workqueue.
> 
> Is there a mechanism in ACPICA to ensure that a handle won't become stale
> while a notify handler is running for it or is the OS responsible for
> ensuring that
> _EJ0 won't be run in parallel with notify handlers for device objects
> being ejected?
> 

It is up to the host.
Bob


> Rafael
> 
> 
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-14 20:45                 ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14 20:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2060 bytes --]



> -----Original Message-----
> From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> Sent: Thursday, February 14, 2013 4:04 AM
> To: Moore, Robert
> Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> memory leaks
> 
> On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > I thought about that, but actually there's no guarantee that the
> > > > > handle will be valid after _EJ0 as far as I can say.  So the
> > > > > race condition is going to be there anyway and using struct
> > > > > acpi_device just makes it easier to avoid it.
> > > >
> > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > performs unload table and if ACPICA frees up its internal data
> > > > structure pointed by the handle as a result.  But we should not
> > > > see such issue now since we do not support dynamic ACPI namespace
> yet.
> > >
> > > I'm waiting for information from Bob about that.  If we can assume
> > > ACPI handles to be always valid, that will simplify things quite a
> bit.
> >
> > If a table is unloaded, all the namespace nodes for that table are
> > removed from the namespace, and thus any ACPI_HANDLE pointers go stale
> and invalid.
> 
> OK, thanks!
> 
> To me this means that we cannot assume a handle to stay valid between a
> notify handler and acpi_bus_hot_remove_device() run from a workqueue.
> 
> Is there a mechanism in ACPICA to ensure that a handle won't become stale
> while a notify handler is running for it or is the OS responsible for
> ensuring that
> _EJ0 won't be run in parallel with notify handlers for device objects
> being ejected?
> 

It is up to the host.
Bob


> Rafael
> 
> 
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-14 20:45                 ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14 20:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogUmFmYWVsIEouIFd5c29j
a2kgW21haWx0bzpyandAc2lzay5wbF0NCj4gU2VudDogVGh1cnNkYXksIEZlYnJ1YXJ5IDE0LCAy
MDEzIDQ6MDQgQU0NCj4gVG86IE1vb3JlLCBSb2JlcnQNCj4gQ2M6IFRvc2hpIEthbmk7IEFDUEkg
RGV2ZWwgTWFsaW5nIExpc3Q7IExLTUw7IEJqb3JuIEhlbGdhYXM7IEppYW5nIExpdTsNCj4gWWlu
Z2hhaSBMdTsgWWFzdWFraSBJc2hpbWF0c3U7IE15cm9uIFN0b3dlOyBsaW51eC1wY2lAdmdlci5r
ZXJuZWwub3JnDQo+IFN1YmplY3Q6IFJlOiBbVXBkYXRlXVtQQVRDSF0gQUNQSSAvIGhvdHBsdWc6
IEZpeCBjb25jdXJyZW5jeSBpc3N1ZXMgYW5kDQo+IG1lbW9yeSBsZWFrcw0KPiANCj4gT24gVGh1
cnNkYXksIEZlYnJ1YXJ5IDE0LCAyMDEzIDAyOjMxOjIyIEFNIE1vb3JlLCBSb2JlcnQgd3JvdGU6
DQo+ID4gPiA+ID4gSSB0aG91Z2h0IGFib3V0IHRoYXQsIGJ1dCBhY3R1YWxseSB0aGVyZSdzIG5v
IGd1YXJhbnRlZSB0aGF0IHRoZQ0KPiA+ID4gPiA+IGhhbmRsZSB3aWxsIGJlIHZhbGlkIGFmdGVy
IF9FSjAgYXMgZmFyIGFzIEkgY2FuIHNheS4gIFNvIHRoZQ0KPiA+ID4gPiA+IHJhY2UgY29uZGl0
aW9uIGlzIGdvaW5nIHRvIGJlIHRoZXJlIGFueXdheSBhbmQgdXNpbmcgc3RydWN0DQo+ID4gPiA+
ID4gYWNwaV9kZXZpY2UganVzdCBtYWtlcyBpdCBlYXNpZXIgdG8gYXZvaWQgaXQuDQo+ID4gPiA+
DQo+ID4gPiA+IEluIHRoZW9yeSwgeWVzLCBhIHN0YWxlIGhhbmRsZSBjb3VsZCBiZSBhIHByb2Js
ZW0sIGlmIF9FSjANCj4gPiA+ID4gcGVyZm9ybXMgdW5sb2FkIHRhYmxlIGFuZCBpZiBBQ1BJQ0Eg
ZnJlZXMgdXAgaXRzIGludGVybmFsIGRhdGENCj4gPiA+ID4gc3RydWN0dXJlIHBvaW50ZWQgYnkg
dGhlIGhhbmRsZSBhcyBhIHJlc3VsdC4gIEJ1dCB3ZSBzaG91bGQgbm90DQo+ID4gPiA+IHNlZSBz
dWNoIGlzc3VlIG5vdyBzaW5jZSB3ZSBkbyBub3Qgc3VwcG9ydCBkeW5hbWljIEFDUEkgbmFtZXNw
YWNlDQo+IHlldC4NCj4gPiA+DQo+ID4gPiBJJ20gd2FpdGluZyBmb3IgaW5mb3JtYXRpb24gZnJv
bSBCb2IgYWJvdXQgdGhhdC4gIElmIHdlIGNhbiBhc3N1bWUNCj4gPiA+IEFDUEkgaGFuZGxlcyB0
byBiZSBhbHdheXMgdmFsaWQsIHRoYXQgd2lsbCBzaW1wbGlmeSB0aGluZ3MgcXVpdGUgYQ0KPiBi
aXQuDQo+ID4NCj4gPiBJZiBhIHRhYmxlIGlzIHVubG9hZGVkLCBhbGwgdGhlIG5hbWVzcGFjZSBu
b2RlcyBmb3IgdGhhdCB0YWJsZSBhcmUNCj4gPiByZW1vdmVkIGZyb20gdGhlIG5hbWVzcGFjZSwg
YW5kIHRodXMgYW55IEFDUElfSEFORExFIHBvaW50ZXJzIGdvIHN0YWxlDQo+IGFuZCBpbnZhbGlk
Lg0KPiANCj4gT0ssIHRoYW5rcyENCj4gDQo+IFRvIG1lIHRoaXMgbWVhbnMgdGhhdCB3ZSBjYW5u
b3QgYXNzdW1lIGEgaGFuZGxlIHRvIHN0YXkgdmFsaWQgYmV0d2VlbiBhDQo+IG5vdGlmeSBoYW5k
bGVyIGFuZCBhY3BpX2J1c19ob3RfcmVtb3ZlX2RldmljZSgpIHJ1biBmcm9tIGEgd29ya3F1ZXVl
Lg0KPiANCj4gSXMgdGhlcmUgYSBtZWNoYW5pc20gaW4gQUNQSUNBIHRvIGVuc3VyZSB0aGF0IGEg
aGFuZGxlIHdvbid0IGJlY29tZSBzdGFsZQ0KPiB3aGlsZSBhIG5vdGlmeSBoYW5kbGVyIGlzIHJ1
bm5pbmcgZm9yIGl0IG9yIGlzIHRoZSBPUyByZXNwb25zaWJsZSBmb3INCj4gZW5zdXJpbmcgdGhh
dA0KPiBfRUowIHdvbid0IGJlIHJ1biBpbiBwYXJhbGxlbCB3aXRoIG5vdGlmeSBoYW5kbGVycyBm
b3IgZGV2aWNlIG9iamVjdHMNCj4gYmVpbmcgZWplY3RlZD8NCj4gDQoNCkl0IGlzIHVwIHRvIHRo
ZSBob3N0Lg0KQm9iDQoNCg0KPiBSYWZhZWwNCj4gDQo+IA0KPiAtLQ0KPiBJIHNwZWFrIG9ubHkg
Zm9yIG15c2VsZi4NCj4gUmFmYWVsIEouIFd5c29ja2ksIEludGVsIE9wZW4gU291cmNlIFRlY2hu
b2xvZ3kgQ2VudGVyLg0K

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-14 20:45                 ` Moore, Robert
  (?)
  (?)
@ 2013-02-14 20:59                 ` Rafael J. Wysocki
  2013-02-14 23:45                     ` Moore, Robert
  -1 siblings, 1 reply; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-14 20:59 UTC (permalink / raw)
  To: Moore, Robert
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote:
> 
> > -----Original Message-----
> > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > Sent: Thursday, February 14, 2013 4:04 AM
> > To: Moore, Robert
> > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> > Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> > memory leaks
> > 
> > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > > I thought about that, but actually there's no guarantee that the
> > > > > > handle will be valid after _EJ0 as far as I can say.  So the
> > > > > > race condition is going to be there anyway and using struct
> > > > > > acpi_device just makes it easier to avoid it.
> > > > >
> > > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > > performs unload table and if ACPICA frees up its internal data
> > > > > structure pointed by the handle as a result.  But we should not
> > > > > see such issue now since we do not support dynamic ACPI namespace
> > yet.
> > > >
> > > > I'm waiting for information from Bob about that.  If we can assume
> > > > ACPI handles to be always valid, that will simplify things quite a
> > bit.
> > >
> > > If a table is unloaded, all the namespace nodes for that table are
> > > removed from the namespace, and thus any ACPI_HANDLE pointers go stale
> > and invalid.
> > 
> > OK, thanks!
> > 
> > To me this means that we cannot assume a handle to stay valid between a
> > notify handler and acpi_bus_hot_remove_device() run from a workqueue.
> > 
> > Is there a mechanism in ACPICA to ensure that a handle won't become stale
> > while a notify handler is running for it or is the OS responsible for
> > ensuring that
> > _EJ0 won't be run in parallel with notify handlers for device objects
> > being ejected?
> > 
> 
> It is up to the host.

I was afraid that that might be the case. :-)

So far the (Linux) host has been happily ignoring that potential problem, so
I guess it can still be ignored for a while, although we'll need to address it
eventually at one point.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-14 20:59                 ` Rafael J. Wysocki
  2013-02-14 23:45                     ` Moore, Robert
@ 2013-02-14 23:45                     ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14 23:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci



> -----Original Message-----
> From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> Sent: Thursday, February 14, 2013 12:59 PM
> To: Moore, Robert
> Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> memory leaks
> 
> On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote:
> >
> > > -----Original Message-----
> > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > Sent: Thursday, February 14, 2013 4:04 AM
> > > To: Moore, Robert
> > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang
> > > Liu; Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe;
> > > linux-pci@vger.kernel.org
> > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues
> > > and memory leaks
> > >
> > > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > > > I thought about that, but actually there's no guarantee that
> > > > > > > the handle will be valid after _EJ0 as far as I can say.  So
> > > > > > > the race condition is going to be there anyway and using
> > > > > > > struct acpi_device just makes it easier to avoid it.
> > > > > >
> > > > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > > > performs unload table and if ACPICA frees up its internal data
> > > > > > structure pointed by the handle as a result.  But we should
> > > > > > not see such issue now since we do not support dynamic ACPI
> > > > > > namespace
> > > yet.
> > > > >
> > > > > I'm waiting for information from Bob about that.  If we can
> > > > > assume ACPI handles to be always valid, that will simplify
> > > > > things quite a
> > > bit.
> > > >
> > > > If a table is unloaded, all the namespace nodes for that table are
> > > > removed from the namespace, and thus any ACPI_HANDLE pointers go
> > > > stale
> > > and invalid.
> > >
> > > OK, thanks!
> > >
> > > To me this means that we cannot assume a handle to stay valid
> > > between a notify handler and acpi_bus_hot_remove_device() run from a
> workqueue.
> > >
> > > Is there a mechanism in ACPICA to ensure that a handle won't become
> > > stale while a notify handler is running for it or is the OS
> > > responsible for ensuring that
> > > _EJ0 won't be run in parallel with notify handlers for device
> > > objects being ejected?
> > >
> >
> > It is up to the host.
> 
> I was afraid that that might be the case. :-)
> 
> So far the (Linux) host has been happily ignoring that potential problem,
> so I guess it can still be ignored for a while, although we'll need to
> address it eventually at one point.

I would think it should be fairly simple to setup a mechanism to either tell the driver or for the driver to figure it out -- such that the driver knows that all handles associated with the device are now invalid. Another way to look at it is that when the device is re-installed, the driver should reinitialize such that it obtains new handles for the devices and subobjects in question.

Bob






> 
> Thanks,
> Rafael
> 
> 
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-14 23:45                     ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14 23:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3358 bytes --]



> -----Original Message-----
> From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> Sent: Thursday, February 14, 2013 12:59 PM
> To: Moore, Robert
> Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> memory leaks
> 
> On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote:
> >
> > > -----Original Message-----
> > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > Sent: Thursday, February 14, 2013 4:04 AM
> > > To: Moore, Robert
> > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang
> > > Liu; Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe;
> > > linux-pci@vger.kernel.org
> > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues
> > > and memory leaks
> > >
> > > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > > > I thought about that, but actually there's no guarantee that
> > > > > > > the handle will be valid after _EJ0 as far as I can say.  So
> > > > > > > the race condition is going to be there anyway and using
> > > > > > > struct acpi_device just makes it easier to avoid it.
> > > > > >
> > > > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > > > performs unload table and if ACPICA frees up its internal data
> > > > > > structure pointed by the handle as a result.  But we should
> > > > > > not see such issue now since we do not support dynamic ACPI
> > > > > > namespace
> > > yet.
> > > > >
> > > > > I'm waiting for information from Bob about that.  If we can
> > > > > assume ACPI handles to be always valid, that will simplify
> > > > > things quite a
> > > bit.
> > > >
> > > > If a table is unloaded, all the namespace nodes for that table are
> > > > removed from the namespace, and thus any ACPI_HANDLE pointers go
> > > > stale
> > > and invalid.
> > >
> > > OK, thanks!
> > >
> > > To me this means that we cannot assume a handle to stay valid
> > > between a notify handler and acpi_bus_hot_remove_device() run from a
> workqueue.
> > >
> > > Is there a mechanism in ACPICA to ensure that a handle won't become
> > > stale while a notify handler is running for it or is the OS
> > > responsible for ensuring that
> > > _EJ0 won't be run in parallel with notify handlers for device
> > > objects being ejected?
> > >
> >
> > It is up to the host.
> 
> I was afraid that that might be the case. :-)
> 
> So far the (Linux) host has been happily ignoring that potential problem,
> so I guess it can still be ignored for a while, although we'll need to
> address it eventually at one point.

I would think it should be fairly simple to setup a mechanism to either tell the driver or for the driver to figure it out -- such that the driver knows that all handles associated with the device are now invalid. Another way to look at it is that when the device is re-installed, the driver should reinitialize such that it obtains new handles for the devices and subobjects in question.

Bob






> 
> Thanks,
> Rafael
> 
> 
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-14 23:45                     ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-14 23:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogUmFmYWVsIEouIFd5c29j
a2kgW21haWx0bzpyandAc2lzay5wbF0NCj4gU2VudDogVGh1cnNkYXksIEZlYnJ1YXJ5IDE0LCAy
MDEzIDEyOjU5IFBNDQo+IFRvOiBNb29yZSwgUm9iZXJ0DQo+IENjOiBUb3NoaSBLYW5pOyBBQ1BJ
IERldmVsIE1hbGluZyBMaXN0OyBMS01MOyBCam9ybiBIZWxnYWFzOyBKaWFuZyBMaXU7DQo+IFlp
bmdoYWkgTHU7IFlhc3Vha2kgSXNoaW1hdHN1OyBNeXJvbiBTdG93ZTsgbGludXgtcGNpQHZnZXIu
a2VybmVsLm9yZw0KPiBTdWJqZWN0OiBSZTogW1VwZGF0ZV1bUEFUQ0hdIEFDUEkgLyBob3RwbHVn
OiBGaXggY29uY3VycmVuY3kgaXNzdWVzIGFuZA0KPiBtZW1vcnkgbGVha3MNCj4gDQo+IE9uIFRo
dXJzZGF5LCBGZWJydWFyeSAxNCwgMjAxMyAwODo0NToxNCBQTSBNb29yZSwgUm9iZXJ0IHdyb3Rl
Og0KPiA+DQo+ID4gPiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+ID4gRnJvbTogUmFm
YWVsIEouIFd5c29ja2kgW21haWx0bzpyandAc2lzay5wbF0NCj4gPiA+IFNlbnQ6IFRodXJzZGF5
LCBGZWJydWFyeSAxNCwgMjAxMyA0OjA0IEFNDQo+ID4gPiBUbzogTW9vcmUsIFJvYmVydA0KPiA+
ID4gQ2M6IFRvc2hpIEthbmk7IEFDUEkgRGV2ZWwgTWFsaW5nIExpc3Q7IExLTUw7IEJqb3JuIEhl
bGdhYXM7IEppYW5nDQo+ID4gPiBMaXU7IFlpbmdoYWkgTHU7IFlhc3Vha2kgSXNoaW1hdHN1OyBN
eXJvbiBTdG93ZTsNCj4gPiA+IGxpbnV4LXBjaUB2Z2VyLmtlcm5lbC5vcmcNCj4gPiA+IFN1Ympl
Y3Q6IFJlOiBbVXBkYXRlXVtQQVRDSF0gQUNQSSAvIGhvdHBsdWc6IEZpeCBjb25jdXJyZW5jeSBp
c3N1ZXMNCj4gPiA+IGFuZCBtZW1vcnkgbGVha3MNCj4gPiA+DQo+ID4gPiBPbiBUaHVyc2RheSwg
RmVicnVhcnkgMTQsIDIwMTMgMDI6MzE6MjIgQU0gTW9vcmUsIFJvYmVydCB3cm90ZToNCj4gPiA+
ID4gPiA+ID4gSSB0aG91Z2h0IGFib3V0IHRoYXQsIGJ1dCBhY3R1YWxseSB0aGVyZSdzIG5vIGd1
YXJhbnRlZSB0aGF0DQo+ID4gPiA+ID4gPiA+IHRoZSBoYW5kbGUgd2lsbCBiZSB2YWxpZCBhZnRl
ciBfRUowIGFzIGZhciBhcyBJIGNhbiBzYXkuICBTbw0KPiA+ID4gPiA+ID4gPiB0aGUgcmFjZSBj
b25kaXRpb24gaXMgZ29pbmcgdG8gYmUgdGhlcmUgYW55d2F5IGFuZCB1c2luZw0KPiA+ID4gPiA+
ID4gPiBzdHJ1Y3QgYWNwaV9kZXZpY2UganVzdCBtYWtlcyBpdCBlYXNpZXIgdG8gYXZvaWQgaXQu
DQo+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gSW4gdGhlb3J5LCB5ZXMsIGEgc3RhbGUgaGFuZGxl
IGNvdWxkIGJlIGEgcHJvYmxlbSwgaWYgX0VKMA0KPiA+ID4gPiA+ID4gcGVyZm9ybXMgdW5sb2Fk
IHRhYmxlIGFuZCBpZiBBQ1BJQ0EgZnJlZXMgdXAgaXRzIGludGVybmFsIGRhdGENCj4gPiA+ID4g
PiA+IHN0cnVjdHVyZSBwb2ludGVkIGJ5IHRoZSBoYW5kbGUgYXMgYSByZXN1bHQuICBCdXQgd2Ug
c2hvdWxkDQo+ID4gPiA+ID4gPiBub3Qgc2VlIHN1Y2ggaXNzdWUgbm93IHNpbmNlIHdlIGRvIG5v
dCBzdXBwb3J0IGR5bmFtaWMgQUNQSQ0KPiA+ID4gPiA+ID4gbmFtZXNwYWNlDQo+ID4gPiB5ZXQu
DQo+ID4gPiA+ID4NCj4gPiA+ID4gPiBJJ20gd2FpdGluZyBmb3IgaW5mb3JtYXRpb24gZnJvbSBC
b2IgYWJvdXQgdGhhdC4gIElmIHdlIGNhbg0KPiA+ID4gPiA+IGFzc3VtZSBBQ1BJIGhhbmRsZXMg
dG8gYmUgYWx3YXlzIHZhbGlkLCB0aGF0IHdpbGwgc2ltcGxpZnkNCj4gPiA+ID4gPiB0aGluZ3Mg
cXVpdGUgYQ0KPiA+ID4gYml0Lg0KPiA+ID4gPg0KPiA+ID4gPiBJZiBhIHRhYmxlIGlzIHVubG9h
ZGVkLCBhbGwgdGhlIG5hbWVzcGFjZSBub2RlcyBmb3IgdGhhdCB0YWJsZSBhcmUNCj4gPiA+ID4g
cmVtb3ZlZCBmcm9tIHRoZSBuYW1lc3BhY2UsIGFuZCB0aHVzIGFueSBBQ1BJX0hBTkRMRSBwb2lu
dGVycyBnbw0KPiA+ID4gPiBzdGFsZQ0KPiA+ID4gYW5kIGludmFsaWQuDQo+ID4gPg0KPiA+ID4g
T0ssIHRoYW5rcyENCj4gPiA+DQo+ID4gPiBUbyBtZSB0aGlzIG1lYW5zIHRoYXQgd2UgY2Fubm90
IGFzc3VtZSBhIGhhbmRsZSB0byBzdGF5IHZhbGlkDQo+ID4gPiBiZXR3ZWVuIGEgbm90aWZ5IGhh
bmRsZXIgYW5kIGFjcGlfYnVzX2hvdF9yZW1vdmVfZGV2aWNlKCkgcnVuIGZyb20gYQ0KPiB3b3Jr
cXVldWUuDQo+ID4gPg0KPiA+ID4gSXMgdGhlcmUgYSBtZWNoYW5pc20gaW4gQUNQSUNBIHRvIGVu
c3VyZSB0aGF0IGEgaGFuZGxlIHdvbid0IGJlY29tZQ0KPiA+ID4gc3RhbGUgd2hpbGUgYSBub3Rp
ZnkgaGFuZGxlciBpcyBydW5uaW5nIGZvciBpdCBvciBpcyB0aGUgT1MNCj4gPiA+IHJlc3BvbnNp
YmxlIGZvciBlbnN1cmluZyB0aGF0DQo+ID4gPiBfRUowIHdvbid0IGJlIHJ1biBpbiBwYXJhbGxl
bCB3aXRoIG5vdGlmeSBoYW5kbGVycyBmb3IgZGV2aWNlDQo+ID4gPiBvYmplY3RzIGJlaW5nIGVq
ZWN0ZWQ/DQo+ID4gPg0KPiA+DQo+ID4gSXQgaXMgdXAgdG8gdGhlIGhvc3QuDQo+IA0KPiBJIHdh
cyBhZnJhaWQgdGhhdCB0aGF0IG1pZ2h0IGJlIHRoZSBjYXNlLiA6LSkNCj4gDQo+IFNvIGZhciB0
aGUgKExpbnV4KSBob3N0IGhhcyBiZWVuIGhhcHBpbHkgaWdub3JpbmcgdGhhdCBwb3RlbnRpYWwg
cHJvYmxlbSwNCj4gc28gSSBndWVzcyBpdCBjYW4gc3RpbGwgYmUgaWdub3JlZCBmb3IgYSB3aGls
ZSwgYWx0aG91Z2ggd2UnbGwgbmVlZCB0bw0KPiBhZGRyZXNzIGl0IGV2ZW50dWFsbHkgYXQgb25l
IHBvaW50Lg0KDQpJIHdvdWxkIHRoaW5rIGl0IHNob3VsZCBiZSBmYWlybHkgc2ltcGxlIHRvIHNl
dHVwIGEgbWVjaGFuaXNtIHRvIGVpdGhlciB0ZWxsIHRoZSBkcml2ZXIgb3IgZm9yIHRoZSBkcml2
ZXIgdG8gZmlndXJlIGl0IG91dCAtLSBzdWNoIHRoYXQgdGhlIGRyaXZlciBrbm93cyB0aGF0IGFs
bCBoYW5kbGVzIGFzc29jaWF0ZWQgd2l0aCB0aGUgZGV2aWNlIGFyZSBub3cgaW52YWxpZC4gQW5v
dGhlciB3YXkgdG8gbG9vayBhdCBpdCBpcyB0aGF0IHdoZW4gdGhlIGRldmljZSBpcyByZS1pbnN0
YWxsZWQsIHRoZSBkcml2ZXIgc2hvdWxkIHJlaW5pdGlhbGl6ZSBzdWNoIHRoYXQgaXQgb2J0YWlu
cyBuZXcgaGFuZGxlcyBmb3IgdGhlIGRldmljZXMgYW5kIHN1Ym9iamVjdHMgaW4gcXVlc3Rpb24u
DQoNCkJvYg0KDQoNCg0KDQoNCg0KPiANCj4gVGhhbmtzLA0KPiBSYWZhZWwNCj4gDQo+IA0KPiAt
LQ0KPiBJIHNwZWFrIG9ubHkgZm9yIG15c2VsZi4NCj4gUmFmYWVsIEouIFd5c29ja2ksIEludGVs
IE9wZW4gU291cmNlIFRlY2hub2xvZ3kgQ2VudGVyLg0K

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-14 23:45                     ` Moore, Robert
  (?)
  (?)
@ 2013-02-15  0:23                     ` Rafael J. Wysocki
  2013-02-15  0:28                       ` Toshi Kani
  -1 siblings, 1 reply; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-15  0:23 UTC (permalink / raw)
  To: Moore, Robert
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Thursday, February 14, 2013 11:45:27 PM Moore, Robert wrote:
> 
> > -----Original Message-----
> > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > Sent: Thursday, February 14, 2013 12:59 PM
> > To: Moore, Robert
> > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> > Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> > memory leaks
> > 
> > On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote:
> > >
> > > > -----Original Message-----
> > > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > > Sent: Thursday, February 14, 2013 4:04 AM
> > > > To: Moore, Robert
> > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang
> > > > Liu; Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe;
> > > > linux-pci@vger.kernel.org
> > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues
> > > > and memory leaks
> > > >
> > > > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > > > > I thought about that, but actually there's no guarantee that
> > > > > > > > the handle will be valid after _EJ0 as far as I can say.  So
> > > > > > > > the race condition is going to be there anyway and using
> > > > > > > > struct acpi_device just makes it easier to avoid it.
> > > > > > >
> > > > > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > > > > performs unload table and if ACPICA frees up its internal data
> > > > > > > structure pointed by the handle as a result.  But we should
> > > > > > > not see such issue now since we do not support dynamic ACPI
> > > > > > > namespace
> > > > yet.
> > > > > >
> > > > > > I'm waiting for information from Bob about that.  If we can
> > > > > > assume ACPI handles to be always valid, that will simplify
> > > > > > things quite a
> > > > bit.
> > > > >
> > > > > If a table is unloaded, all the namespace nodes for that table are
> > > > > removed from the namespace, and thus any ACPI_HANDLE pointers go
> > > > > stale
> > > > and invalid.
> > > >
> > > > OK, thanks!
> > > >
> > > > To me this means that we cannot assume a handle to stay valid
> > > > between a notify handler and acpi_bus_hot_remove_device() run from a
> > workqueue.
> > > >
> > > > Is there a mechanism in ACPICA to ensure that a handle won't become
> > > > stale while a notify handler is running for it or is the OS
> > > > responsible for ensuring that
> > > > _EJ0 won't be run in parallel with notify handlers for device
> > > > objects being ejected?
> > > >
> > >
> > > It is up to the host.
> > 
> > I was afraid that that might be the case. :-)
> > 
> > So far the (Linux) host has been happily ignoring that potential problem,
> > so I guess it can still be ignored for a while, although we'll need to
> > address it eventually at one point.
> 
> I would think it should be fairly simple to setup a mechanism to either tell
> the driver or for the driver to figure it out -- such that the driver knows
> that all handles associated with the device are now invalid. Another way
> to look at it is that when the device is re-installed, the driver should
> reinitialize such that it obtains new handles for the devices and subobjects
> in question.

Unfortunately, there is quite strong assumption in our code that ACPI handles
will not become stale before the device objects associated with them are
removed.  For this reason, we need to know in advance which handles will
become stale as a result of a table unload and remove their device objects
beforehand.

Moreover, when there's a notify handler installed for a given ACPI handle
and that handle becomes stale while the notify handler is running, we'll be
in trouble.  To avoid that we need to ensure that table unloads and notifies
will always be mutually exclusive.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-15  0:23                     ` Rafael J. Wysocki
@ 2013-02-15  0:28                       ` Toshi Kani
  2013-02-15 12:49                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 35+ messages in thread
From: Toshi Kani @ 2013-02-15  0:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Moore, Robert, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Fri, 2013-02-15 at 01:23 +0100, Rafael J. Wysocki wrote:
> On Thursday, February 14, 2013 11:45:27 PM Moore, Robert wrote:
> > 
> > > -----Original Message-----
> > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > Sent: Thursday, February 14, 2013 12:59 PM
> > > To: Moore, Robert
> > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> > > Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> > > memory leaks
> > > 
> > > On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > > > Sent: Thursday, February 14, 2013 4:04 AM
> > > > > To: Moore, Robert
> > > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang
> > > > > Liu; Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe;
> > > > > linux-pci@vger.kernel.org
> > > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues
> > > > > and memory leaks
> > > > >
> > > > > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > > > > > I thought about that, but actually there's no guarantee that
> > > > > > > > > the handle will be valid after _EJ0 as far as I can say.  So
> > > > > > > > > the race condition is going to be there anyway and using
> > > > > > > > > struct acpi_device just makes it easier to avoid it.
> > > > > > > >
> > > > > > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > > > > > performs unload table and if ACPICA frees up its internal data
> > > > > > > > structure pointed by the handle as a result.  But we should
> > > > > > > > not see such issue now since we do not support dynamic ACPI
> > > > > > > > namespace
> > > > > yet.
> > > > > > >
> > > > > > > I'm waiting for information from Bob about that.  If we can
> > > > > > > assume ACPI handles to be always valid, that will simplify
> > > > > > > things quite a
> > > > > bit.
> > > > > >
> > > > > > If a table is unloaded, all the namespace nodes for that table are
> > > > > > removed from the namespace, and thus any ACPI_HANDLE pointers go
> > > > > > stale
> > > > > and invalid.
> > > > >
> > > > > OK, thanks!
> > > > >
> > > > > To me this means that we cannot assume a handle to stay valid
> > > > > between a notify handler and acpi_bus_hot_remove_device() run from a
> > > workqueue.
> > > > >
> > > > > Is there a mechanism in ACPICA to ensure that a handle won't become
> > > > > stale while a notify handler is running for it or is the OS
> > > > > responsible for ensuring that
> > > > > _EJ0 won't be run in parallel with notify handlers for device
> > > > > objects being ejected?
> > > > >
> > > >
> > > > It is up to the host.
> > > 
> > > I was afraid that that might be the case. :-)
> > > 
> > > So far the (Linux) host has been happily ignoring that potential problem,
> > > so I guess it can still be ignored for a while, although we'll need to
> > > address it eventually at one point.
> > 
> > I would think it should be fairly simple to setup a mechanism to either tell
> > the driver or for the driver to figure it out -- such that the driver knows
> > that all handles associated with the device are now invalid. Another way
> > to look at it is that when the device is re-installed, the driver should
> > reinitialize such that it obtains new handles for the devices and subobjects
> > in question.
> 
> Unfortunately, there is quite strong assumption in our code that ACPI handles
> will not become stale before the device objects associated with them are
> removed.  For this reason, we need to know in advance which handles will
> become stale as a result of a table unload and remove their device objects
> beforehand.
> 
> Moreover, when there's a notify handler installed for a given ACPI handle
> and that handle becomes stale while the notify handler is running, we'll be
> in trouble.  To avoid that we need to ensure that table unloads and notifies
> will always be mutually exclusive.

I wonder if we can make acpi_ns_validate_handle() to actually be able to
verify if a given handle is valid.  This way, ACPICA can fail gracefully
(AE_BAD_PARAMETER) when a stable handle is passed to the interfaces.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-15  0:28                       ` Toshi Kani
@ 2013-02-15 12:49                         ` Rafael J. Wysocki
  2013-02-15 15:18                           ` Toshi Kani
  0 siblings, 1 reply; 35+ messages in thread
From: Rafael J. Wysocki @ 2013-02-15 12:49 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Moore, Robert, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Thursday, February 14, 2013 05:28:02 PM Toshi Kani wrote:
> On Fri, 2013-02-15 at 01:23 +0100, Rafael J. Wysocki wrote:
> > On Thursday, February 14, 2013 11:45:27 PM Moore, Robert wrote:
> > > 
> > > > -----Original Message-----
> > > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > > Sent: Thursday, February 14, 2013 12:59 PM
> > > > To: Moore, Robert
> > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> > > > Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> > > > memory leaks
> > > > 
> > > > On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > > > > Sent: Thursday, February 14, 2013 4:04 AM
> > > > > > To: Moore, Robert
> > > > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang
> > > > > > Liu; Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe;
> > > > > > linux-pci@vger.kernel.org
> > > > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues
> > > > > > and memory leaks
> > > > > >
> > > > > > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > > > > > > I thought about that, but actually there's no guarantee that
> > > > > > > > > > the handle will be valid after _EJ0 as far as I can say.  So
> > > > > > > > > > the race condition is going to be there anyway and using
> > > > > > > > > > struct acpi_device just makes it easier to avoid it.
> > > > > > > > >
> > > > > > > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > > > > > > performs unload table and if ACPICA frees up its internal data
> > > > > > > > > structure pointed by the handle as a result.  But we should
> > > > > > > > > not see such issue now since we do not support dynamic ACPI
> > > > > > > > > namespace
> > > > > > yet.
> > > > > > > >
> > > > > > > > I'm waiting for information from Bob about that.  If we can
> > > > > > > > assume ACPI handles to be always valid, that will simplify
> > > > > > > > things quite a
> > > > > > bit.
> > > > > > >
> > > > > > > If a table is unloaded, all the namespace nodes for that table are
> > > > > > > removed from the namespace, and thus any ACPI_HANDLE pointers go
> > > > > > > stale
> > > > > > and invalid.
> > > > > >
> > > > > > OK, thanks!
> > > > > >
> > > > > > To me this means that we cannot assume a handle to stay valid
> > > > > > between a notify handler and acpi_bus_hot_remove_device() run from a
> > > > workqueue.
> > > > > >
> > > > > > Is there a mechanism in ACPICA to ensure that a handle won't become
> > > > > > stale while a notify handler is running for it or is the OS
> > > > > > responsible for ensuring that
> > > > > > _EJ0 won't be run in parallel with notify handlers for device
> > > > > > objects being ejected?
> > > > > >
> > > > >
> > > > > It is up to the host.
> > > > 
> > > > I was afraid that that might be the case. :-)
> > > > 
> > > > So far the (Linux) host has been happily ignoring that potential problem,
> > > > so I guess it can still be ignored for a while, although we'll need to
> > > > address it eventually at one point.
> > > 
> > > I would think it should be fairly simple to setup a mechanism to either tell
> > > the driver or for the driver to figure it out -- such that the driver knows
> > > that all handles associated with the device are now invalid. Another way
> > > to look at it is that when the device is re-installed, the driver should
> > > reinitialize such that it obtains new handles for the devices and subobjects
> > > in question.
> > 
> > Unfortunately, there is quite strong assumption in our code that ACPI handles
> > will not become stale before the device objects associated with them are
> > removed.  For this reason, we need to know in advance which handles will
> > become stale as a result of a table unload and remove their device objects
> > beforehand.
> > 
> > Moreover, when there's a notify handler installed for a given ACPI handle
> > and that handle becomes stale while the notify handler is running, we'll be
> > in trouble.  To avoid that we need to ensure that table unloads and notifies
> > will always be mutually exclusive.
> 
> I wonder if we can make acpi_ns_validate_handle() to actually be able to
> verify if a given handle is valid.  This way, ACPICA can fail gracefully
> (AE_BAD_PARAMETER) when a stable handle is passed to the interfaces.

That'd be good, but to implement it, I think, it would be necessary to
introduce some reference counting of namespace objects such that the given
object would only be deleted after the last reference to it had been dropped.
On table unload it would just be marked as invalid, but it would stay in
memory as long as there were any references to it.

So, for example, a notify handler would start from something like
acpi_add_reference(handle), which would guarantee that the object pointed to by
handle would stay in memory, and it would finish by doing
acpi_drop_reference(handle) or a work item scheduled by it would do that.

We do that for objects based on struct device and it works well.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-15 12:49                         ` Rafael J. Wysocki
@ 2013-02-15 15:18                           ` Toshi Kani
  2013-02-15 16:33                               ` Moore, Robert
  0 siblings, 1 reply; 35+ messages in thread
From: Toshi Kani @ 2013-02-15 15:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Moore, Robert, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci

On Fri, 2013-02-15 at 13:49 +0100, Rafael J. Wysocki wrote:
> On Thursday, February 14, 2013 05:28:02 PM Toshi Kani wrote:
> > On Fri, 2013-02-15 at 01:23 +0100, Rafael J. Wysocki wrote:
> > > On Thursday, February 14, 2013 11:45:27 PM Moore, Robert wrote:
> > > > 
> > > > > -----Original Message-----
> > > > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > > > Sent: Thursday, February 14, 2013 12:59 PM
> > > > > To: Moore, Robert
> > > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> > > > > Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@vger.kernel.org
> > > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> > > > > memory leaks
> > > > > 
> > > > > On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Rafael J. Wysocki [mailto:rjw@sisk.pl]
> > > > > > > Sent: Thursday, February 14, 2013 4:04 AM
> > > > > > > To: Moore, Robert
> > > > > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang
> > > > > > > Liu; Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe;
> > > > > > > linux-pci@vger.kernel.org
> > > > > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues
> > > > > > > and memory leaks
> > > > > > >
> > > > > > > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > > > > > > > I thought about that, but actually there's no guarantee that
> > > > > > > > > > > the handle will be valid after _EJ0 as far as I can say.  So
> > > > > > > > > > > the race condition is going to be there anyway and using
> > > > > > > > > > > struct acpi_device just makes it easier to avoid it.
> > > > > > > > > >
> > > > > > > > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > > > > > > > performs unload table and if ACPICA frees up its internal data
> > > > > > > > > > structure pointed by the handle as a result.  But we should
> > > > > > > > > > not see such issue now since we do not support dynamic ACPI
> > > > > > > > > > namespace
> > > > > > > yet.
> > > > > > > > >
> > > > > > > > > I'm waiting for information from Bob about that.  If we can
> > > > > > > > > assume ACPI handles to be always valid, that will simplify
> > > > > > > > > things quite a
> > > > > > > bit.
> > > > > > > >
> > > > > > > > If a table is unloaded, all the namespace nodes for that table are
> > > > > > > > removed from the namespace, and thus any ACPI_HANDLE pointers go
> > > > > > > > stale
> > > > > > > and invalid.
> > > > > > >
> > > > > > > OK, thanks!
> > > > > > >
> > > > > > > To me this means that we cannot assume a handle to stay valid
> > > > > > > between a notify handler and acpi_bus_hot_remove_device() run from a
> > > > > workqueue.
> > > > > > >
> > > > > > > Is there a mechanism in ACPICA to ensure that a handle won't become
> > > > > > > stale while a notify handler is running for it or is the OS
> > > > > > > responsible for ensuring that
> > > > > > > _EJ0 won't be run in parallel with notify handlers for device
> > > > > > > objects being ejected?
> > > > > > >
> > > > > >
> > > > > > It is up to the host.
> > > > > 
> > > > > I was afraid that that might be the case. :-)
> > > > > 
> > > > > So far the (Linux) host has been happily ignoring that potential problem,
> > > > > so I guess it can still be ignored for a while, although we'll need to
> > > > > address it eventually at one point.
> > > > 
> > > > I would think it should be fairly simple to setup a mechanism to either tell
> > > > the driver or for the driver to figure it out -- such that the driver knows
> > > > that all handles associated with the device are now invalid. Another way
> > > > to look at it is that when the device is re-installed, the driver should
> > > > reinitialize such that it obtains new handles for the devices and subobjects
> > > > in question.
> > > 
> > > Unfortunately, there is quite strong assumption in our code that ACPI handles
> > > will not become stale before the device objects associated with them are
> > > removed.  For this reason, we need to know in advance which handles will
> > > become stale as a result of a table unload and remove their device objects
> > > beforehand.
> > > 
> > > Moreover, when there's a notify handler installed for a given ACPI handle
> > > and that handle becomes stale while the notify handler is running, we'll be
> > > in trouble.  To avoid that we need to ensure that table unloads and notifies
> > > will always be mutually exclusive.
> > 
> > I wonder if we can make acpi_ns_validate_handle() to actually be able to
> > verify if a given handle is valid.  This way, ACPICA can fail gracefully
> > (AE_BAD_PARAMETER) when a stable handle is passed to the interfaces.
> 
> That'd be good, but to implement it, I think, it would be necessary to
> introduce some reference counting of namespace objects such that the given
> object would only be deleted after the last reference to it had been dropped.
> On table unload it would just be marked as invalid, but it would stay in
> memory as long as there were any references to it.
> 
> So, for example, a notify handler would start from something like
> acpi_add_reference(handle), which would guarantee that the object pointed to by
> handle would stay in memory, and it would finish by doing
> acpi_drop_reference(handle) or a work item scheduled by it would do that.
> 
> We do that for objects based on struct device and it works well.

There is other way to implement it.  Since acpi_handle is defined as an
opaque value, this could be changed to an index to an array of pointers,
instead of a direct pointer.  Then we can safely invalidate an index by
invalidating the pointer associated with the index.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-15 15:18                           ` Toshi Kani
  2013-02-15 16:33                               ` Moore, Robert
@ 2013-02-15 16:33                               ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-15 16:33 UTC (permalink / raw)
  To: Toshi Kani, Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci, Zheng, Lv,
	Brown, Len


> > > > > > > > Is there a mechanism in ACPICA to ensure that a handle
> > > > > > > > won't become stale while a notify handler is running for
> > > > > > > > it or is the OS responsible for ensuring that
> > > > > > > > _EJ0 won't be run in parallel with notify handlers for
> > > > > > > > device objects being ejected?
> > > > > > > >
> > > > > > >
> > > > > > > It is up to the host.
> > > > > >
> > > > > > I was afraid that that might be the case. :-)
> > > > > >
> > > > > > So far the (Linux) host has been happily ignoring that
> > > > > > potential problem, so I guess it can still be ignored for a
> > > > > > while, although we'll need to address it eventually at one
> point.
> > > > >
> > > > > I would think it should be fairly simple to setup a mechanism to
> > > > > either tell the driver or for the driver to figure it out --
> > > > > such that the driver knows that all handles associated with the
> > > > > device are now invalid. Another way to look at it is that when
> > > > > the device is re-installed, the driver should reinitialize such
> > > > > that it obtains new handles for the devices and subobjects in
> question.
> > > >
> > > > Unfortunately, there is quite strong assumption in our code that
> > > > ACPI handles will not become stale before the device objects
> > > > associated with them are removed.  For this reason, we need to
> > > > know in advance which handles will become stale as a result of a
> > > > table unload and remove their device objects beforehand.
> > > >
> > > > Moreover, when there's a notify handler installed for a given ACPI
> > > > handle and that handle becomes stale while the notify handler is
> > > > running, we'll be in trouble.  To avoid that we need to ensure
> > > > that table unloads and notifies will always be mutually exclusive.
> > >
> > > I wonder if we can make acpi_ns_validate_handle() to actually be
> > > able to verify if a given handle is valid.  This way, ACPICA can
> > > fail gracefully
> > > (AE_BAD_PARAMETER) when a stable handle is passed to the interfaces.
> >
> > That'd be good, but to implement it, I think, it would be necessary to
> > introduce some reference counting of namespace objects such that the
> > given object would only be deleted after the last reference to it had
> been dropped.
> > On table unload it would just be marked as invalid, but it would stay
> > in memory as long as there were any references to it.
> >
> > So, for example, a notify handler would start from something like
> > acpi_add_reference(handle), which would guarantee that the object
> > pointed to by handle would stay in memory, and it would finish by
> > doing
> > acpi_drop_reference(handle) or a work item scheduled by it would do
> that.
> >
> > We do that for objects based on struct device and it works well.
> 
> There is other way to implement it.  Since acpi_handle is defined as an
> opaque value, this could be changed to an index to an array of pointers,
> instead of a direct pointer.  Then we can safely invalidate an index by
> invalidating the pointer associated with the index.



We have of course thought about adding a mechanism to validate/invalidate acpica namespace handles. In fact, this is why the ACPI_HANDLE data type was introduced in the first place, so that if we ever wanted or needed to implement something like this, it would not break a lot of existing code.

However, we have never had a real need to implement such a mechanism, nor has the ever been a request from any of the operating systems that run ACPICA.

The existing model is that the host has knowledge of what objects will go away when a table is unloaded, and can simply assume that all related acpi handles will go bad after the unload (and table unload is the only case where parts of the namespace go away, as far as I remember). Upon a reload of the table, the host knows to reinitialize all handles associated with the table/device.

ACPCI has a table handler mechanism, where the host handler is invoked *before* an ACPI table is actually unloaded, so the host can prepare for the unload. For example, before returning from the table handler, the host may want to synchronize by waiting for any outstanding notifies to complete, then simply ignoring any further notifies from any devices associated with the table.

In summary, it has always been felt that the fairly large overhead of implementing a mechanism like this is not worth the cost, as well as not really needed.

I suggest that the implementation of eject support proceed by using the existing mechanisms such as the table handler. If additional support/interfaces are needed in ACPICA, we can discuss it. However, just about the last thing I would like to do is add a level of indirection between the ACPI_HANDLE and the ACPI_NAMESPACE_NODE -- which would require a large, global change to ACPICA that would be only applicable for a single rather rare issue, the unloading of an ACPI table. Just the fact that we are discussing this in 2013 and ACPICA has been running since 1999 should confirm the rarity of this case and/or that the existing mechanism has been sufficient for other hosts that run ACPICA.

Bob





^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-15 16:33                               ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-15 16:33 UTC (permalink / raw)
  To: Toshi Kani, Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci, Zheng, Lv,
	Brown, Len

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5336 bytes --]


> > > > > > > > Is there a mechanism in ACPICA to ensure that a handle
> > > > > > > > won't become stale while a notify handler is running for
> > > > > > > > it or is the OS responsible for ensuring that
> > > > > > > > _EJ0 won't be run in parallel with notify handlers for
> > > > > > > > device objects being ejected?
> > > > > > > >
> > > > > > >
> > > > > > > It is up to the host.
> > > > > >
> > > > > > I was afraid that that might be the case. :-)
> > > > > >
> > > > > > So far the (Linux) host has been happily ignoring that
> > > > > > potential problem, so I guess it can still be ignored for a
> > > > > > while, although we'll need to address it eventually at one
> point.
> > > > >
> > > > > I would think it should be fairly simple to setup a mechanism to
> > > > > either tell the driver or for the driver to figure it out --
> > > > > such that the driver knows that all handles associated with the
> > > > > device are now invalid. Another way to look at it is that when
> > > > > the device is re-installed, the driver should reinitialize such
> > > > > that it obtains new handles for the devices and subobjects in
> question.
> > > >
> > > > Unfortunately, there is quite strong assumption in our code that
> > > > ACPI handles will not become stale before the device objects
> > > > associated with them are removed.  For this reason, we need to
> > > > know in advance which handles will become stale as a result of a
> > > > table unload and remove their device objects beforehand.
> > > >
> > > > Moreover, when there's a notify handler installed for a given ACPI
> > > > handle and that handle becomes stale while the notify handler is
> > > > running, we'll be in trouble.  To avoid that we need to ensure
> > > > that table unloads and notifies will always be mutually exclusive.
> > >
> > > I wonder if we can make acpi_ns_validate_handle() to actually be
> > > able to verify if a given handle is valid.  This way, ACPICA can
> > > fail gracefully
> > > (AE_BAD_PARAMETER) when a stable handle is passed to the interfaces.
> >
> > That'd be good, but to implement it, I think, it would be necessary to
> > introduce some reference counting of namespace objects such that the
> > given object would only be deleted after the last reference to it had
> been dropped.
> > On table unload it would just be marked as invalid, but it would stay
> > in memory as long as there were any references to it.
> >
> > So, for example, a notify handler would start from something like
> > acpi_add_reference(handle), which would guarantee that the object
> > pointed to by handle would stay in memory, and it would finish by
> > doing
> > acpi_drop_reference(handle) or a work item scheduled by it would do
> that.
> >
> > We do that for objects based on struct device and it works well.
> 
> There is other way to implement it.  Since acpi_handle is defined as an
> opaque value, this could be changed to an index to an array of pointers,
> instead of a direct pointer.  Then we can safely invalidate an index by
> invalidating the pointer associated with the index.



We have of course thought about adding a mechanism to validate/invalidate acpica namespace handles. In fact, this is why the ACPI_HANDLE data type was introduced in the first place, so that if we ever wanted or needed to implement something like this, it would not break a lot of existing code.

However, we have never had a real need to implement such a mechanism, nor has the ever been a request from any of the operating systems that run ACPICA.

The existing model is that the host has knowledge of what objects will go away when a table is unloaded, and can simply assume that all related acpi handles will go bad after the unload (and table unload is the only case where parts of the namespace go away, as far as I remember). Upon a reload of the table, the host knows to reinitialize all handles associated with the table/device.

ACPCI has a table handler mechanism, where the host handler is invoked *before* an ACPI table is actually unloaded, so the host can prepare for the unload. For example, before returning from the table handler, the host may want to synchronize by waiting for any outstanding notifies to complete, then simply ignoring any further notifies from any devices associated with the table.

In summary, it has always been felt that the fairly large overhead of implementing a mechanism like this is not worth the cost, as well as not really needed.

I suggest that the implementation of eject support proceed by using the existing mechanisms such as the table handler. If additional support/interfaces are needed in ACPICA, we can discuss it. However, just about the last thing I would like to do is add a level of indirection between the ACPI_HANDLE and the ACPI_NAMESPACE_NODE -- which would require a large, global change to ACPICA that would be only applicable for a single rather rare issue, the unloading of an ACPI table. Just the fact that we are discussing this in 2013 and ACPICA has been running since 1999 should confirm the rarity of this case and/or that the existing mechanism has been sufficient for other hosts that run ACPICA.

Bob




ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
@ 2013-02-15 16:33                               ` Moore, Robert
  0 siblings, 0 replies; 35+ messages in thread
From: Moore, Robert @ 2013-02-15 16:33 UTC (permalink / raw)
  To: Toshi Kani, Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Bjorn Helgaas, Jiang Liu,
	Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci, Zheng, Lv,
	Brown, Len

DQo+ID4gPiA+ID4gPiA+ID4gSXMgdGhlcmUgYSBtZWNoYW5pc20gaW4gQUNQSUNBIHRvIGVuc3Vy
ZSB0aGF0IGEgaGFuZGxlDQo+ID4gPiA+ID4gPiA+ID4gd29uJ3QgYmVjb21lIHN0YWxlIHdoaWxl
IGEgbm90aWZ5IGhhbmRsZXIgaXMgcnVubmluZyBmb3INCj4gPiA+ID4gPiA+ID4gPiBpdCBvciBp
cyB0aGUgT1MgcmVzcG9uc2libGUgZm9yIGVuc3VyaW5nIHRoYXQNCj4gPiA+ID4gPiA+ID4gPiBf
RUowIHdvbid0IGJlIHJ1biBpbiBwYXJhbGxlbCB3aXRoIG5vdGlmeSBoYW5kbGVycyBmb3INCj4g
PiA+ID4gPiA+ID4gPiBkZXZpY2Ugb2JqZWN0cyBiZWluZyBlamVjdGVkPw0KPiA+ID4gPiA+ID4g
PiA+DQo+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+IEl0IGlzIHVwIHRvIHRoZSBob3N0Lg0K
PiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+IEkgd2FzIGFmcmFpZCB0aGF0IHRoYXQgbWlnaHQgYmUg
dGhlIGNhc2UuIDotKQ0KPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+IFNvIGZhciB0aGUgKExpbnV4
KSBob3N0IGhhcyBiZWVuIGhhcHBpbHkgaWdub3JpbmcgdGhhdA0KPiA+ID4gPiA+ID4gcG90ZW50
aWFsIHByb2JsZW0sIHNvIEkgZ3Vlc3MgaXQgY2FuIHN0aWxsIGJlIGlnbm9yZWQgZm9yIGENCj4g
PiA+ID4gPiA+IHdoaWxlLCBhbHRob3VnaCB3ZSdsbCBuZWVkIHRvIGFkZHJlc3MgaXQgZXZlbnR1
YWxseSBhdCBvbmUNCj4gcG9pbnQuDQo+ID4gPiA+ID4NCj4gPiA+ID4gPiBJIHdvdWxkIHRoaW5r
IGl0IHNob3VsZCBiZSBmYWlybHkgc2ltcGxlIHRvIHNldHVwIGEgbWVjaGFuaXNtIHRvDQo+ID4g
PiA+ID4gZWl0aGVyIHRlbGwgdGhlIGRyaXZlciBvciBmb3IgdGhlIGRyaXZlciB0byBmaWd1cmUg
aXQgb3V0IC0tDQo+ID4gPiA+ID4gc3VjaCB0aGF0IHRoZSBkcml2ZXIga25vd3MgdGhhdCBhbGwg
aGFuZGxlcyBhc3NvY2lhdGVkIHdpdGggdGhlDQo+ID4gPiA+ID4gZGV2aWNlIGFyZSBub3cgaW52
YWxpZC4gQW5vdGhlciB3YXkgdG8gbG9vayBhdCBpdCBpcyB0aGF0IHdoZW4NCj4gPiA+ID4gPiB0
aGUgZGV2aWNlIGlzIHJlLWluc3RhbGxlZCwgdGhlIGRyaXZlciBzaG91bGQgcmVpbml0aWFsaXpl
IHN1Y2gNCj4gPiA+ID4gPiB0aGF0IGl0IG9idGFpbnMgbmV3IGhhbmRsZXMgZm9yIHRoZSBkZXZp
Y2VzIGFuZCBzdWJvYmplY3RzIGluDQo+IHF1ZXN0aW9uLg0KPiA+ID4gPg0KPiA+ID4gPiBVbmZv
cnR1bmF0ZWx5LCB0aGVyZSBpcyBxdWl0ZSBzdHJvbmcgYXNzdW1wdGlvbiBpbiBvdXIgY29kZSB0
aGF0DQo+ID4gPiA+IEFDUEkgaGFuZGxlcyB3aWxsIG5vdCBiZWNvbWUgc3RhbGUgYmVmb3JlIHRo
ZSBkZXZpY2Ugb2JqZWN0cw0KPiA+ID4gPiBhc3NvY2lhdGVkIHdpdGggdGhlbSBhcmUgcmVtb3Zl
ZC4gIEZvciB0aGlzIHJlYXNvbiwgd2UgbmVlZCB0bw0KPiA+ID4gPiBrbm93IGluIGFkdmFuY2Ug
d2hpY2ggaGFuZGxlcyB3aWxsIGJlY29tZSBzdGFsZSBhcyBhIHJlc3VsdCBvZiBhDQo+ID4gPiA+
IHRhYmxlIHVubG9hZCBhbmQgcmVtb3ZlIHRoZWlyIGRldmljZSBvYmplY3RzIGJlZm9yZWhhbmQu
DQo+ID4gPiA+DQo+ID4gPiA+IE1vcmVvdmVyLCB3aGVuIHRoZXJlJ3MgYSBub3RpZnkgaGFuZGxl
ciBpbnN0YWxsZWQgZm9yIGEgZ2l2ZW4gQUNQSQ0KPiA+ID4gPiBoYW5kbGUgYW5kIHRoYXQgaGFu
ZGxlIGJlY29tZXMgc3RhbGUgd2hpbGUgdGhlIG5vdGlmeSBoYW5kbGVyIGlzDQo+ID4gPiA+IHJ1
bm5pbmcsIHdlJ2xsIGJlIGluIHRyb3VibGUuICBUbyBhdm9pZCB0aGF0IHdlIG5lZWQgdG8gZW5z
dXJlDQo+ID4gPiA+IHRoYXQgdGFibGUgdW5sb2FkcyBhbmQgbm90aWZpZXMgd2lsbCBhbHdheXMg
YmUgbXV0dWFsbHkgZXhjbHVzaXZlLg0KPiA+ID4NCj4gPiA+IEkgd29uZGVyIGlmIHdlIGNhbiBt
YWtlIGFjcGlfbnNfdmFsaWRhdGVfaGFuZGxlKCkgdG8gYWN0dWFsbHkgYmUNCj4gPiA+IGFibGUg
dG8gdmVyaWZ5IGlmIGEgZ2l2ZW4gaGFuZGxlIGlzIHZhbGlkLiAgVGhpcyB3YXksIEFDUElDQSBj
YW4NCj4gPiA+IGZhaWwgZ3JhY2VmdWxseQ0KPiA+ID4gKEFFX0JBRF9QQVJBTUVURVIpIHdoZW4g
YSBzdGFibGUgaGFuZGxlIGlzIHBhc3NlZCB0byB0aGUgaW50ZXJmYWNlcy4NCj4gPg0KPiA+IFRo
YXQnZCBiZSBnb29kLCBidXQgdG8gaW1wbGVtZW50IGl0LCBJIHRoaW5rLCBpdCB3b3VsZCBiZSBu
ZWNlc3NhcnkgdG8NCj4gPiBpbnRyb2R1Y2Ugc29tZSByZWZlcmVuY2UgY291bnRpbmcgb2YgbmFt
ZXNwYWNlIG9iamVjdHMgc3VjaCB0aGF0IHRoZQ0KPiA+IGdpdmVuIG9iamVjdCB3b3VsZCBvbmx5
IGJlIGRlbGV0ZWQgYWZ0ZXIgdGhlIGxhc3QgcmVmZXJlbmNlIHRvIGl0IGhhZA0KPiBiZWVuIGRy
b3BwZWQuDQo+ID4gT24gdGFibGUgdW5sb2FkIGl0IHdvdWxkIGp1c3QgYmUgbWFya2VkIGFzIGlu
dmFsaWQsIGJ1dCBpdCB3b3VsZCBzdGF5DQo+ID4gaW4gbWVtb3J5IGFzIGxvbmcgYXMgdGhlcmUg
d2VyZSBhbnkgcmVmZXJlbmNlcyB0byBpdC4NCj4gPg0KPiA+IFNvLCBmb3IgZXhhbXBsZSwgYSBu
b3RpZnkgaGFuZGxlciB3b3VsZCBzdGFydCBmcm9tIHNvbWV0aGluZyBsaWtlDQo+ID4gYWNwaV9h
ZGRfcmVmZXJlbmNlKGhhbmRsZSksIHdoaWNoIHdvdWxkIGd1YXJhbnRlZSB0aGF0IHRoZSBvYmpl
Y3QNCj4gPiBwb2ludGVkIHRvIGJ5IGhhbmRsZSB3b3VsZCBzdGF5IGluIG1lbW9yeSwgYW5kIGl0
IHdvdWxkIGZpbmlzaCBieQ0KPiA+IGRvaW5nDQo+ID4gYWNwaV9kcm9wX3JlZmVyZW5jZShoYW5k
bGUpIG9yIGEgd29yayBpdGVtIHNjaGVkdWxlZCBieSBpdCB3b3VsZCBkbw0KPiB0aGF0Lg0KPiA+
DQo+ID4gV2UgZG8gdGhhdCBmb3Igb2JqZWN0cyBiYXNlZCBvbiBzdHJ1Y3QgZGV2aWNlIGFuZCBp
dCB3b3JrcyB3ZWxsLg0KPiANCj4gVGhlcmUgaXMgb3RoZXIgd2F5IHRvIGltcGxlbWVudCBpdC4g
IFNpbmNlIGFjcGlfaGFuZGxlIGlzIGRlZmluZWQgYXMgYW4NCj4gb3BhcXVlIHZhbHVlLCB0aGlz
IGNvdWxkIGJlIGNoYW5nZWQgdG8gYW4gaW5kZXggdG8gYW4gYXJyYXkgb2YgcG9pbnRlcnMsDQo+
IGluc3RlYWQgb2YgYSBkaXJlY3QgcG9pbnRlci4gIFRoZW4gd2UgY2FuIHNhZmVseSBpbnZhbGlk
YXRlIGFuIGluZGV4IGJ5DQo+IGludmFsaWRhdGluZyB0aGUgcG9pbnRlciBhc3NvY2lhdGVkIHdp
dGggdGhlIGluZGV4Lg0KDQoNCg0KV2UgaGF2ZSBvZiBjb3Vyc2UgdGhvdWdodCBhYm91dCBhZGRp
bmcgYSBtZWNoYW5pc20gdG8gdmFsaWRhdGUvaW52YWxpZGF0ZSBhY3BpY2EgbmFtZXNwYWNlIGhh
bmRsZXMuIEluIGZhY3QsIHRoaXMgaXMgd2h5IHRoZSBBQ1BJX0hBTkRMRSBkYXRhIHR5cGUgd2Fz
IGludHJvZHVjZWQgaW4gdGhlIGZpcnN0IHBsYWNlLCBzbyB0aGF0IGlmIHdlIGV2ZXIgd2FudGVk
IG9yIG5lZWRlZCB0byBpbXBsZW1lbnQgc29tZXRoaW5nIGxpa2UgdGhpcywgaXQgd291bGQgbm90
IGJyZWFrIGEgbG90IG9mIGV4aXN0aW5nIGNvZGUuDQoNCkhvd2V2ZXIsIHdlIGhhdmUgbmV2ZXIg
aGFkIGEgcmVhbCBuZWVkIHRvIGltcGxlbWVudCBzdWNoIGEgbWVjaGFuaXNtLCBub3IgaGFzIHRo
ZSBldmVyIGJlZW4gYSByZXF1ZXN0IGZyb20gYW55IG9mIHRoZSBvcGVyYXRpbmcgc3lzdGVtcyB0
aGF0IHJ1biBBQ1BJQ0EuDQoNClRoZSBleGlzdGluZyBtb2RlbCBpcyB0aGF0IHRoZSBob3N0IGhh
cyBrbm93bGVkZ2Ugb2Ygd2hhdCBvYmplY3RzIHdpbGwgZ28gYXdheSB3aGVuIGEgdGFibGUgaXMg
dW5sb2FkZWQsIGFuZCBjYW4gc2ltcGx5IGFzc3VtZSB0aGF0IGFsbCByZWxhdGVkIGFjcGkgaGFu
ZGxlcyB3aWxsIGdvIGJhZCBhZnRlciB0aGUgdW5sb2FkIChhbmQgdGFibGUgdW5sb2FkIGlzIHRo
ZSBvbmx5IGNhc2Ugd2hlcmUgcGFydHMgb2YgdGhlIG5hbWVzcGFjZSBnbyBhd2F5LCBhcyBmYXIg
YXMgSSByZW1lbWJlcikuIFVwb24gYSByZWxvYWQgb2YgdGhlIHRhYmxlLCB0aGUgaG9zdCBrbm93
cyB0byByZWluaXRpYWxpemUgYWxsIGhhbmRsZXMgYXNzb2NpYXRlZCB3aXRoIHRoZSB0YWJsZS9k
ZXZpY2UuDQoNCkFDUENJIGhhcyBhIHRhYmxlIGhhbmRsZXIgbWVjaGFuaXNtLCB3aGVyZSB0aGUg
aG9zdCBoYW5kbGVyIGlzIGludm9rZWQgKmJlZm9yZSogYW4gQUNQSSB0YWJsZSBpcyBhY3R1YWxs
eSB1bmxvYWRlZCwgc28gdGhlIGhvc3QgY2FuIHByZXBhcmUgZm9yIHRoZSB1bmxvYWQuIEZvciBl
eGFtcGxlLCBiZWZvcmUgcmV0dXJuaW5nIGZyb20gdGhlIHRhYmxlIGhhbmRsZXIsIHRoZSBob3N0
IG1heSB3YW50IHRvIHN5bmNocm9uaXplIGJ5IHdhaXRpbmcgZm9yIGFueSBvdXRzdGFuZGluZyBu
b3RpZmllcyB0byBjb21wbGV0ZSwgdGhlbiBzaW1wbHkgaWdub3JpbmcgYW55IGZ1cnRoZXIgbm90
aWZpZXMgZnJvbSBhbnkgZGV2aWNlcyBhc3NvY2lhdGVkIHdpdGggdGhlIHRhYmxlLg0KDQpJbiBz
dW1tYXJ5LCBpdCBoYXMgYWx3YXlzIGJlZW4gZmVsdCB0aGF0IHRoZSBmYWlybHkgbGFyZ2Ugb3Zl
cmhlYWQgb2YgaW1wbGVtZW50aW5nIGEgbWVjaGFuaXNtIGxpa2UgdGhpcyBpcyBub3Qgd29ydGgg
dGhlIGNvc3QsIGFzIHdlbGwgYXMgbm90IHJlYWxseSBuZWVkZWQuDQoNCkkgc3VnZ2VzdCB0aGF0
IHRoZSBpbXBsZW1lbnRhdGlvbiBvZiBlamVjdCBzdXBwb3J0IHByb2NlZWQgYnkgdXNpbmcgdGhl
IGV4aXN0aW5nIG1lY2hhbmlzbXMgc3VjaCBhcyB0aGUgdGFibGUgaGFuZGxlci4gSWYgYWRkaXRp
b25hbCBzdXBwb3J0L2ludGVyZmFjZXMgYXJlIG5lZWRlZCBpbiBBQ1BJQ0EsIHdlIGNhbiBkaXNj
dXNzIGl0LiBIb3dldmVyLCBqdXN0IGFib3V0IHRoZSBsYXN0IHRoaW5nIEkgd291bGQgbGlrZSB0
byBkbyBpcyBhZGQgYSBsZXZlbCBvZiBpbmRpcmVjdGlvbiBiZXR3ZWVuIHRoZSBBQ1BJX0hBTkRM
RSBhbmQgdGhlIEFDUElfTkFNRVNQQUNFX05PREUgLS0gd2hpY2ggd291bGQgcmVxdWlyZSBhIGxh
cmdlLCBnbG9iYWwgY2hhbmdlIHRvIEFDUElDQSB0aGF0IHdvdWxkIGJlIG9ubHkgYXBwbGljYWJs
ZSBmb3IgYSBzaW5nbGUgcmF0aGVyIHJhcmUgaXNzdWUsIHRoZSB1bmxvYWRpbmcgb2YgYW4gQUNQ
SSB0YWJsZS4gSnVzdCB0aGUgZmFjdCB0aGF0IHdlIGFyZSBkaXNjdXNzaW5nIHRoaXMgaW4gMjAx
MyBhbmQgQUNQSUNBIGhhcyBiZWVuIHJ1bm5pbmcgc2luY2UgMTk5OSBzaG91bGQgY29uZmlybSB0
aGUgcmFyaXR5IG9mIHRoaXMgY2FzZSBhbmQvb3IgdGhhdCB0aGUgZXhpc3RpbmcgbWVjaGFuaXNt
IGhhcyBiZWVuIHN1ZmZpY2llbnQgZm9yIG90aGVyIGhvc3RzIHRoYXQgcnVuIEFDUElDQS4NCg0K
Qm9iDQoNCg0KDQoNCg==

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
  2013-02-15 16:33                               ` Moore, Robert
  (?)
  (?)
@ 2013-02-15 17:22                               ` Toshi Kani
  -1 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2013-02-15 17:22 UTC (permalink / raw)
  To: Moore, Robert
  Cc: Rafael J. Wysocki, ACPI Devel Maling List, LKML, Bjorn Helgaas,
	Jiang Liu, Yinghai Lu, Yasuaki Ishimatsu, Myron Stowe, linux-pci,
	Zheng, Lv, Brown, Len

On Fri, 2013-02-15 at 16:33 +0000, Moore, Robert wrote:
> > > > > > > > > Is there a mechanism in ACPICA to ensure that a handle
> > > > > > > > > won't become stale while a notify handler is running for
> > > > > > > > > it or is the OS responsible for ensuring that
> > > > > > > > > _EJ0 won't be run in parallel with notify handlers for
> > > > > > > > > device objects being ejected?
> > > > > > > > >
> > > > > > > >
> > > > > > > > It is up to the host.
> > > > > > >
> > > > > > > I was afraid that that might be the case. :-)
> > > > > > >
> > > > > > > So far the (Linux) host has been happily ignoring that
> > > > > > > potential problem, so I guess it can still be ignored for a
> > > > > > > while, although we'll need to address it eventually at one
> > point.
> > > > > >
> > > > > > I would think it should be fairly simple to setup a mechanism to
> > > > > > either tell the driver or for the driver to figure it out --
> > > > > > such that the driver knows that all handles associated with the
> > > > > > device are now invalid. Another way to look at it is that when
> > > > > > the device is re-installed, the driver should reinitialize such
> > > > > > that it obtains new handles for the devices and subobjects in
> > question.
> > > > >
> > > > > Unfortunately, there is quite strong assumption in our code that
> > > > > ACPI handles will not become stale before the device objects
> > > > > associated with them are removed.  For this reason, we need to
> > > > > know in advance which handles will become stale as a result of a
> > > > > table unload and remove their device objects beforehand.
> > > > >
> > > > > Moreover, when there's a notify handler installed for a given ACPI
> > > > > handle and that handle becomes stale while the notify handler is
> > > > > running, we'll be in trouble.  To avoid that we need to ensure
> > > > > that table unloads and notifies will always be mutually exclusive.
> > > >
> > > > I wonder if we can make acpi_ns_validate_handle() to actually be
> > > > able to verify if a given handle is valid.  This way, ACPICA can
> > > > fail gracefully
> > > > (AE_BAD_PARAMETER) when a stable handle is passed to the interfaces.
> > >
> > > That'd be good, but to implement it, I think, it would be necessary to
> > > introduce some reference counting of namespace objects such that the
> > > given object would only be deleted after the last reference to it had
> > been dropped.
> > > On table unload it would just be marked as invalid, but it would stay
> > > in memory as long as there were any references to it.
> > >
> > > So, for example, a notify handler would start from something like
> > > acpi_add_reference(handle), which would guarantee that the object
> > > pointed to by handle would stay in memory, and it would finish by
> > > doing
> > > acpi_drop_reference(handle) or a work item scheduled by it would do
> > that.
> > >
> > > We do that for objects based on struct device and it works well.
> > 
> > There is other way to implement it.  Since acpi_handle is defined as an
> > opaque value, this could be changed to an index to an array of pointers,
> > instead of a direct pointer.  Then we can safely invalidate an index by
> > invalidating the pointer associated with the index.
> 
> 
> 
> We have of course thought about adding a mechanism to validate/invalidate acpica namespace handles. In fact, this is why the ACPI_HANDLE data type was introduced in the first place, so that if we ever wanted or needed to implement something like this, it would not break a lot of existing code.
> 
> However, we have never had a real need to implement such a mechanism, nor has the ever been a request from any of the operating systems that run ACPICA.
> 
> The existing model is that the host has knowledge of what objects will go away when a table is unloaded, and can simply assume that all related acpi handles will go bad after the unload (and table unload is the only case where parts of the namespace go away, as far as I remember). Upon a reload of the table, the host knows to reinitialize all handles associated with the table/device.
> 
> ACPCI has a table handler mechanism, where the host handler is invoked *before* an ACPI table is actually unloaded, so the host can prepare for the unload. For example, before returning from the table handler, the host may want to synchronize by waiting for any outstanding notifies to complete, then simply ignoring any further notifies from any devices associated with the table.
> 
> In summary, it has always been felt that the fairly large overhead of implementing a mechanism like this is not worth the cost, as well as not really needed.
> 
> I suggest that the implementation of eject support proceed by using the existing mechanisms such as the table handler. If additional support/interfaces are needed in ACPICA, we can discuss it. However, just about the last thing I would like to do is add a level of indirection between the ACPI_HANDLE and the ACPI_NAMESPACE_NODE -- which would require a large, global change to ACPICA that would be only applicable for a single rather rare issue, the unloading of an ACPI table. Just the fact that we are discussing this in 2013 and ACPICA has been running since 1999 should confirm the rarity of this case and/or that the existing mechanism has been sufficient for other hosts that run ACPICA.
> 

Thanks for the info.  I understand that making such changes requires a
lot of effort.  This is just a brainstorming, and as you said, I do not
think there is any platform that can cause this issue on Linux today.
We are still in the process of handling load/unload table properly in
the kernel.  Given your input, Rafael's approach of using reference
counting on struct device seems to be the best choice for us. 

BTW, I did work on other OS that supports load/unload table (which might
be the first OS that supported this feature.)  It protects from this
race condition with serialization between the OS and FW with _OST.
However, we cannot expect all platforms to do the same for Linux.

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2013-02-15 17:33 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-13  0:19 [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks Rafael J. Wysocki
2013-02-13  1:55 ` Yinghai Lu
2013-02-13 13:08   ` Rafael J. Wysocki
2013-02-13  3:08 ` Yasuaki Ishimatsu
2013-02-13  3:08   ` Yasuaki Ishimatsu
2013-02-13  3:31   ` Yasuaki Ishimatsu
2013-02-13  3:31     ` Yasuaki Ishimatsu
2013-02-13 13:12     ` Rafael J. Wysocki
2013-02-13 13:16 ` [Update][PATCH] " Rafael J. Wysocki
2013-02-13 17:43   ` Toshi Kani
2013-02-13 20:52     ` Rafael J. Wysocki
2013-02-13 23:09       ` Toshi Kani
2013-02-13 23:42         ` Rafael J. Wysocki
2013-02-14  0:16           ` Toshi Kani
2013-02-14  2:31           ` Moore, Robert
2013-02-14  2:31             ` Moore, Robert
2013-02-14  2:31             ` Moore, Robert
2013-02-14 12:03             ` Rafael J. Wysocki
2013-02-14 20:45               ` Moore, Robert
2013-02-14 20:45                 ` Moore, Robert
2013-02-14 20:45                 ` Moore, Robert
2013-02-14 20:59                 ` Rafael J. Wysocki
2013-02-14 23:45                   ` Moore, Robert
2013-02-14 23:45                     ` Moore, Robert
2013-02-14 23:45                     ` Moore, Robert
2013-02-15  0:23                     ` Rafael J. Wysocki
2013-02-15  0:28                       ` Toshi Kani
2013-02-15 12:49                         ` Rafael J. Wysocki
2013-02-15 15:18                           ` Toshi Kani
2013-02-15 16:33                             ` Moore, Robert
2013-02-15 16:33                               ` Moore, Robert
2013-02-15 16:33                               ` Moore, Robert
2013-02-15 17:22                               ` Toshi Kani
2013-02-14 20:05   ` Yinghai Lu
2013-02-14 20:17     ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.