All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices
@ 2013-04-29 12:23 Rafael J. Wysocki
  2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
                   ` (3 more replies)
  0 siblings, 4 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-29 12:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Toshi Kani
  Cc: ACPI Devel Maling List, LKML, isimatu.yasuaki, vasilis.liaskovitis

Hi,

It has been argued for a number of times that in some cases, if a device cannot
be gracefully removed from the system, it shouldn't be removed from it at all,
because that may lead to a kernel crash.  In particular, that will happen if a
memory module holding kernel memory is removed, but also removing the last CPU
in the system may not be a good idea.  [And I can imagine a few other cases
like that.]

The kernel currently only supports "forced" hot-remove which cannot be stopped
once started, so users have no choice but to try to hot-remove stuff and see
whether or not that crashes the kernel which is kind of unpleasant.  That seems
to be based on the "the user knows better" argument according to which users
triggering device hot-removal should really know what they are doing, so the
kernel doesn't have to worry about that.  However, for instance, this pretty
much isn't the case for memory modules, because the users have no way to see
whether or not any kernel memory has been allocated from a given module.

There have been a few attempts to address this issue, but none of them has
gained broader acceptance.  The following 3 patches are the heart of a new
proposal which is based on the idea to introduce device_offline() and
device_online() operations along the lines of the existing CPU offline/online
mechanism (or, rather, to extend the CPU offline/online so that analogous
operations are available for other devices).  The way it is supposed to work is
that device_offline() will fail if the given device cannot be gracefully
removed from the system (in the kernel's view).  Once it succeeds, though, the
device won't be used any more until either it is removed, or device_online() is
run for it.  That will allow the ACPI device hot-remove code, for one example,
to avoid triggering a non-reversible removal procedure for devices that cannot
be removed gracefully.

Patch [1/3] introduces device_offline() and device_online() as outlined above.
The .offline() and .online() callbacks are only added at the bus type level for
now, because that should be sufficient to cover the memory and CPU use cases.

Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
device_online() to support the sysfs 'online' attribute for CPUs.

Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
for checking if graceful removal of devices is possible.  The way it does that
is to walk the list of "physical" companion devices for each struct acpi_device
involved in the operation and call device_offline() for each of them.  If any
of the device_offline() calls fails (and the hot-removal is not "forced", which
is an option), the removal procedure (which is not reversible) is simply not
carried out.

Of some concern is that device_offline() (and possibly device_online()) is
called under physical_node_lock of the corresponding struct acpi_device, which
introduces ordering dependency between that lock and device locks for the
"physical" devices, but I didn't see any cleaner way to do that (I guess it
is avoidable at the expense of added complexity, but for now it's just better
to make the code as clean as possible IMO).

The next step will be to modify the ACPI processor driver to use the new
mechanism.  Unfortunately, this isn't really straightforward, because it
requires untangling some events handling functionality from hotplug support
code, but I don't see any fundamental obstacles to that at the moment.  Then,
the same approach may be applied to memory hotplug and possibly other devices
in the future.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-04-29 12:23 [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
@ 2013-04-29 12:26 ` Rafael J. Wysocki
  2013-04-29 23:10   ` Greg Kroah-Hartman
  2013-04-30 23:38   ` Toshi Kani
  2013-04-29 12:28 ` [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-29 12:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In some cases, graceful hot-removal of devices is not possible,
although in principle the devices in question support hotplug.
For example, that may happen for the last CPU in the system or
for memory modules holding kernel memory.

In those cases it is nice to be able to check if the given device
can be safely hot-removed before triggering a removal procedure
that cannot be aborted or reversed.  Unfortunately, however, the
kernel currently doesn't provide any support for that.

To address that deficiency, introduce support for offline and
online operations that can be performed on devices, respectively,
before a hot-removal and in case when it is necessary (or convenient)
to put a device back online after a successful offline (that has not
been followed by removal).  The idea is that the offline will fail
whenever the given device cannot be gracefully removed from the
system and it will not be allowed to use the device after a
successful offline (until a subsequent online) in analogy with the
existing CPU offline/online mechanism.

For now, the offline and online operations are introduced at the
bus type level, as that should be sufficient for the most urgent use
cases (CPUs and memory modules).  In the future, however, the
approach may be extended to cover some more complicated device
offline/online scenarios involving device drivers etc.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/ABI/testing/sysfs-devices-online |   19 +++
 drivers/base/core.c                            |  134 +++++++++++++++++++++++++
 include/linux/device.h                         |   21 +++
 3 files changed, 174 insertions(+)

Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t
  *		the specific driver's probe to initial the matched device.
  * @remove:	Called when a device removed from this bus.
  * @shutdown:	Called at shut-down time to quiesce the device.
+ *
+ * @online:	Called to put the device back online (after offlining it).
+ * @offline:	Called to put the device offline for hot-removal. May fail.
+ *
  * @suspend:	Called when a device on this bus wants to go to sleep mode.
  * @resume:	Called to bring a device on this bus out of sleep mode.
  * @pm:		Power management operations of this bus, callback the specific
@@ -103,6 +107,9 @@ struct bus_type {
 	int (*remove)(struct device *dev);
 	void (*shutdown)(struct device *dev);
 
+	int (*online)(struct device *dev);
+	int (*offline)(struct device *dev);
+
 	int (*suspend)(struct device *dev, pm_message_t state);
 	int (*resume)(struct device *dev);
 
@@ -646,6 +653,8 @@ struct acpi_dev_node {
  * @release:	Callback to free the device after all references have
  * 		gone away. This should be set by the allocator of the
  * 		device (i.e. the bus driver that discovered the device).
+ * @offline_disabled: If set, the device is permanently online.
+ * @offline:	Set after successful invocation of bus type's .offline().
  *
  * At the lowest level, every device in a Linux system is represented by an
  * instance of struct device. The device structure contains the information
@@ -718,6 +727,9 @@ struct device {
 
 	void	(*release)(struct device *dev);
 	struct iommu_group	*iommu_group;
+
+	bool			offline_disabled:1;
+	bool			offline:1;
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)
@@ -853,6 +865,15 @@ extern const char *device_get_devnode(st
 extern void *dev_get_drvdata(const struct device *dev);
 extern int dev_set_drvdata(struct device *dev, void *data);
 
+static inline bool device_supports_offline(struct device *dev)
+{
+	return dev->bus && dev->bus->offline && dev->bus->online;
+}
+
+extern void lock_device_offline(void);
+extern void unlock_device_offline(void);
+extern int device_offline(struct device *dev);
+extern int device_online(struct device *dev);
 /*
  * Root device objects for grouping under /sys/devices
  */
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -397,6 +397,40 @@ static ssize_t store_uevent(struct devic
 static struct device_attribute uevent_attr =
 	__ATTR(uevent, S_IRUGO | S_IWUSR, show_uevent, store_uevent);
 
+static ssize_t show_online(struct device *dev, struct device_attribute *attr,
+			   char *buf)
+{
+	bool ret;
+
+	lock_device_offline();
+	ret = !dev->offline;
+	unlock_device_offline();
+	return sprintf(buf, "%u\n", ret);
+}
+
+static ssize_t store_online(struct device *dev, struct device_attribute *attr,
+			    const char *buf, size_t count)
+{
+	int ret;
+
+	lock_device_offline();
+	switch (buf[0]) {
+	case '0':
+		ret = device_offline(dev);
+		break;
+	case '1':
+		ret = device_online(dev);
+		break;
+	default:
+		ret = -EINVAL;
+	}
+	unlock_device_offline();
+	return ret < 0 ? ret : count;
+}
+
+static struct device_attribute online_attr =
+	__ATTR(online, S_IRUGO | S_IWUSR, show_online, store_online);
+
 static int device_add_attributes(struct device *dev,
 				 struct device_attribute *attrs)
 {
@@ -510,6 +544,12 @@ static int device_add_attrs(struct devic
 	if (error)
 		goto err_remove_type_groups;
 
+	if (device_supports_offline(dev) && !dev->offline_disabled) {
+		error = device_create_file(dev, &online_attr);
+		if (error)
+			goto err_remove_type_groups;
+	}
+
 	return 0;
 
  err_remove_type_groups:
@@ -530,6 +570,7 @@ static void device_remove_attrs(struct d
 	struct class *class = dev->class;
 	const struct device_type *type = dev->type;
 
+	device_remove_file(dev, &online_attr);
 	device_remove_groups(dev, dev->groups);
 
 	if (type)
@@ -1415,6 +1456,99 @@ EXPORT_SYMBOL_GPL(put_device);
 EXPORT_SYMBOL_GPL(device_create_file);
 EXPORT_SYMBOL_GPL(device_remove_file);
 
+static DEFINE_MUTEX(device_offline_lock);
+
+void lock_device_offline(void)
+{
+	mutex_lock(&device_offline_lock);
+}
+
+void unlock_device_offline(void)
+{
+	mutex_unlock(&device_offline_lock);
+}
+
+static int device_check_offline(struct device *dev, void *not_used)
+{
+	int ret;
+
+	ret = device_for_each_child(dev, NULL, device_check_offline);
+	if (ret)
+		return ret;
+
+	return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
+}
+
+/**
+ * device_offline - Prepare the device for hot-removal.
+ * @dev: Device to be put offline.
+ *
+ * Execute the device bus type's .offline() callback, if present, to prepare
+ * the device for a subsequent hot-removal.  If that succeeds, the device must
+ * not be used until either it is removed or its bus type's .online() callback
+ * is executed.
+ *
+ * Call under device_offline_lock.
+ */
+int device_offline(struct device *dev)
+{
+	int ret;
+
+	if (dev->offline_disabled)
+		return -EPERM;
+
+	ret = device_for_each_child(dev, NULL, device_check_offline);
+	if (ret)
+		return ret;
+
+	device_lock(dev);
+	if (device_supports_offline(dev)) {
+		if (dev->offline) {
+			ret = 1;
+		} else {
+			ret = dev->bus->offline(dev);
+			if (!ret) {
+				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
+				dev->offline = true;
+			}
+		}
+	}
+	device_unlock(dev);
+
+	return ret;
+}
+
+/**
+ * device_online - Put the device back online after successful device_offline().
+ * @dev: Device to be put back online.
+ *
+ * If device_offline() has been successfully executed for @dev, but the device
+ * has not been removed subsequently, execute its bus type's .online() callback
+ * to indicate that the device can be used again.
+ *
+ * Call under device_offline_lock.
+ */
+int device_online(struct device *dev)
+{
+	int ret = 0;
+
+	device_lock(dev);
+	if (device_supports_offline(dev)) {
+		if (dev->offline) {
+			ret = dev->bus->online(dev);
+			if (!ret) {
+				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
+				dev->offline = false;
+			}
+		} else {
+			ret = 1;
+		}
+	}
+	device_unlock(dev);
+
+	return ret;
+}
+
 struct root_device {
 	struct device dev;
 	struct module *owner;
Index: linux-pm/Documentation/ABI/testing/sysfs-devices-online
===================================================================
--- /dev/null
+++ linux-pm/Documentation/ABI/testing/sysfs-devices-online
@@ -0,0 +1,19 @@
+What:		/sys/devices/.../online
+Date:		April 2013
+Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+Description:
+		The /sys/devices/.../online attribute is only present for
+		devices whose bus types provide .online() and .offline()
+		callbacks.  The number read from it (0 or 1) reflects the value
+		of the device's 'offline' field.  If that number is 1 and 0
+		is written to this file, the device bus type's .offline()
+		callback is executed for the device and (if successful) its
+		'offline' field is updated accordingly.  In turn, if that number
+		is 0 and 1 is written to this file, the device bus type's
+		.online() callback is executed for the device and (if
+		successful) its 'offline' field is updated as appropriate.
+
+		After a successful execution of the bus type's .offline()
+		callback the device cannot be used for any purpose until either
+		it is removed (i.e. device_del() is called for it), or its bus
+		type's .online() is exeucted successfully.


^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-04-29 12:23 [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
  2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
@ 2013-04-29 12:28 ` Rafael J. Wysocki
  2013-04-29 23:11   ` Greg Kroah-Hartman
  2013-04-30 23:42   ` Toshi Kani
  2013-04-29 12:29 ` [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
  2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
  3 siblings, 2 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-29 12:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Rework the CPU hotplug code in drivers/base/cpu.c to use the
generic offline/online support introduced previously instead of
its own CPU-specific code.

For this purpose, modify cpu_subsys to provide offline and online
callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
the CPU-specific 'online' sysfs attribute.

This modification is not supposed to change the user-observable
behavior of the kernel (i.e. the 'online' attribute will be present
in exactly the same place in sysfs and should trigger exactly the
same actions as before).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
 1 file changed, 15 insertions(+), 47 deletions(-)

Index: linux-pm/drivers/base/cpu.c
===================================================================
--- linux-pm.orig/drivers/base/cpu.c
+++ linux-pm/drivers/base/cpu.c
@@ -16,66 +16,25 @@
 
 #include "base.h"
 
-struct bus_type cpu_subsys = {
-	.name = "cpu",
-	.dev_name = "cpu",
-};
-EXPORT_SYMBOL_GPL(cpu_subsys);
-
 static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
 
 #ifdef CONFIG_HOTPLUG_CPU
-static ssize_t show_online(struct device *dev,
-			   struct device_attribute *attr,
-			   char *buf)
+static int cpu_subsys_online(struct device *dev)
 {
-	struct cpu *cpu = container_of(dev, struct cpu, dev);
-
-	return sprintf(buf, "%u\n", !!cpu_online(cpu->dev.id));
+	return cpu_up(dev->id);
 }
 
-static ssize_t __ref store_online(struct device *dev,
-				  struct device_attribute *attr,
-				  const char *buf, size_t count)
+static int cpu_subsys_offline(struct device *dev)
 {
-	struct cpu *cpu = container_of(dev, struct cpu, dev);
-	ssize_t ret;
-
-	cpu_hotplug_driver_lock();
-	switch (buf[0]) {
-	case '0':
-		ret = cpu_down(cpu->dev.id);
-		if (!ret)
-			kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
-		break;
-	case '1':
-		ret = cpu_up(cpu->dev.id);
-		if (!ret)
-			kobject_uevent(&dev->kobj, KOBJ_ONLINE);
-		break;
-	default:
-		ret = -EINVAL;
-	}
-	cpu_hotplug_driver_unlock();
-
-	if (ret >= 0)
-		ret = count;
-	return ret;
+	return cpu_down(dev->id);
 }
-static DEVICE_ATTR(online, 0644, show_online, store_online);
 
-static void __cpuinit register_cpu_control(struct cpu *cpu)
-{
-	device_create_file(&cpu->dev, &dev_attr_online);
-}
 void unregister_cpu(struct cpu *cpu)
 {
 	int logical_cpu = cpu->dev.id;
 
 	unregister_cpu_under_node(logical_cpu, cpu_to_node(logical_cpu));
 
-	device_remove_file(&cpu->dev, &dev_attr_online);
-
 	device_unregister(&cpu->dev);
 	per_cpu(cpu_sys_devices, logical_cpu) = NULL;
 	return;
@@ -108,6 +67,16 @@ static inline void register_cpu_control(
 }
 #endif /* CONFIG_HOTPLUG_CPU */
 
+struct bus_type cpu_subsys = {
+	.name = "cpu",
+	.dev_name = "cpu",
+#ifdef CONFIG_HOTPLUG_CPU
+	.online = cpu_subsys_online,
+	.offline = cpu_subsys_offline,
+#endif
+};
+EXPORT_SYMBOL_GPL(cpu_subsys);
+
 #ifdef CONFIG_KEXEC
 #include <linux/kexec.h>
 
@@ -245,12 +214,11 @@ int __cpuinit register_cpu(struct cpu *c
 	cpu->dev.id = num;
 	cpu->dev.bus = &cpu_subsys;
 	cpu->dev.release = cpu_device_release;
+	cpu->dev.offline_disabled = !cpu->hotpluggable;
 #ifdef CONFIG_ARCH_HAS_CPU_AUTOPROBE
 	cpu->dev.bus->uevent = arch_cpu_uevent;
 #endif
 	error = device_register(&cpu->dev);
-	if (!error && cpu->hotpluggable)
-		register_cpu_control(cpu);
 	if (!error)
 		per_cpu(cpu_sys_devices, num) = &cpu->dev;
 	if (!error)

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal
  2013-04-29 12:23 [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
  2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
  2013-04-29 12:28 ` [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
@ 2013-04-29 12:29 ` Rafael J. Wysocki
  2013-04-30 23:49   ` Toshi Kani
  2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
  3 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-29 12:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify the generic ACPI hotplug code to be able to check if devices
scheduled for hot-removal may be gracefully removed from the system
using the device offline/online mechanism introduced previously.

Namely, make acpi_scan_hot_remove() which handles device hot-removal
call device_offline() for all physical companions of the ACPI device
nodes involved in the operation and check the results.  If any of
the device_offline() calls fails, the function will not progress to
the removal phase (which cannot be aborted), unless its (new) force
argument is set (in case of a failing offline it will put the devices
offlined by it back online).

In support of the 'forced' hot-removal, add a new sysfs attribute
'force_remove' that will reside in every ACPI hotplug profile
present under /sys/firmware/acpi/hotplug/.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/ABI/testing/sysfs-firmware-acpi |    9 +-
 drivers/acpi/internal.h                       |    2 
 drivers/acpi/scan.c                           |   97 ++++++++++++++++++++++++--
 drivers/acpi/sysfs.c                          |   27 +++++++
 include/acpi/acpi_bus.h                       |    3 
 5 files changed, 131 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/acpi/sysfs.c
===================================================================
--- linux-pm.orig/drivers/acpi/sysfs.c
+++ linux-pm/drivers/acpi/sysfs.c
@@ -745,8 +745,35 @@ static struct kobj_attribute hotplug_ena
 	__ATTR(enabled, S_IRUGO | S_IWUSR, hotplug_enabled_show,
 		hotplug_enabled_store);
 
+static ssize_t hotplug_force_remove_show(struct kobject *kobj,
+					 struct kobj_attribute *attr, char *buf)
+{
+	struct acpi_hotplug_profile *hotplug = to_acpi_hotplug_profile(kobj);
+
+	return sprintf(buf, "%d\n", hotplug->force_remove);
+}
+
+static ssize_t hotplug_force_remove_store(struct kobject *kobj,
+					  struct kobj_attribute *attr,
+					  const char *buf, size_t size)
+{
+	struct acpi_hotplug_profile *hotplug = to_acpi_hotplug_profile(kobj);
+	unsigned int val;
+
+	if (kstrtouint(buf, 10, &val) || val > 1)
+		return -EINVAL;
+
+	acpi_scan_hotplug_force_remove(hotplug, val);
+	return size;
+}
+
+static struct kobj_attribute hotplug_force_remove_attr =
+	__ATTR(force_remove, S_IRUGO | S_IWUSR, hotplug_force_remove_show,
+		hotplug_force_remove_store);
+
 static struct attribute *hotplug_profile_attrs[] = {
 	&hotplug_enabled_attr.attr,
+	&hotplug_force_remove_attr.attr,
 	NULL
 };
 
Index: linux-pm/drivers/acpi/internal.h
===================================================================
--- linux-pm.orig/drivers/acpi/internal.h
+++ linux-pm/drivers/acpi/internal.h
@@ -52,6 +52,8 @@ void acpi_sysfs_add_hotplug_profile(stru
 int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
 				       const char *hotplug_profile_name);
 void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
+void acpi_scan_hotplug_force_remove(struct acpi_hotplug_profile *hotplug,
+				    bool val);
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *acpi_debugfs_dir;
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -97,6 +97,7 @@ enum acpi_hotplug_mode {
 struct acpi_hotplug_profile {
 	struct kobject kobj;
 	bool enabled:1;
+	bool force_remove:1;
 	enum acpi_hotplug_mode mode;
 };
 
@@ -286,6 +287,7 @@ struct acpi_device_physical_node {
 	u8 node_id;
 	struct list_head node;
 	struct device *dev;
+	bool put_online:1;
 };
 
 /* set maximum of physical nodes to 32 for expansibility */
@@ -346,6 +348,7 @@ struct acpi_bus_event {
 struct acpi_eject_event {
 	struct acpi_device	*device;
 	u32		event;
+	bool		force;
 };
 
 struct acpi_hp_work {
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -120,7 +120,61 @@ acpi_device_modalias_show(struct device
 }
 static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
 
-static int acpi_scan_hot_remove(struct acpi_device *device)
+static acpi_status acpi_bus_offline_companions(acpi_handle handle, u32 lvl,
+					       void *data, void **ret_p)
+{
+	struct acpi_device *device = NULL;
+	struct acpi_device_physical_node *pn;
+	bool force = *((bool *)data);
+	acpi_status status = AE_OK;
+
+	if (acpi_bus_get_device(handle, &device))
+		return AE_OK;
+
+	mutex_lock(&device->physical_node_lock);
+
+	list_for_each_entry(pn, &device->physical_node_list, node) {
+		int ret;
+
+		ret = device_offline(pn->dev);
+		if (force)
+			continue;
+
+		if (ret < 0) {
+			status = AE_ERROR;
+			break;
+		}
+		pn->put_online = !ret;
+	}
+
+	mutex_unlock(&device->physical_node_lock);
+
+	return status;
+}
+
+static acpi_status acpi_bus_online_companions(acpi_handle handle, u32 lvl,
+					      void *data, void **ret_p)
+{
+	struct acpi_device *device = NULL;
+	struct acpi_device_physical_node *pn;
+
+	if (acpi_bus_get_device(handle, &device))
+		return AE_OK;
+
+	mutex_lock(&device->physical_node_lock);
+
+	list_for_each_entry(pn, &device->physical_node_list, node)
+		if (pn->put_online) {
+			device_online(pn->dev);
+			pn->put_online = false;
+		}
+
+	mutex_unlock(&device->physical_node_lock);
+
+	return AE_OK;
+}
+
+static int acpi_scan_hot_remove(struct acpi_device *device, bool force)
 {
 	acpi_handle handle = device->handle;
 	acpi_handle not_used;
@@ -136,10 +190,30 @@ static int acpi_scan_hot_remove(struct a
 		return -EINVAL;
 	}
 
+	lock_device_offline();
+
+	status = acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
+				     NULL, acpi_bus_offline_companions, &force,
+				     NULL);
+	if (ACPI_SUCCESS(status) || force)
+		status = acpi_bus_offline_companions(handle, 0, &force, NULL);
+
+	if (ACPI_FAILURE(status) && !force) {
+		acpi_bus_online_companions(handle, 0, NULL, NULL);
+		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
+				    acpi_bus_online_companions, NULL, NULL,
+				    NULL);
+		unlock_device_offline();
+		return -EBUSY;
+	}
+
 	ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 		"Hot-removing device %s...\n", dev_name(&device->dev)));
 
 	acpi_bus_trim(device);
+
+	unlock_device_offline();
+
 	/* Device node has been unregistered. */
 	put_device(&device->dev);
 	device = NULL;
@@ -214,7 +288,8 @@ static void acpi_bus_device_eject(void *
 		int error;
 
 		get_device(&device->dev);
-		error = acpi_scan_hot_remove(device);
+		error = acpi_scan_hot_remove(device,
+					     handler->hotplug.force_remove);
 		if (error)
 			goto err_out;
 	}
@@ -353,7 +428,7 @@ void acpi_bus_hot_remove_device(void *co
 
 	mutex_lock(&acpi_scan_lock);
 
-	error = acpi_scan_hot_remove(device);
+	error = acpi_scan_hot_remove(device, ej_event->force);
 	if (error && handle)
 		acpi_evaluate_hotplug_ost(handle, ej_event->event,
 					  ACPI_OST_SC_NON_SPECIFIC_FAILURE,
@@ -422,7 +497,7 @@ acpi_eject_store(struct device *d, struc
 		/* Eject initiated by user space. */
 		ost_source = ACPI_OST_EC_OSPM_EJECT;
 	}
-	ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
+	ej_event = kzalloc(sizeof(*ej_event), GFP_KERNEL);
 	if (!ej_event) {
 		ret = -ENOMEM;
 		goto err_out;
@@ -431,6 +506,9 @@ acpi_eject_store(struct device *d, struc
 				  ACPI_OST_SC_EJECT_IN_PROGRESS, NULL);
 	ej_event->device = acpi_device;
 	ej_event->event = ost_source;
+	if (acpi_device->handler)
+		ej_event->force = acpi_device->handler->hotplug.force_remove;
+
 	get_device(&acpi_device->dev);
 	status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device, ej_event);
 	if (ACPI_FAILURE(status)) {
@@ -1769,9 +1847,18 @@ void acpi_scan_hotplug_enabled(struct ac
 		return;
 
 	mutex_lock(&acpi_scan_lock);
-
 	hotplug->enabled = val;
+	mutex_unlock(&acpi_scan_lock);
+}
 
+void acpi_scan_hotplug_force_remove(struct acpi_hotplug_profile *hotplug,
+				    bool val)
+{
+	if (!!hotplug->force_remove == !!val)
+		return;
+
+	mutex_lock(&acpi_scan_lock);
+	hotplug->force_remove = val;
 	mutex_unlock(&acpi_scan_lock);
 }
 
Index: linux-pm/Documentation/ABI/testing/sysfs-firmware-acpi
===================================================================
--- linux-pm.orig/Documentation/ABI/testing/sysfs-firmware-acpi
+++ linux-pm/Documentation/ABI/testing/sysfs-firmware-acpi
@@ -40,8 +40,13 @@ Description:
 			effectively disables hotplug for the correspoinding
 			class of devices.
 
-		The value of the above attribute is an integer number: 1 (set)
-		or 0 (unset).  Attempts to write any other values to it will
+		force_remove: If set, the ACPI core will force hot-removal
+			for the given class of devices regardless of whether or
+			not they may be gracefully removed from the system
+			(according to the kernel).
+
+		The values of the above attributes are integer numbers: 1 (set)
+		or 0 (unset).  Attempts to write any other values to them will
 		cause -EINVAL to be returned.
 
 What:		/sys/firmware/acpi/interrupts/


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
@ 2013-04-29 23:10   ` Greg Kroah-Hartman
  2013-04-30 11:59     ` Rafael J. Wysocki
  2013-04-30 23:38   ` Toshi Kani
  1 sibling, 1 reply; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-04-29 23:10 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

On Mon, Apr 29, 2013 at 02:26:56PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> In some cases, graceful hot-removal of devices is not possible,
> although in principle the devices in question support hotplug.
> For example, that may happen for the last CPU in the system or
> for memory modules holding kernel memory.
> 
> In those cases it is nice to be able to check if the given device
> can be safely hot-removed before triggering a removal procedure
> that cannot be aborted or reversed.  Unfortunately, however, the
> kernel currently doesn't provide any support for that.
> 
> To address that deficiency, introduce support for offline and
> online operations that can be performed on devices, respectively,
> before a hot-removal and in case when it is necessary (or convenient)
> to put a device back online after a successful offline (that has not
> been followed by removal).  The idea is that the offline will fail
> whenever the given device cannot be gracefully removed from the
> system and it will not be allowed to use the device after a
> successful offline (until a subsequent online) in analogy with the
> existing CPU offline/online mechanism.
> 
> For now, the offline and online operations are introduced at the
> bus type level, as that should be sufficient for the most urgent use
> cases (CPUs and memory modules).  In the future, however, the
> approach may be extended to cover some more complicated device
> offline/online scenarios involving device drivers etc.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-online |   19 +++
>  drivers/base/core.c                            |  134 +++++++++++++++++++++++++
>  include/linux/device.h                         |   21 +++
>  3 files changed, 174 insertions(+)
> 
> Index: linux-pm/include/linux/device.h
> ===================================================================
> --- linux-pm.orig/include/linux/device.h
> +++ linux-pm/include/linux/device.h
> @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t
>   *		the specific driver's probe to initial the matched device.
>   * @remove:	Called when a device removed from this bus.
>   * @shutdown:	Called at shut-down time to quiesce the device.
> + *
> + * @online:	Called to put the device back online (after offlining it).
> + * @offline:	Called to put the device offline for hot-removal. May fail.
> + *
>   * @suspend:	Called when a device on this bus wants to go to sleep mode.
>   * @resume:	Called to bring a device on this bus out of sleep mode.
>   * @pm:		Power management operations of this bus, callback the specific
> @@ -103,6 +107,9 @@ struct bus_type {
>  	int (*remove)(struct device *dev);
>  	void (*shutdown)(struct device *dev);
>  
> +	int (*online)(struct device *dev);
> +	int (*offline)(struct device *dev);
> +
>  	int (*suspend)(struct device *dev, pm_message_t state);
>  	int (*resume)(struct device *dev);
>  
> @@ -646,6 +653,8 @@ struct acpi_dev_node {
>   * @release:	Callback to free the device after all references have
>   * 		gone away. This should be set by the allocator of the
>   * 		device (i.e. the bus driver that discovered the device).
> + * @offline_disabled: If set, the device is permanently online.
> + * @offline:	Set after successful invocation of bus type's .offline().
>   *
>   * At the lowest level, every device in a Linux system is represented by an
>   * instance of struct device. The device structure contains the information
> @@ -718,6 +727,9 @@ struct device {
>  
>  	void	(*release)(struct device *dev);
>  	struct iommu_group	*iommu_group;
> +
> +	bool			offline_disabled:1;
> +	bool			offline:1;
>  };
>  
>  static inline struct device *kobj_to_dev(struct kobject *kobj)
> @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st
>  extern void *dev_get_drvdata(const struct device *dev);
>  extern int dev_set_drvdata(struct device *dev, void *data);
>  
> +static inline bool device_supports_offline(struct device *dev)
> +{
> +	return dev->bus && dev->bus->offline && dev->bus->online;

Wouldn't it be easier for us to also check offline_disabled here as
well?  That would save the extra check when we go to create the sysfs
file.


> +}
> +
> +extern void lock_device_offline(void);
> +extern void unlock_device_offline(void);
> +extern int device_offline(struct device *dev);
> +extern int device_online(struct device *dev);
>  /*
>   * Root device objects for grouping under /sys/devices
>   */
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -397,6 +397,40 @@ static ssize_t store_uevent(struct devic
>  static struct device_attribute uevent_attr =
>  	__ATTR(uevent, S_IRUGO | S_IWUSR, show_uevent, store_uevent);
>  
> +static ssize_t show_online(struct device *dev, struct device_attribute *attr,
> +			   char *buf)
> +{
> +	bool ret;
> +
> +	lock_device_offline();
> +	ret = !dev->offline;
> +	unlock_device_offline();
> +	return sprintf(buf, "%u\n", ret);
> +}
> +
> +static ssize_t store_online(struct device *dev, struct device_attribute *attr,
> +			    const char *buf, size_t count)
> +{
> +	int ret;
> +
> +	lock_device_offline();
> +	switch (buf[0]) {
> +	case '0':
> +		ret = device_offline(dev);
> +		break;
> +	case '1':
> +		ret = device_online(dev);
> +		break;

Should we also accept 'y', 'Y', 'n', and 'N', like most boolean sysfs
files do?  I think we even have a kernel helper function for it
somewhere...

> +	default:
> +		ret = -EINVAL;
> +	}
> +	unlock_device_offline();
> +	return ret < 0 ? ret : count;
> +}
> +
> +static struct device_attribute online_attr =
> +	__ATTR(online, S_IRUGO | S_IWUSR, show_online, store_online);
> +
>  static int device_add_attributes(struct device *dev,
>  				 struct device_attribute *attrs)
>  {
> @@ -510,6 +544,12 @@ static int device_add_attrs(struct devic
>  	if (error)
>  		goto err_remove_type_groups;
>  
> +	if (device_supports_offline(dev) && !dev->offline_disabled) {
> +		error = device_create_file(dev, &online_attr);
> +		if (error)
> +			goto err_remove_type_groups;
> +	}
> +
>  	return 0;
>  
>   err_remove_type_groups:
> @@ -530,6 +570,7 @@ static void device_remove_attrs(struct d
>  	struct class *class = dev->class;
>  	const struct device_type *type = dev->type;
>  
> +	device_remove_file(dev, &online_attr);
>  	device_remove_groups(dev, dev->groups);
>  
>  	if (type)
> @@ -1415,6 +1456,99 @@ EXPORT_SYMBOL_GPL(put_device);
>  EXPORT_SYMBOL_GPL(device_create_file);
>  EXPORT_SYMBOL_GPL(device_remove_file);
>  
> +static DEFINE_MUTEX(device_offline_lock);
> +
> +void lock_device_offline(void)
> +{
> +	mutex_lock(&device_offline_lock);
> +}
> +
> +void unlock_device_offline(void)
> +{
> +	mutex_unlock(&device_offline_lock);
> +}

Why have functions?  Why not just do the mutex_lock/unlock instead
everywhere?

> +static int device_check_offline(struct device *dev, void *not_used)
> +{
> +	int ret;
> +
> +	ret = device_for_each_child(dev, NULL, device_check_offline);
> +	if (ret)
> +		return ret;
> +
> +	return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
> +}
> +
> +/**
> + * device_offline - Prepare the device for hot-removal.
> + * @dev: Device to be put offline.
> + *
> + * Execute the device bus type's .offline() callback, if present, to prepare
> + * the device for a subsequent hot-removal.  If that succeeds, the device must
> + * not be used until either it is removed or its bus type's .online() callback
> + * is executed.
> + *
> + * Call under device_offline_lock.
> + */
> +int device_offline(struct device *dev)
> +{
> +	int ret;
> +
> +	if (dev->offline_disabled)
> +		return -EPERM;
> +
> +	ret = device_for_each_child(dev, NULL, device_check_offline);
> +	if (ret)
> +		return ret;
> +
> +	device_lock(dev);
> +	if (device_supports_offline(dev)) {
> +		if (dev->offline) {
> +			ret = 1;
> +		} else {
> +			ret = dev->bus->offline(dev);
> +			if (!ret) {
> +				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> +				dev->offline = true;
> +			}
> +		}
> +	}
> +	device_unlock(dev);
> +
> +	return ret;
> +}
> +
> +/**
> + * device_online - Put the device back online after successful device_offline().
> + * @dev: Device to be put back online.
> + *
> + * If device_offline() has been successfully executed for @dev, but the device
> + * has not been removed subsequently, execute its bus type's .online() callback
> + * to indicate that the device can be used again.
> + *
> + * Call under device_offline_lock.
> + */
> +int device_online(struct device *dev)
> +{
> +	int ret = 0;
> +
> +	device_lock(dev);
> +	if (device_supports_offline(dev)) {
> +		if (dev->offline) {
> +			ret = dev->bus->online(dev);
> +			if (!ret) {
> +				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> +				dev->offline = false;
> +			}
> +		} else {
> +			ret = 1;
> +		}
> +	}
> +	device_unlock(dev);
> +
> +	return ret;
> +}

We don't grab the offline lock for when we go offline/online?  I like
the device_lock() call.  I don't understand what the offline locking is
supposed to be protecting as you don't use it here.  Will it make more
sense in the rest of the patches?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-04-29 12:28 ` [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
@ 2013-04-29 23:11   ` Greg Kroah-Hartman
  2013-04-30 12:01     ` Rafael J. Wysocki
  2013-04-30 23:42   ` Toshi Kani
  1 sibling, 1 reply; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-04-29 23:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

On Mon, Apr 29, 2013 at 02:28:02PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Rework the CPU hotplug code in drivers/base/cpu.c to use the
> generic offline/online support introduced previously instead of
> its own CPU-specific code.
> 
> For this purpose, modify cpu_subsys to provide offline and online
> callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> the CPU-specific 'online' sysfs attribute.
> 
> This modification is not supposed to change the user-observable
> behavior of the kernel (i.e. the 'online' attribute will be present
> in exactly the same place in sysfs and should trigger exactly the
> same actions as before).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
>  1 file changed, 15 insertions(+), 47 deletions(-)

Very nice, I like reductions like this :)

greg k-h

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-04-29 23:10   ` Greg Kroah-Hartman
@ 2013-04-30 11:59     ` Rafael J. Wysocki
  2013-04-30 15:32       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-30 11:59 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

On Monday, April 29, 2013 04:10:19 PM Greg Kroah-Hartman wrote:
> On Mon, Apr 29, 2013 at 02:26:56PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > In some cases, graceful hot-removal of devices is not possible,
> > although in principle the devices in question support hotplug.
> > For example, that may happen for the last CPU in the system or
> > for memory modules holding kernel memory.
> > 
> > In those cases it is nice to be able to check if the given device
> > can be safely hot-removed before triggering a removal procedure
> > that cannot be aborted or reversed.  Unfortunately, however, the
> > kernel currently doesn't provide any support for that.
> > 
> > To address that deficiency, introduce support for offline and
> > online operations that can be performed on devices, respectively,
> > before a hot-removal and in case when it is necessary (or convenient)
> > to put a device back online after a successful offline (that has not
> > been followed by removal).  The idea is that the offline will fail
> > whenever the given device cannot be gracefully removed from the
> > system and it will not be allowed to use the device after a
> > successful offline (until a subsequent online) in analogy with the
> > existing CPU offline/online mechanism.
> > 
> > For now, the offline and online operations are introduced at the
> > bus type level, as that should be sufficient for the most urgent use
> > cases (CPUs and memory modules).  In the future, however, the
> > approach may be extended to cover some more complicated device
> > offline/online scenarios involving device drivers etc.
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  Documentation/ABI/testing/sysfs-devices-online |   19 +++
> >  drivers/base/core.c                            |  134 +++++++++++++++++++++++++
> >  include/linux/device.h                         |   21 +++
> >  3 files changed, 174 insertions(+)
> > 
> > Index: linux-pm/include/linux/device.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/device.h
> > +++ linux-pm/include/linux/device.h
> > @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t
> >   *		the specific driver's probe to initial the matched device.
> >   * @remove:	Called when a device removed from this bus.
> >   * @shutdown:	Called at shut-down time to quiesce the device.
> > + *
> > + * @online:	Called to put the device back online (after offlining it).
> > + * @offline:	Called to put the device offline for hot-removal. May fail.
> > + *
> >   * @suspend:	Called when a device on this bus wants to go to sleep mode.
> >   * @resume:	Called to bring a device on this bus out of sleep mode.
> >   * @pm:		Power management operations of this bus, callback the specific
> > @@ -103,6 +107,9 @@ struct bus_type {
> >  	int (*remove)(struct device *dev);
> >  	void (*shutdown)(struct device *dev);
> >  
> > +	int (*online)(struct device *dev);
> > +	int (*offline)(struct device *dev);
> > +
> >  	int (*suspend)(struct device *dev, pm_message_t state);
> >  	int (*resume)(struct device *dev);
> >  
> > @@ -646,6 +653,8 @@ struct acpi_dev_node {
> >   * @release:	Callback to free the device after all references have
> >   * 		gone away. This should be set by the allocator of the
> >   * 		device (i.e. the bus driver that discovered the device).
> > + * @offline_disabled: If set, the device is permanently online.
> > + * @offline:	Set after successful invocation of bus type's .offline().
> >   *
> >   * At the lowest level, every device in a Linux system is represented by an
> >   * instance of struct device. The device structure contains the information
> > @@ -718,6 +727,9 @@ struct device {
> >  
> >  	void	(*release)(struct device *dev);
> >  	struct iommu_group	*iommu_group;
> > +
> > +	bool			offline_disabled:1;
> > +	bool			offline:1;
> >  };
> >  
> >  static inline struct device *kobj_to_dev(struct kobject *kobj)
> > @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st
> >  extern void *dev_get_drvdata(const struct device *dev);
> >  extern int dev_set_drvdata(struct device *dev, void *data);
> >  
> > +static inline bool device_supports_offline(struct device *dev)
> > +{
> > +	return dev->bus && dev->bus->offline && dev->bus->online;
> 
> Wouldn't it be easier for us to also check offline_disabled here as
> well?  That would save the extra check when we go to create the sysfs
> file.

Yes, it would, but I want device_offline() to return an error in case
when offline_disabled is set while the above returns 'true'.  If that check
were folded into device_supports_offline(), device_offline() would return 0
in that case.

> > +}
> > +
> > +extern void lock_device_offline(void);
> > +extern void unlock_device_offline(void);
> > +extern int device_offline(struct device *dev);
> > +extern int device_online(struct device *dev);
> >  /*
> >   * Root device objects for grouping under /sys/devices
> >   */
> > Index: linux-pm/drivers/base/core.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/core.c
> > +++ linux-pm/drivers/base/core.c
> > @@ -397,6 +397,40 @@ static ssize_t store_uevent(struct devic
> >  static struct device_attribute uevent_attr =
> >  	__ATTR(uevent, S_IRUGO | S_IWUSR, show_uevent, store_uevent);
> >  
> > +static ssize_t show_online(struct device *dev, struct device_attribute *attr,
> > +			   char *buf)
> > +{
> > +	bool ret;
> > +
> > +	lock_device_offline();
> > +	ret = !dev->offline;
> > +	unlock_device_offline();
> > +	return sprintf(buf, "%u\n", ret);
> > +}
> > +
> > +static ssize_t store_online(struct device *dev, struct device_attribute *attr,
> > +			    const char *buf, size_t count)
> > +{
> > +	int ret;
> > +
> > +	lock_device_offline();
> > +	switch (buf[0]) {
> > +	case '0':
> > +		ret = device_offline(dev);
> > +		break;
> > +	case '1':
> > +		ret = device_online(dev);
> > +		break;
> 
> Should we also accept 'y', 'Y', 'n', and 'N', like most boolean sysfs
> files do?  I think we even have a kernel helper function for it
> somewhere...

Yes, we do, but it doesn't accept '0' as false. :-)

Well, I suppose I can modify that function and use it here.  What do you think?

> > +	default:
> > +		ret = -EINVAL;
> > +	}
> > +	unlock_device_offline();
> > +	return ret < 0 ? ret : count;
> > +}
> > +
> > +static struct device_attribute online_attr =
> > +	__ATTR(online, S_IRUGO | S_IWUSR, show_online, store_online);
> > +
> >  static int device_add_attributes(struct device *dev,
> >  				 struct device_attribute *attrs)
> >  {
> > @@ -510,6 +544,12 @@ static int device_add_attrs(struct devic
> >  	if (error)
> >  		goto err_remove_type_groups;
> >  
> > +	if (device_supports_offline(dev) && !dev->offline_disabled) {
> > +		error = device_create_file(dev, &online_attr);
> > +		if (error)
> > +			goto err_remove_type_groups;
> > +	}
> > +
> >  	return 0;
> >  
> >   err_remove_type_groups:
> > @@ -530,6 +570,7 @@ static void device_remove_attrs(struct d
> >  	struct class *class = dev->class;
> >  	const struct device_type *type = dev->type;
> >  
> > +	device_remove_file(dev, &online_attr);
> >  	device_remove_groups(dev, dev->groups);
> >  
> >  	if (type)
> > @@ -1415,6 +1456,99 @@ EXPORT_SYMBOL_GPL(put_device);
> >  EXPORT_SYMBOL_GPL(device_create_file);
> >  EXPORT_SYMBOL_GPL(device_remove_file);
> >  
> > +static DEFINE_MUTEX(device_offline_lock);
> > +
> > +void lock_device_offline(void)
> > +{
> > +	mutex_lock(&device_offline_lock);
> > +}
> > +
> > +void unlock_device_offline(void)
> > +{
> > +	mutex_unlock(&device_offline_lock);
> > +}
> 
> Why have functions?  Why not just do the mutex_lock/unlock instead
> everywhere?

Ah, that's something I forgot to write about in the changelog.

Patch [3/3] depends on that, because it has to take device_offline_lock around
a larger piece of code.  Specifically, it needs to put acpi_bus_trim() under
that lock too to avoid situations in which a previously offlined device would
be onlined from user space right before (or worse yet during) acpi_bus_trim()
(which would then remove it without offlining).

It is not necessary in [1/3], so I can move it to [3/3] if that's better.

> > +static int device_check_offline(struct device *dev, void *not_used)
> > +{
> > +	int ret;
> > +
> > +	ret = device_for_each_child(dev, NULL, device_check_offline);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
> > +}
> > +
> > +/**
> > + * device_offline - Prepare the device for hot-removal.
> > + * @dev: Device to be put offline.
> > + *
> > + * Execute the device bus type's .offline() callback, if present, to prepare
> > + * the device for a subsequent hot-removal.  If that succeeds, the device must
> > + * not be used until either it is removed or its bus type's .online() callback
> > + * is executed.
> > + *
> > + * Call under device_offline_lock.
> > + */
> > +int device_offline(struct device *dev)
> > +{
> > +	int ret;
> > +
> > +	if (dev->offline_disabled)
> > +		return -EPERM;
> > +
> > +	ret = device_for_each_child(dev, NULL, device_check_offline);
> > +	if (ret)
> > +		return ret;
> > +
> > +	device_lock(dev);
> > +	if (device_supports_offline(dev)) {
> > +		if (dev->offline) {
> > +			ret = 1;
> > +		} else {
> > +			ret = dev->bus->offline(dev);
> > +			if (!ret) {
> > +				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> > +				dev->offline = true;
> > +			}
> > +		}
> > +	}
> > +	device_unlock(dev);
> > +
> > +	return ret;
> > +}
> > +
> > +/**
> > + * device_online - Put the device back online after successful device_offline().
> > + * @dev: Device to be put back online.
> > + *
> > + * If device_offline() has been successfully executed for @dev, but the device
> > + * has not been removed subsequently, execute its bus type's .online() callback
> > + * to indicate that the device can be used again.
> > + *
> > + * Call under device_offline_lock.
> > + */
> > +int device_online(struct device *dev)
> > +{
> > +	int ret = 0;
> > +
> > +	device_lock(dev);
> > +	if (device_supports_offline(dev)) {
> > +		if (dev->offline) {
> > +			ret = dev->bus->online(dev);
> > +			if (!ret) {
> > +				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> > +				dev->offline = false;
> > +			}
> > +		} else {
> > +			ret = 1;
> > +		}
> > +	}
> > +	device_unlock(dev);
> > +
> > +	return ret;
> > +}
> 
> We don't grab the offline lock for when we go offline/online?  I like
> the device_lock() call.  I don't understand what the offline locking is
> supposed to be protecting as you don't use it here.  Will it make more
> sense in the rest of the patches?

Yes, like I said above, it's only needed by patch [3/3], so I can move it
there.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-04-29 23:11   ` Greg Kroah-Hartman
@ 2013-04-30 12:01     ` Rafael J. Wysocki
  2013-04-30 15:27       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-30 12:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

On Monday, April 29, 2013 04:11:06 PM Greg Kroah-Hartman wrote:
> On Mon, Apr 29, 2013 at 02:28:02PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Rework the CPU hotplug code in drivers/base/cpu.c to use the
> > generic offline/online support introduced previously instead of
> > its own CPU-specific code.
> > 
> > For this purpose, modify cpu_subsys to provide offline and online
> > callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> > the CPU-specific 'online' sysfs attribute.
> > 
> > This modification is not supposed to change the user-observable
> > behavior of the kernel (i.e. the 'online' attribute will be present
> > in exactly the same place in sysfs and should trigger exactly the
> > same actions as before).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
> >  1 file changed, 15 insertions(+), 47 deletions(-)
> 
> Very nice, I like reductions like this :)

Thanks!

So I guess the patches make sense to you overall?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-04-30 12:01     ` Rafael J. Wysocki
@ 2013-04-30 15:27       ` Greg Kroah-Hartman
  2013-04-30 20:06         ` Rafael J. Wysocki
  0 siblings, 1 reply; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-04-30 15:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

On Tue, Apr 30, 2013 at 02:01:10PM +0200, Rafael J. Wysocki wrote:
> On Monday, April 29, 2013 04:11:06 PM Greg Kroah-Hartman wrote:
> > On Mon, Apr 29, 2013 at 02:28:02PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Rework the CPU hotplug code in drivers/base/cpu.c to use the
> > > generic offline/online support introduced previously instead of
> > > its own CPU-specific code.
> > > 
> > > For this purpose, modify cpu_subsys to provide offline and online
> > > callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> > > the CPU-specific 'online' sysfs attribute.
> > > 
> > > This modification is not supposed to change the user-observable
> > > behavior of the kernel (i.e. the 'online' attribute will be present
> > > in exactly the same place in sysfs and should trigger exactly the
> > > same actions as before).
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
> > >  1 file changed, 15 insertions(+), 47 deletions(-)
> > 
> > Very nice, I like reductions like this :)
> 
> Thanks!
> 
> So I guess the patches make sense to you overall?

Overall, yes, I like them a lot.

greg k-h

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-04-30 11:59     ` Rafael J. Wysocki
@ 2013-04-30 15:32       ` Greg Kroah-Hartman
  2013-04-30 20:05         ` Rafael J. Wysocki
  0 siblings, 1 reply; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-04-30 15:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

On Tue, Apr 30, 2013 at 01:59:55PM +0200, Rafael J. Wysocki wrote:
> On Monday, April 29, 2013 04:10:19 PM Greg Kroah-Hartman wrote:
> > On Mon, Apr 29, 2013 at 02:26:56PM +0200, Rafael J. Wysocki wrote:
> > > +static inline bool device_supports_offline(struct device *dev)
> > > +{
> > > +	return dev->bus && dev->bus->offline && dev->bus->online;
> > 
> > Wouldn't it be easier for us to also check offline_disabled here as
> > well?  That would save the extra check when we go to create the sysfs
> > file.
> 
> Yes, it would, but I want device_offline() to return an error in case
> when offline_disabled is set while the above returns 'true'.  If that check
> were folded into device_supports_offline(), device_offline() would return 0
> in that case.

Ok, that makes sense.

> > > +static ssize_t store_online(struct device *dev, struct device_attribute *attr,
> > > +			    const char *buf, size_t count)
> > > +{
> > > +	int ret;
> > > +
> > > +	lock_device_offline();
> > > +	switch (buf[0]) {
> > > +	case '0':
> > > +		ret = device_offline(dev);
> > > +		break;
> > > +	case '1':
> > > +		ret = device_online(dev);
> > > +		break;
> > 
> > Should we also accept 'y', 'Y', 'n', and 'N', like most boolean sysfs
> > files do?  I think we even have a kernel helper function for it
> > somewhere...
> 
> Yes, we do, but it doesn't accept '0' as false. :-)

It doesn't?  That's crazy, and should be fixed.

> Well, I suppose I can modify that function and use it here.  What do
> you think?

Yes please.

> > > +static DEFINE_MUTEX(device_offline_lock);
> > > +
> > > +void lock_device_offline(void)
> > > +{
> > > +	mutex_lock(&device_offline_lock);
> > > +}
> > > +
> > > +void unlock_device_offline(void)
> > > +{
> > > +	mutex_unlock(&device_offline_lock);
> > > +}
> > 
> > Why have functions?  Why not just do the mutex_lock/unlock instead
> > everywhere?
> 
> Ah, that's something I forgot to write about in the changelog.
> 
> Patch [3/3] depends on that, because it has to take device_offline_lock around
> a larger piece of code.  Specifically, it needs to put acpi_bus_trim() under
> that lock too to avoid situations in which a previously offlined device would
> be onlined from user space right before (or worse yet during) acpi_bus_trim()
> (which would then remove it without offlining).
> 
> It is not necessary in [1/3], so I can move it to [3/3] if that's better.

No, that makes sense, but doesn't that mean you need to export the
symbols as well?  Oh, nevermind, acpi can't be a module, that's fine.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-04-30 15:32       ` Greg Kroah-Hartman
@ 2013-04-30 20:05         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-30 20:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

On Tuesday, April 30, 2013 08:32:28 AM Greg Kroah-Hartman wrote:
> On Tue, Apr 30, 2013 at 01:59:55PM +0200, Rafael J. Wysocki wrote:
> > On Monday, April 29, 2013 04:10:19 PM Greg Kroah-Hartman wrote:
> > > On Mon, Apr 29, 2013 at 02:26:56PM +0200, Rafael J. Wysocki wrote:
> > > > +static inline bool device_supports_offline(struct device *dev)
> > > > +{
> > > > +	return dev->bus && dev->bus->offline && dev->bus->online;
> > > 
> > > Wouldn't it be easier for us to also check offline_disabled here as
> > > well?  That would save the extra check when we go to create the sysfs
> > > file.
> > 
> > Yes, it would, but I want device_offline() to return an error in case
> > when offline_disabled is set while the above returns 'true'.  If that check
> > were folded into device_supports_offline(), device_offline() would return 0
> > in that case.
> 
> Ok, that makes sense.
> 
> > > > +static ssize_t store_online(struct device *dev, struct device_attribute *attr,
> > > > +			    const char *buf, size_t count)
> > > > +{
> > > > +	int ret;
> > > > +
> > > > +	lock_device_offline();
> > > > +	switch (buf[0]) {
> > > > +	case '0':
> > > > +		ret = device_offline(dev);
> > > > +		break;
> > > > +	case '1':
> > > > +		ret = device_online(dev);
> > > > +		break;
> > > 
> > > Should we also accept 'y', 'Y', 'n', and 'N', like most boolean sysfs
> > > files do?  I think we even have a kernel helper function for it
> > > somewhere...
> > 
> > Yes, we do, but it doesn't accept '0' as false. :-)
> 
> It doesn't?  That's crazy, and should be fixed.
> 
> > Well, I suppose I can modify that function and use it here.  What do
> > you think?
> 
> Yes please.

In fact, the function is OK, but http://lxr.free-electrons.com/source/lib/string.c#L549
shows it incorrectly.  I'll use strtobool() going forward.

> > > > +static DEFINE_MUTEX(device_offline_lock);
> > > > +
> > > > +void lock_device_offline(void)
> > > > +{
> > > > +	mutex_lock(&device_offline_lock);
> > > > +}
> > > > +
> > > > +void unlock_device_offline(void)
> > > > +{
> > > > +	mutex_unlock(&device_offline_lock);
> > > > +}
> > > 
> > > Why have functions?  Why not just do the mutex_lock/unlock instead
> > > everywhere?
> > 
> > Ah, that's something I forgot to write about in the changelog.
> > 
> > Patch [3/3] depends on that, because it has to take device_offline_lock around
> > a larger piece of code.  Specifically, it needs to put acpi_bus_trim() under
> > that lock too to avoid situations in which a previously offlined device would
> > be onlined from user space right before (or worse yet during) acpi_bus_trim()
> > (which would then remove it without offlining).
> > 
> > It is not necessary in [1/3], so I can move it to [3/3] if that's better.
> 
> No, that makes sense, but doesn't that mean you need to export the
> symbols as well?  Oh, nevermind, acpi can't be a module, that's fine.

Yup.  The exports may be added when someone needs them.

At the moment I'm working on untangling the ACPI processor driver which is
somewhat more complicated than I'd expected (oh well).  When that's done, I'll
post a more complete series of patches.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-04-30 15:27       ` Greg Kroah-Hartman
@ 2013-04-30 20:06         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-30 20:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis

On Tuesday, April 30, 2013 08:27:53 AM Greg Kroah-Hartman wrote:
> On Tue, Apr 30, 2013 at 02:01:10PM +0200, Rafael J. Wysocki wrote:
> > On Monday, April 29, 2013 04:11:06 PM Greg Kroah-Hartman wrote:
> > > On Mon, Apr 29, 2013 at 02:28:02PM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > Rework the CPU hotplug code in drivers/base/cpu.c to use the
> > > > generic offline/online support introduced previously instead of
> > > > its own CPU-specific code.
> > > > 
> > > > For this purpose, modify cpu_subsys to provide offline and online
> > > > callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> > > > the CPU-specific 'online' sysfs attribute.
> > > > 
> > > > This modification is not supposed to change the user-observable
> > > > behavior of the kernel (i.e. the 'online' attribute will be present
> > > > in exactly the same place in sysfs and should trigger exactly the
> > > > same actions as before).
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > >  drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
> > > >  1 file changed, 15 insertions(+), 47 deletions(-)
> > > 
> > > Very nice, I like reductions like this :)
> > 
> > Thanks!
> > 
> > So I guess the patches make sense to you overall?
> 
> Overall, yes, I like them a lot.

Cool, thanks! :-)

Rafael


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
  2013-04-29 23:10   ` Greg Kroah-Hartman
@ 2013-04-30 23:38   ` Toshi Kani
  2013-05-02  0:58     ` Rafael J. Wysocki
  1 sibling, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-04-30 23:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Mon, 2013-04-29 at 14:26 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> In some cases, graceful hot-removal of devices is not possible,
> although in principle the devices in question support hotplug.
> For example, that may happen for the last CPU in the system or
> for memory modules holding kernel memory.
> 
> In those cases it is nice to be able to check if the given device
> can be safely hot-removed before triggering a removal procedure
> that cannot be aborted or reversed.  Unfortunately, however, the
> kernel currently doesn't provide any support for that.
> 
> To address that deficiency, introduce support for offline and
> online operations that can be performed on devices, respectively,
> before a hot-removal and in case when it is necessary (or convenient)
> to put a device back online after a successful offline (that has not
> been followed by removal).  The idea is that the offline will fail
> whenever the given device cannot be gracefully removed from the
> system and it will not be allowed to use the device after a
> successful offline (until a subsequent online) in analogy with the
> existing CPU offline/online mechanism.
> 
> For now, the offline and online operations are introduced at the
> bus type level, as that should be sufficient for the most urgent use
> cases (CPUs and memory modules).  In the future, however, the
> approach may be extended to cover some more complicated device
> offline/online scenarios involving device drivers etc.

I like this approach much better than the user space approach we
considered before. :)  My comments below.

> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-online |   19 +++
>  drivers/base/core.c                            |  134 +++++++++++++++++++++++++
>  include/linux/device.h                         |   21 +++
>  3 files changed, 174 insertions(+)
> 
> Index: linux-pm/include/linux/device.h
> ===================================================================
> --- linux-pm.orig/include/linux/device.h
> +++ linux-pm/include/linux/device.h
> @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t
>   *		the specific driver's probe to initial the matched device.
>   * @remove:	Called when a device removed from this bus.
>   * @shutdown:	Called at shut-down time to quiesce the device.
> + *
> + * @online:	Called to put the device back online (after offlining it).
> + * @offline:	Called to put the device offline for hot-removal. May fail.
> + *
>   * @suspend:	Called when a device on this bus wants to go to sleep mode.
>   * @resume:	Called to bring a device on this bus out of sleep mode.
>   * @pm:		Power management operations of this bus, callback the specific
> @@ -103,6 +107,9 @@ struct bus_type {
>  	int (*remove)(struct device *dev);
>  	void (*shutdown)(struct device *dev);
>  
> +	int (*online)(struct device *dev);
> +	int (*offline)(struct device *dev);
> +
>  	int (*suspend)(struct device *dev, pm_message_t state);
>  	int (*resume)(struct device *dev);
>  
> @@ -646,6 +653,8 @@ struct acpi_dev_node {
>   * @release:	Callback to free the device after all references have
>   * 		gone away. This should be set by the allocator of the
>   * 		device (i.e. the bus driver that discovered the device).
> + * @offline_disabled: If set, the device is permanently online.
> + * @offline:	Set after successful invocation of bus type's .offline().
>   *
>   * At the lowest level, every device in a Linux system is represented by an
>   * instance of struct device. The device structure contains the information
> @@ -718,6 +727,9 @@ struct device {
>  
>  	void	(*release)(struct device *dev);
>  	struct iommu_group	*iommu_group;
> +
> +	bool			offline_disabled:1;
> +	bool			offline:1;
>  };
>  
>  static inline struct device *kobj_to_dev(struct kobject *kobj)
> @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st
>  extern void *dev_get_drvdata(const struct device *dev);
>  extern int dev_set_drvdata(struct device *dev, void *data);
>  
> +static inline bool device_supports_offline(struct device *dev)
> +{
> +	return dev->bus && dev->bus->offline && dev->bus->online;
> +}
> +
> +extern void lock_device_offline(void);
> +extern void unlock_device_offline(void);
> +extern int device_offline(struct device *dev);
> +extern int device_online(struct device *dev);
>  /*
>   * Root device objects for grouping under /sys/devices
>   */
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -397,6 +397,40 @@ static ssize_t store_uevent(struct devic
>  static struct device_attribute uevent_attr =
>  	__ATTR(uevent, S_IRUGO | S_IWUSR, show_uevent, store_uevent);
>  
> +static ssize_t show_online(struct device *dev, struct device_attribute *attr,
> +			   char *buf)
> +{
> +	bool ret;
> +
> +	lock_device_offline();
> +	ret = !dev->offline;
> +	unlock_device_offline();
> +	return sprintf(buf, "%u\n", ret);
> +}
> +
> +static ssize_t store_online(struct device *dev, struct device_attribute *attr,
> +			    const char *buf, size_t count)
> +{
> +	int ret;
> +
> +	lock_device_offline();
> +	switch (buf[0]) {
> +	case '0':
> +		ret = device_offline(dev);
> +		break;
> +	case '1':
> +		ret = device_online(dev);
> +		break;

memblk has multiple types of online operations specific to memory
devices, such as "online_kernel" and "online_movable".  As memblk needs
to be integrated into this framework for addressing the crash issue, we
need to think about how they can be generalized into this operation.

> +	default:
> +		ret = -EINVAL;
> +	}
> +	unlock_device_offline();
> +	return ret < 0 ? ret : count;
> +}
> +
> +static struct device_attribute online_attr =
> +	__ATTR(online, S_IRUGO | S_IWUSR, show_online, store_online);
> +
>  static int device_add_attributes(struct device *dev,
>  				 struct device_attribute *attrs)
>  {
> @@ -510,6 +544,12 @@ static int device_add_attrs(struct devic
>  	if (error)
>  		goto err_remove_type_groups;
>  
> +	if (device_supports_offline(dev) && !dev->offline_disabled) {
> +		error = device_create_file(dev, &online_attr);
> +		if (error)
> +			goto err_remove_type_groups;
> +	}
> +
>  	return 0;
>  
>   err_remove_type_groups:
> @@ -530,6 +570,7 @@ static void device_remove_attrs(struct d
>  	struct class *class = dev->class;
>  	const struct device_type *type = dev->type;
>  
> +	device_remove_file(dev, &online_attr);
>  	device_remove_groups(dev, dev->groups);
>  
>  	if (type)
> @@ -1415,6 +1456,99 @@ EXPORT_SYMBOL_GPL(put_device);
>  EXPORT_SYMBOL_GPL(device_create_file);
>  EXPORT_SYMBOL_GPL(device_remove_file);
>  
> +static DEFINE_MUTEX(device_offline_lock);
> +
> +void lock_device_offline(void)
> +{
> +	mutex_lock(&device_offline_lock);
> +}
> +
> +void unlock_device_offline(void)
> +{
> +	mutex_unlock(&device_offline_lock);
> +}
> +
> +static int device_check_offline(struct device *dev, void *not_used)
> +{
> +	int ret;
> +
> +	ret = device_for_each_child(dev, NULL, device_check_offline);
> +	if (ret)
> +		return ret;
> +
> +	return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
> +}
> +
> +/**
> + * device_offline - Prepare the device for hot-removal.
> + * @dev: Device to be put offline.
> + *
> + * Execute the device bus type's .offline() callback, if present, to prepare
> + * the device for a subsequent hot-removal.  If that succeeds, the device must
> + * not be used until either it is removed or its bus type's .online() callback
> + * is executed.
> + *
> + * Call under device_offline_lock.
> + */
> +int device_offline(struct device *dev)
> +{
> +	int ret;
> +
> +	if (dev->offline_disabled)
> +		return -EPERM;
> +
> +	ret = device_for_each_child(dev, NULL, device_check_offline);
> +	if (ret)
> +		return ret;
> +
> +	device_lock(dev);
> +	if (device_supports_offline(dev)) {
> +		if (dev->offline) {
> +			ret = 1;
> +		} else {
> +			ret = dev->bus->offline(dev);
> +			if (!ret) {
> +				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> +				dev->offline = true;

Shouldn't this offline flag be set before sending KOBJ_OFFLINE?

> +			}
> +		}
> +	}
> +	device_unlock(dev);
> +
> +	return ret;
> +}
> +
> +/**
> + * device_online - Put the device back online after successful device_offline().
> + * @dev: Device to be put back online.
> + *
> + * If device_offline() has been successfully executed for @dev, but the device
> + * has not been removed subsequently, execute its bus type's .online() callback
> + * to indicate that the device can be used again.

There is another use-case for online().  When a device like CPU is
hot-added, it is added in offline.  I am not sure why, but it has been
this way.  So, we need to call online() to make a new device available
for use after a hot-add.

> + *
> + * Call under device_offline_lock.
> + */
> +int device_online(struct device *dev)
> +{
> +	int ret = 0;
> +
> +	device_lock(dev);
> +	if (device_supports_offline(dev)) {
> +		if (dev->offline) {
> +			ret = dev->bus->online(dev);
> +			if (!ret) {
> +				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> +				dev->offline = false;

Same comment as KOBJ_OFFLINE.

> +			}
> +		} else {
> +			ret = 1;

This case has a problem in the hot-add use-case I mentioned above.  When
a new device is added, dev->offline is set to 0.  So, device_online()
thinks it is online already.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-04-29 12:28 ` [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
  2013-04-29 23:11   ` Greg Kroah-Hartman
@ 2013-04-30 23:42   ` Toshi Kani
  2013-05-01 14:49     ` Rafael J. Wysocki
  1 sibling, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-04-30 23:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Mon, 2013-04-29 at 14:28 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Rework the CPU hotplug code in drivers/base/cpu.c to use the
> generic offline/online support introduced previously instead of
> its own CPU-specific code.
> 
> For this purpose, modify cpu_subsys to provide offline and online
> callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> the CPU-specific 'online' sysfs attribute.
> 
> This modification is not supposed to change the user-observable
> behavior of the kernel (i.e. the 'online' attribute will be present
> in exactly the same place in sysfs and should trigger exactly the
> same actions as before).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
>  1 file changed, 15 insertions(+), 47 deletions(-)
> 
> Index: linux-pm/drivers/base/cpu.c
> ===================================================================
> --- linux-pm.orig/drivers/base/cpu.c
> +++ linux-pm/drivers/base/cpu.c
> @@ -16,66 +16,25 @@
>  
>  #include "base.h"
>  
> -struct bus_type cpu_subsys = {
> -	.name = "cpu",
> -	.dev_name = "cpu",
> -};
> -EXPORT_SYMBOL_GPL(cpu_subsys);
> -
>  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
>  
>  #ifdef CONFIG_HOTPLUG_CPU
> -static ssize_t show_online(struct device *dev,
> -			   struct device_attribute *attr,
> -			   char *buf)
> +static int cpu_subsys_online(struct device *dev)
>  {
> -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> -
> -	return sprintf(buf, "%u\n", !!cpu_online(cpu->dev.id));
> +	return cpu_up(dev->id);
>  }
>  
> -static ssize_t __ref store_online(struct device *dev,
> -				  struct device_attribute *attr,
> -				  const char *buf, size_t count)
> +static int cpu_subsys_offline(struct device *dev)
>  {
> -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> -	ssize_t ret;
> -
> -	cpu_hotplug_driver_lock();

By replacing cpu_hotplug_driver_lock() with lock_device_offline() in
patch 1/3, it no longer protects from other places that still use
cpu_hotplug_device_lock(), such as save_mc_for_early().

Thanks,
-Toshi


> -	switch (buf[0]) {
> -	case '0':
> -		ret = cpu_down(cpu->dev.id);
> -		if (!ret)
> -			kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> -		break;
> -	case '1':
> -		ret = cpu_up(cpu->dev.id);
> -		if (!ret)
> -			kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> -		break;
> -	default:
> -		ret = -EINVAL;
> -	}
> -	cpu_hotplug_driver_unlock();
> -
> -	if (ret >= 0)
> -		ret = count;
> -	return ret;
> +	return cpu_down(dev->id);
>  }
> -static DEVICE_ATTR(online, 0644, show_online, store_online);



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal
  2013-04-29 12:29 ` [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
@ 2013-04-30 23:49   ` Toshi Kani
  2013-05-01 15:05     ` Rafael J. Wysocki
  0 siblings, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-04-30 23:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Mon, 2013-04-29 at 14:29 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Modify the generic ACPI hotplug code to be able to check if devices
> scheduled for hot-removal may be gracefully removed from the system
> using the device offline/online mechanism introduced previously.
> 
> Namely, make acpi_scan_hot_remove() which handles device hot-removal
> call device_offline() for all physical companions of the ACPI device
> nodes involved in the operation and check the results.  If any of
> the device_offline() calls fails, the function will not progress to
> the removal phase (which cannot be aborted), unless its (new) force
> argument is set (in case of a failing offline it will put the devices
> offlined by it back online).
> 
> In support of the 'forced' hot-removal, add a new sysfs attribute
> 'force_remove' that will reside in every ACPI hotplug profile
> present under /sys/firmware/acpi/hotplug/.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  Documentation/ABI/testing/sysfs-firmware-acpi |    9 +-
>  drivers/acpi/internal.h                       |    2 
>  drivers/acpi/scan.c                           |   97 ++++++++++++++++++++++++--
>  drivers/acpi/sysfs.c                          |   27 +++++++
>  include/acpi/acpi_bus.h                       |    3 
>  5 files changed, 131 insertions(+), 7 deletions(-)
> 
 :
> Index: linux-pm/drivers/acpi/scan.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/scan.c
> +++ linux-pm/drivers/acpi/scan.c
> @@ -120,7 +120,61 @@ acpi_device_modalias_show(struct device
>  }
>  static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
>  
> -static int acpi_scan_hot_remove(struct acpi_device *device)
> +static acpi_status acpi_bus_offline_companions(acpi_handle handle, u32 lvl,
> +					       void *data, void **ret_p)
> +{
> +	struct acpi_device *device = NULL;
> +	struct acpi_device_physical_node *pn;
> +	bool force = *((bool *)data);
> +	acpi_status status = AE_OK;
> +
> +	if (acpi_bus_get_device(handle, &device))
> +		return AE_OK;
> +
> +	mutex_lock(&device->physical_node_lock);
> +
> +	list_for_each_entry(pn, &device->physical_node_list, node) {

I do not think physical_node_list is set for ACPI processor devices, so
this code is NOP at this point.  I think properly initializing
physical_node_list for CPU and memblk is one of the key items in this
approach.

> +		int ret;
> +
> +		ret = device_offline(pn->dev);
> +		if (force)
> +			continue;
> +
> +		if (ret < 0) {
> +			status = AE_ERROR;
> +			break;
> +		}
> +		pn->put_online = !ret;
> +	}
> +
> +	mutex_unlock(&device->physical_node_lock);
> +
> +	return status;
> +}
> +
> +static acpi_status acpi_bus_online_companions(acpi_handle handle, u32 lvl,
> +					      void *data, void **ret_p)
> +{
> +	struct acpi_device *device = NULL;
> +	struct acpi_device_physical_node *pn;
> +
> +	if (acpi_bus_get_device(handle, &device))
> +		return AE_OK;
> +
> +	mutex_lock(&device->physical_node_lock);
> +
> +	list_for_each_entry(pn, &device->physical_node_list, node)
> +		if (pn->put_online) {
> +			device_online(pn->dev);
> +			pn->put_online = false;
> +		}
> +
> +	mutex_unlock(&device->physical_node_lock);
> +
> +	return AE_OK;
> +}
> +
> +static int acpi_scan_hot_remove(struct acpi_device *device, bool force)
>  {
>  	acpi_handle handle = device->handle;
>  	acpi_handle not_used;
> @@ -136,10 +190,30 @@ static int acpi_scan_hot_remove(struct a
>  		return -EINVAL;
>  	}
>  
> +	lock_device_offline();
> +
> +	status = acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
> +				     NULL, acpi_bus_offline_companions, &force,
> +				     NULL);
> +	if (ACPI_SUCCESS(status) || force)
> +		status = acpi_bus_offline_companions(handle, 0, &force, NULL);
> +
> +	if (ACPI_FAILURE(status) && !force) {
> +		acpi_bus_online_companions(handle, 0, NULL, NULL);
> +		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
> +				    acpi_bus_online_companions, NULL, NULL,
> +				    NULL);
> +		unlock_device_offline();

Don't we need put_device(&device->dev) here?

Thanks,
-Toshi


> +		return -EBUSY;
> +	}
> +



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-04-30 23:42   ` Toshi Kani
@ 2013-05-01 14:49     ` Rafael J. Wysocki
  2013-05-01 20:07       ` Toshi Kani
  0 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-01 14:49 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Tuesday, April 30, 2013 05:42:06 PM Toshi Kani wrote:
> On Mon, 2013-04-29 at 14:28 +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Rework the CPU hotplug code in drivers/base/cpu.c to use the
> > generic offline/online support introduced previously instead of
> > its own CPU-specific code.
> > 
> > For this purpose, modify cpu_subsys to provide offline and online
> > callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> > the CPU-specific 'online' sysfs attribute.
> > 
> > This modification is not supposed to change the user-observable
> > behavior of the kernel (i.e. the 'online' attribute will be present
> > in exactly the same place in sysfs and should trigger exactly the
> > same actions as before).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
> >  1 file changed, 15 insertions(+), 47 deletions(-)
> > 
> > Index: linux-pm/drivers/base/cpu.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/cpu.c
> > +++ linux-pm/drivers/base/cpu.c
> > @@ -16,66 +16,25 @@
> >  
> >  #include "base.h"
> >  
> > -struct bus_type cpu_subsys = {
> > -	.name = "cpu",
> > -	.dev_name = "cpu",
> > -};
> > -EXPORT_SYMBOL_GPL(cpu_subsys);
> > -
> >  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
> >  
> >  #ifdef CONFIG_HOTPLUG_CPU
> > -static ssize_t show_online(struct device *dev,
> > -			   struct device_attribute *attr,
> > -			   char *buf)
> > +static int cpu_subsys_online(struct device *dev)
> >  {
> > -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> > -
> > -	return sprintf(buf, "%u\n", !!cpu_online(cpu->dev.id));
> > +	return cpu_up(dev->id);
> >  }
> >  
> > -static ssize_t __ref store_online(struct device *dev,
> > -				  struct device_attribute *attr,
> > -				  const char *buf, size_t count)
> > +static int cpu_subsys_offline(struct device *dev)
> >  {
> > -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> > -	ssize_t ret;
> > -
> > -	cpu_hotplug_driver_lock();
> 
> By replacing cpu_hotplug_driver_lock() with lock_device_offline() in
> patch 1/3, it no longer protects from other places that still use
> cpu_hotplug_device_lock(), such as save_mc_for_early().

Yes.

What about taking cpu_hotplug_driver_lock() around cpu_up() and
cpu_down() in cpu_subsys_online() and cpu_subsys_offline()?

Alternatively, I can just replace cpu_hotplug_driver_lock() with
lock_device_offline() everywhere.

Thanks,
Rafael


> > -	switch (buf[0]) {
> > -	case '0':
> > -		ret = cpu_down(cpu->dev.id);
> > -		if (!ret)
> > -			kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> > -		break;
> > -	case '1':
> > -		ret = cpu_up(cpu->dev.id);
> > -		if (!ret)
> > -			kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> > -		break;
> > -	default:
> > -		ret = -EINVAL;
> > -	}
> > -	cpu_hotplug_driver_unlock();
> > -
> > -	if (ret >= 0)
> > -		ret = count;
> > -	return ret;
> > +	return cpu_down(dev->id);
> >  }
> > -static DEVICE_ATTR(online, 0644, show_online, store_online);
> 
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal
  2013-04-30 23:49   ` Toshi Kani
@ 2013-05-01 15:05     ` Rafael J. Wysocki
  2013-05-01 20:20       ` Toshi Kani
  0 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-01 15:05 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Tuesday, April 30, 2013 05:49:38 PM Toshi Kani wrote:
> On Mon, 2013-04-29 at 14:29 +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Modify the generic ACPI hotplug code to be able to check if devices
> > scheduled for hot-removal may be gracefully removed from the system
> > using the device offline/online mechanism introduced previously.
> > 
> > Namely, make acpi_scan_hot_remove() which handles device hot-removal
> > call device_offline() for all physical companions of the ACPI device
> > nodes involved in the operation and check the results.  If any of
> > the device_offline() calls fails, the function will not progress to
> > the removal phase (which cannot be aborted), unless its (new) force
> > argument is set (in case of a failing offline it will put the devices
> > offlined by it back online).
> > 
> > In support of the 'forced' hot-removal, add a new sysfs attribute
> > 'force_remove' that will reside in every ACPI hotplug profile
> > present under /sys/firmware/acpi/hotplug/.
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  Documentation/ABI/testing/sysfs-firmware-acpi |    9 +-
> >  drivers/acpi/internal.h                       |    2 
> >  drivers/acpi/scan.c                           |   97 ++++++++++++++++++++++++--
> >  drivers/acpi/sysfs.c                          |   27 +++++++
> >  include/acpi/acpi_bus.h                       |    3 
> >  5 files changed, 131 insertions(+), 7 deletions(-)
> > 
>  :
> > Index: linux-pm/drivers/acpi/scan.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/scan.c
> > +++ linux-pm/drivers/acpi/scan.c
> > @@ -120,7 +120,61 @@ acpi_device_modalias_show(struct device
> >  }
> >  static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
> >  
> > -static int acpi_scan_hot_remove(struct acpi_device *device)
> > +static acpi_status acpi_bus_offline_companions(acpi_handle handle, u32 lvl,
> > +					       void *data, void **ret_p)
> > +{
> > +	struct acpi_device *device = NULL;
> > +	struct acpi_device_physical_node *pn;
> > +	bool force = *((bool *)data);
> > +	acpi_status status = AE_OK;
> > +
> > +	if (acpi_bus_get_device(handle, &device))
> > +		return AE_OK;
> > +
> > +	mutex_lock(&device->physical_node_lock);
> > +
> > +	list_for_each_entry(pn, &device->physical_node_list, node) {
> 
> I do not think physical_node_list is set for ACPI processor devices, so
> this code is NOP at this point.  I think properly initializing
> physical_node_list for CPU and memblk is one of the key items in this
> approach.

It surely is. :-)

I've almost done that for CPUs, but that still requires some more work.
Hopefully, it'll be mostly done later this week.

Memory will take some more time I guess, though.

> > +		int ret;
> > +
> > +		ret = device_offline(pn->dev);
> > +		if (force)
> > +			continue;
> > +
> > +		if (ret < 0) {
> > +			status = AE_ERROR;
> > +			break;
> > +		}
> > +		pn->put_online = !ret;
> > +	}
> > +
> > +	mutex_unlock(&device->physical_node_lock);
> > +
> > +	return status;
> > +}
> > +
> > +static acpi_status acpi_bus_online_companions(acpi_handle handle, u32 lvl,
> > +					      void *data, void **ret_p)
> > +{
> > +	struct acpi_device *device = NULL;
> > +	struct acpi_device_physical_node *pn;
> > +
> > +	if (acpi_bus_get_device(handle, &device))
> > +		return AE_OK;
> > +
> > +	mutex_lock(&device->physical_node_lock);
> > +
> > +	list_for_each_entry(pn, &device->physical_node_list, node)
> > +		if (pn->put_online) {
> > +			device_online(pn->dev);
> > +			pn->put_online = false;
> > +		}
> > +
> > +	mutex_unlock(&device->physical_node_lock);
> > +
> > +	return AE_OK;
> > +}
> > +
> > +static int acpi_scan_hot_remove(struct acpi_device *device, bool force)
> >  {
> >  	acpi_handle handle = device->handle;
> >  	acpi_handle not_used;
> > @@ -136,10 +190,30 @@ static int acpi_scan_hot_remove(struct a
> >  		return -EINVAL;
> >  	}
> >  
> > +	lock_device_offline();
> > +
> > +	status = acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
> > +				     NULL, acpi_bus_offline_companions, &force,
> > +				     NULL);
> > +	if (ACPI_SUCCESS(status) || force)
> > +		status = acpi_bus_offline_companions(handle, 0, &force, NULL);
> > +
> > +	if (ACPI_FAILURE(status) && !force) {
> > +		acpi_bus_online_companions(handle, 0, NULL, NULL);
> > +		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
> > +				    acpi_bus_online_companions, NULL, NULL,
> > +				    NULL);
> > +		unlock_device_offline();
> 
> Don't we need put_device(&device->dev) here?

Yes, we do.  Thanks for spotting that!

Thanks for the comments.  I'll reply to your other messages later today
or tomorrow.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-05-01 14:49     ` Rafael J. Wysocki
@ 2013-05-01 20:07       ` Toshi Kani
  2013-05-02  0:26         ` Rafael J. Wysocki
  0 siblings, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-05-01 20:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Wed, 2013-05-01 at 16:49 +0200, Rafael J. Wysocki wrote:
> On Tuesday, April 30, 2013 05:42:06 PM Toshi Kani wrote:
> > On Mon, 2013-04-29 at 14:28 +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Rework the CPU hotplug code in drivers/base/cpu.c to use the
> > > generic offline/online support introduced previously instead of
> > > its own CPU-specific code.
> > > 
> > > For this purpose, modify cpu_subsys to provide offline and online
> > > callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> > > the CPU-specific 'online' sysfs attribute.
> > > 
> > > This modification is not supposed to change the user-observable
> > > behavior of the kernel (i.e. the 'online' attribute will be present
> > > in exactly the same place in sysfs and should trigger exactly the
> > > same actions as before).
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
> > >  1 file changed, 15 insertions(+), 47 deletions(-)
> > > 
> > > Index: linux-pm/drivers/base/cpu.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/base/cpu.c
> > > +++ linux-pm/drivers/base/cpu.c
> > > @@ -16,66 +16,25 @@
> > >  
> > >  #include "base.h"
> > >  
> > > -struct bus_type cpu_subsys = {
> > > -	.name = "cpu",
> > > -	.dev_name = "cpu",
> > > -};
> > > -EXPORT_SYMBOL_GPL(cpu_subsys);
> > > -
> > >  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
> > >  
> > >  #ifdef CONFIG_HOTPLUG_CPU
> > > -static ssize_t show_online(struct device *dev,
> > > -			   struct device_attribute *attr,
> > > -			   char *buf)
> > > +static int cpu_subsys_online(struct device *dev)
> > >  {
> > > -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> > > -
> > > -	return sprintf(buf, "%u\n", !!cpu_online(cpu->dev.id));
> > > +	return cpu_up(dev->id);
> > >  }
> > >  
> > > -static ssize_t __ref store_online(struct device *dev,
> > > -				  struct device_attribute *attr,
> > > -				  const char *buf, size_t count)
> > > +static int cpu_subsys_offline(struct device *dev)
> > >  {
> > > -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> > > -	ssize_t ret;
> > > -
> > > -	cpu_hotplug_driver_lock();
> > 
> > By replacing cpu_hotplug_driver_lock() with lock_device_offline() in
> > patch 1/3, it no longer protects from other places that still use
> > cpu_hotplug_device_lock(), such as save_mc_for_early().
> 
> Yes.
> 
> What about taking cpu_hotplug_driver_lock() around cpu_up() and
> cpu_down() in cpu_subsys_online() and cpu_subsys_offline()?

Sounds like a reasonable approach to me. 

> Alternatively, I can just replace cpu_hotplug_driver_lock() with
> lock_device_offline() everywhere.

That works too.  Not sure which way is better.  If we go this option,
I'd suggest to rename lock_device_offline() since it could be misleading
that the lock is only used for offline, i.e. excluding online.
lock_device_hotplug() might be less confusing although we distinguish
online/offline and hotplug operations.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal
  2013-05-01 15:05     ` Rafael J. Wysocki
@ 2013-05-01 20:20       ` Toshi Kani
  2013-05-02  0:53         ` Rafael J. Wysocki
  0 siblings, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-05-01 20:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Wed, 2013-05-01 at 17:05 +0200, Rafael J. Wysocki wrote:
> On Tuesday, April 30, 2013 05:49:38 PM Toshi Kani wrote:
> > On Mon, 2013-04-29 at 14:29 +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Modify the generic ACPI hotplug code to be able to check if devices
> > > scheduled for hot-removal may be gracefully removed from the system
> > > using the device offline/online mechanism introduced previously.
> > > 
> > > Namely, make acpi_scan_hot_remove() which handles device hot-removal
> > > call device_offline() for all physical companions of the ACPI device
> > > nodes involved in the operation and check the results.  If any of
> > > the device_offline() calls fails, the function will not progress to
> > > the removal phase (which cannot be aborted), unless its (new) force
> > > argument is set (in case of a failing offline it will put the devices
> > > offlined by it back online).
> > > 
> > > In support of the 'forced' hot-removal, add a new sysfs attribute
> > > 'force_remove' that will reside in every ACPI hotplug profile
> > > present under /sys/firmware/acpi/hotplug/.
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  Documentation/ABI/testing/sysfs-firmware-acpi |    9 +-
> > >  drivers/acpi/internal.h                       |    2 
> > >  drivers/acpi/scan.c                           |   97 ++++++++++++++++++++++++--
> > >  drivers/acpi/sysfs.c                          |   27 +++++++
> > >  include/acpi/acpi_bus.h                       |    3 
> > >  5 files changed, 131 insertions(+), 7 deletions(-)
> > > 
> >  :
> > > Index: linux-pm/drivers/acpi/scan.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/acpi/scan.c
> > > +++ linux-pm/drivers/acpi/scan.c
> > > @@ -120,7 +120,61 @@ acpi_device_modalias_show(struct device
> > >  }
> > >  static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
> > >  
> > > -static int acpi_scan_hot_remove(struct acpi_device *device)
> > > +static acpi_status acpi_bus_offline_companions(acpi_handle handle, u32 lvl,
> > > +					       void *data, void **ret_p)
> > > +{
> > > +	struct acpi_device *device = NULL;
> > > +	struct acpi_device_physical_node *pn;
> > > +	bool force = *((bool *)data);
> > > +	acpi_status status = AE_OK;
> > > +
> > > +	if (acpi_bus_get_device(handle, &device))
> > > +		return AE_OK;
> > > +
> > > +	mutex_lock(&device->physical_node_lock);
> > > +
> > > +	list_for_each_entry(pn, &device->physical_node_list, node) {
> > 
> > I do not think physical_node_list is set for ACPI processor devices, so
> > this code is NOP at this point.  I think properly initializing
> > physical_node_list for CPU and memblk is one of the key items in this
> > approach.
> 
> It surely is. :-)
> 
> I've almost done that for CPUs, but that still requires some more work.
> Hopefully, it'll be mostly done later this week.

Cool!

> Memory will take some more time I guess, though.

Yes, memory has an ordering issue when using glue.c.
https://lkml.org/lkml/2013/3/26/398

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online
  2013-05-01 20:07       ` Toshi Kani
@ 2013-05-02  0:26         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02  0:26 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Wednesday, May 01, 2013 02:07:45 PM Toshi Kani wrote:
> On Wed, 2013-05-01 at 16:49 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, April 30, 2013 05:42:06 PM Toshi Kani wrote:
> > > On Mon, 2013-04-29 at 14:28 +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > Rework the CPU hotplug code in drivers/base/cpu.c to use the
> > > > generic offline/online support introduced previously instead of
> > > > its own CPU-specific code.
> > > > 
> > > > For this purpose, modify cpu_subsys to provide offline and online
> > > > callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> > > > the CPU-specific 'online' sysfs attribute.
> > > > 
> > > > This modification is not supposed to change the user-observable
> > > > behavior of the kernel (i.e. the 'online' attribute will be present
> > > > in exactly the same place in sysfs and should trigger exactly the
> > > > same actions as before).
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > >  drivers/base/cpu.c |   62 ++++++++++++-----------------------------------------
> > > >  1 file changed, 15 insertions(+), 47 deletions(-)
> > > > 
> > > > Index: linux-pm/drivers/base/cpu.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/base/cpu.c
> > > > +++ linux-pm/drivers/base/cpu.c
> > > > @@ -16,66 +16,25 @@
> > > >  
> > > >  #include "base.h"
> > > >  
> > > > -struct bus_type cpu_subsys = {
> > > > -	.name = "cpu",
> > > > -	.dev_name = "cpu",
> > > > -};
> > > > -EXPORT_SYMBOL_GPL(cpu_subsys);
> > > > -
> > > >  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
> > > >  
> > > >  #ifdef CONFIG_HOTPLUG_CPU
> > > > -static ssize_t show_online(struct device *dev,
> > > > -			   struct device_attribute *attr,
> > > > -			   char *buf)
> > > > +static int cpu_subsys_online(struct device *dev)
> > > >  {
> > > > -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> > > > -
> > > > -	return sprintf(buf, "%u\n", !!cpu_online(cpu->dev.id));
> > > > +	return cpu_up(dev->id);
> > > >  }
> > > >  
> > > > -static ssize_t __ref store_online(struct device *dev,
> > > > -				  struct device_attribute *attr,
> > > > -				  const char *buf, size_t count)
> > > > +static int cpu_subsys_offline(struct device *dev)
> > > >  {
> > > > -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> > > > -	ssize_t ret;
> > > > -
> > > > -	cpu_hotplug_driver_lock();
> > > 
> > > By replacing cpu_hotplug_driver_lock() with lock_device_offline() in
> > > patch 1/3, it no longer protects from other places that still use
> > > cpu_hotplug_device_lock(), such as save_mc_for_early().
> > 
> > Yes.
> > 
> > What about taking cpu_hotplug_driver_lock() around cpu_up() and
> > cpu_down() in cpu_subsys_online() and cpu_subsys_offline()?
> 
> Sounds like a reasonable approach to me. 
> 
> > Alternatively, I can just replace cpu_hotplug_driver_lock() with
> > lock_device_offline() everywhere.
> 
> That works too.  Not sure which way is better.

It turns out that cpu_hotplug_driver_lock() is per-arch, so I'd prefer to
just take it in cpu_subsys_online() and cpu_subsys_offline(), at least for
the time being.

> If we go this option,
> I'd suggest to rename lock_device_offline() since it could be misleading
> that the lock is only used for offline, i.e. excluding online.
> lock_device_hotplug() might be less confusing although we distinguish
> online/offline and hotplug operations.

Well, I've decided to rename them to lock/unlock_device_hotplug() anyway,
because "hotplug" has been used to refer to CPU offline/online for years.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal
  2013-05-01 20:20       ` Toshi Kani
@ 2013-05-02  0:53         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02  0:53 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Wednesday, May 01, 2013 02:20:12 PM Toshi Kani wrote:
> On Wed, 2013-05-01 at 17:05 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, April 30, 2013 05:49:38 PM Toshi Kani wrote:
> > > On Mon, 2013-04-29 at 14:29 +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > Modify the generic ACPI hotplug code to be able to check if devices
> > > > scheduled for hot-removal may be gracefully removed from the system
> > > > using the device offline/online mechanism introduced previously.
> > > > 
> > > > Namely, make acpi_scan_hot_remove() which handles device hot-removal
> > > > call device_offline() for all physical companions of the ACPI device
> > > > nodes involved in the operation and check the results.  If any of
> > > > the device_offline() calls fails, the function will not progress to
> > > > the removal phase (which cannot be aborted), unless its (new) force
> > > > argument is set (in case of a failing offline it will put the devices
> > > > offlined by it back online).
> > > > 
> > > > In support of the 'forced' hot-removal, add a new sysfs attribute
> > > > 'force_remove' that will reside in every ACPI hotplug profile
> > > > present under /sys/firmware/acpi/hotplug/.
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > >  Documentation/ABI/testing/sysfs-firmware-acpi |    9 +-
> > > >  drivers/acpi/internal.h                       |    2 
> > > >  drivers/acpi/scan.c                           |   97 ++++++++++++++++++++++++--
> > > >  drivers/acpi/sysfs.c                          |   27 +++++++
> > > >  include/acpi/acpi_bus.h                       |    3 
> > > >  5 files changed, 131 insertions(+), 7 deletions(-)
> > > > 
> > >  :
> > > > Index: linux-pm/drivers/acpi/scan.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/acpi/scan.c
> > > > +++ linux-pm/drivers/acpi/scan.c
> > > > @@ -120,7 +120,61 @@ acpi_device_modalias_show(struct device
> > > >  }
> > > >  static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
> > > >  
> > > > -static int acpi_scan_hot_remove(struct acpi_device *device)
> > > > +static acpi_status acpi_bus_offline_companions(acpi_handle handle, u32 lvl,
> > > > +					       void *data, void **ret_p)
> > > > +{
> > > > +	struct acpi_device *device = NULL;
> > > > +	struct acpi_device_physical_node *pn;
> > > > +	bool force = *((bool *)data);
> > > > +	acpi_status status = AE_OK;
> > > > +
> > > > +	if (acpi_bus_get_device(handle, &device))
> > > > +		return AE_OK;
> > > > +
> > > > +	mutex_lock(&device->physical_node_lock);
> > > > +
> > > > +	list_for_each_entry(pn, &device->physical_node_list, node) {
> > > 
> > > I do not think physical_node_list is set for ACPI processor devices, so
> > > this code is NOP at this point.  I think properly initializing
> > > physical_node_list for CPU and memblk is one of the key items in this
> > > approach.
> > 
> > It surely is. :-)
> > 
> > I've almost done that for CPUs, but that still requires some more work.
> > Hopefully, it'll be mostly done later this week.
> 
> Cool!
> 
> > Memory will take some more time I guess, though.
> 
> Yes, memory has an ordering issue when using glue.c.
> https://lkml.org/lkml/2013/3/26/398

Well, that may not be such a big problem.  I'll have a look at that later.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-04-30 23:38   ` Toshi Kani
@ 2013-05-02  0:58     ` Rafael J. Wysocki
  2013-05-02 23:29       ` Toshi Kani
  0 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02  0:58 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Tuesday, April 30, 2013 05:38:38 PM Toshi Kani wrote:
> On Mon, 2013-04-29 at 14:26 +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > In some cases, graceful hot-removal of devices is not possible,
> > although in principle the devices in question support hotplug.
> > For example, that may happen for the last CPU in the system or
> > for memory modules holding kernel memory.
> > 
> > In those cases it is nice to be able to check if the given device
> > can be safely hot-removed before triggering a removal procedure
> > that cannot be aborted or reversed.  Unfortunately, however, the
> > kernel currently doesn't provide any support for that.
> > 
> > To address that deficiency, introduce support for offline and
> > online operations that can be performed on devices, respectively,
> > before a hot-removal and in case when it is necessary (or convenient)
> > to put a device back online after a successful offline (that has not
> > been followed by removal).  The idea is that the offline will fail
> > whenever the given device cannot be gracefully removed from the
> > system and it will not be allowed to use the device after a
> > successful offline (until a subsequent online) in analogy with the
> > existing CPU offline/online mechanism.
> > 
> > For now, the offline and online operations are introduced at the
> > bus type level, as that should be sufficient for the most urgent use
> > cases (CPUs and memory modules).  In the future, however, the
> > approach may be extended to cover some more complicated device
> > offline/online scenarios involving device drivers etc.
> 
> I like this approach much better than the user space approach we
> considered before. :)  My comments below.

Great! :-)

> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  Documentation/ABI/testing/sysfs-devices-online |   19 +++
> >  drivers/base/core.c                            |  134 +++++++++++++++++++++++++
> >  include/linux/device.h                         |   21 +++
> >  3 files changed, 174 insertions(+)
> > 
> > Index: linux-pm/include/linux/device.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/device.h
> > +++ linux-pm/include/linux/device.h
> > @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t
> >   *		the specific driver's probe to initial the matched device.
> >   * @remove:	Called when a device removed from this bus.
> >   * @shutdown:	Called at shut-down time to quiesce the device.
> > + *
> > + * @online:	Called to put the device back online (after offlining it).
> > + * @offline:	Called to put the device offline for hot-removal. May fail.
> > + *
> >   * @suspend:	Called when a device on this bus wants to go to sleep mode.
> >   * @resume:	Called to bring a device on this bus out of sleep mode.
> >   * @pm:		Power management operations of this bus, callback the specific
> > @@ -103,6 +107,9 @@ struct bus_type {
> >  	int (*remove)(struct device *dev);
> >  	void (*shutdown)(struct device *dev);
> >  
> > +	int (*online)(struct device *dev);
> > +	int (*offline)(struct device *dev);
> > +
> >  	int (*suspend)(struct device *dev, pm_message_t state);
> >  	int (*resume)(struct device *dev);
> >  
> > @@ -646,6 +653,8 @@ struct acpi_dev_node {
> >   * @release:	Callback to free the device after all references have
> >   * 		gone away. This should be set by the allocator of the
> >   * 		device (i.e. the bus driver that discovered the device).
> > + * @offline_disabled: If set, the device is permanently online.
> > + * @offline:	Set after successful invocation of bus type's .offline().
> >   *
> >   * At the lowest level, every device in a Linux system is represented by an
> >   * instance of struct device. The device structure contains the information
> > @@ -718,6 +727,9 @@ struct device {
> >  
> >  	void	(*release)(struct device *dev);
> >  	struct iommu_group	*iommu_group;
> > +
> > +	bool			offline_disabled:1;
> > +	bool			offline:1;
> >  };
> >  
> >  static inline struct device *kobj_to_dev(struct kobject *kobj)
> > @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st
> >  extern void *dev_get_drvdata(const struct device *dev);
> >  extern int dev_set_drvdata(struct device *dev, void *data);
> >  
> > +static inline bool device_supports_offline(struct device *dev)
> > +{
> > +	return dev->bus && dev->bus->offline && dev->bus->online;
> > +}
> > +
> > +extern void lock_device_offline(void);
> > +extern void unlock_device_offline(void);
> > +extern int device_offline(struct device *dev);
> > +extern int device_online(struct device *dev);
> >  /*
> >   * Root device objects for grouping under /sys/devices
> >   */
> > Index: linux-pm/drivers/base/core.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/core.c
> > +++ linux-pm/drivers/base/core.c
> > @@ -397,6 +397,40 @@ static ssize_t store_uevent(struct devic
> >  static struct device_attribute uevent_attr =
> >  	__ATTR(uevent, S_IRUGO | S_IWUSR, show_uevent, store_uevent);
> >  
> > +static ssize_t show_online(struct device *dev, struct device_attribute *attr,
> > +			   char *buf)
> > +{
> > +	bool ret;
> > +
> > +	lock_device_offline();
> > +	ret = !dev->offline;
> > +	unlock_device_offline();
> > +	return sprintf(buf, "%u\n", ret);
> > +}
> > +
> > +static ssize_t store_online(struct device *dev, struct device_attribute *attr,
> > +			    const char *buf, size_t count)
> > +{
> > +	int ret;
> > +
> > +	lock_device_offline();
> > +	switch (buf[0]) {
> > +	case '0':
> > +		ret = device_offline(dev);
> > +		break;
> > +	case '1':
> > +		ret = device_online(dev);
> > +		break;
> 
> memblk has multiple types of online operations specific to memory
> devices, such as "online_kernel" and "online_movable".  As memblk needs
> to be integrated into this framework for addressing the crash issue, we
> need to think about how they can be generalized into this operation.

Sure.

> > +	default:
> > +		ret = -EINVAL;
> > +	}
> > +	unlock_device_offline();
> > +	return ret < 0 ? ret : count;
> > +}
> > +
> > +static struct device_attribute online_attr =
> > +	__ATTR(online, S_IRUGO | S_IWUSR, show_online, store_online);
> > +
> >  static int device_add_attributes(struct device *dev,
> >  				 struct device_attribute *attrs)
> >  {
> > @@ -510,6 +544,12 @@ static int device_add_attrs(struct devic
> >  	if (error)
> >  		goto err_remove_type_groups;
> >  
> > +	if (device_supports_offline(dev) && !dev->offline_disabled) {
> > +		error = device_create_file(dev, &online_attr);
> > +		if (error)
> > +			goto err_remove_type_groups;
> > +	}
> > +
> >  	return 0;
> >  
> >   err_remove_type_groups:
> > @@ -530,6 +570,7 @@ static void device_remove_attrs(struct d
> >  	struct class *class = dev->class;
> >  	const struct device_type *type = dev->type;
> >  
> > +	device_remove_file(dev, &online_attr);
> >  	device_remove_groups(dev, dev->groups);
> >  
> >  	if (type)
> > @@ -1415,6 +1456,99 @@ EXPORT_SYMBOL_GPL(put_device);
> >  EXPORT_SYMBOL_GPL(device_create_file);
> >  EXPORT_SYMBOL_GPL(device_remove_file);
> >  
> > +static DEFINE_MUTEX(device_offline_lock);
> > +
> > +void lock_device_offline(void)
> > +{
> > +	mutex_lock(&device_offline_lock);
> > +}
> > +
> > +void unlock_device_offline(void)
> > +{
> > +	mutex_unlock(&device_offline_lock);
> > +}
> > +
> > +static int device_check_offline(struct device *dev, void *not_used)
> > +{
> > +	int ret;
> > +
> > +	ret = device_for_each_child(dev, NULL, device_check_offline);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
> > +}
> > +
> > +/**
> > + * device_offline - Prepare the device for hot-removal.
> > + * @dev: Device to be put offline.
> > + *
> > + * Execute the device bus type's .offline() callback, if present, to prepare
> > + * the device for a subsequent hot-removal.  If that succeeds, the device must
> > + * not be used until either it is removed or its bus type's .online() callback
> > + * is executed.
> > + *
> > + * Call under device_offline_lock.
> > + */
> > +int device_offline(struct device *dev)
> > +{
> > +	int ret;
> > +
> > +	if (dev->offline_disabled)
> > +		return -EPERM;
> > +
> > +	ret = device_for_each_child(dev, NULL, device_check_offline);
> > +	if (ret)
> > +		return ret;
> > +
> > +	device_lock(dev);
> > +	if (device_supports_offline(dev)) {
> > +		if (dev->offline) {
> > +			ret = 1;
> > +		} else {
> > +			ret = dev->bus->offline(dev);
> > +			if (!ret) {
> > +				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> > +				dev->offline = true;
> 
> Shouldn't this offline flag be set before sending KOBJ_OFFLINE?
> 
> > +			}
> > +		}
> > +	}
> > +	device_unlock(dev);
> > +
> > +	return ret;
> > +}
> > +
> > +/**
> > + * device_online - Put the device back online after successful device_offline().
> > + * @dev: Device to be put back online.
> > + *
> > + * If device_offline() has been successfully executed for @dev, but the device
> > + * has not been removed subsequently, execute its bus type's .online() callback
> > + * to indicate that the device can be used again.
> 
> There is another use-case for online().  When a device like CPU is
> hot-added, it is added in offline.  I am not sure why, but it has been
> this way.  So, we need to call online() to make a new device available
> for use after a hot-add.

Actually, in the CPU case that is left to user space as far as I can say.
That is, the device appears initially offline and user space is supposed to
bring it online via sysfs.

> > + *
> > + * Call under device_offline_lock.
> > + */
> > +int device_online(struct device *dev)
> > +{
> > +	int ret = 0;
> > +
> > +	device_lock(dev);
> > +	if (device_supports_offline(dev)) {
> > +		if (dev->offline) {
> > +			ret = dev->bus->online(dev);
> > +			if (!ret) {
> > +				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> > +				dev->offline = false;
> 
> Same comment as KOBJ_OFFLINE.

I wonder why the ordering may be important?

> > +			}
> > +		} else {
> > +			ret = 1;
> 
> This case has a problem in the hot-add use-case I mentioned above.  When
> a new device is added, dev->offline is set to 0.  So, device_online()
> thinks it is online already.

Then whoever adds the device needs to set dev->offline.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices
  2013-04-29 12:23 [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2013-04-29 12:29 ` [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
@ 2013-05-02 12:26 ` Rafael J. Wysocki
  2013-05-02 12:27   ` [PATCH 1/4] Driver core: Add offline/online device operations Rafael J. Wysocki
                     ` (4 more replies)
  3 siblings, 5 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02 12:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Toshi Kani
  Cc: ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown

Hi,

The following introduction is still valid for patches [1-3/4] and patch [4/4]
reworks the ACPI processor driver to use the new code.  Some details below.

On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> 
> It has been argued for a number of times that in some cases, if a device cannot
> be gracefully removed from the system, it shouldn't be removed from it at all,
> because that may lead to a kernel crash.  In particular, that will happen if a
> memory module holding kernel memory is removed, but also removing the last CPU
> in the system may not be a good idea.  [And I can imagine a few other cases
> like that.]
> 
> The kernel currently only supports "forced" hot-remove which cannot be stopped
> once started, so users have no choice but to try to hot-remove stuff and see
> whether or not that crashes the kernel which is kind of unpleasant.  That seems
> to be based on the "the user knows better" argument according to which users
> triggering device hot-removal should really know what they are doing, so the
> kernel doesn't have to worry about that.  However, for instance, this pretty
> much isn't the case for memory modules, because the users have no way to see
> whether or not any kernel memory has been allocated from a given module.
> 
> There have been a few attempts to address this issue, but none of them has
> gained broader acceptance.  The following 3 patches are the heart of a new
> proposal which is based on the idea to introduce device_offline() and
> device_online() operations along the lines of the existing CPU offline/online
> mechanism (or, rather, to extend the CPU offline/online so that analogous
> operations are available for other devices).  The way it is supposed to work is
> that device_offline() will fail if the given device cannot be gracefully
> removed from the system (in the kernel's view).  Once it succeeds, though, the
> device won't be used any more until either it is removed, or device_online() is
> run for it.  That will allow the ACPI device hot-remove code, for one example,
> to avoid triggering a non-reversible removal procedure for devices that cannot
> be removed gracefully.
> 
> Patch [1/3] introduces device_offline() and device_online() as outlined above.
> The .offline() and .online() callbacks are only added at the bus type level for
> now, because that should be sufficient to cover the memory and CPU use cases.

That's [1/4] now and the changes from the previous version are:
- strtobool() is used in store_online().
- device_offline_lock has been renamed to device_hotplug_lock (and the
  functions operating it accordingly) following the Toshi's advice.

> Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> device_online() to support the sysfs 'online' attribute for CPUs.

That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
cpu_down().

> Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> for checking if graceful removal of devices is possible.  The way it does that
> is to walk the list of "physical" companion devices for each struct acpi_device
> involved in the operation and call device_offline() for each of them.  If any
> of the device_offline() calls fails (and the hot-removal is not "forced", which
> is an option), the removal procedure (which is not reversible) is simply not
> carried out.

That's current [3/4].  It's a bit simpler, because I decided that it would be
better to have a global 'force_remove' attribute (the semantics of the
per-profile 'force_remove' wasn't clear and it didn't really add any value over
a global one).  I also added lock/unlock_device_hotplug() around acpi_bus_scan()
in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
for "physical" companion devices safely (the processor's one added by the next
patch actually does that).

> Of some concern is that device_offline() (and possibly device_online()) is
> called under physical_node_lock of the corresponding struct acpi_device, which
> introduces ordering dependency between that lock and device locks for the
> "physical" devices, but I didn't see any cleaner way to do that (I guess it
> is avoidable at the expense of added complexity, but for now it's just better
> to make the code as clean as possible IMO).

Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
It basically splits the driver into two parts as described in the changelog,
where the first part is essentially a scan handler and the second part is
a driver, but it doesn't bind to struct acpi_device any more.  Instead, it
binds to processor devices under /sys/devices/system/cpu/ (the driver itself
has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
than having it under /sys/bus/acpi/drivers/).

The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
for this series, but I'm going to push it for v3.10-rc2 if no one screams
bloody murder.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 1/4] Driver core: Add offline/online device operations
  2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
@ 2013-05-02 12:27   ` Rafael J. Wysocki
  2013-05-02 13:57     ` Greg Kroah-Hartman
  2013-05-02 23:11     ` Toshi Kani
  2013-05-02 12:28   ` [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02 12:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In some cases, graceful hot-removal of devices is not possible,
although in principle the devices in question support hotplug.
For example, that may happen for the last CPU in the system or
for memory modules holding kernel memory.

In those cases it is nice to be able to check if the given device
can be gracefully hot-removed before triggering a removal procedure
that cannot be aborted or reversed.  Unfortunately, however, the
kernel currently doesn't provide any support for that.

To address that deficiency, introduce support for offline and
online operations that can be performed on devices, respectively,
before a hot-removal and in case when it is necessary (or convenient)
to put a device back online after a successful offline (that has not
been followed by removal).  The idea is that the offline will fail
whenever the given device cannot be gracefully removed from the
system and it will not be allowed to use the device after a
successful offline (until a subsequent online) in analogy with the
existing CPU offline/online mechanism.

For now, the offline and online operations are introduced at the
bus type level, as that should be sufficient for the most urgent use
cases (CPUs and memory modules).  In the future, however, the
approach may be extended to cover some more complicated device
offline/online scenarios involving device drivers etc.

The lock_device_hotplug() and unlock_device_hotplug() functions are
introduced because subsequent patches need to put larger pieces of
code under device_hotplug_lock to prevent race conditions between
device offline and removal from happening.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/ABI/testing/sysfs-devices-online |   20 +++
 drivers/base/core.c                            |  130 +++++++++++++++++++++++++
 include/linux/device.h                         |   21 ++++
 3 files changed, 171 insertions(+)

Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t
  *		the specific driver's probe to initial the matched device.
  * @remove:	Called when a device removed from this bus.
  * @shutdown:	Called at shut-down time to quiesce the device.
+ *
+ * @online:	Called to put the device back online (after offlining it).
+ * @offline:	Called to put the device offline for hot-removal. May fail.
+ *
  * @suspend:	Called when a device on this bus wants to go to sleep mode.
  * @resume:	Called to bring a device on this bus out of sleep mode.
  * @pm:		Power management operations of this bus, callback the specific
@@ -103,6 +107,9 @@ struct bus_type {
 	int (*remove)(struct device *dev);
 	void (*shutdown)(struct device *dev);
 
+	int (*online)(struct device *dev);
+	int (*offline)(struct device *dev);
+
 	int (*suspend)(struct device *dev, pm_message_t state);
 	int (*resume)(struct device *dev);
 
@@ -646,6 +653,8 @@ struct acpi_dev_node {
  * @release:	Callback to free the device after all references have
  * 		gone away. This should be set by the allocator of the
  * 		device (i.e. the bus driver that discovered the device).
+ * @offline_disabled: If set, the device is permanently online.
+ * @offline:	Set after successful invocation of bus type's .offline().
  *
  * At the lowest level, every device in a Linux system is represented by an
  * instance of struct device. The device structure contains the information
@@ -718,6 +727,9 @@ struct device {
 
 	void	(*release)(struct device *dev);
 	struct iommu_group	*iommu_group;
+
+	bool			offline_disabled:1;
+	bool			offline:1;
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)
@@ -853,6 +865,15 @@ extern const char *device_get_devnode(st
 extern void *dev_get_drvdata(const struct device *dev);
 extern int dev_set_drvdata(struct device *dev, void *data);
 
+static inline bool device_supports_offline(struct device *dev)
+{
+	return dev->bus && dev->bus->offline && dev->bus->online;
+}
+
+extern void lock_device_hotplug(void);
+extern void unlock_device_hotplug(void);
+extern int device_offline(struct device *dev);
+extern int device_online(struct device *dev);
 /*
  * Root device objects for grouping under /sys/devices
  */
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -397,6 +397,36 @@ static ssize_t store_uevent(struct devic
 static struct device_attribute uevent_attr =
 	__ATTR(uevent, S_IRUGO | S_IWUSR, show_uevent, store_uevent);
 
+static ssize_t show_online(struct device *dev, struct device_attribute *attr,
+			   char *buf)
+{
+	bool val;
+
+	lock_device_hotplug();
+	val = !dev->offline;
+	unlock_device_hotplug();
+	return sprintf(buf, "%u\n", val);
+}
+
+static ssize_t store_online(struct device *dev, struct device_attribute *attr,
+			    const char *buf, size_t count)
+{
+	bool val;
+	int ret;
+
+	ret = strtobool(buf, &val);
+	if (ret < 0)
+		return ret;
+
+	lock_device_hotplug();
+	ret = val ? device_online(dev) : device_offline(dev);
+	unlock_device_hotplug();
+	return ret < 0 ? ret : count;
+}
+
+static struct device_attribute online_attr =
+	__ATTR(online, S_IRUGO | S_IWUSR, show_online, store_online);
+
 static int device_add_attributes(struct device *dev,
 				 struct device_attribute *attrs)
 {
@@ -510,6 +540,12 @@ static int device_add_attrs(struct devic
 	if (error)
 		goto err_remove_type_groups;
 
+	if (device_supports_offline(dev) && !dev->offline_disabled) {
+		error = device_create_file(dev, &online_attr);
+		if (error)
+			goto err_remove_type_groups;
+	}
+
 	return 0;
 
  err_remove_type_groups:
@@ -530,6 +566,7 @@ static void device_remove_attrs(struct d
 	struct class *class = dev->class;
 	const struct device_type *type = dev->type;
 
+	device_remove_file(dev, &online_attr);
 	device_remove_groups(dev, dev->groups);
 
 	if (type)
@@ -1415,6 +1452,99 @@ EXPORT_SYMBOL_GPL(put_device);
 EXPORT_SYMBOL_GPL(device_create_file);
 EXPORT_SYMBOL_GPL(device_remove_file);
 
+static DEFINE_MUTEX(device_hotplug_lock);
+
+void lock_device_hotplug(void)
+{
+	mutex_lock(&device_hotplug_lock);
+}
+
+void unlock_device_hotplug(void)
+{
+	mutex_unlock(&device_hotplug_lock);
+}
+
+static int device_check_offline(struct device *dev, void *not_used)
+{
+	int ret;
+
+	ret = device_for_each_child(dev, NULL, device_check_offline);
+	if (ret)
+		return ret;
+
+	return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
+}
+
+/**
+ * device_offline - Prepare the device for hot-removal.
+ * @dev: Device to be put offline.
+ *
+ * Execute the device bus type's .offline() callback, if present, to prepare
+ * the device for a subsequent hot-removal.  If that succeeds, the device must
+ * not be used until either it is removed or its bus type's .online() callback
+ * is executed.
+ *
+ * Call under device_hotplug_lock.
+ */
+int device_offline(struct device *dev)
+{
+	int ret;
+
+	if (dev->offline_disabled)
+		return -EPERM;
+
+	ret = device_for_each_child(dev, NULL, device_check_offline);
+	if (ret)
+		return ret;
+
+	device_lock(dev);
+	if (device_supports_offline(dev)) {
+		if (dev->offline) {
+			ret = 1;
+		} else {
+			ret = dev->bus->offline(dev);
+			if (!ret) {
+				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
+				dev->offline = true;
+			}
+		}
+	}
+	device_unlock(dev);
+
+	return ret;
+}
+
+/**
+ * device_online - Put the device back online after successful device_offline().
+ * @dev: Device to be put back online.
+ *
+ * If device_offline() has been successfully executed for @dev, but the device
+ * has not been removed subsequently, execute its bus type's .online() callback
+ * to indicate that the device can be used again.
+ *
+ * Call under device_hotplug_lock.
+ */
+int device_online(struct device *dev)
+{
+	int ret = 0;
+
+	device_lock(dev);
+	if (device_supports_offline(dev)) {
+		if (dev->offline) {
+			ret = dev->bus->online(dev);
+			if (!ret) {
+				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
+				dev->offline = false;
+			}
+		} else {
+			ret = 1;
+		}
+	}
+	device_unlock(dev);
+
+	return ret;
+}
+
 struct root_device {
 	struct device dev;
 	struct module *owner;
Index: linux-pm/Documentation/ABI/testing/sysfs-devices-online
===================================================================
--- /dev/null
+++ linux-pm/Documentation/ABI/testing/sysfs-devices-online
@@ -0,0 +1,20 @@
+What:		/sys/devices/.../online
+Date:		April 2013
+Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+Description:
+		The /sys/devices/.../online attribute is only present for
+		devices whose bus types provide .online() and .offline()
+		callbacks.  The number read from it (0 or 1) reflects the value
+		of the device's 'offline' field.  If that number is 1 and '0'
+		(or 'n', or 'N') is written to this file, the device bus type's
+		.offline() callback is executed for the device and (if
+		successful) its 'offline' field is updated accordingly.  In
+		turn, if that number is 0 and '1' (or 'y', or 'Y') is written to
+		this file, the device bus type's .online() callback is executed
+		for the device and (if successful) its 'offline' field is
+		updated as appropriate.
+
+		After a successful execution of the bus type's .offline()
+		callback the device cannot be used for any purpose until either
+		it is removed (i.e. device_del() is called for it), or its bus
+		type's .online() is exeucted successfully.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online
  2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
  2013-05-02 12:27   ` [PATCH 1/4] Driver core: Add offline/online device operations Rafael J. Wysocki
@ 2013-05-02 12:28   ` Rafael J. Wysocki
  2013-05-02 13:57     ` Greg Kroah-Hartman
  2013-05-02 12:29   ` [PATCH 3/4] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02 12:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Rework the CPU hotplug code in drivers/base/cpu.c to use the
generic offline/online support introduced previously instead of
its own CPU-specific code.

For this purpose, modify cpu_subsys to provide offline and online
callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
the CPU-specific 'online' sysfs attribute.

This modification is not supposed to change the user-observable
behavior of the kernel (i.e. the 'online' attribute will be present
in exactly the same place in sysfs and should trigger exactly the
same actions as before).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/cpu.c |   66 ++++++++++++++++-------------------------------------
 1 file changed, 20 insertions(+), 46 deletions(-)

Index: linux-pm/drivers/base/cpu.c
===================================================================
--- linux-pm.orig/drivers/base/cpu.c
+++ linux-pm/drivers/base/cpu.c
@@ -16,66 +16,35 @@
 
 #include "base.h"
 
-struct bus_type cpu_subsys = {
-	.name = "cpu",
-	.dev_name = "cpu",
-};
-EXPORT_SYMBOL_GPL(cpu_subsys);
-
 static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
 
 #ifdef CONFIG_HOTPLUG_CPU
-static ssize_t show_online(struct device *dev,
-			   struct device_attribute *attr,
-			   char *buf)
+static int cpu_subsys_online(struct device *dev)
 {
-	struct cpu *cpu = container_of(dev, struct cpu, dev);
+	int ret;
 
-	return sprintf(buf, "%u\n", !!cpu_online(cpu->dev.id));
+	cpu_hotplug_driver_lock();
+	ret = cpu_up(dev->id);
+	cpu_hotplug_driver_unlock();
+	return ret;
 }
 
-static ssize_t __ref store_online(struct device *dev,
-				  struct device_attribute *attr,
-				  const char *buf, size_t count)
+static int cpu_subsys_offline(struct device *dev)
 {
-	struct cpu *cpu = container_of(dev, struct cpu, dev);
-	ssize_t ret;
+	int ret;
 
 	cpu_hotplug_driver_lock();
-	switch (buf[0]) {
-	case '0':
-		ret = cpu_down(cpu->dev.id);
-		if (!ret)
-			kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
-		break;
-	case '1':
-		ret = cpu_up(cpu->dev.id);
-		if (!ret)
-			kobject_uevent(&dev->kobj, KOBJ_ONLINE);
-		break;
-	default:
-		ret = -EINVAL;
-	}
+	ret = cpu_down(dev->id);
 	cpu_hotplug_driver_unlock();
-
-	if (ret >= 0)
-		ret = count;
 	return ret;
 }
-static DEVICE_ATTR(online, 0644, show_online, store_online);
 
-static void __cpuinit register_cpu_control(struct cpu *cpu)
-{
-	device_create_file(&cpu->dev, &dev_attr_online);
-}
 void unregister_cpu(struct cpu *cpu)
 {
 	int logical_cpu = cpu->dev.id;
 
 	unregister_cpu_under_node(logical_cpu, cpu_to_node(logical_cpu));
 
-	device_remove_file(&cpu->dev, &dev_attr_online);
-
 	device_unregister(&cpu->dev);
 	per_cpu(cpu_sys_devices, logical_cpu) = NULL;
 	return;
@@ -102,12 +71,18 @@ static DEVICE_ATTR(probe, S_IWUSR, NULL,
 static DEVICE_ATTR(release, S_IWUSR, NULL, cpu_release_store);
 #endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
 
-#else /* ... !CONFIG_HOTPLUG_CPU */
-static inline void register_cpu_control(struct cpu *cpu)
-{
-}
 #endif /* CONFIG_HOTPLUG_CPU */
 
+struct bus_type cpu_subsys = {
+	.name = "cpu",
+	.dev_name = "cpu",
+#ifdef CONFIG_HOTPLUG_CPU
+	.online = cpu_subsys_online,
+	.offline = cpu_subsys_offline,
+#endif
+};
+EXPORT_SYMBOL_GPL(cpu_subsys);
+
 #ifdef CONFIG_KEXEC
 #include <linux/kexec.h>
 
@@ -245,12 +220,11 @@ int __cpuinit register_cpu(struct cpu *c
 	cpu->dev.id = num;
 	cpu->dev.bus = &cpu_subsys;
 	cpu->dev.release = cpu_device_release;
+	cpu->dev.offline_disabled = !cpu->hotpluggable;
 #ifdef CONFIG_ARCH_HAS_CPU_AUTOPROBE
 	cpu->dev.bus->uevent = arch_cpu_uevent;
 #endif
 	error = device_register(&cpu->dev);
-	if (!error && cpu->hotpluggable)
-		register_cpu_control(cpu);
 	if (!error)
 		per_cpu(cpu_sys_devices, num) = &cpu->dev;
 	if (!error)


^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 3/4] ACPI / hotplug: Use device offline/online for graceful hot-removal
  2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
  2013-05-02 12:27   ` [PATCH 1/4] Driver core: Add offline/online device operations Rafael J. Wysocki
  2013-05-02 12:28   ` [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
@ 2013-05-02 12:29   ` Rafael J. Wysocki
  2013-05-02 12:31   ` [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure Rafael J. Wysocki
  2013-05-04  1:01     ` Rafael J. Wysocki
  4 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02 12:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify the generic ACPI hotplug code to be able to check if devices
scheduled for hot-removal may be gracefully removed from the system
using the device offline/online mechanism introduced previously.

Namely, make acpi_scan_hot_remove() handling device hot-removal call
device_offline() for all physical companions of the ACPI device nodes
involved in the operation and check the results.  If any of the
device_offline() calls fails, the function will not progress to the
removal phase (which cannot be aborted), unless its (new) force
argument is set (in case of a failing offline it will put the devices
offlined by it back online).

In support of 'forced' device hot-removal, add a new sysfs attribute
'force_remove' that will reside under /sys/firmware/acpi/hotplug/.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/ABI/testing/sysfs-firmware-acpi |   10 +++
 drivers/acpi/internal.h                       |    2 
 drivers/acpi/scan.c                           |   84 ++++++++++++++++++++++++++
 drivers/acpi/sysfs.c                          |   31 +++++++++
 include/acpi/acpi_bus.h                       |    1 
 5 files changed, 128 insertions(+)

Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -286,6 +286,7 @@ struct acpi_device_physical_node {
 	u8 node_id;
 	struct list_head node;
 	struct device *dev;
+	bool put_online:1;
 };
 
 /* set maximum of physical nodes to 32 for expansibility */
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -27,6 +27,12 @@ extern struct acpi_device *acpi_root;
 
 #define ACPI_IS_ROOT_DEVICE(device)    (!(device)->parent)
 
+/*
+ * If set, devices will be hot-removed even if they cannot be put offline
+ * gracefully (from the kernel's standpoint).
+ */
+bool acpi_force_hot_remove;
+
 static const char *dummy_hid = "device";
 
 static LIST_HEAD(acpi_device_list);
@@ -120,6 +126,59 @@ acpi_device_modalias_show(struct device
 }
 static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
 
+static acpi_status acpi_bus_offline_companions(acpi_handle handle, u32 lvl,
+					       void *data, void **ret_p)
+{
+	struct acpi_device *device = NULL;
+	struct acpi_device_physical_node *pn;
+	acpi_status status = AE_OK;
+
+	if (acpi_bus_get_device(handle, &device))
+		return AE_OK;
+
+	mutex_lock(&device->physical_node_lock);
+
+	list_for_each_entry(pn, &device->physical_node_list, node) {
+		int ret;
+
+		ret = device_offline(pn->dev);
+		if (acpi_force_hot_remove)
+			continue;
+
+		if (ret < 0) {
+			status = AE_ERROR;
+			break;
+		}
+		pn->put_online = !ret;
+	}
+
+	mutex_unlock(&device->physical_node_lock);
+
+	return status;
+}
+
+static acpi_status acpi_bus_online_companions(acpi_handle handle, u32 lvl,
+					      void *data, void **ret_p)
+{
+	struct acpi_device *device = NULL;
+	struct acpi_device_physical_node *pn;
+
+	if (acpi_bus_get_device(handle, &device))
+		return AE_OK;
+
+	mutex_lock(&device->physical_node_lock);
+
+	list_for_each_entry(pn, &device->physical_node_list, node)
+		if (pn->put_online) {
+			device_online(pn->dev);
+			pn->put_online = false;
+		}
+
+	mutex_unlock(&device->physical_node_lock);
+
+	return AE_OK;
+}
+
 static int acpi_scan_hot_remove(struct acpi_device *device)
 {
 	acpi_handle handle = device->handle;
@@ -136,10 +195,33 @@ static int acpi_scan_hot_remove(struct a
 		return -EINVAL;
 	}
 
+	lock_device_hotplug();
+
+	status = acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
+				     NULL, acpi_bus_offline_companions, NULL,
+				     NULL);
+	if (ACPI_SUCCESS(status) || acpi_force_hot_remove)
+		status = acpi_bus_offline_companions(handle, 0, NULL, NULL);
+
+	if (ACPI_FAILURE(status) && !acpi_force_hot_remove) {
+		acpi_bus_online_companions(handle, 0, NULL, NULL);
+		acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX,
+				    acpi_bus_online_companions, NULL, NULL,
+				    NULL);
+
+		unlock_device_hotplug();
+
+		put_device(&device->dev);
+		return -EBUSY;
+	}
+
 	ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 		"Hot-removing device %s...\n", dev_name(&device->dev)));
 
 	acpi_bus_trim(device);
+
+	unlock_device_hotplug();
+
 	/* Device node has been unregistered. */
 	put_device(&device->dev);
 	device = NULL;
@@ -236,6 +318,7 @@ static void acpi_scan_bus_device_check(a
 	int error;
 
 	mutex_lock(&acpi_scan_lock);
+	lock_device_hotplug();
 
 	acpi_bus_get_device(handle, &device);
 	if (device) {
@@ -259,6 +342,7 @@ static void acpi_scan_bus_device_check(a
 		kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
 
  out:
+	unlock_device_hotplug();
 	acpi_evaluate_hotplug_ost(handle, ost_source, ost_code, NULL);
 	mutex_unlock(&acpi_scan_lock);
 }
Index: linux-pm/drivers/acpi/sysfs.c
===================================================================
--- linux-pm.orig/drivers/acpi/sysfs.c
+++ linux-pm/drivers/acpi/sysfs.c
@@ -780,6 +780,33 @@ void acpi_sysfs_add_hotplug_profile(stru
 	pr_err(PREFIX "Unable to add hotplug profile '%s'\n", name);
 }
 
+static ssize_t force_remove_show(struct kobject *kobj,
+				 struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", !!acpi_force_hot_remove);
+}
+
+static ssize_t force_remove_store(struct kobject *kobj,
+				  struct kobj_attribute *attr,
+				  const char *buf, size_t size)
+{
+	bool val;
+	int ret;
+
+	ret = strtobool(buf, &val);
+	if (ret < 0)
+		return ret;
+
+	lock_device_hotplug();
+	acpi_force_hot_remove = val;
+	unlock_device_hotplug();
+	return size;
+}
+
+static const struct kobj_attribute force_remove_attr =
+	__ATTR(force_remove, S_IRUGO | S_IWUSR, force_remove_show,
+	       force_remove_store);
+
 int __init acpi_sysfs_init(void)
 {
 	int result;
@@ -789,6 +816,10 @@ int __init acpi_sysfs_init(void)
 		return result;
 
 	hotplug_kobj = kobject_create_and_add("hotplug", acpi_kobj);
+	result = sysfs_create_file(hotplug_kobj, &force_remove_attr.attr);
+	if (result)
+		return result;
+
 	result = sysfs_create_file(acpi_kobj, &pm_profile_attr.attr);
 	return result;
 }
Index: linux-pm/drivers/acpi/internal.h
===================================================================
--- linux-pm.orig/drivers/acpi/internal.h
+++ linux-pm/drivers/acpi/internal.h
@@ -47,6 +47,8 @@ void acpi_memory_hotplug_init(void);
 static inline void acpi_memory_hotplug_init(void) {}
 #endif
 
+extern bool acpi_force_hot_remove;
+
 void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
 				    const char *name);
 int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
Index: linux-pm/Documentation/ABI/testing/sysfs-firmware-acpi
===================================================================
--- linux-pm.orig/Documentation/ABI/testing/sysfs-firmware-acpi
+++ linux-pm/Documentation/ABI/testing/sysfs-firmware-acpi
@@ -44,6 +44,16 @@ Description:
 		or 0 (unset).  Attempts to write any other values to it will
 		cause -EINVAL to be returned.
 
+What:		/sys/firmware/acpi/hotplug/force_remove
+Date:		May 2013
+Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+Description:
+		The number in this file (0 or 1) determines whether (1) or not
+		(0) the ACPI subsystem will allow devices to be hot-removed even
+		if they cannot be put offline gracefully (from the kernel's
+		viewpoint).  That number can be changed by writing a boolean
+		value to this file.
+
 What:		/sys/firmware/acpi/interrupts/
 Date:		February 2008
 Contact:	Len Brown <lenb@kernel.org>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure
  2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
                     ` (2 preceding siblings ...)
  2013-05-02 12:29   ` [PATCH 3/4] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
@ 2013-05-02 12:31   ` Rafael J. Wysocki
  2013-05-02 13:59     ` Greg Kroah-Hartman
  2013-05-02 23:20     ` Toshi Kani
  2013-05-04  1:01     ` Rafael J. Wysocki
  4 siblings, 2 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02 12:31 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Split the ACPI processor driver into two parts, one that is
non-modular, resides in the ACPI core and handles the enumeration
and hotplug of processors and one that implements the rest of the
existing processor driver functionality.

The non-modular part uses an ACPI scan handler object to enumerate
processors on the basis of information provided by the ACPI namespace
and to hook up with the common ACPI hotplug infrastructure.  It also
populates the ACPI handle of each processor device having a
corresponding object in the ACPI namespace, which allows the driver
proper to bind to those devices, and makes the driver bind to them
if it is readily available (i.e. loaded) when the scan handler's
.attach() routine is running.

There are a few reasons to make this change.

First, switching the ACPI processor driver to using the common ACPI
hotplug infrastructure reduces code duplication and size considerably,
even though a new file is created along with a header comment etc.

Second, since the common hotplug code attempts to offline devices
before starting the (non-reversible) removal procedure, it will abort
(and possibly roll back) hot-remove operations involving processors
if cpu_down() returns an error code for one of them instead of
continuing them blindly (if /sys/firmware/acpi/hotplug/force_remove
is unset).  That is a more desirable behavior than what the current
code does.

Finally, the separation of the scan/hotplug part from the driver
proper makes it possible to simplify the driver's .remove() routine,
because it doesn't need to worry about the possible cleanup related
to processor removal any more (the scan/hotplug part is responsible
for that now) and can handle device removal and driver removal
symmetricaly (i.e. as appropriate).

Some user-visible changes in sysfs are made (for example, the
'sysdev' link from the ACPI device node to the processor device's
directory is gone and a 'physical_node' link is present instead,
a 'firmware_node' link is present in the processor device's
directory, the processor driver is now visible under
/sys/bus/cpu/drivers/ and bound to the processor device), but
that shouldn't affect the functionality that users care about
(frequency scaling, C-states and thermal management).

Tested on my venerable Toshiba Portege R500.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/Makefile           |    1 
 drivers/acpi/acpi_processor.c   |  473 +++++++++++++++++++++++
 drivers/acpi/glue.c             |    6 
 drivers/acpi/internal.h         |    3 
 drivers/acpi/processor_driver.c |  803 +++-------------------------------------
 drivers/acpi/scan.c             |    1 
 drivers/base/cpu.c              |   11 
 include/acpi/processor.h        |    5 
 8 files changed, 574 insertions(+), 729 deletions(-)

Index: linux-pm/drivers/acpi/processor_driver.c
===================================================================
--- linux-pm.orig/drivers/acpi/processor_driver.c
+++ linux-pm/drivers/acpi/processor_driver.c
@@ -1,11 +1,13 @@
 /*
- * acpi_processor.c - ACPI Processor Driver ($Revision: 71 $)
+ * processor_driver.c - ACPI Processor Driver
  *
  *  Copyright (C) 2001, 2002 Andy Grover <andrew.grover@intel.com>
  *  Copyright (C) 2001, 2002 Paul Diefenbaugh <paul.s.diefenbaugh@intel.com>
  *  Copyright (C) 2004       Dominik Brodowski <linux@brodo.de>
  *  Copyright (C) 2004  Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
  *  			- Added processor hotplug support
+ *  Copyright (C) 2013, Intel Corporation
+ *                      Rafael J. Wysocki <rafael.j.wysocki@intel.com>
  *
  * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  *
@@ -24,52 +26,29 @@
  *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
  *
  * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- *  TBD:
- *	1. Make # power states dynamic.
- *	2. Support duty_cycle values that span bit 4.
- *	3. Optimize by having scheduler determine business instead of
- *	   having us try to calculate it here.
- *	4. Need C1 timing -- must modify kernel (IRQ handler) to get this.
  */
 
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/init.h>
-#include <linux/types.h>
-#include <linux/pci.h>
-#include <linux/pm.h>
 #include <linux/cpufreq.h>
 #include <linux/cpu.h>
-#include <linux/dmi.h>
-#include <linux/moduleparam.h>
 #include <linux/cpuidle.h>
 #include <linux/slab.h>
 #include <linux/acpi.h>
-#include <linux/memory_hotplug.h>
 
-#include <asm/io.h>
-#include <asm/cpu.h>
-#include <asm/delay.h>
-#include <asm/uaccess.h>
-#include <asm/processor.h>
-#include <asm/smp.h>
-#include <asm/acpi.h>
-
-#include <acpi/acpi_bus.h>
-#include <acpi/acpi_drivers.h>
 #include <acpi/processor.h>
 
+#include "internal.h"
+
 #define PREFIX "ACPI: "
 
-#define ACPI_PROCESSOR_CLASS		"processor"
-#define ACPI_PROCESSOR_DEVICE_NAME	"Processor"
 #define ACPI_PROCESSOR_FILE_INFO	"info"
 #define ACPI_PROCESSOR_FILE_THROTTLING	"throttling"
 #define ACPI_PROCESSOR_FILE_LIMIT	"limit"
 #define ACPI_PROCESSOR_NOTIFY_PERFORMANCE 0x80
 #define ACPI_PROCESSOR_NOTIFY_POWER	0x81
 #define ACPI_PROCESSOR_NOTIFY_THROTTLING	0x82
-#define ACPI_PROCESSOR_DEVICE_HID	"ACPI0007"
 
 #define ACPI_PROCESSOR_LIMIT_USER	0
 #define ACPI_PROCESSOR_LIMIT_THERMAL	1
@@ -81,12 +60,8 @@ MODULE_AUTHOR("Paul Diefenbaugh");
 MODULE_DESCRIPTION("ACPI Processor Driver");
 MODULE_LICENSE("GPL");
 
-static int acpi_processor_add(struct acpi_device *device);
-static int acpi_processor_remove(struct acpi_device *device);
-static void acpi_processor_notify(struct acpi_device *device, u32 event);
-static acpi_status acpi_processor_hotadd_init(struct acpi_processor *pr);
-static int acpi_processor_handle_eject(struct acpi_processor *pr);
-static int acpi_processor_start(struct acpi_processor *pr);
+static int acpi_processor_start(struct device *dev);
+static int acpi_processor_stop(struct device *dev);
 
 static const struct acpi_device_id processor_device_ids[] = {
 	{ACPI_PROCESSOR_OBJECT_HID, 0},
@@ -95,295 +70,27 @@ static const struct acpi_device_id proce
 };
 MODULE_DEVICE_TABLE(acpi, processor_device_ids);
 
-static struct acpi_driver acpi_processor_driver = {
+static struct device_driver acpi_processor_driver = {
 	.name = "processor",
-	.class = ACPI_PROCESSOR_CLASS,
-	.ids = processor_device_ids,
-	.ops = {
-		.add = acpi_processor_add,
-		.remove = acpi_processor_remove,
-		.notify = acpi_processor_notify,
-		},
+	.bus = &cpu_subsys,
+	.acpi_match_table = processor_device_ids,
+	.probe = acpi_processor_start,
+	.remove = acpi_processor_stop,
 };
 
-#define INSTALL_NOTIFY_HANDLER		1
-#define UNINSTALL_NOTIFY_HANDLER	2
-
 DEFINE_PER_CPU(struct acpi_processor *, processors);
 EXPORT_PER_CPU_SYMBOL(processors);
 
-struct acpi_processor_errata errata __read_mostly;
-
-/* --------------------------------------------------------------------------
-                                Errata Handling
-   -------------------------------------------------------------------------- */
-
-static int acpi_processor_errata_piix4(struct pci_dev *dev)
+static void acpi_processor_notify(acpi_handle handle, u32 event, void *data)
 {
-	u8 value1 = 0;
-	u8 value2 = 0;
-
-
-	if (!dev)
-		return -EINVAL;
-
-	/*
-	 * Note that 'dev' references the PIIX4 ACPI Controller.
-	 */
-
-	switch (dev->revision) {
-	case 0:
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found PIIX4 A-step\n"));
-		break;
-	case 1:
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found PIIX4 B-step\n"));
-		break;
-	case 2:
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found PIIX4E\n"));
-		break;
-	case 3:
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found PIIX4M\n"));
-		break;
-	default:
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found unknown PIIX4\n"));
-		break;
-	}
-
-	switch (dev->revision) {
-
-	case 0:		/* PIIX4 A-step */
-	case 1:		/* PIIX4 B-step */
-		/*
-		 * See specification changes #13 ("Manual Throttle Duty Cycle")
-		 * and #14 ("Enabling and Disabling Manual Throttle"), plus
-		 * erratum #5 ("STPCLK# Deassertion Time") from the January
-		 * 2002 PIIX4 specification update.  Applies to only older
-		 * PIIX4 models.
-		 */
-		errata.piix4.throttle = 1;
-
-	case 2:		/* PIIX4E */
-	case 3:		/* PIIX4M */
-		/*
-		 * See erratum #18 ("C3 Power State/BMIDE and Type-F DMA
-		 * Livelock") from the January 2002 PIIX4 specification update.
-		 * Applies to all PIIX4 models.
-		 */
-
-		/*
-		 * BM-IDE
-		 * ------
-		 * Find the PIIX4 IDE Controller and get the Bus Master IDE
-		 * Status register address.  We'll use this later to read
-		 * each IDE controller's DMA status to make sure we catch all
-		 * DMA activity.
-		 */
-		dev = pci_get_subsys(PCI_VENDOR_ID_INTEL,
-				     PCI_DEVICE_ID_INTEL_82371AB,
-				     PCI_ANY_ID, PCI_ANY_ID, NULL);
-		if (dev) {
-			errata.piix4.bmisx = pci_resource_start(dev, 4);
-			pci_dev_put(dev);
-		}
-
-		/*
-		 * Type-F DMA
-		 * ----------
-		 * Find the PIIX4 ISA Controller and read the Motherboard
-		 * DMA controller's status to see if Type-F (Fast) DMA mode
-		 * is enabled (bit 7) on either channel.  Note that we'll
-		 * disable C3 support if this is enabled, as some legacy
-		 * devices won't operate well if fast DMA is disabled.
-		 */
-		dev = pci_get_subsys(PCI_VENDOR_ID_INTEL,
-				     PCI_DEVICE_ID_INTEL_82371AB_0,
-				     PCI_ANY_ID, PCI_ANY_ID, NULL);
-		if (dev) {
-			pci_read_config_byte(dev, 0x76, &value1);
-			pci_read_config_byte(dev, 0x77, &value2);
-			if ((value1 & 0x80) || (value2 & 0x80))
-				errata.piix4.fdma = 1;
-			pci_dev_put(dev);
-		}
-
-		break;
-	}
-
-	if (errata.piix4.bmisx)
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-				  "Bus master activity detection (BM-IDE) erratum enabled\n"));
-	if (errata.piix4.fdma)
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-				  "Type-F DMA livelock erratum (C3 disabled)\n"));
-
-	return 0;
-}
-
-static int acpi_processor_errata(struct acpi_processor *pr)
-{
-	int result = 0;
-	struct pci_dev *dev = NULL;
-
-
-	if (!pr)
-		return -EINVAL;
-
-	/*
-	 * PIIX4
-	 */
-	dev = pci_get_subsys(PCI_VENDOR_ID_INTEL,
-			     PCI_DEVICE_ID_INTEL_82371AB_3, PCI_ANY_ID,
-			     PCI_ANY_ID, NULL);
-	if (dev) {
-		result = acpi_processor_errata_piix4(dev);
-		pci_dev_put(dev);
-	}
-
-	return result;
-}
-
-/* --------------------------------------------------------------------------
-                                 Driver Interface
-   -------------------------------------------------------------------------- */
-
-static int acpi_processor_get_info(struct acpi_device *device)
-{
-	acpi_status status = 0;
-	union acpi_object object = { 0 };
-	struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
+	struct acpi_device *device = data;
 	struct acpi_processor *pr;
-	int cpu_index, device_declaration = 0;
-	static int cpu0_initialized;
-
-	pr = acpi_driver_data(device);
-	if (!pr)
-		return -EINVAL;
-
-	if (num_online_cpus() > 1)
-		errata.smp = TRUE;
-
-	acpi_processor_errata(pr);
-
-	/*
-	 * Check to see if we have bus mastering arbitration control.  This
-	 * is required for proper C3 usage (to maintain cache coherency).
-	 */
-	if (acpi_gbl_FADT.pm2_control_block && acpi_gbl_FADT.pm2_control_length) {
-		pr->flags.bm_control = 1;
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-				  "Bus mastering arbitration control present\n"));
-	} else
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-				  "No bus mastering arbitration control\n"));
-
-	if (!strcmp(acpi_device_hid(device), ACPI_PROCESSOR_OBJECT_HID)) {
-		/* Declared with "Processor" statement; match ProcessorID */
-		status = acpi_evaluate_object(pr->handle, NULL, NULL, &buffer);
-		if (ACPI_FAILURE(status)) {
-			dev_err(&device->dev,
-				"Failed to evaluate processor object (0x%x)\n",
-				status);
-			return -ENODEV;
-		}
-
-		/*
-		 * TBD: Synch processor ID (via LAPIC/LSAPIC structures) on SMP.
-		 *      >>> 'acpi_get_processor_id(acpi_id, &id)' in
-		 *      arch/xxx/acpi.c
-		 */
-		pr->acpi_id = object.processor.proc_id;
-	} else {
-		/*
-		 * Declared with "Device" statement; match _UID.
-		 * Note that we don't handle string _UIDs yet.
-		 */
-		unsigned long long value;
-		status = acpi_evaluate_integer(pr->handle, METHOD_NAME__UID,
-						NULL, &value);
-		if (ACPI_FAILURE(status)) {
-			dev_err(&device->dev,
-				"Failed to evaluate processor _UID (0x%x)\n",
-				status);
-			return -ENODEV;
-		}
-		device_declaration = 1;
-		pr->acpi_id = value;
-	}
-	cpu_index = acpi_get_cpuid(pr->handle, device_declaration, pr->acpi_id);
-
-	/* Handle UP system running SMP kernel, with no LAPIC in MADT */
-	if (!cpu0_initialized && (cpu_index == -1) &&
-	    (num_online_cpus() == 1)) {
-		cpu_index = 0;
-	}
-
-	cpu0_initialized = 1;
-
-	pr->id = cpu_index;
-
-	/*
-	 *  Extra Processor objects may be enumerated on MP systems with
-	 *  less than the max # of CPUs. They should be ignored _iff
-	 *  they are physically not present.
-	 */
-	if (pr->id == -1) {
-		if (ACPI_FAILURE(acpi_processor_hotadd_init(pr)))
-			return -ENODEV;
-	}
-	/*
-	 * On some boxes several processors use the same processor bus id.
-	 * But they are located in different scope. For example:
-	 * \_SB.SCK0.CPU0
-	 * \_SB.SCK1.CPU0
-	 * Rename the processor device bus id. And the new bus id will be
-	 * generated as the following format:
-	 * CPU+CPU ID.
-	 */
-	sprintf(acpi_device_bid(device), "CPU%X", pr->id);
-	ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Processor [%d:%d]\n", pr->id,
-			  pr->acpi_id));
-
-	if (!object.processor.pblk_address)
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "No PBLK (NULL address)\n"));
-	else if (object.processor.pblk_length != 6)
-		dev_err(&device->dev, "Invalid PBLK length [%d]\n",
-			    object.processor.pblk_length);
-	else {
-		pr->throttling.address = object.processor.pblk_address;
-		pr->throttling.duty_offset = acpi_gbl_FADT.duty_offset;
-		pr->throttling.duty_width = acpi_gbl_FADT.duty_width;
-
-		pr->pblk = object.processor.pblk_address;
-
-		/*
-		 * We don't care about error returns - we just try to mark
-		 * these reserved so that nobody else is confused into thinking
-		 * that this region might be unused..
-		 *
-		 * (In particular, allocating the IO range for Cardbus)
-		 */
-		request_region(pr->throttling.address, 6, "ACPI CPU throttle");
-	}
-
-	/*
-	 * If ACPI describes a slot number for this CPU, we can use it
-	 * ensure we get the right value in the "physical id" field
-	 * of /proc/cpuinfo
-	 */
-	status = acpi_evaluate_object(pr->handle, "_SUN", NULL, &buffer);
-	if (ACPI_SUCCESS(status))
-		arch_fix_phys_package_id(pr->id, object.integer.value);
-
-	return 0;
-}
-
-static DEFINE_PER_CPU(void *, processor_device_array);
-
-static void acpi_processor_notify(struct acpi_device *device, u32 event)
-{
-	struct acpi_processor *pr = acpi_driver_data(device);
 	int saved;
 
+	if (device->handle != handle)
+		return;
+
+	pr = acpi_driver_data(device);
 	if (!pr)
 		return;
 
@@ -420,32 +127,40 @@ static void acpi_processor_notify(struct
 	return;
 }
 
-static int acpi_cpu_soft_notify(struct notifier_block *nfb,
-		unsigned long action, void *hcpu)
+static __cpuinit int __acpi_processor_start(struct acpi_device *device);
+
+static int __cpuinit acpi_cpu_soft_notify(struct notifier_block *nfb,
+					  unsigned long action, void *hcpu)
 {
 	unsigned int cpu = (unsigned long)hcpu;
 	struct acpi_processor *pr = per_cpu(processors, cpu);
+	struct acpi_device *device;
+
+	if (!pr || acpi_bus_get_device(pr->handle, &device))
+		return NOTIFY_DONE;
 
-	if (action == CPU_ONLINE && pr) {
-		/* CPU got physically hotplugged and onlined the first time:
-		 * Initialize missing things
+	if (action == CPU_ONLINE) {
+		/*
+		 * CPU got physically hotplugged and onlined for the first time:
+		 * Initialize missing things.
 		 */
 		if (pr->flags.need_hotplug_init) {
+			int ret;
+
 			pr_info("Will online and init hotplugged CPU: %d\n",
 				pr->id);
-			WARN(acpi_processor_start(pr), "Failed to start CPU:"
-				" %d\n", pr->id);
 			pr->flags.need_hotplug_init = 0;
-		/* Normal CPU soft online event */
+			ret = __acpi_processor_start(device);
+			WARN(ret, "Failed to start CPU: %d\n", pr->id);
 		} else {
+			/* Normal CPU soft online event. */
 			acpi_processor_ppc_has_changed(pr, 0);
 			acpi_processor_hotplug(pr);
 			acpi_processor_reevaluate_tstate(pr, action);
 			acpi_processor_tstate_has_changed(pr);
 		}
-	}
-	if (action == CPU_DEAD && pr) {
-		/* invalidate the flag.throttling after one CPU is offline */
+	} else if (action == CPU_DEAD) {
+		/* Invalidate flag.throttling after the CPU is offline. */
 		acpi_processor_reevaluate_tstate(pr, action);
 	}
 	return NOTIFY_OK;
@@ -456,19 +171,18 @@ static struct notifier_block acpi_cpu_no
 	    .notifier_call = acpi_cpu_soft_notify,
 };
 
-/*
- * acpi_processor_start() is called by the cpu_hotplug_notifier func:
- * acpi_cpu_soft_notify(). Getting it __cpuinit{data} is difficult, the
- * root cause seem to be that acpi_processor_uninstall_hotplug_notify()
- * is in the module_exit (__exit) func. Allowing acpi_processor_start()
- * to not be in __cpuinit section, but being called from __cpuinit funcs
- * via __ref looks like the right thing to do here.
- */
-static __ref int acpi_processor_start(struct acpi_processor *pr)
+static __cpuinit int __acpi_processor_start(struct acpi_device *device)
 {
-	struct acpi_device *device = per_cpu(processor_device_array, pr->id);
+	struct acpi_processor *pr = acpi_driver_data(device);
+	acpi_status status;
 	int result = 0;
 
+	if (!pr)
+		return -ENODEV;
+
+	if (pr->flags.need_hotplug_init)
+		return 0;
+
 #ifdef CONFIG_CPU_FREQ
 	acpi_processor_ppc_has_changed(pr, 0);
 	acpi_processor_load_module(pr);
@@ -506,129 +220,48 @@ static __ref int acpi_processor_start(st
 		goto err_remove_sysfs_thermal;
 	}
 
-	return 0;
+	status = acpi_install_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
+					     acpi_processor_notify, device);
+	if (ACPI_SUCCESS(status))
+		return 0;
 
-err_remove_sysfs_thermal:
+	sysfs_remove_link(&pr->cdev->device.kobj, "device");
+ err_remove_sysfs_thermal:
 	sysfs_remove_link(&device->dev.kobj, "thermal_cooling");
-err_thermal_unregister:
+ err_thermal_unregister:
 	thermal_cooling_device_unregister(pr->cdev);
-err_power_exit:
+ err_power_exit:
 	acpi_processor_power_exit(pr);
-
 	return result;
 }
 
-/*
- * Do not put anything in here which needs the core to be online.
- * For example MSR access or setting up things which check for cpuinfo_x86
- * (cpu_data(cpu)) values, like CPU feature flags, family, model, etc.
- * Such things have to be put in and set up above in acpi_processor_start()
- */
-static int __cpuinit acpi_processor_add(struct acpi_device *device)
+static int __cpuinit acpi_processor_start(struct device *dev)
 {
-	struct acpi_processor *pr = NULL;
-	int result = 0;
-	struct device *dev;
+	struct acpi_device *device;
 
-	pr = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
-	if (!pr)
-		return -ENOMEM;
+	if (acpi_bus_get_device(ACPI_HANDLE(dev), &device))
+		return -ENODEV;
 
-	if (!zalloc_cpumask_var(&pr->throttling.shared_cpu_map, GFP_KERNEL)) {
-		result = -ENOMEM;
-		goto err_free_pr;
-	}
-
-	pr->handle = device->handle;
-	strcpy(acpi_device_name(device), ACPI_PROCESSOR_DEVICE_NAME);
-	strcpy(acpi_device_class(device), ACPI_PROCESSOR_CLASS);
-	device->driver_data = pr;
-
-	result = acpi_processor_get_info(device);
-	if (result) {
-		/* Processor is physically not present */
-		return 0;
-	}
-
-#ifdef CONFIG_SMP
-	if (pr->id >= setup_max_cpus && pr->id != 0)
-		return 0;
-#endif
-
-	BUG_ON(pr->id >= nr_cpu_ids);
-
-	/*
-	 * Buggy BIOS check
-	 * ACPI id of processors can be reported wrongly by the BIOS.
-	 * Don't trust it blindly
-	 */
-	if (per_cpu(processor_device_array, pr->id) != NULL &&
-	    per_cpu(processor_device_array, pr->id) != device) {
-		dev_warn(&device->dev,
-			"BIOS reported wrong ACPI id %d for the processor\n",
-			pr->id);
-		result = -ENODEV;
-		goto err_free_cpumask;
-	}
-	per_cpu(processor_device_array, pr->id) = device;
-
-	per_cpu(processors, pr->id) = pr;
-
-	dev = get_cpu_device(pr->id);
-	if (sysfs_create_link(&device->dev.kobj, &dev->kobj, "sysdev")) {
-		result = -EFAULT;
-		goto err_clear_processor;
-	}
-
-	/*
-	 * Do not start hotplugged CPUs now, but when they
-	 * are onlined the first time
-	 */
-	if (pr->flags.need_hotplug_init)
-		return 0;
-
-	result = acpi_processor_start(pr);
-	if (result)
-		goto err_remove_sysfs;
-
-	return 0;
-
-err_remove_sysfs:
-	sysfs_remove_link(&device->dev.kobj, "sysdev");
-err_clear_processor:
-	/*
-	 * processor_device_array is not cleared to allow checks for buggy BIOS
-	 */ 
-	per_cpu(processors, pr->id) = NULL;
-err_free_cpumask:
-	free_cpumask_var(pr->throttling.shared_cpu_map);
-err_free_pr:
-	kfree(pr);
-	return result;
+	return __acpi_processor_start(device);
 }
 
-static int acpi_processor_remove(struct acpi_device *device)
+static int acpi_processor_stop(struct device *dev)
 {
-	struct acpi_processor *pr = NULL;
+	struct acpi_device *device;
+	struct acpi_processor *pr;
 
+	if (acpi_bus_get_device(ACPI_HANDLE(dev), &device))
+		return 0;
 
-	if (!device || !acpi_driver_data(device))
-		return -EINVAL;
+	acpi_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
+				   acpi_processor_notify);
 
 	pr = acpi_driver_data(device);
-
-	if (pr->id >= nr_cpu_ids)
-		goto free;
-
-	if (device->removal_type == ACPI_BUS_REMOVAL_EJECT) {
-		if (acpi_processor_handle_eject(pr))
-			return -EINVAL;
-	}
+	if (!pr)
+		return 0;
 
 	acpi_processor_power_exit(pr);
 
-	sysfs_remove_link(&device->dev.kobj, "sysdev");
-
 	if (pr->cdev) {
 		sysfs_remove_link(&device->dev.kobj, "thermal_cooling");
 		sysfs_remove_link(&pr->cdev->device.kobj, "device");
@@ -637,331 +270,47 @@ static int acpi_processor_remove(struct
 	}
 
 	per_cpu(processors, pr->id) = NULL;
-	per_cpu(processor_device_array, pr->id) = NULL;
-	try_offline_node(cpu_to_node(pr->id));
-
-free:
-	free_cpumask_var(pr->throttling.shared_cpu_map);
-	kfree(pr);
-
 	return 0;
 }
 
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
-/****************************************************************************
- * 	Acpi processor hotplug support 				       	    *
- ****************************************************************************/
-
-static int is_processor_present(acpi_handle handle)
-{
-	acpi_status status;
-	unsigned long long sta = 0;
-
-
-	status = acpi_evaluate_integer(handle, "_STA", NULL, &sta);
-
-	if (ACPI_SUCCESS(status) && (sta & ACPI_STA_DEVICE_PRESENT))
-		return 1;
-
-	/*
-	 * _STA is mandatory for a processor that supports hot plug
-	 */
-	if (status == AE_NOT_FOUND)
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-				"Processor does not support hot plug\n"));
-	else
-		ACPI_EXCEPTION((AE_INFO, status,
-				"Processor Device is not present"));
-	return 0;
-}
-
-static void acpi_processor_hotplug_notify(acpi_handle handle,
-					  u32 event, void *data)
-{
-	struct acpi_device *device = NULL;
-	struct acpi_eject_event *ej_event = NULL;
-	u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */
-	acpi_status status;
-	int result;
-
-	acpi_scan_lock_acquire();
-
-	switch (event) {
-	case ACPI_NOTIFY_BUS_CHECK:
-	case ACPI_NOTIFY_DEVICE_CHECK:
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-		"Processor driver received %s event\n",
-		       (event == ACPI_NOTIFY_BUS_CHECK) ?
-		       "ACPI_NOTIFY_BUS_CHECK" : "ACPI_NOTIFY_DEVICE_CHECK"));
-
-		if (!is_processor_present(handle))
-			break;
-
-		if (!acpi_bus_get_device(handle, &device))
-			break;
-
-		result = acpi_bus_scan(handle);
-		if (result) {
-			acpi_handle_err(handle, "Unable to add the device\n");
-			break;
-		}
-		result = acpi_bus_get_device(handle, &device);
-		if (result) {
-			acpi_handle_err(handle, "Missing device object\n");
-			break;
-		}
-		ost_code = ACPI_OST_SC_SUCCESS;
-		break;
-
-	case ACPI_NOTIFY_EJECT_REQUEST:
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-				  "received ACPI_NOTIFY_EJECT_REQUEST\n"));
-
-		if (acpi_bus_get_device(handle, &device)) {
-			acpi_handle_err(handle,
-				"Device don't exist, dropping EJECT\n");
-			break;
-		}
-		if (!acpi_driver_data(device)) {
-			acpi_handle_err(handle,
-				"Driver data is NULL, dropping EJECT\n");
-			break;
-		}
-
-		ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL);
-		if (!ej_event) {
-			acpi_handle_err(handle, "No memory, dropping EJECT\n");
-			break;
-		}
-
-		get_device(&device->dev);
-		ej_event->device = device;
-		ej_event->event = ACPI_NOTIFY_EJECT_REQUEST;
-		/* The eject is carried out asynchronously. */
-		status = acpi_os_hotplug_execute(acpi_bus_hot_remove_device,
-						 ej_event);
-		if (ACPI_FAILURE(status)) {
-			put_device(&device->dev);
-			kfree(ej_event);
-			break;
-		}
-		goto out;
-
-	default:
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-				  "Unsupported event [0x%x]\n", event));
-
-		/* non-hotplug event; possibly handled by other handler */
-		goto out;
-	}
-
-	/* Inform firmware that the hotplug operation has completed */
-	(void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL);
-
- out:
-	acpi_scan_lock_release();
-}
-
-static acpi_status is_processor_device(acpi_handle handle)
-{
-	struct acpi_device_info *info;
-	char *hid;
-	acpi_status status;
-
-	status = acpi_get_object_info(handle, &info);
-	if (ACPI_FAILURE(status))
-		return status;
-
-	if (info->type == ACPI_TYPE_PROCESSOR) {
-		kfree(info);
-		return AE_OK;	/* found a processor object */
-	}
-
-	if (!(info->valid & ACPI_VALID_HID)) {
-		kfree(info);
-		return AE_ERROR;
-	}
-
-	hid = info->hardware_id.string;
-	if ((hid == NULL) || strcmp(hid, ACPI_PROCESSOR_DEVICE_HID)) {
-		kfree(info);
-		return AE_ERROR;
-	}
-
-	kfree(info);
-	return AE_OK;	/* found a processor device object */
-}
-
-static acpi_status
-processor_walk_namespace_cb(acpi_handle handle,
-			    u32 lvl, void *context, void **rv)
-{
-	acpi_status status;
-	int *action = context;
-
-	status = is_processor_device(handle);
-	if (ACPI_FAILURE(status))
-		return AE_OK;	/* not a processor; continue to walk */
-
-	switch (*action) {
-	case INSTALL_NOTIFY_HANDLER:
-		acpi_install_notify_handler(handle,
-					    ACPI_SYSTEM_NOTIFY,
-					    acpi_processor_hotplug_notify,
-					    NULL);
-		break;
-	case UNINSTALL_NOTIFY_HANDLER:
-		acpi_remove_notify_handler(handle,
-					   ACPI_SYSTEM_NOTIFY,
-					   acpi_processor_hotplug_notify);
-		break;
-	default:
-		break;
-	}
-
-	/* found a processor; skip walking underneath */
-	return AE_CTRL_DEPTH;
-}
-
-static acpi_status acpi_processor_hotadd_init(struct acpi_processor *pr)
-{
-	acpi_handle handle = pr->handle;
-
-	if (!is_processor_present(handle)) {
-		return AE_ERROR;
-	}
-
-	if (acpi_map_lsapic(handle, &pr->id))
-		return AE_ERROR;
-
-	if (arch_register_cpu(pr->id)) {
-		acpi_unmap_lsapic(pr->id);
-		return AE_ERROR;
-	}
-
-	/* CPU got hot-plugged, but cpu_data is not initialized yet
-	 * Set flag to delay cpu_idle/throttling initialization
-	 * in:
-	 * acpi_processor_add()
-	 *   acpi_processor_get_info()
-	 * and do it when the CPU gets online the first time
-	 * TBD: Cleanup above functions and try to do this more elegant.
-	 */
-	pr_info("CPU %d got hotplugged\n", pr->id);
-	pr->flags.need_hotplug_init = 1;
-
-	return AE_OK;
-}
-
-static int acpi_processor_handle_eject(struct acpi_processor *pr)
-{
-	if (cpu_online(pr->id))
-		cpu_down(pr->id);
-
-	get_online_cpus();
-	/*
-	 * The cpu might become online again at this point. So we check whether
-	 * the cpu has been onlined or not. If the cpu became online, it means
-	 * that someone wants to use the cpu. So acpi_processor_handle_eject()
-	 * returns -EAGAIN.
-	 */
-	if (unlikely(cpu_online(pr->id))) {
-		put_online_cpus();
-		pr_warn("Failed to remove CPU %d, because other task "
-			"brought the CPU back online\n", pr->id);
-		return -EAGAIN;
-	}
-	arch_unregister_cpu(pr->id);
-	acpi_unmap_lsapic(pr->id);
-	put_online_cpus();
-	return (0);
-}
-#else
-static acpi_status acpi_processor_hotadd_init(struct acpi_processor *pr)
-{
-	return AE_ERROR;
-}
-static int acpi_processor_handle_eject(struct acpi_processor *pr)
-{
-	return (-EINVAL);
-}
-#endif
-
-static
-void acpi_processor_install_hotplug_notify(void)
-{
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
-	int action = INSTALL_NOTIFY_HANDLER;
-	acpi_walk_namespace(ACPI_TYPE_ANY,
-			    ACPI_ROOT_OBJECT,
-			    ACPI_UINT32_MAX,
-			    processor_walk_namespace_cb, NULL, &action, NULL);
-#endif
-	register_hotcpu_notifier(&acpi_cpu_notifier);
-}
-
-static
-void acpi_processor_uninstall_hotplug_notify(void)
-{
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
-	int action = UNINSTALL_NOTIFY_HANDLER;
-	acpi_walk_namespace(ACPI_TYPE_ANY,
-			    ACPI_ROOT_OBJECT,
-			    ACPI_UINT32_MAX,
-			    processor_walk_namespace_cb, NULL, &action, NULL);
-#endif
-	unregister_hotcpu_notifier(&acpi_cpu_notifier);
-}
-
 /*
  * We keep the driver loaded even when ACPI is not running.
  * This is needed for the powernow-k8 driver, that works even without
  * ACPI, but needs symbols from this driver
  */
 
-static int __init acpi_processor_init(void)
+static int __init acpi_processor_driver_init(void)
 {
 	int result = 0;
 
 	if (acpi_disabled)
 		return 0;
 
-	result = acpi_bus_register_driver(&acpi_processor_driver);
+	result = driver_register(&acpi_processor_driver);
 	if (result < 0)
 		return result;
 
 	acpi_processor_syscore_init();
-
-	acpi_processor_install_hotplug_notify();
-
+	register_hotcpu_notifier(&acpi_cpu_notifier);
 	acpi_thermal_cpufreq_init();
-
 	acpi_processor_ppc_init();
-
 	acpi_processor_throttling_init();
-
 	return 0;
 }
 
-static void __exit acpi_processor_exit(void)
+static void __exit acpi_processor_driver_exit(void)
 {
 	if (acpi_disabled)
 		return;
 
 	acpi_processor_ppc_exit();
-
 	acpi_thermal_cpufreq_exit();
-
-	acpi_processor_uninstall_hotplug_notify();
-
+	unregister_hotcpu_notifier(&acpi_cpu_notifier);
 	acpi_processor_syscore_exit();
-
-	acpi_bus_unregister_driver(&acpi_processor_driver);
-
-	return;
+	driver_unregister(&acpi_processor_driver);
 }
 
-module_init(acpi_processor_init);
-module_exit(acpi_processor_exit);
+module_init(acpi_processor_driver_init);
+module_exit(acpi_processor_driver_exit);
 
 MODULE_ALIAS("processor");
Index: linux-pm/drivers/acpi/glue.c
===================================================================
--- linux-pm.orig/drivers/acpi/glue.c
+++ linux-pm/drivers/acpi/glue.c
@@ -105,7 +105,7 @@ acpi_handle acpi_get_child(acpi_handle p
 }
 EXPORT_SYMBOL(acpi_get_child);
 
-static int acpi_bind_one(struct device *dev, acpi_handle handle)
+int acpi_bind_one(struct device *dev, acpi_handle handle)
 {
 	struct acpi_device *acpi_dev;
 	acpi_status status;
@@ -188,8 +188,9 @@ static int acpi_bind_one(struct device *
 	kfree(physical_node);
 	goto err;
 }
+EXPORT_SYMBOL_GPL(acpi_bind_one);
 
-static int acpi_unbind_one(struct device *dev)
+int acpi_unbind_one(struct device *dev)
 {
 	struct acpi_device_physical_node *entry;
 	struct acpi_device *acpi_dev;
@@ -238,6 +239,7 @@ err:
 	dev_err(dev, "Oops, 'acpi_handle' corrupt\n");
 	return -EINVAL;
 }
+EXPORT_SYMBOL_GPL(acpi_unbind_one);
 
 static int acpi_platform_notify(struct device *dev)
 {
Index: linux-pm/drivers/acpi/internal.h
===================================================================
--- linux-pm.orig/drivers/acpi/internal.h
+++ linux-pm/drivers/acpi/internal.h
@@ -33,6 +33,7 @@ static inline void acpi_pci_slot_init(vo
 void acpi_pci_root_init(void);
 void acpi_pci_link_init(void);
 void acpi_pci_root_hp_init(void);
+void acpi_processor_init(void);
 void acpi_platform_init(void);
 int acpi_sysfs_init(void);
 void acpi_csrt_init(void);
@@ -79,6 +80,8 @@ void acpi_init_device_object(struct acpi
 			     int type, unsigned long long sta);
 void acpi_device_add_finalize(struct acpi_device *device);
 void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
+int acpi_bind_one(struct device *dev, acpi_handle handle);
+int acpi_unbind_one(struct device *dev);
 
 /* --------------------------------------------------------------------------
                                   Power Resource
Index: linux-pm/drivers/acpi/acpi_processor.c
===================================================================
--- /dev/null
+++ linux-pm/drivers/acpi/acpi_processor.c
@@ -0,0 +1,473 @@
+/*
+ * acpi_processor.c - ACPI processor enumeration support
+ *
+ * Copyright (C) 2001, 2002 Andy Grover <andrew.grover@intel.com>
+ * Copyright (C) 2001, 2002 Paul Diefenbaugh <paul.s.diefenbaugh@intel.com>
+ * Copyright (C) 2004       Dominik Brodowski <linux@brodo.de>
+ * Copyright (C) 2004  Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
+ * Copyright (C) 2013, Intel Corporation
+ *                     Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include <linux/acpi.h>
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+#include <acpi/processor.h>
+
+#include <asm/cpu.h>
+
+#include "internal.h"
+
+#define _COMPONENT	ACPI_PROCESSOR_COMPONENT
+
+ACPI_MODULE_NAME("processor");
+
+/* --------------------------------------------------------------------------
+                                Errata Handling
+   -------------------------------------------------------------------------- */
+
+struct acpi_processor_errata errata __read_mostly;
+EXPORT_SYMBOL_GPL(errata);
+
+static int acpi_processor_errata_piix4(struct pci_dev *dev)
+{
+	u8 value1 = 0;
+	u8 value2 = 0;
+
+
+	if (!dev)
+		return -EINVAL;
+
+	/*
+	 * Note that 'dev' references the PIIX4 ACPI Controller.
+	 */
+
+	switch (dev->revision) {
+	case 0:
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found PIIX4 A-step\n"));
+		break;
+	case 1:
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found PIIX4 B-step\n"));
+		break;
+	case 2:
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found PIIX4E\n"));
+		break;
+	case 3:
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found PIIX4M\n"));
+		break;
+	default:
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found unknown PIIX4\n"));
+		break;
+	}
+
+	switch (dev->revision) {
+
+	case 0:		/* PIIX4 A-step */
+	case 1:		/* PIIX4 B-step */
+		/*
+		 * See specification changes #13 ("Manual Throttle Duty Cycle")
+		 * and #14 ("Enabling and Disabling Manual Throttle"), plus
+		 * erratum #5 ("STPCLK# Deassertion Time") from the January
+		 * 2002 PIIX4 specification update.  Applies to only older
+		 * PIIX4 models.
+		 */
+		errata.piix4.throttle = 1;
+
+	case 2:		/* PIIX4E */
+	case 3:		/* PIIX4M */
+		/*
+		 * See erratum #18 ("C3 Power State/BMIDE and Type-F DMA
+		 * Livelock") from the January 2002 PIIX4 specification update.
+		 * Applies to all PIIX4 models.
+		 */
+
+		/*
+		 * BM-IDE
+		 * ------
+		 * Find the PIIX4 IDE Controller and get the Bus Master IDE
+		 * Status register address.  We'll use this later to read
+		 * each IDE controller's DMA status to make sure we catch all
+		 * DMA activity.
+		 */
+		dev = pci_get_subsys(PCI_VENDOR_ID_INTEL,
+				     PCI_DEVICE_ID_INTEL_82371AB,
+				     PCI_ANY_ID, PCI_ANY_ID, NULL);
+		if (dev) {
+			errata.piix4.bmisx = pci_resource_start(dev, 4);
+			pci_dev_put(dev);
+		}
+
+		/*
+		 * Type-F DMA
+		 * ----------
+		 * Find the PIIX4 ISA Controller and read the Motherboard
+		 * DMA controller's status to see if Type-F (Fast) DMA mode
+		 * is enabled (bit 7) on either channel.  Note that we'll
+		 * disable C3 support if this is enabled, as some legacy
+		 * devices won't operate well if fast DMA is disabled.
+		 */
+		dev = pci_get_subsys(PCI_VENDOR_ID_INTEL,
+				     PCI_DEVICE_ID_INTEL_82371AB_0,
+				     PCI_ANY_ID, PCI_ANY_ID, NULL);
+		if (dev) {
+			pci_read_config_byte(dev, 0x76, &value1);
+			pci_read_config_byte(dev, 0x77, &value2);
+			if ((value1 & 0x80) || (value2 & 0x80))
+				errata.piix4.fdma = 1;
+			pci_dev_put(dev);
+		}
+
+		break;
+	}
+
+	if (errata.piix4.bmisx)
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+				  "Bus master activity detection (BM-IDE) erratum enabled\n"));
+	if (errata.piix4.fdma)
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+				  "Type-F DMA livelock erratum (C3 disabled)\n"));
+
+	return 0;
+}
+
+static int acpi_processor_errata(struct acpi_processor *pr)
+{
+	int result = 0;
+	struct pci_dev *dev = NULL;
+
+
+	if (!pr)
+		return -EINVAL;
+
+	/*
+	 * PIIX4
+	 */
+	dev = pci_get_subsys(PCI_VENDOR_ID_INTEL,
+			     PCI_DEVICE_ID_INTEL_82371AB_3, PCI_ANY_ID,
+			     PCI_ANY_ID, NULL);
+	if (dev) {
+		result = acpi_processor_errata_piix4(dev);
+		pci_dev_put(dev);
+	}
+
+	return result;
+}
+
+/* --------------------------------------------------------------------------
+                                Initialization
+   -------------------------------------------------------------------------- */
+
+static acpi_status acpi_processor_hotadd_init(struct acpi_processor *pr)
+{
+	unsigned long long sta;
+	acpi_status status;
+	int ret;
+
+	status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
+	if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_PRESENT))
+		return -ENODEV;
+
+	ret = acpi_map_lsapic(pr->handle, &pr->id);
+	if (ret)
+		return ret;
+
+	ret = arch_register_cpu(pr->id);
+	if (ret) {
+		acpi_unmap_lsapic(pr->id);
+		return ret;
+	}
+
+	/*
+	 * CPU got hot-added, but cpu_data is not initialized yet.  Set a flag
+	 * to delay cpu_idle/throttling initialization and do it when the CPU
+	 * gets online for the first time.
+	 */
+	pr_info("CPU%d has been hot-added\n", pr->id);
+	pr->flags.need_hotplug_init = 1;
+	return 0;
+}
+
+static int acpi_processor_get_info(struct acpi_device *device)
+{
+	union acpi_object object = { 0 };
+	struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
+	struct acpi_processor *pr = acpi_driver_data(device);
+	int cpu_index, device_declaration = 0;
+	acpi_status status = AE_OK;
+	static int cpu0_initialized;
+
+	if (num_online_cpus() > 1)
+		errata.smp = TRUE;
+
+	acpi_processor_errata(pr);
+
+	/*
+	 * Check to see if we have bus mastering arbitration control.  This
+	 * is required for proper C3 usage (to maintain cache coherency).
+	 */
+	if (acpi_gbl_FADT.pm2_control_block && acpi_gbl_FADT.pm2_control_length) {
+		pr->flags.bm_control = 1;
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+				  "Bus mastering arbitration control present\n"));
+	} else
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+				  "No bus mastering arbitration control\n"));
+
+	if (!strcmp(acpi_device_hid(device), ACPI_PROCESSOR_OBJECT_HID)) {
+		/* Declared with "Processor" statement; match ProcessorID */
+		status = acpi_evaluate_object(pr->handle, NULL, NULL, &buffer);
+		if (ACPI_FAILURE(status)) {
+			dev_err(&device->dev,
+				"Failed to evaluate processor object (0x%x)\n",
+				status);
+			return -ENODEV;
+		}
+
+		/*
+		 * TBD: Synch processor ID (via LAPIC/LSAPIC structures) on SMP.
+		 *      >>> 'acpi_get_processor_id(acpi_id, &id)' in
+		 *      arch/xxx/acpi.c
+		 */
+		pr->acpi_id = object.processor.proc_id;
+	} else {
+		/*
+		 * Declared with "Device" statement; match _UID.
+		 * Note that we don't handle string _UIDs yet.
+		 */
+		unsigned long long value;
+		status = acpi_evaluate_integer(pr->handle, METHOD_NAME__UID,
+						NULL, &value);
+		if (ACPI_FAILURE(status)) {
+			dev_err(&device->dev,
+				"Failed to evaluate processor _UID (0x%x)\n",
+				status);
+			return -ENODEV;
+		}
+		device_declaration = 1;
+		pr->acpi_id = value;
+	}
+	cpu_index = acpi_get_cpuid(pr->handle, device_declaration, pr->acpi_id);
+
+	/* Handle UP system running SMP kernel, with no LAPIC in MADT */
+	if (!cpu0_initialized && (cpu_index == -1) &&
+	    (num_online_cpus() == 1)) {
+		cpu_index = 0;
+	}
+
+	cpu0_initialized = 1;
+
+	pr->id = cpu_index;
+
+	/*
+	 *  Extra Processor objects may be enumerated on MP systems with
+	 *  less than the max # of CPUs. They should be ignored _iff
+	 *  they are physically not present.
+	 */
+	if (pr->id == -1) {
+		int ret = acpi_processor_hotadd_init(pr);
+		if (ret)
+			return ret;
+	}
+	/*
+	 * On some boxes several processors use the same processor bus id.
+	 * But they are located in different scope. For example:
+	 * \_SB.SCK0.CPU0
+	 * \_SB.SCK1.CPU0
+	 * Rename the processor device bus id. And the new bus id will be
+	 * generated as the following format:
+	 * CPU+CPU ID.
+	 */
+	sprintf(acpi_device_bid(device), "CPU%X", pr->id);
+	ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Processor [%d:%d]\n", pr->id,
+			  pr->acpi_id));
+
+	if (!object.processor.pblk_address)
+		ACPI_DEBUG_PRINT((ACPI_DB_INFO, "No PBLK (NULL address)\n"));
+	else if (object.processor.pblk_length != 6)
+		dev_err(&device->dev, "Invalid PBLK length [%d]\n",
+			    object.processor.pblk_length);
+	else {
+		pr->throttling.address = object.processor.pblk_address;
+		pr->throttling.duty_offset = acpi_gbl_FADT.duty_offset;
+		pr->throttling.duty_width = acpi_gbl_FADT.duty_width;
+
+		pr->pblk = object.processor.pblk_address;
+
+		/*
+		 * We don't care about error returns - we just try to mark
+		 * these reserved so that nobody else is confused into thinking
+		 * that this region might be unused..
+		 *
+		 * (In particular, allocating the IO range for Cardbus)
+		 */
+		request_region(pr->throttling.address, 6, "ACPI CPU throttle");
+	}
+
+	/*
+	 * If ACPI describes a slot number for this CPU, we can use it to
+	 * ensure we get the right value in the "physical id" field
+	 * of /proc/cpuinfo
+	 */
+	status = acpi_evaluate_object(pr->handle, "_SUN", NULL, &buffer);
+	if (ACPI_SUCCESS(status))
+		arch_fix_phys_package_id(pr->id, object.integer.value);
+
+	return 0;
+}
+
+/*
+ * Do not put anything in here which needs the core to be online.
+ * For example MSR access or setting up things which check for cpuinfo_x86
+ * (cpu_data(cpu)) values, like CPU feature flags, family, model, etc.
+ * Such things have to be put in and set up by the processor driver's .probe().
+ */
+static DEFINE_PER_CPU(void *, processor_device_array);
+
+static int __cpuinit acpi_processor_add(struct acpi_device *device,
+					const struct acpi_device_id *id)
+{
+	struct acpi_processor *pr;
+	struct device *dev;
+	int result = 0;
+
+	pr = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
+	if (!pr)
+		return -ENOMEM;
+
+	if (!zalloc_cpumask_var(&pr->throttling.shared_cpu_map, GFP_KERNEL)) {
+		result = -ENOMEM;
+		goto err_free_pr;
+	}
+
+	pr->handle = device->handle;
+	strcpy(acpi_device_name(device), ACPI_PROCESSOR_DEVICE_NAME);
+	strcpy(acpi_device_class(device), ACPI_PROCESSOR_CLASS);
+	device->driver_data = pr;
+
+	result = acpi_processor_get_info(device);
+	if (result) /* Processor is not physically present or unavailable */
+		return 0;
+
+#ifdef CONFIG_SMP
+	if (pr->id >= setup_max_cpus && pr->id != 0)
+		return 0;
+#endif
+
+	BUG_ON(pr->id >= nr_cpu_ids);
+
+	/*
+	 * Buggy BIOS check.
+	 * ACPI id of processors can be reported wrongly by the BIOS.
+	 * Don't trust it blindly
+	 */
+	if (per_cpu(processor_device_array, pr->id) != NULL &&
+	    per_cpu(processor_device_array, pr->id) != device) {
+		dev_warn(&device->dev,
+			"BIOS reported wrong ACPI id %d for the processor\n",
+			pr->id);
+		/* Give up, but do not abort the namespace scan. */
+		goto err;
+	}
+	/*
+	 * processor_device_array is not cleared on errors to allow buggy BIOS
+	 * checks.
+	 */
+	per_cpu(processor_device_array, pr->id) = device;
+
+	dev = get_cpu_device(pr->id);
+	ACPI_HANDLE_SET(dev, pr->handle);
+	result = acpi_bind_one(dev, NULL);
+	if (result)
+		goto err;
+
+	pr->dev = dev;
+	dev->offline = pr->flags.need_hotplug_init;
+
+	/* Trigger the processor driver's .probe() if present. */
+	if (device_attach(dev) >= 0)
+		return 1;
+
+	dev_err(dev, "Processor driver could not be attached\n");
+	acpi_unbind_one(dev);
+
+ err:
+	free_cpumask_var(pr->throttling.shared_cpu_map);
+	device->driver_data = NULL;
+ err_free_pr:
+	kfree(pr);
+	return result;
+}
+
+/* --------------------------------------------------------------------------
+                                    Removal
+   -------------------------------------------------------------------------- */
+
+static void acpi_processor_remove(struct acpi_device *device)
+{
+	struct acpi_processor *pr;
+
+	if (!device || !acpi_driver_data(device))
+		return;
+
+	pr = acpi_driver_data(device);
+	if (pr->id >= nr_cpu_ids)
+		goto out;
+
+	/*
+	 * The only reason why we ever get here is CPU hot-removal.  The CPU is
+	 * already offline and the ACPI device removal locking prevents it from
+	 * being put back online at this point.
+	 *
+	 * Unbind the driver from the processor device and detach it from the
+	 * ACPI companion object.
+	 */
+	device_release_driver(pr->dev);
+	acpi_unbind_one(pr->dev);
+
+	/* Clean up. */
+	per_cpu(processor_device_array, pr->id) = NULL;
+	try_offline_node(cpu_to_node(pr->id));
+
+	/* Remove the CPU. */
+	get_online_cpus();
+	arch_unregister_cpu(pr->id);
+	acpi_unmap_lsapic(pr->id);
+	put_online_cpus();
+
+ out:
+	free_cpumask_var(pr->throttling.shared_cpu_map);
+	kfree(pr);
+}
+
+/*
+ * The following ACPI IDs are known to be suitable for representing as
+ * processor devices.
+ */
+static const struct acpi_device_id processor_device_ids[] = {
+
+	{ ACPI_PROCESSOR_OBJECT_HID, },
+	{ ACPI_PROCESSOR_DEVICE_HID, },
+
+	{ }
+};
+
+static struct acpi_scan_handler processor_handler = {
+	.ids = processor_device_ids,
+	.attach = acpi_processor_add,
+	.detach = acpi_processor_remove,
+	.hotplug = {
+		.enabled = true,
+	},
+};
+
+void __init acpi_processor_init(void)
+{
+	acpi_scan_add_handler_with_hotplug(&processor_handler, "processor");
+}
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -2124,6 +2124,7 @@ int __init acpi_scan_init(void)
 
 	acpi_pci_root_init();
 	acpi_pci_link_init();
+	acpi_processor_init();
 	acpi_platform_init();
 	acpi_lpss_init();
 	acpi_csrt_init();
Index: linux-pm/drivers/acpi/Makefile
===================================================================
--- linux-pm.orig/drivers/acpi/Makefile
+++ linux-pm/drivers/acpi/Makefile
@@ -34,6 +34,7 @@ acpi-$(CONFIG_ACPI_SLEEP)	+= proc.o
 acpi-y				+= bus.o glue.o
 acpi-y				+= scan.o
 acpi-y				+= resource.o
+acpi-y				+= acpi_processor.o
 acpi-y				+= processor_core.o
 acpi-y				+= ec.o
 acpi-$(CONFIG_ACPI_DOCK)	+= dock.o
Index: linux-pm/include/acpi/processor.h
===================================================================
--- linux-pm.orig/include/acpi/processor.h
+++ linux-pm/include/acpi/processor.h
@@ -6,6 +6,10 @@
 #include <linux/thermal.h>
 #include <asm/acpi.h>
 
+#define ACPI_PROCESSOR_CLASS		"processor"
+#define ACPI_PROCESSOR_DEVICE_NAME	"Processor"
+#define ACPI_PROCESSOR_DEVICE_HID	"ACPI0007"
+
 #define ACPI_PROCESSOR_BUSY_METRIC	10
 
 #define ACPI_PROCESSOR_MAX_POWER	8
@@ -207,6 +211,7 @@ struct acpi_processor {
 	struct acpi_processor_throttling throttling;
 	struct acpi_processor_limit limit;
 	struct thermal_cooling_device *cdev;
+	struct device *dev; /* Processor device. */
 };
 
 struct acpi_processor_errata {
Index: linux-pm/drivers/base/cpu.c
===================================================================
--- linux-pm.orig/drivers/base/cpu.c
+++ linux-pm/drivers/base/cpu.c
@@ -13,11 +13,21 @@
 #include <linux/gfp.h>
 #include <linux/slab.h>
 #include <linux/percpu.h>
+#include <linux/acpi.h>
 
 #include "base.h"
 
 static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
 
+static int cpu_subsys_match(struct device *dev, struct device_driver *drv)
+{
+	/* ACPI style match is the only one that may succeed. */
+	if (acpi_driver_match_device(dev, drv))
+		return 1;
+
+	return 0;
+}
+
 #ifdef CONFIG_HOTPLUG_CPU
 static int cpu_subsys_online(struct device *dev)
 {
@@ -76,6 +86,7 @@ static DEVICE_ATTR(release, S_IWUSR, NUL
 struct bus_type cpu_subsys = {
 	.name = "cpu",
 	.dev_name = "cpu",
+	.match = cpu_subsys_match,
 #ifdef CONFIG_HOTPLUG_CPU
 	.online = cpu_subsys_online,
 	.offline = cpu_subsys_offline,


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/4] Driver core: Add offline/online device operations
  2013-05-02 12:27   ` [PATCH 1/4] Driver core: Add offline/online device operations Rafael J. Wysocki
@ 2013-05-02 13:57     ` Greg Kroah-Hartman
  2013-05-02 23:11     ` Toshi Kani
  1 sibling, 0 replies; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-02 13:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown

On Thu, May 02, 2013 at 02:27:30PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> In some cases, graceful hot-removal of devices is not possible,
> although in principle the devices in question support hotplug.
> For example, that may happen for the last CPU in the system or
> for memory modules holding kernel memory.
> 
> In those cases it is nice to be able to check if the given device
> can be gracefully hot-removed before triggering a removal procedure
> that cannot be aborted or reversed.  Unfortunately, however, the
> kernel currently doesn't provide any support for that.
> 
> To address that deficiency, introduce support for offline and
> online operations that can be performed on devices, respectively,
> before a hot-removal and in case when it is necessary (or convenient)
> to put a device back online after a successful offline (that has not
> been followed by removal).  The idea is that the offline will fail
> whenever the given device cannot be gracefully removed from the
> system and it will not be allowed to use the device after a
> successful offline (until a subsequent online) in analogy with the
> existing CPU offline/online mechanism.
> 
> For now, the offline and online operations are introduced at the
> bus type level, as that should be sufficient for the most urgent use
> cases (CPUs and memory modules).  In the future, however, the
> approach may be extended to cover some more complicated device
> offline/online scenarios involving device drivers etc.
> 
> The lock_device_hotplug() and unlock_device_hotplug() functions are
> introduced because subsequent patches need to put larger pieces of
> code under device_hotplug_lock to prevent race conditions between
> device offline and removal from happening.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online
  2013-05-02 12:28   ` [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
@ 2013-05-02 13:57     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-02 13:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown

On Thu, May 02, 2013 at 02:28:19PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Rework the CPU hotplug code in drivers/base/cpu.c to use the
> generic offline/online support introduced previously instead of
> its own CPU-specific code.
> 
> For this purpose, modify cpu_subsys to provide offline and online
> callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling
> the CPU-specific 'online' sysfs attribute.
> 
> This modification is not supposed to change the user-observable
> behavior of the kernel (i.e. the 'online' attribute will be present
> in exactly the same place in sysfs and should trigger exactly the
> same actions as before).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure
  2013-05-02 12:31   ` [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure Rafael J. Wysocki
@ 2013-05-02 13:59     ` Greg Kroah-Hartman
  2013-05-02 23:20     ` Toshi Kani
  1 sibling, 0 replies; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-02 13:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown

On Thu, May 02, 2013 at 02:31:51PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Split the ACPI processor driver into two parts, one that is
> non-modular, resides in the ACPI core and handles the enumeration
> and hotplug of processors and one that implements the rest of the
> existing processor driver functionality.
> 
> The non-modular part uses an ACPI scan handler object to enumerate
> processors on the basis of information provided by the ACPI namespace
> and to hook up with the common ACPI hotplug infrastructure.  It also
> populates the ACPI handle of each processor device having a
> corresponding object in the ACPI namespace, which allows the driver
> proper to bind to those devices, and makes the driver bind to them
> if it is readily available (i.e. loaded) when the scan handler's
> .attach() routine is running.
> 
> There are a few reasons to make this change.
> 
> First, switching the ACPI processor driver to using the common ACPI
> hotplug infrastructure reduces code duplication and size considerably,
> even though a new file is created along with a header comment etc.
> 
> Second, since the common hotplug code attempts to offline devices
> before starting the (non-reversible) removal procedure, it will abort
> (and possibly roll back) hot-remove operations involving processors
> if cpu_down() returns an error code for one of them instead of
> continuing them blindly (if /sys/firmware/acpi/hotplug/force_remove
> is unset).  That is a more desirable behavior than what the current
> code does.
> 
> Finally, the separation of the scan/hotplug part from the driver
> proper makes it possible to simplify the driver's .remove() routine,
> because it doesn't need to worry about the possible cleanup related
> to processor removal any more (the scan/hotplug part is responsible
> for that now) and can handle device removal and driver removal
> symmetricaly (i.e. as appropriate).
> 
> Some user-visible changes in sysfs are made (for example, the
> 'sysdev' link from the ACPI device node to the processor device's
> directory is gone and a 'physical_node' link is present instead,
> a 'firmware_node' link is present in the processor device's
> directory, the processor driver is now visible under
> /sys/bus/cpu/drivers/ and bound to the processor device), but
> that shouldn't affect the functionality that users care about
> (frequency scaling, C-states and thermal management).
> 
> Tested on my venerable Toshiba Portege R500.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

For the driver core part:

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/4] Driver core: Add offline/online device operations
  2013-05-02 12:27   ` [PATCH 1/4] Driver core: Add offline/online device operations Rafael J. Wysocki
  2013-05-02 13:57     ` Greg Kroah-Hartman
@ 2013-05-02 23:11     ` Toshi Kani
  2013-05-02 23:36       ` Rafael J. Wysocki
  1 sibling, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-05-02 23:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Thu, 2013-05-02 at 14:27 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> In some cases, graceful hot-removal of devices is not possible,
> although in principle the devices in question support hotplug.
> For example, that may happen for the last CPU in the system or
> for memory modules holding kernel memory.
> 
> In those cases it is nice to be able to check if the given device
> can be gracefully hot-removed before triggering a removal procedure
> that cannot be aborted or reversed.  Unfortunately, however, the
> kernel currently doesn't provide any support for that.
> 
> To address that deficiency, introduce support for offline and
> online operations that can be performed on devices, respectively,
> before a hot-removal and in case when it is necessary (or convenient)
> to put a device back online after a successful offline (that has not
> been followed by removal).  The idea is that the offline will fail
> whenever the given device cannot be gracefully removed from the
> system and it will not be allowed to use the device after a
> successful offline (until a subsequent online) in analogy with the
> existing CPU offline/online mechanism.
> 
> For now, the offline and online operations are introduced at the
> bus type level, as that should be sufficient for the most urgent use
> cases (CPUs and memory modules).  In the future, however, the
> approach may be extended to cover some more complicated device
> offline/online scenarios involving device drivers etc.
> 
> The lock_device_hotplug() and unlock_device_hotplug() functions are
> introduced because subsequent patches need to put larger pieces of
> code under device_hotplug_lock to prevent race conditions between
> device offline and removal from happening.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Looks good.  For patch 1/4 to 3/4:

Reviewed-by: Toshi Kani <toshi.kani@hp.com>

I have one minor comment below.

> ---
>  Documentation/ABI/testing/sysfs-devices-online |   20 +++
>  drivers/base/core.c                            |  130 +++++++++++++++++++++++++
>  include/linux/device.h                         |   21 ++++
>  3 files changed, 171 insertions(+)
> 
> Index: linux-pm/include/linux/device.h
> ===================================================================
> --- linux-pm.orig/include/linux/device.h
> +++ linux-pm/include/linux/device.h
> @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t
>   *		the specific driver's probe to initial the matched device.
>   * @remove:	Called when a device removed from this bus.
>   * @shutdown:	Called at shut-down time to quiesce the device.
> + *
> + * @online:	Called to put the device back online (after offlining it).
> + * @offline:	Called to put the device offline for hot-removal. May fail.
> + *
>   * @suspend:	Called when a device on this bus wants to go to sleep mode.
>   * @resume:	Called to bring a device on this bus out of sleep mode.
>   * @pm:		Power management operations of this bus, callback the specific
> @@ -103,6 +107,9 @@ struct bus_type {
>  	int (*remove)(struct device *dev);
>  	void (*shutdown)(struct device *dev);
>  
> +	int (*online)(struct device *dev);
> +	int (*offline)(struct device *dev);
> +
>  	int (*suspend)(struct device *dev, pm_message_t state);
>  	int (*resume)(struct device *dev);
>  
> @@ -646,6 +653,8 @@ struct acpi_dev_node {
>   * @release:	Callback to free the device after all references have
>   * 		gone away. This should be set by the allocator of the
>   * 		device (i.e. the bus driver that discovered the device).
> + * @offline_disabled: If set, the device is permanently online.
> + * @offline:	Set after successful invocation of bus type's .offline().
>   *
>   * At the lowest level, every device in a Linux system is represented by an
>   * instance of struct device. The device structure contains the information
> @@ -718,6 +727,9 @@ struct device {
>  
>  	void	(*release)(struct device *dev);
>  	struct iommu_group	*iommu_group;
> +
> +	bool			offline_disabled:1;
> +	bool			offline:1;
>  };
>  
>  static inline struct device *kobj_to_dev(struct kobject *kobj)
> @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st
>  extern void *dev_get_drvdata(const struct device *dev);
>  extern int dev_set_drvdata(struct device *dev, void *data);
>  
> +static inline bool device_supports_offline(struct device *dev)

Since we renamed "offline" to "hotplug" for the lock interfaces, should
this function be renamed to device_supports_hotplug() as well?

Thanks,
-Toshi

> +{
> +	return dev->bus && dev->bus->offline && dev->bus->online;
> +}
> +

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure
  2013-05-02 12:31   ` [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure Rafael J. Wysocki
  2013-05-02 13:59     ` Greg Kroah-Hartman
@ 2013-05-02 23:20     ` Toshi Kani
  2013-05-03 12:05       ` Rafael J. Wysocki
  1 sibling, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-05-02 23:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Thu, 2013-05-02 at 14:31 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Split the ACPI processor driver into two parts, one that is
> non-modular, resides in the ACPI core and handles the enumeration
> and hotplug of processors and one that implements the rest of the
> existing processor driver functionality.
> 
> The non-modular part uses an ACPI scan handler object to enumerate
> processors on the basis of information provided by the ACPI namespace
> and to hook up with the common ACPI hotplug infrastructure.  It also
> populates the ACPI handle of each processor device having a
> corresponding object in the ACPI namespace, which allows the driver
> proper to bind to those devices, and makes the driver bind to them
> if it is readily available (i.e. loaded) when the scan handler's
> .attach() routine is running.
> 
> There are a few reasons to make this change.
> 
> First, switching the ACPI processor driver to using the common ACPI
> hotplug infrastructure reduces code duplication and size considerably,
> even though a new file is created along with a header comment etc.
> 
> Second, since the common hotplug code attempts to offline devices
> before starting the (non-reversible) removal procedure, it will abort
> (and possibly roll back) hot-remove operations involving processors
> if cpu_down() returns an error code for one of them instead of
> continuing them blindly (if /sys/firmware/acpi/hotplug/force_remove
> is unset).  That is a more desirable behavior than what the current
> code does.
> 
> Finally, the separation of the scan/hotplug part from the driver
> proper makes it possible to simplify the driver's .remove() routine,
> because it doesn't need to worry about the possible cleanup related
> to processor removal any more (the scan/hotplug part is responsible
> for that now) and can handle device removal and driver removal
> symmetricaly (i.e. as appropriate).
> 
> Some user-visible changes in sysfs are made (for example, the
> 'sysdev' link from the ACPI device node to the processor device's
> directory is gone and a 'physical_node' link is present instead,
> a 'firmware_node' link is present in the processor device's
> directory, the processor driver is now visible under
> /sys/bus/cpu/drivers/ and bound to the processor device), but
> that shouldn't affect the functionality that users care about
> (frequency scaling, C-states and thermal management).

This looks very nice.  I have one question below.

> Tested on my venerable Toshiba Portege R500.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/acpi/Makefile           |    1 
>  drivers/acpi/acpi_processor.c   |  473 +++++++++++++++++++++++
>  drivers/acpi/glue.c             |    6 
>  drivers/acpi/internal.h         |    3 
>  drivers/acpi/processor_driver.c |  803 +++-------------------------------------
>  drivers/acpi/scan.c             |    1 
>  drivers/base/cpu.c              |   11 
>  include/acpi/processor.h        |    5 
>  8 files changed, 574 insertions(+), 729 deletions(-)

 :

> Index: linux-pm/drivers/base/cpu.c
> ===================================================================
> --- linux-pm.orig/drivers/base/cpu.c
> +++ linux-pm/drivers/base/cpu.c
> @@ -13,11 +13,21 @@
>  #include <linux/gfp.h>
>  #include <linux/slab.h>
>  #include <linux/percpu.h>
> +#include <linux/acpi.h>
>  
>  #include "base.h"
>  
>  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
>  
> +static int cpu_subsys_match(struct device *dev, struct device_driver *drv)
> +{
> +	/* ACPI style match is the only one that may succeed. */
> +	if (acpi_driver_match_device(dev, drv))

Can you explain why this change is needed?  Do CPU devices still behave
the same on non-ACPI systems?  

Thanks,
-Toshi


> +		return 1;
> +
> +	return 0;
> +}
> +
>  #ifdef CONFIG_HOTPLUG_CPU
>  static int cpu_subsys_online(struct device *dev)
>  {
> @@ -76,6 +86,7 @@ static DEVICE_ATTR(release, S_IWUSR, NUL
>  struct bus_type cpu_subsys = {
>  	.name = "cpu",
>  	.dev_name = "cpu",
> +	.match = cpu_subsys_match,
>  #ifdef CONFIG_HOTPLUG_CPU
>  	.online = cpu_subsys_online,
>  	.offline = cpu_subsys_offline,
> 



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/4] Driver core: Add offline/online device operations
  2013-05-02 23:36       ` Rafael J. Wysocki
@ 2013-05-02 23:23         ` Toshi Kani
  0 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-02 23:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Fri, 2013-05-03 at 01:36 +0200, Rafael J. Wysocki wrote:
> On Thursday, May 02, 2013 05:11:27 PM Toshi Kani wrote:
> > On Thu, 2013-05-02 at 14:27 +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
 :
> > >  
> > > +static inline bool device_supports_offline(struct device *dev)
> > 
> > Since we renamed "offline" to "hotplug" for the lock interfaces, should
> > this function be renamed to device_supports_hotplug() as well?
> 
> Well, "offline" is more specific, as there may be devices that don't
> support offline/online, but support hotplug otherwise.  That's why I didn't
> change it.

I see.  That makes sense.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-05-02  0:58     ` Rafael J. Wysocki
@ 2013-05-02 23:29       ` Toshi Kani
  2013-05-03 11:48         ` Rafael J. Wysocki
  0 siblings, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-05-02 23:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Thu, 2013-05-02 at 02:58 +0200, Rafael J. Wysocki wrote:
> On Tuesday, April 30, 2013 05:38:38 PM Toshi Kani wrote:
> > On Mon, 2013-04-29 at 14:26 +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
 :
> > > + */
> > > +int device_offline(struct device *dev)
> > > +{
> > > +	int ret;
> > > +
> > > +	if (dev->offline_disabled)
> > > +		return -EPERM;
> > > +
> > > +	ret = device_for_each_child(dev, NULL, device_check_offline);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	device_lock(dev);
> > > +	if (device_supports_offline(dev)) {
> > > +		if (dev->offline) {
> > > +			ret = 1;
> > > +		} else {
> > > +			ret = dev->bus->offline(dev);
> > > +			if (!ret) {
> > > +				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> > > +				dev->offline = true;
> > 
> > Shouldn't this offline flag be set before sending KOBJ_OFFLINE?
> > 
> > > +			}
> > > +		}
> > > +	}
> > > +	device_unlock(dev);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +/**
> > > + * device_online - Put the device back online after successful device_offline().
> > > + * @dev: Device to be put back online.
> > > + *
> > > + * If device_offline() has been successfully executed for @dev, but the device
> > > + * has not been removed subsequently, execute its bus type's .online() callback
> > > + * to indicate that the device can be used again.
> > 
> > There is another use-case for online().  When a device like CPU is
> > hot-added, it is added in offline.  I am not sure why, but it has been
> > this way.  So, we need to call online() to make a new device available
> > for use after a hot-add.
> 
> Actually, in the CPU case that is left to user space as far as I can say.
> That is, the device appears initially offline and user space is supposed to
> bring it online via sysfs.
> 
> > > + *
> > > + * Call under device_offline_lock.
> > > + */
> > > +int device_online(struct device *dev)
> > > +{
> > > +	int ret = 0;
> > > +
> > > +	device_lock(dev);
> > > +	if (device_supports_offline(dev)) {
> > > +		if (dev->offline) {
> > > +			ret = dev->bus->online(dev);
> > > +			if (!ret) {
> > > +				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> > > +				dev->offline = false;
> > 
> > Same comment as KOBJ_OFFLINE.
> 
> I wonder why the ordering may be important?

I do not think it causes any race condition (so this isn't a big deal),
but it seems to make more sense to emit an ONLINE/OFFLINE event after
its object is marked online/offline.

Thanks,
-Toshi





^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/4] Driver core: Add offline/online device operations
  2013-05-02 23:11     ` Toshi Kani
@ 2013-05-02 23:36       ` Rafael J. Wysocki
  2013-05-02 23:23         ` Toshi Kani
  0 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-02 23:36 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Thursday, May 02, 2013 05:11:27 PM Toshi Kani wrote:
> On Thu, 2013-05-02 at 14:27 +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > In some cases, graceful hot-removal of devices is not possible,
> > although in principle the devices in question support hotplug.
> > For example, that may happen for the last CPU in the system or
> > for memory modules holding kernel memory.
> > 
> > In those cases it is nice to be able to check if the given device
> > can be gracefully hot-removed before triggering a removal procedure
> > that cannot be aborted or reversed.  Unfortunately, however, the
> > kernel currently doesn't provide any support for that.
> > 
> > To address that deficiency, introduce support for offline and
> > online operations that can be performed on devices, respectively,
> > before a hot-removal and in case when it is necessary (or convenient)
> > to put a device back online after a successful offline (that has not
> > been followed by removal).  The idea is that the offline will fail
> > whenever the given device cannot be gracefully removed from the
> > system and it will not be allowed to use the device after a
> > successful offline (until a subsequent online) in analogy with the
> > existing CPU offline/online mechanism.
> > 
> > For now, the offline and online operations are introduced at the
> > bus type level, as that should be sufficient for the most urgent use
> > cases (CPUs and memory modules).  In the future, however, the
> > approach may be extended to cover some more complicated device
> > offline/online scenarios involving device drivers etc.
> > 
> > The lock_device_hotplug() and unlock_device_hotplug() functions are
> > introduced because subsequent patches need to put larger pieces of
> > code under device_hotplug_lock to prevent race conditions between
> > device offline and removal from happening.
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Looks good.  For patch 1/4 to 3/4:
> 
> Reviewed-by: Toshi Kani <toshi.kani@hp.com>

Thanks!

> I have one minor comment below.
> 
> > ---
> >  Documentation/ABI/testing/sysfs-devices-online |   20 +++
> >  drivers/base/core.c                            |  130 +++++++++++++++++++++++++
> >  include/linux/device.h                         |   21 ++++
> >  3 files changed, 171 insertions(+)
> > 
> > Index: linux-pm/include/linux/device.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/device.h
> > +++ linux-pm/include/linux/device.h
> > @@ -70,6 +70,10 @@ extern void bus_remove_file(struct bus_t
> >   *		the specific driver's probe to initial the matched device.
> >   * @remove:	Called when a device removed from this bus.
> >   * @shutdown:	Called at shut-down time to quiesce the device.
> > + *
> > + * @online:	Called to put the device back online (after offlining it).
> > + * @offline:	Called to put the device offline for hot-removal. May fail.
> > + *
> >   * @suspend:	Called when a device on this bus wants to go to sleep mode.
> >   * @resume:	Called to bring a device on this bus out of sleep mode.
> >   * @pm:		Power management operations of this bus, callback the specific
> > @@ -103,6 +107,9 @@ struct bus_type {
> >  	int (*remove)(struct device *dev);
> >  	void (*shutdown)(struct device *dev);
> >  
> > +	int (*online)(struct device *dev);
> > +	int (*offline)(struct device *dev);
> > +
> >  	int (*suspend)(struct device *dev, pm_message_t state);
> >  	int (*resume)(struct device *dev);
> >  
> > @@ -646,6 +653,8 @@ struct acpi_dev_node {
> >   * @release:	Callback to free the device after all references have
> >   * 		gone away. This should be set by the allocator of the
> >   * 		device (i.e. the bus driver that discovered the device).
> > + * @offline_disabled: If set, the device is permanently online.
> > + * @offline:	Set after successful invocation of bus type's .offline().
> >   *
> >   * At the lowest level, every device in a Linux system is represented by an
> >   * instance of struct device. The device structure contains the information
> > @@ -718,6 +727,9 @@ struct device {
> >  
> >  	void	(*release)(struct device *dev);
> >  	struct iommu_group	*iommu_group;
> > +
> > +	bool			offline_disabled:1;
> > +	bool			offline:1;
> >  };
> >  
> >  static inline struct device *kobj_to_dev(struct kobject *kobj)
> > @@ -853,6 +865,15 @@ extern const char *device_get_devnode(st
> >  extern void *dev_get_drvdata(const struct device *dev);
> >  extern int dev_set_drvdata(struct device *dev, void *data);
> >  
> > +static inline bool device_supports_offline(struct device *dev)
> 
> Since we renamed "offline" to "hotplug" for the lock interfaces, should
> this function be renamed to device_supports_hotplug() as well?

Well, "offline" is more specific, as there may be devices that don't
support offline/online, but support hotplug otherwise.  That's why I didn't
change it.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/3 RFC] Driver core: Add offline/online device operations
  2013-05-02 23:29       ` Toshi Kani
@ 2013-05-03 11:48         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-03 11:48 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis

On Thursday, May 02, 2013 05:29:33 PM Toshi Kani wrote:
> On Thu, 2013-05-02 at 02:58 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, April 30, 2013 05:38:38 PM Toshi Kani wrote:
> > > On Mon, 2013-04-29 at 14:26 +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>  :
> > > > + */
> > > > +int device_offline(struct device *dev)
> > > > +{
> > > > +	int ret;
> > > > +
> > > > +	if (dev->offline_disabled)
> > > > +		return -EPERM;
> > > > +
> > > > +	ret = device_for_each_child(dev, NULL, device_check_offline);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	device_lock(dev);
> > > > +	if (device_supports_offline(dev)) {
> > > > +		if (dev->offline) {
> > > > +			ret = 1;
> > > > +		} else {
> > > > +			ret = dev->bus->offline(dev);
> > > > +			if (!ret) {
> > > > +				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> > > > +				dev->offline = true;
> > > 
> > > Shouldn't this offline flag be set before sending KOBJ_OFFLINE?
> > > 
> > > > +			}
> > > > +		}
> > > > +	}
> > > > +	device_unlock(dev);
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > > +/**
> > > > + * device_online - Put the device back online after successful device_offline().
> > > > + * @dev: Device to be put back online.
> > > > + *
> > > > + * If device_offline() has been successfully executed for @dev, but the device
> > > > + * has not been removed subsequently, execute its bus type's .online() callback
> > > > + * to indicate that the device can be used again.
> > > 
> > > There is another use-case for online().  When a device like CPU is
> > > hot-added, it is added in offline.  I am not sure why, but it has been
> > > this way.  So, we need to call online() to make a new device available
> > > for use after a hot-add.
> > 
> > Actually, in the CPU case that is left to user space as far as I can say.
> > That is, the device appears initially offline and user space is supposed to
> > bring it online via sysfs.
> > 
> > > > + *
> > > > + * Call under device_offline_lock.
> > > > + */
> > > > +int device_online(struct device *dev)
> > > > +{
> > > > +	int ret = 0;
> > > > +
> > > > +	device_lock(dev);
> > > > +	if (device_supports_offline(dev)) {
> > > > +		if (dev->offline) {
> > > > +			ret = dev->bus->online(dev);
> > > > +			if (!ret) {
> > > > +				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> > > > +				dev->offline = false;
> > > 
> > > Same comment as KOBJ_OFFLINE.
> > 
> > I wonder why the ordering may be important?
> 
> I do not think it causes any race condition (so this isn't a big deal),
> but it seems to make more sense to emit an ONLINE/OFFLINE event after
> its object is marked online/offline.

Well, dev->offline only matters for device_offline() and device_online()
themselves at this time.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure
  2013-05-02 23:20     ` Toshi Kani
@ 2013-05-03 12:05       ` Rafael J. Wysocki
  2013-05-03 12:21         ` Rafael J. Wysocki
  2013-05-03 18:27         ` Toshi Kani
  0 siblings, 2 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-03 12:05 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Thursday, May 02, 2013 05:20:12 PM Toshi Kani wrote:
> On Thu, 2013-05-02 at 14:31 +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Split the ACPI processor driver into two parts, one that is
> > non-modular, resides in the ACPI core and handles the enumeration
> > and hotplug of processors and one that implements the rest of the
> > existing processor driver functionality.
> > 
> > The non-modular part uses an ACPI scan handler object to enumerate
> > processors on the basis of information provided by the ACPI namespace
> > and to hook up with the common ACPI hotplug infrastructure.  It also
> > populates the ACPI handle of each processor device having a
> > corresponding object in the ACPI namespace, which allows the driver
> > proper to bind to those devices, and makes the driver bind to them
> > if it is readily available (i.e. loaded) when the scan handler's
> > .attach() routine is running.
> > 
> > There are a few reasons to make this change.
> > 
> > First, switching the ACPI processor driver to using the common ACPI
> > hotplug infrastructure reduces code duplication and size considerably,
> > even though a new file is created along with a header comment etc.
> > 
> > Second, since the common hotplug code attempts to offline devices
> > before starting the (non-reversible) removal procedure, it will abort
> > (and possibly roll back) hot-remove operations involving processors
> > if cpu_down() returns an error code for one of them instead of
> > continuing them blindly (if /sys/firmware/acpi/hotplug/force_remove
> > is unset).  That is a more desirable behavior than what the current
> > code does.
> > 
> > Finally, the separation of the scan/hotplug part from the driver
> > proper makes it possible to simplify the driver's .remove() routine,
> > because it doesn't need to worry about the possible cleanup related
> > to processor removal any more (the scan/hotplug part is responsible
> > for that now) and can handle device removal and driver removal
> > symmetricaly (i.e. as appropriate).
> > 
> > Some user-visible changes in sysfs are made (for example, the
> > 'sysdev' link from the ACPI device node to the processor device's
> > directory is gone and a 'physical_node' link is present instead,
> > a 'firmware_node' link is present in the processor device's
> > directory, the processor driver is now visible under
> > /sys/bus/cpu/drivers/ and bound to the processor device), but
> > that shouldn't affect the functionality that users care about
> > (frequency scaling, C-states and thermal management).
> 
> This looks very nice.  I have one question below.
> 
> > Tested on my venerable Toshiba Portege R500.
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/acpi/Makefile           |    1 
> >  drivers/acpi/acpi_processor.c   |  473 +++++++++++++++++++++++
> >  drivers/acpi/glue.c             |    6 
> >  drivers/acpi/internal.h         |    3 
> >  drivers/acpi/processor_driver.c |  803 +++-------------------------------------
> >  drivers/acpi/scan.c             |    1 
> >  drivers/base/cpu.c              |   11 
> >  include/acpi/processor.h        |    5 
> >  8 files changed, 574 insertions(+), 729 deletions(-)
> 
>  :
> 
> > Index: linux-pm/drivers/base/cpu.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/cpu.c
> > +++ linux-pm/drivers/base/cpu.c
> > @@ -13,11 +13,21 @@
> >  #include <linux/gfp.h>
> >  #include <linux/slab.h>
> >  #include <linux/percpu.h>
> > +#include <linux/acpi.h>
> >  
> >  #include "base.h"
> >  
> >  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
> >  
> > +static int cpu_subsys_match(struct device *dev, struct device_driver *drv)
> > +{
> > +	/* ACPI style match is the only one that may succeed. */
> > +	if (acpi_driver_match_device(dev, drv))
> 
> Can you explain why this change is needed?

This is the mechanism by which the driver core determines which driver to use
with a processor device passed to device_attach().

Basically, it walks the list of drivers whose bus type is cpu_subsys and
calls cpu_subsys->match(), which points to cpu_subsys_match(), for the device
and each of the drivers.  The result of that tell is whether or not to use
the given driver with the device.

Now, acpi_driver_match_device() returns 'true' if (a) the device has an ACPI
handle and (b) at least one of the IDs of the struct acpi_device associated
with that handle is in the driver's .acpi_match_table table.  Since the ACPI
processor's .acpi_match_table contains the same set of IDs as the table
of device IDs of processor_handler, this guarantees that the ACPI processor
driver will be used for the devices prepared by acpi_processor_add().

What it boils down to is that acpi_processor_start() is going to be called
for every device whose ACPI handle is populated by acpi_processor_add().

> Do CPU devices still behave the same on non-ACPI systems?

Yes, they do.  The whole driver matching/binding is irrelevant to them, because
the ACPI processor driver is the only one registering itself under cpu_subsys.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure
  2013-05-03 12:05       ` Rafael J. Wysocki
@ 2013-05-03 12:21         ` Rafael J. Wysocki
  2013-05-03 18:27         ` Toshi Kani
  1 sibling, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-03 12:21 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Friday, May 03, 2013 02:05:37 PM Rafael J. Wysocki wrote:
> On Thursday, May 02, 2013 05:20:12 PM Toshi Kani wrote:
> > On Thu, 2013-05-02 at 14:31 +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Split the ACPI processor driver into two parts, one that is
> > > non-modular, resides in the ACPI core and handles the enumeration
> > > and hotplug of processors and one that implements the rest of the
> > > existing processor driver functionality.
> > > 
> > > The non-modular part uses an ACPI scan handler object to enumerate
> > > processors on the basis of information provided by the ACPI namespace
> > > and to hook up with the common ACPI hotplug infrastructure.  It also
> > > populates the ACPI handle of each processor device having a
> > > corresponding object in the ACPI namespace, which allows the driver
> > > proper to bind to those devices, and makes the driver bind to them
> > > if it is readily available (i.e. loaded) when the scan handler's
> > > .attach() routine is running.
> > > 
> > > There are a few reasons to make this change.
> > > 
> > > First, switching the ACPI processor driver to using the common ACPI
> > > hotplug infrastructure reduces code duplication and size considerably,
> > > even though a new file is created along with a header comment etc.
> > > 
> > > Second, since the common hotplug code attempts to offline devices
> > > before starting the (non-reversible) removal procedure, it will abort
> > > (and possibly roll back) hot-remove operations involving processors
> > > if cpu_down() returns an error code for one of them instead of
> > > continuing them blindly (if /sys/firmware/acpi/hotplug/force_remove
> > > is unset).  That is a more desirable behavior than what the current
> > > code does.
> > > 
> > > Finally, the separation of the scan/hotplug part from the driver
> > > proper makes it possible to simplify the driver's .remove() routine,
> > > because it doesn't need to worry about the possible cleanup related
> > > to processor removal any more (the scan/hotplug part is responsible
> > > for that now) and can handle device removal and driver removal
> > > symmetricaly (i.e. as appropriate).
> > > 
> > > Some user-visible changes in sysfs are made (for example, the
> > > 'sysdev' link from the ACPI device node to the processor device's
> > > directory is gone and a 'physical_node' link is present instead,
> > > a 'firmware_node' link is present in the processor device's
> > > directory, the processor driver is now visible under
> > > /sys/bus/cpu/drivers/ and bound to the processor device), but
> > > that shouldn't affect the functionality that users care about
> > > (frequency scaling, C-states and thermal management).
> > 
> > This looks very nice.  I have one question below.
> > 
> > > Tested on my venerable Toshiba Portege R500.
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/acpi/Makefile           |    1 
> > >  drivers/acpi/acpi_processor.c   |  473 +++++++++++++++++++++++
> > >  drivers/acpi/glue.c             |    6 
> > >  drivers/acpi/internal.h         |    3 
> > >  drivers/acpi/processor_driver.c |  803 +++-------------------------------------
> > >  drivers/acpi/scan.c             |    1 
> > >  drivers/base/cpu.c              |   11 
> > >  include/acpi/processor.h        |    5 
> > >  8 files changed, 574 insertions(+), 729 deletions(-)
> > 
> >  :
> > 
> > > Index: linux-pm/drivers/base/cpu.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/base/cpu.c
> > > +++ linux-pm/drivers/base/cpu.c
> > > @@ -13,11 +13,21 @@
> > >  #include <linux/gfp.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/percpu.h>
> > > +#include <linux/acpi.h>
> > >  
> > >  #include "base.h"
> > >  
> > >  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
> > >  
> > > +static int cpu_subsys_match(struct device *dev, struct device_driver *drv)
> > > +{
> > > +	/* ACPI style match is the only one that may succeed. */
> > > +	if (acpi_driver_match_device(dev, drv))
> > 
> > Can you explain why this change is needed?
> 
> This is the mechanism by which the driver core determines which driver to use
> with a processor device passed to device_attach().
> 
> Basically, it walks the list of drivers whose bus type is cpu_subsys and
> calls cpu_subsys->match(), which points to cpu_subsys_match(), for the device
> and each of the drivers.  The result of that tell is whether or not to use
> the given driver with the device.
> 
> Now, acpi_driver_match_device() returns 'true' if (a) the device has an ACPI
> handle and (b) at least one of the IDs of the struct acpi_device associated
> with that handle is in the driver's .acpi_match_table table.  Since the ACPI
> processor's .acpi_match_table contains the same set of IDs as the table
> of device IDs of processor_handler, this guarantees that the ACPI processor
> driver will be used for the devices prepared by acpi_processor_add().
> 
> What it boils down to is that acpi_processor_start() is going to be called
> for every device whose ACPI handle is populated by acpi_processor_add().

The reason why it really is needed is because the ACPI processor driver is
modular and it may or may not be present when acpi_processor_add() is running,
but acpi_processor_start() should be called for the device once the driver has
been loaded.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure
  2013-05-03 12:05       ` Rafael J. Wysocki
  2013-05-03 12:21         ` Rafael J. Wysocki
@ 2013-05-03 18:27         ` Toshi Kani
  2013-05-03 19:31           ` Rafael J. Wysocki
  1 sibling, 1 reply; 105+ messages in thread
From: Toshi Kani @ 2013-05-03 18:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Fri, 2013-05-03 at 14:05 +0200, Rafael J. Wysocki wrote:
> On Thursday, May 02, 2013 05:20:12 PM Toshi Kani wrote:
> > On Thu, 2013-05-02 at 14:31 +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
 : 
> > > Index: linux-pm/drivers/base/cpu.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/base/cpu.c
> > > +++ linux-pm/drivers/base/cpu.c
> > > @@ -13,11 +13,21 @@
> > >  #include <linux/gfp.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/percpu.h>
> > > +#include <linux/acpi.h>
> > >  
> > >  #include "base.h"
> > >  
> > >  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
> > >  
> > > +static int cpu_subsys_match(struct device *dev, struct device_driver *drv)
> > > +{
> > > +	/* ACPI style match is the only one that may succeed. */
> > > +	if (acpi_driver_match_device(dev, drv))
> > 
> > Can you explain why this change is needed?
> 
> This is the mechanism by which the driver core determines which driver to use
> with a processor device passed to device_attach().
> 
> Basically, it walks the list of drivers whose bus type is cpu_subsys and
> calls cpu_subsys->match(), which points to cpu_subsys_match(), for the device
> and each of the drivers.  The result of that tell is whether or not to use
> the given driver with the device.
> 
> Now, acpi_driver_match_device() returns 'true' if (a) the device has an ACPI
> handle and (b) at least one of the IDs of the struct acpi_device associated
> with that handle is in the driver's .acpi_match_table table.  Since the ACPI
> processor's .acpi_match_table contains the same set of IDs as the table
> of device IDs of processor_handler, this guarantees that the ACPI processor
> driver will be used for the devices prepared by acpi_processor_add().
> 
> What it boils down to is that acpi_processor_start() is going to be called
> for every device whose ACPI handle is populated by acpi_processor_add().
> 
> > Do CPU devices still behave the same on non-ACPI systems?
> 
> Yes, they do.  The whole driver matching/binding is irrelevant to them, because
> the ACPI processor driver is the only one registering itself under cpu_subsys.

Thanks for the detailed explanation!  I missed that the new processor
driver is registered to cpu_subsys.  I now see what you did.  This is
clever.

One minor comment.

+static __cpuinit int __acpi_processor_start(struct acpi_device *device)
>  {
> -	struct acpi_device *device = per_cpu(processor_device_array,
pr->id);
> +	struct acpi_processor *pr = acpi_driver_data(device);
> +	acpi_status status;
>  	int result = 0;
>  
> +	if (!pr)
> +		return -ENODEV;
> +
> +	if (pr->flags.need_hotplug_init)
> +		return 0;
> +

I felt the name of "need_hotplug_init" is a bit misleading since the
func actually skips when the need-flag is set.  It may be nice to rename
it to defer_online_init, offline or something like that.

Otherwise the changes look very good. 

Reviewed-by: Toshi Kani <toshi.kani@hp.com>

Thanks,
-Toshi





^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure
  2013-05-03 18:27         ` Toshi Kani
@ 2013-05-03 19:31           ` Rafael J. Wysocki
  2013-05-03 19:34             ` Toshi Kani
  0 siblings, 1 reply; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-03 19:31 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Friday, May 03, 2013 12:27:54 PM Toshi Kani wrote:
> On Fri, 2013-05-03 at 14:05 +0200, Rafael J. Wysocki wrote:
> > On Thursday, May 02, 2013 05:20:12 PM Toshi Kani wrote:
> > > On Thu, 2013-05-02 at 14:31 +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>  : 
> > > > Index: linux-pm/drivers/base/cpu.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/base/cpu.c
> > > > +++ linux-pm/drivers/base/cpu.c
> > > > @@ -13,11 +13,21 @@
> > > >  #include <linux/gfp.h>
> > > >  #include <linux/slab.h>
> > > >  #include <linux/percpu.h>
> > > > +#include <linux/acpi.h>
> > > >  
> > > >  #include "base.h"
> > > >  
> > > >  static DEFINE_PER_CPU(struct device *, cpu_sys_devices);
> > > >  
> > > > +static int cpu_subsys_match(struct device *dev, struct device_driver *drv)
> > > > +{
> > > > +	/* ACPI style match is the only one that may succeed. */
> > > > +	if (acpi_driver_match_device(dev, drv))
> > > 
> > > Can you explain why this change is needed?
> > 
> > This is the mechanism by which the driver core determines which driver to use
> > with a processor device passed to device_attach().
> > 
> > Basically, it walks the list of drivers whose bus type is cpu_subsys and
> > calls cpu_subsys->match(), which points to cpu_subsys_match(), for the device
> > and each of the drivers.  The result of that tell is whether or not to use
> > the given driver with the device.
> > 
> > Now, acpi_driver_match_device() returns 'true' if (a) the device has an ACPI
> > handle and (b) at least one of the IDs of the struct acpi_device associated
> > with that handle is in the driver's .acpi_match_table table.  Since the ACPI
> > processor's .acpi_match_table contains the same set of IDs as the table
> > of device IDs of processor_handler, this guarantees that the ACPI processor
> > driver will be used for the devices prepared by acpi_processor_add().
> > 
> > What it boils down to is that acpi_processor_start() is going to be called
> > for every device whose ACPI handle is populated by acpi_processor_add().
> > 
> > > Do CPU devices still behave the same on non-ACPI systems?
> > 
> > Yes, they do.  The whole driver matching/binding is irrelevant to them, because
> > the ACPI processor driver is the only one registering itself under cpu_subsys.
> 
> Thanks for the detailed explanation!  I missed that the new processor
> driver is registered to cpu_subsys.  I now see what you did.  This is
> clever.

Well, thanks! :-)

> One minor comment.
> 
> +static __cpuinit int __acpi_processor_start(struct acpi_device *device)
> >  {
> > -	struct acpi_device *device = per_cpu(processor_device_array,
> pr->id);
> > +	struct acpi_processor *pr = acpi_driver_data(device);
> > +	acpi_status status;
> >  	int result = 0;
> >  
> > +	if (!pr)
> > +		return -ENODEV;
> > +
> > +	if (pr->flags.need_hotplug_init)
> > +		return 0;
> > +
> 
> I felt the name of "need_hotplug_init" is a bit misleading since the
> func actually skips when the need-flag is set.  It may be nice to rename
> it to defer_online_init, offline or something like that.

I just wanted to avoid making too many non-essential changes in one patch.
We can change the name of that field at any time later.

> Otherwise the changes look very good. 
> 
> Reviewed-by: Toshi Kani <toshi.kani@hp.com>

Thank you!

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure
  2013-05-03 19:31           ` Rafael J. Wysocki
@ 2013-05-03 19:34             ` Toshi Kani
  0 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-03 19:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown

On Fri, 2013-05-03 at 21:31 +0200, Rafael J. Wysocki wrote:
> On Friday, May 03, 2013 12:27:54 PM Toshi Kani wrote:
> > On Fri, 2013-05-03 at 14:05 +0200, Rafael J. Wysocki wrote:
> > > On Thursday, May 02, 2013 05:20:12 PM Toshi Kani wrote:

:

> > One minor comment.
> > 
> > +static __cpuinit int __acpi_processor_start(struct acpi_device *device)
> > >  {
> > > -	struct acpi_device *device = per_cpu(processor_device_array,
> > pr->id);
> > > +	struct acpi_processor *pr = acpi_driver_data(device);
> > > +	acpi_status status;
> > >  	int result = 0;
> > >  
> > > +	if (!pr)
> > > +		return -ENODEV;
> > > +
> > > +	if (pr->flags.need_hotplug_init)
> > > +		return 0;
> > > +
> > 
> > I felt the name of "need_hotplug_init" is a bit misleading since the
> > func actually skips when the need-flag is set.  It may be nice to rename
> > it to defer_online_init, offline or something like that.
> 
> I just wanted to avoid making too many non-essential changes in one patch.
> We can change the name of that field at any time later.

Sounds good to me.

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys
  2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
@ 2013-05-04  1:01     ` Rafael J. Wysocki
  2013-05-02 12:28   ` [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04  1:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

Hi,

This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
and it applies on top of it or rather on top of the rebased version (with
build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:

http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge

An introduction to the first part of the patchset is below, a description of
the current patches follows.

On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> > 
> > It has been argued for a number of times that in some cases, if a device cannot
> > be gracefully removed from the system, it shouldn't be removed from it at all,
> > because that may lead to a kernel crash.  In particular, that will happen if a
> > memory module holding kernel memory is removed, but also removing the last CPU
> > in the system may not be a good idea.  [And I can imagine a few other cases
> > like that.]
> > 
> > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > once started, so users have no choice but to try to hot-remove stuff and see
> > whether or not that crashes the kernel which is kind of unpleasant.  That seems
> > to be based on the "the user knows better" argument according to which users
> > triggering device hot-removal should really know what they are doing, so the
> > kernel doesn't have to worry about that.  However, for instance, this pretty
> > much isn't the case for memory modules, because the users have no way to see
> > whether or not any kernel memory has been allocated from a given module.
> > 
> > There have been a few attempts to address this issue, but none of them has
> > gained broader acceptance.  The following 3 patches are the heart of a new
> > proposal which is based on the idea to introduce device_offline() and
> > device_online() operations along the lines of the existing CPU offline/online
> > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > operations are available for other devices).  The way it is supposed to work is
> > that device_offline() will fail if the given device cannot be gracefully
> > removed from the system (in the kernel's view).  Once it succeeds, though, the
> > device won't be used any more until either it is removed, or device_online() is
> > run for it.  That will allow the ACPI device hot-remove code, for one example,
> > to avoid triggering a non-reversible removal procedure for devices that cannot
> > be removed gracefully.
> > 
> > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > The .offline() and .online() callbacks are only added at the bus type level for
> > now, because that should be sufficient to cover the memory and CPU use cases.
> 
> That's [1/4] now and the changes from the previous version are:
> - strtobool() is used in store_online().
> - device_offline_lock has been renamed to device_hotplug_lock (and the
>   functions operating it accordingly) following the Toshi's advice.
> 
> > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > device_online() to support the sysfs 'online' attribute for CPUs.
> 
> That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> cpu_down().
> 
> > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > for checking if graceful removal of devices is possible.  The way it does that
> > is to walk the list of "physical" companion devices for each struct acpi_device
> > involved in the operation and call device_offline() for each of them.  If any
> > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > is an option), the removal procedure (which is not reversible) is simply not
> > carried out.
> 
> That's current [3/4].  It's a bit simpler, because I decided that it would be
> better to have a global 'force_remove' attribute (the semantics of the
> per-profile 'force_remove' wasn't clear and it didn't really add any value over
> a global one).  I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> for "physical" companion devices safely (the processor's one added by the next
> patch actually does that).
> 
> > Of some concern is that device_offline() (and possibly device_online()) is
> > called under physical_node_lock of the corresponding struct acpi_device, which
> > introduces ordering dependency between that lock and device locks for the
> > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > is avoidable at the expense of added complexity, but for now it's just better
> > to make the code as clean as possible IMO).
> 
> Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> It basically splits the driver into two parts as described in the changelog,
> where the first part is essentially a scan handler and the second part is
> a driver, but it doesn't bind to struct acpi_device any more.  Instead, it
> binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> than having it under /sys/bus/acpi/drivers/).
> 
> The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> for this series, but I'm going to push it for v3.10-rc2 if no one screams
> bloody murder.

Patch [1/3] in the current series uses acpi_bind_one() to associate memory
block devices with ACPI namespace objects representing memory modules that hold
them.  With patch [3/3] that will allow the ACPI core's device hot-remove code
to attempt to offline the memory blocks, if possible, before removing the
modules holding them from the system (and if the offlining fails, the removal
will not be carried out).

Patch [2/3] kind of prepares the (just introduced) driver core's device_online()
and device_offline() for handling memory block devices (becase for those devices
there are multiple types, or levels if you will, of "online").

Finally, patch [3/3] adds .online() and .offline() callbacks to memory_subsys
that are used by the common "online" sysfs attribute and by the ACPI core's
hot-remove code, through device_online() and device_offline().

I hope this is not too ugly. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys
@ 2013-05-04  1:01     ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04  1:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

Hi,

This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
and it applies on top of it or rather on top of the rebased version (with
build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:

http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge

An introduction to the first part of the patchset is below, a description of
the current patches follows.

On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> > 
> > It has been argued for a number of times that in some cases, if a device cannot
> > be gracefully removed from the system, it shouldn't be removed from it at all,
> > because that may lead to a kernel crash.  In particular, that will happen if a
> > memory module holding kernel memory is removed, but also removing the last CPU
> > in the system may not be a good idea.  [And I can imagine a few other cases
> > like that.]
> > 
> > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > once started, so users have no choice but to try to hot-remove stuff and see
> > whether or not that crashes the kernel which is kind of unpleasant.  That seems
> > to be based on the "the user knows better" argument according to which users
> > triggering device hot-removal should really know what they are doing, so the
> > kernel doesn't have to worry about that.  However, for instance, this pretty
> > much isn't the case for memory modules, because the users have no way to see
> > whether or not any kernel memory has been allocated from a given module.
> > 
> > There have been a few attempts to address this issue, but none of them has
> > gained broader acceptance.  The following 3 patches are the heart of a new
> > proposal which is based on the idea to introduce device_offline() and
> > device_online() operations along the lines of the existing CPU offline/online
> > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > operations are available for other devices).  The way it is supposed to work is
> > that device_offline() will fail if the given device cannot be gracefully
> > removed from the system (in the kernel's view).  Once it succeeds, though, the
> > device won't be used any more until either it is removed, or device_online() is
> > run for it.  That will allow the ACPI device hot-remove code, for one example,
> > to avoid triggering a non-reversible removal procedure for devices that cannot
> > be removed gracefully.
> > 
> > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > The .offline() and .online() callbacks are only added at the bus type level for
> > now, because that should be sufficient to cover the memory and CPU use cases.
> 
> That's [1/4] now and the changes from the previous version are:
> - strtobool() is used in store_online().
> - device_offline_lock has been renamed to device_hotplug_lock (and the
>   functions operating it accordingly) following the Toshi's advice.
> 
> > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > device_online() to support the sysfs 'online' attribute for CPUs.
> 
> That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> cpu_down().
> 
> > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > for checking if graceful removal of devices is possible.  The way it does that
> > is to walk the list of "physical" companion devices for each struct acpi_device
> > involved in the operation and call device_offline() for each of them.  If any
> > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > is an option), the removal procedure (which is not reversible) is simply not
> > carried out.
> 
> That's current [3/4].  It's a bit simpler, because I decided that it would be
> better to have a global 'force_remove' attribute (the semantics of the
> per-profile 'force_remove' wasn't clear and it didn't really add any value over
> a global one).  I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> for "physical" companion devices safely (the processor's one added by the next
> patch actually does that).
> 
> > Of some concern is that device_offline() (and possibly device_online()) is
> > called under physical_node_lock of the corresponding struct acpi_device, which
> > introduces ordering dependency between that lock and device locks for the
> > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > is avoidable at the expense of added complexity, but for now it's just better
> > to make the code as clean as possible IMO).
> 
> Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> It basically splits the driver into two parts as described in the changelog,
> where the first part is essentially a scan handler and the second part is
> a driver, but it doesn't bind to struct acpi_device any more.  Instead, it
> binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> than having it under /sys/bus/acpi/drivers/).
> 
> The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> for this series, but I'm going to push it for v3.10-rc2 if no one screams
> bloody murder.

Patch [1/3] in the current series uses acpi_bind_one() to associate memory
block devices with ACPI namespace objects representing memory modules that hold
them.  With patch [3/3] that will allow the ACPI core's device hot-remove code
to attempt to offline the memory blocks, if possible, before removing the
modules holding them from the system (and if the offlining fails, the removal
will not be carried out).

Patch [2/3] kind of prepares the (just introduced) driver core's device_online()
and device_offline() for handling memory block devices (becase for those devices
there are multiple types, or levels if you will, of "online").

Finally, patch [3/3] adds .online() and .offline() callbacks to memory_subsys
that are used by the common "online" sysfs attribute and by the ACPI core's
hot-remove code, through device_online() and device_offline().

I hope this is not too ugly. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
  2013-05-04  1:01     ` Rafael J. Wysocki
@ 2013-05-04  1:03       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04  1:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

During ACPI memory hotplug configuration bind memory blocks residing
in modules removable through the standard ACPI mechanism to struct
acpi_device objects associated with ACPI namespace objects
representing those modules.  Accordingly, unbind those memory blocks
from the struct acpi_device objects when the memory modules in
question are being removed.

When "offline" operation for devices representing memory blocks is
introduced, this will allow the ACPI core's device hot-remove code to
use it to carry out remove_memory() for those memory blocks and check
the results of that before it actually removes the modules holding
them from the system.

Since walk_memory_range() is used for accessing all memory blocks
corresponding to a given ACPI namespace object, it is exported from
memory_hotplug.c so that the code in acpi_memhotplug.c can use it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/acpi_memhotplug.c |   53 ++++++++++++++++++++++++++++++++++++++---
 include/linux/memory_hotplug.h |    2 +
 mm/memory_hotplug.c            |    4 ++-
 3 files changed, 55 insertions(+), 4 deletions(-)

Index: linux-pm/mm/memory_hotplug.c
===================================================================
--- linux-pm.orig/mm/memory_hotplug.c
+++ linux-pm/mm/memory_hotplug.c
@@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
 {
 	return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
 }
+#endif /* CONFIG_MEMORY_HOTREMOVE */
 
 /**
  * walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
@@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
  *
  * Returns the return value of func.
  */
-static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *))
 {
 	struct memory_block *mem = NULL;
@@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
 	return 0;
 }
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
 /**
  * offline_memory_block_cb - callback function for offlining memory block
  * @mem: the memory block to be offlined
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -245,6 +245,8 @@ static inline int is_mem_section_removab
 static inline void try_offline_node(int nid) {}
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
+extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+		void *arg, int (*func)(struct memory_block *, void *));
 extern int mem_online_node(int nid);
 extern int add_memory(int nid, u64 start, u64 size);
 extern int arch_add_memory(int nid, u64 start, u64 size);
Index: linux-pm/drivers/acpi/acpi_memhotplug.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
+++ linux-pm/drivers/acpi/acpi_memhotplug.c
@@ -28,6 +28,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/memory.h>
 #include <linux/memory_hotplug.h>
 
 #include "internal.h"
@@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
 	return 0;
 }
 
+static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
+{
+	return PFN_DOWN(info->start_addr);
+}
+
+static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
+{
+	return PFN_UP(info->start_addr + info->length-1);
+}
+
+static int acpi_bind_memblk(struct memory_block *mem, void *arg)
+{
+	return acpi_bind_one(&mem->dev, (acpi_handle)arg);
+}
+
+static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
+				   acpi_handle handle)
+{
+	return walk_memory_range(acpi_meminfo_start_pfn(info),
+				 acpi_meminfo_end_pfn(info), (void *)handle,
+				 acpi_bind_memblk);
+}
+
+static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
+{
+	acpi_unbind_one(&mem->dev);
+	return 0;
+}
+
+static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
+				      acpi_handle handle)
+{
+	walk_memory_range(acpi_meminfo_start_pfn(info),
+			  acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
+}
+
 static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 {
+	acpi_handle handle = mem_device->device->handle;
 	int result, num_enabled = 0;
 	struct acpi_memory_info *info;
 	int node;
 
-	node = acpi_get_node(mem_device->device->handle);
+	node = acpi_get_node(handle);
 	/*
 	 * Tell the VM there is more memory here...
 	 * Note: Assume that this function returns zero on success
@@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
 		if (result && result != -EEXIST)
 			continue;
 
+		result = acpi_bind_memory_blocks(info, handle);
+		if (result) {
+			acpi_unbind_memory_blocks(info, handle);
+			return -ENODEV;
+		}
+
 		info->enabled = 1;
 
 		/*
@@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
 
 static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
 {
+	acpi_handle handle = mem_device->device->handle;
 	int result = 0, nid;
 	struct acpi_memory_info *info, *n;
 
-	nid = acpi_get_node(mem_device->device->handle);
+	nid = acpi_get_node(handle);
 
 	list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
 		if (!info->enabled)
@@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
 
 		if (nid < 0)
 			nid = memory_add_physaddr_to_nid(info->start_addr);
+
+		acpi_unbind_memory_blocks(info, handle);
 		result = remove_memory(nid, info->start_addr, info->length);
 		if (result)
 			return result;
@@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
 	if (result) {
 		dev_err(&device->dev, "acpi_memory_enable_device() error\n");
 		acpi_memory_device_free(mem_device);
-		return -ENODEV;
+		return result;
 	}
 
 	dev_dbg(&device->dev, "Memory device configured by ACPI\n");

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
@ 2013-05-04  1:03       ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04  1:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

During ACPI memory hotplug configuration bind memory blocks residing
in modules removable through the standard ACPI mechanism to struct
acpi_device objects associated with ACPI namespace objects
representing those modules.  Accordingly, unbind those memory blocks
from the struct acpi_device objects when the memory modules in
question are being removed.

When "offline" operation for devices representing memory blocks is
introduced, this will allow the ACPI core's device hot-remove code to
use it to carry out remove_memory() for those memory blocks and check
the results of that before it actually removes the modules holding
them from the system.

Since walk_memory_range() is used for accessing all memory blocks
corresponding to a given ACPI namespace object, it is exported from
memory_hotplug.c so that the code in acpi_memhotplug.c can use it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/acpi_memhotplug.c |   53 ++++++++++++++++++++++++++++++++++++++---
 include/linux/memory_hotplug.h |    2 +
 mm/memory_hotplug.c            |    4 ++-
 3 files changed, 55 insertions(+), 4 deletions(-)

Index: linux-pm/mm/memory_hotplug.c
===================================================================
--- linux-pm.orig/mm/memory_hotplug.c
+++ linux-pm/mm/memory_hotplug.c
@@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
 {
 	return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
 }
+#endif /* CONFIG_MEMORY_HOTREMOVE */
 
 /**
  * walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
@@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
  *
  * Returns the return value of func.
  */
-static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *))
 {
 	struct memory_block *mem = NULL;
@@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
 	return 0;
 }
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
 /**
  * offline_memory_block_cb - callback function for offlining memory block
  * @mem: the memory block to be offlined
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -245,6 +245,8 @@ static inline int is_mem_section_removab
 static inline void try_offline_node(int nid) {}
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
+extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+		void *arg, int (*func)(struct memory_block *, void *));
 extern int mem_online_node(int nid);
 extern int add_memory(int nid, u64 start, u64 size);
 extern int arch_add_memory(int nid, u64 start, u64 size);
Index: linux-pm/drivers/acpi/acpi_memhotplug.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
+++ linux-pm/drivers/acpi/acpi_memhotplug.c
@@ -28,6 +28,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/memory.h>
 #include <linux/memory_hotplug.h>
 
 #include "internal.h"
@@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
 	return 0;
 }
 
+static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
+{
+	return PFN_DOWN(info->start_addr);
+}
+
+static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
+{
+	return PFN_UP(info->start_addr + info->length-1);
+}
+
+static int acpi_bind_memblk(struct memory_block *mem, void *arg)
+{
+	return acpi_bind_one(&mem->dev, (acpi_handle)arg);
+}
+
+static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
+				   acpi_handle handle)
+{
+	return walk_memory_range(acpi_meminfo_start_pfn(info),
+				 acpi_meminfo_end_pfn(info), (void *)handle,
+				 acpi_bind_memblk);
+}
+
+static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
+{
+	acpi_unbind_one(&mem->dev);
+	return 0;
+}
+
+static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
+				      acpi_handle handle)
+{
+	walk_memory_range(acpi_meminfo_start_pfn(info),
+			  acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
+}
+
 static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 {
+	acpi_handle handle = mem_device->device->handle;
 	int result, num_enabled = 0;
 	struct acpi_memory_info *info;
 	int node;
 
-	node = acpi_get_node(mem_device->device->handle);
+	node = acpi_get_node(handle);
 	/*
 	 * Tell the VM there is more memory here...
 	 * Note: Assume that this function returns zero on success
@@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
 		if (result && result != -EEXIST)
 			continue;
 
+		result = acpi_bind_memory_blocks(info, handle);
+		if (result) {
+			acpi_unbind_memory_blocks(info, handle);
+			return -ENODEV;
+		}
+
 		info->enabled = 1;
 
 		/*
@@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
 
 static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
 {
+	acpi_handle handle = mem_device->device->handle;
 	int result = 0, nid;
 	struct acpi_memory_info *info, *n;
 
-	nid = acpi_get_node(mem_device->device->handle);
+	nid = acpi_get_node(handle);
 
 	list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
 		if (!info->enabled)
@@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
 
 		if (nid < 0)
 			nid = memory_add_physaddr_to_nid(info->start_addr);
+
+		acpi_unbind_memory_blocks(info, handle);
 		result = remove_memory(nid, info->start_addr, info->length);
 		if (result)
 			return result;
@@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
 	if (result) {
 		dev_err(&device->dev, "acpi_memory_enable_device() error\n");
 		acpi_memory_device_free(mem_device);
-		return -ENODEV;
+		return result;
 	}
 
 	dev_dbg(&device->dev, "Memory device configured by ACPI\n");


^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 2/3 RFC] Driver core: Introduce types of device "online"
  2013-05-04  1:01     ` Rafael J. Wysocki
@ 2013-05-04  1:04       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04  1:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

For memory blocks there are multiple ways in which they can be
"online" that determine what can be done with the given block.

For this reason, to allow the generic device_offline() and
device_online() to be used for devices representing memory
blocks, introduce a second "online type" argument for
device_online() that will be interpreted by the bus type whose
.online() callback is executed by device_online().

Of course, that requires some changes to be made in struct device
and struct bus_type, and the code related to device_online()
and device_offline() needs to be changed as well.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/acpi_processor.c |    2 +-
 drivers/acpi/scan.c           |   14 ++++++++------
 drivers/base/core.c           |   36 ++++++++++++++++++++----------------
 drivers/base/cpu.c            |    5 ++++-
 include/acpi/acpi_bus.h       |    2 +-
 include/linux/device.h        |    8 ++++----
 6 files changed, 38 insertions(+), 29 deletions(-)

Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -108,7 +108,7 @@ struct bus_type {
 	int (*remove)(struct device *dev);
 	void (*shutdown)(struct device *dev);
 
-	int (*online)(struct device *dev);
+	int (*online)(struct device *dev, unsigned int type);
 	int (*offline)(struct device *dev);
 
 	int (*suspend)(struct device *dev, pm_message_t state);
@@ -656,7 +656,7 @@ struct acpi_dev_node {
  * 		gone away. This should be set by the allocator of the
  * 		device (i.e. the bus driver that discovered the device).
  * @offline_disabled: If set, the device is permanently online.
- * @offline:	Set after successful invocation of bus type's .offline().
+ * @online_type: 0 if the device is offline, otherwise bus type dependent.
  *
  * At the lowest level, every device in a Linux system is represented by an
  * instance of struct device. The device structure contains the information
@@ -730,8 +730,8 @@ struct device {
 	void	(*release)(struct device *dev);
 	struct iommu_group	*iommu_group;
 
+	unsigned int 		online_type;
 	bool			offline_disabled:1;
-	bool			offline:1;
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)
@@ -876,7 +876,7 @@ static inline bool device_supports_offli
 extern void lock_device_hotplug(void);
 extern void unlock_device_hotplug(void);
 extern int device_offline(struct device *dev);
-extern int device_online(struct device *dev);
+extern int device_online(struct device *dev, unsigned int type);
 /*
  * Root device objects for grouping under /sys/devices
  */
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -406,10 +406,10 @@ static struct device_attribute uevent_at
 static ssize_t show_online(struct device *dev, struct device_attribute *attr,
 			   char *buf)
 {
-	bool val;
+	unsigned int val;
 
 	lock_device_hotplug();
-	val = !dev->offline;
+	val = dev->online_type;
 	unlock_device_hotplug();
 	return sprintf(buf, "%u\n", val);
 }
@@ -417,15 +417,15 @@ static ssize_t show_online(struct device
 static ssize_t store_online(struct device *dev, struct device_attribute *attr,
 			    const char *buf, size_t count)
 {
-	bool val;
+	unsigned int val;
 	int ret;
 
-	ret = strtobool(buf, &val);
+	ret = kstrtouint(buf, 10, &val);
 	if (ret < 0)
 		return ret;
 
 	lock_device_hotplug();
-	ret = val ? device_online(dev) : device_offline(dev);
+	ret = val ? device_online(dev, val) : device_offline(dev);
 	unlock_device_hotplug();
 	return ret < 0 ? ret : count;
 }
@@ -1488,7 +1488,7 @@ static int device_check_offline(struct d
 	if (ret)
 		return ret;
 
-	return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
+	return device_supports_offline(dev) && !!dev->online_type ? -EBUSY : 0;
 }
 
 /**
@@ -1515,14 +1515,14 @@ int device_offline(struct device *dev)
 
 	device_lock(dev);
 	if (device_supports_offline(dev)) {
-		if (dev->offline) {
-			ret = 1;
-		} else {
+		if (dev->online_type) {
 			ret = dev->bus->offline(dev);
 			if (!ret) {
 				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
-				dev->offline = true;
+				dev->online_type = 0;
 			}
+		} else {
+			ret = 1;
 		}
 	}
 	device_unlock(dev);
@@ -1533,6 +1533,7 @@ int device_offline(struct device *dev)
 /**
  * device_online - Put the device back online after successful device_offline().
  * @dev: Device to be put back online.
+ * @type: Interpreted by the bus type, must be nonzero.
  *
  * If device_offline() has been successfully executed for @dev, but the device
  * has not been removed subsequently, execute its bus type's .online() callback
@@ -1540,20 +1541,23 @@ int device_offline(struct device *dev)
  *
  * Call under device_hotplug_lock.
  */
-int device_online(struct device *dev)
+int device_online(struct device *dev, unsigned int type)
 {
 	int ret = 0;
 
+	if (!type)
+		return -EINVAL;
+
 	device_lock(dev);
 	if (device_supports_offline(dev)) {
-		if (dev->offline) {
-			ret = dev->bus->online(dev);
+		if (dev->online_type) {
+			ret = 1;
+		} else {
+			ret = dev->bus->online(dev, type);
 			if (!ret) {
 				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
-				dev->offline = false;
+				dev->online_type = type;
 			}
-		} else {
-			ret = 1;
 		}
 	}
 	device_unlock(dev);
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -286,7 +286,7 @@ struct acpi_device_physical_node {
 	u8 node_id;
 	struct list_head node;
 	struct device *dev;
-	bool put_online:1;
+	unsigned int online_type;
 };
 
 /* set maximum of physical nodes to 32 for expansibility */
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -141,15 +141,17 @@ static acpi_status acpi_bus_offline_comp
 	list_for_each_entry(pn, &device->physical_node_list, node) {
 		int ret;
 
+		pn->online_type = pn->dev->online_type;
 		ret = device_offline(pn->dev);
-		if (acpi_force_hot_remove)
+		if (acpi_force_hot_remove) {
+			pn->online_type = 0;
 			continue;
-
+		}
 		if (ret < 0) {
+			pn->online_type = 0;
 			status = AE_ERROR;
 			break;
 		}
-		pn->put_online = !ret;
 	}
 
 	mutex_unlock(&device->physical_node_lock);
@@ -169,9 +171,9 @@ static acpi_status acpi_bus_online_compa
 	mutex_lock(&device->physical_node_lock);
 
 	list_for_each_entry(pn, &device->physical_node_list, node)
-		if (pn->put_online) {
-			device_online(pn->dev);
-			pn->put_online = false;
+		if (pn->online_type) {
+			device_online(pn->dev, pn->online_type);
+			pn->online_type = 0;
 		}
 
 	mutex_unlock(&device->physical_node_lock);
Index: linux-pm/drivers/acpi/acpi_processor.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_processor.c
+++ linux-pm/drivers/acpi/acpi_processor.c
@@ -395,7 +395,7 @@ static int __cpuinit acpi_processor_add(
 		goto err;
 
 	pr->dev = dev;
-	dev->offline = pr->flags.need_hotplug_init;
+	dev->online_type = !pr->flags.need_hotplug_init;
 
 	/* Trigger the processor driver's .probe() if present. */
 	if (device_attach(dev) >= 0)
Index: linux-pm/drivers/base/cpu.c
===================================================================
--- linux-pm.orig/drivers/base/cpu.c
+++ linux-pm/drivers/base/cpu.c
@@ -38,13 +38,16 @@ static void change_cpu_under_node(struct
 	cpu->node_id = to_nid;
 }
 
-static int __ref cpu_subsys_online(struct device *dev)
+static int __ref cpu_subsys_online(struct device *dev, unsigned int type)
 {
 	struct cpu *cpu = container_of(dev, struct cpu, dev);
 	int cpuid = dev->id;
 	int from_nid, to_nid;
 	int ret;
 
+	if (type > 1)
+		return -EINVAL;
+
 	cpu_hotplug_driver_lock();
 
 	from_nid = cpu_to_node(cpuid);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 2/3 RFC] Driver core: Introduce types of device "online"
@ 2013-05-04  1:04       ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04  1:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

For memory blocks there are multiple ways in which they can be
"online" that determine what can be done with the given block.

For this reason, to allow the generic device_offline() and
device_online() to be used for devices representing memory
blocks, introduce a second "online type" argument for
device_online() that will be interpreted by the bus type whose
.online() callback is executed by device_online().

Of course, that requires some changes to be made in struct device
and struct bus_type, and the code related to device_online()
and device_offline() needs to be changed as well.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/acpi_processor.c |    2 +-
 drivers/acpi/scan.c           |   14 ++++++++------
 drivers/base/core.c           |   36 ++++++++++++++++++++----------------
 drivers/base/cpu.c            |    5 ++++-
 include/acpi/acpi_bus.h       |    2 +-
 include/linux/device.h        |    8 ++++----
 6 files changed, 38 insertions(+), 29 deletions(-)

Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -108,7 +108,7 @@ struct bus_type {
 	int (*remove)(struct device *dev);
 	void (*shutdown)(struct device *dev);
 
-	int (*online)(struct device *dev);
+	int (*online)(struct device *dev, unsigned int type);
 	int (*offline)(struct device *dev);
 
 	int (*suspend)(struct device *dev, pm_message_t state);
@@ -656,7 +656,7 @@ struct acpi_dev_node {
  * 		gone away. This should be set by the allocator of the
  * 		device (i.e. the bus driver that discovered the device).
  * @offline_disabled: If set, the device is permanently online.
- * @offline:	Set after successful invocation of bus type's .offline().
+ * @online_type: 0 if the device is offline, otherwise bus type dependent.
  *
  * At the lowest level, every device in a Linux system is represented by an
  * instance of struct device. The device structure contains the information
@@ -730,8 +730,8 @@ struct device {
 	void	(*release)(struct device *dev);
 	struct iommu_group	*iommu_group;
 
+	unsigned int 		online_type;
 	bool			offline_disabled:1;
-	bool			offline:1;
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)
@@ -876,7 +876,7 @@ static inline bool device_supports_offli
 extern void lock_device_hotplug(void);
 extern void unlock_device_hotplug(void);
 extern int device_offline(struct device *dev);
-extern int device_online(struct device *dev);
+extern int device_online(struct device *dev, unsigned int type);
 /*
  * Root device objects for grouping under /sys/devices
  */
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -406,10 +406,10 @@ static struct device_attribute uevent_at
 static ssize_t show_online(struct device *dev, struct device_attribute *attr,
 			   char *buf)
 {
-	bool val;
+	unsigned int val;
 
 	lock_device_hotplug();
-	val = !dev->offline;
+	val = dev->online_type;
 	unlock_device_hotplug();
 	return sprintf(buf, "%u\n", val);
 }
@@ -417,15 +417,15 @@ static ssize_t show_online(struct device
 static ssize_t store_online(struct device *dev, struct device_attribute *attr,
 			    const char *buf, size_t count)
 {
-	bool val;
+	unsigned int val;
 	int ret;
 
-	ret = strtobool(buf, &val);
+	ret = kstrtouint(buf, 10, &val);
 	if (ret < 0)
 		return ret;
 
 	lock_device_hotplug();
-	ret = val ? device_online(dev) : device_offline(dev);
+	ret = val ? device_online(dev, val) : device_offline(dev);
 	unlock_device_hotplug();
 	return ret < 0 ? ret : count;
 }
@@ -1488,7 +1488,7 @@ static int device_check_offline(struct d
 	if (ret)
 		return ret;
 
-	return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
+	return device_supports_offline(dev) && !!dev->online_type ? -EBUSY : 0;
 }
 
 /**
@@ -1515,14 +1515,14 @@ int device_offline(struct device *dev)
 
 	device_lock(dev);
 	if (device_supports_offline(dev)) {
-		if (dev->offline) {
-			ret = 1;
-		} else {
+		if (dev->online_type) {
 			ret = dev->bus->offline(dev);
 			if (!ret) {
 				kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
-				dev->offline = true;
+				dev->online_type = 0;
 			}
+		} else {
+			ret = 1;
 		}
 	}
 	device_unlock(dev);
@@ -1533,6 +1533,7 @@ int device_offline(struct device *dev)
 /**
  * device_online - Put the device back online after successful device_offline().
  * @dev: Device to be put back online.
+ * @type: Interpreted by the bus type, must be nonzero.
  *
  * If device_offline() has been successfully executed for @dev, but the device
  * has not been removed subsequently, execute its bus type's .online() callback
@@ -1540,20 +1541,23 @@ int device_offline(struct device *dev)
  *
  * Call under device_hotplug_lock.
  */
-int device_online(struct device *dev)
+int device_online(struct device *dev, unsigned int type)
 {
 	int ret = 0;
 
+	if (!type)
+		return -EINVAL;
+
 	device_lock(dev);
 	if (device_supports_offline(dev)) {
-		if (dev->offline) {
-			ret = dev->bus->online(dev);
+		if (dev->online_type) {
+			ret = 1;
+		} else {
+			ret = dev->bus->online(dev, type);
 			if (!ret) {
 				kobject_uevent(&dev->kobj, KOBJ_ONLINE);
-				dev->offline = false;
+				dev->online_type = type;
 			}
-		} else {
-			ret = 1;
 		}
 	}
 	device_unlock(dev);
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -286,7 +286,7 @@ struct acpi_device_physical_node {
 	u8 node_id;
 	struct list_head node;
 	struct device *dev;
-	bool put_online:1;
+	unsigned int online_type;
 };
 
 /* set maximum of physical nodes to 32 for expansibility */
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -141,15 +141,17 @@ static acpi_status acpi_bus_offline_comp
 	list_for_each_entry(pn, &device->physical_node_list, node) {
 		int ret;
 
+		pn->online_type = pn->dev->online_type;
 		ret = device_offline(pn->dev);
-		if (acpi_force_hot_remove)
+		if (acpi_force_hot_remove) {
+			pn->online_type = 0;
 			continue;
-
+		}
 		if (ret < 0) {
+			pn->online_type = 0;
 			status = AE_ERROR;
 			break;
 		}
-		pn->put_online = !ret;
 	}
 
 	mutex_unlock(&device->physical_node_lock);
@@ -169,9 +171,9 @@ static acpi_status acpi_bus_online_compa
 	mutex_lock(&device->physical_node_lock);
 
 	list_for_each_entry(pn, &device->physical_node_list, node)
-		if (pn->put_online) {
-			device_online(pn->dev);
-			pn->put_online = false;
+		if (pn->online_type) {
+			device_online(pn->dev, pn->online_type);
+			pn->online_type = 0;
 		}
 
 	mutex_unlock(&device->physical_node_lock);
Index: linux-pm/drivers/acpi/acpi_processor.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_processor.c
+++ linux-pm/drivers/acpi/acpi_processor.c
@@ -395,7 +395,7 @@ static int __cpuinit acpi_processor_add(
 		goto err;
 
 	pr->dev = dev;
-	dev->offline = pr->flags.need_hotplug_init;
+	dev->online_type = !pr->flags.need_hotplug_init;
 
 	/* Trigger the processor driver's .probe() if present. */
 	if (device_attach(dev) >= 0)
Index: linux-pm/drivers/base/cpu.c
===================================================================
--- linux-pm.orig/drivers/base/cpu.c
+++ linux-pm/drivers/base/cpu.c
@@ -38,13 +38,16 @@ static void change_cpu_under_node(struct
 	cpu->node_id = to_nid;
 }
 
-static int __ref cpu_subsys_online(struct device *dev)
+static int __ref cpu_subsys_online(struct device *dev, unsigned int type)
 {
 	struct cpu *cpu = container_of(dev, struct cpu, dev);
 	int cpuid = dev->id;
 	int from_nid, to_nid;
 	int ret;
 
+	if (type > 1)
+		return -EINVAL;
+
 	cpu_hotplug_driver_lock();
 
 	from_nid = cpu_to_node(cpuid);


^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-04  1:01     ` Rafael J. Wysocki
@ 2013-05-04  1:06       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04  1:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/memory.c          |   84 ++++++++++++++++++++++++++++++-----------
 include/linux/memory_hotplug.h |    2 
 2 files changed, 64 insertions(+), 22 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev, unsigned int type);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -294,16 +299,7 @@ static int __memory_block_change_state(s
 	}
 
 	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
-	}
+
 out:
 	return ret;
 }
@@ -321,27 +317,66 @@ static int memory_block_change_state(str
 
 	return ret;
 }
+
+static int memory_subsys_online(struct device *dev, unsigned int type)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+
+	if (type < ONLINE_KEEP || type > ONLINE_KERNEL)
+		return -EINVAL;
+
+	return memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, type);
+}
+
+static int memory_block_online(struct device *dev, unsigned int type)
+{
+	int ret = memory_subsys_online(dev, type);
+
+	if (!ret) {
+		dev->online_type = type;
+		kobject_uevent(&dev->kobj, KOBJ_ONLINE);
+	}
+
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+
+	return memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+}
+
+static int memory_block_offline(struct device *dev)
+{
+	int ret = memory_subsys_offline(dev);
+
+	if (!ret) {
+		dev->online_type = 0;
+		kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
+	}
+
+	return ret;
+}
+
 static ssize_t
 store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
-	struct memory_block *mem;
 	int ret = -EINVAL;
 
-	mem = container_of(dev, struct memory_block, dev);
+	lock_device_hotplug();
 
 	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
-		ret = memory_block_change_state(mem, MEM_ONLINE,
-						MEM_OFFLINE, ONLINE_KERNEL);
+		ret = memory_block_online(dev, ONLINE_KERNEL);
 	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
-		ret = memory_block_change_state(mem, MEM_ONLINE,
-						MEM_OFFLINE, ONLINE_MOVABLE);
+		ret = memory_block_online(dev, ONLINE_MOVABLE);
 	else if (!strncmp(buf, "online", min_t(int, count, 6)))
-		ret = memory_block_change_state(mem, MEM_ONLINE,
-						MEM_OFFLINE, ONLINE_KEEP);
+		ret = memory_block_online(dev, ONLINE_KEEP);
 	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
-		ret = memory_block_change_state(mem, MEM_OFFLINE,
-						MEM_ONLINE, -1);
+		ret = memory_block_offline(dev);
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -686,10 +721,17 @@ int offline_memory_block(struct memory_b
 {
 	int ret = 0;
 
+	lock_device_hotplug();
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
+	if (mem->state != MEM_OFFLINE) {
 		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+		if (!ret) {
+			mem->dev.online_type = 0;
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+		}
+	}
 	mutex_unlock(&mem->state_mutex);
+	unlock_device_hotplug();
 
 	return ret;
 }
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -28,7 +28,7 @@ enum {
 
 /* Types for control the zone type of onlined memory */
 enum {
-	ONLINE_KEEP,
+	ONLINE_KEEP = 1,
 	ONLINE_KERNEL,
 	ONLINE_MOVABLE,
 };

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-04  1:06       ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04  1:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/memory.c          |   84 ++++++++++++++++++++++++++++++-----------
 include/linux/memory_hotplug.h |    2 
 2 files changed, 64 insertions(+), 22 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev, unsigned int type);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -294,16 +299,7 @@ static int __memory_block_change_state(s
 	}
 
 	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
-	}
+
 out:
 	return ret;
 }
@@ -321,27 +317,66 @@ static int memory_block_change_state(str
 
 	return ret;
 }
+
+static int memory_subsys_online(struct device *dev, unsigned int type)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+
+	if (type < ONLINE_KEEP || type > ONLINE_KERNEL)
+		return -EINVAL;
+
+	return memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, type);
+}
+
+static int memory_block_online(struct device *dev, unsigned int type)
+{
+	int ret = memory_subsys_online(dev, type);
+
+	if (!ret) {
+		dev->online_type = type;
+		kobject_uevent(&dev->kobj, KOBJ_ONLINE);
+	}
+
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+
+	return memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+}
+
+static int memory_block_offline(struct device *dev)
+{
+	int ret = memory_subsys_offline(dev);
+
+	if (!ret) {
+		dev->online_type = 0;
+		kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
+	}
+
+	return ret;
+}
+
 static ssize_t
 store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
-	struct memory_block *mem;
 	int ret = -EINVAL;
 
-	mem = container_of(dev, struct memory_block, dev);
+	lock_device_hotplug();
 
 	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
-		ret = memory_block_change_state(mem, MEM_ONLINE,
-						MEM_OFFLINE, ONLINE_KERNEL);
+		ret = memory_block_online(dev, ONLINE_KERNEL);
 	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
-		ret = memory_block_change_state(mem, MEM_ONLINE,
-						MEM_OFFLINE, ONLINE_MOVABLE);
+		ret = memory_block_online(dev, ONLINE_MOVABLE);
 	else if (!strncmp(buf, "online", min_t(int, count, 6)))
-		ret = memory_block_change_state(mem, MEM_ONLINE,
-						MEM_OFFLINE, ONLINE_KEEP);
+		ret = memory_block_online(dev, ONLINE_KEEP);
 	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
-		ret = memory_block_change_state(mem, MEM_OFFLINE,
-						MEM_ONLINE, -1);
+		ret = memory_block_offline(dev);
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -686,10 +721,17 @@ int offline_memory_block(struct memory_b
 {
 	int ret = 0;
 
+	lock_device_hotplug();
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
+	if (mem->state != MEM_OFFLINE) {
 		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+		if (!ret) {
+			mem->dev.online_type = 0;
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+		}
+	}
 	mutex_unlock(&mem->state_mutex);
+	unlock_device_hotplug();
 
 	return ret;
 }
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -28,7 +28,7 @@ enum {
 
 /* Types for control the zone type of onlined memory */
 enum {
-	ONLINE_KEEP,
+	ONLINE_KEEP = 1,
 	ONLINE_KERNEL,
 	ONLINE_MOVABLE,
 };


^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys
  2013-05-04  1:01     ` Rafael J. Wysocki
@ 2013-05-04 11:11       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

Hi,

On Saturday, May 04, 2013 03:01:23 AM Rafael J. Wysocki wrote:
> Hi,
> 
> This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
> and it applies on top of it or rather on top of the rebased version (with
> build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:
> 
> http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge
> 
> An introduction to the first part of the patchset is below, a description of
> the current patches follows.

Actually, I'm withdrawing the previous version of this patchset (or rather
patches [2-3/3] from it), because I had a better idea in the meantime.

Patch [1/2] is the same as the previous [1/3] ->

> On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> > On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> > > 
> > > It has been argued for a number of times that in some cases, if a device cannot
> > > be gracefully removed from the system, it shouldn't be removed from it at all,
> > > because that may lead to a kernel crash.  In particular, that will happen if a
> > > memory module holding kernel memory is removed, but also removing the last CPU
> > > in the system may not be a good idea.  [And I can imagine a few other cases
> > > like that.]
> > > 
> > > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > > once started, so users have no choice but to try to hot-remove stuff and see
> > > whether or not that crashes the kernel which is kind of unpleasant.  That seems
> > > to be based on the "the user knows better" argument according to which users
> > > triggering device hot-removal should really know what they are doing, so the
> > > kernel doesn't have to worry about that.  However, for instance, this pretty
> > > much isn't the case for memory modules, because the users have no way to see
> > > whether or not any kernel memory has been allocated from a given module.
> > > 
> > > There have been a few attempts to address this issue, but none of them has
> > > gained broader acceptance.  The following 3 patches are the heart of a new
> > > proposal which is based on the idea to introduce device_offline() and
> > > device_online() operations along the lines of the existing CPU offline/online
> > > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > > operations are available for other devices).  The way it is supposed to work is
> > > that device_offline() will fail if the given device cannot be gracefully
> > > removed from the system (in the kernel's view).  Once it succeeds, though, the
> > > device won't be used any more until either it is removed, or device_online() is
> > > run for it.  That will allow the ACPI device hot-remove code, for one example,
> > > to avoid triggering a non-reversible removal procedure for devices that cannot
> > > be removed gracefully.
> > > 
> > > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > > The .offline() and .online() callbacks are only added at the bus type level for
> > > now, because that should be sufficient to cover the memory and CPU use cases.
> > 
> > That's [1/4] now and the changes from the previous version are:
> > - strtobool() is used in store_online().
> > - device_offline_lock has been renamed to device_hotplug_lock (and the
> >   functions operating it accordingly) following the Toshi's advice.
> > 
> > > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > > device_online() to support the sysfs 'online' attribute for CPUs.
> > 
> > That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> > cpu_down().
> > 
> > > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > > for checking if graceful removal of devices is possible.  The way it does that
> > > is to walk the list of "physical" companion devices for each struct acpi_device
> > > involved in the operation and call device_offline() for each of them.  If any
> > > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > > is an option), the removal procedure (which is not reversible) is simply not
> > > carried out.
> > 
> > That's current [3/4].  It's a bit simpler, because I decided that it would be
> > better to have a global 'force_remove' attribute (the semantics of the
> > per-profile 'force_remove' wasn't clear and it didn't really add any value over
> > a global one).  I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> > in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> > for "physical" companion devices safely (the processor's one added by the next
> > patch actually does that).
> > 
> > > Of some concern is that device_offline() (and possibly device_online()) is
> > > called under physical_node_lock of the corresponding struct acpi_device, which
> > > introduces ordering dependency between that lock and device locks for the
> > > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > > is avoidable at the expense of added complexity, but for now it's just better
> > > to make the code as clean as possible IMO).
> > 
> > Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> > It basically splits the driver into two parts as described in the changelog,
> > where the first part is essentially a scan handler and the second part is
> > a driver, but it doesn't bind to struct acpi_device any more.  Instead, it
> > binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> > has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> > than having it under /sys/bus/acpi/drivers/).
> > 
> > The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> > for this series, but I'm going to push it for v3.10-rc2 if no one screams
> > bloody murder.

-> (this is [1/2] now):

> Patch [1/3] in the current series uses acpi_bind_one() to associate memory
> block devices with ACPI namespace objects representing memory modules that hold
> them.  With patch [3/3] that will allow the ACPI core's device hot-remove code
> to attempt to offline the memory blocks, if possible, before removing the
> modules holding them from the system (and if the offlining fails, the removal
> will not be carried out).

Patch [2/2] adds .online() and .offline() callbacks to memory_subsys
that are used by the common "online" sysfs attribute and by the ACPI core's
hot-remove code, through device_online() and device_offline().

The way it is supposed to work is that device_offline() will attempt to put
memory blocks offline and device_online() will online them and attempt to
apply the last online type previously used to them.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys
@ 2013-05-04 11:11       ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

Hi,

On Saturday, May 04, 2013 03:01:23 AM Rafael J. Wysocki wrote:
> Hi,
> 
> This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
> and it applies on top of it or rather on top of the rebased version (with
> build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:
> 
> http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge
> 
> An introduction to the first part of the patchset is below, a description of
> the current patches follows.

Actually, I'm withdrawing the previous version of this patchset (or rather
patches [2-3/3] from it), because I had a better idea in the meantime.

Patch [1/2] is the same as the previous [1/3] ->

> On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> > On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> > > 
> > > It has been argued for a number of times that in some cases, if a device cannot
> > > be gracefully removed from the system, it shouldn't be removed from it at all,
> > > because that may lead to a kernel crash.  In particular, that will happen if a
> > > memory module holding kernel memory is removed, but also removing the last CPU
> > > in the system may not be a good idea.  [And I can imagine a few other cases
> > > like that.]
> > > 
> > > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > > once started, so users have no choice but to try to hot-remove stuff and see
> > > whether or not that crashes the kernel which is kind of unpleasant.  That seems
> > > to be based on the "the user knows better" argument according to which users
> > > triggering device hot-removal should really know what they are doing, so the
> > > kernel doesn't have to worry about that.  However, for instance, this pretty
> > > much isn't the case for memory modules, because the users have no way to see
> > > whether or not any kernel memory has been allocated from a given module.
> > > 
> > > There have been a few attempts to address this issue, but none of them has
> > > gained broader acceptance.  The following 3 patches are the heart of a new
> > > proposal which is based on the idea to introduce device_offline() and
> > > device_online() operations along the lines of the existing CPU offline/online
> > > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > > operations are available for other devices).  The way it is supposed to work is
> > > that device_offline() will fail if the given device cannot be gracefully
> > > removed from the system (in the kernel's view).  Once it succeeds, though, the
> > > device won't be used any more until either it is removed, or device_online() is
> > > run for it.  That will allow the ACPI device hot-remove code, for one example,
> > > to avoid triggering a non-reversible removal procedure for devices that cannot
> > > be removed gracefully.
> > > 
> > > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > > The .offline() and .online() callbacks are only added at the bus type level for
> > > now, because that should be sufficient to cover the memory and CPU use cases.
> > 
> > That's [1/4] now and the changes from the previous version are:
> > - strtobool() is used in store_online().
> > - device_offline_lock has been renamed to device_hotplug_lock (and the
> >   functions operating it accordingly) following the Toshi's advice.
> > 
> > > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > > device_online() to support the sysfs 'online' attribute for CPUs.
> > 
> > That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> > cpu_down().
> > 
> > > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > > for checking if graceful removal of devices is possible.  The way it does that
> > > is to walk the list of "physical" companion devices for each struct acpi_device
> > > involved in the operation and call device_offline() for each of them.  If any
> > > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > > is an option), the removal procedure (which is not reversible) is simply not
> > > carried out.
> > 
> > That's current [3/4].  It's a bit simpler, because I decided that it would be
> > better to have a global 'force_remove' attribute (the semantics of the
> > per-profile 'force_remove' wasn't clear and it didn't really add any value over
> > a global one).  I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> > in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> > for "physical" companion devices safely (the processor's one added by the next
> > patch actually does that).
> > 
> > > Of some concern is that device_offline() (and possibly device_online()) is
> > > called under physical_node_lock of the corresponding struct acpi_device, which
> > > introduces ordering dependency between that lock and device locks for the
> > > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > > is avoidable at the expense of added complexity, but for now it's just better
> > > to make the code as clean as possible IMO).
> > 
> > Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> > It basically splits the driver into two parts as described in the changelog,
> > where the first part is essentially a scan handler and the second part is
> > a driver, but it doesn't bind to struct acpi_device any more.  Instead, it
> > binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> > has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> > than having it under /sys/bus/acpi/drivers/).
> > 
> > The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> > for this series, but I'm going to push it for v3.10-rc2 if no one screams
> > bloody murder.

-> (this is [1/2] now):

> Patch [1/3] in the current series uses acpi_bind_one() to associate memory
> block devices with ACPI namespace objects representing memory modules that hold
> them.  With patch [3/3] that will allow the ACPI core's device hot-remove code
> to attempt to offline the memory blocks, if possible, before removing the
> modules holding them from the system (and if the offlining fails, the removal
> will not be carried out).

Patch [2/2] adds .online() and .offline() callbacks to memory_subsys
that are used by the common "online" sysfs attribute and by the ACPI core's
hot-remove code, through device_online() and device_offline().

The way it is supposed to work is that device_offline() will attempt to put
memory blocks offline and device_online() will online them and attempt to
apply the last online type previously used to them.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
  2013-05-04 11:11       ` Rafael J. Wysocki
@ 2013-05-04 11:12         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

During ACPI memory hotplug configuration bind memory blocks residing
in modules removable through the standard ACPI mechanism to struct
acpi_device objects associated with ACPI namespace objects
representing those modules.  Accordingly, unbind those memory blocks
from the struct acpi_device objects when the memory modules in
question are being removed.

When "offline" operation for devices representing memory blocks is
introduced, this will allow the ACPI core's device hot-remove code to
use it to carry out remove_memory() for those memory blocks and check
the results of that before it actually removes the modules holding
them from the system.

Since walk_memory_range() is used for accessing all memory blocks
corresponding to a given ACPI namespace object, it is exported from
memory_hotplug.c so that the code in acpi_memhotplug.c can use it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/acpi_memhotplug.c |   53 ++++++++++++++++++++++++++++++++++++++---
 include/linux/memory_hotplug.h |    2 +
 mm/memory_hotplug.c            |    4 ++-
 3 files changed, 55 insertions(+), 4 deletions(-)

Index: linux-pm/mm/memory_hotplug.c
===================================================================
--- linux-pm.orig/mm/memory_hotplug.c
+++ linux-pm/mm/memory_hotplug.c
@@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
 {
 	return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
 }
+#endif /* CONFIG_MEMORY_HOTREMOVE */
 
 /**
  * walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
@@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
  *
  * Returns the return value of func.
  */
-static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *))
 {
 	struct memory_block *mem = NULL;
@@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
 	return 0;
 }
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
 /**
  * offline_memory_block_cb - callback function for offlining memory block
  * @mem: the memory block to be offlined
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -245,6 +245,8 @@ static inline int is_mem_section_removab
 static inline void try_offline_node(int nid) {}
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
+extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+		void *arg, int (*func)(struct memory_block *, void *));
 extern int mem_online_node(int nid);
 extern int add_memory(int nid, u64 start, u64 size);
 extern int arch_add_memory(int nid, u64 start, u64 size);
Index: linux-pm/drivers/acpi/acpi_memhotplug.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
+++ linux-pm/drivers/acpi/acpi_memhotplug.c
@@ -28,6 +28,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/memory.h>
 #include <linux/memory_hotplug.h>
 
 #include "internal.h"
@@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
 	return 0;
 }
 
+static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
+{
+	return PFN_DOWN(info->start_addr);
+}
+
+static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
+{
+	return PFN_UP(info->start_addr + info->length-1);
+}
+
+static int acpi_bind_memblk(struct memory_block *mem, void *arg)
+{
+	return acpi_bind_one(&mem->dev, (acpi_handle)arg);
+}
+
+static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
+				   acpi_handle handle)
+{
+	return walk_memory_range(acpi_meminfo_start_pfn(info),
+				 acpi_meminfo_end_pfn(info), (void *)handle,
+				 acpi_bind_memblk);
+}
+
+static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
+{
+	acpi_unbind_one(&mem->dev);
+	return 0;
+}
+
+static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
+				      acpi_handle handle)
+{
+	walk_memory_range(acpi_meminfo_start_pfn(info),
+			  acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
+}
+
 static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 {
+	acpi_handle handle = mem_device->device->handle;
 	int result, num_enabled = 0;
 	struct acpi_memory_info *info;
 	int node;
 
-	node = acpi_get_node(mem_device->device->handle);
+	node = acpi_get_node(handle);
 	/*
 	 * Tell the VM there is more memory here...
 	 * Note: Assume that this function returns zero on success
@@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
 		if (result && result != -EEXIST)
 			continue;
 
+		result = acpi_bind_memory_blocks(info, handle);
+		if (result) {
+			acpi_unbind_memory_blocks(info, handle);
+			return -ENODEV;
+		}
+
 		info->enabled = 1;
 
 		/*
@@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
 
 static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
 {
+	acpi_handle handle = mem_device->device->handle;
 	int result = 0, nid;
 	struct acpi_memory_info *info, *n;
 
-	nid = acpi_get_node(mem_device->device->handle);
+	nid = acpi_get_node(handle);
 
 	list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
 		if (!info->enabled)
@@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
 
 		if (nid < 0)
 			nid = memory_add_physaddr_to_nid(info->start_addr);
+
+		acpi_unbind_memory_blocks(info, handle);
 		result = remove_memory(nid, info->start_addr, info->length);
 		if (result)
 			return result;
@@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
 	if (result) {
 		dev_err(&device->dev, "acpi_memory_enable_device() error\n");
 		acpi_memory_device_free(mem_device);
-		return -ENODEV;
+		return result;
 	}
 
 	dev_dbg(&device->dev, "Memory device configured by ACPI\n");

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
@ 2013-05-04 11:12         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

During ACPI memory hotplug configuration bind memory blocks residing
in modules removable through the standard ACPI mechanism to struct
acpi_device objects associated with ACPI namespace objects
representing those modules.  Accordingly, unbind those memory blocks
from the struct acpi_device objects when the memory modules in
question are being removed.

When "offline" operation for devices representing memory blocks is
introduced, this will allow the ACPI core's device hot-remove code to
use it to carry out remove_memory() for those memory blocks and check
the results of that before it actually removes the modules holding
them from the system.

Since walk_memory_range() is used for accessing all memory blocks
corresponding to a given ACPI namespace object, it is exported from
memory_hotplug.c so that the code in acpi_memhotplug.c can use it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/acpi_memhotplug.c |   53 ++++++++++++++++++++++++++++++++++++++---
 include/linux/memory_hotplug.h |    2 +
 mm/memory_hotplug.c            |    4 ++-
 3 files changed, 55 insertions(+), 4 deletions(-)

Index: linux-pm/mm/memory_hotplug.c
===================================================================
--- linux-pm.orig/mm/memory_hotplug.c
+++ linux-pm/mm/memory_hotplug.c
@@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
 {
 	return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
 }
+#endif /* CONFIG_MEMORY_HOTREMOVE */
 
 /**
  * walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
@@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
  *
  * Returns the return value of func.
  */
-static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *))
 {
 	struct memory_block *mem = NULL;
@@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
 	return 0;
 }
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
 /**
  * offline_memory_block_cb - callback function for offlining memory block
  * @mem: the memory block to be offlined
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -245,6 +245,8 @@ static inline int is_mem_section_removab
 static inline void try_offline_node(int nid) {}
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
+extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+		void *arg, int (*func)(struct memory_block *, void *));
 extern int mem_online_node(int nid);
 extern int add_memory(int nid, u64 start, u64 size);
 extern int arch_add_memory(int nid, u64 start, u64 size);
Index: linux-pm/drivers/acpi/acpi_memhotplug.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
+++ linux-pm/drivers/acpi/acpi_memhotplug.c
@@ -28,6 +28,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/memory.h>
 #include <linux/memory_hotplug.h>
 
 #include "internal.h"
@@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
 	return 0;
 }
 
+static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
+{
+	return PFN_DOWN(info->start_addr);
+}
+
+static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
+{
+	return PFN_UP(info->start_addr + info->length-1);
+}
+
+static int acpi_bind_memblk(struct memory_block *mem, void *arg)
+{
+	return acpi_bind_one(&mem->dev, (acpi_handle)arg);
+}
+
+static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
+				   acpi_handle handle)
+{
+	return walk_memory_range(acpi_meminfo_start_pfn(info),
+				 acpi_meminfo_end_pfn(info), (void *)handle,
+				 acpi_bind_memblk);
+}
+
+static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
+{
+	acpi_unbind_one(&mem->dev);
+	return 0;
+}
+
+static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
+				      acpi_handle handle)
+{
+	walk_memory_range(acpi_meminfo_start_pfn(info),
+			  acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
+}
+
 static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 {
+	acpi_handle handle = mem_device->device->handle;
 	int result, num_enabled = 0;
 	struct acpi_memory_info *info;
 	int node;
 
-	node = acpi_get_node(mem_device->device->handle);
+	node = acpi_get_node(handle);
 	/*
 	 * Tell the VM there is more memory here...
 	 * Note: Assume that this function returns zero on success
@@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
 		if (result && result != -EEXIST)
 			continue;
 
+		result = acpi_bind_memory_blocks(info, handle);
+		if (result) {
+			acpi_unbind_memory_blocks(info, handle);
+			return -ENODEV;
+		}
+
 		info->enabled = 1;
 
 		/*
@@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
 
 static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
 {
+	acpi_handle handle = mem_device->device->handle;
 	int result = 0, nid;
 	struct acpi_memory_info *info, *n;
 
-	nid = acpi_get_node(mem_device->device->handle);
+	nid = acpi_get_node(handle);
 
 	list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
 		if (!info->enabled)
@@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
 
 		if (nid < 0)
 			nid = memory_add_physaddr_to_nid(info->start_addr);
+
+		acpi_unbind_memory_blocks(info, handle);
 		result = remove_memory(nid, info->start_addr, info->length);
 		if (result)
 			return result;
@@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
 	if (result) {
 		dev_err(&device->dev, "acpi_memory_enable_device() error\n");
 		acpi_memory_device_free(mem_device);
-		return -ENODEV;
+		return result;
 	}
 
 	dev_dbg(&device->dev, "Memory device configured by ACPI\n");


^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-04 11:11       ` Rafael J. Wysocki
@ 2013-05-04 11:21         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
 include/linux/memory.h |    1 
 2 files changed, 81 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,64 @@ static int __memory_block_change_state(s
 {
 	int ret = 0;
 
-	if (mem->state != from_state_req) {
-		ret = -EINVAL;
-		goto out;
-	}
+	if (mem->state != from_state_req)
+		return -EINVAL;
 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
 	if (ret) {
 		mem->state = from_state_req;
-		goto out;
+	} else {
+		mem->state = to_state;
+		if (to_state == MEM_ONLINE)
+			mem->last_online = online_type;
 	}
+	return ret;
+}
 
-	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
+static int memory_subsys_online(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+	ret = __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+					  mem->last_online);
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
+{
+	int ret = __memory_block_change_state(mem, to_state, from_state_req,
+					      online_type);
+	if (!ret) {
+		switch (mem->state) {
+		case MEM_OFFLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+			break;
+		case MEM_ONLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+			break;
+		default:
+			break;
+		}
 	}
-out:
 	return ret;
 }
 
@@ -315,8 +351,8 @@ static int memory_block_change_state(str
 	int ret;
 
 	mutex_lock(&mem->state_mutex);
-	ret = __memory_block_change_state(mem, to_state, from_state_req,
-					  online_type);
+	ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+						 online_type);
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
@@ -326,22 +362,34 @@ store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct memory_block *mem;
+	bool offline;
 	int ret = -EINVAL;
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+	lock_device_hotplug();
+
+	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KERNEL);
-	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+	} else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_MOVABLE);
-	else if (!strncmp(buf, "online", min_t(int, count, 6)))
+	} else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KEEP);
-	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+	} else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+		offline = true;
 		ret = memory_block_change_state(mem, MEM_OFFLINE,
 						MEM_ONLINE, -1);
+	}
+	if (!ret)
+		dev->offline = offline;
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -563,6 +611,7 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
+	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
 {
 	int ret = 0;
 
+	lock_device_hotplug();
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
-		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	if (mem->state != MEM_OFFLINE) {
+		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+							 MEM_ONLINE, -1);
+		if (!ret)
+			mem->dev.offline = true;
+	}
 	mutex_unlock(&mem->state_mutex);
+	unlock_device_hotplug();
 
 	return ret;
 }
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
+	int last_online;
 	int section_count;
 
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-04 11:21         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
 include/linux/memory.h |    1 
 2 files changed, 81 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,64 @@ static int __memory_block_change_state(s
 {
 	int ret = 0;
 
-	if (mem->state != from_state_req) {
-		ret = -EINVAL;
-		goto out;
-	}
+	if (mem->state != from_state_req)
+		return -EINVAL;
 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
 	if (ret) {
 		mem->state = from_state_req;
-		goto out;
+	} else {
+		mem->state = to_state;
+		if (to_state == MEM_ONLINE)
+			mem->last_online = online_type;
 	}
+	return ret;
+}
 
-	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
+static int memory_subsys_online(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+	ret = __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+					  mem->last_online);
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
+{
+	int ret = __memory_block_change_state(mem, to_state, from_state_req,
+					      online_type);
+	if (!ret) {
+		switch (mem->state) {
+		case MEM_OFFLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+			break;
+		case MEM_ONLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+			break;
+		default:
+			break;
+		}
 	}
-out:
 	return ret;
 }
 
@@ -315,8 +351,8 @@ static int memory_block_change_state(str
 	int ret;
 
 	mutex_lock(&mem->state_mutex);
-	ret = __memory_block_change_state(mem, to_state, from_state_req,
-					  online_type);
+	ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+						 online_type);
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
@@ -326,22 +362,34 @@ store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct memory_block *mem;
+	bool offline;
 	int ret = -EINVAL;
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+	lock_device_hotplug();
+
+	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KERNEL);
-	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+	} else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_MOVABLE);
-	else if (!strncmp(buf, "online", min_t(int, count, 6)))
+	} else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KEEP);
-	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+	} else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+		offline = true;
 		ret = memory_block_change_state(mem, MEM_OFFLINE,
 						MEM_ONLINE, -1);
+	}
+	if (!ret)
+		dev->offline = offline;
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -563,6 +611,7 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
+	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
 {
 	int ret = 0;
 
+	lock_device_hotplug();
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
-		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	if (mem->state != MEM_OFFLINE) {
+		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+							 MEM_ONLINE, -1);
+		if (!ret)
+			mem->dev.offline = true;
+	}
 	mutex_unlock(&mem->state_mutex);
+	unlock_device_hotplug();
 
 	return ret;
 }
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
+	int last_online;
 	int section_count;
 
 	/*


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys
  2013-05-04 11:11       ` Rafael J. Wysocki
@ 2013-05-06 10:48         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-06 10:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

On Saturday, May 04, 2013 01:11:21 PM Rafael J. Wysocki wrote:
> Hi,
> 
> On Saturday, May 04, 2013 03:01:23 AM Rafael J. Wysocki wrote:
> > Hi,
> > 
> > This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
> > and it applies on top of it or rather on top of the rebased version (with
> > build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:
> > 
> > http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge
> > 
> > An introduction to the first part of the patchset is below, a description of
> > the current patches follows.
> 
> Actually, I'm withdrawing the previous version of this patchset (or rather
> patches [2-3/3] from it), because I had a better idea in the meantime.
> 
> Patch [1/2] is the same as the previous [1/3] ->
> 
> > On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> > > On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> > > > 
> > > > It has been argued for a number of times that in some cases, if a device cannot
> > > > be gracefully removed from the system, it shouldn't be removed from it at all,
> > > > because that may lead to a kernel crash.  In particular, that will happen if a
> > > > memory module holding kernel memory is removed, but also removing the last CPU
> > > > in the system may not be a good idea.  [And I can imagine a few other cases
> > > > like that.]
> > > > 
> > > > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > > > once started, so users have no choice but to try to hot-remove stuff and see
> > > > whether or not that crashes the kernel which is kind of unpleasant.  That seems
> > > > to be based on the "the user knows better" argument according to which users
> > > > triggering device hot-removal should really know what they are doing, so the
> > > > kernel doesn't have to worry about that.  However, for instance, this pretty
> > > > much isn't the case for memory modules, because the users have no way to see
> > > > whether or not any kernel memory has been allocated from a given module.
> > > > 
> > > > There have been a few attempts to address this issue, but none of them has
> > > > gained broader acceptance.  The following 3 patches are the heart of a new
> > > > proposal which is based on the idea to introduce device_offline() and
> > > > device_online() operations along the lines of the existing CPU offline/online
> > > > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > > > operations are available for other devices).  The way it is supposed to work is
> > > > that device_offline() will fail if the given device cannot be gracefully
> > > > removed from the system (in the kernel's view).  Once it succeeds, though, the
> > > > device won't be used any more until either it is removed, or device_online() is
> > > > run for it.  That will allow the ACPI device hot-remove code, for one example,
> > > > to avoid triggering a non-reversible removal procedure for devices that cannot
> > > > be removed gracefully.
> > > > 
> > > > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > > > The .offline() and .online() callbacks are only added at the bus type level for
> > > > now, because that should be sufficient to cover the memory and CPU use cases.
> > > 
> > > That's [1/4] now and the changes from the previous version are:
> > > - strtobool() is used in store_online().
> > > - device_offline_lock has been renamed to device_hotplug_lock (and the
> > >   functions operating it accordingly) following the Toshi's advice.
> > > 
> > > > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > > > device_online() to support the sysfs 'online' attribute for CPUs.
> > > 
> > > That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> > > cpu_down().
> > > 
> > > > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > > > for checking if graceful removal of devices is possible.  The way it does that
> > > > is to walk the list of "physical" companion devices for each struct acpi_device
> > > > involved in the operation and call device_offline() for each of them.  If any
> > > > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > > > is an option), the removal procedure (which is not reversible) is simply not
> > > > carried out.
> > > 
> > > That's current [3/4].  It's a bit simpler, because I decided that it would be
> > > better to have a global 'force_remove' attribute (the semantics of the
> > > per-profile 'force_remove' wasn't clear and it didn't really add any value over
> > > a global one).  I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> > > in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> > > for "physical" companion devices safely (the processor's one added by the next
> > > patch actually does that).
> > > 
> > > > Of some concern is that device_offline() (and possibly device_online()) is
> > > > called under physical_node_lock of the corresponding struct acpi_device, which
> > > > introduces ordering dependency between that lock and device locks for the
> > > > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > > > is avoidable at the expense of added complexity, but for now it's just better
> > > > to make the code as clean as possible IMO).
> > > 
> > > Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> > > It basically splits the driver into two parts as described in the changelog,
> > > where the first part is essentially a scan handler and the second part is
> > > a driver, but it doesn't bind to struct acpi_device any more.  Instead, it
> > > binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> > > has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> > > than having it under /sys/bus/acpi/drivers/).
> > > 
> > > The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> > > for this series, but I'm going to push it for v3.10-rc2 if no one screams
> > > bloody murder.
> 
> -> (this is [1/2] now):
> 
> > Patch [1/3] in the current series uses acpi_bind_one() to associate memory
> > block devices with ACPI namespace objects representing memory modules that hold
> > them.  With patch [3/3] that will allow the ACPI core's device hot-remove code
> > to attempt to offline the memory blocks, if possible, before removing the
> > modules holding them from the system (and if the offlining fails, the removal
> > will not be carried out).
> 
> Patch [2/2] adds .online() and .offline() callbacks to memory_subsys
> that are used by the common "online" sysfs attribute and by the ACPI core's
> hot-remove code, through device_online() and device_offline().
> 
> The way it is supposed to work is that device_offline() will attempt to put
> memory blocks offline and device_online() will online them and attempt to
> apply the last online type previously used to them.

I forgot to mention that patch [2/2] was (lightly) tested.  Unfortunately,
I don't have the hardware (or an emulator) allowing me to test patch [1/2].

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys
@ 2013-05-06 10:48         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-06 10:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

On Saturday, May 04, 2013 01:11:21 PM Rafael J. Wysocki wrote:
> Hi,
> 
> On Saturday, May 04, 2013 03:01:23 AM Rafael J. Wysocki wrote:
> > Hi,
> > 
> > This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
> > and it applies on top of it or rather on top of the rebased version (with
> > build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:
> > 
> > http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge
> > 
> > An introduction to the first part of the patchset is below, a description of
> > the current patches follows.
> 
> Actually, I'm withdrawing the previous version of this patchset (or rather
> patches [2-3/3] from it), because I had a better idea in the meantime.
> 
> Patch [1/2] is the same as the previous [1/3] ->
> 
> > On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> > > On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> > > > 
> > > > It has been argued for a number of times that in some cases, if a device cannot
> > > > be gracefully removed from the system, it shouldn't be removed from it at all,
> > > > because that may lead to a kernel crash.  In particular, that will happen if a
> > > > memory module holding kernel memory is removed, but also removing the last CPU
> > > > in the system may not be a good idea.  [And I can imagine a few other cases
> > > > like that.]
> > > > 
> > > > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > > > once started, so users have no choice but to try to hot-remove stuff and see
> > > > whether or not that crashes the kernel which is kind of unpleasant.  That seems
> > > > to be based on the "the user knows better" argument according to which users
> > > > triggering device hot-removal should really know what they are doing, so the
> > > > kernel doesn't have to worry about that.  However, for instance, this pretty
> > > > much isn't the case for memory modules, because the users have no way to see
> > > > whether or not any kernel memory has been allocated from a given module.
> > > > 
> > > > There have been a few attempts to address this issue, but none of them has
> > > > gained broader acceptance.  The following 3 patches are the heart of a new
> > > > proposal which is based on the idea to introduce device_offline() and
> > > > device_online() operations along the lines of the existing CPU offline/online
> > > > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > > > operations are available for other devices).  The way it is supposed to work is
> > > > that device_offline() will fail if the given device cannot be gracefully
> > > > removed from the system (in the kernel's view).  Once it succeeds, though, the
> > > > device won't be used any more until either it is removed, or device_online() is
> > > > run for it.  That will allow the ACPI device hot-remove code, for one example,
> > > > to avoid triggering a non-reversible removal procedure for devices that cannot
> > > > be removed gracefully.
> > > > 
> > > > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > > > The .offline() and .online() callbacks are only added at the bus type level for
> > > > now, because that should be sufficient to cover the memory and CPU use cases.
> > > 
> > > That's [1/4] now and the changes from the previous version are:
> > > - strtobool() is used in store_online().
> > > - device_offline_lock has been renamed to device_hotplug_lock (and the
> > >   functions operating it accordingly) following the Toshi's advice.
> > > 
> > > > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > > > device_online() to support the sysfs 'online' attribute for CPUs.
> > > 
> > > That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> > > cpu_down().
> > > 
> > > > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > > > for checking if graceful removal of devices is possible.  The way it does that
> > > > is to walk the list of "physical" companion devices for each struct acpi_device
> > > > involved in the operation and call device_offline() for each of them.  If any
> > > > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > > > is an option), the removal procedure (which is not reversible) is simply not
> > > > carried out.
> > > 
> > > That's current [3/4].  It's a bit simpler, because I decided that it would be
> > > better to have a global 'force_remove' attribute (the semantics of the
> > > per-profile 'force_remove' wasn't clear and it didn't really add any value over
> > > a global one).  I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> > > in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> > > for "physical" companion devices safely (the processor's one added by the next
> > > patch actually does that).
> > > 
> > > > Of some concern is that device_offline() (and possibly device_online()) is
> > > > called under physical_node_lock of the corresponding struct acpi_device, which
> > > > introduces ordering dependency between that lock and device locks for the
> > > > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > > > is avoidable at the expense of added complexity, but for now it's just better
> > > > to make the code as clean as possible IMO).
> > > 
> > > Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> > > It basically splits the driver into two parts as described in the changelog,
> > > where the first part is essentially a scan handler and the second part is
> > > a driver, but it doesn't bind to struct acpi_device any more.  Instead, it
> > > binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> > > has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> > > than having it under /sys/bus/acpi/drivers/).
> > > 
> > > The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> > > for this series, but I'm going to push it for v3.10-rc2 if no one screams
> > > bloody murder.
> 
> -> (this is [1/2] now):
> 
> > Patch [1/3] in the current series uses acpi_bind_one() to associate memory
> > block devices with ACPI namespace objects representing memory modules that hold
> > them.  With patch [3/3] that will allow the ACPI core's device hot-remove code
> > to attempt to offline the memory blocks, if possible, before removing the
> > modules holding them from the system (and if the offlining fails, the removal
> > will not be carried out).
> 
> Patch [2/2] adds .online() and .offline() callbacks to memory_subsys
> that are used by the common "online" sysfs attribute and by the ACPI core's
> hot-remove code, through device_online() and device_offline().
> 
> The way it is supposed to work is that device_offline() will attempt to put
> memory blocks offline and device_online() will online them and attempt to
> apply the last online type previously used to them.

I forgot to mention that patch [2/2] was (lightly) tested.  Unfortunately,
I don't have the hardware (or an emulator) allowing me to test patch [1/2].

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-04 11:21         ` Rafael J. Wysocki
@ 2013-05-06 16:28           ` Vasilis Liaskovitis
  -1 siblings, 0 replies; 105+ messages in thread
From: Vasilis Liaskovitis @ 2013-05-06 16:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, Len Brown, linux-mm

Hi,

On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Introduce .offline() and .online() callbacks for memory_subsys
> that will allow the generic device_offline() and device_online()
> to be used with device objects representing memory blocks.  That,
> in turn, allows the ACPI subsystem to use device_offline() to put
> removable memory blocks offline, if possible, before removing
> memory modules holding them.
> 
> The 'online' sysfs attribute of memory block devices will attempt to
> put them offline if 0 is written to it and will attempt to apply the
> previously used online type when onlining them (i.e. when 1 is
> written to it).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
>  include/linux/memory.h |    1 
>  2 files changed, 81 insertions(+), 25 deletions(-)
>
[...]

> @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
>  {
>  	int ret = 0;
>  
> +	lock_device_hotplug();
>  	mutex_lock(&mem->state_mutex);
> -	if (mem->state != MEM_OFFLINE)
> -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> +	if (mem->state != MEM_OFFLINE) {
> +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> +							 MEM_ONLINE, -1);
> +		if (!ret)
> +			mem->dev.offline = true;
> +	}
>  	mutex_unlock(&mem->state_mutex);
> +	unlock_device_hotplug();

(Testing with qemu...)
offline_memory_block is called from remove_memory, which in turn is called from
acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
don't need to lock/unlock_device_hotplug in offline_memory_block.

A more general issue is that there are now two memory offlining efforts:

1) from acpi_bus_offline_companions during device offline
2) from mm: remove_memory during device detach (offline_memory_block_cb)

The 2nd is only called if the device offline operation was already succesful, so
it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
(unless the blocks were re-onlined in between).
On the other hand, the 2nd effort has some more intelligence in offlining, as it
tries to offline twice in the precense of memcg, see commits df3e1b91 or
reworked 0baeab16. Maybe we need to consolidate the logic.

remove_memory is called from device_detach, during trim that can't fail, so it
should not fail. However this function can still fail in 2 cases:
- offline_memory_block_cb
- is_memblock_offlined_cb
in the case of re-onlined memblocks in between device-offline and device detach.
This seems possible I think, since we do not hold lock_memory_hotplug for the
duration of the hot-remove operation.

thanks,

- Vasilis

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-06 16:28           ` Vasilis Liaskovitis
  0 siblings, 0 replies; 105+ messages in thread
From: Vasilis Liaskovitis @ 2013-05-06 16:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, Len Brown, linux-mm

Hi,

On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Introduce .offline() and .online() callbacks for memory_subsys
> that will allow the generic device_offline() and device_online()
> to be used with device objects representing memory blocks.  That,
> in turn, allows the ACPI subsystem to use device_offline() to put
> removable memory blocks offline, if possible, before removing
> memory modules holding them.
> 
> The 'online' sysfs attribute of memory block devices will attempt to
> put them offline if 0 is written to it and will attempt to apply the
> previously used online type when onlining them (i.e. when 1 is
> written to it).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
>  include/linux/memory.h |    1 
>  2 files changed, 81 insertions(+), 25 deletions(-)
>
[...]

> @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
>  {
>  	int ret = 0;
>  
> +	lock_device_hotplug();
>  	mutex_lock(&mem->state_mutex);
> -	if (mem->state != MEM_OFFLINE)
> -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> +	if (mem->state != MEM_OFFLINE) {
> +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> +							 MEM_ONLINE, -1);
> +		if (!ret)
> +			mem->dev.offline = true;
> +	}
>  	mutex_unlock(&mem->state_mutex);
> +	unlock_device_hotplug();

(Testing with qemu...)
offline_memory_block is called from remove_memory, which in turn is called from
acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
don't need to lock/unlock_device_hotplug in offline_memory_block.

A more general issue is that there are now two memory offlining efforts:

1) from acpi_bus_offline_companions during device offline
2) from mm: remove_memory during device detach (offline_memory_block_cb)

The 2nd is only called if the device offline operation was already succesful, so
it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
(unless the blocks were re-onlined in between).
On the other hand, the 2nd effort has some more intelligence in offlining, as it
tries to offline twice in the precense of memcg, see commits df3e1b91 or
reworked 0baeab16. Maybe we need to consolidate the logic.

remove_memory is called from device_detach, during trim that can't fail, so it
should not fail. However this function can still fail in 2 cases:
- offline_memory_block_cb
- is_memblock_offlined_cb
in the case of re-onlined memblocks in between device-offline and device detach.
This seems possible I think, since we do not hold lock_memory_hotplug for the
duration of the hot-remove operation.

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-04 11:21         ` Rafael J. Wysocki
@ 2013-05-06 17:20           ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-06 17:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Introduce .offline() and .online() callbacks for memory_subsys
> that will allow the generic device_offline() and device_online()
> to be used with device objects representing memory blocks.  That,
> in turn, allows the ACPI subsystem to use device_offline() to put
> removable memory blocks offline, if possible, before removing
> memory modules holding them.
> 
> The 'online' sysfs attribute of memory block devices will attempt to
> put them offline if 0 is written to it and will attempt to apply the
> previously used online type when onlining them (i.e. when 1 is
> written to it).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
>  include/linux/memory.h |    1 
>  2 files changed, 81 insertions(+), 25 deletions(-)

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-06 17:20           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-06 17:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Introduce .offline() and .online() callbacks for memory_subsys
> that will allow the generic device_offline() and device_online()
> to be used with device objects representing memory blocks.  That,
> in turn, allows the ACPI subsystem to use device_offline() to put
> removable memory blocks offline, if possible, before removing
> memory modules holding them.
> 
> The 'online' sysfs attribute of memory block devices will attempt to
> put them offline if 0 is written to it and will attempt to apply the
> previously used online type when onlining them (i.e. when 1 is
> written to it).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
>  include/linux/memory.h |    1 
>  2 files changed, 81 insertions(+), 25 deletions(-)

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-06 17:20           ` Greg Kroah-Hartman
@ 2013-05-06 19:46             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-06 19:46 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

On Monday, May 06, 2013 10:20:44 AM Greg Kroah-Hartman wrote:
> On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Introduce .offline() and .online() callbacks for memory_subsys
> > that will allow the generic device_offline() and device_online()
> > to be used with device objects representing memory blocks.  That,
> > in turn, allows the ACPI subsystem to use device_offline() to put
> > removable memory blocks offline, if possible, before removing
> > memory modules holding them.
> > 
> > The 'online' sysfs attribute of memory block devices will attempt to
> > put them offline if 0 is written to it and will attempt to apply the
> > previously used online type when onlining them (i.e. when 1 is
> > written to it).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> >  include/linux/memory.h |    1 
> >  2 files changed, 81 insertions(+), 25 deletions(-)
> 
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-06 19:46             ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-06 19:46 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

On Monday, May 06, 2013 10:20:44 AM Greg Kroah-Hartman wrote:
> On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Introduce .offline() and .online() callbacks for memory_subsys
> > that will allow the generic device_offline() and device_online()
> > to be used with device objects representing memory blocks.  That,
> > in turn, allows the ACPI subsystem to use device_offline() to put
> > removable memory blocks offline, if possible, before removing
> > memory modules holding them.
> > 
> > The 'online' sysfs attribute of memory block devices will attempt to
> > put them offline if 0 is written to it and will attempt to apply the
> > previously used online type when onlining them (i.e. when 1 is
> > written to it).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> >  include/linux/memory.h |    1 
> >  2 files changed, 81 insertions(+), 25 deletions(-)
> 
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Thanks!


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-06 16:28           ` Vasilis Liaskovitis
@ 2013-05-07  0:59             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07  0:59 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, Len Brown, linux-mm

On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> Hi,
> 
> On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Introduce .offline() and .online() callbacks for memory_subsys
> > that will allow the generic device_offline() and device_online()
> > to be used with device objects representing memory blocks.  That,
> > in turn, allows the ACPI subsystem to use device_offline() to put
> > removable memory blocks offline, if possible, before removing
> > memory modules holding them.
> > 
> > The 'online' sysfs attribute of memory block devices will attempt to
> > put them offline if 0 is written to it and will attempt to apply the
> > previously used online type when onlining them (i.e. when 1 is
> > written to it).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> >  include/linux/memory.h |    1 
> >  2 files changed, 81 insertions(+), 25 deletions(-)
> >
> [...]
> 
> > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> >  {
> >  	int ret = 0;
> >  
> > +	lock_device_hotplug();
> >  	mutex_lock(&mem->state_mutex);
> > -	if (mem->state != MEM_OFFLINE)
> > -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > +	if (mem->state != MEM_OFFLINE) {
> > +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > +							 MEM_ONLINE, -1);
> > +		if (!ret)
> > +			mem->dev.offline = true;
> > +	}
> >  	mutex_unlock(&mem->state_mutex);
> > +	unlock_device_hotplug();
> 
> (Testing with qemu...)

Thanks!

> offline_memory_block is called from remove_memory, which in turn is called from
> acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> don't need to lock/unlock_device_hotplug in offline_memory_block.

Indeed.

First, it looks like offline_memory_block_cb() is the only place calling
offline_memory_block(), is that right?  I'm wondering if it would make
sense to use device_offline() in there and remove offline_memory_block()
entirely?

Second, if you ran into this issue during testing, that would mean that patch
[1/2] actually worked for you, which would be nice. :-)  Was that really the
case?

> A more general issue is that there are now two memory offlining efforts:
> 
> 1) from acpi_bus_offline_companions during device offline
> 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> 
> The 2nd is only called if the device offline operation was already succesful, so
> it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> (unless the blocks were re-onlined in between).

Sure, and that should be OK for now.  Changing the detach behavior is not
essential from the patch [2/2] perspective, we can do it later.

> On the other hand, the 2nd effort has some more intelligence in offlining, as it
> tries to offline twice in the precense of memcg, see commits df3e1b91 or
> reworked 0baeab16. Maybe we need to consolidate the logic.

Hmm.  Perhaps it would make sense to implement that logic in
memory_subsys_offline(), then?

> remove_memory is called from device_detach, during trim that can't fail, so it
> should not fail. However this function can still fail in 2 cases:
> - offline_memory_block_cb
> - is_memblock_offlined_cb
> in the case of re-onlined memblocks in between device-offline and device detach.
> This seems possible I think, since we do not hold lock_memory_hotplug for the
> duration of the hot-remove operation.

But we do hold device_hotplug_lock, so every code path that may race with
acpi_scan_hot_remove() needs to take device_hotplug_lock as well.  Now,
question is whether or not there are any code paths like that calling one of
the two functions above without holding device_hotplug_lock?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-07  0:59             ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07  0:59 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, Len Brown, linux-mm

On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> Hi,
> 
> On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Introduce .offline() and .online() callbacks for memory_subsys
> > that will allow the generic device_offline() and device_online()
> > to be used with device objects representing memory blocks.  That,
> > in turn, allows the ACPI subsystem to use device_offline() to put
> > removable memory blocks offline, if possible, before removing
> > memory modules holding them.
> > 
> > The 'online' sysfs attribute of memory block devices will attempt to
> > put them offline if 0 is written to it and will attempt to apply the
> > previously used online type when onlining them (i.e. when 1 is
> > written to it).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> >  include/linux/memory.h |    1 
> >  2 files changed, 81 insertions(+), 25 deletions(-)
> >
> [...]
> 
> > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> >  {
> >  	int ret = 0;
> >  
> > +	lock_device_hotplug();
> >  	mutex_lock(&mem->state_mutex);
> > -	if (mem->state != MEM_OFFLINE)
> > -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > +	if (mem->state != MEM_OFFLINE) {
> > +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > +							 MEM_ONLINE, -1);
> > +		if (!ret)
> > +			mem->dev.offline = true;
> > +	}
> >  	mutex_unlock(&mem->state_mutex);
> > +	unlock_device_hotplug();
> 
> (Testing with qemu...)

Thanks!

> offline_memory_block is called from remove_memory, which in turn is called from
> acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> don't need to lock/unlock_device_hotplug in offline_memory_block.

Indeed.

First, it looks like offline_memory_block_cb() is the only place calling
offline_memory_block(), is that right?  I'm wondering if it would make
sense to use device_offline() in there and remove offline_memory_block()
entirely?

Second, if you ran into this issue during testing, that would mean that patch
[1/2] actually worked for you, which would be nice. :-)  Was that really the
case?

> A more general issue is that there are now two memory offlining efforts:
> 
> 1) from acpi_bus_offline_companions during device offline
> 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> 
> The 2nd is only called if the device offline operation was already succesful, so
> it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> (unless the blocks were re-onlined in between).

Sure, and that should be OK for now.  Changing the detach behavior is not
essential from the patch [2/2] perspective, we can do it later.

> On the other hand, the 2nd effort has some more intelligence in offlining, as it
> tries to offline twice in the precense of memcg, see commits df3e1b91 or
> reworked 0baeab16. Maybe we need to consolidate the logic.

Hmm.  Perhaps it would make sense to implement that logic in
memory_subsys_offline(), then?

> remove_memory is called from device_detach, during trim that can't fail, so it
> should not fail. However this function can still fail in 2 cases:
> - offline_memory_block_cb
> - is_memblock_offlined_cb
> in the case of re-onlined memblocks in between device-offline and device detach.
> This seems possible I think, since we do not hold lock_memory_hotplug for the
> duration of the hot-remove operation.

But we do hold device_hotplug_lock, so every code path that may race with
acpi_scan_hot_remove() needs to take device_hotplug_lock as well.  Now,
question is whether or not there are any code paths like that calling one of
the two functions above without holding device_hotplug_lock?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-07  0:59             ` Rafael J. Wysocki
@ 2013-05-07 10:59               ` Vasilis Liaskovitis
  -1 siblings, 0 replies; 105+ messages in thread
From: Vasilis Liaskovitis @ 2013-05-07 10:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, Len Brown, linux-mm, wency

Hi,

On Tue, May 07, 2013 at 02:59:05AM +0200, Rafael J. Wysocki wrote:
> On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> > On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Introduce .offline() and .online() callbacks for memory_subsys
> > > that will allow the generic device_offline() and device_online()
> > > to be used with device objects representing memory blocks.  That,
> > > in turn, allows the ACPI subsystem to use device_offline() to put
> > > removable memory blocks offline, if possible, before removing
> > > memory modules holding them.
> > > 
> > > The 'online' sysfs attribute of memory block devices will attempt to
> > > put them offline if 0 is written to it and will attempt to apply the
> > > previously used online type when onlining them (i.e. when 1 is
> > > written to it).
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> > >  include/linux/memory.h |    1 
> > >  2 files changed, 81 insertions(+), 25 deletions(-)
> > >
> > [...]
> > 
> > > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > >  {
> > >  	int ret = 0;
> > >  
> > > +	lock_device_hotplug();
> > >  	mutex_lock(&mem->state_mutex);
> > > -	if (mem->state != MEM_OFFLINE)
> > > -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > +	if (mem->state != MEM_OFFLINE) {
> > > +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > > +							 MEM_ONLINE, -1);
> > > +		if (!ret)
> > > +			mem->dev.offline = true;
> > > +	}
> > >  	mutex_unlock(&mem->state_mutex);
> > > +	unlock_device_hotplug();
> > 
> > (Testing with qemu...)
> 
> Thanks!
> 
> > offline_memory_block is called from remove_memory, which in turn is called from
> > acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> > hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> > don't need to lock/unlock_device_hotplug in offline_memory_block.
> 
> Indeed.
> 
> First, it looks like offline_memory_block_cb() is the only place calling
> offline_memory_block(), is that right?  I'm wondering if it would make

correct.

> sense to use device_offline() in there and remove offline_memory_block()
> entirely?

possibly. Not sure if we can get hold of the struct device from
mm/memory_hotplug.c, maybe we still need the helper function that operates
directly on the memory block.

> 
> Second, if you ran into this issue during testing, that would mean that patch
> [1/2] actually worked for you, which would be nice. :-)  Was that really the
> case?

yes, the patchset works fine once the extra lock/unlock_device_hotplug is
removed. For various dimm hot-remove operations, I saw either successfull
offlining and removal, or failed offlining and aborted removal.
You can add this to 1/2 (or, once the extra lock is removed, to 2/2 as well):

Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>

> 
> > A more general issue is that there are now two memory offlining efforts:
> > 
> > 1) from acpi_bus_offline_companions during device offline
> > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > 
> > The 2nd is only called if the device offline operation was already succesful, so
> > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > (unless the blocks were re-onlined in between).
> 
> Sure, and that should be OK for now.  Changing the detach behavior is not
> essential from the patch [2/2] perspective, we can do it later.

yes, ok.

> 
> > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > reworked 0baeab16. Maybe we need to consolidate the logic.
> 
> Hmm.  Perhaps it would make sense to implement that logic in
> memory_subsys_offline(), then?

the logic tries to offline the memory blocks of the device twice, because the
first memory block might be storing information for the subsequent memblocks.

memory_subsys_offline operates on one memory block at a time. Perhaps we can get
the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
acpi_scan_hot_remove but it's probably not a good idea, since that would
affect non-memory devices as well. 

I am not sure how important this intelligence is in practice (I am not using
mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
more details on 2-pass offlining effectiveness.

> 
> > remove_memory is called from device_detach, during trim that can't fail, so it
> > should not fail. However this function can still fail in 2 cases:
> > - offline_memory_block_cb
> > - is_memblock_offlined_cb
> > in the case of re-onlined memblocks in between device-offline and device detach.
> > This seems possible I think, since we do not hold lock_memory_hotplug for the
> > duration of the hot-remove operation.
> 
> But we do hold device_hotplug_lock, so every code path that may race with
> acpi_scan_hot_remove() needs to take device_hotplug_lock as well.  Now,
> question is whether or not there are any code paths like that calling one of
> the two functions above without holding device_hotplug_lock?

I think you are right. The other code path I had in mind was userspace initiated
online/offline operations from store_mem_state in drivers/base/memory.c. But we
also do lock_device_hotplug in that case too. So it seems safe. If I find
something else with stress testing the paths simultaneously (or another code
path) I 'll update.

thanks,

- Vasilis

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-07 10:59               ` Vasilis Liaskovitis
  0 siblings, 0 replies; 105+ messages in thread
From: Vasilis Liaskovitis @ 2013-05-07 10:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, Len Brown, linux-mm, wency

Hi,

On Tue, May 07, 2013 at 02:59:05AM +0200, Rafael J. Wysocki wrote:
> On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> > On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Introduce .offline() and .online() callbacks for memory_subsys
> > > that will allow the generic device_offline() and device_online()
> > > to be used with device objects representing memory blocks.  That,
> > > in turn, allows the ACPI subsystem to use device_offline() to put
> > > removable memory blocks offline, if possible, before removing
> > > memory modules holding them.
> > > 
> > > The 'online' sysfs attribute of memory block devices will attempt to
> > > put them offline if 0 is written to it and will attempt to apply the
> > > previously used online type when onlining them (i.e. when 1 is
> > > written to it).
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> > >  include/linux/memory.h |    1 
> > >  2 files changed, 81 insertions(+), 25 deletions(-)
> > >
> > [...]
> > 
> > > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > >  {
> > >  	int ret = 0;
> > >  
> > > +	lock_device_hotplug();
> > >  	mutex_lock(&mem->state_mutex);
> > > -	if (mem->state != MEM_OFFLINE)
> > > -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > +	if (mem->state != MEM_OFFLINE) {
> > > +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > > +							 MEM_ONLINE, -1);
> > > +		if (!ret)
> > > +			mem->dev.offline = true;
> > > +	}
> > >  	mutex_unlock(&mem->state_mutex);
> > > +	unlock_device_hotplug();
> > 
> > (Testing with qemu...)
> 
> Thanks!
> 
> > offline_memory_block is called from remove_memory, which in turn is called from
> > acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> > hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> > don't need to lock/unlock_device_hotplug in offline_memory_block.
> 
> Indeed.
> 
> First, it looks like offline_memory_block_cb() is the only place calling
> offline_memory_block(), is that right?  I'm wondering if it would make

correct.

> sense to use device_offline() in there and remove offline_memory_block()
> entirely?

possibly. Not sure if we can get hold of the struct device from
mm/memory_hotplug.c, maybe we still need the helper function that operates
directly on the memory block.

> 
> Second, if you ran into this issue during testing, that would mean that patch
> [1/2] actually worked for you, which would be nice. :-)  Was that really the
> case?

yes, the patchset works fine once the extra lock/unlock_device_hotplug is
removed. For various dimm hot-remove operations, I saw either successfull
offlining and removal, or failed offlining and aborted removal.
You can add this to 1/2 (or, once the extra lock is removed, to 2/2 as well):

Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>

> 
> > A more general issue is that there are now two memory offlining efforts:
> > 
> > 1) from acpi_bus_offline_companions during device offline
> > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > 
> > The 2nd is only called if the device offline operation was already succesful, so
> > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > (unless the blocks were re-onlined in between).
> 
> Sure, and that should be OK for now.  Changing the detach behavior is not
> essential from the patch [2/2] perspective, we can do it later.

yes, ok.

> 
> > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > reworked 0baeab16. Maybe we need to consolidate the logic.
> 
> Hmm.  Perhaps it would make sense to implement that logic in
> memory_subsys_offline(), then?

the logic tries to offline the memory blocks of the device twice, because the
first memory block might be storing information for the subsequent memblocks.

memory_subsys_offline operates on one memory block at a time. Perhaps we can get
the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
acpi_scan_hot_remove but it's probably not a good idea, since that would
affect non-memory devices as well. 

I am not sure how important this intelligence is in practice (I am not using
mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
more details on 2-pass offlining effectiveness.

> 
> > remove_memory is called from device_detach, during trim that can't fail, so it
> > should not fail. However this function can still fail in 2 cases:
> > - offline_memory_block_cb
> > - is_memblock_offlined_cb
> > in the case of re-onlined memblocks in between device-offline and device detach.
> > This seems possible I think, since we do not hold lock_memory_hotplug for the
> > duration of the hot-remove operation.
> 
> But we do hold device_hotplug_lock, so every code path that may race with
> acpi_scan_hot_remove() needs to take device_hotplug_lock as well.  Now,
> question is whether or not there are any code paths like that calling one of
> the two functions above without holding device_hotplug_lock?

I think you are right. The other code path I had in mind was userspace initiated
online/offline operations from store_mem_state in drivers/base/memory.c. But we
also do lock_device_hotplug in that case too. So it seems safe. If I find
something else with stress testing the paths simultaneously (or another code
path) I 'll update.

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-07 10:59               ` Vasilis Liaskovitis
@ 2013-05-07 12:11                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 12:11 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> Hi,
> 
> On Tue, May 07, 2013 at 02:59:05AM +0200, Rafael J. Wysocki wrote:
> > On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> > > On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > Introduce .offline() and .online() callbacks for memory_subsys
> > > > that will allow the generic device_offline() and device_online()
> > > > to be used with device objects representing memory blocks.  That,
> > > > in turn, allows the ACPI subsystem to use device_offline() to put
> > > > removable memory blocks offline, if possible, before removing
> > > > memory modules holding them.
> > > > 
> > > > The 'online' sysfs attribute of memory block devices will attempt to
> > > > put them offline if 0 is written to it and will attempt to apply the
> > > > previously used online type when onlining them (i.e. when 1 is
> > > > written to it).
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> > > >  include/linux/memory.h |    1 
> > > >  2 files changed, 81 insertions(+), 25 deletions(-)
> > > >
> > > [...]
> > > 
> > > > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > > >  {
> > > >  	int ret = 0;
> > > >  
> > > > +	lock_device_hotplug();
> > > >  	mutex_lock(&mem->state_mutex);
> > > > -	if (mem->state != MEM_OFFLINE)
> > > > -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > +	if (mem->state != MEM_OFFLINE) {
> > > > +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > > > +							 MEM_ONLINE, -1);
> > > > +		if (!ret)
> > > > +			mem->dev.offline = true;
> > > > +	}
> > > >  	mutex_unlock(&mem->state_mutex);
> > > > +	unlock_device_hotplug();
> > > 
> > > (Testing with qemu...)
> > 
> > Thanks!
> > 
> > > offline_memory_block is called from remove_memory, which in turn is called from
> > > acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> > > hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> > > don't need to lock/unlock_device_hotplug in offline_memory_block.
> > 
> > Indeed.
> > 
> > First, it looks like offline_memory_block_cb() is the only place calling
> > offline_memory_block(), is that right?  I'm wondering if it would make
> 
> correct.

Great!

> > sense to use device_offline() in there and remove offline_memory_block()
> > entirely?
> 
> possibly. Not sure if we can get hold of the struct device from
> mm/memory_hotplug.c, maybe we still need the helper function that operates
> directly on the memory block.

We can pass mem->dev to device_offline() and the locking should be fine.

> > Second, if you ran into this issue during testing, that would mean that patch
> > [1/2] actually worked for you, which would be nice. :-)  Was that really the
> > case?
> 
> yes, the patchset works fine once the extra lock/unlock_device_hotplug is
> removed. For various dimm hot-remove operations, I saw either successfull
> offlining and removal, or failed offlining and aborted removal.
> You can add this to 1/2 (or, once the extra lock is removed, to 2/2 as well):
> 
> Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>

Thanks!

Updated patch is appended for completness.

> > 
> > > A more general issue is that there are now two memory offlining efforts:
> > > 
> > > 1) from acpi_bus_offline_companions during device offline
> > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > 
> > > The 2nd is only called if the device offline operation was already succesful, so
> > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > (unless the blocks were re-onlined in between).
> > 
> > Sure, and that should be OK for now.  Changing the detach behavior is not
> > essential from the patch [2/2] perspective, we can do it later.
> 
> yes, ok.
> 
> > 
> > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > 
> > Hmm.  Perhaps it would make sense to implement that logic in
> > memory_subsys_offline(), then?
> 
> the logic tries to offline the memory blocks of the device twice, because the
> first memory block might be storing information for the subsequent memblocks.
> 
> memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> acpi_scan_hot_remove but it's probably not a good idea, since that would
> affect non-memory devices as well. 
> 
> I am not sure how important this intelligence is in practice (I am not using
> mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> more details on 2-pass offlining effectiveness.

OK

It may be added in a separate patch in any case.

> > > remove_memory is called from device_detach, during trim that can't fail, so it
> > > should not fail. However this function can still fail in 2 cases:
> > > - offline_memory_block_cb
> > > - is_memblock_offlined_cb
> > > in the case of re-onlined memblocks in between device-offline and device detach.
> > > This seems possible I think, since we do not hold lock_memory_hotplug for the
> > > duration of the hot-remove operation.
> > 
> > But we do hold device_hotplug_lock, so every code path that may race with
> > acpi_scan_hot_remove() needs to take device_hotplug_lock as well.  Now,
> > question is whether or not there are any code paths like that calling one of
> > the two functions above without holding device_hotplug_lock?
> 
> I think you are right. The other code path I had in mind was userspace initiated
> online/offline operations from store_mem_state in drivers/base/memory.c. But we
> also do lock_device_hotplug in that case too. So it seems safe. If I find
> something else with stress testing the paths simultaneously (or another code
> path) I 'll update.

OK

Thanks,
Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
 include/linux/memory.h |    1 
 2 files changed, 81 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,64 @@ static int __memory_block_change_state(s
 {
 	int ret = 0;
 
-	if (mem->state != from_state_req) {
-		ret = -EINVAL;
-		goto out;
-	}
+	if (mem->state != from_state_req)
+		return -EINVAL;
 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
 	if (ret) {
 		mem->state = from_state_req;
-		goto out;
+	} else {
+		mem->state = to_state;
+		if (to_state == MEM_ONLINE)
+			mem->last_online = online_type;
 	}
+	return ret;
+}
 
-	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
+static int memory_subsys_online(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+	ret = __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+					  mem->last_online);
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
+{
+	int ret = __memory_block_change_state(mem, to_state, from_state_req,
+					      online_type);
+	if (!ret) {
+		switch (mem->state) {
+		case MEM_OFFLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+			break;
+		case MEM_ONLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+			break;
+		default:
+			break;
+		}
 	}
-out:
 	return ret;
 }
 
@@ -315,8 +351,8 @@ static int memory_block_change_state(str
 	int ret;
 
 	mutex_lock(&mem->state_mutex);
-	ret = __memory_block_change_state(mem, to_state, from_state_req,
-					  online_type);
+	ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+						 online_type);
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
@@ -326,22 +362,34 @@ store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct memory_block *mem;
+	bool offline;
 	int ret = -EINVAL;
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+	lock_device_hotplug();
+
+	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KERNEL);
-	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+	} else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_MOVABLE);
-	else if (!strncmp(buf, "online", min_t(int, count, 6)))
+	} else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KEEP);
-	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+	} else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+		offline = true;
 		ret = memory_block_change_state(mem, MEM_OFFLINE,
 						MEM_ONLINE, -1);
+	}
+	if (!ret)
+		dev->offline = offline;
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -563,6 +611,7 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
+	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +730,20 @@ int unregister_memory_section(struct mem
 
 /*
  * offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
  */
 int offline_memory_block(struct memory_block *mem)
 {
 	int ret = 0;
 
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
-		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	if (mem->state != MEM_OFFLINE) {
+		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+							 MEM_ONLINE, -1);
+		if (!ret)
+			mem->dev.offline = true;
+	}
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
+	int last_online;
 	int section_count;
 
 	/*


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-07 12:11                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 12:11 UTC (permalink / raw)
  To: Vasilis Liaskovitis
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> Hi,
> 
> On Tue, May 07, 2013 at 02:59:05AM +0200, Rafael J. Wysocki wrote:
> > On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> > > On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > Introduce .offline() and .online() callbacks for memory_subsys
> > > > that will allow the generic device_offline() and device_online()
> > > > to be used with device objects representing memory blocks.  That,
> > > > in turn, allows the ACPI subsystem to use device_offline() to put
> > > > removable memory blocks offline, if possible, before removing
> > > > memory modules holding them.
> > > > 
> > > > The 'online' sysfs attribute of memory block devices will attempt to
> > > > put them offline if 0 is written to it and will attempt to apply the
> > > > previously used online type when onlining them (i.e. when 1 is
> > > > written to it).
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > >  drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
> > > >  include/linux/memory.h |    1 
> > > >  2 files changed, 81 insertions(+), 25 deletions(-)
> > > >
> > > [...]
> > > 
> > > > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > > >  {
> > > >  	int ret = 0;
> > > >  
> > > > +	lock_device_hotplug();
> > > >  	mutex_lock(&mem->state_mutex);
> > > > -	if (mem->state != MEM_OFFLINE)
> > > > -		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > +	if (mem->state != MEM_OFFLINE) {
> > > > +		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > > > +							 MEM_ONLINE, -1);
> > > > +		if (!ret)
> > > > +			mem->dev.offline = true;
> > > > +	}
> > > >  	mutex_unlock(&mem->state_mutex);
> > > > +	unlock_device_hotplug();
> > > 
> > > (Testing with qemu...)
> > 
> > Thanks!
> > 
> > > offline_memory_block is called from remove_memory, which in turn is called from
> > > acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> > > hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> > > don't need to lock/unlock_device_hotplug in offline_memory_block.
> > 
> > Indeed.
> > 
> > First, it looks like offline_memory_block_cb() is the only place calling
> > offline_memory_block(), is that right?  I'm wondering if it would make
> 
> correct.

Great!

> > sense to use device_offline() in there and remove offline_memory_block()
> > entirely?
> 
> possibly. Not sure if we can get hold of the struct device from
> mm/memory_hotplug.c, maybe we still need the helper function that operates
> directly on the memory block.

We can pass mem->dev to device_offline() and the locking should be fine.

> > Second, if you ran into this issue during testing, that would mean that patch
> > [1/2] actually worked for you, which would be nice. :-)  Was that really the
> > case?
> 
> yes, the patchset works fine once the extra lock/unlock_device_hotplug is
> removed. For various dimm hot-remove operations, I saw either successfull
> offlining and removal, or failed offlining and aborted removal.
> You can add this to 1/2 (or, once the extra lock is removed, to 2/2 as well):
> 
> Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>

Thanks!

Updated patch is appended for completness.

> > 
> > > A more general issue is that there are now two memory offlining efforts:
> > > 
> > > 1) from acpi_bus_offline_companions during device offline
> > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > 
> > > The 2nd is only called if the device offline operation was already succesful, so
> > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > (unless the blocks were re-onlined in between).
> > 
> > Sure, and that should be OK for now.  Changing the detach behavior is not
> > essential from the patch [2/2] perspective, we can do it later.
> 
> yes, ok.
> 
> > 
> > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > 
> > Hmm.  Perhaps it would make sense to implement that logic in
> > memory_subsys_offline(), then?
> 
> the logic tries to offline the memory blocks of the device twice, because the
> first memory block might be storing information for the subsequent memblocks.
> 
> memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> acpi_scan_hot_remove but it's probably not a good idea, since that would
> affect non-memory devices as well. 
> 
> I am not sure how important this intelligence is in practice (I am not using
> mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> more details on 2-pass offlining effectiveness.

OK

It may be added in a separate patch in any case.

> > > remove_memory is called from device_detach, during trim that can't fail, so it
> > > should not fail. However this function can still fail in 2 cases:
> > > - offline_memory_block_cb
> > > - is_memblock_offlined_cb
> > > in the case of re-onlined memblocks in between device-offline and device detach.
> > > This seems possible I think, since we do not hold lock_memory_hotplug for the
> > > duration of the hot-remove operation.
> > 
> > But we do hold device_hotplug_lock, so every code path that may race with
> > acpi_scan_hot_remove() needs to take device_hotplug_lock as well.  Now,
> > question is whether or not there are any code paths like that calling one of
> > the two functions above without holding device_hotplug_lock?
> 
> I think you are right. The other code path I had in mind was userspace initiated
> online/offline operations from store_mem_state in drivers/base/memory.c. But we
> also do lock_device_hotplug in that case too. So it seems safe. If I find
> something else with stress testing the paths simultaneously (or another code
> path) I 'll update.

OK

Thanks,
Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/memory.c  |  105 +++++++++++++++++++++++++++++++++++++------------
 include/linux/memory.h |    1 
 2 files changed, 81 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,64 @@ static int __memory_block_change_state(s
 {
 	int ret = 0;
 
-	if (mem->state != from_state_req) {
-		ret = -EINVAL;
-		goto out;
-	}
+	if (mem->state != from_state_req)
+		return -EINVAL;
 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
 	if (ret) {
 		mem->state = from_state_req;
-		goto out;
+	} else {
+		mem->state = to_state;
+		if (to_state == MEM_ONLINE)
+			mem->last_online = online_type;
 	}
+	return ret;
+}
 
-	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
+static int memory_subsys_online(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+	ret = __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+					  mem->last_online);
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
+{
+	int ret = __memory_block_change_state(mem, to_state, from_state_req,
+					      online_type);
+	if (!ret) {
+		switch (mem->state) {
+		case MEM_OFFLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+			break;
+		case MEM_ONLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+			break;
+		default:
+			break;
+		}
 	}
-out:
 	return ret;
 }
 
@@ -315,8 +351,8 @@ static int memory_block_change_state(str
 	int ret;
 
 	mutex_lock(&mem->state_mutex);
-	ret = __memory_block_change_state(mem, to_state, from_state_req,
-					  online_type);
+	ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+						 online_type);
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
@@ -326,22 +362,34 @@ store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct memory_block *mem;
+	bool offline;
 	int ret = -EINVAL;
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+	lock_device_hotplug();
+
+	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KERNEL);
-	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+	} else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_MOVABLE);
-	else if (!strncmp(buf, "online", min_t(int, count, 6)))
+	} else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KEEP);
-	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+	} else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+		offline = true;
 		ret = memory_block_change_state(mem, MEM_OFFLINE,
 						MEM_ONLINE, -1);
+	}
+	if (!ret)
+		dev->offline = offline;
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -563,6 +611,7 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
+	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +730,20 @@ int unregister_memory_section(struct mem
 
 /*
  * offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
  */
 int offline_memory_block(struct memory_block *mem)
 {
 	int ret = 0;
 
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
-		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	if (mem->state != MEM_OFFLINE) {
+		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+							 MEM_ONLINE, -1);
+		if (!ret)
+			mem->dev.offline = true;
+	}
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
+	int last_online;
 	int section_count;
 
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-07 12:11                 ` Rafael J. Wysocki
@ 2013-05-07 21:03                   ` Toshi Kani
  -1 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-07 21:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:

 :

> Updated patch is appended for completness.

Yes, this updated patch solved the locking issue.

> > > > A more general issue is that there are now two memory offlining efforts:
> > > > 
> > > > 1) from acpi_bus_offline_companions during device offline
> > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > 
> > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > (unless the blocks were re-onlined in between).
> > > 
> > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > essential from the patch [2/2] perspective, we can do it later.
> > 
> > yes, ok.
> > 
> > > 
> > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > 
> > > Hmm.  Perhaps it would make sense to implement that logic in
> > > memory_subsys_offline(), then?
> > 
> > the logic tries to offline the memory blocks of the device twice, because the
> > first memory block might be storing information for the subsequent memblocks.
> > 
> > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > affect non-memory devices as well. 
> > 
> > I am not sure how important this intelligence is in practice (I am not using
> > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > more details on 2-pass offlining effectiveness.
> 
> OK
> 
> It may be added in a separate patch in any case.

I had the same comment as Vasilis.  And, I agree with you that we can
enhance it in separate patches.

 :

> +static int memory_subsys_offline(struct device *dev)
> +{
> +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> +	int ret;
> +
> +	mutex_lock(&mem->state_mutex);
> +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);

This function needs to check mem->state just like
offline_memory_block().  That is:

	int ret = 0;
		:
	if (mem->state != MEM_OFFLINE)
		ret = __memory_block_change_state(...);

Otherwise, memory hot-delete to an off-lined memory fails in
__memory_block_change_state() since mem->state is already set to
MEM_OFFLINE.

With that change, for the series:
Reviewed-by: Toshi Kani <toshi.kani@hp.com>

Thanks,
-Toshi

> +	mutex_unlock(&mem->state_mutex);
> +	return ret;
> +}
> +



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-07 21:03                   ` Toshi Kani
  0 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-07 21:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:

 :

> Updated patch is appended for completness.

Yes, this updated patch solved the locking issue.

> > > > A more general issue is that there are now two memory offlining efforts:
> > > > 
> > > > 1) from acpi_bus_offline_companions during device offline
> > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > 
> > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > (unless the blocks were re-onlined in between).
> > > 
> > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > essential from the patch [2/2] perspective, we can do it later.
> > 
> > yes, ok.
> > 
> > > 
> > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > 
> > > Hmm.  Perhaps it would make sense to implement that logic in
> > > memory_subsys_offline(), then?
> > 
> > the logic tries to offline the memory blocks of the device twice, because the
> > first memory block might be storing information for the subsequent memblocks.
> > 
> > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > affect non-memory devices as well. 
> > 
> > I am not sure how important this intelligence is in practice (I am not using
> > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > more details on 2-pass offlining effectiveness.
> 
> OK
> 
> It may be added in a separate patch in any case.

I had the same comment as Vasilis.  And, I agree with you that we can
enhance it in separate patches.

 :

> +static int memory_subsys_offline(struct device *dev)
> +{
> +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> +	int ret;
> +
> +	mutex_lock(&mem->state_mutex);
> +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);

This function needs to check mem->state just like
offline_memory_block().  That is:

	int ret = 0;
		:
	if (mem->state != MEM_OFFLINE)
		ret = __memory_block_change_state(...);

Otherwise, memory hot-delete to an off-lined memory fails in
__memory_block_change_state() since mem->state is already set to
MEM_OFFLINE.

With that change, for the series:
Reviewed-by: Toshi Kani <toshi.kani@hp.com>

Thanks,
-Toshi

> +	mutex_unlock(&mem->state_mutex);
> +	return ret;
> +}
> +




^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-07 21:03                   ` Toshi Kani
@ 2013-05-07 22:10                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 22:10 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> 
>  :
> 
> > Updated patch is appended for completness.
> 
> Yes, this updated patch solved the locking issue.
> 
> > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > 
> > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > 
> > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > (unless the blocks were re-onlined in between).
> > > > 
> > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > essential from the patch [2/2] perspective, we can do it later.
> > > 
> > > yes, ok.
> > > 
> > > > 
> > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > 
> > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > memory_subsys_offline(), then?
> > > 
> > > the logic tries to offline the memory blocks of the device twice, because the
> > > first memory block might be storing information for the subsequent memblocks.
> > > 
> > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > affect non-memory devices as well. 
> > > 
> > > I am not sure how important this intelligence is in practice (I am not using
> > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > more details on 2-pass offlining effectiveness.
> > 
> > OK
> > 
> > It may be added in a separate patch in any case.
> 
> I had the same comment as Vasilis.  And, I agree with you that we can
> enhance it in separate patches.
> 
>  :
> 
> > +static int memory_subsys_offline(struct device *dev)
> > +{
> > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > +	int ret;
> > +
> > +	mutex_lock(&mem->state_mutex);
> > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> 
> This function needs to check mem->state just like
> offline_memory_block().  That is:
> 
> 	int ret = 0;
> 		:
> 	if (mem->state != MEM_OFFLINE)
> 		ret = __memory_block_change_state(...);
> 
> Otherwise, memory hot-delete to an off-lined memory fails in
> __memory_block_change_state() since mem->state is already set to
> MEM_OFFLINE.
> 
> With that change, for the series:
> Reviewed-by: Toshi Kani <toshi.kani@hp.com>

OK, one more update, then (appended).

That said I thought that the check against dev->offline in device_offline()
would be sufficient to guard agaist that.  Is there any "offline" code path
I didn't take into account?

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
---
 drivers/base/memory.c  |  111 +++++++++++++++++++++++++++++++++++++------------
 include/linux/memory.h |    1 
 2 files changed, 87 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,70 @@ static int __memory_block_change_state(s
 {
 	int ret = 0;
 
-	if (mem->state != from_state_req) {
-		ret = -EINVAL;
-		goto out;
-	}
+	if (mem->state != from_state_req)
+		return -EINVAL;
 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
 	if (ret) {
 		mem->state = from_state_req;
-		goto out;
+	} else {
+		mem->state = to_state;
+		if (to_state == MEM_ONLINE)
+			mem->last_online = online_type;
 	}
+	return ret;
+}
 
-	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
+static int memory_subsys_online(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+
+	ret = mem->state == MEM_ONLINE ? 0 :
+		__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+					    mem->last_online);
+
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+
+	ret = mem->state == MEM_OFFLINE ? 0 :
+		__memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
+{
+	int ret = __memory_block_change_state(mem, to_state, from_state_req,
+					      online_type);
+	if (!ret) {
+		switch (mem->state) {
+		case MEM_OFFLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+			break;
+		case MEM_ONLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+			break;
+		default:
+			break;
+		}
 	}
-out:
 	return ret;
 }
 
@@ -315,8 +357,8 @@ static int memory_block_change_state(str
 	int ret;
 
 	mutex_lock(&mem->state_mutex);
-	ret = __memory_block_change_state(mem, to_state, from_state_req,
-					  online_type);
+	ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+						 online_type);
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
@@ -326,22 +368,34 @@ store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct memory_block *mem;
+	bool offline;
 	int ret = -EINVAL;
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+	lock_device_hotplug();
+
+	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KERNEL);
-	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+	} else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_MOVABLE);
-	else if (!strncmp(buf, "online", min_t(int, count, 6)))
+	} else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KEEP);
-	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+	} else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+		offline = true;
 		ret = memory_block_change_state(mem, MEM_OFFLINE,
 						MEM_ONLINE, -1);
+	}
+	if (!ret)
+		dev->offline = offline;
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -563,6 +617,7 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
+	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +736,20 @@ int unregister_memory_section(struct mem
 
 /*
  * offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
  */
 int offline_memory_block(struct memory_block *mem)
 {
 	int ret = 0;
 
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
-		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	if (mem->state != MEM_OFFLINE) {
+		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+							 MEM_ONLINE, -1);
+		if (!ret)
+			mem->dev.offline = true;
+	}
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
+	int last_online;
 	int section_count;
 
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-07 22:10                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 22:10 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> 
>  :
> 
> > Updated patch is appended for completness.
> 
> Yes, this updated patch solved the locking issue.
> 
> > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > 
> > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > 
> > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > (unless the blocks were re-onlined in between).
> > > > 
> > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > essential from the patch [2/2] perspective, we can do it later.
> > > 
> > > yes, ok.
> > > 
> > > > 
> > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > 
> > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > memory_subsys_offline(), then?
> > > 
> > > the logic tries to offline the memory blocks of the device twice, because the
> > > first memory block might be storing information for the subsequent memblocks.
> > > 
> > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > affect non-memory devices as well. 
> > > 
> > > I am not sure how important this intelligence is in practice (I am not using
> > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > more details on 2-pass offlining effectiveness.
> > 
> > OK
> > 
> > It may be added in a separate patch in any case.
> 
> I had the same comment as Vasilis.  And, I agree with you that we can
> enhance it in separate patches.
> 
>  :
> 
> > +static int memory_subsys_offline(struct device *dev)
> > +{
> > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > +	int ret;
> > +
> > +	mutex_lock(&mem->state_mutex);
> > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> 
> This function needs to check mem->state just like
> offline_memory_block().  That is:
> 
> 	int ret = 0;
> 		:
> 	if (mem->state != MEM_OFFLINE)
> 		ret = __memory_block_change_state(...);
> 
> Otherwise, memory hot-delete to an off-lined memory fails in
> __memory_block_change_state() since mem->state is already set to
> MEM_OFFLINE.
> 
> With that change, for the series:
> Reviewed-by: Toshi Kani <toshi.kani@hp.com>

OK, one more update, then (appended).

That said I thought that the check against dev->offline in device_offline()
would be sufficient to guard agaist that.  Is there any "offline" code path
I didn't take into account?

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
---
 drivers/base/memory.c  |  111 +++++++++++++++++++++++++++++++++++++------------
 include/linux/memory.h |    1 
 2 files changed, 87 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,70 @@ static int __memory_block_change_state(s
 {
 	int ret = 0;
 
-	if (mem->state != from_state_req) {
-		ret = -EINVAL;
-		goto out;
-	}
+	if (mem->state != from_state_req)
+		return -EINVAL;
 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
 	if (ret) {
 		mem->state = from_state_req;
-		goto out;
+	} else {
+		mem->state = to_state;
+		if (to_state == MEM_ONLINE)
+			mem->last_online = online_type;
 	}
+	return ret;
+}
 
-	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
+static int memory_subsys_online(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+
+	ret = mem->state == MEM_ONLINE ? 0 :
+		__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+					    mem->last_online);
+
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+
+	ret = mem->state == MEM_OFFLINE ? 0 :
+		__memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
+{
+	int ret = __memory_block_change_state(mem, to_state, from_state_req,
+					      online_type);
+	if (!ret) {
+		switch (mem->state) {
+		case MEM_OFFLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+			break;
+		case MEM_ONLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+			break;
+		default:
+			break;
+		}
 	}
-out:
 	return ret;
 }
 
@@ -315,8 +357,8 @@ static int memory_block_change_state(str
 	int ret;
 
 	mutex_lock(&mem->state_mutex);
-	ret = __memory_block_change_state(mem, to_state, from_state_req,
-					  online_type);
+	ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+						 online_type);
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
@@ -326,22 +368,34 @@ store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct memory_block *mem;
+	bool offline;
 	int ret = -EINVAL;
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+	lock_device_hotplug();
+
+	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KERNEL);
-	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+	} else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_MOVABLE);
-	else if (!strncmp(buf, "online", min_t(int, count, 6)))
+	} else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KEEP);
-	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+	} else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+		offline = true;
 		ret = memory_block_change_state(mem, MEM_OFFLINE,
 						MEM_ONLINE, -1);
+	}
+	if (!ret)
+		dev->offline = offline;
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -563,6 +617,7 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
+	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +736,20 @@ int unregister_memory_section(struct mem
 
 /*
  * offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
  */
 int offline_memory_block(struct memory_block *mem)
 {
 	int ret = 0;
 
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
-		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	if (mem->state != MEM_OFFLINE) {
+		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+							 MEM_ONLINE, -1);
+		if (!ret)
+			mem->dev.offline = true;
+	}
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
+	int last_online;
 	int section_count;
 
 	/*

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-07 22:10                     ` Rafael J. Wysocki
@ 2013-05-07 22:45                       ` Toshi Kani
  -1 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-07 22:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > 
> >  :
> > 
> > > Updated patch is appended for completness.
> > 
> > Yes, this updated patch solved the locking issue.
> > 
> > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > 
> > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > 
> > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > (unless the blocks were re-onlined in between).
> > > > > 
> > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > 
> > > > yes, ok.
> > > > 
> > > > > 
> > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > 
> > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > memory_subsys_offline(), then?
> > > > 
> > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > first memory block might be storing information for the subsequent memblocks.
> > > > 
> > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > affect non-memory devices as well. 
> > > > 
> > > > I am not sure how important this intelligence is in practice (I am not using
> > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > more details on 2-pass offlining effectiveness.
> > > 
> > > OK
> > > 
> > > It may be added in a separate patch in any case.
> > 
> > I had the same comment as Vasilis.  And, I agree with you that we can
> > enhance it in separate patches.
> > 
> >  :
> > 
> > > +static int memory_subsys_offline(struct device *dev)
> > > +{
> > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > +	int ret;
> > > +
> > > +	mutex_lock(&mem->state_mutex);
> > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > 
> > This function needs to check mem->state just like
> > offline_memory_block().  That is:
> > 
> > 	int ret = 0;
> > 		:
> > 	if (mem->state != MEM_OFFLINE)
> > 		ret = __memory_block_change_state(...);
> > 
> > Otherwise, memory hot-delete to an off-lined memory fails in
> > __memory_block_change_state() since mem->state is already set to
> > MEM_OFFLINE.
> > 
> > With that change, for the series:
> > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> 
> OK, one more update, then (appended).
> 
> That said I thought that the check against dev->offline in device_offline()
> would be sufficient to guard agaist that.  Is there any "offline" code path
> I didn't take into account?

Oh, you are right about that.  The real problem is that dev->offline is
set to false (0) when a new memory is hot-added in off-line state.  So,
instead, dev->offline needs to be set properly.  

Thanks,
-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-07 22:45                       ` Toshi Kani
  0 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-07 22:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > 
> >  :
> > 
> > > Updated patch is appended for completness.
> > 
> > Yes, this updated patch solved the locking issue.
> > 
> > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > 
> > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > 
> > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > (unless the blocks were re-onlined in between).
> > > > > 
> > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > 
> > > > yes, ok.
> > > > 
> > > > > 
> > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > 
> > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > memory_subsys_offline(), then?
> > > > 
> > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > first memory block might be storing information for the subsequent memblocks.
> > > > 
> > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > affect non-memory devices as well. 
> > > > 
> > > > I am not sure how important this intelligence is in practice (I am not using
> > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > more details on 2-pass offlining effectiveness.
> > > 
> > > OK
> > > 
> > > It may be added in a separate patch in any case.
> > 
> > I had the same comment as Vasilis.  And, I agree with you that we can
> > enhance it in separate patches.
> > 
> >  :
> > 
> > > +static int memory_subsys_offline(struct device *dev)
> > > +{
> > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > +	int ret;
> > > +
> > > +	mutex_lock(&mem->state_mutex);
> > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > 
> > This function needs to check mem->state just like
> > offline_memory_block().  That is:
> > 
> > 	int ret = 0;
> > 		:
> > 	if (mem->state != MEM_OFFLINE)
> > 		ret = __memory_block_change_state(...);
> > 
> > Otherwise, memory hot-delete to an off-lined memory fails in
> > __memory_block_change_state() since mem->state is already set to
> > MEM_OFFLINE.
> > 
> > With that change, for the series:
> > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> 
> OK, one more update, then (appended).
> 
> That said I thought that the check against dev->offline in device_offline()
> would be sufficient to guard agaist that.  Is there any "offline" code path
> I didn't take into account?

Oh, you are right about that.  The real problem is that dev->offline is
set to false (0) when a new memory is hot-added in off-line state.  So,
instead, dev->offline needs to be set properly.  

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-07 22:45                       ` Toshi Kani
@ 2013-05-07 23:17                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 23:17 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > 
> > >  :
> > > 
> > > > Updated patch is appended for completness.
> > > 
> > > Yes, this updated patch solved the locking issue.
> > > 
> > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > 
> > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > 
> > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > (unless the blocks were re-onlined in between).
> > > > > > 
> > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > 
> > > > > yes, ok.
> > > > > 
> > > > > > 
> > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > 
> > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > memory_subsys_offline(), then?
> > > > > 
> > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > 
> > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > affect non-memory devices as well. 
> > > > > 
> > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > more details on 2-pass offlining effectiveness.
> > > > 
> > > > OK
> > > > 
> > > > It may be added in a separate patch in any case.
> > > 
> > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > enhance it in separate patches.
> > > 
> > >  :
> > > 
> > > > +static int memory_subsys_offline(struct device *dev)
> > > > +{
> > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > +	int ret;
> > > > +
> > > > +	mutex_lock(&mem->state_mutex);
> > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > 
> > > This function needs to check mem->state just like
> > > offline_memory_block().  That is:
> > > 
> > > 	int ret = 0;
> > > 		:
> > > 	if (mem->state != MEM_OFFLINE)
> > > 		ret = __memory_block_change_state(...);
> > > 
> > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > __memory_block_change_state() since mem->state is already set to
> > > MEM_OFFLINE.
> > > 
> > > With that change, for the series:
> > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > 
> > OK, one more update, then (appended).
> > 
> > That said I thought that the check against dev->offline in device_offline()
> > would be sufficient to guard agaist that.  Is there any "offline" code path
> > I didn't take into account?
> 
> Oh, you are right about that.  The real problem is that dev->offline is
> set to false (0) when a new memory is hot-added in off-line state.  So,
> instead, dev->offline needs to be set properly.  

OK, where does that happen?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-07 23:17                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 23:17 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > 
> > >  :
> > > 
> > > > Updated patch is appended for completness.
> > > 
> > > Yes, this updated patch solved the locking issue.
> > > 
> > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > 
> > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > 
> > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > (unless the blocks were re-onlined in between).
> > > > > > 
> > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > 
> > > > > yes, ok.
> > > > > 
> > > > > > 
> > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > 
> > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > memory_subsys_offline(), then?
> > > > > 
> > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > 
> > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > affect non-memory devices as well. 
> > > > > 
> > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > more details on 2-pass offlining effectiveness.
> > > > 
> > > > OK
> > > > 
> > > > It may be added in a separate patch in any case.
> > > 
> > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > enhance it in separate patches.
> > > 
> > >  :
> > > 
> > > > +static int memory_subsys_offline(struct device *dev)
> > > > +{
> > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > +	int ret;
> > > > +
> > > > +	mutex_lock(&mem->state_mutex);
> > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > 
> > > This function needs to check mem->state just like
> > > offline_memory_block().  That is:
> > > 
> > > 	int ret = 0;
> > > 		:
> > > 	if (mem->state != MEM_OFFLINE)
> > > 		ret = __memory_block_change_state(...);
> > > 
> > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > __memory_block_change_state() since mem->state is already set to
> > > MEM_OFFLINE.
> > > 
> > > With that change, for the series:
> > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > 
> > OK, one more update, then (appended).
> > 
> > That said I thought that the check against dev->offline in device_offline()
> > would be sufficient to guard agaist that.  Is there any "offline" code path
> > I didn't take into account?
> 
> Oh, you are right about that.  The real problem is that dev->offline is
> set to false (0) when a new memory is hot-added in off-line state.  So,
> instead, dev->offline needs to be set properly.  

OK, where does that happen?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-07 23:17                         ` Rafael J. Wysocki
@ 2013-05-07 23:59                           ` Toshi Kani
  -1 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-07 23:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > 
> > > >  :
> > > > 
> > > > > Updated patch is appended for completness.
> > > > 
> > > > Yes, this updated patch solved the locking issue.
> > > > 
> > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > 
> > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > 
> > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > 
> > > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > 
> > > > > > yes, ok.
> > > > > > 
> > > > > > > 
> > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > 
> > > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > > memory_subsys_offline(), then?
> > > > > > 
> > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > 
> > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > affect non-memory devices as well. 
> > > > > > 
> > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > > more details on 2-pass offlining effectiveness.
> > > > > 
> > > > > OK
> > > > > 
> > > > > It may be added in a separate patch in any case.
> > > > 
> > > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > > enhance it in separate patches.
> > > > 
> > > >  :
> > > > 
> > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > +{
> > > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > +	int ret;
> > > > > +
> > > > > +	mutex_lock(&mem->state_mutex);
> > > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > 
> > > > This function needs to check mem->state just like
> > > > offline_memory_block().  That is:
> > > > 
> > > > 	int ret = 0;
> > > > 		:
> > > > 	if (mem->state != MEM_OFFLINE)
> > > > 		ret = __memory_block_change_state(...);
> > > > 
> > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > __memory_block_change_state() since mem->state is already set to
> > > > MEM_OFFLINE.
> > > > 
> > > > With that change, for the series:
> > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > 
> > > OK, one more update, then (appended).
> > > 
> > > That said I thought that the check against dev->offline in device_offline()
> > > would be sufficient to guard agaist that.  Is there any "offline" code path
> > > I didn't take into account?
> > 
> > Oh, you are right about that.  The real problem is that dev->offline is
> > set to false (0) when a new memory is hot-added in off-line state.  So,
> > instead, dev->offline needs to be set properly.  
> 
> OK, where does that happen?

It's a bit messy, but the following change seems to work.  A tricky part
is that online() is not called during boot, so I needed to update the
offline flag in __memory_block_change_state().

Thanks,
-Toshi

---
 drivers/base/memory.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index b9dfd34..1c8d781 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
memory_block *mem,
 		mem->state = from_state_req;
 	} else {
 		mem->state = to_state;
-		if (to_state == MEM_ONLINE)
+		if (to_state == MEM_ONLINE) {
 			mem->last_online = online_type;
+			mem->dev.offline = false;
+		}
 	}
 	return ret;
 }
@@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
**memory,
 	mem->state = state;
 	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
+	mem->dev.offline = (state == MEM_OFFLINE) ? true : false; 
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
 	mem->phys_device = arch_get_memory_phys_device(start_pfn);




^ permalink raw reply related	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-07 23:59                           ` Toshi Kani
  0 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-07 23:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > 
> > > >  :
> > > > 
> > > > > Updated patch is appended for completness.
> > > > 
> > > > Yes, this updated patch solved the locking issue.
> > > > 
> > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > 
> > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > 
> > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > 
> > > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > 
> > > > > > yes, ok.
> > > > > > 
> > > > > > > 
> > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > 
> > > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > > memory_subsys_offline(), then?
> > > > > > 
> > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > 
> > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > affect non-memory devices as well. 
> > > > > > 
> > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > > more details on 2-pass offlining effectiveness.
> > > > > 
> > > > > OK
> > > > > 
> > > > > It may be added in a separate patch in any case.
> > > > 
> > > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > > enhance it in separate patches.
> > > > 
> > > >  :
> > > > 
> > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > +{
> > > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > +	int ret;
> > > > > +
> > > > > +	mutex_lock(&mem->state_mutex);
> > > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > 
> > > > This function needs to check mem->state just like
> > > > offline_memory_block().  That is:
> > > > 
> > > > 	int ret = 0;
> > > > 		:
> > > > 	if (mem->state != MEM_OFFLINE)
> > > > 		ret = __memory_block_change_state(...);
> > > > 
> > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > __memory_block_change_state() since mem->state is already set to
> > > > MEM_OFFLINE.
> > > > 
> > > > With that change, for the series:
> > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > 
> > > OK, one more update, then (appended).
> > > 
> > > That said I thought that the check against dev->offline in device_offline()
> > > would be sufficient to guard agaist that.  Is there any "offline" code path
> > > I didn't take into account?
> > 
> > Oh, you are right about that.  The real problem is that dev->offline is
> > set to false (0) when a new memory is hot-added in off-line state.  So,
> > instead, dev->offline needs to be set properly.  
> 
> OK, where does that happen?

It's a bit messy, but the following change seems to work.  A tricky part
is that online() is not called during boot, so I needed to update the
offline flag in __memory_block_change_state().

Thanks,
-Toshi

---
 drivers/base/memory.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index b9dfd34..1c8d781 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
memory_block *mem,
 		mem->state = from_state_req;
 	} else {
 		mem->state = to_state;
-		if (to_state == MEM_ONLINE)
+		if (to_state == MEM_ONLINE) {
 			mem->last_online = online_type;
+			mem->dev.offline = false;
+		}
 	}
 	return ret;
 }
@@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
**memory,
 	mem->state = state;
 	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
+	mem->dev.offline = (state == MEM_OFFLINE) ? true : false; 
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
 	mem->phys_device = arch_get_memory_phys_device(start_pfn);



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-07 23:59                           ` Toshi Kani
@ 2013-05-08  0:24                             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-08  0:24 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > > 
> > > > >  :
> > > > > 
> > > > > > Updated patch is appended for completness.
> > > > > 
> > > > > Yes, this updated patch solved the locking issue.
> > > > > 
> > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > > 
> > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > > 
> > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > > 
> > > > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > > 
> > > > > > > yes, ok.
> > > > > > > 
> > > > > > > > 
> > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > > 
> > > > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > > > memory_subsys_offline(), then?
> > > > > > > 
> > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > > 
> > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > affect non-memory devices as well. 
> > > > > > > 
> > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > > > more details on 2-pass offlining effectiveness.
> > > > > > 
> > > > > > OK
> > > > > > 
> > > > > > It may be added in a separate patch in any case.
> > > > > 
> > > > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > > > enhance it in separate patches.
> > > > > 
> > > > >  :
> > > > > 
> > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > +{
> > > > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > +	int ret;
> > > > > > +
> > > > > > +	mutex_lock(&mem->state_mutex);
> > > > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > > 
> > > > > This function needs to check mem->state just like
> > > > > offline_memory_block().  That is:
> > > > > 
> > > > > 	int ret = 0;
> > > > > 		:
> > > > > 	if (mem->state != MEM_OFFLINE)
> > > > > 		ret = __memory_block_change_state(...);
> > > > > 
> > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > __memory_block_change_state() since mem->state is already set to
> > > > > MEM_OFFLINE.
> > > > > 
> > > > > With that change, for the series:
> > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > > 
> > > > OK, one more update, then (appended).
> > > > 
> > > > That said I thought that the check against dev->offline in device_offline()
> > > > would be sufficient to guard agaist that.  Is there any "offline" code path
> > > > I didn't take into account?
> > > 
> > > Oh, you are right about that.  The real problem is that dev->offline is
> > > set to false (0) when a new memory is hot-added in off-line state.  So,
> > > instead, dev->offline needs to be set properly.  
> > 
> > OK, where does that happen?
> 
> It's a bit messy, but the following change seems to work.  A tricky part
> is that online() is not called during boot, so I needed to update the
> offline flag in __memory_block_change_state().

I wonder why? ->

> ---
>  drivers/base/memory.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index b9dfd34..1c8d781 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> memory_block *mem,
>  		mem->state = from_state_req;
>  	} else {
>  		mem->state = to_state;
> -		if (to_state == MEM_ONLINE)
> +		if (to_state == MEM_ONLINE) {
>  			mem->last_online = online_type;
> +			mem->dev.offline = false;
> +		}

->

__memory_block_change_state() is called by memory_subsys_online/offline()
and by __memory_block_change_state_uevent() only, so it should be sufficient
to do this under the switch () in the latter.

Still, though, __memory_block_change_state_uevent() is only called (indirectly)
from store_mem_state() and by offline_memory_block() the both of which update
dev->offline.

What's the exact scenario you needed this for?

>  	}
>  	return ret;
>  }
> @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> **memory,
>  	mem->state = state;
>  	mem->last_online = ONLINE_KEEP;
>  	mem->section_count++;
> +	mem->dev.offline = (state == MEM_OFFLINE) ? true : false; 

You could write this as

+	mem->dev.offline = state == MEM_OFFLINE; 

Moreover, it'd be better to do it in register_memory(), I think.

>  	mutex_init(&mem->state_mutex);
>  	start_pfn = section_nr_to_pfn(mem->start_section_nr);
>  	mem->phys_device = arch_get_memory_phys_device(start_pfn);

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-08  0:24                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-08  0:24 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > > 
> > > > >  :
> > > > > 
> > > > > > Updated patch is appended for completness.
> > > > > 
> > > > > Yes, this updated patch solved the locking issue.
> > > > > 
> > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > > 
> > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > > 
> > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > > 
> > > > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > > 
> > > > > > > yes, ok.
> > > > > > > 
> > > > > > > > 
> > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > > 
> > > > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > > > memory_subsys_offline(), then?
> > > > > > > 
> > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > > 
> > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > affect non-memory devices as well. 
> > > > > > > 
> > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > > > more details on 2-pass offlining effectiveness.
> > > > > > 
> > > > > > OK
> > > > > > 
> > > > > > It may be added in a separate patch in any case.
> > > > > 
> > > > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > > > enhance it in separate patches.
> > > > > 
> > > > >  :
> > > > > 
> > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > +{
> > > > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > +	int ret;
> > > > > > +
> > > > > > +	mutex_lock(&mem->state_mutex);
> > > > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > > 
> > > > > This function needs to check mem->state just like
> > > > > offline_memory_block().  That is:
> > > > > 
> > > > > 	int ret = 0;
> > > > > 		:
> > > > > 	if (mem->state != MEM_OFFLINE)
> > > > > 		ret = __memory_block_change_state(...);
> > > > > 
> > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > __memory_block_change_state() since mem->state is already set to
> > > > > MEM_OFFLINE.
> > > > > 
> > > > > With that change, for the series:
> > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > > 
> > > > OK, one more update, then (appended).
> > > > 
> > > > That said I thought that the check against dev->offline in device_offline()
> > > > would be sufficient to guard agaist that.  Is there any "offline" code path
> > > > I didn't take into account?
> > > 
> > > Oh, you are right about that.  The real problem is that dev->offline is
> > > set to false (0) when a new memory is hot-added in off-line state.  So,
> > > instead, dev->offline needs to be set properly.  
> > 
> > OK, where does that happen?
> 
> It's a bit messy, but the following change seems to work.  A tricky part
> is that online() is not called during boot, so I needed to update the
> offline flag in __memory_block_change_state().

I wonder why? ->

> ---
>  drivers/base/memory.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index b9dfd34..1c8d781 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> memory_block *mem,
>  		mem->state = from_state_req;
>  	} else {
>  		mem->state = to_state;
> -		if (to_state == MEM_ONLINE)
> +		if (to_state == MEM_ONLINE) {
>  			mem->last_online = online_type;
> +			mem->dev.offline = false;
> +		}

->

__memory_block_change_state() is called by memory_subsys_online/offline()
and by __memory_block_change_state_uevent() only, so it should be sufficient
to do this under the switch () in the latter.

Still, though, __memory_block_change_state_uevent() is only called (indirectly)
from store_mem_state() and by offline_memory_block() the both of which update
dev->offline.

What's the exact scenario you needed this for?

>  	}
>  	return ret;
>  }
> @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> **memory,
>  	mem->state = state;
>  	mem->last_online = ONLINE_KEEP;
>  	mem->section_count++;
> +	mem->dev.offline = (state == MEM_OFFLINE) ? true : false; 

You could write this as

+	mem->dev.offline = state == MEM_OFFLINE; 

Moreover, it'd be better to do it in register_memory(), I think.

>  	mutex_init(&mem->state_mutex);
>  	start_pfn = section_nr_to_pfn(mem->start_section_nr);
>  	mem->phys_device = arch_get_memory_phys_device(start_pfn);

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-08  0:24                             ` Rafael J. Wysocki
@ 2013-05-08  0:37                               ` Toshi Kani
  -1 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-08  0:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > > > 
> > > > > >  :
> > > > > > 
> > > > > > > Updated patch is appended for completness.
> > > > > > 
> > > > > > Yes, this updated patch solved the locking issue.
> > > > > > 
> > > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > > > 
> > > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > > > 
> > > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > > > 
> > > > > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > > > 
> > > > > > > > yes, ok.
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > > > 
> > > > > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > > > > memory_subsys_offline(), then?
> > > > > > > > 
> > > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > > > 
> > > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > > affect non-memory devices as well. 
> > > > > > > > 
> > > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > > > > more details on 2-pass offlining effectiveness.
> > > > > > > 
> > > > > > > OK
> > > > > > > 
> > > > > > > It may be added in a separate patch in any case.
> > > > > > 
> > > > > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > > > > enhance it in separate patches.
> > > > > > 
> > > > > >  :
> > > > > > 
> > > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > > +{
> > > > > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > > +	int ret;
> > > > > > > +
> > > > > > > +	mutex_lock(&mem->state_mutex);
> > > > > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > > > 
> > > > > > This function needs to check mem->state just like
> > > > > > offline_memory_block().  That is:
> > > > > > 
> > > > > > 	int ret = 0;
> > > > > > 		:
> > > > > > 	if (mem->state != MEM_OFFLINE)
> > > > > > 		ret = __memory_block_change_state(...);
> > > > > > 
> > > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > > __memory_block_change_state() since mem->state is already set to
> > > > > > MEM_OFFLINE.
> > > > > > 
> > > > > > With that change, for the series:
> > > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > > > 
> > > > > OK, one more update, then (appended).
> > > > > 
> > > > > That said I thought that the check against dev->offline in device_offline()
> > > > > would be sufficient to guard agaist that.  Is there any "offline" code path
> > > > > I didn't take into account?
> > > > 
> > > > Oh, you are right about that.  The real problem is that dev->offline is
> > > > set to false (0) when a new memory is hot-added in off-line state.  So,
> > > > instead, dev->offline needs to be set properly.  
> > > 
> > > OK, where does that happen?
> > 
> > It's a bit messy, but the following change seems to work.  A tricky part
> > is that online() is not called during boot, so I needed to update the
> > offline flag in __memory_block_change_state().
> 
> I wonder why? ->
> 
> > ---
> >  drivers/base/memory.c |    5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > index b9dfd34..1c8d781 100644
> > --- a/drivers/base/memory.c
> > +++ b/drivers/base/memory.c
> > @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> > memory_block *mem,
> >  		mem->state = from_state_req;
> >  	} else {
> >  		mem->state = to_state;
> > -		if (to_state == MEM_ONLINE)
> > +		if (to_state == MEM_ONLINE) {
> >  			mem->last_online = online_type;
> > +			mem->dev.offline = false;
> > +		}
> 
> ->
> 
> __memory_block_change_state() is called by memory_subsys_online/offline()
> and by __memory_block_change_state_uevent() only, so it should be sufficient
> to do this under the switch () in the latter.
> 
> Still, though, __memory_block_change_state_uevent() is only called (indirectly)
> from store_mem_state() and by offline_memory_block() the both of which update
> dev->offline.
> 
> What's the exact scenario you needed this for?

Right.  I was in hurry and made a wrong assumption...  This change is
not necessary.

> >  	}
> >  	return ret;
> >  }
> > @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> > **memory,
> >  	mem->state = state;
> >  	mem->last_online = ONLINE_KEEP;
> >  	mem->section_count++;
> > +	mem->dev.offline = (state == MEM_OFFLINE) ? true : false; 
> 
> You could write this as
> 
> +	mem->dev.offline = state == MEM_OFFLINE; 

Right.

> Moreover, it'd be better to do it in register_memory(), I think.

Yes, if we change register_memory() to have the arg state. 

Thanks,
-Toshi


> 
> >  	mutex_init(&mem->state_mutex);
> >  	start_pfn = section_nr_to_pfn(mem->start_section_nr);
> >  	mem->phys_device = arch_get_memory_phys_device(start_pfn);
> 
> Thanks,
> Rafael
> 
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-08  0:37                               ` Toshi Kani
  0 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-08  0:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > > > 
> > > > > >  :
> > > > > > 
> > > > > > > Updated patch is appended for completness.
> > > > > > 
> > > > > > Yes, this updated patch solved the locking issue.
> > > > > > 
> > > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > > > 
> > > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > > > 
> > > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > > > 
> > > > > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > > > 
> > > > > > > > yes, ok.
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > > > 
> > > > > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > > > > memory_subsys_offline(), then?
> > > > > > > > 
> > > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > > > 
> > > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > > affect non-memory devices as well. 
> > > > > > > > 
> > > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > > > > more details on 2-pass offlining effectiveness.
> > > > > > > 
> > > > > > > OK
> > > > > > > 
> > > > > > > It may be added in a separate patch in any case.
> > > > > > 
> > > > > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > > > > enhance it in separate patches.
> > > > > > 
> > > > > >  :
> > > > > > 
> > > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > > +{
> > > > > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > > +	int ret;
> > > > > > > +
> > > > > > > +	mutex_lock(&mem->state_mutex);
> > > > > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > > > 
> > > > > > This function needs to check mem->state just like
> > > > > > offline_memory_block().  That is:
> > > > > > 
> > > > > > 	int ret = 0;
> > > > > > 		:
> > > > > > 	if (mem->state != MEM_OFFLINE)
> > > > > > 		ret = __memory_block_change_state(...);
> > > > > > 
> > > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > > __memory_block_change_state() since mem->state is already set to
> > > > > > MEM_OFFLINE.
> > > > > > 
> > > > > > With that change, for the series:
> > > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > > > 
> > > > > OK, one more update, then (appended).
> > > > > 
> > > > > That said I thought that the check against dev->offline in device_offline()
> > > > > would be sufficient to guard agaist that.  Is there any "offline" code path
> > > > > I didn't take into account?
> > > > 
> > > > Oh, you are right about that.  The real problem is that dev->offline is
> > > > set to false (0) when a new memory is hot-added in off-line state.  So,
> > > > instead, dev->offline needs to be set properly.  
> > > 
> > > OK, where does that happen?
> > 
> > It's a bit messy, but the following change seems to work.  A tricky part
> > is that online() is not called during boot, so I needed to update the
> > offline flag in __memory_block_change_state().
> 
> I wonder why? ->
> 
> > ---
> >  drivers/base/memory.c |    5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > index b9dfd34..1c8d781 100644
> > --- a/drivers/base/memory.c
> > +++ b/drivers/base/memory.c
> > @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> > memory_block *mem,
> >  		mem->state = from_state_req;
> >  	} else {
> >  		mem->state = to_state;
> > -		if (to_state == MEM_ONLINE)
> > +		if (to_state == MEM_ONLINE) {
> >  			mem->last_online = online_type;
> > +			mem->dev.offline = false;
> > +		}
> 
> ->
> 
> __memory_block_change_state() is called by memory_subsys_online/offline()
> and by __memory_block_change_state_uevent() only, so it should be sufficient
> to do this under the switch () in the latter.
> 
> Still, though, __memory_block_change_state_uevent() is only called (indirectly)
> from store_mem_state() and by offline_memory_block() the both of which update
> dev->offline.
> 
> What's the exact scenario you needed this for?

Right.  I was in hurry and made a wrong assumption...  This change is
not necessary.

> >  	}
> >  	return ret;
> >  }
> > @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> > **memory,
> >  	mem->state = state;
> >  	mem->last_online = ONLINE_KEEP;
> >  	mem->section_count++;
> > +	mem->dev.offline = (state == MEM_OFFLINE) ? true : false; 
> 
> You could write this as
> 
> +	mem->dev.offline = state == MEM_OFFLINE; 

Right.

> Moreover, it'd be better to do it in register_memory(), I think.

Yes, if we change register_memory() to have the arg state. 

Thanks,
-Toshi


> 
> >  	mutex_init(&mem->state_mutex);
> >  	start_pfn = section_nr_to_pfn(mem->start_section_nr);
> >  	mem->phys_device = arch_get_memory_phys_device(start_pfn);
> 
> Thanks,
> Rafael
> 
> 



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-08  0:37                               ` Toshi Kani
@ 2013-05-08 11:53                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-08 11:53 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 06:37:34 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> > > On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > > > > 
> > > > > > >  :
> > > > > > > 
> > > > > > > > Updated patch is appended for completness.
> > > > > > > 
> > > > > > > Yes, this updated patch solved the locking issue.
> > > > > > > 
> > > > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > > > > 
> > > > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > > > > 
> > > > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > > > > 
> > > > > > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > > > > 
> > > > > > > > > yes, ok.
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > > > > 
> > > > > > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > > > > > memory_subsys_offline(), then?
> > > > > > > > > 
> > > > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > > > > 
> > > > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > > > affect non-memory devices as well. 
> > > > > > > > > 
> > > > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > > > > > more details on 2-pass offlining effectiveness.
> > > > > > > > 
> > > > > > > > OK
> > > > > > > > 
> > > > > > > > It may be added in a separate patch in any case.
> > > > > > > 
> > > > > > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > > > > > enhance it in separate patches.
> > > > > > > 
> > > > > > >  :
> > > > > > > 
> > > > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > > > +{
> > > > > > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > > > +	int ret;
> > > > > > > > +
> > > > > > > > +	mutex_lock(&mem->state_mutex);
> > > > > > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > > > > 
> > > > > > > This function needs to check mem->state just like
> > > > > > > offline_memory_block().  That is:
> > > > > > > 
> > > > > > > 	int ret = 0;
> > > > > > > 		:
> > > > > > > 	if (mem->state != MEM_OFFLINE)
> > > > > > > 		ret = __memory_block_change_state(...);
> > > > > > > 
> > > > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > > > __memory_block_change_state() since mem->state is already set to
> > > > > > > MEM_OFFLINE.
> > > > > > > 
> > > > > > > With that change, for the series:
> > > > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > > > > 
> > > > > > OK, one more update, then (appended).
> > > > > > 
> > > > > > That said I thought that the check against dev->offline in device_offline()
> > > > > > would be sufficient to guard agaist that.  Is there any "offline" code path
> > > > > > I didn't take into account?
> > > > > 
> > > > > Oh, you are right about that.  The real problem is that dev->offline is
> > > > > set to false (0) when a new memory is hot-added in off-line state.  So,
> > > > > instead, dev->offline needs to be set properly.  
> > > > 
> > > > OK, where does that happen?
> > > 
> > > It's a bit messy, but the following change seems to work.  A tricky part
> > > is that online() is not called during boot, so I needed to update the
> > > offline flag in __memory_block_change_state().
> > 
> > I wonder why? ->
> > 
> > > ---
> > >  drivers/base/memory.c |    5 ++++-
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > > index b9dfd34..1c8d781 100644
> > > --- a/drivers/base/memory.c
> > > +++ b/drivers/base/memory.c
> > > @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> > > memory_block *mem,
> > >  		mem->state = from_state_req;
> > >  	} else {
> > >  		mem->state = to_state;
> > > -		if (to_state == MEM_ONLINE)
> > > +		if (to_state == MEM_ONLINE) {
> > >  			mem->last_online = online_type;
> > > +			mem->dev.offline = false;
> > > +		}
> > 
> > ->
> > 
> > __memory_block_change_state() is called by memory_subsys_online/offline()
> > and by __memory_block_change_state_uevent() only, so it should be sufficient
> > to do this under the switch () in the latter.
> > 
> > Still, though, __memory_block_change_state_uevent() is only called (indirectly)
> > from store_mem_state() and by offline_memory_block() the both of which update
> > dev->offline.
> > 
> > What's the exact scenario you needed this for?
> 
> Right.  I was in hurry and made a wrong assumption...  This change is
> not necessary.
> 
> > >  	}
> > >  	return ret;
> > >  }
> > > @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> > > **memory,
> > >  	mem->state = state;
> > >  	mem->last_online = ONLINE_KEEP;
> > >  	mem->section_count++;
> > > +	mem->dev.offline = (state == MEM_OFFLINE) ? true : false; 
> > 
> > You could write this as
> > 
> > +	mem->dev.offline = state == MEM_OFFLINE; 
> 
> Right.
> 
> > Moreover, it'd be better to do it in register_memory(), I think.
> 
> Yes, if we change register_memory() to have the arg state.

It can use mem->state which already has been populated at this point
(and init_memory_block() is the only called).

I've updated the patch to do that (appended).

Thanks,
Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
---
 drivers/base/memory.c  |  112 ++++++++++++++++++++++++++++++++++++++-----------
 include/linux/memory.h |    1 
 2 files changed, 88 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -88,6 +93,7 @@ int register_memory(struct memory_block
 	memory->dev.bus = &memory_subsys;
 	memory->dev.id = memory->start_section_nr / sections_per_block;
 	memory->dev.release = memory_block_release;
+	memory->dev.offline = memory->state == MEM_OFFLINE;
 
 	error = device_register(&memory->dev);
 	return error;
@@ -278,33 +284,70 @@ static int __memory_block_change_state(s
 {
 	int ret = 0;
 
-	if (mem->state != from_state_req) {
-		ret = -EINVAL;
-		goto out;
-	}
+	if (mem->state != from_state_req)
+		return -EINVAL;
 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
 	if (ret) {
 		mem->state = from_state_req;
-		goto out;
+	} else {
+		mem->state = to_state;
+		if (to_state == MEM_ONLINE)
+			mem->last_online = online_type;
 	}
+	return ret;
+}
 
-	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
+static int memory_subsys_online(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+
+	ret = mem->state == MEM_ONLINE ? 0 :
+		__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+					    mem->last_online);
+
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+
+	ret = mem->state == MEM_OFFLINE ? 0 :
+		__memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
+{
+	int ret = __memory_block_change_state(mem, to_state, from_state_req,
+					      online_type);
+	if (!ret) {
+		switch (mem->state) {
+		case MEM_OFFLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+			break;
+		case MEM_ONLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+			break;
+		default:
+			break;
+		}
 	}
-out:
 	return ret;
 }
 
@@ -315,8 +358,8 @@ static int memory_block_change_state(str
 	int ret;
 
 	mutex_lock(&mem->state_mutex);
-	ret = __memory_block_change_state(mem, to_state, from_state_req,
-					  online_type);
+	ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+						 online_type);
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
@@ -326,22 +369,34 @@ store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct memory_block *mem;
+	bool offline;
 	int ret = -EINVAL;
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+	lock_device_hotplug();
+
+	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KERNEL);
-	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+	} else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_MOVABLE);
-	else if (!strncmp(buf, "online", min_t(int, count, 6)))
+	} else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KEEP);
-	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+	} else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+		offline = true;
 		ret = memory_block_change_state(mem, MEM_OFFLINE,
 						MEM_ONLINE, -1);
+	}
+	if (!ret)
+		dev->offline = offline;
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -563,6 +618,7 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
+	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +737,20 @@ int unregister_memory_section(struct mem
 
 /*
  * offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
  */
 int offline_memory_block(struct memory_block *mem)
 {
 	int ret = 0;
 
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
-		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	if (mem->state != MEM_OFFLINE) {
+		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+							 MEM_ONLINE, -1);
+		if (!ret)
+			mem->dev.offline = true;
+	}
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
+	int last_online;
 	int section_count;
 
 	/*


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-08 11:53                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-08 11:53 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Tuesday, May 07, 2013 06:37:34 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> > > On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > > > > 
> > > > > > >  :
> > > > > > > 
> > > > > > > > Updated patch is appended for completness.
> > > > > > > 
> > > > > > > Yes, this updated patch solved the locking issue.
> > > > > > > 
> > > > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > > > > 
> > > > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > > > > 
> > > > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > > > > 
> > > > > > > > > > Sure, and that should be OK for now.  Changing the detach behavior is not
> > > > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > > > > 
> > > > > > > > > yes, ok.
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > > > > 
> > > > > > > > > > Hmm.  Perhaps it would make sense to implement that logic in
> > > > > > > > > > memory_subsys_offline(), then?
> > > > > > > > > 
> > > > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > > > > 
> > > > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > > > affect non-memory devices as well. 
> > > > > > > > > 
> > > > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > > > mem cgroups in my guest kernel tests yet).  Maybe Wen (original author) has
> > > > > > > > > more details on 2-pass offlining effectiveness.
> > > > > > > > 
> > > > > > > > OK
> > > > > > > > 
> > > > > > > > It may be added in a separate patch in any case.
> > > > > > > 
> > > > > > > I had the same comment as Vasilis.  And, I agree with you that we can
> > > > > > > enhance it in separate patches.
> > > > > > > 
> > > > > > >  :
> > > > > > > 
> > > > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > > > +{
> > > > > > > > +	struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > > > +	int ret;
> > > > > > > > +
> > > > > > > > +	mutex_lock(&mem->state_mutex);
> > > > > > > > +	ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > > > > 
> > > > > > > This function needs to check mem->state just like
> > > > > > > offline_memory_block().  That is:
> > > > > > > 
> > > > > > > 	int ret = 0;
> > > > > > > 		:
> > > > > > > 	if (mem->state != MEM_OFFLINE)
> > > > > > > 		ret = __memory_block_change_state(...);
> > > > > > > 
> > > > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > > > __memory_block_change_state() since mem->state is already set to
> > > > > > > MEM_OFFLINE.
> > > > > > > 
> > > > > > > With that change, for the series:
> > > > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > > > > 
> > > > > > OK, one more update, then (appended).
> > > > > > 
> > > > > > That said I thought that the check against dev->offline in device_offline()
> > > > > > would be sufficient to guard agaist that.  Is there any "offline" code path
> > > > > > I didn't take into account?
> > > > > 
> > > > > Oh, you are right about that.  The real problem is that dev->offline is
> > > > > set to false (0) when a new memory is hot-added in off-line state.  So,
> > > > > instead, dev->offline needs to be set properly.  
> > > > 
> > > > OK, where does that happen?
> > > 
> > > It's a bit messy, but the following change seems to work.  A tricky part
> > > is that online() is not called during boot, so I needed to update the
> > > offline flag in __memory_block_change_state().
> > 
> > I wonder why? ->
> > 
> > > ---
> > >  drivers/base/memory.c |    5 ++++-
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > > index b9dfd34..1c8d781 100644
> > > --- a/drivers/base/memory.c
> > > +++ b/drivers/base/memory.c
> > > @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> > > memory_block *mem,
> > >  		mem->state = from_state_req;
> > >  	} else {
> > >  		mem->state = to_state;
> > > -		if (to_state == MEM_ONLINE)
> > > +		if (to_state == MEM_ONLINE) {
> > >  			mem->last_online = online_type;
> > > +			mem->dev.offline = false;
> > > +		}
> > 
> > ->
> > 
> > __memory_block_change_state() is called by memory_subsys_online/offline()
> > and by __memory_block_change_state_uevent() only, so it should be sufficient
> > to do this under the switch () in the latter.
> > 
> > Still, though, __memory_block_change_state_uevent() is only called (indirectly)
> > from store_mem_state() and by offline_memory_block() the both of which update
> > dev->offline.
> > 
> > What's the exact scenario you needed this for?
> 
> Right.  I was in hurry and made a wrong assumption...  This change is
> not necessary.
> 
> > >  	}
> > >  	return ret;
> > >  }
> > > @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> > > **memory,
> > >  	mem->state = state;
> > >  	mem->last_online = ONLINE_KEEP;
> > >  	mem->section_count++;
> > > +	mem->dev.offline = (state == MEM_OFFLINE) ? true : false; 
> > 
> > You could write this as
> > 
> > +	mem->dev.offline = state == MEM_OFFLINE; 
> 
> Right.
> 
> > Moreover, it'd be better to do it in register_memory(), I think.
> 
> Yes, if we change register_memory() to have the arg state.

It can use mem->state which already has been populated at this point
(and init_memory_block() is the only called).

I've updated the patch to do that (appended).

Thanks,
Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks

Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
---
 drivers/base/memory.c  |  112 ++++++++++++++++++++++++++++++++++++++-----------
 include/linux/memory.h |    1 
 2 files changed, 88 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
 	return section_nr / sections_per_block;
 }
 
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
 static struct bus_type memory_subsys = {
 	.name = MEMORY_CLASS_NAME,
 	.dev_name = MEMORY_CLASS_NAME,
+	.online = memory_subsys_online,
+	.offline = memory_subsys_offline,
 };
 
 static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -88,6 +93,7 @@ int register_memory(struct memory_block
 	memory->dev.bus = &memory_subsys;
 	memory->dev.id = memory->start_section_nr / sections_per_block;
 	memory->dev.release = memory_block_release;
+	memory->dev.offline = memory->state == MEM_OFFLINE;
 
 	error = device_register(&memory->dev);
 	return error;
@@ -278,33 +284,70 @@ static int __memory_block_change_state(s
 {
 	int ret = 0;
 
-	if (mem->state != from_state_req) {
-		ret = -EINVAL;
-		goto out;
-	}
+	if (mem->state != from_state_req)
+		return -EINVAL;
 
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
 	if (ret) {
 		mem->state = from_state_req;
-		goto out;
+	} else {
+		mem->state = to_state;
+		if (to_state == MEM_ONLINE)
+			mem->last_online = online_type;
 	}
+	return ret;
+}
 
-	mem->state = to_state;
-	switch (mem->state) {
-	case MEM_OFFLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
-		break;
-	case MEM_ONLINE:
-		kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
-		break;
-	default:
-		break;
+static int memory_subsys_online(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+
+	ret = mem->state == MEM_ONLINE ? 0 :
+		__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+					    mem->last_online);
+
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+	struct memory_block *mem = container_of(dev, struct memory_block, dev);
+	int ret;
+
+	mutex_lock(&mem->state_mutex);
+
+	ret = mem->state == MEM_OFFLINE ? 0 :
+		__memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+
+	mutex_unlock(&mem->state_mutex);
+	return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
+{
+	int ret = __memory_block_change_state(mem, to_state, from_state_req,
+					      online_type);
+	if (!ret) {
+		switch (mem->state) {
+		case MEM_OFFLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+			break;
+		case MEM_ONLINE:
+			kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+			break;
+		default:
+			break;
+		}
 	}
-out:
 	return ret;
 }
 
@@ -315,8 +358,8 @@ static int memory_block_change_state(str
 	int ret;
 
 	mutex_lock(&mem->state_mutex);
-	ret = __memory_block_change_state(mem, to_state, from_state_req,
-					  online_type);
+	ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+						 online_type);
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
@@ -326,22 +369,34 @@ store_mem_state(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct memory_block *mem;
+	bool offline;
 	int ret = -EINVAL;
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+	lock_device_hotplug();
+
+	if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KERNEL);
-	else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+	} else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_MOVABLE);
-	else if (!strncmp(buf, "online", min_t(int, count, 6)))
+	} else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+		offline = false;
 		ret = memory_block_change_state(mem, MEM_ONLINE,
 						MEM_OFFLINE, ONLINE_KEEP);
-	else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+	} else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+		offline = true;
 		ret = memory_block_change_state(mem, MEM_OFFLINE,
 						MEM_ONLINE, -1);
+	}
+	if (!ret)
+		dev->offline = offline;
+
+	unlock_device_hotplug();
 
 	if (ret)
 		return ret;
@@ -563,6 +618,7 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
+	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +737,20 @@ int unregister_memory_section(struct mem
 
 /*
  * offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
  */
 int offline_memory_block(struct memory_block *mem)
 {
 	int ret = 0;
 
 	mutex_lock(&mem->state_mutex);
-	if (mem->state != MEM_OFFLINE)
-		ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+	if (mem->state != MEM_OFFLINE) {
+		ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+							 MEM_ONLINE, -1);
+		if (!ret)
+			mem->dev.offline = true;
+	}
 	mutex_unlock(&mem->state_mutex);
 
 	return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
+	int last_online;
 	int section_count;
 
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-08 11:53                                 ` Rafael J. Wysocki
@ 2013-05-08 14:38                                   ` Toshi Kani
  -1 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-08 14:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Wed, 2013-05-08 at 13:53 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 06:37:34 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:

 :
 
> > > Moreover, it'd be better to do it in register_memory(), I think.
> > 
> > Yes, if we change register_memory() to have the arg state.
> 
> It can use mem->state which already has been populated at this point
> (and init_memory_block() is the only called).

Right.

> I've updated the patch to do that (appended).

Looks good!  Thanks for the update!
-Toshi




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-08 14:38                                   ` Toshi Kani
  0 siblings, 0 replies; 105+ messages in thread
From: Toshi Kani @ 2013-05-08 14:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
	LKML, isimatu.yasuaki, Len Brown, linux-mm, wency

On Wed, 2013-05-08 at 13:53 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 06:37:34 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:

 :
 
> > > Moreover, it'd be better to do it in register_memory(), I think.
> > 
> > Yes, if we change register_memory() to have the arg state.
> 
> It can use mem->state which already has been populated at this point
> (and init_memory_block() is the only called).

Right.

> I've updated the patch to do that (appended).

Looks good!  Thanks for the update!
-Toshi





^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-04 11:21         ` Rafael J. Wysocki
@ 2013-05-21  6:37           ` Tang Chen
  -1 siblings, 0 replies; 105+ messages in thread
From: Tang Chen @ 2013-05-21  6:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

Hi Rafael,

Please see below.

On 05/04/2013 07:21 PM, Rafael J. Wysocki wrote:
......
>   static BLOCKING_NOTIFIER_HEAD(memory_chain);
> @@ -278,33 +283,64 @@ static int __memory_block_change_state(s
>   {
>   	int ret = 0;
>
> -	if (mem->state != from_state_req) {
> -		ret = -EINVAL;
> -		goto out;
> -	}
> +	if (mem->state != from_state_req)
> +		return -EINVAL;
>
>   	if (to_state == MEM_OFFLINE)
>   		mem->state = MEM_GOING_OFFLINE;
>
>   	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> -
>   	if (ret) {
>   		mem->state = from_state_req;
> -		goto out;
> +	} else {
> +		mem->state = to_state;
> +		if (to_state == MEM_ONLINE)
> +			mem->last_online = online_type;

Why do we need to remember last online type ?

And as far as I know, we can obtain which zone a page was in last time it
was onlined by check page->flags, just like online_pages() does. If we
use online_kernel or online_movable, the zone boundary will be 
recalculated.
So we don't need to remember the last online type.

Seeing from your patch, I guess memory_subsys_online() can only handle
online and offline. So mem->last_online is used to remember what user has
done through the original way to trigger memory hot-remove, right ? And 
when
user does it in this new way, it just does the same thing as user does last
time.

But I still think we don't need to remember it because if finally you call
online_pages(), it just does the same thing as last time by default.

online_pages()
{
	......
	if (online_type == ONLINE_KERNEL ......

	if (online_type == ONLINE_MOVABLE......

	zone = page_zone(pfn_to_page(pfn));

	/* Here, the page will be put into the zone which it belong to last 
time. */

	......
}

I just thought of it. Maybe I missed something in your design. Please tell
me if I'm wrong.

Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>

Thanks. :)


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-21  6:37           ` Tang Chen
  0 siblings, 0 replies; 105+ messages in thread
From: Tang Chen @ 2013-05-21  6:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

Hi Rafael,

Please see below.

On 05/04/2013 07:21 PM, Rafael J. Wysocki wrote:
......
>   static BLOCKING_NOTIFIER_HEAD(memory_chain);
> @@ -278,33 +283,64 @@ static int __memory_block_change_state(s
>   {
>   	int ret = 0;
>
> -	if (mem->state != from_state_req) {
> -		ret = -EINVAL;
> -		goto out;
> -	}
> +	if (mem->state != from_state_req)
> +		return -EINVAL;
>
>   	if (to_state == MEM_OFFLINE)
>   		mem->state = MEM_GOING_OFFLINE;
>
>   	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> -
>   	if (ret) {
>   		mem->state = from_state_req;
> -		goto out;
> +	} else {
> +		mem->state = to_state;
> +		if (to_state == MEM_ONLINE)
> +			mem->last_online = online_type;

Why do we need to remember last online type ?

And as far as I know, we can obtain which zone a page was in last time it
was onlined by check page->flags, just like online_pages() does. If we
use online_kernel or online_movable, the zone boundary will be 
recalculated.
So we don't need to remember the last online type.

Seeing from your patch, I guess memory_subsys_online() can only handle
online and offline. So mem->last_online is used to remember what user has
done through the original way to trigger memory hot-remove, right ? And 
when
user does it in this new way, it just does the same thing as user does last
time.

But I still think we don't need to remember it because if finally you call
online_pages(), it just does the same thing as last time by default.

online_pages()
{
	......
	if (online_type == ONLINE_KERNEL ......

	if (online_type == ONLINE_MOVABLE......

	zone = page_zone(pfn_to_page(pfn));

	/* Here, the page will be put into the zone which it belong to last 
time. */

	......
}

I just thought of it. Maybe I missed something in your design. Please tell
me if I'm wrong.

Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>

Thanks. :)



^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
  2013-05-04 11:12         ` Rafael J. Wysocki
@ 2013-05-21  6:50           ` Tang Chen
  -1 siblings, 0 replies; 105+ messages in thread
From: Tang Chen @ 2013-05-21  6:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

Hi Rafael,

Seems OK to me.

Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>

Thanks. :)

On 05/04/2013 07:12 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
>
> During ACPI memory hotplug configuration bind memory blocks residing
> in modules removable through the standard ACPI mechanism to struct
> acpi_device objects associated with ACPI namespace objects
> representing those modules.  Accordingly, unbind those memory blocks
> from the struct acpi_device objects when the memory modules in
> question are being removed.
>
> When "offline" operation for devices representing memory blocks is
> introduced, this will allow the ACPI core's device hot-remove code to
> use it to carry out remove_memory() for those memory blocks and check
> the results of that before it actually removes the modules holding
> them from the system.
>
> Since walk_memory_range() is used for accessing all memory blocks
> corresponding to a given ACPI namespace object, it is exported from
> memory_hotplug.c so that the code in acpi_memhotplug.c can use it.
>
> Signed-off-by: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
> ---
>   drivers/acpi/acpi_memhotplug.c |   53 ++++++++++++++++++++++++++++++++++++++---
>   include/linux/memory_hotplug.h |    2 +
>   mm/memory_hotplug.c            |    4 ++-
>   3 files changed, 55 insertions(+), 4 deletions(-)
>
> Index: linux-pm/mm/memory_hotplug.c
> ===================================================================
> --- linux-pm.orig/mm/memory_hotplug.c
> +++ linux-pm/mm/memory_hotplug.c
> @@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
>   {
>   	return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
>   }
> +#endif /* CONFIG_MEMORY_HOTREMOVE */
>
>   /**
>    * walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
> @@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
>    *
>    * Returns the return value of func.
>    */
> -static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> +int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
>   		void *arg, int (*func)(struct memory_block *, void *))
>   {
>   	struct memory_block *mem = NULL;
> @@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
>   	return 0;
>   }
>
> +#ifdef CONFIG_MEMORY_HOTREMOVE
>   /**
>    * offline_memory_block_cb - callback function for offlining memory block
>    * @mem: the memory block to be offlined
> Index: linux-pm/include/linux/memory_hotplug.h
> ===================================================================
> --- linux-pm.orig/include/linux/memory_hotplug.h
> +++ linux-pm/include/linux/memory_hotplug.h
> @@ -245,6 +245,8 @@ static inline int is_mem_section_removab
>   static inline void try_offline_node(int nid) {}
>   #endif /* CONFIG_MEMORY_HOTREMOVE */
>
> +extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> +		void *arg, int (*func)(struct memory_block *, void *));
>   extern int mem_online_node(int nid);
>   extern int add_memory(int nid, u64 start, u64 size);
>   extern int arch_add_memory(int nid, u64 start, u64 size);
> Index: linux-pm/drivers/acpi/acpi_memhotplug.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
> +++ linux-pm/drivers/acpi/acpi_memhotplug.c
> @@ -28,6 +28,7 @@
>    */
>
>   #include<linux/acpi.h>
> +#include<linux/memory.h>
>   #include<linux/memory_hotplug.h>
>
>   #include "internal.h"
> @@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
>   	return 0;
>   }
>
> +static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
> +{
> +	return PFN_DOWN(info->start_addr);
> +}
> +
> +static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
> +{
> +	return PFN_UP(info->start_addr + info->length-1);
> +}
> +
> +static int acpi_bind_memblk(struct memory_block *mem, void *arg)
> +{
> +	return acpi_bind_one(&mem->dev, (acpi_handle)arg);
> +}
> +
> +static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
> +				   acpi_handle handle)
> +{
> +	return walk_memory_range(acpi_meminfo_start_pfn(info),
> +				 acpi_meminfo_end_pfn(info), (void *)handle,
> +				 acpi_bind_memblk);
> +}
> +
> +static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
> +{
> +	acpi_unbind_one(&mem->dev);
> +	return 0;
> +}
> +
> +static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
> +				      acpi_handle handle)
> +{
> +	walk_memory_range(acpi_meminfo_start_pfn(info),
> +			  acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
> +}
> +
>   static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
>   {
> +	acpi_handle handle = mem_device->device->handle;
>   	int result, num_enabled = 0;
>   	struct acpi_memory_info *info;
>   	int node;
>
> -	node = acpi_get_node(mem_device->device->handle);
> +	node = acpi_get_node(handle);
>   	/*
>   	 * Tell the VM there is more memory here...
>   	 * Note: Assume that this function returns zero on success
> @@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
>   		if (result&&  result != -EEXIST)
>   			continue;
>
> +		result = acpi_bind_memory_blocks(info, handle);
> +		if (result) {
> +			acpi_unbind_memory_blocks(info, handle);
> +			return -ENODEV;
> +		}
> +
>   		info->enabled = 1;
>
>   		/*
> @@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
>
>   static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
>   {
> +	acpi_handle handle = mem_device->device->handle;
>   	int result = 0, nid;
>   	struct acpi_memory_info *info, *n;
>
> -	nid = acpi_get_node(mem_device->device->handle);
> +	nid = acpi_get_node(handle);
>
>   	list_for_each_entry_safe(info, n,&mem_device->res_list, list) {
>   		if (!info->enabled)
> @@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
>
>   		if (nid<  0)
>   			nid = memory_add_physaddr_to_nid(info->start_addr);
> +
> +		acpi_unbind_memory_blocks(info, handle);
>   		result = remove_memory(nid, info->start_addr, info->length);
>   		if (result)
>   			return result;
> @@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
>   	if (result) {
>   		dev_err(&device->dev, "acpi_memory_enable_device() error\n");
>   		acpi_memory_device_free(mem_device);
> -		return -ENODEV;
> +		return result;
>   	}
>
>   	dev_dbg(&device->dev, "Memory device configured by ACPI\n");
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
@ 2013-05-21  6:50           ` Tang Chen
  0 siblings, 0 replies; 105+ messages in thread
From: Tang Chen @ 2013-05-21  6:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

Hi Rafael,

Seems OK to me.

Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>

Thanks. :)

On 05/04/2013 07:12 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
>
> During ACPI memory hotplug configuration bind memory blocks residing
> in modules removable through the standard ACPI mechanism to struct
> acpi_device objects associated with ACPI namespace objects
> representing those modules.  Accordingly, unbind those memory blocks
> from the struct acpi_device objects when the memory modules in
> question are being removed.
>
> When "offline" operation for devices representing memory blocks is
> introduced, this will allow the ACPI core's device hot-remove code to
> use it to carry out remove_memory() for those memory blocks and check
> the results of that before it actually removes the modules holding
> them from the system.
>
> Since walk_memory_range() is used for accessing all memory blocks
> corresponding to a given ACPI namespace object, it is exported from
> memory_hotplug.c so that the code in acpi_memhotplug.c can use it.
>
> Signed-off-by: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
> ---
>   drivers/acpi/acpi_memhotplug.c |   53 ++++++++++++++++++++++++++++++++++++++---
>   include/linux/memory_hotplug.h |    2 +
>   mm/memory_hotplug.c            |    4 ++-
>   3 files changed, 55 insertions(+), 4 deletions(-)
>
> Index: linux-pm/mm/memory_hotplug.c
> ===================================================================
> --- linux-pm.orig/mm/memory_hotplug.c
> +++ linux-pm/mm/memory_hotplug.c
> @@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
>   {
>   	return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
>   }
> +#endif /* CONFIG_MEMORY_HOTREMOVE */
>
>   /**
>    * walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
> @@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
>    *
>    * Returns the return value of func.
>    */
> -static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> +int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
>   		void *arg, int (*func)(struct memory_block *, void *))
>   {
>   	struct memory_block *mem = NULL;
> @@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
>   	return 0;
>   }
>
> +#ifdef CONFIG_MEMORY_HOTREMOVE
>   /**
>    * offline_memory_block_cb - callback function for offlining memory block
>    * @mem: the memory block to be offlined
> Index: linux-pm/include/linux/memory_hotplug.h
> ===================================================================
> --- linux-pm.orig/include/linux/memory_hotplug.h
> +++ linux-pm/include/linux/memory_hotplug.h
> @@ -245,6 +245,8 @@ static inline int is_mem_section_removab
>   static inline void try_offline_node(int nid) {}
>   #endif /* CONFIG_MEMORY_HOTREMOVE */
>
> +extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> +		void *arg, int (*func)(struct memory_block *, void *));
>   extern int mem_online_node(int nid);
>   extern int add_memory(int nid, u64 start, u64 size);
>   extern int arch_add_memory(int nid, u64 start, u64 size);
> Index: linux-pm/drivers/acpi/acpi_memhotplug.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
> +++ linux-pm/drivers/acpi/acpi_memhotplug.c
> @@ -28,6 +28,7 @@
>    */
>
>   #include<linux/acpi.h>
> +#include<linux/memory.h>
>   #include<linux/memory_hotplug.h>
>
>   #include "internal.h"
> @@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
>   	return 0;
>   }
>
> +static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
> +{
> +	return PFN_DOWN(info->start_addr);
> +}
> +
> +static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
> +{
> +	return PFN_UP(info->start_addr + info->length-1);
> +}
> +
> +static int acpi_bind_memblk(struct memory_block *mem, void *arg)
> +{
> +	return acpi_bind_one(&mem->dev, (acpi_handle)arg);
> +}
> +
> +static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
> +				   acpi_handle handle)
> +{
> +	return walk_memory_range(acpi_meminfo_start_pfn(info),
> +				 acpi_meminfo_end_pfn(info), (void *)handle,
> +				 acpi_bind_memblk);
> +}
> +
> +static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
> +{
> +	acpi_unbind_one(&mem->dev);
> +	return 0;
> +}
> +
> +static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
> +				      acpi_handle handle)
> +{
> +	walk_memory_range(acpi_meminfo_start_pfn(info),
> +			  acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
> +}
> +
>   static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
>   {
> +	acpi_handle handle = mem_device->device->handle;
>   	int result, num_enabled = 0;
>   	struct acpi_memory_info *info;
>   	int node;
>
> -	node = acpi_get_node(mem_device->device->handle);
> +	node = acpi_get_node(handle);
>   	/*
>   	 * Tell the VM there is more memory here...
>   	 * Note: Assume that this function returns zero on success
> @@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
>   		if (result&&  result != -EEXIST)
>   			continue;
>
> +		result = acpi_bind_memory_blocks(info, handle);
> +		if (result) {
> +			acpi_unbind_memory_blocks(info, handle);
> +			return -ENODEV;
> +		}
> +
>   		info->enabled = 1;
>
>   		/*
> @@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
>
>   static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
>   {
> +	acpi_handle handle = mem_device->device->handle;
>   	int result = 0, nid;
>   	struct acpi_memory_info *info, *n;
>
> -	nid = acpi_get_node(mem_device->device->handle);
> +	nid = acpi_get_node(handle);
>
>   	list_for_each_entry_safe(info, n,&mem_device->res_list, list) {
>   		if (!info->enabled)
> @@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
>
>   		if (nid<  0)
>   			nid = memory_add_physaddr_to_nid(info->start_addr);
> +
> +		acpi_unbind_memory_blocks(info, handle);
>   		result = remove_memory(nid, info->start_addr, info->length);
>   		if (result)
>   			return result;
> @@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
>   	if (result) {
>   		dev_err(&device->dev, "acpi_memory_enable_device() error\n");
>   		acpi_memory_device_free(mem_device);
> -		return -ENODEV;
> +		return result;
>   	}
>
>   	dev_dbg(&device->dev, "Memory device configured by ACPI\n");
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-21  6:37           ` Tang Chen
@ 2013-05-21 11:15             ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-21 11:15 UTC (permalink / raw)
  To: Tang Chen
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

On Tuesday, May 21, 2013 02:37:53 PM Tang Chen wrote:
> Hi Rafael,
> 
> Please see below.
> 
> On 05/04/2013 07:21 PM, Rafael J. Wysocki wrote:
> ......
> >   static BLOCKING_NOTIFIER_HEAD(memory_chain);
> > @@ -278,33 +283,64 @@ static int __memory_block_change_state(s
> >   {
> >   	int ret = 0;
> >
> > -	if (mem->state != from_state_req) {
> > -		ret = -EINVAL;
> > -		goto out;
> > -	}
> > +	if (mem->state != from_state_req)
> > +		return -EINVAL;
> >
> >   	if (to_state == MEM_OFFLINE)
> >   		mem->state = MEM_GOING_OFFLINE;
> >
> >   	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> > -
> >   	if (ret) {
> >   		mem->state = from_state_req;
> > -		goto out;
> > +	} else {
> > +		mem->state = to_state;
> > +		if (to_state == MEM_ONLINE)
> > +			mem->last_online = online_type;
> 
> Why do we need to remember last online type ?
> 
> And as far as I know, we can obtain which zone a page was in last time it
> was onlined by check page->flags, just like online_pages() does. If we
> use online_kernel or online_movable, the zone boundary will be 
> recalculated.
> So we don't need to remember the last online type.
> 
> Seeing from your patch, I guess memory_subsys_online() can only handle
> online and offline. So mem->last_online is used to remember what user has
> done through the original way to trigger memory hot-remove, right ? And 
> when
> user does it in this new way, it just does the same thing as user does last
> time.
> 
> But I still think we don't need to remember it because if finally you call
> online_pages(), it just does the same thing as last time by default.
> 
> online_pages()
> {
> 	......
> 	if (online_type == ONLINE_KERNEL ......
> 
> 	if (online_type == ONLINE_MOVABLE......
> 
> 	zone = page_zone(pfn_to_page(pfn));
> 
> 	/* Here, the page will be put into the zone which it belong to last 
> time. */

To be honest, it wasn't entirely clear to me that online_pages() would do the
same thing as last time by default.  Suppose, for example, that the previous
online_type was ONLINE_MOVABLE.  How online_pages() is supposed to know that
it should do the move_pfn_zone_right() if we don't tell it to do that?  Or
is that unnecessary, because it's already been done previously?

> 	......
> }
> 
> I just thought of it. Maybe I missed something in your design. Please tell
> me if I'm wrong.

Well, so what should be passed to __memory_block_change_state() in
memory_subsys_online()?  -1?

> Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
> 
> Thanks. :)

Thanks for your comments,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-21 11:15             ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-21 11:15 UTC (permalink / raw)
  To: Tang Chen
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

On Tuesday, May 21, 2013 02:37:53 PM Tang Chen wrote:
> Hi Rafael,
> 
> Please see below.
> 
> On 05/04/2013 07:21 PM, Rafael J. Wysocki wrote:
> ......
> >   static BLOCKING_NOTIFIER_HEAD(memory_chain);
> > @@ -278,33 +283,64 @@ static int __memory_block_change_state(s
> >   {
> >   	int ret = 0;
> >
> > -	if (mem->state != from_state_req) {
> > -		ret = -EINVAL;
> > -		goto out;
> > -	}
> > +	if (mem->state != from_state_req)
> > +		return -EINVAL;
> >
> >   	if (to_state == MEM_OFFLINE)
> >   		mem->state = MEM_GOING_OFFLINE;
> >
> >   	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> > -
> >   	if (ret) {
> >   		mem->state = from_state_req;
> > -		goto out;
> > +	} else {
> > +		mem->state = to_state;
> > +		if (to_state == MEM_ONLINE)
> > +			mem->last_online = online_type;
> 
> Why do we need to remember last online type ?
> 
> And as far as I know, we can obtain which zone a page was in last time it
> was onlined by check page->flags, just like online_pages() does. If we
> use online_kernel or online_movable, the zone boundary will be 
> recalculated.
> So we don't need to remember the last online type.
> 
> Seeing from your patch, I guess memory_subsys_online() can only handle
> online and offline. So mem->last_online is used to remember what user has
> done through the original way to trigger memory hot-remove, right ? And 
> when
> user does it in this new way, it just does the same thing as user does last
> time.
> 
> But I still think we don't need to remember it because if finally you call
> online_pages(), it just does the same thing as last time by default.
> 
> online_pages()
> {
> 	......
> 	if (online_type == ONLINE_KERNEL ......
> 
> 	if (online_type == ONLINE_MOVABLE......
> 
> 	zone = page_zone(pfn_to_page(pfn));
> 
> 	/* Here, the page will be put into the zone which it belong to last 
> time. */

To be honest, it wasn't entirely clear to me that online_pages() would do the
same thing as last time by default.  Suppose, for example, that the previous
online_type was ONLINE_MOVABLE.  How online_pages() is supposed to know that
it should do the move_pfn_zone_right() if we don't tell it to do that?  Or
is that unnecessary, because it's already been done previously?

> 	......
> }
> 
> I just thought of it. Maybe I missed something in your design. Please tell
> me if I'm wrong.

Well, so what should be passed to __memory_block_change_state() in
memory_subsys_online()?  -1?

> Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
> 
> Thanks. :)

Thanks for your comments,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-21 11:15             ` Rafael J. Wysocki
@ 2013-05-22  4:45               ` Tang Chen
  -1 siblings, 0 replies; 105+ messages in thread
From: Tang Chen @ 2013-05-22  4:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

Hi Rafael,

On 05/21/2013 07:15 PM, Rafael J. Wysocki wrote:
......
>>> +		mem->state = to_state;
>>> +		if (to_state == MEM_ONLINE)
>>> +			mem->last_online = online_type;
>>
>> Why do we need to remember last online type ?
>>
>> And as far as I know, we can obtain which zone a page was in last time it
>> was onlined by check page->flags, just like online_pages() does. If we
>> use online_kernel or online_movable, the zone boundary will be
>> recalculated.
>> So we don't need to remember the last online type.
>>
>> Seeing from your patch, I guess memory_subsys_online() can only handle
>> online and offline. So mem->last_online is used to remember what user has
>> done through the original way to trigger memory hot-remove, right ? And
>> when
>> user does it in this new way, it just does the same thing as user does last
>> time.
>>
>> But I still think we don't need to remember it because if finally you call
>> online_pages(), it just does the same thing as last time by default.
>>
>> online_pages()
>> {
>> 	......
>> 	if (online_type == ONLINE_KERNEL ......
>>
>> 	if (online_type == ONLINE_MOVABLE......
>>
>> 	zone = page_zone(pfn_to_page(pfn));
>>
>> 	/* Here, the page will be put into the zone which it belong to last
>> time. */
>
> To be honest, it wasn't entirely clear to me that online_pages() would do the
> same thing as last time by default.  Suppose, for example, that the previous
> online_type was ONLINE_MOVABLE.  How online_pages() is supposed to know that
> it should do the move_pfn_zone_right() if we don't tell it to do that?  Or
> is that unnecessary, because it's already been done previously?

Yes, it is unnecessary. move_pfn_zone_right/left() will modify the zone 
related
bits in page->flags. But when the page is offline, the zone related bits in
page->flags will not change. So when it is online again, by dafault, it 
will
be in the zone which it was in last time.

......

>>
>> I just thought of it. Maybe I missed something in your design. Please tell
>> me if I'm wrong.
>
> Well, so what should be passed to __memory_block_change_state() in
> memory_subsys_online()?  -1?

If you want to keep the last time status, you can pass ONLINE_KEEP.
Or -1 is all right.

Thanks. :)

>
>> Reviewed-by: Tang Chen<tangchen@cn.fujitsu.com>
>>
>> Thanks. :)
>
> Thanks for your comments,
> Rafael
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-22  4:45               ` Tang Chen
  0 siblings, 0 replies; 105+ messages in thread
From: Tang Chen @ 2013-05-22  4:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

Hi Rafael,

On 05/21/2013 07:15 PM, Rafael J. Wysocki wrote:
......
>>> +		mem->state = to_state;
>>> +		if (to_state == MEM_ONLINE)
>>> +			mem->last_online = online_type;
>>
>> Why do we need to remember last online type ?
>>
>> And as far as I know, we can obtain which zone a page was in last time it
>> was onlined by check page->flags, just like online_pages() does. If we
>> use online_kernel or online_movable, the zone boundary will be
>> recalculated.
>> So we don't need to remember the last online type.
>>
>> Seeing from your patch, I guess memory_subsys_online() can only handle
>> online and offline. So mem->last_online is used to remember what user has
>> done through the original way to trigger memory hot-remove, right ? And
>> when
>> user does it in this new way, it just does the same thing as user does last
>> time.
>>
>> But I still think we don't need to remember it because if finally you call
>> online_pages(), it just does the same thing as last time by default.
>>
>> online_pages()
>> {
>> 	......
>> 	if (online_type == ONLINE_KERNEL ......
>>
>> 	if (online_type == ONLINE_MOVABLE......
>>
>> 	zone = page_zone(pfn_to_page(pfn));
>>
>> 	/* Here, the page will be put into the zone which it belong to last
>> time. */
>
> To be honest, it wasn't entirely clear to me that online_pages() would do the
> same thing as last time by default.  Suppose, for example, that the previous
> online_type was ONLINE_MOVABLE.  How online_pages() is supposed to know that
> it should do the move_pfn_zone_right() if we don't tell it to do that?  Or
> is that unnecessary, because it's already been done previously?

Yes, it is unnecessary. move_pfn_zone_right/left() will modify the zone 
related
bits in page->flags. But when the page is offline, the zone related bits in
page->flags will not change. So when it is online again, by dafault, it 
will
be in the zone which it was in last time.

......

>>
>> I just thought of it. Maybe I missed something in your design. Please tell
>> me if I'm wrong.
>
> Well, so what should be passed to __memory_block_change_state() in
> memory_subsys_online()?  -1?

If you want to keep the last time status, you can pass ONLINE_KEEP.
Or -1 is all right.

Thanks. :)

>
>> Reviewed-by: Tang Chen<tangchen@cn.fujitsu.com>
>>
>> Thanks. :)
>
> Thanks for your comments,
> Rafael
>
>


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
  2013-05-22  4:45               ` Tang Chen
@ 2013-05-22 10:42                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 10:42 UTC (permalink / raw)
  To: Tang Chen
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

On Wednesday, May 22, 2013 12:45:34 PM Tang Chen wrote:
> Hi Rafael,
> 
> On 05/21/2013 07:15 PM, Rafael J. Wysocki wrote:
> ......
> >>> +		mem->state = to_state;
> >>> +		if (to_state == MEM_ONLINE)
> >>> +			mem->last_online = online_type;
> >>
> >> Why do we need to remember last online type ?
> >>
> >> And as far as I know, we can obtain which zone a page was in last time it
> >> was onlined by check page->flags, just like online_pages() does. If we
> >> use online_kernel or online_movable, the zone boundary will be
> >> recalculated.
> >> So we don't need to remember the last online type.
> >>
> >> Seeing from your patch, I guess memory_subsys_online() can only handle
> >> online and offline. So mem->last_online is used to remember what user has
> >> done through the original way to trigger memory hot-remove, right ? And
> >> when
> >> user does it in this new way, it just does the same thing as user does last
> >> time.
> >>
> >> But I still think we don't need to remember it because if finally you call
> >> online_pages(), it just does the same thing as last time by default.
> >>
> >> online_pages()
> >> {
> >> 	......
> >> 	if (online_type == ONLINE_KERNEL ......
> >>
> >> 	if (online_type == ONLINE_MOVABLE......
> >>
> >> 	zone = page_zone(pfn_to_page(pfn));
> >>
> >> 	/* Here, the page will be put into the zone which it belong to last
> >> time. */
> >
> > To be honest, it wasn't entirely clear to me that online_pages() would do the
> > same thing as last time by default.  Suppose, for example, that the previous
> > online_type was ONLINE_MOVABLE.  How online_pages() is supposed to know that
> > it should do the move_pfn_zone_right() if we don't tell it to do that?  Or
> > is that unnecessary, because it's already been done previously?
> 
> Yes, it is unnecessary. move_pfn_zone_right/left() will modify the zone 
> related
> bits in page->flags. But when the page is offline, the zone related bits in
> page->flags will not change. So when it is online again, by dafault, it 
> will
> be in the zone which it was in last time.
> 
> ......
> 
> >>
> >> I just thought of it. Maybe I missed something in your design. Please tell
> >> me if I'm wrong.
> >
> > Well, so what should be passed to __memory_block_change_state() in
> > memory_subsys_online()?  -1?
> 
> If you want to keep the last time status, you can pass ONLINE_KEEP.
> Or -1 is all right.
> 
> Thanks. :)

OK, thanks for the info.

Since the $subject patch is on my acpi-hotplug branch which has gone public
already (and cannot be rebased), I'll prepare a patch with the change you're
recommending on top of it.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
@ 2013-05-22 10:42                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 10:42 UTC (permalink / raw)
  To: Tang Chen
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

On Wednesday, May 22, 2013 12:45:34 PM Tang Chen wrote:
> Hi Rafael,
> 
> On 05/21/2013 07:15 PM, Rafael J. Wysocki wrote:
> ......
> >>> +		mem->state = to_state;
> >>> +		if (to_state == MEM_ONLINE)
> >>> +			mem->last_online = online_type;
> >>
> >> Why do we need to remember last online type ?
> >>
> >> And as far as I know, we can obtain which zone a page was in last time it
> >> was onlined by check page->flags, just like online_pages() does. If we
> >> use online_kernel or online_movable, the zone boundary will be
> >> recalculated.
> >> So we don't need to remember the last online type.
> >>
> >> Seeing from your patch, I guess memory_subsys_online() can only handle
> >> online and offline. So mem->last_online is used to remember what user has
> >> done through the original way to trigger memory hot-remove, right ? And
> >> when
> >> user does it in this new way, it just does the same thing as user does last
> >> time.
> >>
> >> But I still think we don't need to remember it because if finally you call
> >> online_pages(), it just does the same thing as last time by default.
> >>
> >> online_pages()
> >> {
> >> 	......
> >> 	if (online_type == ONLINE_KERNEL ......
> >>
> >> 	if (online_type == ONLINE_MOVABLE......
> >>
> >> 	zone = page_zone(pfn_to_page(pfn));
> >>
> >> 	/* Here, the page will be put into the zone which it belong to last
> >> time. */
> >
> > To be honest, it wasn't entirely clear to me that online_pages() would do the
> > same thing as last time by default.  Suppose, for example, that the previous
> > online_type was ONLINE_MOVABLE.  How online_pages() is supposed to know that
> > it should do the move_pfn_zone_right() if we don't tell it to do that?  Or
> > is that unnecessary, because it's already been done previously?
> 
> Yes, it is unnecessary. move_pfn_zone_right/left() will modify the zone 
> related
> bits in page->flags. But when the page is offline, the zone related bits in
> page->flags will not change. So when it is online again, by dafault, it 
> will
> be in the zone which it was in last time.
> 
> ......
> 
> >>
> >> I just thought of it. Maybe I missed something in your design. Please tell
> >> me if I'm wrong.
> >
> > Well, so what should be passed to __memory_block_change_state() in
> > memory_subsys_online()?  -1?
> 
> If you want to keep the last time status, you can pass ONLINE_KEEP.
> Or -1 is all right.
> 
> Thanks. :)

OK, thanks for the info.

Since the $subject patch is on my acpi-hotplug branch which has gone public
already (and cannot be rebased), I'll prepare a patch with the change you're
recommending on top of it.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH] Driver core / memory: Simplify __memory_block_change_state()
  2013-05-22  4:45               ` Tang Chen
@ 2013-05-22 22:06                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 22:06 UTC (permalink / raw)
  To: Tang Chen, Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

As noted by Tang Chen, the last_online field in struct memory_block
introduced by commit 4960e05 (Driver core: Introduce offline/online
callbacks for memory blocks) is not really necessary, because
online_pages() restores the previous state if passed ONLINE_KEEP as
the last argument.  Therefore, remove that field along with the code
referring to it.

References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

Hi,

The patch is on top (and the commit mentioned in the changelog is present in)
the acpi-hotplug branch of the linux-pm.git tree.

Thanks,
Rafael

---
 drivers/base/memory.c  |   11 ++---------
 include/linux/memory.h |    1 -
 2 files changed, 2 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -291,13 +291,7 @@ static int __memory_block_change_state(s
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-	if (ret) {
-		mem->state = from_state_req;
-	} else {
-		mem->state = to_state;
-		if (to_state == MEM_ONLINE)
-			mem->last_online = online_type;
-	}
+	mem->state = ret ? from_state_req : to_state;
 	return ret;
 }
 
@@ -310,7 +304,7 @@ static int memory_subsys_online(struct d
 
 	ret = mem->state == MEM_ONLINE ? 0 :
 		__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
-					    mem->last_online);
+					    ONLINE_KEEP);
 
 	mutex_unlock(&mem->state_mutex);
 	return ret;
@@ -618,7 +612,6 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
-	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,7 +26,6 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
-	int last_online;
 	int section_count;
 
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH] Driver core / memory: Simplify __memory_block_change_state()
@ 2013-05-22 22:06                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 22:06 UTC (permalink / raw)
  To: Tang Chen, Greg Kroah-Hartman
  Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
	vasilis.liaskovitis, Len Brown, linux-mm

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

As noted by Tang Chen, the last_online field in struct memory_block
introduced by commit 4960e05 (Driver core: Introduce offline/online
callbacks for memory blocks) is not really necessary, because
online_pages() restores the previous state if passed ONLINE_KEEP as
the last argument.  Therefore, remove that field along with the code
referring to it.

References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

Hi,

The patch is on top (and the commit mentioned in the changelog is present in)
the acpi-hotplug branch of the linux-pm.git tree.

Thanks,
Rafael

---
 drivers/base/memory.c  |   11 ++---------
 include/linux/memory.h |    1 -
 2 files changed, 2 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -291,13 +291,7 @@ static int __memory_block_change_state(s
 		mem->state = MEM_GOING_OFFLINE;
 
 	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-	if (ret) {
-		mem->state = from_state_req;
-	} else {
-		mem->state = to_state;
-		if (to_state == MEM_ONLINE)
-			mem->last_online = online_type;
-	}
+	mem->state = ret ? from_state_req : to_state;
 	return ret;
 }
 
@@ -310,7 +304,7 @@ static int memory_subsys_online(struct d
 
 	ret = mem->state == MEM_ONLINE ? 0 :
 		__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
-					    mem->last_online);
+					    ONLINE_KEEP);
 
 	mutex_unlock(&mem->state_mutex);
 	return ret;
@@ -618,7 +612,6 @@ static int init_memory_block(struct memo
 			base_memory_block_id(scn_nr) * sections_per_block;
 	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
 	mem->state = state;
-	mem->last_online = ONLINE_KEEP;
 	mem->section_count++;
 	mutex_init(&mem->state_mutex);
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,7 +26,6 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long end_section_nr;
 	unsigned long state;
-	int last_online;
 	int section_count;
 
 	/*


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
  2013-05-22 22:06                 ` Rafael J. Wysocki
@ 2013-05-22 22:14                   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-22 22:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tang Chen, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

On Thu, May 23, 2013 at 12:06:50AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> As noted by Tang Chen, the last_online field in struct memory_block
> introduced by commit 4960e05 (Driver core: Introduce offline/online
> callbacks for memory blocks) is not really necessary, because
> online_pages() restores the previous state if passed ONLINE_KEEP as
> the last argument.  Therefore, remove that field along with the code
> referring to it.
> 
> References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
@ 2013-05-22 22:14                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 105+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-22 22:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tang Chen, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

On Thu, May 23, 2013 at 12:06:50AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> As noted by Tang Chen, the last_online field in struct memory_block
> introduced by commit 4960e05 (Driver core: Introduce offline/online
> callbacks for memory blocks) is not really necessary, because
> online_pages() restores the previous state if passed ONLINE_KEEP as
> the last argument.  Therefore, remove that field along with the code
> referring to it.
> 
> References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
  2013-05-22 22:14                   ` Greg Kroah-Hartman
@ 2013-05-22 23:29                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 23:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tang Chen, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

On Wednesday, May 22, 2013 03:14:43 PM Greg Kroah-Hartman wrote:
> On Thu, May 23, 2013 at 12:06:50AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > As noted by Tang Chen, the last_online field in struct memory_block
> > introduced by commit 4960e05 (Driver core: Introduce offline/online
> > callbacks for memory blocks) is not really necessary, because
> > online_pages() restores the previous state if passed ONLINE_KEEP as
> > the last argument.  Therefore, remove that field along with the code
> > referring to it.
> > 
> > References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> 
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Thanks!


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
@ 2013-05-22 23:29                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 23:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tang Chen, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

On Wednesday, May 22, 2013 03:14:43 PM Greg Kroah-Hartman wrote:
> On Thu, May 23, 2013 at 12:06:50AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > As noted by Tang Chen, the last_online field in struct memory_block
> > introduced by commit 4960e05 (Driver core: Introduce offline/online
> > callbacks for memory blocks) is not really necessary, because
> > online_pages() restores the previous state if passed ONLINE_KEEP as
> > the last argument.  Therefore, remove that field along with the code
> > referring to it.
> > 
> > References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> 
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
  2013-05-22 22:06                 ` Rafael J. Wysocki
@ 2013-05-23  4:37                   ` Tang Chen
  -1 siblings, 0 replies; 105+ messages in thread
From: Tang Chen @ 2013-05-23  4:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>

Thanks. :)

On 05/23/2013 06:06 AM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
>
> As noted by Tang Chen, the last_online field in struct memory_block
> introduced by commit 4960e05 (Driver core: Introduce offline/online
> callbacks for memory blocks) is not really necessary, because
> online_pages() restores the previous state if passed ONLINE_KEEP as
> the last argument.  Therefore, remove that field along with the code
> referring to it.
>
> References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> Signed-off-by: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
> ---
>
> Hi,
>
> The patch is on top (and the commit mentioned in the changelog is present in)
> the acpi-hotplug branch of the linux-pm.git tree.
>
> Thanks,
> Rafael
>
> ---
>   drivers/base/memory.c  |   11 ++---------
>   include/linux/memory.h |    1 -
>   2 files changed, 2 insertions(+), 10 deletions(-)
>
> Index: linux-pm/drivers/base/memory.c
> ===================================================================
> --- linux-pm.orig/drivers/base/memory.c
> +++ linux-pm/drivers/base/memory.c
> @@ -291,13 +291,7 @@ static int __memory_block_change_state(s
>   		mem->state = MEM_GOING_OFFLINE;
>
>   	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> -	if (ret) {
> -		mem->state = from_state_req;
> -	} else {
> -		mem->state = to_state;
> -		if (to_state == MEM_ONLINE)
> -			mem->last_online = online_type;
> -	}
> +	mem->state = ret ? from_state_req : to_state;
>   	return ret;
>   }
>
> @@ -310,7 +304,7 @@ static int memory_subsys_online(struct d
>
>   	ret = mem->state == MEM_ONLINE ? 0 :
>   		__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
> -					    mem->last_online);
> +					    ONLINE_KEEP);
>
>   	mutex_unlock(&mem->state_mutex);
>   	return ret;
> @@ -618,7 +612,6 @@ static int init_memory_block(struct memo
>   			base_memory_block_id(scn_nr) * sections_per_block;
>   	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
>   	mem->state = state;
> -	mem->last_online = ONLINE_KEEP;
>   	mem->section_count++;
>   	mutex_init(&mem->state_mutex);
>   	start_pfn = section_nr_to_pfn(mem->start_section_nr);
> Index: linux-pm/include/linux/memory.h
> ===================================================================
> --- linux-pm.orig/include/linux/memory.h
> +++ linux-pm/include/linux/memory.h
> @@ -26,7 +26,6 @@ struct memory_block {
>   	unsigned long start_section_nr;
>   	unsigned long end_section_nr;
>   	unsigned long state;
> -	int last_online;
>   	int section_count;
>
>   	/*
>
>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
@ 2013-05-23  4:37                   ` Tang Chen
  0 siblings, 0 replies; 105+ messages in thread
From: Tang Chen @ 2013-05-23  4:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
	isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm

Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>

Thanks. :)

On 05/23/2013 06:06 AM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
>
> As noted by Tang Chen, the last_online field in struct memory_block
> introduced by commit 4960e05 (Driver core: Introduce offline/online
> callbacks for memory blocks) is not really necessary, because
> online_pages() restores the previous state if passed ONLINE_KEEP as
> the last argument.  Therefore, remove that field along with the code
> referring to it.
>
> References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> Signed-off-by: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
> ---
>
> Hi,
>
> The patch is on top (and the commit mentioned in the changelog is present in)
> the acpi-hotplug branch of the linux-pm.git tree.
>
> Thanks,
> Rafael
>
> ---
>   drivers/base/memory.c  |   11 ++---------
>   include/linux/memory.h |    1 -
>   2 files changed, 2 insertions(+), 10 deletions(-)
>
> Index: linux-pm/drivers/base/memory.c
> ===================================================================
> --- linux-pm.orig/drivers/base/memory.c
> +++ linux-pm/drivers/base/memory.c
> @@ -291,13 +291,7 @@ static int __memory_block_change_state(s
>   		mem->state = MEM_GOING_OFFLINE;
>
>   	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> -	if (ret) {
> -		mem->state = from_state_req;
> -	} else {
> -		mem->state = to_state;
> -		if (to_state == MEM_ONLINE)
> -			mem->last_online = online_type;
> -	}
> +	mem->state = ret ? from_state_req : to_state;
>   	return ret;
>   }
>
> @@ -310,7 +304,7 @@ static int memory_subsys_online(struct d
>
>   	ret = mem->state == MEM_ONLINE ? 0 :
>   		__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
> -					    mem->last_online);
> +					    ONLINE_KEEP);
>
>   	mutex_unlock(&mem->state_mutex);
>   	return ret;
> @@ -618,7 +612,6 @@ static int init_memory_block(struct memo
>   			base_memory_block_id(scn_nr) * sections_per_block;
>   	mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
>   	mem->state = state;
> -	mem->last_online = ONLINE_KEEP;
>   	mem->section_count++;
>   	mutex_init(&mem->state_mutex);
>   	start_pfn = section_nr_to_pfn(mem->start_section_nr);
> Index: linux-pm/include/linux/memory.h
> ===================================================================
> --- linux-pm.orig/include/linux/memory.h
> +++ linux-pm/include/linux/memory.h
> @@ -26,7 +26,6 @@ struct memory_block {
>   	unsigned long start_section_nr;
>   	unsigned long end_section_nr;
>   	unsigned long state;
> -	int last_online;
>   	int section_count;
>
>   	/*
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 105+ messages in thread

end of thread, other threads:[~2013-05-23  4:34 UTC | newest]

Thread overview: 105+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-29 12:23 [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
2013-04-29 23:10   ` Greg Kroah-Hartman
2013-04-30 11:59     ` Rafael J. Wysocki
2013-04-30 15:32       ` Greg Kroah-Hartman
2013-04-30 20:05         ` Rafael J. Wysocki
2013-04-30 23:38   ` Toshi Kani
2013-05-02  0:58     ` Rafael J. Wysocki
2013-05-02 23:29       ` Toshi Kani
2013-05-03 11:48         ` Rafael J. Wysocki
2013-04-29 12:28 ` [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
2013-04-29 23:11   ` Greg Kroah-Hartman
2013-04-30 12:01     ` Rafael J. Wysocki
2013-04-30 15:27       ` Greg Kroah-Hartman
2013-04-30 20:06         ` Rafael J. Wysocki
2013-04-30 23:42   ` Toshi Kani
2013-05-01 14:49     ` Rafael J. Wysocki
2013-05-01 20:07       ` Toshi Kani
2013-05-02  0:26         ` Rafael J. Wysocki
2013-04-29 12:29 ` [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
2013-04-30 23:49   ` Toshi Kani
2013-05-01 15:05     ` Rafael J. Wysocki
2013-05-01 20:20       ` Toshi Kani
2013-05-02  0:53         ` Rafael J. Wysocki
2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
2013-05-02 12:27   ` [PATCH 1/4] Driver core: Add offline/online device operations Rafael J. Wysocki
2013-05-02 13:57     ` Greg Kroah-Hartman
2013-05-02 23:11     ` Toshi Kani
2013-05-02 23:36       ` Rafael J. Wysocki
2013-05-02 23:23         ` Toshi Kani
2013-05-02 12:28   ` [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
2013-05-02 13:57     ` Greg Kroah-Hartman
2013-05-02 12:29   ` [PATCH 3/4] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
2013-05-02 12:31   ` [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure Rafael J. Wysocki
2013-05-02 13:59     ` Greg Kroah-Hartman
2013-05-02 23:20     ` Toshi Kani
2013-05-03 12:05       ` Rafael J. Wysocki
2013-05-03 12:21         ` Rafael J. Wysocki
2013-05-03 18:27         ` Toshi Kani
2013-05-03 19:31           ` Rafael J. Wysocki
2013-05-03 19:34             ` Toshi Kani
2013-05-04  1:01   ` [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04  1:01     ` Rafael J. Wysocki
2013-05-04  1:03     ` [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04  1:03       ` Rafael J. Wysocki
2013-05-04  1:04     ` [PATCH 2/3 RFC] Driver core: Introduce types of device "online" Rafael J. Wysocki
2013-05-04  1:04       ` Rafael J. Wysocki
2013-05-04  1:06     ` [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-04  1:06       ` Rafael J. Wysocki
2013-05-04 11:11     ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 11:11       ` Rafael J. Wysocki
2013-05-04 11:12       ` [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04 11:12         ` Rafael J. Wysocki
2013-05-21  6:50         ` Tang Chen
2013-05-21  6:50           ` Tang Chen
2013-05-04 11:21       ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-04 11:21         ` Rafael J. Wysocki
2013-05-06 16:28         ` Vasilis Liaskovitis
2013-05-06 16:28           ` Vasilis Liaskovitis
2013-05-07  0:59           ` Rafael J. Wysocki
2013-05-07  0:59             ` Rafael J. Wysocki
2013-05-07 10:59             ` Vasilis Liaskovitis
2013-05-07 10:59               ` Vasilis Liaskovitis
2013-05-07 12:11               ` Rafael J. Wysocki
2013-05-07 12:11                 ` Rafael J. Wysocki
2013-05-07 21:03                 ` Toshi Kani
2013-05-07 21:03                   ` Toshi Kani
2013-05-07 22:10                   ` Rafael J. Wysocki
2013-05-07 22:10                     ` Rafael J. Wysocki
2013-05-07 22:45                     ` Toshi Kani
2013-05-07 22:45                       ` Toshi Kani
2013-05-07 23:17                       ` Rafael J. Wysocki
2013-05-07 23:17                         ` Rafael J. Wysocki
2013-05-07 23:59                         ` Toshi Kani
2013-05-07 23:59                           ` Toshi Kani
2013-05-08  0:24                           ` Rafael J. Wysocki
2013-05-08  0:24                             ` Rafael J. Wysocki
2013-05-08  0:37                             ` Toshi Kani
2013-05-08  0:37                               ` Toshi Kani
2013-05-08 11:53                               ` Rafael J. Wysocki
2013-05-08 11:53                                 ` Rafael J. Wysocki
2013-05-08 14:38                                 ` Toshi Kani
2013-05-08 14:38                                   ` Toshi Kani
2013-05-06 17:20         ` Greg Kroah-Hartman
2013-05-06 17:20           ` Greg Kroah-Hartman
2013-05-06 19:46           ` Rafael J. Wysocki
2013-05-06 19:46             ` Rafael J. Wysocki
2013-05-21  6:37         ` Tang Chen
2013-05-21  6:37           ` Tang Chen
2013-05-21 11:15           ` Rafael J. Wysocki
2013-05-21 11:15             ` Rafael J. Wysocki
2013-05-22  4:45             ` Tang Chen
2013-05-22  4:45               ` Tang Chen
2013-05-22 10:42               ` Rafael J. Wysocki
2013-05-22 10:42                 ` Rafael J. Wysocki
2013-05-22 22:06               ` [PATCH] Driver core / memory: Simplify __memory_block_change_state() Rafael J. Wysocki
2013-05-22 22:06                 ` Rafael J. Wysocki
2013-05-22 22:14                 ` Greg Kroah-Hartman
2013-05-22 22:14                   ` Greg Kroah-Hartman
2013-05-22 23:29                   ` Rafael J. Wysocki
2013-05-22 23:29                     ` Rafael J. Wysocki
2013-05-23  4:37                 ` Tang Chen
2013-05-23  4:37                   ` Tang Chen
2013-05-06 10:48       ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-06 10:48         ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.