All of lore.kernel.org
 help / color / mirror / Atom feed
* [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
@ 2009-08-03 21:36 Rafael J. Wysocki
  2009-08-04 20:33 ` Alan Stern
  2009-08-04 20:33 ` Alan Stern
  0 siblings, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-03 21:36 UTC (permalink / raw)
  To: Linux-pm mailing list
  Cc: Alan Stern, Magnus Damm, Greg KH, Pavel Machek, Len Brown, LKML

Hi,

OK, if this is to go into 2.6.32, the last moment for putting it into
linux-next is now.  If you have any objections, remarks, etc. please let me
know or I'm going to put this one into the linux-next branch of the suspend-2.6
tree in the next couple of days.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 11)

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.  Document all these things.

Special thanks to Alan Stern for his help with the design and to
Magnus Damm for testing feedback.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/power/runtime_pm.txt |  340 +++++++++++++
 drivers/base/dd.c                  |    7 
 drivers/base/power/Makefile        |    1 
 drivers/base/power/main.c          |   20 
 drivers/base/power/power.h         |   11 
 drivers/base/power/runtime.c       |  946 +++++++++++++++++++++++++++++++++++++
 include/linux/pm.h                 |  102 +++
 include/linux/pm_runtime.h         |  111 ++++
 kernel/power/Kconfig               |   14 
 kernel/power/main.c                |   17 
 10 files changed, 1558 insertions(+), 11 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /**
@@ -315,14 +344,81 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		disable_depth:3;
+	unsigned int		ignore_children:1;
+	unsigned int		runtime_failure:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			last_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,946 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * If there's a request pending, make sure its work function will return
+	 * without doing anything.
+	 */
+	if (dev->power.request_pending)
+		dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
+		dev->bus->pm->runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If there's an idle notification pending, cancel it.  If
+ * there's a suspend request scheduled while this function is running and @sync
+ * is 'true', cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* Pending resume requests take precedence over us. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+		/* Other pending requests need to be canceled. */
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.disable_depth > 0
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend ?
+		dev->bus->pm->runtime_suspend(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY) {
+			notify = true;
+		} else {
+			dev->power.runtime_failure = true;
+			dev->power.last_error = retval;
+		}
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		__pm_request_resume(dev);
+		dev->power.deferred_resume = false;
+	}
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (parent && !parent->power.ignore_children)
+		pm_request_idle(parent);
+
+	if (notify)
+		pm_runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  If there's a suspend
+ * request or idle notification pending, cancel it.  If there's a resume request
+ * scheduled while this function is running, cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			return -EINPROGRESS;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		spin_unlock_irq(&dev->power.lock);
+
+		parent = dev->parent;
+		retval = pm_runtime_get_sync(parent);
+		if (retval < 0)
+			goto out_parent;
+
+		spin_lock_irq(&dev->power.lock);
+		retval = 0;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume ?
+		dev->bus->pm->runtime_resume(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		dev->power.runtime_failure = true;
+		dev->power.last_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	spin_unlock_irq(&dev->power.lock);
+
+ out_parent:
+	if (parent)
+		pm_runtime_put(parent);
+
+	if (!retval)
+		pm_request_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		goto out;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	if (dev->power.request_pending)
+		goto out;
+
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+
+		if (dev->power.request == RPM_REQ_SUSPEND)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		/* There's nothing to do if resume request is pending. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_get - Reference count a device and wake it up, if necessary.
+ * @dev: Device to handle.
+ * @sync: If set and the device is suspended, resume it synchronously.
+ *
+ * Increment the usage count of the device and if it was zero previously,
+ * resume it or submit a resume request for it, depending on the value of @sync.
+ */
+int __pm_runtime_get(struct device *dev, bool sync)
+{
+	int retval = 1;
+
+	if (atomic_add_return(1, &dev->power.usage_count) == 1)
+		retval = sync ? pm_runtime_resume(dev) : pm_request_resume(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_get);
+
+/**
+ * __pm_runtime_put - Decrement the device's usage counter and notify its bus.
+ * @dev: Device to handle.
+ * @sync: If the device's bus type is to be notified, do that synchronously.
+ *
+ * Decrement the usage count of the device and if it reaches zero, carry out a
+ * synchronous idle notification or submit an idle notification request for it,
+ * depending on the value of @sync.
+ */
+int __pm_runtime_put(struct device *dev, bool sync)
+{
+	int retval = 0;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		retval = sync ? pm_runtime_idle(dev) : pm_request_idle(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_put);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_failure flag is
+ * set, the status may be changed either to RPM_ACTIVE, or to RPM_SUSPENDED, as
+ * long as that reflects the actual state of the device.  However, if the device
+ * has a parent and the parent is not active, and the parent's
+ * power.ignore_children flag is unset, the device's status cannot be set to
+ * RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_failure
+ * flag and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_failure && !dev->power.disable_depth)
+		goto out;
+
+	if (dev->power.runtime_status == status)
+		goto out_clear;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent)
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		dev->power.runtime_status = status;
+		goto out_clear;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It is invalid to put an active child under a parent that is
+		 * not active, has run-time PM enabled and the
+		 * 'power.ignore_children' flag unset.
+		 */
+		if (!parent->power.disable_depth
+		    && !parent->power.ignore_children
+		    && parent->power.runtime_status != RPM_ACTIVE) {
+			error = -EBUSY;
+		} else {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+			dev->power.runtime_status = status;
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	} else {
+		dev->power.runtime_status = status;
+	}
+
+ out_clear:
+	dev->power.runtime_failure = false;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.disable_depth > 0)
+		dev->power.disable_depth--;
+	else
+		dev_warn(dev, "Unbalanced %s!", __func__);
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ *
+ * Increment power.disable_depth for the device and if was zero previously,
+ * cancel all pending run-time PM requests for the device and wait for all
+ * operations in progress to complete.  The device can be either active or
+ * suspended after its run-time PM has been disabled.
+ *
+ * If there's a resume request pending when pm_runtime_disable() is called and
+ * power.disable_depth is zero, the function will wake up the device before
+ * disabling its run-time PM and will return 1.  Otherwise, 0 is returned.
+ */
+int pm_runtime_disable(struct device *dev)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (dev->power.disable_depth > 0) {
+		dev->power.disable_depth++;
+		goto out;
+	}
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and disabling run-time PM
+	 * shouldn't prevent the device from processing the I/O.
+	 */
+	if (dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		/*
+		 * Prevent suspends and idle notifications from being carried
+		 * out after we have woken up the device.
+		 */
+		pm_runtime_get_noresume(dev);
+
+		__pm_runtime_resume(dev, false);
+
+		pm_runtime_put_noidle(dev);
+		retval = 1;
+	}
+
+	if (dev->power.disable_depth++ > 0)
+		goto out;
+
+	if (dev->power.runtime_failure)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+	if (dev->power.runtime_failure)
+		goto out;
+
+	if (dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (!dev->power.idle_notification)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_SUSPENDED;
+	dev->power.idle_notification = false;
+
+	dev->power.disable_depth = 1;
+	atomic_set(&dev->power.usage_count, 0);
+
+	dev->power.runtime_failure = false;
+	dev->power.last_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	dev->power.suspend_timer.expires = jiffies;
+	dev->power.suspend_timer.data = (unsigned long)dev;
+	dev->power.suspend_timer.function = pm_suspend_timer_fn;
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	pm_runtime_disable(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE) {
+		struct device *parent = dev->parent;
+
+		/*
+		 * Change the status back to 'suspended' to match the initial
+		 * status.
+		 */
+		pm_runtime_set_suspended(dev);
+		if (parent && !parent->power.ignore_children)
+			pm_request_idle(parent);
+	}
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,111 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_get(struct device *dev, bool sync);
+extern int __pm_runtime_put(struct device *dev, bool sync);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int pm_runtime_disable(struct device *dev);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline void pm_runtime_get_noresume(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+}
+
+static inline void pm_runtime_put_noidle(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_get(struct device *dev, bool sync) { return 1; }
+static inline int __pm_runtime_put(struct device *dev, bool sync) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int pm_runtime_disable(struct device *dev) { return 0; }
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline void pm_runtime_get_noresume(struct device *dev) {}
+static inline void pm_runtime_put_noidle(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	return __pm_runtime_get(dev, false);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	return __pm_runtime_get(dev, true);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	return __pm_runtime_put(dev, false);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	return __pm_runtime_put(dev, true);
+}
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object to initialize.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -105,6 +116,8 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -512,6 +525,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
+			error = -EBUSY;
+		else
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
+	pm_runtime_get_noresume(dev);
 	ret = really_probe(dev, drv);
+	pm_runtime_put_noidle(dev);
 
 	return ret;
 }
@@ -306,6 +309,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +329,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,8 +1,3 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
-
 #ifdef CONFIG_PM_SLEEP
 
 /*
@@ -16,14 +11,16 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
 
+static inline void device_pm_init(struct device *dev) {}
 static inline void device_pm_add(struct device *dev) {}
 static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
@@ -32,7 +29,7 @@ static inline void device_pm_move_after(
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 
Index: linux-2.6/Documentation/power/runtime_pm.txt
===================================================================
--- /dev/null
+++ linux-2.6/Documentation/power/runtime_pm.txt
@@ -0,0 +1,340 @@
+Run-time Power Management Framework for I/O Devices
+
+(C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+
+1. Introduction
+
+Support for run-time power management (run-time PM) of I/O devices is provided
+at the power management core (PM core) level by means of:
+
+* The power management workqueue pm_wq in which bus types and device drivers can
+  put their PM-related work items.  It is strongly recommended that pm_wq be
+  used for queuing all work items related to run-time PM, because this allows
+  them to be synchronized with system-wide power transitions (suspend to RAM,
+  hibernation and resume from system sleep states).  pm_wq is declared in
+  include/linux/pm_runtime.h and defined in kernel/power/main.c.
+
+* A number of run-time PM fields in the 'power' member of 'struct device' (which
+  is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can
+  be used for synchronizing run-time PM operations with one another.
+
+* Three device run-time PM callbacks in 'struct dev_pm_ops' (defined in
+  include/linux/pm.h).
+
+* A set of helper functions defined in drivers/base/power/runtime.c that can be
+  used for carrying out run-time PM operations in such a way that the
+  synchronization between them is taken care of by the PM core.  Bus types and
+  device drivers are encouraged to use these functions.
+
+The run-time PM callbacks present in 'struct dev_pm_ops', the device run-time PM
+fields of 'struct dev_pm_info' and the core helper functions provided for
+run-time PM are described below.
+
+2. Device Run-time PM Callbacks
+
+There are three device run-time PM callbacks defined in 'struct dev_pm_ops':
+
+struct dev_pm_ops {
+	...
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
+	...
+};
+
+The ->runtime_suspend() callback is executed by the PM core for the bus type of
+the device being suspended.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_suspend() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_suspend()
+callback in a device driver as long as the bus type's ->runtime_suspend() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_suspend() callback has completed successfully
+    for given device, the PM core regards the device as suspended, which need
+    not mean that the device has been put into a low power state.  It is
+    supposed to mean, however, that the device will not process data and will
+    not communicate with the CPU(s) and RAM until its bus type's
+    ->runtime_resume() callback is executed for it.  The run-time PM status of
+    a device after successful execution of its bus type's ->runtime_suspend()
+    callback is 'suspended'.
+
+  * If the bus type's ->runtime_suspend() callback returns -EBUSY or -EAGAIN,
+    the device's run-time PM status is supposed to be 'active', which means that
+    the device _must_ be fully operational afterwards.
+
+  * If the bus type's ->runtime_suspend() callback returns an error code
+    different from -EBUSY or -EAGAIN, the PM core regards this as a fatal
+    error and will refuse to run the helper functions described in Section 4
+    for the device, until the status of it is directly set either to 'active'
+    or to 'suspended' (the PM core provides special helper functions for this
+    purpose).
+
+In particular, it is recommended that ->runtime_suspend() return -EBUSY if
+device_may_wakeup() returns 'false' for the device.  On the other hand, if
+device_may_wakeup() returns 'true' for the device and the device is put
+into a low power state during the execution of its bus type's
+->runtime_suspend(), it is expected that remote wake-up (i.e. hardware mechanism
+allowing the device to request a change of its power state, such as PCI PME)
+will be enabled for the device.  Generally, remote wake-up should be enabled
+for all input devices put into a low power state at run time.
+
+The ->runtime_resume() callback is executed by the PM core for the bus type of
+the device being woken up.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_resume() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_resume()
+callback in a device driver as long as the bus type's ->runtime_resume() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_resume() callback has completed successfully,
+    the PM core regards the device as fully operational, which means that the
+    device _must_ be able to complete I/O operations as needed.  The run-time
+    PM status of the device is then 'active'.
+
+  * If the bus type's ->runtime_resume() callback returns an error code, the PM
+    core regards this as a fatal error and will refuse to run the helper
+    functions described in Section 4 for the device, until its status is
+    directly set either to 'active' or to 'suspended' (the PM core provides
+    special helper functions for this purpose).
+
+The ->runtime_idle() callback is executed by the PM core for the bus type of
+given device whenever the device appears to be idle, which is indicated to the
+PM core by two counters, the device's usage counter and the counter of 'active'
+children of the device.
+
+  * If any of these counters is decreased using a helper function provided by
+    the PM core and it turns out to be equal to zero, the other counter is
+    checked.  If that counter also is equal to zero, the PM core executes the
+    device bus type's ->runtime_idle() callback (with the device as an
+    argument).
+
+The action performed by a bus type's ->runtime_idle() callback is totally
+dependent on the bus type in question, but the expected and recommended action
+is to check if the device can be suspended (i.e. if all of the conditions
+necessary for suspending the device are satisfied) and to queue up a suspend
+request for the device in that case.
+
+The helper functions provided by the PM core, described in Section 4, guarantee
+that the following constraints are met with respect to the bus type's run-time
+PM callbacks:
+
+(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
+    ->runtime_suspend() in parallel with ->runtime_resume() or with another
+    instance of ->runtime_suspend() for the same device) with the exception that
+    ->runtime_suspend() or ->runtime_resume() can be executed in parallel with
+    ->runtime_idle() (although ->runtime_idle() will not be started while any
+    of the other callbacks is being executed for the same device).
+
+(2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active'
+    devices (i.e. the PM core will only execute ->runtime_idle() or
+    ->runtime_suspend() for the devices the run-time PM status of which is
+    'active').
+
+(3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device
+    the usage counter of which is equal to zero _and_ either the counter of
+    'active' children of which is equal to zero, or the 'power.ignore_children'
+    flag of which is set.
+
+(4) ->runtime_resume() can only be executed for 'suspended' devices  (i.e. the
+    PM core will only execute ->runtime_resume() for the devices the run-time
+    PM status of which is 'suspended').
+
+Additionally, the helper functions provided by the PM core obey the following
+rules:
+
+  * If ->runtime_suspend() is about to be executed or the execution of it is
+    scheduled or there's a pending request to execute it, ->runtime_idle() will
+    not be executed for the same device.
+
+  * A request to execute or to schedule the execution of ->runtime_suspend()
+    will cancel any pending requests to execute ->runtime_idle() for the same
+    device.
+
+  * If ->runtime_resume() is about to be executed or there's a pending request
+    to execute it, the other callbacks will not be executed for the same device.
+
+  * A request to execute ->runtime_resume() will cancel any pending or
+    scheduled requests to execute the other callbacks for the same device.
+
+3. Run-time PM Device Fields
+
+The following device run-time PM fields are present in 'struct dev_pm_info', as
+defined in include/linux/pm.h:
+
+  struct timer_list suspend_timer;
+    - timer used for scheduling (delayed) suspend request
+
+  unsigned long timer_expires;
+    - timer expiration time, in jiffies (if this is different from zero, the
+      timer is running and will expire at that time, otherwise the timer is not
+      running)
+
+  struct work_struct work;
+    - work structure used for queuing up requests (i.e. work items in pm_wq)
+
+  wait_queue_head_t wait_queue;
+    - wait queue used if any of the helper functions needs to wait for another
+      one to complete
+
+  spinlock_t lock;
+    - lock used for synchronisation
+
+  atomic_t usage_count;
+    - the usage counter of the device
+
+  atomic_t child_count;
+    - the count of 'active' children of the device
+
+  unsigned int ignore_children;
+    - if set, the value of child_count is ignored (but still updated)
+
+  unsigned int disable_depth;
+    - used for disabling the helper funcions (they work normally if this is
+      equal to zero); the initial value of it is 1 (i.e. run-time PM is
+      initially disabled for all devices)
+
+  unsigned int runtime_failure;
+    - if set, there was a fatal error (one of the callbacks returned error code
+      as described in Section 2), so the helper funtions will not work until
+      this flag is cleared
+
+  int last_error;
+    - if runtime_failure is set, this is the error code returned by the
+      failing callback
+
+  unsigned int idle_notification;
+    - if set, ->runtime_idle() is being executed
+
+  unsigned int request_pending;
+    - if set, there's a pending request (i.e. a work item queued up into pm_wq)
+
+  enum rpm_request request;
+    - type of request that's pending (valid if request_pending is set)
+
+  unsigned int deferred_resume;
+    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
+      being executed for that device and it is not practical to wait for the
+      suspend to complete; means "queue up a resume request as soon as you've
+      suspended"
+
+  enum rpm_status runtime_status;
+    - the run-time PM status of the device; this field's initial value is
+      RPM_SUSPENDED, which means that each device is initially regarded by the
+      PM core as 'suspended', regardless of its real hardware status
+
+All of the above fields are members of the 'power' member of 'struct device'.
+
+4. Run-time PM Device Helper Functions
+
+The following run-time PM helper functions are defined in
+drivers/base/power/runtime.c and include/linux/pm_runtime.h:
+
+  void pm_runtime_init(struct device *dev);
+    - initialize the device run-time PM fields in 'struct dev_pm_info'
+
+  void pm_runtime_remove(struct device *dev);
+    - make sure that the run-time PM of the device will be disabled after
+      removing the device from device hierarchy
+
+  int pm_runtime_idle(struct device *dev);
+    - execute ->runtime_idle() for the device's bus type; returns 0 on success
+      or error code on failure, where -EINPROGRESS means that ->runtime_idle()
+      is already being executed
+
+  int pm_runtime_suspend(struct device *dev);
+    - execute ->runtime_suspend() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'suspended', or
+      error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt
+      to suspend the device again in future
+
+  int pm_runtime_resume(struct device *dev);
+    - execute ->runtime_resume() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'active' or
+      error code on failure, where -EAGAIN means it may be safe to attempt to
+      resume the device again in future, but 'power.runtime_failure' should be
+      checked additionally
+
+  int pm_request_idle(struct device *dev);
+    - submit a request to execute ->runtime_idle() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on success
+      or error code if the request has not been queued up
+
+  int pm_schedule_suspend(struct device *dev, unsigned int delay);
+    - schedule the execution of ->runtime_suspend() for the device's bus type
+      in future, where 'delay' is the time to wait before queuing up a suspend
+      work item in pm_wq, in miliseconds (if 'delay' is zero, the work item is
+      queued up immediately); returns 0 on success, 1 if the device's PM
+      run-time status was already 'suspended', or error code if the request
+      hasn't been scheduled (or queued up if 'delay' is 0)
+
+  int pm_request_resume(struct device *dev);
+    - submit a request to execute ->runtime_resume() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on
+      success, 1 if the device's run-time PM status was already 'active', or
+      error code if the request hasn't been queued up
+
+  void pm_runtime_get_noresume(struct device *dev);
+    - increment the device's usage counter
+
+  int pm_runtime_get(struct device *dev);
+    - increment the device's usage counter, run pm_request_resume(dev) and
+      return its result
+
+  int pm_runtime_get_sync(struct device *dev);
+    - increment the device's usage counter, run pm_rutime_resume(dev) and return
+      its result
+
+  void pm_runtime_put_noidle(struct device *dev);
+    - decrement the device's usage counter
+
+  int pm_runtime_put(struct device *dev);
+    - decrement the device's usage counter, run pm_request_idle(dev) and return
+      its result
+
+  int pm_runtime_put_sync(struct device *dev);
+    - decrement the device's usage counter, run pm_rutime_idle(dev) and return
+      its result
+
+  void pm_runtime_enable(struct device *dev);
+    - enable the run-time PM helper functions to run the device bus type's
+      run-time PM callbacks described in Section 2
+
+  int pm_runtime_disable(struct device *dev);
+    - prevent the run-time PM helper functions from running the device bus
+      type's run-time PM callbacks, make sure that all of the pending run-time
+      PM operations on the device are either completed or canceled; returns
+      1 if there was a resume request pending and it was necessary to execute
+      ->runtime_resume() for the device's bus type to satisfy that request,
+      otherwise 0 is returned
+
+  void pm_suspend_ignore_children(struct device *dev, bool enable);
+    - set/unset the power.ignore_children flag of the device
+
+  int pm_runtime_set_active(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'active' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_failure' is set or 'power.disable_depth' is greater than
+      zero); it will fail and return error code if the device has a parent
+      which is not active and the 'power.ignore_children' flag of which is unset
+
+  void pm_runtime_set_suspended(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'suspended' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_failure' is set or 'power.disable_depth' is greater than
+      zero)
+
+It is safe to execute the following helper functions from interrupt context:
+
+pm_request_idle()
+pm_schedule_suspend()
+pm_request_resume()
+pm_runtime_get_noresume()
+pm_runtime_get()
+pm_runtime_put_noidle()
+pm_runtime_put()
+pm_suspend_ignore_children()
+pm_runtime_set_active()
+pm_runtime_set_suspended()

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
  2009-08-03 21:36 [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11) Rafael J. Wysocki
  2009-08-04 20:33 ` Alan Stern
@ 2009-08-04 20:33 ` Alan Stern
  2009-08-05  0:19   ` Rafael J. Wysocki
  2009-08-05  0:19   ` Rafael J. Wysocki
  1 sibling, 2 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-04 20:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Mon, 3 Aug 2009, Rafael J. Wysocki wrote:

> Hi,
> 
> OK, if this is to go into 2.6.32, the last moment for putting it into
> linux-next is now.  If you have any objections, remarks, etc. please let me
> know or I'm going to put this one into the linux-next branch of the suspend-2.6
> tree in the next couple of days.

I'm sorry I haven't been keeping on top of all your work on this.
Lots of other stuff has been going on in the meantime...

One the whole this all looks very good.  It's basically ready to be
merged.  There are a couple of minor issues remaining plus a bunch of
unimportant implementation details.

pm_runtime_disable() gets used for several different purposes.  For
the usage in pm_runtime_remove(), it's silly to carry out a pending
resume request.  Should we add an argument saying whether or not to do
so?

In the documentation, it would be nice to have a section listing the
default runtime PM settings and explaining what a driver should do to
activate runtime PM on a newly-registered device.


> +static void pm_runtime_cancel_pending(struct device *dev)
> +{
> +	pm_runtime_deactivate_timer(dev);
> +	/*
> +	 * If there's a request pending, make sure its work function will return
> +	 * without doing anything.
> +	 */
> +	if (dev->power.request_pending)
> +		dev->power.request = RPM_REQ_NONE;

No need for the "if"; you can always do the assignment.


> +static int __pm_runtime_idle(struct device *dev)
> +{
> +	int retval = 0;
> +
> +	if (dev->power.runtime_failure)
> +		retval = -EINVAL;

Instead of assigning to retval, you could simply return these values.

> +	else if (dev->power.idle_notification)
> +		retval = -EINPROGRESS;
> +	else if (atomic_read(&dev->power.usage_count) > 0
> +	    || dev->power.disable_depth > 0
> +	    || dev->power.timer_expires > 0
> +	    || dev->power.runtime_status == RPM_SUSPENDED
> +	    || dev->power.runtime_status == RPM_SUSPENDING)
> +		retval = -EAGAIN;

Do we also want to rule out RPM_RESUMING?  That is,

	    || dev->power.runtime_status != RPM_ACTIVE

> +	spin_unlock_irq(&dev->power.lock);
> +
> +	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> +		dev->bus->pm->runtime_idle(dev);
> +
> +	spin_lock_irq(&dev->power.lock);

Small optimization: Put the spin_{un}lock_irq stuff inside the "if"
statement, so it doesn't happen if the test fails.  The same thing can
be done in other places.


> + * __pm_runtime_suspend - Carry out run-time suspend of given device.
> + * @dev: Device to suspend.
> + * @from_wq: If set, the funtion has been called via pm_wq.

Fix spelling of "function".  Likewise in __pm_runtime_resume.

> +int __pm_runtime_suspend(struct device *dev, bool from_wq)
> +{
...
> +	spin_unlock_irq(&dev->power.lock);
> +
> +	if (parent && !parent->power.ignore_children)
> +		pm_request_idle(parent);
> +
> +	if (notify)
> +		pm_runtime_idle(dev);

Move this up before the spin_unlock_irq and call __pm_runtime_idle instead.
The same sort of thing can be done in __pm_runtime_resume.


> +static int __pm_request_suspend(struct device *dev)
> +{
> +	int retval = 0;
> +
> +	if (dev->power.runtime_failure)
> +		return -EINVAL;
> +
> +	if (dev->power.runtime_status == RPM_SUSPENDED)
> +		retval = 1;
> +	else if (atomic_read(&dev->power.usage_count) > 0
> +	    || dev->power.disable_depth > 0)
> +		retval = -EAGAIN;
> +	else if (dev->power.runtime_status == RPM_SUSPENDING)
> +		retval = -EINPROGRESS;
> +	else if (!pm_children_suspended(dev))
> +		retval = -EBUSY;

Insert:
	if (retval)
		return retval;

Or else change the assignments to "return" statements.  Yes, we agreed
that a suspend request should override an existing idle-notify
request.  But if the new request fails then it shouldn't override
anything.  (Of course, if it fails for any of the reasons here then
there can't be a pending idle-notify request anyway.)

> +	pm_runtime_deactivate_timer(dev);
> +
> +	if (dev->power.request_pending) {
> +		/*
> +		 * Pending resume requests take precedence over us, but we can
> +		 * overtake any other pending request.
> +		 */
> +		if (dev->power.request == RPM_REQ_RESUME)
> +			retval = -EAGAIN;
> +		else if (dev->power.request != RPM_REQ_SUSPEND)
> +			dev->power.request = retval ?
> +						RPM_REQ_NONE : RPM_REQ_SUSPEND;

Now there's no need to check retval.

> +
> +		if (dev->power.request == RPM_REQ_SUSPEND)
> +			return 0;

Just simply:
		return retval;

Some of these cases can't happen.  For instance, if we reach here then
the status can't be SUSPENDED or SUSPENDING, so there can't be a
pending resume request.

> +	}
> +
> +	if (retval)
> +		return retval;

Now this isn't needed.  Similar code rearrangements can be made in
__pm_request_resume.


> +int __pm_request_resume(struct device *dev)

Should be static.


> +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> +{
> +	struct device *parent = dev->parent;
> +	unsigned long flags;
> +	int error = 0;
> +
> +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> +		return -EINVAL;

This should go inside the spinlocked area.

> +	spin_lock_irqsave(&dev->power.lock, flags);
> +
> +	if (!dev->power.runtime_failure && !dev->power.disable_depth)
> +		goto out;
> +
> +	if (dev->power.runtime_status == status)
> +		goto out_clear;
> +
> +	if (status == RPM_SUSPENDED) {
> +		/* It always is possible to set the status to 'suspended'. */
> +		if (parent)
> +			atomic_add_unless(&parent->power.child_count, -1, 0);
> +		dev->power.runtime_status = status;
> +		goto out_clear;
> +	}
> +
> +	if (parent) {
> +		spin_lock_irq(&parent->power.lock);
> +
> +		/*
> +		 * It is invalid to put an active child under a parent that is
> +		 * not active, has run-time PM enabled and the
> +		 * 'power.ignore_children' flag unset.
> +		 */
> +		if (!parent->power.disable_depth
> +		    && !parent->power.ignore_children
> +		    && parent->power.runtime_status != RPM_ACTIVE) {
> +			error = -EBUSY;
> +		} else {
> +			if (dev->power.runtime_status == RPM_SUSPENDED)
> +				atomic_inc(&parent->power.child_count);
> +			dev->power.runtime_status = status;
> +		}
> +
> +		spin_unlock_irq(&parent->power.lock);
> +
> +		if (error)
> +			goto out;
> +	} else {
> +		dev->power.runtime_status = status;
> +	}
> +
> + out_clear:
> +	dev->power.runtime_failure = false;

Move all those assignments to dev->power.runtime_status down here.


> +int pm_runtime_disable(struct device *dev)
> +{
...
> +	if (dev->power.disable_depth++ > 0)
> +		goto out;
> +
> +	if (dev->power.runtime_failure)
> +		goto out;

I don't see why this is needed.

> +
> +	pm_runtime_deactivate_timer(dev);
> +
> +	if (dev->power.request_pending) {
> +		dev->power.request = RPM_REQ_NONE;
> +
> +		spin_unlock_irq(&dev->power.lock);
> +
> +		cancel_work_sync(&dev->power.work);
> +
> +		spin_lock_irq(&dev->power.lock);
> +
> +		dev->power.request_pending = false;

Remove excessive whitespace.

> +	}
> +
> +	if (dev->power.runtime_status == RPM_SUSPENDING
> +	    || dev->power.runtime_status == RPM_RESUMING) {
> +		DEFINE_WAIT(wait);
> +
> +		/* Suspend or wake-up in progress. */
> +		for (;;) {
> +			prepare_to_wait(&dev->power.wait_queue, &wait,
> +					TASK_UNINTERRUPTIBLE);
> +			if (dev->power.runtime_status != RPM_SUSPENDING
> +			    && dev->power.runtime_status != RPM_RESUMING)
> +				break;
> +
> +			spin_unlock_irq(&dev->power.lock);
> +
> +			schedule();
> +
> +			spin_lock_irq(&dev->power.lock);
> +		}
> +		finish_wait(&dev->power.wait_queue, &wait);
> +	}
> +
> +	if (dev->power.runtime_failure)
> +		goto out;
> +
> +	if (dev->power.idle_notification) {
> +		DEFINE_WAIT(wait);
> +
> +		for (;;) {
> +			prepare_to_wait(&dev->power.wait_queue, &wait,
> +					TASK_UNINTERRUPTIBLE);
> +			if (!dev->power.idle_notification)
> +				break;
> +
> +			spin_unlock_irq(&dev->power.lock);
> +
> +			schedule();
> +
> +			spin_lock_irq(&dev->power.lock);
> +		}
> +		finish_wait(&dev->power.wait_queue, &wait);
> +	}

This wait loop should be merged with the previous loop.


> +void pm_runtime_init(struct device *dev)
> +{
> +	spin_lock_init(&dev->power.lock);
> +
> +	dev->power.runtime_status = RPM_SUSPENDED;
> +	dev->power.idle_notification = false;
> +
> +	dev->power.disable_depth = 1;
> +	atomic_set(&dev->power.usage_count, 0);
> +
> +	dev->power.runtime_failure = false;
> +	dev->power.last_error = 0;

You don't have to set values to 0; they are initialized by kzalloc.

> +	dev->power.suspend_timer.expires = jiffies;
> +	dev->power.suspend_timer.data = (unsigned long)dev;
> +	dev->power.suspend_timer.function = pm_suspend_timer_fn;

Use setup_timer() instead of assigning these fields directly.


> +void pm_runtime_remove(struct device *dev)
> +{
> +	pm_runtime_disable(dev);
> +
> +	if (dev->power.runtime_status == RPM_ACTIVE) {
> +		struct device *parent = dev->parent;
> +
> +		/*
> +		 * Change the status back to 'suspended' to match the initial
> +		 * status.
> +		 */
> +		pm_runtime_set_suspended(dev);
> +		if (parent && !parent->power.ignore_children)
> +			pm_request_idle(parent);

Shouldn't these last two lines be part of __pm_runtime_set_status()?


> --- /dev/null
> +++ linux-2.6/include/linux/pm_runtime.h

> +extern void pm_runtime_init(struct device *dev);
> +extern void pm_runtime_remove(struct device *dev);

I don't like seeing these two functions included in the public header
file.  It's enough to put them in drivers/base/power/power.h.


> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
>  static bool transition_started;
>  
>  /**
> + * device_pm_init - Initialize the PM-related part of a device object
> + * @dev: Device object to initialize.
> + */
> +void device_pm_init(struct device *dev)
> +{
> +	dev->power.status = DPM_ON;
> +	pm_runtime_init(dev);
> +}
> +
> +/**
>   *	device_pm_lock - lock the list of active devices used by the PM core
>   */
>  void device_pm_lock(void)
> @@ -105,6 +116,8 @@ void device_pm_remove(struct device *dev
>  	mutex_lock(&dpm_list_mtx);
>  	list_del_init(&dev->power.entry);
>  	mutex_unlock(&dpm_list_mtx);
> +
> +	pm_runtime_remove(dev);
>  }

Calling pm_runtime_init() from device_pm_init() and
pm_runtime_remove() from device_pm_remove() isn't good.  If
CONFIG_PM_SLEEP isn't enabled then the calls won't be compiled, even
if CONFIG_PM_RUNTIME is set.


> @@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
>  		dev->power.status = DPM_PREPARING;
>  		mutex_unlock(&dpm_list_mtx);
>  
> -		error = device_prepare(dev, state);
> +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> +			error = -EBUSY;

What's the reason for the -EBUSY error?


> --- linux-2.6.orig/drivers/base/dd.c
> +++ linux-2.6/drivers/base/dd.c
> @@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
>  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
>  		 drv->bus->name, __func__, dev_name(dev), drv->name);
>  
> +	pm_runtime_get_noresume(dev);
>  	ret = really_probe(dev, drv);
> +	pm_runtime_put_noidle(dev);

This is bad because it won't wait if there's a runtime-PM call in
progress.  Also, we shouldn't use put_noidle because it might subvert
the driver's attempt to autosuspend.  Instead we should do something
like this:

	/* Wait for runtime PM calls to finish and prevent new calls
	 * until the probe is done.
	 */
	pm_runtime_disable(dev);
	pm_runtime_get_noresume(dev);
	pm_runtime_enable(dev):
	ret = really_probe(dev, drv);
	pm_runtime_put_sync(dev);


> --- /dev/null
> +++ linux-2.6/Documentation/power/runtime_pm.txt

> +2. Device Run-time PM Callbacks

> +In particular, it is recommended that ->runtime_suspend() return -EBUSY if
> +device_may_wakeup() returns 'false' for the device.

What's the point of this?  I don't understand -- we don't want to
discourage people from suspending devices with wakeup enabled.


> +Additionally, the helper functions provided by the PM core obey the following
> +rules:
> +
> +  * If ->runtime_suspend() is about to be executed or the execution of it is
> +    scheduled or there's a pending request to execute it, ->runtime_idle() will
> +    not be executed for the same device.

Shouldn't we allow runtime_idle when a suspend is scheduled?  The idle
handler might decide to suspend right away instead of waiting for the
timer to expire.


> +4. Run-time PM Device Helper Functions
> +
> +The following run-time PM helper functions are defined in
> +drivers/base/power/runtime.c and include/linux/pm_runtime.h:

> +  int pm_schedule_suspend(struct device *dev, unsigned int delay);
> +    - schedule the execution of ->runtime_suspend() for the device's bus type
> +      in future, where 'delay' is the time to wait before queuing up a suspend
> +      work item in pm_wq, in miliseconds (if 'delay' is zero, the work item is

Fix spelling of "milli".  Explain that the new delay will override the
old one if a suspend was already scheduled and not yet expired.

> +  int pm_runtime_get_sync(struct device *dev);
> +    - increment the device's usage counter, run pm_rutime_resume(dev) and return

Fix spelling of "runtime".  Same under pm_runtime_put_sync.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
  2009-08-03 21:36 [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11) Rafael J. Wysocki
@ 2009-08-04 20:33 ` Alan Stern
  2009-08-04 20:33 ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-04 20:33 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Mon, 3 Aug 2009, Rafael J. Wysocki wrote:

> Hi,
> 
> OK, if this is to go into 2.6.32, the last moment for putting it into
> linux-next is now.  If you have any objections, remarks, etc. please let me
> know or I'm going to put this one into the linux-next branch of the suspend-2.6
> tree in the next couple of days.

I'm sorry I haven't been keeping on top of all your work on this.
Lots of other stuff has been going on in the meantime...

One the whole this all looks very good.  It's basically ready to be
merged.  There are a couple of minor issues remaining plus a bunch of
unimportant implementation details.

pm_runtime_disable() gets used for several different purposes.  For
the usage in pm_runtime_remove(), it's silly to carry out a pending
resume request.  Should we add an argument saying whether or not to do
so?

In the documentation, it would be nice to have a section listing the
default runtime PM settings and explaining what a driver should do to
activate runtime PM on a newly-registered device.


> +static void pm_runtime_cancel_pending(struct device *dev)
> +{
> +	pm_runtime_deactivate_timer(dev);
> +	/*
> +	 * If there's a request pending, make sure its work function will return
> +	 * without doing anything.
> +	 */
> +	if (dev->power.request_pending)
> +		dev->power.request = RPM_REQ_NONE;

No need for the "if"; you can always do the assignment.


> +static int __pm_runtime_idle(struct device *dev)
> +{
> +	int retval = 0;
> +
> +	if (dev->power.runtime_failure)
> +		retval = -EINVAL;

Instead of assigning to retval, you could simply return these values.

> +	else if (dev->power.idle_notification)
> +		retval = -EINPROGRESS;
> +	else if (atomic_read(&dev->power.usage_count) > 0
> +	    || dev->power.disable_depth > 0
> +	    || dev->power.timer_expires > 0
> +	    || dev->power.runtime_status == RPM_SUSPENDED
> +	    || dev->power.runtime_status == RPM_SUSPENDING)
> +		retval = -EAGAIN;

Do we also want to rule out RPM_RESUMING?  That is,

	    || dev->power.runtime_status != RPM_ACTIVE

> +	spin_unlock_irq(&dev->power.lock);
> +
> +	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> +		dev->bus->pm->runtime_idle(dev);
> +
> +	spin_lock_irq(&dev->power.lock);

Small optimization: Put the spin_{un}lock_irq stuff inside the "if"
statement, so it doesn't happen if the test fails.  The same thing can
be done in other places.


> + * __pm_runtime_suspend - Carry out run-time suspend of given device.
> + * @dev: Device to suspend.
> + * @from_wq: If set, the funtion has been called via pm_wq.

Fix spelling of "function".  Likewise in __pm_runtime_resume.

> +int __pm_runtime_suspend(struct device *dev, bool from_wq)
> +{
...
> +	spin_unlock_irq(&dev->power.lock);
> +
> +	if (parent && !parent->power.ignore_children)
> +		pm_request_idle(parent);
> +
> +	if (notify)
> +		pm_runtime_idle(dev);

Move this up before the spin_unlock_irq and call __pm_runtime_idle instead.
The same sort of thing can be done in __pm_runtime_resume.


> +static int __pm_request_suspend(struct device *dev)
> +{
> +	int retval = 0;
> +
> +	if (dev->power.runtime_failure)
> +		return -EINVAL;
> +
> +	if (dev->power.runtime_status == RPM_SUSPENDED)
> +		retval = 1;
> +	else if (atomic_read(&dev->power.usage_count) > 0
> +	    || dev->power.disable_depth > 0)
> +		retval = -EAGAIN;
> +	else if (dev->power.runtime_status == RPM_SUSPENDING)
> +		retval = -EINPROGRESS;
> +	else if (!pm_children_suspended(dev))
> +		retval = -EBUSY;

Insert:
	if (retval)
		return retval;

Or else change the assignments to "return" statements.  Yes, we agreed
that a suspend request should override an existing idle-notify
request.  But if the new request fails then it shouldn't override
anything.  (Of course, if it fails for any of the reasons here then
there can't be a pending idle-notify request anyway.)

> +	pm_runtime_deactivate_timer(dev);
> +
> +	if (dev->power.request_pending) {
> +		/*
> +		 * Pending resume requests take precedence over us, but we can
> +		 * overtake any other pending request.
> +		 */
> +		if (dev->power.request == RPM_REQ_RESUME)
> +			retval = -EAGAIN;
> +		else if (dev->power.request != RPM_REQ_SUSPEND)
> +			dev->power.request = retval ?
> +						RPM_REQ_NONE : RPM_REQ_SUSPEND;

Now there's no need to check retval.

> +
> +		if (dev->power.request == RPM_REQ_SUSPEND)
> +			return 0;

Just simply:
		return retval;

Some of these cases can't happen.  For instance, if we reach here then
the status can't be SUSPENDED or SUSPENDING, so there can't be a
pending resume request.

> +	}
> +
> +	if (retval)
> +		return retval;

Now this isn't needed.  Similar code rearrangements can be made in
__pm_request_resume.


> +int __pm_request_resume(struct device *dev)

Should be static.


> +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> +{
> +	struct device *parent = dev->parent;
> +	unsigned long flags;
> +	int error = 0;
> +
> +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> +		return -EINVAL;

This should go inside the spinlocked area.

> +	spin_lock_irqsave(&dev->power.lock, flags);
> +
> +	if (!dev->power.runtime_failure && !dev->power.disable_depth)
> +		goto out;
> +
> +	if (dev->power.runtime_status == status)
> +		goto out_clear;
> +
> +	if (status == RPM_SUSPENDED) {
> +		/* It always is possible to set the status to 'suspended'. */
> +		if (parent)
> +			atomic_add_unless(&parent->power.child_count, -1, 0);
> +		dev->power.runtime_status = status;
> +		goto out_clear;
> +	}
> +
> +	if (parent) {
> +		spin_lock_irq(&parent->power.lock);
> +
> +		/*
> +		 * It is invalid to put an active child under a parent that is
> +		 * not active, has run-time PM enabled and the
> +		 * 'power.ignore_children' flag unset.
> +		 */
> +		if (!parent->power.disable_depth
> +		    && !parent->power.ignore_children
> +		    && parent->power.runtime_status != RPM_ACTIVE) {
> +			error = -EBUSY;
> +		} else {
> +			if (dev->power.runtime_status == RPM_SUSPENDED)
> +				atomic_inc(&parent->power.child_count);
> +			dev->power.runtime_status = status;
> +		}
> +
> +		spin_unlock_irq(&parent->power.lock);
> +
> +		if (error)
> +			goto out;
> +	} else {
> +		dev->power.runtime_status = status;
> +	}
> +
> + out_clear:
> +	dev->power.runtime_failure = false;

Move all those assignments to dev->power.runtime_status down here.


> +int pm_runtime_disable(struct device *dev)
> +{
...
> +	if (dev->power.disable_depth++ > 0)
> +		goto out;
> +
> +	if (dev->power.runtime_failure)
> +		goto out;

I don't see why this is needed.

> +
> +	pm_runtime_deactivate_timer(dev);
> +
> +	if (dev->power.request_pending) {
> +		dev->power.request = RPM_REQ_NONE;
> +
> +		spin_unlock_irq(&dev->power.lock);
> +
> +		cancel_work_sync(&dev->power.work);
> +
> +		spin_lock_irq(&dev->power.lock);
> +
> +		dev->power.request_pending = false;

Remove excessive whitespace.

> +	}
> +
> +	if (dev->power.runtime_status == RPM_SUSPENDING
> +	    || dev->power.runtime_status == RPM_RESUMING) {
> +		DEFINE_WAIT(wait);
> +
> +		/* Suspend or wake-up in progress. */
> +		for (;;) {
> +			prepare_to_wait(&dev->power.wait_queue, &wait,
> +					TASK_UNINTERRUPTIBLE);
> +			if (dev->power.runtime_status != RPM_SUSPENDING
> +			    && dev->power.runtime_status != RPM_RESUMING)
> +				break;
> +
> +			spin_unlock_irq(&dev->power.lock);
> +
> +			schedule();
> +
> +			spin_lock_irq(&dev->power.lock);
> +		}
> +		finish_wait(&dev->power.wait_queue, &wait);
> +	}
> +
> +	if (dev->power.runtime_failure)
> +		goto out;
> +
> +	if (dev->power.idle_notification) {
> +		DEFINE_WAIT(wait);
> +
> +		for (;;) {
> +			prepare_to_wait(&dev->power.wait_queue, &wait,
> +					TASK_UNINTERRUPTIBLE);
> +			if (!dev->power.idle_notification)
> +				break;
> +
> +			spin_unlock_irq(&dev->power.lock);
> +
> +			schedule();
> +
> +			spin_lock_irq(&dev->power.lock);
> +		}
> +		finish_wait(&dev->power.wait_queue, &wait);
> +	}

This wait loop should be merged with the previous loop.


> +void pm_runtime_init(struct device *dev)
> +{
> +	spin_lock_init(&dev->power.lock);
> +
> +	dev->power.runtime_status = RPM_SUSPENDED;
> +	dev->power.idle_notification = false;
> +
> +	dev->power.disable_depth = 1;
> +	atomic_set(&dev->power.usage_count, 0);
> +
> +	dev->power.runtime_failure = false;
> +	dev->power.last_error = 0;

You don't have to set values to 0; they are initialized by kzalloc.

> +	dev->power.suspend_timer.expires = jiffies;
> +	dev->power.suspend_timer.data = (unsigned long)dev;
> +	dev->power.suspend_timer.function = pm_suspend_timer_fn;

Use setup_timer() instead of assigning these fields directly.


> +void pm_runtime_remove(struct device *dev)
> +{
> +	pm_runtime_disable(dev);
> +
> +	if (dev->power.runtime_status == RPM_ACTIVE) {
> +		struct device *parent = dev->parent;
> +
> +		/*
> +		 * Change the status back to 'suspended' to match the initial
> +		 * status.
> +		 */
> +		pm_runtime_set_suspended(dev);
> +		if (parent && !parent->power.ignore_children)
> +			pm_request_idle(parent);

Shouldn't these last two lines be part of __pm_runtime_set_status()?


> --- /dev/null
> +++ linux-2.6/include/linux/pm_runtime.h

> +extern void pm_runtime_init(struct device *dev);
> +extern void pm_runtime_remove(struct device *dev);

I don't like seeing these two functions included in the public header
file.  It's enough to put them in drivers/base/power/power.h.


> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
>  static bool transition_started;
>  
>  /**
> + * device_pm_init - Initialize the PM-related part of a device object
> + * @dev: Device object to initialize.
> + */
> +void device_pm_init(struct device *dev)
> +{
> +	dev->power.status = DPM_ON;
> +	pm_runtime_init(dev);
> +}
> +
> +/**
>   *	device_pm_lock - lock the list of active devices used by the PM core
>   */
>  void device_pm_lock(void)
> @@ -105,6 +116,8 @@ void device_pm_remove(struct device *dev
>  	mutex_lock(&dpm_list_mtx);
>  	list_del_init(&dev->power.entry);
>  	mutex_unlock(&dpm_list_mtx);
> +
> +	pm_runtime_remove(dev);
>  }

Calling pm_runtime_init() from device_pm_init() and
pm_runtime_remove() from device_pm_remove() isn't good.  If
CONFIG_PM_SLEEP isn't enabled then the calls won't be compiled, even
if CONFIG_PM_RUNTIME is set.


> @@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
>  		dev->power.status = DPM_PREPARING;
>  		mutex_unlock(&dpm_list_mtx);
>  
> -		error = device_prepare(dev, state);
> +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> +			error = -EBUSY;

What's the reason for the -EBUSY error?


> --- linux-2.6.orig/drivers/base/dd.c
> +++ linux-2.6/drivers/base/dd.c
> @@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
>  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
>  		 drv->bus->name, __func__, dev_name(dev), drv->name);
>  
> +	pm_runtime_get_noresume(dev);
>  	ret = really_probe(dev, drv);
> +	pm_runtime_put_noidle(dev);

This is bad because it won't wait if there's a runtime-PM call in
progress.  Also, we shouldn't use put_noidle because it might subvert
the driver's attempt to autosuspend.  Instead we should do something
like this:

	/* Wait for runtime PM calls to finish and prevent new calls
	 * until the probe is done.
	 */
	pm_runtime_disable(dev);
	pm_runtime_get_noresume(dev);
	pm_runtime_enable(dev):
	ret = really_probe(dev, drv);
	pm_runtime_put_sync(dev);


> --- /dev/null
> +++ linux-2.6/Documentation/power/runtime_pm.txt

> +2. Device Run-time PM Callbacks

> +In particular, it is recommended that ->runtime_suspend() return -EBUSY if
> +device_may_wakeup() returns 'false' for the device.

What's the point of this?  I don't understand -- we don't want to
discourage people from suspending devices with wakeup enabled.


> +Additionally, the helper functions provided by the PM core obey the following
> +rules:
> +
> +  * If ->runtime_suspend() is about to be executed or the execution of it is
> +    scheduled or there's a pending request to execute it, ->runtime_idle() will
> +    not be executed for the same device.

Shouldn't we allow runtime_idle when a suspend is scheduled?  The idle
handler might decide to suspend right away instead of waiting for the
timer to expire.


> +4. Run-time PM Device Helper Functions
> +
> +The following run-time PM helper functions are defined in
> +drivers/base/power/runtime.c and include/linux/pm_runtime.h:

> +  int pm_schedule_suspend(struct device *dev, unsigned int delay);
> +    - schedule the execution of ->runtime_suspend() for the device's bus type
> +      in future, where 'delay' is the time to wait before queuing up a suspend
> +      work item in pm_wq, in miliseconds (if 'delay' is zero, the work item is

Fix spelling of "milli".  Explain that the new delay will override the
old one if a suspend was already scheduled and not yet expired.

> +  int pm_runtime_get_sync(struct device *dev);
> +    - increment the device's usage counter, run pm_rutime_resume(dev) and return

Fix spelling of "runtime".  Same under pm_runtime_put_sync.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
  2009-08-04 20:33 ` Alan Stern
  2009-08-05  0:19   ` Rafael J. Wysocki
@ 2009-08-05  0:19   ` Rafael J. Wysocki
  2009-08-05  2:44     ` Alan Stern
  2009-08-05  2:44     ` Alan Stern
  1 sibling, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-05  0:19 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Tuesday 04 August 2009, Alan Stern wrote:
> On Mon, 3 Aug 2009, Rafael J. Wysocki wrote:
> 
> > Hi,
> > 
> > OK, if this is to go into 2.6.32, the last moment for putting it into
> > linux-next is now.  If you have any objections, remarks, etc. please let me
> > know or I'm going to put this one into the linux-next branch of the suspend-2.6
> > tree in the next couple of days.
> 
> I'm sorry I haven't been keeping on top of all your work on this.
> Lots of other stuff has been going on in the meantime...

No problem.

> One the whole this all looks very good.  It's basically ready to be
> merged.  There are a couple of minor issues remaining plus a bunch of
> unimportant implementation details.
> 
> pm_runtime_disable() gets used for several different purposes.  For
> the usage in pm_runtime_remove(), it's silly to carry out a pending
> resume request.  Should we add an argument saying whether or not to do
> so?

Yes, we can do that.

> In the documentation, it would be nice to have a section listing the
> default runtime PM settings and explaining what a driver should do to
> activate runtime PM on a newly-registered device.

OK, I'll try to put something like this in there.

> > +static void pm_runtime_cancel_pending(struct device *dev)
> > +{
> > +	pm_runtime_deactivate_timer(dev);
> > +	/*
> > +	 * If there's a request pending, make sure its work function will return
> > +	 * without doing anything.
> > +	 */
> > +	if (dev->power.request_pending)
> > +		dev->power.request = RPM_REQ_NONE;
> 
> No need for the "if"; you can always do the assignment.

OK

> > +static int __pm_runtime_idle(struct device *dev)
> > +{
> > +	int retval = 0;
> > +
> > +	if (dev->power.runtime_failure)
> > +		retval = -EINVAL;
> 
> Instead of assigning to retval, you could simply return these values.

I could, but I chose not to.

> > +	else if (dev->power.idle_notification)
> > +		retval = -EINPROGRESS;
> > +	else if (atomic_read(&dev->power.usage_count) > 0
> > +	    || dev->power.disable_depth > 0
> > +	    || dev->power.timer_expires > 0
> > +	    || dev->power.runtime_status == RPM_SUSPENDED
> > +	    || dev->power.runtime_status == RPM_SUSPENDING)
> > +		retval = -EAGAIN;
> 
> Do we also want to rule out RPM_RESUMING?  That is,
> 
> 	    || dev->power.runtime_status != RPM_ACTIVE

Yes, we do, thanks.

> > +	spin_unlock_irq(&dev->power.lock);
> > +
> > +	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> > +		dev->bus->pm->runtime_idle(dev);
> > +
> > +	spin_lock_irq(&dev->power.lock);
> 
> Small optimization: Put the spin_{un}lock_irq stuff inside the "if"
> statement, so it doesn't happen if the test fails.

Well, I don't think so.  We need to take the lock here unconditionally,
because the caller is going to unlock it.

> The same thing can be done in other places.

I'm not really sure it can.

> > + * __pm_runtime_suspend - Carry out run-time suspend of given device.
> > + * @dev: Device to suspend.
> > + * @from_wq: If set, the funtion has been called via pm_wq.
> 
> Fix spelling of "function".  Likewise in __pm_runtime_resume.

Will do, thanks.
 
> > +int __pm_runtime_suspend(struct device *dev, bool from_wq)
> > +{
> ...
> > +	spin_unlock_irq(&dev->power.lock);
> > +
> > +	if (parent && !parent->power.ignore_children)
> > +		pm_request_idle(parent);
> > +
> > +	if (notify)
> > +		pm_runtime_idle(dev);
> 
> Move this up before the spin_unlock_irq and call __pm_runtime_idle instead.
> The same sort of thing can be done in __pm_runtime_resume.

OK

> > +static int __pm_request_suspend(struct device *dev)
> > +{
> > +	int retval = 0;
> > +
> > +	if (dev->power.runtime_failure)
> > +		return -EINVAL;
> > +
> > +	if (dev->power.runtime_status == RPM_SUSPENDED)
> > +		retval = 1;
> > +	else if (atomic_read(&dev->power.usage_count) > 0
> > +	    || dev->power.disable_depth > 0)
> > +		retval = -EAGAIN;
> > +	else if (dev->power.runtime_status == RPM_SUSPENDING)
> > +		retval = -EINPROGRESS;
> > +	else if (!pm_children_suspended(dev))
> > +		retval = -EBUSY;
> 
> Insert:
> 	if (retval)
> 		return retval;
> 
> Or else change the assignments to "return" statements.  Yes, we agreed
> that a suspend request should override an existing idle-notify
> request.  But if the new request fails then it shouldn't override
> anything.  (Of course, if it fails for any of the reasons here then
> there can't be a pending idle-notify request anyway.)

However, if it's going to return 1, it should override existing idle-notify
and suspend requests.  I'll add

 	if (retval < 0)
 		return retval;

> > +	pm_runtime_deactivate_timer(dev);
> > +
> > +	if (dev->power.request_pending) {
> > +		/*
> > +		 * Pending resume requests take precedence over us, but we can
> > +		 * overtake any other pending request.
> > +		 */
> > +		if (dev->power.request == RPM_REQ_RESUME)
> > +			retval = -EAGAIN;
> > +		else if (dev->power.request != RPM_REQ_SUSPEND)
> > +			dev->power.request = retval ?
> > +						RPM_REQ_NONE : RPM_REQ_SUSPEND;
> 
> Now there's no need to check retval.

It is.  If retval is 1, we cancel pending idle-notify and suspend requests.

> > +
> > +		if (dev->power.request == RPM_REQ_SUSPEND)
> > +			return 0;
> 
> Just simply:
> 		return retval;

OK, I'll change that.

> Some of these cases can't happen.  For instance, if we reach here then
> the status can't be SUSPENDED or SUSPENDING, so there can't be a
> pending resume request.
> 
> > +	}
> > +
> > +	if (retval)
> > +		return retval;
> 
> Now this isn't needed.  Similar code rearrangements can be made in
> __pm_request_resume.

OK

> > +int __pm_request_resume(struct device *dev)
> 
> Should be static.

OK

> > +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > +{
> > +	struct device *parent = dev->parent;
> > +	unsigned long flags;
> > +	int error = 0;
> > +
> > +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > +		return -EINVAL;
> 
> This should go inside the spinlocked area.

Why?  'status' is a function argument, it doesn't need to be protected from
concurrent modification.

> > +	spin_lock_irqsave(&dev->power.lock, flags);
> > +
> > +	if (!dev->power.runtime_failure && !dev->power.disable_depth)
> > +		goto out;
> > +
> > +	if (dev->power.runtime_status == status)
> > +		goto out_clear;
> > +
> > +	if (status == RPM_SUSPENDED) {
> > +		/* It always is possible to set the status to 'suspended'. */
> > +		if (parent)
> > +			atomic_add_unless(&parent->power.child_count, -1, 0);
> > +		dev->power.runtime_status = status;
> > +		goto out_clear;
> > +	}
> > +
> > +	if (parent) {
> > +		spin_lock_irq(&parent->power.lock);
> > +
> > +		/*
> > +		 * It is invalid to put an active child under a parent that is
> > +		 * not active, has run-time PM enabled and the
> > +		 * 'power.ignore_children' flag unset.
> > +		 */
> > +		if (!parent->power.disable_depth
> > +		    && !parent->power.ignore_children
> > +		    && parent->power.runtime_status != RPM_ACTIVE) {
> > +			error = -EBUSY;
> > +		} else {
> > +			if (dev->power.runtime_status == RPM_SUSPENDED)
> > +				atomic_inc(&parent->power.child_count);
> > +			dev->power.runtime_status = status;
> > +		}
> > +
> > +		spin_unlock_irq(&parent->power.lock);
> > +
> > +		if (error)
> > +			goto out;
> > +	} else {
> > +		dev->power.runtime_status = status;
> > +	}
> > +
> > + out_clear:
> > +	dev->power.runtime_failure = false;
> 
> Move all those assignments to dev->power.runtime_status down here.

OK

> > +int pm_runtime_disable(struct device *dev)
> > +{
> ...
> > +	if (dev->power.disable_depth++ > 0)
> > +		goto out;
> > +
> > +	if (dev->power.runtime_failure)
> > +		goto out;
> 
> I don't see why this is needed.

If dev->power.runtime_failure, there's no need to do anything more.

> > +
> > +	pm_runtime_deactivate_timer(dev);
> > +
> > +	if (dev->power.request_pending) {
> > +		dev->power.request = RPM_REQ_NONE;
> > +
> > +		spin_unlock_irq(&dev->power.lock);
> > +
> > +		cancel_work_sync(&dev->power.work);
> > +
> > +		spin_lock_irq(&dev->power.lock);
> > +
> > +		dev->power.request_pending = false;
> 
> Remove excessive whitespace.

OK

> > +	}
> > +
> > +	if (dev->power.runtime_status == RPM_SUSPENDING
> > +	    || dev->power.runtime_status == RPM_RESUMING) {
> > +		DEFINE_WAIT(wait);
> > +
> > +		/* Suspend or wake-up in progress. */
> > +		for (;;) {
> > +			prepare_to_wait(&dev->power.wait_queue, &wait,
> > +					TASK_UNINTERRUPTIBLE);
> > +			if (dev->power.runtime_status != RPM_SUSPENDING
> > +			    && dev->power.runtime_status != RPM_RESUMING)
> > +				break;
> > +
> > +			spin_unlock_irq(&dev->power.lock);
> > +
> > +			schedule();
> > +
> > +			spin_lock_irq(&dev->power.lock);
> > +		}
> > +		finish_wait(&dev->power.wait_queue, &wait);
> > +	}
> > +
> > +	if (dev->power.runtime_failure)
> > +		goto out;
> > +
> > +	if (dev->power.idle_notification) {
> > +		DEFINE_WAIT(wait);
> > +
> > +		for (;;) {
> > +			prepare_to_wait(&dev->power.wait_queue, &wait,
> > +					TASK_UNINTERRUPTIBLE);
> > +			if (!dev->power.idle_notification)
> > +				break;
> > +
> > +			spin_unlock_irq(&dev->power.lock);
> > +
> > +			schedule();
> > +
> > +			spin_lock_irq(&dev->power.lock);
> > +		}
> > +		finish_wait(&dev->power.wait_queue, &wait);
> > +	}
> 
> This wait loop should be merged with the previous loop.

OK

> > +void pm_runtime_init(struct device *dev)
> > +{
> > +	spin_lock_init(&dev->power.lock);
> > +
> > +	dev->power.runtime_status = RPM_SUSPENDED;
> > +	dev->power.idle_notification = false;
> > +
> > +	dev->power.disable_depth = 1;
> > +	atomic_set(&dev->power.usage_count, 0);
> > +
> > +	dev->power.runtime_failure = false;
> > +	dev->power.last_error = 0;
> 
> You don't have to set values to 0; they are initialized by kzalloc.

No, I don't, but does it really hurt?

> > +	dev->power.suspend_timer.expires = jiffies;
> > +	dev->power.suspend_timer.data = (unsigned long)dev;
> > +	dev->power.suspend_timer.function = pm_suspend_timer_fn;
> 
> Use setup_timer() instead of assigning these fields directly.

OK

> > +void pm_runtime_remove(struct device *dev)
> > +{
> > +	pm_runtime_disable(dev);
> > +
> > +	if (dev->power.runtime_status == RPM_ACTIVE) {
> > +		struct device *parent = dev->parent;
> > +
> > +		/*
> > +		 * Change the status back to 'suspended' to match the initial
> > +		 * status.
> > +		 */
> > +		pm_runtime_set_suspended(dev);
> > +		if (parent && !parent->power.ignore_children)
> > +			pm_request_idle(parent);
> 
> Shouldn't these last two lines be part of __pm_runtime_set_status()?

No.  It is valid to call __pm_runtime_set_status() when runtime PM is disabled
for the device and I don't think we should kick the parent in such cases.

> > --- /dev/null
> > +++ linux-2.6/include/linux/pm_runtime.h
> 
> > +extern void pm_runtime_init(struct device *dev);
> > +extern void pm_runtime_remove(struct device *dev);
> 
> I don't like seeing these two functions included in the public header
> file.  It's enough to put them in drivers/base/power/power.h.

OK

> > --- linux-2.6.orig/drivers/base/power/main.c
> > +++ linux-2.6/drivers/base/power/main.c
> > @@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
> >  static bool transition_started;
> >  
> >  /**
> > + * device_pm_init - Initialize the PM-related part of a device object
> > + * @dev: Device object to initialize.
> > + */
> > +void device_pm_init(struct device *dev)
> > +{
> > +	dev->power.status = DPM_ON;
> > +	pm_runtime_init(dev);
> > +}
> > +
> > +/**
> >   *	device_pm_lock - lock the list of active devices used by the PM core
> >   */
> >  void device_pm_lock(void)
> > @@ -105,6 +116,8 @@ void device_pm_remove(struct device *dev
> >  	mutex_lock(&dpm_list_mtx);
> >  	list_del_init(&dev->power.entry);
> >  	mutex_unlock(&dpm_list_mtx);
> > +
> > +	pm_runtime_remove(dev);
> >  }
> 
> Calling pm_runtime_init() from device_pm_init() and
> pm_runtime_remove() from device_pm_remove() isn't good.  If
> CONFIG_PM_SLEEP isn't enabled then the calls won't be compiled, even
> if CONFIG_PM_RUNTIME is set.

Right, I shouldn't have moved device_pm_init() to main.c at all.

> > @@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
> >  		dev->power.status = DPM_PREPARING;
> >  		mutex_unlock(&dpm_list_mtx);
> >  
> > -		error = device_prepare(dev, state);
> > +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> > +			error = -EBUSY;
> 
> What's the reason for the -EBUSY error?

If this is a wake-up device and pm_runtime_disable(dev) returned 1 (it can only
return 1 or 0), which means there was a resume request pending for the device,
suspend fails with -EBUSY (wake-up event during suspend).

> > --- linux-2.6.orig/drivers/base/dd.c
> > +++ linux-2.6/drivers/base/dd.c
> > @@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
> >  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
> >  		 drv->bus->name, __func__, dev_name(dev), drv->name);
> >  
> > +	pm_runtime_get_noresume(dev);
> >  	ret = really_probe(dev, drv);
> > +	pm_runtime_put_noidle(dev);
> 
> This is bad because it won't wait if there's a runtime-PM call in
> progress.  Also, we shouldn't use put_noidle because it might subvert
> the driver's attempt to autosuspend.

I'm not sure how that's possible, but whatever.

> Instead we should do something like this:
> 
> 	/* Wait for runtime PM calls to finish and prevent new calls
> 	 * until the probe is done.
> 	 */
> 	pm_runtime_disable(dev);
> 	pm_runtime_get_noresume(dev);
> 	pm_runtime_enable(dev):
> 	ret = really_probe(dev, drv);
> 	pm_runtime_put_sync(dev);

Fine by me.

> > --- /dev/null
> > +++ linux-2.6/Documentation/power/runtime_pm.txt
> 
> > +2. Device Run-time PM Callbacks
> 
> > +In particular, it is recommended that ->runtime_suspend() return -EBUSY if
> > +device_may_wakeup() returns 'false' for the device.
> 
> What's the point of this?  I don't understand -- we don't want to
> discourage people from suspending devices with wakeup enabled.

device_may_wakeup(dev) == false means wake-up is disabled for dev, so
suspending it might not be a good idea.

> > +Additionally, the helper functions provided by the PM core obey the following
> > +rules:
> > +
> > +  * If ->runtime_suspend() is about to be executed or the execution of it is
> > +    scheduled or there's a pending request to execute it, ->runtime_idle() will
> > +    not be executed for the same device.
> 
> Shouldn't we allow runtime_idle when a suspend is scheduled?  The idle
> handler might decide to suspend right away instead of waiting for the
> timer to expire.

Hmm.  We can.

> > +4. Run-time PM Device Helper Functions
> > +
> > +The following run-time PM helper functions are defined in
> > +drivers/base/power/runtime.c and include/linux/pm_runtime.h:
> 
> > +  int pm_schedule_suspend(struct device *dev, unsigned int delay);
> > +    - schedule the execution of ->runtime_suspend() for the device's bus type
> > +      in future, where 'delay' is the time to wait before queuing up a suspend
> > +      work item in pm_wq, in miliseconds (if 'delay' is zero, the work item is
> 
> Fix spelling of "milli".

OK

> Explain that the new delay will override the
> old one if a suspend was already scheduled and not yet expired.

OK

> > +  int pm_runtime_get_sync(struct device *dev);
> > +    - increment the device's usage counter, run pm_rutime_resume(dev) and return
> 
> Fix spelling of "runtime".  Same under pm_runtime_put_sync.

OK

Thanks a lot for the comments, I'll post an updated patch addressing them in
the next few days.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
  2009-08-04 20:33 ` Alan Stern
@ 2009-08-05  0:19   ` Rafael J. Wysocki
  2009-08-05  0:19   ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-05  0:19 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

On Tuesday 04 August 2009, Alan Stern wrote:
> On Mon, 3 Aug 2009, Rafael J. Wysocki wrote:
> 
> > Hi,
> > 
> > OK, if this is to go into 2.6.32, the last moment for putting it into
> > linux-next is now.  If you have any objections, remarks, etc. please let me
> > know or I'm going to put this one into the linux-next branch of the suspend-2.6
> > tree in the next couple of days.
> 
> I'm sorry I haven't been keeping on top of all your work on this.
> Lots of other stuff has been going on in the meantime...

No problem.

> One the whole this all looks very good.  It's basically ready to be
> merged.  There are a couple of minor issues remaining plus a bunch of
> unimportant implementation details.
> 
> pm_runtime_disable() gets used for several different purposes.  For
> the usage in pm_runtime_remove(), it's silly to carry out a pending
> resume request.  Should we add an argument saying whether or not to do
> so?

Yes, we can do that.

> In the documentation, it would be nice to have a section listing the
> default runtime PM settings and explaining what a driver should do to
> activate runtime PM on a newly-registered device.

OK, I'll try to put something like this in there.

> > +static void pm_runtime_cancel_pending(struct device *dev)
> > +{
> > +	pm_runtime_deactivate_timer(dev);
> > +	/*
> > +	 * If there's a request pending, make sure its work function will return
> > +	 * without doing anything.
> > +	 */
> > +	if (dev->power.request_pending)
> > +		dev->power.request = RPM_REQ_NONE;
> 
> No need for the "if"; you can always do the assignment.

OK

> > +static int __pm_runtime_idle(struct device *dev)
> > +{
> > +	int retval = 0;
> > +
> > +	if (dev->power.runtime_failure)
> > +		retval = -EINVAL;
> 
> Instead of assigning to retval, you could simply return these values.

I could, but I chose not to.

> > +	else if (dev->power.idle_notification)
> > +		retval = -EINPROGRESS;
> > +	else if (atomic_read(&dev->power.usage_count) > 0
> > +	    || dev->power.disable_depth > 0
> > +	    || dev->power.timer_expires > 0
> > +	    || dev->power.runtime_status == RPM_SUSPENDED
> > +	    || dev->power.runtime_status == RPM_SUSPENDING)
> > +		retval = -EAGAIN;
> 
> Do we also want to rule out RPM_RESUMING?  That is,
> 
> 	    || dev->power.runtime_status != RPM_ACTIVE

Yes, we do, thanks.

> > +	spin_unlock_irq(&dev->power.lock);
> > +
> > +	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> > +		dev->bus->pm->runtime_idle(dev);
> > +
> > +	spin_lock_irq(&dev->power.lock);
> 
> Small optimization: Put the spin_{un}lock_irq stuff inside the "if"
> statement, so it doesn't happen if the test fails.

Well, I don't think so.  We need to take the lock here unconditionally,
because the caller is going to unlock it.

> The same thing can be done in other places.

I'm not really sure it can.

> > + * __pm_runtime_suspend - Carry out run-time suspend of given device.
> > + * @dev: Device to suspend.
> > + * @from_wq: If set, the funtion has been called via pm_wq.
> 
> Fix spelling of "function".  Likewise in __pm_runtime_resume.

Will do, thanks.
 
> > +int __pm_runtime_suspend(struct device *dev, bool from_wq)
> > +{
> ...
> > +	spin_unlock_irq(&dev->power.lock);
> > +
> > +	if (parent && !parent->power.ignore_children)
> > +		pm_request_idle(parent);
> > +
> > +	if (notify)
> > +		pm_runtime_idle(dev);
> 
> Move this up before the spin_unlock_irq and call __pm_runtime_idle instead.
> The same sort of thing can be done in __pm_runtime_resume.

OK

> > +static int __pm_request_suspend(struct device *dev)
> > +{
> > +	int retval = 0;
> > +
> > +	if (dev->power.runtime_failure)
> > +		return -EINVAL;
> > +
> > +	if (dev->power.runtime_status == RPM_SUSPENDED)
> > +		retval = 1;
> > +	else if (atomic_read(&dev->power.usage_count) > 0
> > +	    || dev->power.disable_depth > 0)
> > +		retval = -EAGAIN;
> > +	else if (dev->power.runtime_status == RPM_SUSPENDING)
> > +		retval = -EINPROGRESS;
> > +	else if (!pm_children_suspended(dev))
> > +		retval = -EBUSY;
> 
> Insert:
> 	if (retval)
> 		return retval;
> 
> Or else change the assignments to "return" statements.  Yes, we agreed
> that a suspend request should override an existing idle-notify
> request.  But if the new request fails then it shouldn't override
> anything.  (Of course, if it fails for any of the reasons here then
> there can't be a pending idle-notify request anyway.)

However, if it's going to return 1, it should override existing idle-notify
and suspend requests.  I'll add

 	if (retval < 0)
 		return retval;

> > +	pm_runtime_deactivate_timer(dev);
> > +
> > +	if (dev->power.request_pending) {
> > +		/*
> > +		 * Pending resume requests take precedence over us, but we can
> > +		 * overtake any other pending request.
> > +		 */
> > +		if (dev->power.request == RPM_REQ_RESUME)
> > +			retval = -EAGAIN;
> > +		else if (dev->power.request != RPM_REQ_SUSPEND)
> > +			dev->power.request = retval ?
> > +						RPM_REQ_NONE : RPM_REQ_SUSPEND;
> 
> Now there's no need to check retval.

It is.  If retval is 1, we cancel pending idle-notify and suspend requests.

> > +
> > +		if (dev->power.request == RPM_REQ_SUSPEND)
> > +			return 0;
> 
> Just simply:
> 		return retval;

OK, I'll change that.

> Some of these cases can't happen.  For instance, if we reach here then
> the status can't be SUSPENDED or SUSPENDING, so there can't be a
> pending resume request.
> 
> > +	}
> > +
> > +	if (retval)
> > +		return retval;
> 
> Now this isn't needed.  Similar code rearrangements can be made in
> __pm_request_resume.

OK

> > +int __pm_request_resume(struct device *dev)
> 
> Should be static.

OK

> > +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > +{
> > +	struct device *parent = dev->parent;
> > +	unsigned long flags;
> > +	int error = 0;
> > +
> > +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > +		return -EINVAL;
> 
> This should go inside the spinlocked area.

Why?  'status' is a function argument, it doesn't need to be protected from
concurrent modification.

> > +	spin_lock_irqsave(&dev->power.lock, flags);
> > +
> > +	if (!dev->power.runtime_failure && !dev->power.disable_depth)
> > +		goto out;
> > +
> > +	if (dev->power.runtime_status == status)
> > +		goto out_clear;
> > +
> > +	if (status == RPM_SUSPENDED) {
> > +		/* It always is possible to set the status to 'suspended'. */
> > +		if (parent)
> > +			atomic_add_unless(&parent->power.child_count, -1, 0);
> > +		dev->power.runtime_status = status;
> > +		goto out_clear;
> > +	}
> > +
> > +	if (parent) {
> > +		spin_lock_irq(&parent->power.lock);
> > +
> > +		/*
> > +		 * It is invalid to put an active child under a parent that is
> > +		 * not active, has run-time PM enabled and the
> > +		 * 'power.ignore_children' flag unset.
> > +		 */
> > +		if (!parent->power.disable_depth
> > +		    && !parent->power.ignore_children
> > +		    && parent->power.runtime_status != RPM_ACTIVE) {
> > +			error = -EBUSY;
> > +		} else {
> > +			if (dev->power.runtime_status == RPM_SUSPENDED)
> > +				atomic_inc(&parent->power.child_count);
> > +			dev->power.runtime_status = status;
> > +		}
> > +
> > +		spin_unlock_irq(&parent->power.lock);
> > +
> > +		if (error)
> > +			goto out;
> > +	} else {
> > +		dev->power.runtime_status = status;
> > +	}
> > +
> > + out_clear:
> > +	dev->power.runtime_failure = false;
> 
> Move all those assignments to dev->power.runtime_status down here.

OK

> > +int pm_runtime_disable(struct device *dev)
> > +{
> ...
> > +	if (dev->power.disable_depth++ > 0)
> > +		goto out;
> > +
> > +	if (dev->power.runtime_failure)
> > +		goto out;
> 
> I don't see why this is needed.

If dev->power.runtime_failure, there's no need to do anything more.

> > +
> > +	pm_runtime_deactivate_timer(dev);
> > +
> > +	if (dev->power.request_pending) {
> > +		dev->power.request = RPM_REQ_NONE;
> > +
> > +		spin_unlock_irq(&dev->power.lock);
> > +
> > +		cancel_work_sync(&dev->power.work);
> > +
> > +		spin_lock_irq(&dev->power.lock);
> > +
> > +		dev->power.request_pending = false;
> 
> Remove excessive whitespace.

OK

> > +	}
> > +
> > +	if (dev->power.runtime_status == RPM_SUSPENDING
> > +	    || dev->power.runtime_status == RPM_RESUMING) {
> > +		DEFINE_WAIT(wait);
> > +
> > +		/* Suspend or wake-up in progress. */
> > +		for (;;) {
> > +			prepare_to_wait(&dev->power.wait_queue, &wait,
> > +					TASK_UNINTERRUPTIBLE);
> > +			if (dev->power.runtime_status != RPM_SUSPENDING
> > +			    && dev->power.runtime_status != RPM_RESUMING)
> > +				break;
> > +
> > +			spin_unlock_irq(&dev->power.lock);
> > +
> > +			schedule();
> > +
> > +			spin_lock_irq(&dev->power.lock);
> > +		}
> > +		finish_wait(&dev->power.wait_queue, &wait);
> > +	}
> > +
> > +	if (dev->power.runtime_failure)
> > +		goto out;
> > +
> > +	if (dev->power.idle_notification) {
> > +		DEFINE_WAIT(wait);
> > +
> > +		for (;;) {
> > +			prepare_to_wait(&dev->power.wait_queue, &wait,
> > +					TASK_UNINTERRUPTIBLE);
> > +			if (!dev->power.idle_notification)
> > +				break;
> > +
> > +			spin_unlock_irq(&dev->power.lock);
> > +
> > +			schedule();
> > +
> > +			spin_lock_irq(&dev->power.lock);
> > +		}
> > +		finish_wait(&dev->power.wait_queue, &wait);
> > +	}
> 
> This wait loop should be merged with the previous loop.

OK

> > +void pm_runtime_init(struct device *dev)
> > +{
> > +	spin_lock_init(&dev->power.lock);
> > +
> > +	dev->power.runtime_status = RPM_SUSPENDED;
> > +	dev->power.idle_notification = false;
> > +
> > +	dev->power.disable_depth = 1;
> > +	atomic_set(&dev->power.usage_count, 0);
> > +
> > +	dev->power.runtime_failure = false;
> > +	dev->power.last_error = 0;
> 
> You don't have to set values to 0; they are initialized by kzalloc.

No, I don't, but does it really hurt?

> > +	dev->power.suspend_timer.expires = jiffies;
> > +	dev->power.suspend_timer.data = (unsigned long)dev;
> > +	dev->power.suspend_timer.function = pm_suspend_timer_fn;
> 
> Use setup_timer() instead of assigning these fields directly.

OK

> > +void pm_runtime_remove(struct device *dev)
> > +{
> > +	pm_runtime_disable(dev);
> > +
> > +	if (dev->power.runtime_status == RPM_ACTIVE) {
> > +		struct device *parent = dev->parent;
> > +
> > +		/*
> > +		 * Change the status back to 'suspended' to match the initial
> > +		 * status.
> > +		 */
> > +		pm_runtime_set_suspended(dev);
> > +		if (parent && !parent->power.ignore_children)
> > +			pm_request_idle(parent);
> 
> Shouldn't these last two lines be part of __pm_runtime_set_status()?

No.  It is valid to call __pm_runtime_set_status() when runtime PM is disabled
for the device and I don't think we should kick the parent in such cases.

> > --- /dev/null
> > +++ linux-2.6/include/linux/pm_runtime.h
> 
> > +extern void pm_runtime_init(struct device *dev);
> > +extern void pm_runtime_remove(struct device *dev);
> 
> I don't like seeing these two functions included in the public header
> file.  It's enough to put them in drivers/base/power/power.h.

OK

> > --- linux-2.6.orig/drivers/base/power/main.c
> > +++ linux-2.6/drivers/base/power/main.c
> > @@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
> >  static bool transition_started;
> >  
> >  /**
> > + * device_pm_init - Initialize the PM-related part of a device object
> > + * @dev: Device object to initialize.
> > + */
> > +void device_pm_init(struct device *dev)
> > +{
> > +	dev->power.status = DPM_ON;
> > +	pm_runtime_init(dev);
> > +}
> > +
> > +/**
> >   *	device_pm_lock - lock the list of active devices used by the PM core
> >   */
> >  void device_pm_lock(void)
> > @@ -105,6 +116,8 @@ void device_pm_remove(struct device *dev
> >  	mutex_lock(&dpm_list_mtx);
> >  	list_del_init(&dev->power.entry);
> >  	mutex_unlock(&dpm_list_mtx);
> > +
> > +	pm_runtime_remove(dev);
> >  }
> 
> Calling pm_runtime_init() from device_pm_init() and
> pm_runtime_remove() from device_pm_remove() isn't good.  If
> CONFIG_PM_SLEEP isn't enabled then the calls won't be compiled, even
> if CONFIG_PM_RUNTIME is set.

Right, I shouldn't have moved device_pm_init() to main.c at all.

> > @@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
> >  		dev->power.status = DPM_PREPARING;
> >  		mutex_unlock(&dpm_list_mtx);
> >  
> > -		error = device_prepare(dev, state);
> > +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> > +			error = -EBUSY;
> 
> What's the reason for the -EBUSY error?

If this is a wake-up device and pm_runtime_disable(dev) returned 1 (it can only
return 1 or 0), which means there was a resume request pending for the device,
suspend fails with -EBUSY (wake-up event during suspend).

> > --- linux-2.6.orig/drivers/base/dd.c
> > +++ linux-2.6/drivers/base/dd.c
> > @@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
> >  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
> >  		 drv->bus->name, __func__, dev_name(dev), drv->name);
> >  
> > +	pm_runtime_get_noresume(dev);
> >  	ret = really_probe(dev, drv);
> > +	pm_runtime_put_noidle(dev);
> 
> This is bad because it won't wait if there's a runtime-PM call in
> progress.  Also, we shouldn't use put_noidle because it might subvert
> the driver's attempt to autosuspend.

I'm not sure how that's possible, but whatever.

> Instead we should do something like this:
> 
> 	/* Wait for runtime PM calls to finish and prevent new calls
> 	 * until the probe is done.
> 	 */
> 	pm_runtime_disable(dev);
> 	pm_runtime_get_noresume(dev);
> 	pm_runtime_enable(dev):
> 	ret = really_probe(dev, drv);
> 	pm_runtime_put_sync(dev);

Fine by me.

> > --- /dev/null
> > +++ linux-2.6/Documentation/power/runtime_pm.txt
> 
> > +2. Device Run-time PM Callbacks
> 
> > +In particular, it is recommended that ->runtime_suspend() return -EBUSY if
> > +device_may_wakeup() returns 'false' for the device.
> 
> What's the point of this?  I don't understand -- we don't want to
> discourage people from suspending devices with wakeup enabled.

device_may_wakeup(dev) == false means wake-up is disabled for dev, so
suspending it might not be a good idea.

> > +Additionally, the helper functions provided by the PM core obey the following
> > +rules:
> > +
> > +  * If ->runtime_suspend() is about to be executed or the execution of it is
> > +    scheduled or there's a pending request to execute it, ->runtime_idle() will
> > +    not be executed for the same device.
> 
> Shouldn't we allow runtime_idle when a suspend is scheduled?  The idle
> handler might decide to suspend right away instead of waiting for the
> timer to expire.

Hmm.  We can.

> > +4. Run-time PM Device Helper Functions
> > +
> > +The following run-time PM helper functions are defined in
> > +drivers/base/power/runtime.c and include/linux/pm_runtime.h:
> 
> > +  int pm_schedule_suspend(struct device *dev, unsigned int delay);
> > +    - schedule the execution of ->runtime_suspend() for the device's bus type
> > +      in future, where 'delay' is the time to wait before queuing up a suspend
> > +      work item in pm_wq, in miliseconds (if 'delay' is zero, the work item is
> 
> Fix spelling of "milli".

OK

> Explain that the new delay will override the
> old one if a suspend was already scheduled and not yet expired.

OK

> > +  int pm_runtime_get_sync(struct device *dev);
> > +    - increment the device's usage counter, run pm_rutime_resume(dev) and return
> 
> Fix spelling of "runtime".  Same under pm_runtime_put_sync.

OK

Thanks a lot for the comments, I'll post an updated patch addressing them in
the next few days.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
  2009-08-05  0:19   ` Rafael J. Wysocki
  2009-08-05  2:44     ` Alan Stern
@ 2009-08-05  2:44     ` Alan Stern
  2009-08-05 13:25       ` Rafael J. Wysocki
  2009-08-05 13:25       ` [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11) Rafael J. Wysocki
  1 sibling, 2 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-05  2:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Wed, 5 Aug 2009, Rafael J. Wysocki wrote:

> > > +	spin_unlock_irq(&dev->power.lock);
> > > +
> > > +	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> > > +		dev->bus->pm->runtime_idle(dev);
> > > +
> > > +	spin_lock_irq(&dev->power.lock);
> > 
> > Small optimization: Put the spin_{un}lock_irq stuff inside the "if"
> > statement, so it doesn't happen if the test fails.
> 
> Well, I don't think so.  We need to take the lock here unconditionally,
> because the caller is going to unlock it.

No, I meant do this:

	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
		spin_unlock_irq(&dev->power.lock);
		dev->bus->pm->runtime_idle(dev);
		spin_lock_irq(&dev->power.lock);
	}

By the way, I don't know if anyone still pays attention to sparse-type
annotations.  If you do, you should add "__releases(&dev->power.lock)"
and "__acquires(&dev->power.lock)" annotations to functions like this,
which release the lock without first acquiring it, and then acquire
the lock without releasing before returning.

> > The same thing can be done in other places.
> 
> I'm not really sure it can.

The other places aren't quite the same as this.  I'll leave it to your
discretion.  :-)


> > > +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > > +{
> > > +	struct device *parent = dev->parent;
> > > +	unsigned long flags;
> > > +	int error = 0;
> > > +
> > > +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > > +		return -EINVAL;
> > 
> > This should go inside the spinlocked area.
> 
> Why?  'status' is a function argument, it doesn't need to be protected from
> concurrent modification.

Oops, my mistake.  Never mind...


> > > +int pm_runtime_disable(struct device *dev)
> > > +{
> > ...
> > > +	if (dev->power.disable_depth++ > 0)
> > > +		goto out;
> > > +
> > > +	if (dev->power.runtime_failure)
> > > +		goto out;
> > 
> > I don't see why this is needed.
> 
> If dev->power.runtime_failure, there's no need to do anything more.

Don't you still want to deactivate the timer and kill any pending
requests?  True, they won't be able to do anything until the failure 
state is cleared, but even so...


> > > +void pm_runtime_remove(struct device *dev)
> > > +{
> > > +	pm_runtime_disable(dev);
> > > +
> > > +	if (dev->power.runtime_status == RPM_ACTIVE) {
> > > +		struct device *parent = dev->parent;
> > > +
> > > +		/*
> > > +		 * Change the status back to 'suspended' to match the initial
> > > +		 * status.
> > > +		 */
> > > +		pm_runtime_set_suspended(dev);
> > > +		if (parent && !parent->power.ignore_children)
> > > +			pm_request_idle(parent);
> > 
> > Shouldn't these last two lines be part of __pm_runtime_set_status()?
> 
> No.  It is valid to call __pm_runtime_set_status() when runtime PM is disabled
> for the device and I don't think we should kick the parent in such cases.

Ah, an interesting point.  Suppose the device is in RPM_ACTIVE, and
then someone calls pm_runtime_disable followed by
pm_runtime_set_suspended.  Then the device's status would change to
RPM_SUSPENDED and the parent's count of active children would be
decremented, perhaps to 0.  If the count does go to 0, why wouldn't you
want to send out an idle notification for the parent?


> > > @@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
> > >  		dev->power.status = DPM_PREPARING;
> > >  		mutex_unlock(&dpm_list_mtx);
> > >  
> > > -		error = device_prepare(dev, state);
> > > +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> > > +			error = -EBUSY;
> > 
> > What's the reason for the -EBUSY error?
> 
> If this is a wake-up device and pm_runtime_disable(dev) returned 1 (it can only
> return 1 or 0), which means there was a resume request pending for the device,
> suspend fails with -EBUSY (wake-up event during suspend).

I see.  That's a little obscure; a comment would help.  Even something
as simple as:

			error = -EBUSY;		/* wake-up during suspend */


> > > @@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
> > >  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
> > >  		 drv->bus->name, __func__, dev_name(dev), drv->name);
> > >  
> > > +	pm_runtime_get_noresume(dev);
> > >  	ret = really_probe(dev, drv);
> > > +	pm_runtime_put_noidle(dev);
> > 
> > This is bad because it won't wait if there's a runtime-PM call in
> > progress.  Also, we shouldn't use put_noidle because it might subvert
> > the driver's attempt to autosuspend.
> 
> I'm not sure how that's possible, but whatever.

Suppose the probe routine does:

	pm_runtime_get_sync(dev);
	/* do some work */
	pm_runtime_put_sync(dev);

There is a clear expectation that an idle notification will eventually
be sent.  But if the probe is surrounded by

	pm_runtime_get_noresume(dev);
	... probe ...
	pm_runtime_put_noidle(dev);

then there won't be any idle notifications.


> > > +2. Device Run-time PM Callbacks
> > 
> > > +In particular, it is recommended that ->runtime_suspend() return -EBUSY if
> > > +device_may_wakeup() returns 'false' for the device.
> > 
> > What's the point of this?  I don't understand -- we don't want to
> > discourage people from suspending devices with wakeup enabled.
> 
> device_may_wakeup(dev) == false means wake-up is disabled for dev, so
> suspending it might not be a good idea.

Okay.  This needs to be rephrased.  For example,

	In particular, if the driver requires remote wakeup capability
	for proper functioning and device_may_wakeup() returns 'false'
	for the device, then ->runtime_suspend() should return -EBUSY.

The point is that plenty of drivers can work perfectly well without
remote wakeup, so they have no reason to check device_may_wakeup().

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
  2009-08-05  0:19   ` Rafael J. Wysocki
@ 2009-08-05  2:44     ` Alan Stern
  2009-08-05  2:44     ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-05  2:44 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Wed, 5 Aug 2009, Rafael J. Wysocki wrote:

> > > +	spin_unlock_irq(&dev->power.lock);
> > > +
> > > +	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> > > +		dev->bus->pm->runtime_idle(dev);
> > > +
> > > +	spin_lock_irq(&dev->power.lock);
> > 
> > Small optimization: Put the spin_{un}lock_irq stuff inside the "if"
> > statement, so it doesn't happen if the test fails.
> 
> Well, I don't think so.  We need to take the lock here unconditionally,
> because the caller is going to unlock it.

No, I meant do this:

	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
		spin_unlock_irq(&dev->power.lock);
		dev->bus->pm->runtime_idle(dev);
		spin_lock_irq(&dev->power.lock);
	}

By the way, I don't know if anyone still pays attention to sparse-type
annotations.  If you do, you should add "__releases(&dev->power.lock)"
and "__acquires(&dev->power.lock)" annotations to functions like this,
which release the lock without first acquiring it, and then acquire
the lock without releasing before returning.

> > The same thing can be done in other places.
> 
> I'm not really sure it can.

The other places aren't quite the same as this.  I'll leave it to your
discretion.  :-)


> > > +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > > +{
> > > +	struct device *parent = dev->parent;
> > > +	unsigned long flags;
> > > +	int error = 0;
> > > +
> > > +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > > +		return -EINVAL;
> > 
> > This should go inside the spinlocked area.
> 
> Why?  'status' is a function argument, it doesn't need to be protected from
> concurrent modification.

Oops, my mistake.  Never mind...


> > > +int pm_runtime_disable(struct device *dev)
> > > +{
> > ...
> > > +	if (dev->power.disable_depth++ > 0)
> > > +		goto out;
> > > +
> > > +	if (dev->power.runtime_failure)
> > > +		goto out;
> > 
> > I don't see why this is needed.
> 
> If dev->power.runtime_failure, there's no need to do anything more.

Don't you still want to deactivate the timer and kill any pending
requests?  True, they won't be able to do anything until the failure 
state is cleared, but even so...


> > > +void pm_runtime_remove(struct device *dev)
> > > +{
> > > +	pm_runtime_disable(dev);
> > > +
> > > +	if (dev->power.runtime_status == RPM_ACTIVE) {
> > > +		struct device *parent = dev->parent;
> > > +
> > > +		/*
> > > +		 * Change the status back to 'suspended' to match the initial
> > > +		 * status.
> > > +		 */
> > > +		pm_runtime_set_suspended(dev);
> > > +		if (parent && !parent->power.ignore_children)
> > > +			pm_request_idle(parent);
> > 
> > Shouldn't these last two lines be part of __pm_runtime_set_status()?
> 
> No.  It is valid to call __pm_runtime_set_status() when runtime PM is disabled
> for the device and I don't think we should kick the parent in such cases.

Ah, an interesting point.  Suppose the device is in RPM_ACTIVE, and
then someone calls pm_runtime_disable followed by
pm_runtime_set_suspended.  Then the device's status would change to
RPM_SUSPENDED and the parent's count of active children would be
decremented, perhaps to 0.  If the count does go to 0, why wouldn't you
want to send out an idle notification for the parent?


> > > @@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
> > >  		dev->power.status = DPM_PREPARING;
> > >  		mutex_unlock(&dpm_list_mtx);
> > >  
> > > -		error = device_prepare(dev, state);
> > > +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> > > +			error = -EBUSY;
> > 
> > What's the reason for the -EBUSY error?
> 
> If this is a wake-up device and pm_runtime_disable(dev) returned 1 (it can only
> return 1 or 0), which means there was a resume request pending for the device,
> suspend fails with -EBUSY (wake-up event during suspend).

I see.  That's a little obscure; a comment would help.  Even something
as simple as:

			error = -EBUSY;		/* wake-up during suspend */


> > > @@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
> > >  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
> > >  		 drv->bus->name, __func__, dev_name(dev), drv->name);
> > >  
> > > +	pm_runtime_get_noresume(dev);
> > >  	ret = really_probe(dev, drv);
> > > +	pm_runtime_put_noidle(dev);
> > 
> > This is bad because it won't wait if there's a runtime-PM call in
> > progress.  Also, we shouldn't use put_noidle because it might subvert
> > the driver's attempt to autosuspend.
> 
> I'm not sure how that's possible, but whatever.

Suppose the probe routine does:

	pm_runtime_get_sync(dev);
	/* do some work */
	pm_runtime_put_sync(dev);

There is a clear expectation that an idle notification will eventually
be sent.  But if the probe is surrounded by

	pm_runtime_get_noresume(dev);
	... probe ...
	pm_runtime_put_noidle(dev);

then there won't be any idle notifications.


> > > +2. Device Run-time PM Callbacks
> > 
> > > +In particular, it is recommended that ->runtime_suspend() return -EBUSY if
> > > +device_may_wakeup() returns 'false' for the device.
> > 
> > What's the point of this?  I don't understand -- we don't want to
> > discourage people from suspending devices with wakeup enabled.
> 
> device_may_wakeup(dev) == false means wake-up is disabled for dev, so
> suspending it might not be a good idea.

Okay.  This needs to be rephrased.  For example,

	In particular, if the driver requires remote wakeup capability
	for proper functioning and device_may_wakeup() returns 'false'
	for the device, then ->runtime_suspend() should return -EBUSY.

The point is that plenty of drivers can work perfectly well without
remote wakeup, so they have no reason to check device_may_wakeup().

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
  2009-08-05  2:44     ` Alan Stern
@ 2009-08-05 13:25       ` Rafael J. Wysocki
  2009-08-05 21:47         ` [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12) Rafael J. Wysocki
  2009-08-05 21:47         ` Rafael J. Wysocki
  2009-08-05 13:25       ` [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11) Rafael J. Wysocki
  1 sibling, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-05 13:25 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Wednesday 05 August 2009, Alan Stern wrote:
> On Wed, 5 Aug 2009, Rafael J. Wysocki wrote:
> 
> > > > +	spin_unlock_irq(&dev->power.lock);
> > > > +
> > > > +	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> > > > +		dev->bus->pm->runtime_idle(dev);
> > > > +
> > > > +	spin_lock_irq(&dev->power.lock);
> > > 
> > > Small optimization: Put the spin_{un}lock_irq stuff inside the "if"
> > > statement, so it doesn't happen if the test fails.
> > 
> > Well, I don't think so.  We need to take the lock here unconditionally,
> > because the caller is going to unlock it.
> 
> No, I meant do this:
> 
> 	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
> 		spin_unlock_irq(&dev->power.lock);
> 		dev->bus->pm->runtime_idle(dev);
> 		spin_lock_irq(&dev->power.lock);
> 	}
 
Ah, OK.  That makes sense.

> By the way, I don't know if anyone still pays attention to sparse-type
> annotations.  If you do,

Well, I guess I should. :-)

> you should add "__releases(&dev->power.lock)"
> and "__acquires(&dev->power.lock)" annotations to functions like this,
> which release the lock without first acquiring it, and then acquire
> the lock without releasing before returning.
> 
> > > The same thing can be done in other places.
> > 
> > I'm not really sure it can.
> 
> The other places aren't quite the same as this.  I'll leave it to your
> discretion.  :-)
> 
> 
> > > > +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > > > +{
> > > > +	struct device *parent = dev->parent;
> > > > +	unsigned long flags;
> > > > +	int error = 0;
> > > > +
> > > > +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > > > +		return -EINVAL;
> > > 
> > > This should go inside the spinlocked area.
> > 
> > Why?  'status' is a function argument, it doesn't need to be protected from
> > concurrent modification.
> 
> Oops, my mistake.  Never mind...
> 
> 
> > > > +int pm_runtime_disable(struct device *dev)
> > > > +{
> > > ...
> > > > +	if (dev->power.disable_depth++ > 0)
> > > > +		goto out;
> > > > +
> > > > +	if (dev->power.runtime_failure)
> > > > +		goto out;
> > > 
> > > I don't see why this is needed.
> > 
> > If dev->power.runtime_failure, there's no need to do anything more.
> 
> Don't you still want to deactivate the timer and kill any pending
> requests?  True, they won't be able to do anything until the failure 
> state is cleared, but even so...

OK, I'll drop that if ().

> > > > +void pm_runtime_remove(struct device *dev)
> > > > +{
> > > > +	pm_runtime_disable(dev);
> > > > +
> > > > +	if (dev->power.runtime_status == RPM_ACTIVE) {
> > > > +		struct device *parent = dev->parent;
> > > > +
> > > > +		/*
> > > > +		 * Change the status back to 'suspended' to match the initial
> > > > +		 * status.
> > > > +		 */
> > > > +		pm_runtime_set_suspended(dev);
> > > > +		if (parent && !parent->power.ignore_children)
> > > > +			pm_request_idle(parent);
> > > 
> > > Shouldn't these last two lines be part of __pm_runtime_set_status()?
> > 
> > No.  It is valid to call __pm_runtime_set_status() when runtime PM is disabled
> > for the device and I don't think we should kick the parent in such cases.
> 
> Ah, an interesting point.  Suppose the device is in RPM_ACTIVE, and
> then someone calls pm_runtime_disable followed by
> pm_runtime_set_suspended.  Then the device's status would change to
> RPM_SUSPENDED and the parent's count of active children would be
> decremented, perhaps to 0.  If the count does go to 0, why wouldn't you
> want to send out an idle notification for the parent?

OK, having reconsidered that, I think you're right, it should go into
__pm_runtime_set_status().

> > > > @@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
> > > >  		dev->power.status = DPM_PREPARING;
> > > >  		mutex_unlock(&dpm_list_mtx);
> > > >  
> > > > -		error = device_prepare(dev, state);
> > > > +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> > > > +			error = -EBUSY;
> > > 
> > > What's the reason for the -EBUSY error?
> > 
> > If this is a wake-up device and pm_runtime_disable(dev) returned 1 (it can only
> > return 1 or 0), which means there was a resume request pending for the device,
> > suspend fails with -EBUSY (wake-up event during suspend).
> 
> I see.  That's a little obscure; a comment would help.  Even something
> as simple as:
> 
> 			error = -EBUSY;		/* wake-up during suspend */

OK

> > > > @@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
> > > >  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
> > > >  		 drv->bus->name, __func__, dev_name(dev), drv->name);
> > > >  
> > > > +	pm_runtime_get_noresume(dev);
> > > >  	ret = really_probe(dev, drv);
> > > > +	pm_runtime_put_noidle(dev);
> > > 
> > > This is bad because it won't wait if there's a runtime-PM call in
> > > progress.  Also, we shouldn't use put_noidle because it might subvert
> > > the driver's attempt to autosuspend.
> > 
> > I'm not sure how that's possible, but whatever.
> 
> Suppose the probe routine does:
> 
> 	pm_runtime_get_sync(dev);
> 	/* do some work */
> 	pm_runtime_put_sync(dev);
> 
> There is a clear expectation that an idle notification will eventually
> be sent.  But if the probe is surrounded by
> 
> 	pm_runtime_get_noresume(dev);
> 	... probe ...
> 	pm_runtime_put_noidle(dev);
> 
> then there won't be any idle notifications.

Ah, OK.  So the point is that we should always idle-notify after .probe(),
because that's what .probe() would most probably want to do.  I guess that
makes sense.

> > > > +2. Device Run-time PM Callbacks
> > > 
> > > > +In particular, it is recommended that ->runtime_suspend() return -EBUSY if
> > > > +device_may_wakeup() returns 'false' for the device.
> > > 
> > > What's the point of this?  I don't understand -- we don't want to
> > > discourage people from suspending devices with wakeup enabled.
> > 
> > device_may_wakeup(dev) == false means wake-up is disabled for dev, so
> > suspending it might not be a good idea.
> 
> Okay.  This needs to be rephrased.  For example,
> 
> 	In particular, if the driver requires remote wakeup capability
> 	for proper functioning and device_may_wakeup() returns 'false'
> 	for the device, then ->runtime_suspend() should return -EBUSY.
> 
> The point is that plenty of drivers can work perfectly well without
> remote wakeup, so they have no reason to check device_may_wakeup().

OK

Thanks a lot again, I'll do my best to address the comments in the next version
of the patch.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11)
  2009-08-05  2:44     ` Alan Stern
  2009-08-05 13:25       ` Rafael J. Wysocki
@ 2009-08-05 13:25       ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-05 13:25 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

On Wednesday 05 August 2009, Alan Stern wrote:
> On Wed, 5 Aug 2009, Rafael J. Wysocki wrote:
> 
> > > > +	spin_unlock_irq(&dev->power.lock);
> > > > +
> > > > +	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> > > > +		dev->bus->pm->runtime_idle(dev);
> > > > +
> > > > +	spin_lock_irq(&dev->power.lock);
> > > 
> > > Small optimization: Put the spin_{un}lock_irq stuff inside the "if"
> > > statement, so it doesn't happen if the test fails.
> > 
> > Well, I don't think so.  We need to take the lock here unconditionally,
> > because the caller is going to unlock it.
> 
> No, I meant do this:
> 
> 	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
> 		spin_unlock_irq(&dev->power.lock);
> 		dev->bus->pm->runtime_idle(dev);
> 		spin_lock_irq(&dev->power.lock);
> 	}
 
Ah, OK.  That makes sense.

> By the way, I don't know if anyone still pays attention to sparse-type
> annotations.  If you do,

Well, I guess I should. :-)

> you should add "__releases(&dev->power.lock)"
> and "__acquires(&dev->power.lock)" annotations to functions like this,
> which release the lock without first acquiring it, and then acquire
> the lock without releasing before returning.
> 
> > > The same thing can be done in other places.
> > 
> > I'm not really sure it can.
> 
> The other places aren't quite the same as this.  I'll leave it to your
> discretion.  :-)
> 
> 
> > > > +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > > > +{
> > > > +	struct device *parent = dev->parent;
> > > > +	unsigned long flags;
> > > > +	int error = 0;
> > > > +
> > > > +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > > > +		return -EINVAL;
> > > 
> > > This should go inside the spinlocked area.
> > 
> > Why?  'status' is a function argument, it doesn't need to be protected from
> > concurrent modification.
> 
> Oops, my mistake.  Never mind...
> 
> 
> > > > +int pm_runtime_disable(struct device *dev)
> > > > +{
> > > ...
> > > > +	if (dev->power.disable_depth++ > 0)
> > > > +		goto out;
> > > > +
> > > > +	if (dev->power.runtime_failure)
> > > > +		goto out;
> > > 
> > > I don't see why this is needed.
> > 
> > If dev->power.runtime_failure, there's no need to do anything more.
> 
> Don't you still want to deactivate the timer and kill any pending
> requests?  True, they won't be able to do anything until the failure 
> state is cleared, but even so...

OK, I'll drop that if ().

> > > > +void pm_runtime_remove(struct device *dev)
> > > > +{
> > > > +	pm_runtime_disable(dev);
> > > > +
> > > > +	if (dev->power.runtime_status == RPM_ACTIVE) {
> > > > +		struct device *parent = dev->parent;
> > > > +
> > > > +		/*
> > > > +		 * Change the status back to 'suspended' to match the initial
> > > > +		 * status.
> > > > +		 */
> > > > +		pm_runtime_set_suspended(dev);
> > > > +		if (parent && !parent->power.ignore_children)
> > > > +			pm_request_idle(parent);
> > > 
> > > Shouldn't these last two lines be part of __pm_runtime_set_status()?
> > 
> > No.  It is valid to call __pm_runtime_set_status() when runtime PM is disabled
> > for the device and I don't think we should kick the parent in such cases.
> 
> Ah, an interesting point.  Suppose the device is in RPM_ACTIVE, and
> then someone calls pm_runtime_disable followed by
> pm_runtime_set_suspended.  Then the device's status would change to
> RPM_SUSPENDED and the parent's count of active children would be
> decremented, perhaps to 0.  If the count does go to 0, why wouldn't you
> want to send out an idle notification for the parent?

OK, having reconsidered that, I think you're right, it should go into
__pm_runtime_set_status().

> > > > @@ -757,11 +771,15 @@ static int dpm_prepare(pm_message_t stat
> > > >  		dev->power.status = DPM_PREPARING;
> > > >  		mutex_unlock(&dpm_list_mtx);
> > > >  
> > > > -		error = device_prepare(dev, state);
> > > > +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> > > > +			error = -EBUSY;
> > > 
> > > What's the reason for the -EBUSY error?
> > 
> > If this is a wake-up device and pm_runtime_disable(dev) returned 1 (it can only
> > return 1 or 0), which means there was a resume request pending for the device,
> > suspend fails with -EBUSY (wake-up event during suspend).
> 
> I see.  That's a little obscure; a comment would help.  Even something
> as simple as:
> 
> 			error = -EBUSY;		/* wake-up during suspend */

OK

> > > > @@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
> > > >  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
> > > >  		 drv->bus->name, __func__, dev_name(dev), drv->name);
> > > >  
> > > > +	pm_runtime_get_noresume(dev);
> > > >  	ret = really_probe(dev, drv);
> > > > +	pm_runtime_put_noidle(dev);
> > > 
> > > This is bad because it won't wait if there's a runtime-PM call in
> > > progress.  Also, we shouldn't use put_noidle because it might subvert
> > > the driver's attempt to autosuspend.
> > 
> > I'm not sure how that's possible, but whatever.
> 
> Suppose the probe routine does:
> 
> 	pm_runtime_get_sync(dev);
> 	/* do some work */
> 	pm_runtime_put_sync(dev);
> 
> There is a clear expectation that an idle notification will eventually
> be sent.  But if the probe is surrounded by
> 
> 	pm_runtime_get_noresume(dev);
> 	... probe ...
> 	pm_runtime_put_noidle(dev);
> 
> then there won't be any idle notifications.

Ah, OK.  So the point is that we should always idle-notify after .probe(),
because that's what .probe() would most probably want to do.  I guess that
makes sense.

> > > > +2. Device Run-time PM Callbacks
> > > 
> > > > +In particular, it is recommended that ->runtime_suspend() return -EBUSY if
> > > > +device_may_wakeup() returns 'false' for the device.
> > > 
> > > What's the point of this?  I don't understand -- we don't want to
> > > discourage people from suspending devices with wakeup enabled.
> > 
> > device_may_wakeup(dev) == false means wake-up is disabled for dev, so
> > suspending it might not be a good idea.
> 
> Okay.  This needs to be rephrased.  For example,
> 
> 	In particular, if the driver requires remote wakeup capability
> 	for proper functioning and device_may_wakeup() returns 'false'
> 	for the device, then ->runtime_suspend() should return -EBUSY.
> 
> The point is that plenty of drivers can work perfectly well without
> remote wakeup, so they have no reason to check device_may_wakeup().

OK

Thanks a lot again, I'll do my best to address the comments in the next version
of the patch.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12)
  2009-08-05 13:25       ` Rafael J. Wysocki
  2009-08-05 21:47         ` [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12) Rafael J. Wysocki
@ 2009-08-05 21:47         ` Rafael J. Wysocki
  2009-08-06 17:01           ` Alan Stern
  2009-08-06 17:01           ` Alan Stern
  1 sibling, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-05 21:47 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

Hi,

The patch below should address all of your recent comments.

Additionally I changed a few bits that I thought could turn out to be
problematic at one point.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 12)

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.  Document all these things.

Special thanks to Alan Stern for his help with the design and
multiple detailed reviews of the pereceding versions of this patch
and to Magnus Damm for testing feedback.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/power/runtime_pm.txt |  382 +++++++++++++++
 drivers/base/dd.c                  |   15 
 drivers/base/power/Makefile        |    1 
 drivers/base/power/main.c          |   20 
 drivers/base/power/power.h         |   31 -
 drivers/base/power/runtime.c       |  939 +++++++++++++++++++++++++++++++++++++
 include/linux/pm.h                 |  102 +++-
 include/linux/pm_runtime.h         |  115 ++++
 kernel/power/Kconfig               |   14 
 kernel/power/main.c                |   17 
 10 files changed, 1625 insertions(+), 11 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /*
@@ -329,14 +358,81 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		disable_depth:3;
+	unsigned int		ignore_children:1;
+	unsigned int		runtime_failure:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			last_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,939 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_request_idle(struct device *dev);
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * In case there's a request pending, make sure its work function will
+	 * return without doing anything.
+	 */
+	dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.runtime_status != RPM_ACTIVE)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
+		spin_unlock_irq(&dev->power.lock);
+
+		dev->bus->pm->runtime_idle(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the function has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If there's an idle notification pending, cancel it.  If
+ * there's a suspend request scheduled while this function is running and @sync
+ * is 'true', cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* Pending resume requests take precedence over us. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+		/* Other pending requests need to be canceled. */
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.disable_depth > 0
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend) {
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = dev->bus->pm->runtime_suspend(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	} else {
+		retval = -ENOSYS;
+	}
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY) {
+			notify = true;
+		} else {
+			dev->power.runtime_failure = true;
+			dev->power.last_error = retval;
+		}
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		__pm_request_resume(dev);
+		dev->power.deferred_resume = false;
+	}
+
+	if (notify)
+		__pm_runtime_idle(dev);
+
+	if (parent && !parent->power.ignore_children) {
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_request_idle(parent);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the function has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  If there's a suspend
+ * request or idle notification pending, cancel it.  If there's a resume request
+ * scheduled while this function is running, cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			return -EINPROGRESS;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		parent = dev->parent;
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = pm_runtime_get_sync(parent);
+
+		spin_lock_irq(&dev->power.lock);
+		/* We can resume if the parent's run-time PM is disabled. */
+		if (retval < 0 && retval != -EAGAIN)
+			goto out_parent;
+		retval = 0;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume) {
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = dev->bus->pm->runtime_resume(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	} else {
+		retval = -ENOSYS;
+	}
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		dev->power.runtime_failure = true;
+		dev->power.last_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (!retval)
+		__pm_request_idle(dev);
+
+ out_parent:
+	if (parent) {
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_runtime_put(parent);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * __pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	if (dev->power.request_pending)
+		return retval;
+
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_idle(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval < 0)
+		return retval;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+		return retval;
+	} else if (retval) {
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval < 0)
+		return retval;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		return retval;
+	} else if (retval) {
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_get - Reference count a device and wake it up, if necessary.
+ * @dev: Device to handle.
+ * @sync: If set and the device is suspended, resume it synchronously.
+ *
+ * Increment the usage count of the device and if it was zero previously,
+ * resume it or submit a resume request for it, depending on the value of @sync.
+ */
+int __pm_runtime_get(struct device *dev, bool sync)
+{
+	int retval = 1;
+
+	if (atomic_add_return(1, &dev->power.usage_count) == 1)
+		retval = sync ? pm_runtime_resume(dev) : pm_request_resume(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_get);
+
+/**
+ * __pm_runtime_put - Decrement the device's usage counter and notify its bus.
+ * @dev: Device to handle.
+ * @sync: If the device's bus type is to be notified, do that synchronously.
+ *
+ * Decrement the usage count of the device and if it reaches zero, carry out a
+ * synchronous idle notification or submit an idle notification request for it,
+ * depending on the value of @sync.
+ */
+int __pm_runtime_put(struct device *dev, bool sync)
+{
+	int retval = 0;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		retval = sync ? pm_runtime_idle(dev) : pm_request_idle(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_put);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_failure flag is
+ * set, the status may be changed either to RPM_ACTIVE, or to RPM_SUSPENDED, as
+ * long as that reflects the actual state of the device.  However, if the device
+ * has a parent and the parent is not active, and the parent's
+ * power.ignore_children flag is unset, the device's status cannot be set to
+ * RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_failure
+ * flag and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.  If the new status is RPM_SUSPENDED, an idle
+ * notification request for the parent is submitted.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	bool notify_parent = false;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_failure && !dev->power.disable_depth)
+		goto out;
+
+	if (dev->power.runtime_status == status)
+		goto out_set;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent) {
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+			notify_parent = !parent->power.ignore_children;
+		}
+		goto out_set;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It is invalid to put an active child under a parent that is
+		 * not active, has run-time PM enabled and the
+		 * 'power.ignore_children' flag unset.
+		 */
+		if (!parent->power.disable_depth
+		    && !parent->power.ignore_children
+		    && parent->power.runtime_status != RPM_ACTIVE) {
+			error = -EBUSY;
+		} else {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	}
+
+ out_set:
+	dev->power.runtime_status = status;
+	dev->power.runtime_failure = false;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	if (notify_parent)
+		pm_request_idle(parent);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.disable_depth > 0)
+		dev->power.disable_depth--;
+	else
+		dev_warn(dev, "Unbalanced %s!", __func__);
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * __pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ * @check_resume: If set, check if there's a resume request for the device.
+ *
+ * Increment power.disable_depth for the device and if was zero previously,
+ * cancel all pending run-time PM requests for the device and wait for all
+ * operations in progress to complete.  The device can be either active or
+ * suspended after its run-time PM has been disabled.
+ *
+ * If @check_resume is set and there's a resume request pending when
+ * __pm_runtime_disable() is called and power.disable_depth is zero, the
+ * function will wake up the device before disabling its run-time PM and will
+ * return 1.  Otherwise, 0 is returned.
+ */
+int __pm_runtime_disable(struct device *dev, bool check_resume)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (dev->power.disable_depth > 0) {
+		dev->power.disable_depth++;
+		goto out;
+	}
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and disabling run-time PM
+	 * shouldn't prevent the device from processing the I/O.
+	 */
+	if (check_resume && dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		/*
+		 * Prevent suspends and idle notifications from being carried
+		 * out after we have woken up the device.
+		 */
+		pm_runtime_get_noresume(dev);
+
+		__pm_runtime_resume(dev, false);
+
+		pm_runtime_put_noidle(dev);
+		retval = 1;
+	}
+
+	if (dev->power.disable_depth++ > 0)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING
+			    && !dev->power.idle_notification)
+				break;
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_SUSPENDED;
+	dev->power.idle_notification = false;
+
+	dev->power.disable_depth = 1;
+	atomic_set(&dev->power.usage_count, 0);
+
+	dev->power.runtime_failure = false;
+	dev->power.last_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	setup_timer(&dev->power.suspend_timer, pm_suspend_timer_fn,
+			(unsigned long)dev);
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	__pm_runtime_disable(dev, false);
+
+	/* Change the status back to 'suspended' to match the initial status. */
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		pm_runtime_set_suspended(dev);
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,115 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_get(struct device *dev, bool sync);
+extern int __pm_runtime_put(struct device *dev, bool sync);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int __pm_runtime_disable(struct device *dev, bool check_resume);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline void pm_runtime_get_noresume(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+}
+
+static inline void pm_runtime_put_noidle(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_get(struct device *dev, bool sync) { return 1; }
+static inline int __pm_runtime_put(struct device *dev, bool sync) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int __pm_runtime_disable(struct device *dev, bool check_resume)
+{
+	return 0;
+}
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline void pm_runtime_get_noresume(struct device *dev) {}
+static inline void pm_runtime_put_noidle(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	return __pm_runtime_get(dev, false);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	return __pm_runtime_get(dev, true);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	return __pm_runtime_put(dev, false);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	return __pm_runtime_put(dev, true);
+}
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+static inline int pm_runtime_disable(struct device *dev)
+{
+	return __pm_runtime_disable(dev, true);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object being initialized.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -105,6 +116,7 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -512,6 +524,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -757,11 +770,16 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
+			/* Wake-up during suspend. */
+			error = -EBUSY;
+		else
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,17 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
+	/*
+	 * Wait for run-time PM calls to complete and prevent new suspend calls
+	 * until the probe is done.
+	 */
+	pm_runtime_disable(dev);
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
 	ret = really_probe(dev, drv);
+	pm_runtime_put_noidle(dev);
+	if (!ret)
+		pm_runtime_idle(dev);
 
 	return ret;
 }
@@ -306,6 +317,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +337,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,7 +1,14 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
+#ifdef CONFIG_PM_RUNTIME
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
 
 #ifdef CONFIG_PM_SLEEP
 
@@ -16,23 +23,33 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
+
+static inline void device_pm_init(struct device *dev)
+{
+	pm_runtime_init(dev);
+}
+
+static inline void device_pm_remove(struct device *dev)
+{
+	pm_runtime_remove(dev);
+}
 
 static inline void device_pm_add(struct device *dev) {}
-static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
 					 struct device *devb) {}
 static inline void device_pm_move_after(struct device *deva,
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 
Index: linux-2.6/Documentation/power/runtime_pm.txt
===================================================================
--- /dev/null
+++ linux-2.6/Documentation/power/runtime_pm.txt
@@ -0,0 +1,382 @@
+Run-time Power Management Framework for I/O Devices
+
+(C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+
+1. Introduction
+
+Support for run-time power management (run-time PM) of I/O devices is provided
+at the power management core (PM core) level by means of:
+
+* The power management workqueue pm_wq in which bus types and device drivers can
+  put their PM-related work items.  It is strongly recommended that pm_wq be
+  used for queuing all work items related to run-time PM, because this allows
+  them to be synchronized with system-wide power transitions (suspend to RAM,
+  hibernation and resume from system sleep states).  pm_wq is declared in
+  include/linux/pm_runtime.h and defined in kernel/power/main.c.
+
+* A number of run-time PM fields in the 'power' member of 'struct device' (which
+  is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can
+  be used for synchronizing run-time PM operations with one another.
+
+* Three device run-time PM callbacks in 'struct dev_pm_ops' (defined in
+  include/linux/pm.h).
+
+* A set of helper functions defined in drivers/base/power/runtime.c that can be
+  used for carrying out run-time PM operations in such a way that the
+  synchronization between them is taken care of by the PM core.  Bus types and
+  device drivers are encouraged to use these functions.
+
+The run-time PM callbacks present in 'struct dev_pm_ops', the device run-time PM
+fields of 'struct dev_pm_info' and the core helper functions provided for
+run-time PM are described below.
+
+2. Device Run-time PM Callbacks
+
+There are three device run-time PM callbacks defined in 'struct dev_pm_ops':
+
+struct dev_pm_ops {
+	...
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
+	...
+};
+
+The ->runtime_suspend() callback is executed by the PM core for the bus type of
+the device being suspended.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_suspend() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_suspend()
+callback in a device driver as long as the bus type's ->runtime_suspend() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_suspend() callback has completed successfully
+    for given device, the PM core regards the device as suspended, which need
+    not mean that the device has been put into a low power state.  It is
+    supposed to mean, however, that the device will not process data and will
+    not communicate with the CPU(s) and RAM until its bus type's
+    ->runtime_resume() callback is executed for it.  The run-time PM status of
+    a device after successful execution of its bus type's ->runtime_suspend()
+    callback is 'suspended'.
+
+  * If the bus type's ->runtime_suspend() callback returns -EBUSY or -EAGAIN,
+    the device's run-time PM status is supposed to be 'active', which means that
+    the device _must_ be fully operational afterwards.
+
+  * If the bus type's ->runtime_suspend() callback returns an error code
+    different from -EBUSY or -EAGAIN, the PM core regards this as a fatal
+    error and will refuse to run the helper functions described in Section 4
+    for the device, until the status of it is directly set either to 'active'
+    or to 'suspended' (the PM core provides special helper functions for this
+    purpose).
+
+In particular, if the driver requires remote wakeup capability for proper
+functioning and device_may_wakeup() returns 'false' for the device, then
+->runtime_suspend() should return -EBUSY.  On the other hand, if
+device_may_wakeup() returns 'true' for the device and the device is put
+into a low power state during the execution of its bus type's
+->runtime_suspend(), it is expected that remote wake-up (i.e. hardware mechanism
+allowing the device to request a change of its power state, such as PCI PME)
+will be enabled for the device.  Generally, remote wake-up should be enabled
+for all input devices put into a low power state at run time.
+
+The ->runtime_resume() callback is executed by the PM core for the bus type of
+the device being woken up.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_resume() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_resume()
+callback in a device driver as long as the bus type's ->runtime_resume() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_resume() callback has completed successfully,
+    the PM core regards the device as fully operational, which means that the
+    device _must_ be able to complete I/O operations as needed.  The run-time
+    PM status of the device is then 'active'.
+
+  * If the bus type's ->runtime_resume() callback returns an error code, the PM
+    core regards this as a fatal error and will refuse to run the helper
+    functions described in Section 4 for the device, until its status is
+    directly set either to 'active' or to 'suspended' (the PM core provides
+    special helper functions for this purpose).
+
+The ->runtime_idle() callback is executed by the PM core for the bus type of
+given device whenever the device appears to be idle, which is indicated to the
+PM core by two counters, the device's usage counter and the counter of 'active'
+children of the device.
+
+  * If any of these counters is decreased using a helper function provided by
+    the PM core and it turns out to be equal to zero, the other counter is
+    checked.  If that counter also is equal to zero, the PM core executes the
+    device bus type's ->runtime_idle() callback (with the device as an
+    argument).
+
+The action performed by a bus type's ->runtime_idle() callback is totally
+dependent on the bus type in question, but the expected and recommended action
+is to check if the device can be suspended (i.e. if all of the conditions
+necessary for suspending the device are satisfied) and to queue up a suspend
+request for the device in that case.
+
+The helper functions provided by the PM core, described in Section 4, guarantee
+that the following constraints are met with respect to the bus type's run-time
+PM callbacks:
+
+(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
+    ->runtime_suspend() in parallel with ->runtime_resume() or with another
+    instance of ->runtime_suspend() for the same device) with the exception that
+    ->runtime_suspend() or ->runtime_resume() can be executed in parallel with
+    ->runtime_idle() (although ->runtime_idle() will not be started while any
+    of the other callbacks is being executed for the same device).
+
+(2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active'
+    devices (i.e. the PM core will only execute ->runtime_idle() or
+    ->runtime_suspend() for the devices the run-time PM status of which is
+    'active').
+
+(3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device
+    the usage counter of which is equal to zero _and_ either the counter of
+    'active' children of which is equal to zero, or the 'power.ignore_children'
+    flag of which is set.
+
+(4) ->runtime_resume() can only be executed for 'suspended' devices  (i.e. the
+    PM core will only execute ->runtime_resume() for the devices the run-time
+    PM status of which is 'suspended').
+
+Additionally, the helper functions provided by the PM core obey the following
+rules:
+
+  * If ->runtime_suspend() is about to be executed or there's a pending request
+    to execute it, ->runtime_idle() will not be executed for the same device.
+
+  * A request to execute or to schedule the execution of ->runtime_suspend()
+    will cancel any pending requests to execute ->runtime_idle() for the same
+    device.
+
+  * If ->runtime_resume() is about to be executed or there's a pending request
+    to execute it, the other callbacks will not be executed for the same device.
+
+  * A request to execute ->runtime_resume() will cancel any pending or
+    scheduled requests to execute the other callbacks for the same device.
+
+3. Run-time PM Device Fields
+
+The following device run-time PM fields are present in 'struct dev_pm_info', as
+defined in include/linux/pm.h:
+
+  struct timer_list suspend_timer;
+    - timer used for scheduling (delayed) suspend request
+
+  unsigned long timer_expires;
+    - timer expiration time, in jiffies (if this is different from zero, the
+      timer is running and will expire at that time, otherwise the timer is not
+      running)
+
+  struct work_struct work;
+    - work structure used for queuing up requests (i.e. work items in pm_wq)
+
+  wait_queue_head_t wait_queue;
+    - wait queue used if any of the helper functions needs to wait for another
+      one to complete
+
+  spinlock_t lock;
+    - lock used for synchronisation
+
+  atomic_t usage_count;
+    - the usage counter of the device
+
+  atomic_t child_count;
+    - the count of 'active' children of the device
+
+  unsigned int ignore_children;
+    - if set, the value of child_count is ignored (but still updated)
+
+  unsigned int disable_depth;
+    - used for disabling the helper funcions (they work normally if this is
+      equal to zero); the initial value of it is 1 (i.e. run-time PM is
+      initially disabled for all devices)
+
+  unsigned int runtime_failure;
+    - if set, there was a fatal error (one of the callbacks returned error code
+      as described in Section 2), so the helper funtions will not work until
+      this flag is cleared
+
+  int last_error;
+    - if runtime_failure is set, this is the error code returned by the
+      failing callback
+
+  unsigned int idle_notification;
+    - if set, ->runtime_idle() is being executed
+
+  unsigned int request_pending;
+    - if set, there's a pending request (i.e. a work item queued up into pm_wq)
+
+  enum rpm_request request;
+    - type of request that's pending (valid if request_pending is set)
+
+  unsigned int deferred_resume;
+    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
+      being executed for that device and it is not practical to wait for the
+      suspend to complete; means "queue up a resume request as soon as you've
+      suspended"
+
+  enum rpm_status runtime_status;
+    - the run-time PM status of the device; this field's initial value is
+      RPM_SUSPENDED, which means that each device is initially regarded by the
+      PM core as 'suspended', regardless of its real hardware status
+
+All of the above fields are members of the 'power' member of 'struct device'.
+
+4. Run-time PM Device Helper Functions
+
+The following run-time PM helper functions are defined in
+drivers/base/power/runtime.c and include/linux/pm_runtime.h:
+
+  void pm_runtime_init(struct device *dev);
+    - initialize the device run-time PM fields in 'struct dev_pm_info'
+
+  void pm_runtime_remove(struct device *dev);
+    - make sure that the run-time PM of the device will be disabled after
+      removing the device from device hierarchy
+
+  int pm_runtime_idle(struct device *dev);
+    - execute ->runtime_idle() for the device's bus type; returns 0 on success
+      or error code on failure, where -EINPROGRESS means that ->runtime_idle()
+      is already being executed
+
+  int pm_runtime_suspend(struct device *dev);
+    - execute ->runtime_suspend() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'suspended', or
+      error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt
+      to suspend the device again in future
+
+  int pm_runtime_resume(struct device *dev);
+    - execute ->runtime_resume() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'active' or
+      error code on failure, where -EAGAIN means it may be safe to attempt to
+      resume the device again in future, but 'power.runtime_failure' should be
+      checked additionally
+
+  int pm_request_idle(struct device *dev);
+    - submit a request to execute ->runtime_idle() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on success
+      or error code if the request has not been queued up
+
+  int pm_schedule_suspend(struct device *dev, unsigned int delay);
+    - schedule the execution of ->runtime_suspend() for the device's bus type
+      in future, where 'delay' is the time to wait before queuing up a suspend
+      work item in pm_wq, in milliseconds (if 'delay' is zero, the work item is
+      queued up immediately); returns 0 on success, 1 if the device's PM
+      run-time status was already 'suspended', or error code if the request
+      hasn't been scheduled (or queued up if 'delay' is 0); if the execution of
+      ->runtime_suspend() is already scheduled and not yet expired, the new
+      value of 'delay' will be used as the time to wait
+
+  int pm_request_resume(struct device *dev);
+    - submit a request to execute ->runtime_resume() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on
+      success, 1 if the device's run-time PM status was already 'active', or
+      error code if the request hasn't been queued up
+
+  void pm_runtime_get_noresume(struct device *dev);
+    - increment the device's usage counter
+
+  int pm_runtime_get(struct device *dev);
+    - increment the device's usage counter, run pm_request_resume(dev) and
+      return its result
+
+  int pm_runtime_get_sync(struct device *dev);
+    - increment the device's usage counter, run pm_runtime_resume(dev) and
+      return its result
+
+  void pm_runtime_put_noidle(struct device *dev);
+    - decrement the device's usage counter
+
+  int pm_runtime_put(struct device *dev);
+    - decrement the device's usage counter, run pm_request_idle(dev) and return
+      its result
+
+  int pm_runtime_put_sync(struct device *dev);
+    - decrement the device's usage counter, run pm_runtime_idle(dev) and return
+      its result
+
+  void pm_runtime_enable(struct device *dev);
+    - enable the run-time PM helper functions to run the device bus type's
+      run-time PM callbacks described in Section 2
+
+  int pm_runtime_disable(struct device *dev);
+    - prevent the run-time PM helper functions from running the device bus
+      type's run-time PM callbacks, make sure that all of the pending run-time
+      PM operations on the device are either completed or canceled; returns
+      1 if there was a resume request pending and it was necessary to execute
+      ->runtime_resume() for the device's bus type to satisfy that request,
+      otherwise 0 is returned
+
+  void pm_suspend_ignore_children(struct device *dev, bool enable);
+    - set/unset the power.ignore_children flag of the device
+
+  int pm_runtime_set_active(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'active' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_failure' is set or 'power.disable_depth' is greater than
+      zero); it will fail and return error code if the device has a parent
+      which is not active and the 'power.ignore_children' flag of which is unset
+
+  void pm_runtime_set_suspended(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'suspended' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_failure' is set or 'power.disable_depth' is greater than
+      zero)
+
+It is safe to execute the following helper functions from interrupt context:
+
+pm_request_idle()
+pm_schedule_suspend()
+pm_request_resume()
+pm_runtime_get_noresume()
+pm_runtime_get()
+pm_runtime_put_noidle()
+pm_runtime_put()
+pm_suspend_ignore_children()
+pm_runtime_set_active()
+pm_runtime_set_suspended()
+pm_runtime_enable()
+
+5. Run-time PM Initialization
+
+Initially, the run-time PM is disabled for all devices, which means that the
+majority of the run-time PM helper funtions described in Section 4 will return
+-EAGAIN until pm_runtime_enable() is called for the device.
+
+In addition to that, the initial run-time PM status of all devices is
+'suspended', but it need not reflect the actual physical state of the device.
+Thus, if the device is initially active (i.e. it is able to process I/O), its
+run-time PM status must be changed to 'active', with the help of
+pm_runtime_set_active(), before pm_runtime_enable() is called for the device.
+
+However, if the device has a parent and the parent's run-time PM is enabled,
+calling pm_runtime_set_active() for the device will affect the parent, unless
+the parent's 'power.ignore_children' flag is set.  Namely, in that case the
+parent won't be able to suspend at run time, using the PM core's helper
+functions, as long as the child's status is 'active', even if the child's
+run-time PM is still disabled (i.e. pm_runtime_enable() hasn't been called for
+the child yet or pm_runtime_disable() has been called for it).  For this reason,
+once pm_runtime_set_active() has been called for the device, pm_runtime_enable()
+should be called for it too as soon as reasonably possible or its run-time PM
+status should be changed back to 'suspended' with the help of
+pm_runtime_set_suspended().
+
+If the defaul initial run-time PM status of the device (i.e. 'suspended')
+reflects the actual state of the device, its bus type's or its driver's
+->probe() callback will likely need to wake it up using one of the PM core's
+helper functions described in Section 4.  In that case, pm_runtime_resume()
+should be used.  Of course, for this purpose the device's run-time PM has to be
+enabled earlier by calling pm_runtime_enable().
+
+If ->probe() calls pm_runtime_suspend() or pm_runtime_idle(), or their
+asynchronous counterparts, they will fail returning -EAGAIN, because the
+device's usage counter is incremented by the core before executing ->probe().
+Still, it may be desirable to suspend the device as soon as ->probe() has
+finished, so the core uses pm_runtime_idle() to invoke the device bus type's
+->runtime_idle() callback at that time, which only happens even if ->probe()
+is successful.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12)
  2009-08-05 13:25       ` Rafael J. Wysocki
@ 2009-08-05 21:47         ` Rafael J. Wysocki
  2009-08-05 21:47         ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-05 21:47 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

Hi,

The patch below should address all of your recent comments.

Additionally I changed a few bits that I thought could turn out to be
problematic at one point.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 12)

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.  Document all these things.

Special thanks to Alan Stern for his help with the design and
multiple detailed reviews of the pereceding versions of this patch
and to Magnus Damm for testing feedback.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/power/runtime_pm.txt |  382 +++++++++++++++
 drivers/base/dd.c                  |   15 
 drivers/base/power/Makefile        |    1 
 drivers/base/power/main.c          |   20 
 drivers/base/power/power.h         |   31 -
 drivers/base/power/runtime.c       |  939 +++++++++++++++++++++++++++++++++++++
 include/linux/pm.h                 |  102 +++-
 include/linux/pm_runtime.h         |  115 ++++
 kernel/power/Kconfig               |   14 
 kernel/power/main.c                |   17 
 10 files changed, 1625 insertions(+), 11 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /*
@@ -329,14 +358,81 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		disable_depth:3;
+	unsigned int		ignore_children:1;
+	unsigned int		runtime_failure:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			last_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,939 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_request_idle(struct device *dev);
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * In case there's a request pending, make sure its work function will
+	 * return without doing anything.
+	 */
+	dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.runtime_status != RPM_ACTIVE)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
+		spin_unlock_irq(&dev->power.lock);
+
+		dev->bus->pm->runtime_idle(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the function has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If there's an idle notification pending, cancel it.  If
+ * there's a suspend request scheduled while this function is running and @sync
+ * is 'true', cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* Pending resume requests take precedence over us. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+		/* Other pending requests need to be canceled. */
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.disable_depth > 0
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend) {
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = dev->bus->pm->runtime_suspend(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	} else {
+		retval = -ENOSYS;
+	}
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY) {
+			notify = true;
+		} else {
+			dev->power.runtime_failure = true;
+			dev->power.last_error = retval;
+		}
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		__pm_request_resume(dev);
+		dev->power.deferred_resume = false;
+	}
+
+	if (notify)
+		__pm_runtime_idle(dev);
+
+	if (parent && !parent->power.ignore_children) {
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_request_idle(parent);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the function has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  If there's a suspend
+ * request or idle notification pending, cancel it.  If there's a resume request
+ * scheduled while this function is running, cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			return -EINPROGRESS;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		parent = dev->parent;
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = pm_runtime_get_sync(parent);
+
+		spin_lock_irq(&dev->power.lock);
+		/* We can resume if the parent's run-time PM is disabled. */
+		if (retval < 0 && retval != -EAGAIN)
+			goto out_parent;
+		retval = 0;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume) {
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = dev->bus->pm->runtime_resume(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	} else {
+		retval = -ENOSYS;
+	}
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		dev->power.runtime_failure = true;
+		dev->power.last_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (!retval)
+		__pm_request_idle(dev);
+
+ out_parent:
+	if (parent) {
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_runtime_put(parent);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * __pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	if (dev->power.request_pending)
+		return retval;
+
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_idle(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval < 0)
+		return retval;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+		return retval;
+	} else if (retval) {
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval < 0)
+		return retval;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		return retval;
+	} else if (retval) {
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_get - Reference count a device and wake it up, if necessary.
+ * @dev: Device to handle.
+ * @sync: If set and the device is suspended, resume it synchronously.
+ *
+ * Increment the usage count of the device and if it was zero previously,
+ * resume it or submit a resume request for it, depending on the value of @sync.
+ */
+int __pm_runtime_get(struct device *dev, bool sync)
+{
+	int retval = 1;
+
+	if (atomic_add_return(1, &dev->power.usage_count) == 1)
+		retval = sync ? pm_runtime_resume(dev) : pm_request_resume(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_get);
+
+/**
+ * __pm_runtime_put - Decrement the device's usage counter and notify its bus.
+ * @dev: Device to handle.
+ * @sync: If the device's bus type is to be notified, do that synchronously.
+ *
+ * Decrement the usage count of the device and if it reaches zero, carry out a
+ * synchronous idle notification or submit an idle notification request for it,
+ * depending on the value of @sync.
+ */
+int __pm_runtime_put(struct device *dev, bool sync)
+{
+	int retval = 0;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		retval = sync ? pm_runtime_idle(dev) : pm_request_idle(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_put);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_failure flag is
+ * set, the status may be changed either to RPM_ACTIVE, or to RPM_SUSPENDED, as
+ * long as that reflects the actual state of the device.  However, if the device
+ * has a parent and the parent is not active, and the parent's
+ * power.ignore_children flag is unset, the device's status cannot be set to
+ * RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_failure
+ * flag and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.  If the new status is RPM_SUSPENDED, an idle
+ * notification request for the parent is submitted.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	bool notify_parent = false;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_failure && !dev->power.disable_depth)
+		goto out;
+
+	if (dev->power.runtime_status == status)
+		goto out_set;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent) {
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+			notify_parent = !parent->power.ignore_children;
+		}
+		goto out_set;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It is invalid to put an active child under a parent that is
+		 * not active, has run-time PM enabled and the
+		 * 'power.ignore_children' flag unset.
+		 */
+		if (!parent->power.disable_depth
+		    && !parent->power.ignore_children
+		    && parent->power.runtime_status != RPM_ACTIVE) {
+			error = -EBUSY;
+		} else {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	}
+
+ out_set:
+	dev->power.runtime_status = status;
+	dev->power.runtime_failure = false;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	if (notify_parent)
+		pm_request_idle(parent);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.disable_depth > 0)
+		dev->power.disable_depth--;
+	else
+		dev_warn(dev, "Unbalanced %s!", __func__);
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * __pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ * @check_resume: If set, check if there's a resume request for the device.
+ *
+ * Increment power.disable_depth for the device and if was zero previously,
+ * cancel all pending run-time PM requests for the device and wait for all
+ * operations in progress to complete.  The device can be either active or
+ * suspended after its run-time PM has been disabled.
+ *
+ * If @check_resume is set and there's a resume request pending when
+ * __pm_runtime_disable() is called and power.disable_depth is zero, the
+ * function will wake up the device before disabling its run-time PM and will
+ * return 1.  Otherwise, 0 is returned.
+ */
+int __pm_runtime_disable(struct device *dev, bool check_resume)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (dev->power.disable_depth > 0) {
+		dev->power.disable_depth++;
+		goto out;
+	}
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and disabling run-time PM
+	 * shouldn't prevent the device from processing the I/O.
+	 */
+	if (check_resume && dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		/*
+		 * Prevent suspends and idle notifications from being carried
+		 * out after we have woken up the device.
+		 */
+		pm_runtime_get_noresume(dev);
+
+		__pm_runtime_resume(dev, false);
+
+		pm_runtime_put_noidle(dev);
+		retval = 1;
+	}
+
+	if (dev->power.disable_depth++ > 0)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING
+			    && !dev->power.idle_notification)
+				break;
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_SUSPENDED;
+	dev->power.idle_notification = false;
+
+	dev->power.disable_depth = 1;
+	atomic_set(&dev->power.usage_count, 0);
+
+	dev->power.runtime_failure = false;
+	dev->power.last_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	setup_timer(&dev->power.suspend_timer, pm_suspend_timer_fn,
+			(unsigned long)dev);
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	__pm_runtime_disable(dev, false);
+
+	/* Change the status back to 'suspended' to match the initial status. */
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		pm_runtime_set_suspended(dev);
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,115 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_get(struct device *dev, bool sync);
+extern int __pm_runtime_put(struct device *dev, bool sync);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int __pm_runtime_disable(struct device *dev, bool check_resume);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline void pm_runtime_get_noresume(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+}
+
+static inline void pm_runtime_put_noidle(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_get(struct device *dev, bool sync) { return 1; }
+static inline int __pm_runtime_put(struct device *dev, bool sync) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int __pm_runtime_disable(struct device *dev, bool check_resume)
+{
+	return 0;
+}
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline void pm_runtime_get_noresume(struct device *dev) {}
+static inline void pm_runtime_put_noidle(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	return __pm_runtime_get(dev, false);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	return __pm_runtime_get(dev, true);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	return __pm_runtime_put(dev, false);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	return __pm_runtime_put(dev, true);
+}
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+static inline int pm_runtime_disable(struct device *dev)
+{
+	return __pm_runtime_disable(dev, true);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object being initialized.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -105,6 +116,7 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -512,6 +524,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -757,11 +770,16 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
+			/* Wake-up during suspend. */
+			error = -EBUSY;
+		else
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,17 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
+	/*
+	 * Wait for run-time PM calls to complete and prevent new suspend calls
+	 * until the probe is done.
+	 */
+	pm_runtime_disable(dev);
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
 	ret = really_probe(dev, drv);
+	pm_runtime_put_noidle(dev);
+	if (!ret)
+		pm_runtime_idle(dev);
 
 	return ret;
 }
@@ -306,6 +317,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +337,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,7 +1,14 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
+#ifdef CONFIG_PM_RUNTIME
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
 
 #ifdef CONFIG_PM_SLEEP
 
@@ -16,23 +23,33 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
+
+static inline void device_pm_init(struct device *dev)
+{
+	pm_runtime_init(dev);
+}
+
+static inline void device_pm_remove(struct device *dev)
+{
+	pm_runtime_remove(dev);
+}
 
 static inline void device_pm_add(struct device *dev) {}
-static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
 					 struct device *devb) {}
 static inline void device_pm_move_after(struct device *deva,
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 
Index: linux-2.6/Documentation/power/runtime_pm.txt
===================================================================
--- /dev/null
+++ linux-2.6/Documentation/power/runtime_pm.txt
@@ -0,0 +1,382 @@
+Run-time Power Management Framework for I/O Devices
+
+(C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+
+1. Introduction
+
+Support for run-time power management (run-time PM) of I/O devices is provided
+at the power management core (PM core) level by means of:
+
+* The power management workqueue pm_wq in which bus types and device drivers can
+  put their PM-related work items.  It is strongly recommended that pm_wq be
+  used for queuing all work items related to run-time PM, because this allows
+  them to be synchronized with system-wide power transitions (suspend to RAM,
+  hibernation and resume from system sleep states).  pm_wq is declared in
+  include/linux/pm_runtime.h and defined in kernel/power/main.c.
+
+* A number of run-time PM fields in the 'power' member of 'struct device' (which
+  is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can
+  be used for synchronizing run-time PM operations with one another.
+
+* Three device run-time PM callbacks in 'struct dev_pm_ops' (defined in
+  include/linux/pm.h).
+
+* A set of helper functions defined in drivers/base/power/runtime.c that can be
+  used for carrying out run-time PM operations in such a way that the
+  synchronization between them is taken care of by the PM core.  Bus types and
+  device drivers are encouraged to use these functions.
+
+The run-time PM callbacks present in 'struct dev_pm_ops', the device run-time PM
+fields of 'struct dev_pm_info' and the core helper functions provided for
+run-time PM are described below.
+
+2. Device Run-time PM Callbacks
+
+There are three device run-time PM callbacks defined in 'struct dev_pm_ops':
+
+struct dev_pm_ops {
+	...
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
+	...
+};
+
+The ->runtime_suspend() callback is executed by the PM core for the bus type of
+the device being suspended.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_suspend() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_suspend()
+callback in a device driver as long as the bus type's ->runtime_suspend() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_suspend() callback has completed successfully
+    for given device, the PM core regards the device as suspended, which need
+    not mean that the device has been put into a low power state.  It is
+    supposed to mean, however, that the device will not process data and will
+    not communicate with the CPU(s) and RAM until its bus type's
+    ->runtime_resume() callback is executed for it.  The run-time PM status of
+    a device after successful execution of its bus type's ->runtime_suspend()
+    callback is 'suspended'.
+
+  * If the bus type's ->runtime_suspend() callback returns -EBUSY or -EAGAIN,
+    the device's run-time PM status is supposed to be 'active', which means that
+    the device _must_ be fully operational afterwards.
+
+  * If the bus type's ->runtime_suspend() callback returns an error code
+    different from -EBUSY or -EAGAIN, the PM core regards this as a fatal
+    error and will refuse to run the helper functions described in Section 4
+    for the device, until the status of it is directly set either to 'active'
+    or to 'suspended' (the PM core provides special helper functions for this
+    purpose).
+
+In particular, if the driver requires remote wakeup capability for proper
+functioning and device_may_wakeup() returns 'false' for the device, then
+->runtime_suspend() should return -EBUSY.  On the other hand, if
+device_may_wakeup() returns 'true' for the device and the device is put
+into a low power state during the execution of its bus type's
+->runtime_suspend(), it is expected that remote wake-up (i.e. hardware mechanism
+allowing the device to request a change of its power state, such as PCI PME)
+will be enabled for the device.  Generally, remote wake-up should be enabled
+for all input devices put into a low power state at run time.
+
+The ->runtime_resume() callback is executed by the PM core for the bus type of
+the device being woken up.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_resume() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_resume()
+callback in a device driver as long as the bus type's ->runtime_resume() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_resume() callback has completed successfully,
+    the PM core regards the device as fully operational, which means that the
+    device _must_ be able to complete I/O operations as needed.  The run-time
+    PM status of the device is then 'active'.
+
+  * If the bus type's ->runtime_resume() callback returns an error code, the PM
+    core regards this as a fatal error and will refuse to run the helper
+    functions described in Section 4 for the device, until its status is
+    directly set either to 'active' or to 'suspended' (the PM core provides
+    special helper functions for this purpose).
+
+The ->runtime_idle() callback is executed by the PM core for the bus type of
+given device whenever the device appears to be idle, which is indicated to the
+PM core by two counters, the device's usage counter and the counter of 'active'
+children of the device.
+
+  * If any of these counters is decreased using a helper function provided by
+    the PM core and it turns out to be equal to zero, the other counter is
+    checked.  If that counter also is equal to zero, the PM core executes the
+    device bus type's ->runtime_idle() callback (with the device as an
+    argument).
+
+The action performed by a bus type's ->runtime_idle() callback is totally
+dependent on the bus type in question, but the expected and recommended action
+is to check if the device can be suspended (i.e. if all of the conditions
+necessary for suspending the device are satisfied) and to queue up a suspend
+request for the device in that case.
+
+The helper functions provided by the PM core, described in Section 4, guarantee
+that the following constraints are met with respect to the bus type's run-time
+PM callbacks:
+
+(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
+    ->runtime_suspend() in parallel with ->runtime_resume() or with another
+    instance of ->runtime_suspend() for the same device) with the exception that
+    ->runtime_suspend() or ->runtime_resume() can be executed in parallel with
+    ->runtime_idle() (although ->runtime_idle() will not be started while any
+    of the other callbacks is being executed for the same device).
+
+(2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active'
+    devices (i.e. the PM core will only execute ->runtime_idle() or
+    ->runtime_suspend() for the devices the run-time PM status of which is
+    'active').
+
+(3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device
+    the usage counter of which is equal to zero _and_ either the counter of
+    'active' children of which is equal to zero, or the 'power.ignore_children'
+    flag of which is set.
+
+(4) ->runtime_resume() can only be executed for 'suspended' devices  (i.e. the
+    PM core will only execute ->runtime_resume() for the devices the run-time
+    PM status of which is 'suspended').
+
+Additionally, the helper functions provided by the PM core obey the following
+rules:
+
+  * If ->runtime_suspend() is about to be executed or there's a pending request
+    to execute it, ->runtime_idle() will not be executed for the same device.
+
+  * A request to execute or to schedule the execution of ->runtime_suspend()
+    will cancel any pending requests to execute ->runtime_idle() for the same
+    device.
+
+  * If ->runtime_resume() is about to be executed or there's a pending request
+    to execute it, the other callbacks will not be executed for the same device.
+
+  * A request to execute ->runtime_resume() will cancel any pending or
+    scheduled requests to execute the other callbacks for the same device.
+
+3. Run-time PM Device Fields
+
+The following device run-time PM fields are present in 'struct dev_pm_info', as
+defined in include/linux/pm.h:
+
+  struct timer_list suspend_timer;
+    - timer used for scheduling (delayed) suspend request
+
+  unsigned long timer_expires;
+    - timer expiration time, in jiffies (if this is different from zero, the
+      timer is running and will expire at that time, otherwise the timer is not
+      running)
+
+  struct work_struct work;
+    - work structure used for queuing up requests (i.e. work items in pm_wq)
+
+  wait_queue_head_t wait_queue;
+    - wait queue used if any of the helper functions needs to wait for another
+      one to complete
+
+  spinlock_t lock;
+    - lock used for synchronisation
+
+  atomic_t usage_count;
+    - the usage counter of the device
+
+  atomic_t child_count;
+    - the count of 'active' children of the device
+
+  unsigned int ignore_children;
+    - if set, the value of child_count is ignored (but still updated)
+
+  unsigned int disable_depth;
+    - used for disabling the helper funcions (they work normally if this is
+      equal to zero); the initial value of it is 1 (i.e. run-time PM is
+      initially disabled for all devices)
+
+  unsigned int runtime_failure;
+    - if set, there was a fatal error (one of the callbacks returned error code
+      as described in Section 2), so the helper funtions will not work until
+      this flag is cleared
+
+  int last_error;
+    - if runtime_failure is set, this is the error code returned by the
+      failing callback
+
+  unsigned int idle_notification;
+    - if set, ->runtime_idle() is being executed
+
+  unsigned int request_pending;
+    - if set, there's a pending request (i.e. a work item queued up into pm_wq)
+
+  enum rpm_request request;
+    - type of request that's pending (valid if request_pending is set)
+
+  unsigned int deferred_resume;
+    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
+      being executed for that device and it is not practical to wait for the
+      suspend to complete; means "queue up a resume request as soon as you've
+      suspended"
+
+  enum rpm_status runtime_status;
+    - the run-time PM status of the device; this field's initial value is
+      RPM_SUSPENDED, which means that each device is initially regarded by the
+      PM core as 'suspended', regardless of its real hardware status
+
+All of the above fields are members of the 'power' member of 'struct device'.
+
+4. Run-time PM Device Helper Functions
+
+The following run-time PM helper functions are defined in
+drivers/base/power/runtime.c and include/linux/pm_runtime.h:
+
+  void pm_runtime_init(struct device *dev);
+    - initialize the device run-time PM fields in 'struct dev_pm_info'
+
+  void pm_runtime_remove(struct device *dev);
+    - make sure that the run-time PM of the device will be disabled after
+      removing the device from device hierarchy
+
+  int pm_runtime_idle(struct device *dev);
+    - execute ->runtime_idle() for the device's bus type; returns 0 on success
+      or error code on failure, where -EINPROGRESS means that ->runtime_idle()
+      is already being executed
+
+  int pm_runtime_suspend(struct device *dev);
+    - execute ->runtime_suspend() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'suspended', or
+      error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt
+      to suspend the device again in future
+
+  int pm_runtime_resume(struct device *dev);
+    - execute ->runtime_resume() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'active' or
+      error code on failure, where -EAGAIN means it may be safe to attempt to
+      resume the device again in future, but 'power.runtime_failure' should be
+      checked additionally
+
+  int pm_request_idle(struct device *dev);
+    - submit a request to execute ->runtime_idle() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on success
+      or error code if the request has not been queued up
+
+  int pm_schedule_suspend(struct device *dev, unsigned int delay);
+    - schedule the execution of ->runtime_suspend() for the device's bus type
+      in future, where 'delay' is the time to wait before queuing up a suspend
+      work item in pm_wq, in milliseconds (if 'delay' is zero, the work item is
+      queued up immediately); returns 0 on success, 1 if the device's PM
+      run-time status was already 'suspended', or error code if the request
+      hasn't been scheduled (or queued up if 'delay' is 0); if the execution of
+      ->runtime_suspend() is already scheduled and not yet expired, the new
+      value of 'delay' will be used as the time to wait
+
+  int pm_request_resume(struct device *dev);
+    - submit a request to execute ->runtime_resume() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on
+      success, 1 if the device's run-time PM status was already 'active', or
+      error code if the request hasn't been queued up
+
+  void pm_runtime_get_noresume(struct device *dev);
+    - increment the device's usage counter
+
+  int pm_runtime_get(struct device *dev);
+    - increment the device's usage counter, run pm_request_resume(dev) and
+      return its result
+
+  int pm_runtime_get_sync(struct device *dev);
+    - increment the device's usage counter, run pm_runtime_resume(dev) and
+      return its result
+
+  void pm_runtime_put_noidle(struct device *dev);
+    - decrement the device's usage counter
+
+  int pm_runtime_put(struct device *dev);
+    - decrement the device's usage counter, run pm_request_idle(dev) and return
+      its result
+
+  int pm_runtime_put_sync(struct device *dev);
+    - decrement the device's usage counter, run pm_runtime_idle(dev) and return
+      its result
+
+  void pm_runtime_enable(struct device *dev);
+    - enable the run-time PM helper functions to run the device bus type's
+      run-time PM callbacks described in Section 2
+
+  int pm_runtime_disable(struct device *dev);
+    - prevent the run-time PM helper functions from running the device bus
+      type's run-time PM callbacks, make sure that all of the pending run-time
+      PM operations on the device are either completed or canceled; returns
+      1 if there was a resume request pending and it was necessary to execute
+      ->runtime_resume() for the device's bus type to satisfy that request,
+      otherwise 0 is returned
+
+  void pm_suspend_ignore_children(struct device *dev, bool enable);
+    - set/unset the power.ignore_children flag of the device
+
+  int pm_runtime_set_active(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'active' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_failure' is set or 'power.disable_depth' is greater than
+      zero); it will fail and return error code if the device has a parent
+      which is not active and the 'power.ignore_children' flag of which is unset
+
+  void pm_runtime_set_suspended(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'suspended' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_failure' is set or 'power.disable_depth' is greater than
+      zero)
+
+It is safe to execute the following helper functions from interrupt context:
+
+pm_request_idle()
+pm_schedule_suspend()
+pm_request_resume()
+pm_runtime_get_noresume()
+pm_runtime_get()
+pm_runtime_put_noidle()
+pm_runtime_put()
+pm_suspend_ignore_children()
+pm_runtime_set_active()
+pm_runtime_set_suspended()
+pm_runtime_enable()
+
+5. Run-time PM Initialization
+
+Initially, the run-time PM is disabled for all devices, which means that the
+majority of the run-time PM helper funtions described in Section 4 will return
+-EAGAIN until pm_runtime_enable() is called for the device.
+
+In addition to that, the initial run-time PM status of all devices is
+'suspended', but it need not reflect the actual physical state of the device.
+Thus, if the device is initially active (i.e. it is able to process I/O), its
+run-time PM status must be changed to 'active', with the help of
+pm_runtime_set_active(), before pm_runtime_enable() is called for the device.
+
+However, if the device has a parent and the parent's run-time PM is enabled,
+calling pm_runtime_set_active() for the device will affect the parent, unless
+the parent's 'power.ignore_children' flag is set.  Namely, in that case the
+parent won't be able to suspend at run time, using the PM core's helper
+functions, as long as the child's status is 'active', even if the child's
+run-time PM is still disabled (i.e. pm_runtime_enable() hasn't been called for
+the child yet or pm_runtime_disable() has been called for it).  For this reason,
+once pm_runtime_set_active() has been called for the device, pm_runtime_enable()
+should be called for it too as soon as reasonably possible or its run-time PM
+status should be changed back to 'suspended' with the help of
+pm_runtime_set_suspended().
+
+If the defaul initial run-time PM status of the device (i.e. 'suspended')
+reflects the actual state of the device, its bus type's or its driver's
+->probe() callback will likely need to wake it up using one of the PM core's
+helper functions described in Section 4.  In that case, pm_runtime_resume()
+should be used.  Of course, for this purpose the device's run-time PM has to be
+enabled earlier by calling pm_runtime_enable().
+
+If ->probe() calls pm_runtime_suspend() or pm_runtime_idle(), or their
+asynchronous counterparts, they will fail returning -EAGAIN, because the
+device's usage counter is incremented by the core before executing ->probe().
+Still, it may be desirable to suspend the device as soon as ->probe() has
+finished, so the core uses pm_runtime_idle() to invoke the device bus type's
+->runtime_idle() callback at that time, which only happens even if ->probe()
+is successful.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12)
  2009-08-05 21:47         ` Rafael J. Wysocki
  2009-08-06 17:01           ` Alan Stern
@ 2009-08-06 17:01           ` Alan Stern
  2009-08-06 21:50             ` Rafael J. Wysocki
                               ` (3 more replies)
  1 sibling, 4 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-06 17:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Wed, 5 Aug 2009, Rafael J. Wysocki wrote:

> Hi,
> 
> The patch below should address all of your recent comments.
> 
> Additionally I changed a few bits that I thought could turn out to be
> problematic at one point.

Looking good.  I've got a few more suggestions.

It occurred to me that there's no need for a separate
"runtime_failure" flag.  A nonzero value of "last_error" will do just
as well.  If you make this change, note that it affects the
documentation as well as the code.

If we defer a resume request while a suspend is in progress, then when
the suspend finishes should the resume be carried out immediately
rather than queued?  I don't see any reason why not.


> +/**
> + * __pm_runtime_suspend - Carry out run-time suspend of given device.
> + * @dev: Device to suspend.
> + * @from_wq: If set, the function has been called via pm_wq.
> + *
> + * Check if the device can be suspended and run the ->runtime_suspend() callback
> + * provided by its bus type.  If another suspend has been started earlier, wait
> + * for it to finish.  If there's an idle notification pending, cancel it.  If
> + * there's a suspend request scheduled while this function is running and @sync
> + * is 'true', cancel that request.

Change the last two sentences as follows: If an idle notification or suspend
request is pending or scheduled, cancel it.

> + *
> + * This function must be called under dev->power.lock with interrupts disabled.
> + */
> +int __pm_runtime_suspend(struct device *dev, bool from_wq)
> +	__releases(&dev->power.lock) __acquires(&dev->power.lock)
> +{
...
> +	pm_runtime_deactivate_timer(dev);
> +
> +	if (dev->power.request_pending) {
> +		/* Pending resume requests take precedence over us. */
> +		if (dev->power.request == RPM_REQ_RESUME)
> +			return -EAGAIN;
> +		/* Other pending requests need to be canceled. */
> +		dev->power.request = RPM_REQ_NONE;
> +	}

Might as well use pm_runtime_cancel_pending since we have it:

	/* Pending resume requests take precedence over us. */
	if (dev->power.request_pending && dev->power.request == RPM_REQ_RESUME)
		return -EAGAIN;

	/* Other pending requests need to be canceled. */
	pm_runtime_cancel_pending(dev);

...
> +	if (dev->power.deferred_resume) {
> +		__pm_request_resume(dev);

__pm_runtime_resume instead?


> +/**
> + * __pm_runtime_resume - Carry out run-time resume of given device.
> + * @dev: Device to resume.
> + * @from_wq: If set, the function has been called via pm_wq.
> + *
> + * Check if the device can be woken up and run the ->runtime_resume() callback
> + * provided by its bus type.  If another resume has been started earlier, wait
> + * for it to finish.  If there's a suspend running in parallel with this
> + * function, wait for it to finish and resume the device.  If there's a suspend
> + * request or idle notification pending, cancel it.  If there's a resume request
> + * scheduled while this function is running, cancel that request.

Change the last two sentences as follows: Cancel any pending requests.

> + *
> + * This function must be called under dev->power.lock with interrupts disabled.
> + */
> +int __pm_runtime_resume(struct device *dev, bool from_wq)
> +	__releases(&dev->power.lock) __acquires(&dev->power.lock)
> +{
> +	struct device *parent = NULL;
> +	int retval = 0;
> +
> + repeat:
> +	if (dev->power.runtime_failure)
> +		return -EINVAL;

Here and in two places below, goto out_parent instead of returning
directly.

...
> +	if (!parent && dev->parent) {
> +		/*
> +		 * Increment the parent's resume counter and resume it if
> +		 * necessary.
> +		 */
> +		parent = dev->parent;
> +		spin_unlock_irq(&dev->power.lock);
> +
> +		retval = pm_runtime_get_sync(parent);
> +
> +		spin_lock_irq(&dev->power.lock);
> +		/* We can resume if the parent's run-time PM is disabled. */
> +		if (retval < 0 && retval != -EAGAIN)
> +			goto out_parent;

Instead of checking retval, how about checking the parent's PM status?
Also, this isn't needed if the parent is set to ignore children.


> +static int __pm_request_idle(struct device *dev)
> +{
> +	int retval = 0;
> +
> +	if (dev->power.runtime_failure)
> +		retval = -EINVAL;
> +	else if (atomic_read(&dev->power.usage_count) > 0
> +	    || dev->power.disable_depth > 0
> +	    || dev->power.timer_expires > 0

This line should be removed.

...
> +	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
> +		/* Any requests other then RPM_REQ_IDLE take precedence. */
> +		if (dev->power.request != RPM_REQ_IDLE)
> +			retval = -EAGAIN;
> +		return retval;
> +	}
> +
> +	dev->power.request = RPM_REQ_IDLE;
> +	if (dev->power.request_pending)
> +		return retval;
> +
> +	dev->power.request_pending = true;
> +	queue_work(pm_wq, &dev->power.work);

This should be done consistently with the other routines.  Thus:

	if (dev->power.request_pending) {
		/* All other requests take precedence. */
		if (dev->power.request == RPM_REQ_NONE)
			dev->power.request = RPM_REQ_IDLE;
		else if (dev->power.request != RPM_REQ_IDLE)
			retval = -EAGAIN;
		return retval;
	}

	dev->power.request = RPM_REQ_IDLE;
	dev->power.request_pending = true;
	queue_work(pm_wq, &dev->power.work);


> +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> +{
> +	struct device *parent = dev->parent;
> +	unsigned long flags;
> +	bool notify_parent = false;
> +	int error = 0;
> +
> +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> +		return -EINVAL;
> +
> +	spin_lock_irqsave(&dev->power.lock, flags);
> +
> +	if (!dev->power.runtime_failure && !dev->power.disable_depth)
> +		goto out;

Set "error" to a negative code?


> @@ -757,11 +770,16 @@ static int dpm_prepare(pm_message_t stat
>  		dev->power.status = DPM_PREPARING;
>  		mutex_unlock(&dpm_list_mtx);
>  
> -		error = device_prepare(dev, state);
> +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> +			/* Wake-up during suspend. */
> +			error = -EBUSY;

Or maybe "Wakeup was requested during sleep transition."


> +  unsigned int deferred_resume;
> +    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
> +      being executed for that device and it is not practical to wait for the
> +      suspend to complete; means "queue up a resume request as soon as you've
> +      suspended"

"start a resume" instead of "queue up a resume request"?


> +5. Run-time PM Initialization
...
> +If the defaul initial run-time PM status of the device (i.e. 'suspended')

Fix spelling of "default".

> +reflects the actual state of the device, its bus type's or its driver's
> +->probe() callback will likely need to wake it up using one of the PM core's
> +helper functions described in Section 4.  In that case, pm_runtime_resume()
> +should be used.  Of course, for this purpose the device's run-time PM has to be
> +enabled earlier by calling pm_runtime_enable().
> +
> +If ->probe() calls pm_runtime_suspend() or pm_runtime_idle(), or their
> +asynchronous counterparts, they will fail returning -EAGAIN, because the
> +device's usage counter is incremented by the core before executing ->probe().
> +Still, it may be desirable to suspend the device as soon as ->probe() has
> +finished, so the core uses pm_runtime_idle() to invoke the device bus type's
> +->runtime_idle() callback at that time, which only happens even if ->probe()

s/which only happens even/but only/

> +is successful.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12)
  2009-08-05 21:47         ` Rafael J. Wysocki
@ 2009-08-06 17:01           ` Alan Stern
  2009-08-06 17:01           ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-06 17:01 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Wed, 5 Aug 2009, Rafael J. Wysocki wrote:

> Hi,
> 
> The patch below should address all of your recent comments.
> 
> Additionally I changed a few bits that I thought could turn out to be
> problematic at one point.

Looking good.  I've got a few more suggestions.

It occurred to me that there's no need for a separate
"runtime_failure" flag.  A nonzero value of "last_error" will do just
as well.  If you make this change, note that it affects the
documentation as well as the code.

If we defer a resume request while a suspend is in progress, then when
the suspend finishes should the resume be carried out immediately
rather than queued?  I don't see any reason why not.


> +/**
> + * __pm_runtime_suspend - Carry out run-time suspend of given device.
> + * @dev: Device to suspend.
> + * @from_wq: If set, the function has been called via pm_wq.
> + *
> + * Check if the device can be suspended and run the ->runtime_suspend() callback
> + * provided by its bus type.  If another suspend has been started earlier, wait
> + * for it to finish.  If there's an idle notification pending, cancel it.  If
> + * there's a suspend request scheduled while this function is running and @sync
> + * is 'true', cancel that request.

Change the last two sentences as follows: If an idle notification or suspend
request is pending or scheduled, cancel it.

> + *
> + * This function must be called under dev->power.lock with interrupts disabled.
> + */
> +int __pm_runtime_suspend(struct device *dev, bool from_wq)
> +	__releases(&dev->power.lock) __acquires(&dev->power.lock)
> +{
...
> +	pm_runtime_deactivate_timer(dev);
> +
> +	if (dev->power.request_pending) {
> +		/* Pending resume requests take precedence over us. */
> +		if (dev->power.request == RPM_REQ_RESUME)
> +			return -EAGAIN;
> +		/* Other pending requests need to be canceled. */
> +		dev->power.request = RPM_REQ_NONE;
> +	}

Might as well use pm_runtime_cancel_pending since we have it:

	/* Pending resume requests take precedence over us. */
	if (dev->power.request_pending && dev->power.request == RPM_REQ_RESUME)
		return -EAGAIN;

	/* Other pending requests need to be canceled. */
	pm_runtime_cancel_pending(dev);

...
> +	if (dev->power.deferred_resume) {
> +		__pm_request_resume(dev);

__pm_runtime_resume instead?


> +/**
> + * __pm_runtime_resume - Carry out run-time resume of given device.
> + * @dev: Device to resume.
> + * @from_wq: If set, the function has been called via pm_wq.
> + *
> + * Check if the device can be woken up and run the ->runtime_resume() callback
> + * provided by its bus type.  If another resume has been started earlier, wait
> + * for it to finish.  If there's a suspend running in parallel with this
> + * function, wait for it to finish and resume the device.  If there's a suspend
> + * request or idle notification pending, cancel it.  If there's a resume request
> + * scheduled while this function is running, cancel that request.

Change the last two sentences as follows: Cancel any pending requests.

> + *
> + * This function must be called under dev->power.lock with interrupts disabled.
> + */
> +int __pm_runtime_resume(struct device *dev, bool from_wq)
> +	__releases(&dev->power.lock) __acquires(&dev->power.lock)
> +{
> +	struct device *parent = NULL;
> +	int retval = 0;
> +
> + repeat:
> +	if (dev->power.runtime_failure)
> +		return -EINVAL;

Here and in two places below, goto out_parent instead of returning
directly.

...
> +	if (!parent && dev->parent) {
> +		/*
> +		 * Increment the parent's resume counter and resume it if
> +		 * necessary.
> +		 */
> +		parent = dev->parent;
> +		spin_unlock_irq(&dev->power.lock);
> +
> +		retval = pm_runtime_get_sync(parent);
> +
> +		spin_lock_irq(&dev->power.lock);
> +		/* We can resume if the parent's run-time PM is disabled. */
> +		if (retval < 0 && retval != -EAGAIN)
> +			goto out_parent;

Instead of checking retval, how about checking the parent's PM status?
Also, this isn't needed if the parent is set to ignore children.


> +static int __pm_request_idle(struct device *dev)
> +{
> +	int retval = 0;
> +
> +	if (dev->power.runtime_failure)
> +		retval = -EINVAL;
> +	else if (atomic_read(&dev->power.usage_count) > 0
> +	    || dev->power.disable_depth > 0
> +	    || dev->power.timer_expires > 0

This line should be removed.

...
> +	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
> +		/* Any requests other then RPM_REQ_IDLE take precedence. */
> +		if (dev->power.request != RPM_REQ_IDLE)
> +			retval = -EAGAIN;
> +		return retval;
> +	}
> +
> +	dev->power.request = RPM_REQ_IDLE;
> +	if (dev->power.request_pending)
> +		return retval;
> +
> +	dev->power.request_pending = true;
> +	queue_work(pm_wq, &dev->power.work);

This should be done consistently with the other routines.  Thus:

	if (dev->power.request_pending) {
		/* All other requests take precedence. */
		if (dev->power.request == RPM_REQ_NONE)
			dev->power.request = RPM_REQ_IDLE;
		else if (dev->power.request != RPM_REQ_IDLE)
			retval = -EAGAIN;
		return retval;
	}

	dev->power.request = RPM_REQ_IDLE;
	dev->power.request_pending = true;
	queue_work(pm_wq, &dev->power.work);


> +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> +{
> +	struct device *parent = dev->parent;
> +	unsigned long flags;
> +	bool notify_parent = false;
> +	int error = 0;
> +
> +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> +		return -EINVAL;
> +
> +	spin_lock_irqsave(&dev->power.lock, flags);
> +
> +	if (!dev->power.runtime_failure && !dev->power.disable_depth)
> +		goto out;

Set "error" to a negative code?


> @@ -757,11 +770,16 @@ static int dpm_prepare(pm_message_t stat
>  		dev->power.status = DPM_PREPARING;
>  		mutex_unlock(&dpm_list_mtx);
>  
> -		error = device_prepare(dev, state);
> +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> +			/* Wake-up during suspend. */
> +			error = -EBUSY;

Or maybe "Wakeup was requested during sleep transition."


> +  unsigned int deferred_resume;
> +    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
> +      being executed for that device and it is not practical to wait for the
> +      suspend to complete; means "queue up a resume request as soon as you've
> +      suspended"

"start a resume" instead of "queue up a resume request"?


> +5. Run-time PM Initialization
...
> +If the defaul initial run-time PM status of the device (i.e. 'suspended')

Fix spelling of "default".

> +reflects the actual state of the device, its bus type's or its driver's
> +->probe() callback will likely need to wake it up using one of the PM core's
> +helper functions described in Section 4.  In that case, pm_runtime_resume()
> +should be used.  Of course, for this purpose the device's run-time PM has to be
> +enabled earlier by calling pm_runtime_enable().
> +
> +If ->probe() calls pm_runtime_suspend() or pm_runtime_idle(), or their
> +asynchronous counterparts, they will fail returning -EAGAIN, because the
> +device's usage counter is incremented by the core before executing ->probe().
> +Still, it may be desirable to suspend the device as soon as ->probe() has
> +finished, so the core uses pm_runtime_idle() to invoke the device bus type's
> +->runtime_idle() callback at that time, which only happens even if ->probe()

s/which only happens even/but only/

> +is successful.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12)
  2009-08-06 17:01           ` Alan Stern
  2009-08-06 21:50             ` Rafael J. Wysocki
@ 2009-08-06 21:50             ` Rafael J. Wysocki
  2009-08-07 13:59               ` Alan Stern
  2009-08-07 13:59               ` Alan Stern
  2009-08-06 21:53             ` [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13) Rafael J. Wysocki
  2009-08-06 21:53             ` Rafael J. Wysocki
  3 siblings, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-06 21:50 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Thursday 06 August 2009, Alan Stern wrote:
> On Wed, 5 Aug 2009, Rafael J. Wysocki wrote:
> 
> > Hi,
> > 
> > The patch below should address all of your recent comments.
> > 
> > Additionally I changed a few bits that I thought could turn out to be
> > problematic at one point.
> 
> Looking good.  I've got a few more suggestions.
> 
> It occurred to me that there's no need for a separate
> "runtime_failure" flag.  A nonzero value of "last_error" will do just
> as well.

Yes, good catch.  I don't quite remember why I wanted the flag and a separate
error field.

> If you make this change, note that it affects the documentation as well as
> the code.

Sure.

> If we defer a resume request while a suspend is in progress, then when
> the suspend finishes should the resume be carried out immediately
> rather than queued?  I don't see any reason why not.

Well, it's not very clear what to return to the caller in such a case.  I guess
we can return -EAGAIN.

> > +/**
> > + * __pm_runtime_suspend - Carry out run-time suspend of given device.
> > + * @dev: Device to suspend.
> > + * @from_wq: If set, the function has been called via pm_wq.
> > + *
> > + * Check if the device can be suspended and run the ->runtime_suspend() callback
> > + * provided by its bus type.  If another suspend has been started earlier, wait
> > + * for it to finish.  If there's an idle notification pending, cancel it.  If
> > + * there's a suspend request scheduled while this function is running and @sync
> > + * is 'true', cancel that request.
> 
> Change the last two sentences as follows: If an idle notification or suspend
> request is pending or scheduled, cancel it.

OK

> > + *
> > + * This function must be called under dev->power.lock with interrupts disabled.
> > + */
> > +int __pm_runtime_suspend(struct device *dev, bool from_wq)
> > +	__releases(&dev->power.lock) __acquires(&dev->power.lock)
> > +{
> ...
> > +	pm_runtime_deactivate_timer(dev);
> > +
> > +	if (dev->power.request_pending) {
> > +		/* Pending resume requests take precedence over us. */
> > +		if (dev->power.request == RPM_REQ_RESUME)
> > +			return -EAGAIN;
> > +		/* Other pending requests need to be canceled. */
> > +		dev->power.request = RPM_REQ_NONE;
> > +	}
> 
> Might as well use pm_runtime_cancel_pending since we have it:
> 
> 	/* Pending resume requests take precedence over us. */
> 	if (dev->power.request_pending && dev->power.request == RPM_REQ_RESUME)
> 		return -EAGAIN;
> 
> 	/* Other pending requests need to be canceled. */
> 	pm_runtime_cancel_pending(dev);

OK

> ...
> > +	if (dev->power.deferred_resume) {
> > +		__pm_request_resume(dev);
> 
> __pm_runtime_resume instead?

In which case we shouldn't execute the code below, IMO, but return immediately
instead.

> > +/**
> > + * __pm_runtime_resume - Carry out run-time resume of given device.
> > + * @dev: Device to resume.
> > + * @from_wq: If set, the function has been called via pm_wq.
> > + *
> > + * Check if the device can be woken up and run the ->runtime_resume() callback
> > + * provided by its bus type.  If another resume has been started earlier, wait
> > + * for it to finish.  If there's a suspend running in parallel with this
> > + * function, wait for it to finish and resume the device.  If there's a suspend
> > + * request or idle notification pending, cancel it.  If there's a resume request
> > + * scheduled while this function is running, cancel that request.
> 
> Change the last two sentences as follows: Cancel any pending requests.

OK

> > + *
> > + * This function must be called under dev->power.lock with interrupts disabled.
> > + */
> > +int __pm_runtime_resume(struct device *dev, bool from_wq)
> > +	__releases(&dev->power.lock) __acquires(&dev->power.lock)
> > +{
> > +	struct device *parent = NULL;
> > +	int retval = 0;
> > +
> > + repeat:
> > +	if (dev->power.runtime_failure)
> > +		return -EINVAL;
> 
> Here and in two places below, goto out_parent instead of returning
> directly.

Ah, that was a real bug.  Thanks for catching it!

> ...
> > +	if (!parent && dev->parent) {
> > +		/*
> > +		 * Increment the parent's resume counter and resume it if
> > +		 * necessary.
> > +		 */
> > +		parent = dev->parent;
> > +		spin_unlock_irq(&dev->power.lock);
> > +
> > +		retval = pm_runtime_get_sync(parent);
> > +
> > +		spin_lock_irq(&dev->power.lock);
> > +		/* We can resume if the parent's run-time PM is disabled. */
> > +		if (retval < 0 && retval != -EAGAIN)
> > +			goto out_parent;
> 
> Instead of checking retval, how about checking the parent's PM status?

Should work.

> Also, this isn't needed if the parent is set to ignore children.

OK, I'll change that.

> > +static int __pm_request_idle(struct device *dev)
> > +{
> > +	int retval = 0;
> > +
> > +	if (dev->power.runtime_failure)
> > +		retval = -EINVAL;
> > +	else if (atomic_read(&dev->power.usage_count) > 0
> > +	    || dev->power.disable_depth > 0
> > +	    || dev->power.timer_expires > 0
> 
> This line should be removed.

Yeah, thanks!

> ...
> > +	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
> > +		/* Any requests other then RPM_REQ_IDLE take precedence. */
> > +		if (dev->power.request != RPM_REQ_IDLE)
> > +			retval = -EAGAIN;
> > +		return retval;
> > +	}
> > +
> > +	dev->power.request = RPM_REQ_IDLE;
> > +	if (dev->power.request_pending)
> > +		return retval;
> > +
> > +	dev->power.request_pending = true;
> > +	queue_work(pm_wq, &dev->power.work);
> 
> This should be done consistently with the other routines.  Thus:
> 
> 	if (dev->power.request_pending) {
> 		/* All other requests take precedence. */
> 		if (dev->power.request == RPM_REQ_NONE)
> 			dev->power.request = RPM_REQ_IDLE;
> 		else if (dev->power.request != RPM_REQ_IDLE)
> 			retval = -EAGAIN;
> 		return retval;
> 	}
> 
> 	dev->power.request = RPM_REQ_IDLE;
> 	dev->power.request_pending = true;
> 	queue_work(pm_wq, &dev->power.work);

OK

> > +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > +{
> > +	struct device *parent = dev->parent;
> > +	unsigned long flags;
> > +	bool notify_parent = false;
> > +	int error = 0;
> > +
> > +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > +		return -EINVAL;
> > +
> > +	spin_lock_irqsave(&dev->power.lock, flags);
> > +
> > +	if (!dev->power.runtime_failure && !dev->power.disable_depth)
> > +		goto out;
> 
> Set "error" to a negative code?

OK

> > @@ -757,11 +770,16 @@ static int dpm_prepare(pm_message_t stat
> >  		dev->power.status = DPM_PREPARING;
> >  		mutex_unlock(&dpm_list_mtx);
> >  
> > -		error = device_prepare(dev, state);
> > +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> > +			/* Wake-up during suspend. */
> > +			error = -EBUSY;
> 
> Or maybe "Wakeup was requested during sleep transition."

Sounds better.

> > +  unsigned int deferred_resume;
> > +    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
> > +      being executed for that device and it is not practical to wait for the
> > +      suspend to complete; means "queue up a resume request as soon as you've
> > +      suspended"
> 
> "start a resume" instead of "queue up a resume request"?

OK

> > +5. Run-time PM Initialization
> ...
> > +If the defaul initial run-time PM status of the device (i.e. 'suspended')
> 
> Fix spelling of "default".

OK

> > +reflects the actual state of the device, its bus type's or its driver's
> > +->probe() callback will likely need to wake it up using one of the PM core's
> > +helper functions described in Section 4.  In that case, pm_runtime_resume()
> > +should be used.  Of course, for this purpose the device's run-time PM has to be
> > +enabled earlier by calling pm_runtime_enable().
> > +
> > +If ->probe() calls pm_runtime_suspend() or pm_runtime_idle(), or their
> > +asynchronous counterparts, they will fail returning -EAGAIN, because the
> > +device's usage counter is incremented by the core before executing ->probe().
> > +Still, it may be desirable to suspend the device as soon as ->probe() has
> > +finished, so the core uses pm_runtime_idle() to invoke the device bus type's
> > +->runtime_idle() callback at that time, which only happens even if ->probe()
> 
> s/which only happens even/but only/
> 
> > +is successful.

OK

Thanks for the comments!  In fact I've already updated the patch to address
them, so I'll send it in a little while.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12)
  2009-08-06 17:01           ` Alan Stern
@ 2009-08-06 21:50             ` Rafael J. Wysocki
  2009-08-06 21:50             ` Rafael J. Wysocki
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-06 21:50 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

On Thursday 06 August 2009, Alan Stern wrote:
> On Wed, 5 Aug 2009, Rafael J. Wysocki wrote:
> 
> > Hi,
> > 
> > The patch below should address all of your recent comments.
> > 
> > Additionally I changed a few bits that I thought could turn out to be
> > problematic at one point.
> 
> Looking good.  I've got a few more suggestions.
> 
> It occurred to me that there's no need for a separate
> "runtime_failure" flag.  A nonzero value of "last_error" will do just
> as well.

Yes, good catch.  I don't quite remember why I wanted the flag and a separate
error field.

> If you make this change, note that it affects the documentation as well as
> the code.

Sure.

> If we defer a resume request while a suspend is in progress, then when
> the suspend finishes should the resume be carried out immediately
> rather than queued?  I don't see any reason why not.

Well, it's not very clear what to return to the caller in such a case.  I guess
we can return -EAGAIN.

> > +/**
> > + * __pm_runtime_suspend - Carry out run-time suspend of given device.
> > + * @dev: Device to suspend.
> > + * @from_wq: If set, the function has been called via pm_wq.
> > + *
> > + * Check if the device can be suspended and run the ->runtime_suspend() callback
> > + * provided by its bus type.  If another suspend has been started earlier, wait
> > + * for it to finish.  If there's an idle notification pending, cancel it.  If
> > + * there's a suspend request scheduled while this function is running and @sync
> > + * is 'true', cancel that request.
> 
> Change the last two sentences as follows: If an idle notification or suspend
> request is pending or scheduled, cancel it.

OK

> > + *
> > + * This function must be called under dev->power.lock with interrupts disabled.
> > + */
> > +int __pm_runtime_suspend(struct device *dev, bool from_wq)
> > +	__releases(&dev->power.lock) __acquires(&dev->power.lock)
> > +{
> ...
> > +	pm_runtime_deactivate_timer(dev);
> > +
> > +	if (dev->power.request_pending) {
> > +		/* Pending resume requests take precedence over us. */
> > +		if (dev->power.request == RPM_REQ_RESUME)
> > +			return -EAGAIN;
> > +		/* Other pending requests need to be canceled. */
> > +		dev->power.request = RPM_REQ_NONE;
> > +	}
> 
> Might as well use pm_runtime_cancel_pending since we have it:
> 
> 	/* Pending resume requests take precedence over us. */
> 	if (dev->power.request_pending && dev->power.request == RPM_REQ_RESUME)
> 		return -EAGAIN;
> 
> 	/* Other pending requests need to be canceled. */
> 	pm_runtime_cancel_pending(dev);

OK

> ...
> > +	if (dev->power.deferred_resume) {
> > +		__pm_request_resume(dev);
> 
> __pm_runtime_resume instead?

In which case we shouldn't execute the code below, IMO, but return immediately
instead.

> > +/**
> > + * __pm_runtime_resume - Carry out run-time resume of given device.
> > + * @dev: Device to resume.
> > + * @from_wq: If set, the function has been called via pm_wq.
> > + *
> > + * Check if the device can be woken up and run the ->runtime_resume() callback
> > + * provided by its bus type.  If another resume has been started earlier, wait
> > + * for it to finish.  If there's a suspend running in parallel with this
> > + * function, wait for it to finish and resume the device.  If there's a suspend
> > + * request or idle notification pending, cancel it.  If there's a resume request
> > + * scheduled while this function is running, cancel that request.
> 
> Change the last two sentences as follows: Cancel any pending requests.

OK

> > + *
> > + * This function must be called under dev->power.lock with interrupts disabled.
> > + */
> > +int __pm_runtime_resume(struct device *dev, bool from_wq)
> > +	__releases(&dev->power.lock) __acquires(&dev->power.lock)
> > +{
> > +	struct device *parent = NULL;
> > +	int retval = 0;
> > +
> > + repeat:
> > +	if (dev->power.runtime_failure)
> > +		return -EINVAL;
> 
> Here and in two places below, goto out_parent instead of returning
> directly.

Ah, that was a real bug.  Thanks for catching it!

> ...
> > +	if (!parent && dev->parent) {
> > +		/*
> > +		 * Increment the parent's resume counter and resume it if
> > +		 * necessary.
> > +		 */
> > +		parent = dev->parent;
> > +		spin_unlock_irq(&dev->power.lock);
> > +
> > +		retval = pm_runtime_get_sync(parent);
> > +
> > +		spin_lock_irq(&dev->power.lock);
> > +		/* We can resume if the parent's run-time PM is disabled. */
> > +		if (retval < 0 && retval != -EAGAIN)
> > +			goto out_parent;
> 
> Instead of checking retval, how about checking the parent's PM status?

Should work.

> Also, this isn't needed if the parent is set to ignore children.

OK, I'll change that.

> > +static int __pm_request_idle(struct device *dev)
> > +{
> > +	int retval = 0;
> > +
> > +	if (dev->power.runtime_failure)
> > +		retval = -EINVAL;
> > +	else if (atomic_read(&dev->power.usage_count) > 0
> > +	    || dev->power.disable_depth > 0
> > +	    || dev->power.timer_expires > 0
> 
> This line should be removed.

Yeah, thanks!

> ...
> > +	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
> > +		/* Any requests other then RPM_REQ_IDLE take precedence. */
> > +		if (dev->power.request != RPM_REQ_IDLE)
> > +			retval = -EAGAIN;
> > +		return retval;
> > +	}
> > +
> > +	dev->power.request = RPM_REQ_IDLE;
> > +	if (dev->power.request_pending)
> > +		return retval;
> > +
> > +	dev->power.request_pending = true;
> > +	queue_work(pm_wq, &dev->power.work);
> 
> This should be done consistently with the other routines.  Thus:
> 
> 	if (dev->power.request_pending) {
> 		/* All other requests take precedence. */
> 		if (dev->power.request == RPM_REQ_NONE)
> 			dev->power.request = RPM_REQ_IDLE;
> 		else if (dev->power.request != RPM_REQ_IDLE)
> 			retval = -EAGAIN;
> 		return retval;
> 	}
> 
> 	dev->power.request = RPM_REQ_IDLE;
> 	dev->power.request_pending = true;
> 	queue_work(pm_wq, &dev->power.work);

OK

> > +int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > +{
> > +	struct device *parent = dev->parent;
> > +	unsigned long flags;
> > +	bool notify_parent = false;
> > +	int error = 0;
> > +
> > +	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > +		return -EINVAL;
> > +
> > +	spin_lock_irqsave(&dev->power.lock, flags);
> > +
> > +	if (!dev->power.runtime_failure && !dev->power.disable_depth)
> > +		goto out;
> 
> Set "error" to a negative code?

OK

> > @@ -757,11 +770,16 @@ static int dpm_prepare(pm_message_t stat
> >  		dev->power.status = DPM_PREPARING;
> >  		mutex_unlock(&dpm_list_mtx);
> >  
> > -		error = device_prepare(dev, state);
> > +		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
> > +			/* Wake-up during suspend. */
> > +			error = -EBUSY;
> 
> Or maybe "Wakeup was requested during sleep transition."

Sounds better.

> > +  unsigned int deferred_resume;
> > +    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
> > +      being executed for that device and it is not practical to wait for the
> > +      suspend to complete; means "queue up a resume request as soon as you've
> > +      suspended"
> 
> "start a resume" instead of "queue up a resume request"?

OK

> > +5. Run-time PM Initialization
> ...
> > +If the defaul initial run-time PM status of the device (i.e. 'suspended')
> 
> Fix spelling of "default".

OK

> > +reflects the actual state of the device, its bus type's or its driver's
> > +->probe() callback will likely need to wake it up using one of the PM core's
> > +helper functions described in Section 4.  In that case, pm_runtime_resume()
> > +should be used.  Of course, for this purpose the device's run-time PM has to be
> > +enabled earlier by calling pm_runtime_enable().
> > +
> > +If ->probe() calls pm_runtime_suspend() or pm_runtime_idle(), or their
> > +asynchronous counterparts, they will fail returning -EAGAIN, because the
> > +device's usage counter is incremented by the core before executing ->probe().
> > +Still, it may be desirable to suspend the device as soon as ->probe() has
> > +finished, so the core uses pm_runtime_idle() to invoke the device bus type's
> > +->runtime_idle() callback at that time, which only happens even if ->probe()
> 
> s/which only happens even/but only/
> 
> > +is successful.

OK

Thanks for the comments!  In fact I've already updated the patch to address
them, so I'll send it in a little while.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-06 17:01           ` Alan Stern
  2009-08-06 21:50             ` Rafael J. Wysocki
  2009-08-06 21:50             ` Rafael J. Wysocki
@ 2009-08-06 21:53             ` Rafael J. Wysocki
  2009-08-07  7:45               ` Magnus Damm
                                 ` (3 more replies)
  2009-08-06 21:53             ` Rafael J. Wysocki
  3 siblings, 4 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-06 21:53 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

Hi,

The patch below should address all of your most recent comments.

Best,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 13)

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.  Document all these things.

Special thanks to Alan Stern for his help with the design and
multiple detailed reviews of the pereceding versions of this patch
and to Magnus Damm for testing feedback.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/power/runtime_pm.txt |  377 ++++++++++++++
 drivers/base/dd.c                  |   15 
 drivers/base/power/Makefile        |    1 
 drivers/base/power/main.c          |   20 
 drivers/base/power/power.h         |   31 -
 drivers/base/power/runtime.c       |  944 +++++++++++++++++++++++++++++++++++++
 include/linux/pm.h                 |  101 +++
 include/linux/pm_runtime.h         |  115 ++++
 kernel/power/Kconfig               |   14 
 kernel/power/main.c                |   17 
 10 files changed, 1624 insertions(+), 11 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /*
@@ -329,14 +358,80 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		disable_depth:3;
+	unsigned int		ignore_children:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			runtime_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,944 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_runtime_resume(struct device *dev, bool from_wq);
+static int __pm_request_idle(struct device *dev);
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * In case there's a request pending, make sure its work function will
+	 * return without doing anything.
+	 */
+	dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_error)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.runtime_status != RPM_ACTIVE)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
+		spin_unlock_irq(&dev->power.lock);
+
+		dev->bus->pm->runtime_idle(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the function has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If an idle notification or suspend request is pending or
+ * scheduled, cancel it.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_error)
+		return -EINVAL;
+
+	/* Pending resume requests take precedence over us. */
+	if (dev->power.request_pending && dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+
+	/* Other scheduled or pending requests need to be canceled. */
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.disable_depth > 0
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend) {
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = dev->bus->pm->runtime_suspend(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	} else {
+		retval = -ENOSYS;
+	}
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY)
+			notify = true;
+		else
+			dev->power.runtime_error = retval;
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		dev->power.deferred_resume = false;
+		__pm_runtime_resume(dev, false);
+		return -EAGAIN;
+	}
+
+	if (notify)
+		__pm_runtime_idle(dev);
+
+	if (parent && !parent->power.ignore_children) {
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_request_idle(parent);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the function has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  Cancel any scheduled
+ * or pending requests.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_error) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval)
+		goto out;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			retval = -EINPROGRESS;
+			goto out;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		parent = dev->parent;
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_runtime_get_noresume(parent);
+
+		spin_lock_irq(&parent->power.lock);
+		/*
+		 * We can resume if the parent's run-time PM is disabled or it
+		 * is set to ignore children.
+		 */
+		if (!parent->power.disable_depth
+		    && !parent->power.ignore_children) {
+			__pm_runtime_resume(parent, false);
+			if (parent->power.runtime_status != RPM_ACTIVE)
+				retval = -EBUSY;
+		}
+		spin_unlock_irq(&parent->power.lock);
+
+		spin_lock_irq(&dev->power.lock);
+		if (retval)
+			goto out;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume) {
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = dev->bus->pm->runtime_resume(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	} else {
+		retval = -ENOSYS;
+	}
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+		dev->power.runtime_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (!retval)
+		__pm_request_idle(dev);
+
+ out:
+	if (parent) {
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_runtime_put(parent);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * __pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_error)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_NONE)
+			dev->power.request = RPM_REQ_IDLE;
+		else if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_idle(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_error)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval < 0)
+		return retval;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+		return retval;
+	} else if (retval) {
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_error) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_error)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval < 0)
+		return retval;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		return retval;
+	} else if (retval) {
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_get - Reference count a device and wake it up, if necessary.
+ * @dev: Device to handle.
+ * @sync: If set and the device is suspended, resume it synchronously.
+ *
+ * Increment the usage count of the device and if it was zero previously,
+ * resume it or submit a resume request for it, depending on the value of @sync.
+ */
+int __pm_runtime_get(struct device *dev, bool sync)
+{
+	int retval = 1;
+
+	if (atomic_add_return(1, &dev->power.usage_count) == 1)
+		retval = sync ? pm_runtime_resume(dev) : pm_request_resume(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_get);
+
+/**
+ * __pm_runtime_put - Decrement the device's usage counter and notify its bus.
+ * @dev: Device to handle.
+ * @sync: If the device's bus type is to be notified, do that synchronously.
+ *
+ * Decrement the usage count of the device and if it reaches zero, carry out a
+ * synchronous idle notification or submit an idle notification request for it,
+ * depending on the value of @sync.
+ */
+int __pm_runtime_put(struct device *dev, bool sync)
+{
+	int retval = 0;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		retval = sync ? pm_runtime_idle(dev) : pm_request_idle(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_put);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_error field is
+ * different from zero, the status may be changed either to RPM_ACTIVE, or to
+ * RPM_SUSPENDED, as long as that reflects the actual state of the device.
+ * However, if the device has a parent and the parent is not active, and the
+ * parent's power.ignore_children flag is unset, the device's status cannot be
+ * set to RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_error field
+ * and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.  If the new status is RPM_SUSPENDED, an idle
+ * notification request for the parent is submitted.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	bool notify_parent = false;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_error && !dev->power.disable_depth) {
+		error = -EAGAIN;
+		goto out;
+	}
+
+	if (dev->power.runtime_status == status)
+		goto out_set;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent) {
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+			notify_parent = !parent->power.ignore_children;
+		}
+		goto out_set;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It is invalid to put an active child under a parent that is
+		 * not active, has run-time PM enabled and the
+		 * 'power.ignore_children' flag unset.
+		 */
+		if (!parent->power.disable_depth
+		    && !parent->power.ignore_children
+		    && parent->power.runtime_status != RPM_ACTIVE) {
+			error = -EBUSY;
+		} else {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	}
+
+ out_set:
+	dev->power.runtime_status = status;
+	dev->power.runtime_error = 0;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	if (notify_parent)
+		pm_request_idle(parent);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.disable_depth > 0)
+		dev->power.disable_depth--;
+	else
+		dev_warn(dev, "Unbalanced %s!", __func__);
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * __pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ * @check_resume: If set, check if there's a resume request for the device.
+ *
+ * Increment power.disable_depth for the device and if was zero previously,
+ * cancel all pending run-time PM requests for the device and wait for all
+ * operations in progress to complete.  The device can be either active or
+ * suspended after its run-time PM has been disabled.
+ *
+ * If @check_resume is set and there's a resume request pending when
+ * __pm_runtime_disable() is called and power.disable_depth is zero, the
+ * function will wake up the device before disabling its run-time PM and will
+ * return 1.  Otherwise, 0 is returned.
+ */
+int __pm_runtime_disable(struct device *dev, bool check_resume)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (dev->power.disable_depth > 0) {
+		dev->power.disable_depth++;
+		goto out;
+	}
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and disabling run-time PM
+	 * shouldn't prevent the device from processing the I/O.
+	 */
+	if (check_resume && dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		/*
+		 * Prevent suspends and idle notifications from being carried
+		 * out after we have woken up the device.
+		 */
+		pm_runtime_get_noresume(dev);
+
+		__pm_runtime_resume(dev, false);
+
+		pm_runtime_put_noidle(dev);
+		retval = 1;
+	}
+
+	if (dev->power.disable_depth++ > 0)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING
+			    && !dev->power.idle_notification)
+				break;
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_SUSPENDED;
+	dev->power.idle_notification = false;
+
+	dev->power.disable_depth = 1;
+	atomic_set(&dev->power.usage_count, 0);
+
+	dev->power.runtime_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	setup_timer(&dev->power.suspend_timer, pm_suspend_timer_fn,
+			(unsigned long)dev);
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	__pm_runtime_disable(dev, false);
+
+	/* Change the status back to 'suspended' to match the initial status. */
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		pm_runtime_set_suspended(dev);
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,115 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_get(struct device *dev, bool sync);
+extern int __pm_runtime_put(struct device *dev, bool sync);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int __pm_runtime_disable(struct device *dev, bool check_resume);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline void pm_runtime_get_noresume(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+}
+
+static inline void pm_runtime_put_noidle(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_get(struct device *dev, bool sync) { return 1; }
+static inline int __pm_runtime_put(struct device *dev, bool sync) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int __pm_runtime_disable(struct device *dev, bool check_resume)
+{
+	return 0;
+}
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline void pm_runtime_get_noresume(struct device *dev) {}
+static inline void pm_runtime_put_noidle(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	return __pm_runtime_get(dev, false);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	return __pm_runtime_get(dev, true);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	return __pm_runtime_put(dev, false);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	return __pm_runtime_put(dev, true);
+}
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+static inline int pm_runtime_disable(struct device *dev)
+{
+	return __pm_runtime_disable(dev, true);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object being initialized.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -105,6 +116,7 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -512,6 +524,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -757,11 +770,16 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
+			/* Wake-up requested during system sleep transition. */
+			error = -EBUSY;
+		else
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,17 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
+	/*
+	 * Wait for run-time PM calls to complete and prevent new suspend calls
+	 * until the probe is done.
+	 */
+	pm_runtime_disable(dev);
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
 	ret = really_probe(dev, drv);
+	pm_runtime_put_noidle(dev);
+	if (!ret)
+		pm_runtime_idle(dev);
 
 	return ret;
 }
@@ -306,6 +317,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +337,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,7 +1,14 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
+#ifdef CONFIG_PM_RUNTIME
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
 
 #ifdef CONFIG_PM_SLEEP
 
@@ -16,23 +23,33 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
+
+static inline void device_pm_init(struct device *dev)
+{
+	pm_runtime_init(dev);
+}
+
+static inline void device_pm_remove(struct device *dev)
+{
+	pm_runtime_remove(dev);
+}
 
 static inline void device_pm_add(struct device *dev) {}
-static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
 					 struct device *devb) {}
 static inline void device_pm_move_after(struct device *deva,
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 
Index: linux-2.6/Documentation/power/runtime_pm.txt
===================================================================
--- /dev/null
+++ linux-2.6/Documentation/power/runtime_pm.txt
@@ -0,0 +1,377 @@
+Run-time Power Management Framework for I/O Devices
+
+(C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+
+1. Introduction
+
+Support for run-time power management (run-time PM) of I/O devices is provided
+at the power management core (PM core) level by means of:
+
+* The power management workqueue pm_wq in which bus types and device drivers can
+  put their PM-related work items.  It is strongly recommended that pm_wq be
+  used for queuing all work items related to run-time PM, because this allows
+  them to be synchronized with system-wide power transitions (suspend to RAM,
+  hibernation and resume from system sleep states).  pm_wq is declared in
+  include/linux/pm_runtime.h and defined in kernel/power/main.c.
+
+* A number of run-time PM fields in the 'power' member of 'struct device' (which
+  is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can
+  be used for synchronizing run-time PM operations with one another.
+
+* Three device run-time PM callbacks in 'struct dev_pm_ops' (defined in
+  include/linux/pm.h).
+
+* A set of helper functions defined in drivers/base/power/runtime.c that can be
+  used for carrying out run-time PM operations in such a way that the
+  synchronization between them is taken care of by the PM core.  Bus types and
+  device drivers are encouraged to use these functions.
+
+The run-time PM callbacks present in 'struct dev_pm_ops', the device run-time PM
+fields of 'struct dev_pm_info' and the core helper functions provided for
+run-time PM are described below.
+
+2. Device Run-time PM Callbacks
+
+There are three device run-time PM callbacks defined in 'struct dev_pm_ops':
+
+struct dev_pm_ops {
+	...
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
+	...
+};
+
+The ->runtime_suspend() callback is executed by the PM core for the bus type of
+the device being suspended.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_suspend() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_suspend()
+callback in a device driver as long as the bus type's ->runtime_suspend() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_suspend() callback has completed successfully
+    for given device, the PM core regards the device as suspended, which need
+    not mean that the device has been put into a low power state.  It is
+    supposed to mean, however, that the device will not process data and will
+    not communicate with the CPU(s) and RAM until its bus type's
+    ->runtime_resume() callback is executed for it.  The run-time PM status of
+    a device after successful execution of its bus type's ->runtime_suspend()
+    callback is 'suspended'.
+
+  * If the bus type's ->runtime_suspend() callback returns -EBUSY or -EAGAIN,
+    the device's run-time PM status is supposed to be 'active', which means that
+    the device _must_ be fully operational afterwards.
+
+  * If the bus type's ->runtime_suspend() callback returns an error code
+    different from -EBUSY or -EAGAIN, the PM core regards this as a fatal
+    error and will refuse to run the helper functions described in Section 4
+    for the device, until the status of it is directly set either to 'active'
+    or to 'suspended' (the PM core provides special helper functions for this
+    purpose).
+
+In particular, if the driver requires remote wakeup capability for proper
+functioning and device_may_wakeup() returns 'false' for the device, then
+->runtime_suspend() should return -EBUSY.  On the other hand, if
+device_may_wakeup() returns 'true' for the device and the device is put
+into a low power state during the execution of its bus type's
+->runtime_suspend(), it is expected that remote wake-up (i.e. hardware mechanism
+allowing the device to request a change of its power state, such as PCI PME)
+will be enabled for the device.  Generally, remote wake-up should be enabled
+for all input devices put into a low power state at run time.
+
+The ->runtime_resume() callback is executed by the PM core for the bus type of
+the device being woken up.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_resume() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_resume()
+callback in a device driver as long as the bus type's ->runtime_resume() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_resume() callback has completed successfully,
+    the PM core regards the device as fully operational, which means that the
+    device _must_ be able to complete I/O operations as needed.  The run-time
+    PM status of the device is then 'active'.
+
+  * If the bus type's ->runtime_resume() callback returns an error code, the PM
+    core regards this as a fatal error and will refuse to run the helper
+    functions described in Section 4 for the device, until its status is
+    directly set either to 'active' or to 'suspended' (the PM core provides
+    special helper functions for this purpose).
+
+The ->runtime_idle() callback is executed by the PM core for the bus type of
+given device whenever the device appears to be idle, which is indicated to the
+PM core by two counters, the device's usage counter and the counter of 'active'
+children of the device.
+
+  * If any of these counters is decreased using a helper function provided by
+    the PM core and it turns out to be equal to zero, the other counter is
+    checked.  If that counter also is equal to zero, the PM core executes the
+    device bus type's ->runtime_idle() callback (with the device as an
+    argument).
+
+The action performed by a bus type's ->runtime_idle() callback is totally
+dependent on the bus type in question, but the expected and recommended action
+is to check if the device can be suspended (i.e. if all of the conditions
+necessary for suspending the device are satisfied) and to queue up a suspend
+request for the device in that case.
+
+The helper functions provided by the PM core, described in Section 4, guarantee
+that the following constraints are met with respect to the bus type's run-time
+PM callbacks:
+
+(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
+    ->runtime_suspend() in parallel with ->runtime_resume() or with another
+    instance of ->runtime_suspend() for the same device) with the exception that
+    ->runtime_suspend() or ->runtime_resume() can be executed in parallel with
+    ->runtime_idle() (although ->runtime_idle() will not be started while any
+    of the other callbacks is being executed for the same device).
+
+(2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active'
+    devices (i.e. the PM core will only execute ->runtime_idle() or
+    ->runtime_suspend() for the devices the run-time PM status of which is
+    'active').
+
+(3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device
+    the usage counter of which is equal to zero _and_ either the counter of
+    'active' children of which is equal to zero, or the 'power.ignore_children'
+    flag of which is set.
+
+(4) ->runtime_resume() can only be executed for 'suspended' devices  (i.e. the
+    PM core will only execute ->runtime_resume() for the devices the run-time
+    PM status of which is 'suspended').
+
+Additionally, the helper functions provided by the PM core obey the following
+rules:
+
+  * If ->runtime_suspend() is about to be executed or there's a pending request
+    to execute it, ->runtime_idle() will not be executed for the same device.
+
+  * A request to execute or to schedule the execution of ->runtime_suspend()
+    will cancel any pending requests to execute ->runtime_idle() for the same
+    device.
+
+  * If ->runtime_resume() is about to be executed or there's a pending request
+    to execute it, the other callbacks will not be executed for the same device.
+
+  * A request to execute ->runtime_resume() will cancel any pending or
+    scheduled requests to execute the other callbacks for the same device.
+
+3. Run-time PM Device Fields
+
+The following device run-time PM fields are present in 'struct dev_pm_info', as
+defined in include/linux/pm.h:
+
+  struct timer_list suspend_timer;
+    - timer used for scheduling (delayed) suspend request
+
+  unsigned long timer_expires;
+    - timer expiration time, in jiffies (if this is different from zero, the
+      timer is running and will expire at that time, otherwise the timer is not
+      running)
+
+  struct work_struct work;
+    - work structure used for queuing up requests (i.e. work items in pm_wq)
+
+  wait_queue_head_t wait_queue;
+    - wait queue used if any of the helper functions needs to wait for another
+      one to complete
+
+  spinlock_t lock;
+    - lock used for synchronisation
+
+  atomic_t usage_count;
+    - the usage counter of the device
+
+  atomic_t child_count;
+    - the count of 'active' children of the device
+
+  unsigned int ignore_children;
+    - if set, the value of child_count is ignored (but still updated)
+
+  unsigned int disable_depth;
+    - used for disabling the helper funcions (they work normally if this is
+      equal to zero); the initial value of it is 1 (i.e. run-time PM is
+      initially disabled for all devices)
+
+  unsigned int runtime_error;
+    - if set, there was a fatal error (one of the callbacks returned error code
+      as described in Section 2), so the helper funtions will not work until
+      this flag is cleared; this is the error code returned by the failing
+      callback
+
+  unsigned int idle_notification;
+    - if set, ->runtime_idle() is being executed
+
+  unsigned int request_pending;
+    - if set, there's a pending request (i.e. a work item queued up into pm_wq)
+
+  enum rpm_request request;
+    - type of request that's pending (valid if request_pending is set)
+
+  unsigned int deferred_resume;
+    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
+      being executed for that device and it is not practical to wait for the
+      suspend to complete; means "start a resume as soon as you've suspended"
+
+  enum rpm_status runtime_status;
+    - the run-time PM status of the device; this field's initial value is
+      RPM_SUSPENDED, which means that each device is initially regarded by the
+      PM core as 'suspended', regardless of its real hardware status
+
+All of the above fields are members of the 'power' member of 'struct device'.
+
+4. Run-time PM Device Helper Functions
+
+The following run-time PM helper functions are defined in
+drivers/base/power/runtime.c and include/linux/pm_runtime.h:
+
+  void pm_runtime_init(struct device *dev);
+    - initialize the device run-time PM fields in 'struct dev_pm_info'
+
+  void pm_runtime_remove(struct device *dev);
+    - make sure that the run-time PM of the device will be disabled after
+      removing the device from device hierarchy
+
+  int pm_runtime_idle(struct device *dev);
+    - execute ->runtime_idle() for the device's bus type; returns 0 on success
+      or error code on failure, where -EINPROGRESS means that ->runtime_idle()
+      is already being executed
+
+  int pm_runtime_suspend(struct device *dev);
+    - execute ->runtime_suspend() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'suspended', or
+      error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt
+      to suspend the device again in future
+
+  int pm_runtime_resume(struct device *dev);
+    - execute ->runtime_resume() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'active' or
+      error code on failure, where -EAGAIN means it may be safe to attempt to
+      resume the device again in future, but 'power.runtime_error' should be
+      checked additionally
+
+  int pm_request_idle(struct device *dev);
+    - submit a request to execute ->runtime_idle() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on success
+      or error code if the request has not been queued up
+
+  int pm_schedule_suspend(struct device *dev, unsigned int delay);
+    - schedule the execution of ->runtime_suspend() for the device's bus type
+      in future, where 'delay' is the time to wait before queuing up a suspend
+      work item in pm_wq, in milliseconds (if 'delay' is zero, the work item is
+      queued up immediately); returns 0 on success, 1 if the device's PM
+      run-time status was already 'suspended', or error code if the request
+      hasn't been scheduled (or queued up if 'delay' is 0); if the execution of
+      ->runtime_suspend() is already scheduled and not yet expired, the new
+      value of 'delay' will be used as the time to wait
+
+  int pm_request_resume(struct device *dev);
+    - submit a request to execute ->runtime_resume() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on
+      success, 1 if the device's run-time PM status was already 'active', or
+      error code if the request hasn't been queued up
+
+  void pm_runtime_get_noresume(struct device *dev);
+    - increment the device's usage counter
+
+  int pm_runtime_get(struct device *dev);
+    - increment the device's usage counter, run pm_request_resume(dev) and
+      return its result
+
+  int pm_runtime_get_sync(struct device *dev);
+    - increment the device's usage counter, run pm_runtime_resume(dev) and
+      return its result
+
+  void pm_runtime_put_noidle(struct device *dev);
+    - decrement the device's usage counter
+
+  int pm_runtime_put(struct device *dev);
+    - decrement the device's usage counter, run pm_request_idle(dev) and return
+      its result
+
+  int pm_runtime_put_sync(struct device *dev);
+    - decrement the device's usage counter, run pm_runtime_idle(dev) and return
+      its result
+
+  void pm_runtime_enable(struct device *dev);
+    - enable the run-time PM helper functions to run the device bus type's
+      run-time PM callbacks described in Section 2
+
+  int pm_runtime_disable(struct device *dev);
+    - prevent the run-time PM helper functions from running the device bus
+      type's run-time PM callbacks, make sure that all of the pending run-time
+      PM operations on the device are either completed or canceled; returns
+      1 if there was a resume request pending and it was necessary to execute
+      ->runtime_resume() for the device's bus type to satisfy that request,
+      otherwise 0 is returned
+
+  void pm_suspend_ignore_children(struct device *dev, bool enable);
+    - set/unset the power.ignore_children flag of the device
+
+  int pm_runtime_set_active(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'active' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_error' is set or 'power.disable_depth' is greater than
+      zero); it will fail and return error code if the device has a parent
+      which is not active and the 'power.ignore_children' flag of which is unset
+
+  void pm_runtime_set_suspended(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'suspended' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_error' is set or 'power.disable_depth' is greater than
+      zero)
+
+It is safe to execute the following helper functions from interrupt context:
+
+pm_request_idle()
+pm_schedule_suspend()
+pm_request_resume()
+pm_runtime_get_noresume()
+pm_runtime_get()
+pm_runtime_put_noidle()
+pm_runtime_put()
+pm_suspend_ignore_children()
+pm_runtime_set_active()
+pm_runtime_set_suspended()
+pm_runtime_enable()
+
+5. Run-time PM Initialization
+
+Initially, the run-time PM is disabled for all devices, which means that the
+majority of the run-time PM helper funtions described in Section 4 will return
+-EAGAIN until pm_runtime_enable() is called for the device.
+
+In addition to that, the initial run-time PM status of all devices is
+'suspended', but it need not reflect the actual physical state of the device.
+Thus, if the device is initially active (i.e. it is able to process I/O), its
+run-time PM status must be changed to 'active', with the help of
+pm_runtime_set_active(), before pm_runtime_enable() is called for the device.
+
+However, if the device has a parent and the parent's run-time PM is enabled,
+calling pm_runtime_set_active() for the device will affect the parent, unless
+the parent's 'power.ignore_children' flag is set.  Namely, in that case the
+parent won't be able to suspend at run time, using the PM core's helper
+functions, as long as the child's status is 'active', even if the child's
+run-time PM is still disabled (i.e. pm_runtime_enable() hasn't been called for
+the child yet or pm_runtime_disable() has been called for it).  For this reason,
+once pm_runtime_set_active() has been called for the device, pm_runtime_enable()
+should be called for it too as soon as reasonably possible or its run-time PM
+status should be changed back to 'suspended' with the help of
+pm_runtime_set_suspended().
+
+If the default initial run-time PM status of the device (i.e. 'suspended')
+reflects the actual state of the device, its bus type's or its driver's
+->probe() callback will likely need to wake it up using one of the PM core's
+helper functions described in Section 4.  In that case, pm_runtime_resume()
+should be used.  Of course, for this purpose the device's run-time PM has to be
+enabled earlier by calling pm_runtime_enable().
+
+If ->probe() calls pm_runtime_suspend() or pm_runtime_idle(), or their
+asynchronous counterparts, they will fail returning -EAGAIN, because the
+device's usage counter is incremented by the core before executing ->probe().
+Still, it may be desirable to suspend the device as soon as ->probe() has
+finished, so the core uses pm_runtime_idle() to invoke the device bus type's
+->runtime_idle() callback at that time, but only if ->probe() is successful.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-06 17:01           ` Alan Stern
                               ` (2 preceding siblings ...)
  2009-08-06 21:53             ` [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13) Rafael J. Wysocki
@ 2009-08-06 21:53             ` Rafael J. Wysocki
  3 siblings, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-06 21:53 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

Hi,

The patch below should address all of your most recent comments.

Best,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 13)

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.  Document all these things.

Special thanks to Alan Stern for his help with the design and
multiple detailed reviews of the pereceding versions of this patch
and to Magnus Damm for testing feedback.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/power/runtime_pm.txt |  377 ++++++++++++++
 drivers/base/dd.c                  |   15 
 drivers/base/power/Makefile        |    1 
 drivers/base/power/main.c          |   20 
 drivers/base/power/power.h         |   31 -
 drivers/base/power/runtime.c       |  944 +++++++++++++++++++++++++++++++++++++
 include/linux/pm.h                 |  101 +++
 include/linux/pm_runtime.h         |  115 ++++
 kernel/power/Kconfig               |   14 
 kernel/power/main.c                |   17 
 10 files changed, 1624 insertions(+), 11 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /*
@@ -329,14 +358,80 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		disable_depth:3;
+	unsigned int		ignore_children:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			runtime_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,944 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_runtime_resume(struct device *dev, bool from_wq);
+static int __pm_request_idle(struct device *dev);
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * In case there's a request pending, make sure its work function will
+	 * return without doing anything.
+	 */
+	dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_error)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.runtime_status != RPM_ACTIVE)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
+		spin_unlock_irq(&dev->power.lock);
+
+		dev->bus->pm->runtime_idle(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the function has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If an idle notification or suspend request is pending or
+ * scheduled, cancel it.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_error)
+		return -EINVAL;
+
+	/* Pending resume requests take precedence over us. */
+	if (dev->power.request_pending && dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+
+	/* Other scheduled or pending requests need to be canceled. */
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.disable_depth > 0
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend) {
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = dev->bus->pm->runtime_suspend(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	} else {
+		retval = -ENOSYS;
+	}
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY)
+			notify = true;
+		else
+			dev->power.runtime_error = retval;
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		dev->power.deferred_resume = false;
+		__pm_runtime_resume(dev, false);
+		return -EAGAIN;
+	}
+
+	if (notify)
+		__pm_runtime_idle(dev);
+
+	if (parent && !parent->power.ignore_children) {
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_request_idle(parent);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the function has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  Cancel any scheduled
+ * or pending requests.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+	__releases(&dev->power.lock) __acquires(&dev->power.lock)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_error) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval)
+		goto out;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			retval = -EINPROGRESS;
+			goto out;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		parent = dev->parent;
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_runtime_get_noresume(parent);
+
+		spin_lock_irq(&parent->power.lock);
+		/*
+		 * We can resume if the parent's run-time PM is disabled or it
+		 * is set to ignore children.
+		 */
+		if (!parent->power.disable_depth
+		    && !parent->power.ignore_children) {
+			__pm_runtime_resume(parent, false);
+			if (parent->power.runtime_status != RPM_ACTIVE)
+				retval = -EBUSY;
+		}
+		spin_unlock_irq(&parent->power.lock);
+
+		spin_lock_irq(&dev->power.lock);
+		if (retval)
+			goto out;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume) {
+		spin_unlock_irq(&dev->power.lock);
+
+		retval = dev->bus->pm->runtime_resume(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	} else {
+		retval = -ENOSYS;
+	}
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+		dev->power.runtime_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (!retval)
+		__pm_request_idle(dev);
+
+ out:
+	if (parent) {
+		spin_unlock_irq(&dev->power.lock);
+
+		pm_runtime_put(parent);
+
+		spin_lock_irq(&dev->power.lock);
+	}
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * __pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_error)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_NONE)
+			dev->power.request = RPM_REQ_IDLE;
+		else if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_idle(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_error)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval < 0)
+		return retval;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+		return retval;
+	} else if (retval) {
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_error) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_error)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval < 0)
+		return retval;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		return retval;
+	} else if (retval) {
+		return retval;
+	}
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_get - Reference count a device and wake it up, if necessary.
+ * @dev: Device to handle.
+ * @sync: If set and the device is suspended, resume it synchronously.
+ *
+ * Increment the usage count of the device and if it was zero previously,
+ * resume it or submit a resume request for it, depending on the value of @sync.
+ */
+int __pm_runtime_get(struct device *dev, bool sync)
+{
+	int retval = 1;
+
+	if (atomic_add_return(1, &dev->power.usage_count) == 1)
+		retval = sync ? pm_runtime_resume(dev) : pm_request_resume(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_get);
+
+/**
+ * __pm_runtime_put - Decrement the device's usage counter and notify its bus.
+ * @dev: Device to handle.
+ * @sync: If the device's bus type is to be notified, do that synchronously.
+ *
+ * Decrement the usage count of the device and if it reaches zero, carry out a
+ * synchronous idle notification or submit an idle notification request for it,
+ * depending on the value of @sync.
+ */
+int __pm_runtime_put(struct device *dev, bool sync)
+{
+	int retval = 0;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		retval = sync ? pm_runtime_idle(dev) : pm_request_idle(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_put);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_error field is
+ * different from zero, the status may be changed either to RPM_ACTIVE, or to
+ * RPM_SUSPENDED, as long as that reflects the actual state of the device.
+ * However, if the device has a parent and the parent is not active, and the
+ * parent's power.ignore_children flag is unset, the device's status cannot be
+ * set to RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_error field
+ * and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.  If the new status is RPM_SUSPENDED, an idle
+ * notification request for the parent is submitted.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	bool notify_parent = false;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_error && !dev->power.disable_depth) {
+		error = -EAGAIN;
+		goto out;
+	}
+
+	if (dev->power.runtime_status == status)
+		goto out_set;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent) {
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+			notify_parent = !parent->power.ignore_children;
+		}
+		goto out_set;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It is invalid to put an active child under a parent that is
+		 * not active, has run-time PM enabled and the
+		 * 'power.ignore_children' flag unset.
+		 */
+		if (!parent->power.disable_depth
+		    && !parent->power.ignore_children
+		    && parent->power.runtime_status != RPM_ACTIVE) {
+			error = -EBUSY;
+		} else {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	}
+
+ out_set:
+	dev->power.runtime_status = status;
+	dev->power.runtime_error = 0;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	if (notify_parent)
+		pm_request_idle(parent);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.disable_depth > 0)
+		dev->power.disable_depth--;
+	else
+		dev_warn(dev, "Unbalanced %s!", __func__);
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * __pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ * @check_resume: If set, check if there's a resume request for the device.
+ *
+ * Increment power.disable_depth for the device and if was zero previously,
+ * cancel all pending run-time PM requests for the device and wait for all
+ * operations in progress to complete.  The device can be either active or
+ * suspended after its run-time PM has been disabled.
+ *
+ * If @check_resume is set and there's a resume request pending when
+ * __pm_runtime_disable() is called and power.disable_depth is zero, the
+ * function will wake up the device before disabling its run-time PM and will
+ * return 1.  Otherwise, 0 is returned.
+ */
+int __pm_runtime_disable(struct device *dev, bool check_resume)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (dev->power.disable_depth > 0) {
+		dev->power.disable_depth++;
+		goto out;
+	}
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and disabling run-time PM
+	 * shouldn't prevent the device from processing the I/O.
+	 */
+	if (check_resume && dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		/*
+		 * Prevent suspends and idle notifications from being carried
+		 * out after we have woken up the device.
+		 */
+		pm_runtime_get_noresume(dev);
+
+		__pm_runtime_resume(dev, false);
+
+		pm_runtime_put_noidle(dev);
+		retval = 1;
+	}
+
+	if (dev->power.disable_depth++ > 0)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING
+			    && !dev->power.idle_notification)
+				break;
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_SUSPENDED;
+	dev->power.idle_notification = false;
+
+	dev->power.disable_depth = 1;
+	atomic_set(&dev->power.usage_count, 0);
+
+	dev->power.runtime_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	setup_timer(&dev->power.suspend_timer, pm_suspend_timer_fn,
+			(unsigned long)dev);
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	__pm_runtime_disable(dev, false);
+
+	/* Change the status back to 'suspended' to match the initial status. */
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		pm_runtime_set_suspended(dev);
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,115 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_get(struct device *dev, bool sync);
+extern int __pm_runtime_put(struct device *dev, bool sync);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int __pm_runtime_disable(struct device *dev, bool check_resume);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline void pm_runtime_get_noresume(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+}
+
+static inline void pm_runtime_put_noidle(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_get(struct device *dev, bool sync) { return 1; }
+static inline int __pm_runtime_put(struct device *dev, bool sync) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int __pm_runtime_disable(struct device *dev, bool check_resume)
+{
+	return 0;
+}
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline void pm_runtime_get_noresume(struct device *dev) {}
+static inline void pm_runtime_put_noidle(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	return __pm_runtime_get(dev, false);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	return __pm_runtime_get(dev, true);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	return __pm_runtime_put(dev, false);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	return __pm_runtime_put(dev, true);
+}
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+static inline int pm_runtime_disable(struct device *dev)
+{
+	return __pm_runtime_disable(dev, true);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object being initialized.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -105,6 +116,7 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -512,6 +524,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -757,11 +770,16 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		if (pm_runtime_disable(dev) && device_may_wakeup(dev))
+			/* Wake-up requested during system sleep transition. */
+			error = -EBUSY;
+		else
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,17 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
+	/*
+	 * Wait for run-time PM calls to complete and prevent new suspend calls
+	 * until the probe is done.
+	 */
+	pm_runtime_disable(dev);
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
 	ret = really_probe(dev, drv);
+	pm_runtime_put_noidle(dev);
+	if (!ret)
+		pm_runtime_idle(dev);
 
 	return ret;
 }
@@ -306,6 +317,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +337,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,7 +1,14 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
+#ifdef CONFIG_PM_RUNTIME
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
 
 #ifdef CONFIG_PM_SLEEP
 
@@ -16,23 +23,33 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
+
+static inline void device_pm_init(struct device *dev)
+{
+	pm_runtime_init(dev);
+}
+
+static inline void device_pm_remove(struct device *dev)
+{
+	pm_runtime_remove(dev);
+}
 
 static inline void device_pm_add(struct device *dev) {}
-static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
 					 struct device *devb) {}
 static inline void device_pm_move_after(struct device *deva,
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 
Index: linux-2.6/Documentation/power/runtime_pm.txt
===================================================================
--- /dev/null
+++ linux-2.6/Documentation/power/runtime_pm.txt
@@ -0,0 +1,377 @@
+Run-time Power Management Framework for I/O Devices
+
+(C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+
+1. Introduction
+
+Support for run-time power management (run-time PM) of I/O devices is provided
+at the power management core (PM core) level by means of:
+
+* The power management workqueue pm_wq in which bus types and device drivers can
+  put their PM-related work items.  It is strongly recommended that pm_wq be
+  used for queuing all work items related to run-time PM, because this allows
+  them to be synchronized with system-wide power transitions (suspend to RAM,
+  hibernation and resume from system sleep states).  pm_wq is declared in
+  include/linux/pm_runtime.h and defined in kernel/power/main.c.
+
+* A number of run-time PM fields in the 'power' member of 'struct device' (which
+  is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can
+  be used for synchronizing run-time PM operations with one another.
+
+* Three device run-time PM callbacks in 'struct dev_pm_ops' (defined in
+  include/linux/pm.h).
+
+* A set of helper functions defined in drivers/base/power/runtime.c that can be
+  used for carrying out run-time PM operations in such a way that the
+  synchronization between them is taken care of by the PM core.  Bus types and
+  device drivers are encouraged to use these functions.
+
+The run-time PM callbacks present in 'struct dev_pm_ops', the device run-time PM
+fields of 'struct dev_pm_info' and the core helper functions provided for
+run-time PM are described below.
+
+2. Device Run-time PM Callbacks
+
+There are three device run-time PM callbacks defined in 'struct dev_pm_ops':
+
+struct dev_pm_ops {
+	...
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
+	...
+};
+
+The ->runtime_suspend() callback is executed by the PM core for the bus type of
+the device being suspended.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_suspend() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_suspend()
+callback in a device driver as long as the bus type's ->runtime_suspend() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_suspend() callback has completed successfully
+    for given device, the PM core regards the device as suspended, which need
+    not mean that the device has been put into a low power state.  It is
+    supposed to mean, however, that the device will not process data and will
+    not communicate with the CPU(s) and RAM until its bus type's
+    ->runtime_resume() callback is executed for it.  The run-time PM status of
+    a device after successful execution of its bus type's ->runtime_suspend()
+    callback is 'suspended'.
+
+  * If the bus type's ->runtime_suspend() callback returns -EBUSY or -EAGAIN,
+    the device's run-time PM status is supposed to be 'active', which means that
+    the device _must_ be fully operational afterwards.
+
+  * If the bus type's ->runtime_suspend() callback returns an error code
+    different from -EBUSY or -EAGAIN, the PM core regards this as a fatal
+    error and will refuse to run the helper functions described in Section 4
+    for the device, until the status of it is directly set either to 'active'
+    or to 'suspended' (the PM core provides special helper functions for this
+    purpose).
+
+In particular, if the driver requires remote wakeup capability for proper
+functioning and device_may_wakeup() returns 'false' for the device, then
+->runtime_suspend() should return -EBUSY.  On the other hand, if
+device_may_wakeup() returns 'true' for the device and the device is put
+into a low power state during the execution of its bus type's
+->runtime_suspend(), it is expected that remote wake-up (i.e. hardware mechanism
+allowing the device to request a change of its power state, such as PCI PME)
+will be enabled for the device.  Generally, remote wake-up should be enabled
+for all input devices put into a low power state at run time.
+
+The ->runtime_resume() callback is executed by the PM core for the bus type of
+the device being woken up.  The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_resume() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_resume()
+callback in a device driver as long as the bus type's ->runtime_resume() knows
+what to do to handle the device).
+
+  * Once the bus type's ->runtime_resume() callback has completed successfully,
+    the PM core regards the device as fully operational, which means that the
+    device _must_ be able to complete I/O operations as needed.  The run-time
+    PM status of the device is then 'active'.
+
+  * If the bus type's ->runtime_resume() callback returns an error code, the PM
+    core regards this as a fatal error and will refuse to run the helper
+    functions described in Section 4 for the device, until its status is
+    directly set either to 'active' or to 'suspended' (the PM core provides
+    special helper functions for this purpose).
+
+The ->runtime_idle() callback is executed by the PM core for the bus type of
+given device whenever the device appears to be idle, which is indicated to the
+PM core by two counters, the device's usage counter and the counter of 'active'
+children of the device.
+
+  * If any of these counters is decreased using a helper function provided by
+    the PM core and it turns out to be equal to zero, the other counter is
+    checked.  If that counter also is equal to zero, the PM core executes the
+    device bus type's ->runtime_idle() callback (with the device as an
+    argument).
+
+The action performed by a bus type's ->runtime_idle() callback is totally
+dependent on the bus type in question, but the expected and recommended action
+is to check if the device can be suspended (i.e. if all of the conditions
+necessary for suspending the device are satisfied) and to queue up a suspend
+request for the device in that case.
+
+The helper functions provided by the PM core, described in Section 4, guarantee
+that the following constraints are met with respect to the bus type's run-time
+PM callbacks:
+
+(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
+    ->runtime_suspend() in parallel with ->runtime_resume() or with another
+    instance of ->runtime_suspend() for the same device) with the exception that
+    ->runtime_suspend() or ->runtime_resume() can be executed in parallel with
+    ->runtime_idle() (although ->runtime_idle() will not be started while any
+    of the other callbacks is being executed for the same device).
+
+(2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active'
+    devices (i.e. the PM core will only execute ->runtime_idle() or
+    ->runtime_suspend() for the devices the run-time PM status of which is
+    'active').
+
+(3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device
+    the usage counter of which is equal to zero _and_ either the counter of
+    'active' children of which is equal to zero, or the 'power.ignore_children'
+    flag of which is set.
+
+(4) ->runtime_resume() can only be executed for 'suspended' devices  (i.e. the
+    PM core will only execute ->runtime_resume() for the devices the run-time
+    PM status of which is 'suspended').
+
+Additionally, the helper functions provided by the PM core obey the following
+rules:
+
+  * If ->runtime_suspend() is about to be executed or there's a pending request
+    to execute it, ->runtime_idle() will not be executed for the same device.
+
+  * A request to execute or to schedule the execution of ->runtime_suspend()
+    will cancel any pending requests to execute ->runtime_idle() for the same
+    device.
+
+  * If ->runtime_resume() is about to be executed or there's a pending request
+    to execute it, the other callbacks will not be executed for the same device.
+
+  * A request to execute ->runtime_resume() will cancel any pending or
+    scheduled requests to execute the other callbacks for the same device.
+
+3. Run-time PM Device Fields
+
+The following device run-time PM fields are present in 'struct dev_pm_info', as
+defined in include/linux/pm.h:
+
+  struct timer_list suspend_timer;
+    - timer used for scheduling (delayed) suspend request
+
+  unsigned long timer_expires;
+    - timer expiration time, in jiffies (if this is different from zero, the
+      timer is running and will expire at that time, otherwise the timer is not
+      running)
+
+  struct work_struct work;
+    - work structure used for queuing up requests (i.e. work items in pm_wq)
+
+  wait_queue_head_t wait_queue;
+    - wait queue used if any of the helper functions needs to wait for another
+      one to complete
+
+  spinlock_t lock;
+    - lock used for synchronisation
+
+  atomic_t usage_count;
+    - the usage counter of the device
+
+  atomic_t child_count;
+    - the count of 'active' children of the device
+
+  unsigned int ignore_children;
+    - if set, the value of child_count is ignored (but still updated)
+
+  unsigned int disable_depth;
+    - used for disabling the helper funcions (they work normally if this is
+      equal to zero); the initial value of it is 1 (i.e. run-time PM is
+      initially disabled for all devices)
+
+  unsigned int runtime_error;
+    - if set, there was a fatal error (one of the callbacks returned error code
+      as described in Section 2), so the helper funtions will not work until
+      this flag is cleared; this is the error code returned by the failing
+      callback
+
+  unsigned int idle_notification;
+    - if set, ->runtime_idle() is being executed
+
+  unsigned int request_pending;
+    - if set, there's a pending request (i.e. a work item queued up into pm_wq)
+
+  enum rpm_request request;
+    - type of request that's pending (valid if request_pending is set)
+
+  unsigned int deferred_resume;
+    - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
+      being executed for that device and it is not practical to wait for the
+      suspend to complete; means "start a resume as soon as you've suspended"
+
+  enum rpm_status runtime_status;
+    - the run-time PM status of the device; this field's initial value is
+      RPM_SUSPENDED, which means that each device is initially regarded by the
+      PM core as 'suspended', regardless of its real hardware status
+
+All of the above fields are members of the 'power' member of 'struct device'.
+
+4. Run-time PM Device Helper Functions
+
+The following run-time PM helper functions are defined in
+drivers/base/power/runtime.c and include/linux/pm_runtime.h:
+
+  void pm_runtime_init(struct device *dev);
+    - initialize the device run-time PM fields in 'struct dev_pm_info'
+
+  void pm_runtime_remove(struct device *dev);
+    - make sure that the run-time PM of the device will be disabled after
+      removing the device from device hierarchy
+
+  int pm_runtime_idle(struct device *dev);
+    - execute ->runtime_idle() for the device's bus type; returns 0 on success
+      or error code on failure, where -EINPROGRESS means that ->runtime_idle()
+      is already being executed
+
+  int pm_runtime_suspend(struct device *dev);
+    - execute ->runtime_suspend() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'suspended', or
+      error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt
+      to suspend the device again in future
+
+  int pm_runtime_resume(struct device *dev);
+    - execute ->runtime_resume() for the device's bus type; returns 0 on
+      success, 1 if the device's run-time PM status was already 'active' or
+      error code on failure, where -EAGAIN means it may be safe to attempt to
+      resume the device again in future, but 'power.runtime_error' should be
+      checked additionally
+
+  int pm_request_idle(struct device *dev);
+    - submit a request to execute ->runtime_idle() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on success
+      or error code if the request has not been queued up
+
+  int pm_schedule_suspend(struct device *dev, unsigned int delay);
+    - schedule the execution of ->runtime_suspend() for the device's bus type
+      in future, where 'delay' is the time to wait before queuing up a suspend
+      work item in pm_wq, in milliseconds (if 'delay' is zero, the work item is
+      queued up immediately); returns 0 on success, 1 if the device's PM
+      run-time status was already 'suspended', or error code if the request
+      hasn't been scheduled (or queued up if 'delay' is 0); if the execution of
+      ->runtime_suspend() is already scheduled and not yet expired, the new
+      value of 'delay' will be used as the time to wait
+
+  int pm_request_resume(struct device *dev);
+    - submit a request to execute ->runtime_resume() for the device's bus type
+      (the request is represented by a work item in pm_wq); returns 0 on
+      success, 1 if the device's run-time PM status was already 'active', or
+      error code if the request hasn't been queued up
+
+  void pm_runtime_get_noresume(struct device *dev);
+    - increment the device's usage counter
+
+  int pm_runtime_get(struct device *dev);
+    - increment the device's usage counter, run pm_request_resume(dev) and
+      return its result
+
+  int pm_runtime_get_sync(struct device *dev);
+    - increment the device's usage counter, run pm_runtime_resume(dev) and
+      return its result
+
+  void pm_runtime_put_noidle(struct device *dev);
+    - decrement the device's usage counter
+
+  int pm_runtime_put(struct device *dev);
+    - decrement the device's usage counter, run pm_request_idle(dev) and return
+      its result
+
+  int pm_runtime_put_sync(struct device *dev);
+    - decrement the device's usage counter, run pm_runtime_idle(dev) and return
+      its result
+
+  void pm_runtime_enable(struct device *dev);
+    - enable the run-time PM helper functions to run the device bus type's
+      run-time PM callbacks described in Section 2
+
+  int pm_runtime_disable(struct device *dev);
+    - prevent the run-time PM helper functions from running the device bus
+      type's run-time PM callbacks, make sure that all of the pending run-time
+      PM operations on the device are either completed or canceled; returns
+      1 if there was a resume request pending and it was necessary to execute
+      ->runtime_resume() for the device's bus type to satisfy that request,
+      otherwise 0 is returned
+
+  void pm_suspend_ignore_children(struct device *dev, bool enable);
+    - set/unset the power.ignore_children flag of the device
+
+  int pm_runtime_set_active(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'active' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_error' is set or 'power.disable_depth' is greater than
+      zero); it will fail and return error code if the device has a parent
+      which is not active and the 'power.ignore_children' flag of which is unset
+
+  void pm_runtime_set_suspended(struct device *dev);
+    - clear the device's 'power.runtime_error' flag, set the device's run-time
+      PM status to 'suspended' and update its parent's counter of 'active'
+      children as appropriate (it is only valid to use this function if
+      'power.runtime_error' is set or 'power.disable_depth' is greater than
+      zero)
+
+It is safe to execute the following helper functions from interrupt context:
+
+pm_request_idle()
+pm_schedule_suspend()
+pm_request_resume()
+pm_runtime_get_noresume()
+pm_runtime_get()
+pm_runtime_put_noidle()
+pm_runtime_put()
+pm_suspend_ignore_children()
+pm_runtime_set_active()
+pm_runtime_set_suspended()
+pm_runtime_enable()
+
+5. Run-time PM Initialization
+
+Initially, the run-time PM is disabled for all devices, which means that the
+majority of the run-time PM helper funtions described in Section 4 will return
+-EAGAIN until pm_runtime_enable() is called for the device.
+
+In addition to that, the initial run-time PM status of all devices is
+'suspended', but it need not reflect the actual physical state of the device.
+Thus, if the device is initially active (i.e. it is able to process I/O), its
+run-time PM status must be changed to 'active', with the help of
+pm_runtime_set_active(), before pm_runtime_enable() is called for the device.
+
+However, if the device has a parent and the parent's run-time PM is enabled,
+calling pm_runtime_set_active() for the device will affect the parent, unless
+the parent's 'power.ignore_children' flag is set.  Namely, in that case the
+parent won't be able to suspend at run time, using the PM core's helper
+functions, as long as the child's status is 'active', even if the child's
+run-time PM is still disabled (i.e. pm_runtime_enable() hasn't been called for
+the child yet or pm_runtime_disable() has been called for it).  For this reason,
+once pm_runtime_set_active() has been called for the device, pm_runtime_enable()
+should be called for it too as soon as reasonably possible or its run-time PM
+status should be changed back to 'suspended' with the help of
+pm_runtime_set_suspended().
+
+If the default initial run-time PM status of the device (i.e. 'suspended')
+reflects the actual state of the device, its bus type's or its driver's
+->probe() callback will likely need to wake it up using one of the PM core's
+helper functions described in Section 4.  In that case, pm_runtime_resume()
+should be used.  Of course, for this purpose the device's run-time PM has to be
+enabled earlier by calling pm_runtime_enable().
+
+If ->probe() calls pm_runtime_suspend() or pm_runtime_idle(), or their
+asynchronous counterparts, they will fail returning -EAGAIN, because the
+device's usage counter is incremented by the core before executing ->probe().
+Still, it may be desirable to suspend the device as soon as ->probe() has
+finished, so the core uses pm_runtime_idle() to invoke the device bus type's
+->runtime_idle() callback at that time, but only if ->probe() is successful.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of  I/O devices (rev. 13)
  2009-08-06 21:53             ` [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13) Rafael J. Wysocki
  2009-08-07  7:45               ` Magnus Damm
@ 2009-08-07  7:45               ` Magnus Damm
  2009-08-07 13:54                 ` Rafael J. Wysocki
  2009-08-07 13:54                 ` Rafael J. Wysocki
  2009-08-07 15:41               ` Alan Stern
  2009-08-07 15:41               ` Alan Stern
  3 siblings, 2 replies; 39+ messages in thread
From: Magnus Damm @ 2009-08-07  7:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, Pavel Machek,
	Len Brown, LKML

On Fri, Aug 7, 2009 at 6:53 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
>
> Introduce a core framework for run-time power management of I/O
> devices.  Add device run-time PM fields to 'struct dev_pm_info'
> and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
> a run-time PM workqueue and define some device run-time PM helper
> functions at the core level.  Document all these things.
>
> Special thanks to Alan Stern for his help with the design and
> multiple detailed reviews of the pereceding versions of this patch
> and to Magnus Damm for testing feedback.
>
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

I've rebased my series of patches to v13. My platform device Runtime
PM code for SuperH works well with drivers for i2c, uio and fbdev with
v13. Thank you. I'll post a SuperH specific patch set some time next
week.

Anyway, I'm not far from an "Acked-by", but please look at the following series:

[PATCH 00/05] PM: Runtime PM v13 for Platform Devices 20090807

Maybe there is some code in there that you can include in v14? Let me know!

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-06 21:53             ` [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13) Rafael J. Wysocki
@ 2009-08-07  7:45               ` Magnus Damm
  2009-08-07  7:45               ` Magnus Damm
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Magnus Damm @ 2009-08-07  7:45 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Fri, Aug 7, 2009 at 6:53 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
>
> Introduce a core framework for run-time power management of I/O
> devices.  Add device run-time PM fields to 'struct dev_pm_info'
> and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
> a run-time PM workqueue and define some device run-time PM helper
> functions at the core level.  Document all these things.
>
> Special thanks to Alan Stern for his help with the design and
> multiple detailed reviews of the pereceding versions of this patch
> and to Magnus Damm for testing feedback.
>
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

I've rebased my series of patches to v13. My platform device Runtime
PM code for SuperH works well with drivers for i2c, uio and fbdev with
v13. Thank you. I'll post a SuperH specific patch set some time next
week.

Anyway, I'm not far from an "Acked-by", but please look at the following series:

[PATCH 00/05] PM: Runtime PM v13 for Platform Devices 20090807

Maybe there is some code in there that you can include in v14? Let me know!

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-07  7:45               ` Magnus Damm
  2009-08-07 13:54                 ` Rafael J. Wysocki
@ 2009-08-07 13:54                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-07 13:54 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, Pavel Machek,
	Len Brown, LKML

On Friday 07 August 2009, Magnus Damm wrote:
> On Fri, Aug 7, 2009 at 6:53 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
> >
> > Introduce a core framework for run-time power management of I/O
> > devices.  Add device run-time PM fields to 'struct dev_pm_info'
> > and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
> > a run-time PM workqueue and define some device run-time PM helper
> > functions at the core level.  Document all these things.
> >
> > Special thanks to Alan Stern for his help with the design and
> > multiple detailed reviews of the pereceding versions of this patch
> > and to Magnus Damm for testing feedback.
> >
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> 
> I've rebased my series of patches to v13. My platform device Runtime
> PM code for SuperH works well with drivers for i2c, uio and fbdev with
> v13. Thank you. I'll post a SuperH specific patch set some time next
> week.
> 
> Anyway, I'm not far from an "Acked-by", but please look at the following series:
> 
> [PATCH 00/05] PM: Runtime PM v13 for Platform Devices 20090807
> 
> Maybe there is some code in there that you can include in v14? Let me know!

I will.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-07  7:45               ` Magnus Damm
@ 2009-08-07 13:54                 ` Rafael J. Wysocki
  2009-08-07 13:54                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-07 13:54 UTC (permalink / raw)
  To: Magnus Damm; +Cc: Greg KH, LKML, Linux-pm mailing list

On Friday 07 August 2009, Magnus Damm wrote:
> On Fri, Aug 7, 2009 at 6:53 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
> >
> > Introduce a core framework for run-time power management of I/O
> > devices.  Add device run-time PM fields to 'struct dev_pm_info'
> > and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
> > a run-time PM workqueue and define some device run-time PM helper
> > functions at the core level.  Document all these things.
> >
> > Special thanks to Alan Stern for his help with the design and
> > multiple detailed reviews of the pereceding versions of this patch
> > and to Magnus Damm for testing feedback.
> >
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> 
> I've rebased my series of patches to v13. My platform device Runtime
> PM code for SuperH works well with drivers for i2c, uio and fbdev with
> v13. Thank you. I'll post a SuperH specific patch set some time next
> week.
> 
> Anyway, I'm not far from an "Acked-by", but please look at the following series:
> 
> [PATCH 00/05] PM: Runtime PM v13 for Platform Devices 20090807
> 
> Maybe there is some code in there that you can include in v14? Let me know!

I will.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12)
  2009-08-06 21:50             ` Rafael J. Wysocki
  2009-08-07 13:59               ` Alan Stern
@ 2009-08-07 13:59               ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-07 13:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Thu, 6 Aug 2009, Rafael J. Wysocki wrote:

> > If we defer a resume request while a suspend is in progress, then when
> > the suspend finishes should the resume be carried out immediately
> > rather than queued?  I don't see any reason why not.
> 
> Well, it's not very clear what to return to the caller in such a case.  I guess
> we can return -EAGAIN.

I would have said to return 0, just as if the resume had been queued.  
But it doesn't matter very much.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12)
  2009-08-06 21:50             ` Rafael J. Wysocki
@ 2009-08-07 13:59               ` Alan Stern
  2009-08-07 13:59               ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-07 13:59 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Thu, 6 Aug 2009, Rafael J. Wysocki wrote:

> > If we defer a resume request while a suspend is in progress, then when
> > the suspend finishes should the resume be carried out immediately
> > rather than queued?  I don't see any reason why not.
> 
> Well, it's not very clear what to return to the caller in such a case.  I guess
> we can return -EAGAIN.

I would have said to return 0, just as if the resume had been queued.  
But it doesn't matter very much.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-06 21:53             ` [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13) Rafael J. Wysocki
  2009-08-07  7:45               ` Magnus Damm
  2009-08-07  7:45               ` Magnus Damm
@ 2009-08-07 15:41               ` Alan Stern
  2009-08-08 14:03                 ` Rafael J. Wysocki
  2009-08-08 14:03                 ` Rafael J. Wysocki
  2009-08-07 15:41               ` Alan Stern
  3 siblings, 2 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-07 15:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Thu, 6 Aug 2009, Rafael J. Wysocki wrote:

> Hi,
> 
> The patch below should address all of your most recent comments.

Only two comments.


> +static int __pm_request_idle(struct device *dev)
> +{
...
> +	if (dev->power.request_pending) {
> +		/* Any requests other then RPM_REQ_IDLE take precedence. */
> +		if (dev->power.request != RPM_REQ_NONE)

Replace != with ==.

> +			dev->power.request = RPM_REQ_IDLE;
> +		else if (dev->power.request != RPM_REQ_IDLE)
> +			retval = -EAGAIN;
> +		return retval;


> --- linux-2.6.orig/drivers/base/dd.c
> +++ linux-2.6/drivers/base/dd.c
...
> @@ -306,6 +317,8 @@ static void __device_release_driver(stru
>  
>  	drv = dev->driver;
>  	if (drv) {
> +		pm_runtime_disable(dev);
> +
>  		driver_sysfs_remove(dev);
>  
>  		if (dev->bus)
> @@ -324,6 +337,8 @@ static void __device_release_driver(stru
>  			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
>  						     BUS_NOTIFY_UNBOUND_DRIVER,
>  						     dev);
> +
> +		pm_runtime_enable(dev);
>  	}
>  }

We may need to be more careful here.  The driver's remove method may
want to do some runtime PM stuff to the device before giving up
control.  On the other hand I'm not sure what _should_ be done here, so
I can't suggest anything better.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-06 21:53             ` [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13) Rafael J. Wysocki
                                 ` (2 preceding siblings ...)
  2009-08-07 15:41               ` Alan Stern
@ 2009-08-07 15:41               ` Alan Stern
  3 siblings, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-07 15:41 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Thu, 6 Aug 2009, Rafael J. Wysocki wrote:

> Hi,
> 
> The patch below should address all of your most recent comments.

Only two comments.


> +static int __pm_request_idle(struct device *dev)
> +{
...
> +	if (dev->power.request_pending) {
> +		/* Any requests other then RPM_REQ_IDLE take precedence. */
> +		if (dev->power.request != RPM_REQ_NONE)

Replace != with ==.

> +			dev->power.request = RPM_REQ_IDLE;
> +		else if (dev->power.request != RPM_REQ_IDLE)
> +			retval = -EAGAIN;
> +		return retval;


> --- linux-2.6.orig/drivers/base/dd.c
> +++ linux-2.6/drivers/base/dd.c
...
> @@ -306,6 +317,8 @@ static void __device_release_driver(stru
>  
>  	drv = dev->driver;
>  	if (drv) {
> +		pm_runtime_disable(dev);
> +
>  		driver_sysfs_remove(dev);
>  
>  		if (dev->bus)
> @@ -324,6 +337,8 @@ static void __device_release_driver(stru
>  			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
>  						     BUS_NOTIFY_UNBOUND_DRIVER,
>  						     dev);
> +
> +		pm_runtime_enable(dev);
>  	}
>  }

We may need to be more careful here.  The driver's remove method may
want to do some runtime PM stuff to the device before giving up
control.  On the other hand I'm not sure what _should_ be done here, so
I can't suggest anything better.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-07 15:41               ` Alan Stern
  2009-08-08 14:03                 ` Rafael J. Wysocki
@ 2009-08-08 14:03                 ` Rafael J. Wysocki
  2009-08-08 15:50                   ` Alan Stern
  2009-08-08 15:50                   ` Alan Stern
  1 sibling, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-08 14:03 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Friday 07 August 2009, Alan Stern wrote:
> On Thu, 6 Aug 2009, Rafael J. Wysocki wrote:
> 
> > Hi,
> > 
> > The patch below should address all of your most recent comments.
> 
> Only two comments.
> 
> 
> > +static int __pm_request_idle(struct device *dev)
> > +{
> ...
> > +	if (dev->power.request_pending) {
> > +		/* Any requests other then RPM_REQ_IDLE take precedence. */
> > +		if (dev->power.request != RPM_REQ_NONE)
> 
> Replace != with ==.

Good catch, thanks!

> > +			dev->power.request = RPM_REQ_IDLE;
> > +		else if (dev->power.request != RPM_REQ_IDLE)
> > +			retval = -EAGAIN;
> > +		return retval;
> 
> 
> > --- linux-2.6.orig/drivers/base/dd.c
> > +++ linux-2.6/drivers/base/dd.c
> ...
> > @@ -306,6 +317,8 @@ static void __device_release_driver(stru
> >  
> >  	drv = dev->driver;
> >  	if (drv) {
> > +		pm_runtime_disable(dev);
> > +
> >  		driver_sysfs_remove(dev);
> >  
> >  		if (dev->bus)
> > @@ -324,6 +337,8 @@ static void __device_release_driver(stru
> >  			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
> >  						     BUS_NOTIFY_UNBOUND_DRIVER,
> >  						     dev);
> > +
> > +		pm_runtime_enable(dev);
> >  	}
> >  }
> 
> We may need to be more careful here.  The driver's remove method may
> want to do some runtime PM stuff to the device before giving up
> control.  On the other hand I'm not sure what _should_ be done here, so
> I can't suggest anything better.

Hmm.  Perhaps we can do something along the lines of our .probe() handling.
Namely, call

pm_runtime_disable(dev);
pm_runtime_get_noresume(dev);
pm_runtime_enable(dev);

before and

pm_runtime_put_noidle()

after?  Then, if the driver's or bus type's .remove() needs to resume, it will
be able to do that right away and if it wants to suspend, it can always call
pm_runtime_put*(), because our pm_runtime_put_noidle() won't decrease the
usage counter below zero.

At the same time we can avoid "leftover" suspends that could interfere with
.remove() in case it needs to access the hardware.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-07 15:41               ` Alan Stern
@ 2009-08-08 14:03                 ` Rafael J. Wysocki
  2009-08-08 14:03                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-08 14:03 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

On Friday 07 August 2009, Alan Stern wrote:
> On Thu, 6 Aug 2009, Rafael J. Wysocki wrote:
> 
> > Hi,
> > 
> > The patch below should address all of your most recent comments.
> 
> Only two comments.
> 
> 
> > +static int __pm_request_idle(struct device *dev)
> > +{
> ...
> > +	if (dev->power.request_pending) {
> > +		/* Any requests other then RPM_REQ_IDLE take precedence. */
> > +		if (dev->power.request != RPM_REQ_NONE)
> 
> Replace != with ==.

Good catch, thanks!

> > +			dev->power.request = RPM_REQ_IDLE;
> > +		else if (dev->power.request != RPM_REQ_IDLE)
> > +			retval = -EAGAIN;
> > +		return retval;
> 
> 
> > --- linux-2.6.orig/drivers/base/dd.c
> > +++ linux-2.6/drivers/base/dd.c
> ...
> > @@ -306,6 +317,8 @@ static void __device_release_driver(stru
> >  
> >  	drv = dev->driver;
> >  	if (drv) {
> > +		pm_runtime_disable(dev);
> > +
> >  		driver_sysfs_remove(dev);
> >  
> >  		if (dev->bus)
> > @@ -324,6 +337,8 @@ static void __device_release_driver(stru
> >  			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
> >  						     BUS_NOTIFY_UNBOUND_DRIVER,
> >  						     dev);
> > +
> > +		pm_runtime_enable(dev);
> >  	}
> >  }
> 
> We may need to be more careful here.  The driver's remove method may
> want to do some runtime PM stuff to the device before giving up
> control.  On the other hand I'm not sure what _should_ be done here, so
> I can't suggest anything better.

Hmm.  Perhaps we can do something along the lines of our .probe() handling.
Namely, call

pm_runtime_disable(dev);
pm_runtime_get_noresume(dev);
pm_runtime_enable(dev);

before and

pm_runtime_put_noidle()

after?  Then, if the driver's or bus type's .remove() needs to resume, it will
be able to do that right away and if it wants to suspend, it can always call
pm_runtime_put*(), because our pm_runtime_put_noidle() won't decrease the
usage counter below zero.

At the same time we can avoid "leftover" suspends that could interfere with
.remove() in case it needs to access the hardware.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-08 14:03                 ` Rafael J. Wysocki
  2009-08-08 15:50                   ` Alan Stern
@ 2009-08-08 15:50                   ` Alan Stern
  2009-08-08 21:55                     ` Rafael J. Wysocki
  2009-08-08 21:55                     ` Rafael J. Wysocki
  1 sibling, 2 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-08 15:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Sat, 8 Aug 2009, Rafael J. Wysocki wrote:

> > We may need to be more careful here.  The driver's remove method may
> > want to do some runtime PM stuff to the device before giving up
> > control.  On the other hand I'm not sure what _should_ be done here, so
> > I can't suggest anything better.
> 
> Hmm.  Perhaps we can do something along the lines of our .probe() handling.
> Namely, call
> 
> pm_runtime_disable(dev);
> pm_runtime_get_noresume(dev);
> pm_runtime_enable(dev);
> 
> before and
> 
> pm_runtime_put_noidle()
> 
> after?  Then, if the driver's or bus type's .remove() needs to resume, it will
> be able to do that right away and if it wants to suspend, it can always call
> pm_runtime_put*(), because our pm_runtime_put_noidle() won't decrease the
> usage counter below zero.
> 
> At the same time we can avoid "leftover" suspends that could interfere with
> .remove() in case it needs to access the hardware.

The problem with this is that it calls pm_runtime_disable() at a time 
when the driver is still supposed to be in control of the device.  
Interfering with the driver's legitimate activity in this way is a bad 
thing to do.

The difficulty here is that our requirements are a little
contradictory.  We want to prevent all runtime PM callbacks while the
remove method is running, but we also want the remove method to be able
to carry out its own runtime PM activities.

So maybe what we really need is more like a barrier.  That is,
something that will do a "get", wait for outstanding callbacks to
finish, carry out a resume if one is pending, and cancel other pending
requests.  This could easily share code with pm_runtime_disable.  We 
should be able to use this for both probe and remove.

We will also need to be a lot more careful about handling runtime PM 
during system sleep transitions.  The current code runs a risk of 
losing remote wakeup requests.  One scenario goes like this:

     1. The device sends a wakeup request, probably in the form of
	an IRQ.

     2.	The driver fields the interrupt and tells the device to turn
	off its interrupt request signal.

     3. The driver calls pm_request_resume.

     4. The runtime PM core carries out the resume callback.

If the core disables runtime PM before step 1 (and before we begin the
"late" stage of a system sleep, so interrupts still get delivered) then 
steps 1 and 2 will succeed but step 3 will fail.  The wakeup event will 
be lost.

Perhaps this means we don't want to disable runtime PM during system 
sleep callbacks, but instead use the "barrier" scheme.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-08 14:03                 ` Rafael J. Wysocki
@ 2009-08-08 15:50                   ` Alan Stern
  2009-08-08 15:50                   ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-08 15:50 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Sat, 8 Aug 2009, Rafael J. Wysocki wrote:

> > We may need to be more careful here.  The driver's remove method may
> > want to do some runtime PM stuff to the device before giving up
> > control.  On the other hand I'm not sure what _should_ be done here, so
> > I can't suggest anything better.
> 
> Hmm.  Perhaps we can do something along the lines of our .probe() handling.
> Namely, call
> 
> pm_runtime_disable(dev);
> pm_runtime_get_noresume(dev);
> pm_runtime_enable(dev);
> 
> before and
> 
> pm_runtime_put_noidle()
> 
> after?  Then, if the driver's or bus type's .remove() needs to resume, it will
> be able to do that right away and if it wants to suspend, it can always call
> pm_runtime_put*(), because our pm_runtime_put_noidle() won't decrease the
> usage counter below zero.
> 
> At the same time we can avoid "leftover" suspends that could interfere with
> .remove() in case it needs to access the hardware.

The problem with this is that it calls pm_runtime_disable() at a time 
when the driver is still supposed to be in control of the device.  
Interfering with the driver's legitimate activity in this way is a bad 
thing to do.

The difficulty here is that our requirements are a little
contradictory.  We want to prevent all runtime PM callbacks while the
remove method is running, but we also want the remove method to be able
to carry out its own runtime PM activities.

So maybe what we really need is more like a barrier.  That is,
something that will do a "get", wait for outstanding callbacks to
finish, carry out a resume if one is pending, and cancel other pending
requests.  This could easily share code with pm_runtime_disable.  We 
should be able to use this for both probe and remove.

We will also need to be a lot more careful about handling runtime PM 
during system sleep transitions.  The current code runs a risk of 
losing remote wakeup requests.  One scenario goes like this:

     1. The device sends a wakeup request, probably in the form of
	an IRQ.

     2.	The driver fields the interrupt and tells the device to turn
	off its interrupt request signal.

     3. The driver calls pm_request_resume.

     4. The runtime PM core carries out the resume callback.

If the core disables runtime PM before step 1 (and before we begin the
"late" stage of a system sleep, so interrupts still get delivered) then 
steps 1 and 2 will succeed but step 3 will fail.  The wakeup event will 
be lost.

Perhaps this means we don't want to disable runtime PM during system 
sleep callbacks, but instead use the "barrier" scheme.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-08 15:50                   ` Alan Stern
  2009-08-08 21:55                     ` Rafael J. Wysocki
@ 2009-08-08 21:55                     ` Rafael J. Wysocki
  2009-08-09  2:28                       ` Alan Stern
  2009-08-09  2:28                       ` Alan Stern
  1 sibling, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-08 21:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Saturday 08 August 2009, Alan Stern wrote:
> On Sat, 8 Aug 2009, Rafael J. Wysocki wrote:
> 
> > > We may need to be more careful here.  The driver's remove method may
> > > want to do some runtime PM stuff to the device before giving up
> > > control.  On the other hand I'm not sure what _should_ be done here, so
> > > I can't suggest anything better.
> > 
> > Hmm.  Perhaps we can do something along the lines of our .probe() handling.
> > Namely, call
> > 
> > pm_runtime_disable(dev);
> > pm_runtime_get_noresume(dev);
> > pm_runtime_enable(dev);
> > 
> > before and
> > 
> > pm_runtime_put_noidle()
> > 
> > after?  Then, if the driver's or bus type's .remove() needs to resume, it will
> > be able to do that right away and if it wants to suspend, it can always call
> > pm_runtime_put*(), because our pm_runtime_put_noidle() won't decrease the
> > usage counter below zero.
> > 
> > At the same time we can avoid "leftover" suspends that could interfere with
> > .remove() in case it needs to access the hardware.
> 
> The problem with this is that it calls pm_runtime_disable() at a time 
> when the driver is still supposed to be in control of the device.  
> Interfering with the driver's legitimate activity in this way is a bad 
> thing to do.
> 
> The difficulty here is that our requirements are a little
> contradictory.  We want to prevent all runtime PM callbacks while the
> remove method is running, but we also want the remove method to be able
> to carry out its own runtime PM activities.
> 
> So maybe what we really need is more like a barrier.  That is,
> something that will do a "get", wait for outstanding callbacks to
> finish, carry out a resume if one is pending, and cancel other pending
> requests.  This could easily share code with pm_runtime_disable.  We 
> should be able to use this for both probe and remove.

Isn't it what's done in rev. 14?

pm_runtime_disable(dev);
pm_runtime_get_noresume(dev);
pm_runtime_enable(dev);

is exactly a barrier like this.  How exactly would you like to implement it
instead?

> We will also need to be a lot more careful about handling runtime PM 
> during system sleep transitions.  The current code runs a risk of 
> losing remote wakeup requests.  One scenario goes like this:
> 
>      1. The device sends a wakeup request, probably in the form of
> 	an IRQ.
> 
>      2.	The driver fields the interrupt and tells the device to turn
> 	off its interrupt request signal.
> 
>      3. The driver calls pm_request_resume.
> 
>      4. The runtime PM core carries out the resume callback.
> 
> If the core disables runtime PM before step 1 (and before we begin the
> "late" stage of a system sleep, so interrupts still get delivered) then 
> steps 1 and 2 will succeed but step 3 will fail.  The wakeup event will 
> be lost.

The idea is that once the system sleep transition has started, the non-runtime
callbacks are in charge of handling the device.

> Perhaps this means we don't want to disable runtime PM during system 
> sleep callbacks, but instead use the "barrier" scheme.

I'm not really sure about that.  I'd rather do what's right now in the patch
(well, that's why it's in there) until drivers and bus types start using the
runtime PM framework.  If it turns out to be problematic, we'll change it
later.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-08 15:50                   ` Alan Stern
@ 2009-08-08 21:55                     ` Rafael J. Wysocki
  2009-08-08 21:55                     ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-08 21:55 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

On Saturday 08 August 2009, Alan Stern wrote:
> On Sat, 8 Aug 2009, Rafael J. Wysocki wrote:
> 
> > > We may need to be more careful here.  The driver's remove method may
> > > want to do some runtime PM stuff to the device before giving up
> > > control.  On the other hand I'm not sure what _should_ be done here, so
> > > I can't suggest anything better.
> > 
> > Hmm.  Perhaps we can do something along the lines of our .probe() handling.
> > Namely, call
> > 
> > pm_runtime_disable(dev);
> > pm_runtime_get_noresume(dev);
> > pm_runtime_enable(dev);
> > 
> > before and
> > 
> > pm_runtime_put_noidle()
> > 
> > after?  Then, if the driver's or bus type's .remove() needs to resume, it will
> > be able to do that right away and if it wants to suspend, it can always call
> > pm_runtime_put*(), because our pm_runtime_put_noidle() won't decrease the
> > usage counter below zero.
> > 
> > At the same time we can avoid "leftover" suspends that could interfere with
> > .remove() in case it needs to access the hardware.
> 
> The problem with this is that it calls pm_runtime_disable() at a time 
> when the driver is still supposed to be in control of the device.  
> Interfering with the driver's legitimate activity in this way is a bad 
> thing to do.
> 
> The difficulty here is that our requirements are a little
> contradictory.  We want to prevent all runtime PM callbacks while the
> remove method is running, but we also want the remove method to be able
> to carry out its own runtime PM activities.
> 
> So maybe what we really need is more like a barrier.  That is,
> something that will do a "get", wait for outstanding callbacks to
> finish, carry out a resume if one is pending, and cancel other pending
> requests.  This could easily share code with pm_runtime_disable.  We 
> should be able to use this for both probe and remove.

Isn't it what's done in rev. 14?

pm_runtime_disable(dev);
pm_runtime_get_noresume(dev);
pm_runtime_enable(dev);

is exactly a barrier like this.  How exactly would you like to implement it
instead?

> We will also need to be a lot more careful about handling runtime PM 
> during system sleep transitions.  The current code runs a risk of 
> losing remote wakeup requests.  One scenario goes like this:
> 
>      1. The device sends a wakeup request, probably in the form of
> 	an IRQ.
> 
>      2.	The driver fields the interrupt and tells the device to turn
> 	off its interrupt request signal.
> 
>      3. The driver calls pm_request_resume.
> 
>      4. The runtime PM core carries out the resume callback.
> 
> If the core disables runtime PM before step 1 (and before we begin the
> "late" stage of a system sleep, so interrupts still get delivered) then 
> steps 1 and 2 will succeed but step 3 will fail.  The wakeup event will 
> be lost.

The idea is that once the system sleep transition has started, the non-runtime
callbacks are in charge of handling the device.

> Perhaps this means we don't want to disable runtime PM during system 
> sleep callbacks, but instead use the "barrier" scheme.

I'm not really sure about that.  I'd rather do what's right now in the patch
(well, that's why it's in there) until drivers and bus types start using the
runtime PM framework.  If it turns out to be problematic, we'll change it
later.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-08 21:55                     ` Rafael J. Wysocki
  2009-08-09  2:28                       ` Alan Stern
@ 2009-08-09  2:28                       ` Alan Stern
  2009-08-09 13:10                         ` Rafael J. Wysocki
  2009-08-09 13:10                         ` Rafael J. Wysocki
  1 sibling, 2 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-09  2:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Sat, 8 Aug 2009, Rafael J. Wysocki wrote:

> > The problem with this is that it calls pm_runtime_disable() at a time 
> > when the driver is still supposed to be in control of the device.  
> > Interfering with the driver's legitimate activity in this way is a bad 
> > thing to do.
> > 
> > The difficulty here is that our requirements are a little
> > contradictory.  We want to prevent all runtime PM callbacks while the
> > remove method is running, but we also want the remove method to be able
> > to carry out its own runtime PM activities.
> > 
> > So maybe what we really need is more like a barrier.  That is,
> > something that will do a "get", wait for outstanding callbacks to
> > finish, carry out a resume if one is pending, and cancel other pending
> > requests.  This could easily share code with pm_runtime_disable.  We 
> > should be able to use this for both probe and remove.
> 
> Isn't it what's done in rev. 14?
> 
> pm_runtime_disable(dev);
> pm_runtime_get_noresume(dev);
> pm_runtime_enable(dev);
> 
> is exactly a barrier like this.

It's not exactly the same because it disables runtime PM for a short 
time.  A barrier never disables runtime PM.

>  How exactly would you like to implement it
> instead?

As described above.  The barrier would be equivalent to
pm_runtime_get_noresume followed by pm_runtime_disable except that it
wouldn't actually disable anything.

> > Perhaps this means we don't want to disable runtime PM during system
> > sleep callbacks, but instead use the "barrier" scheme.
> 
> I'm not really sure about that.  I'd rather do what's right now in the patch
> (well, that's why it's in there) until drivers and bus types start using the
> runtime PM framework.  If it turns out to be problematic, we'll change it
> later.

All right.  Since it involves a race, the problem may not show up for a
while.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-08 21:55                     ` Rafael J. Wysocki
@ 2009-08-09  2:28                       ` Alan Stern
  2009-08-09  2:28                       ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-09  2:28 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Sat, 8 Aug 2009, Rafael J. Wysocki wrote:

> > The problem with this is that it calls pm_runtime_disable() at a time 
> > when the driver is still supposed to be in control of the device.  
> > Interfering with the driver's legitimate activity in this way is a bad 
> > thing to do.
> > 
> > The difficulty here is that our requirements are a little
> > contradictory.  We want to prevent all runtime PM callbacks while the
> > remove method is running, but we also want the remove method to be able
> > to carry out its own runtime PM activities.
> > 
> > So maybe what we really need is more like a barrier.  That is,
> > something that will do a "get", wait for outstanding callbacks to
> > finish, carry out a resume if one is pending, and cancel other pending
> > requests.  This could easily share code with pm_runtime_disable.  We 
> > should be able to use this for both probe and remove.
> 
> Isn't it what's done in rev. 14?
> 
> pm_runtime_disable(dev);
> pm_runtime_get_noresume(dev);
> pm_runtime_enable(dev);
> 
> is exactly a barrier like this.

It's not exactly the same because it disables runtime PM for a short 
time.  A barrier never disables runtime PM.

>  How exactly would you like to implement it
> instead?

As described above.  The barrier would be equivalent to
pm_runtime_get_noresume followed by pm_runtime_disable except that it
wouldn't actually disable anything.

> > Perhaps this means we don't want to disable runtime PM during system
> > sleep callbacks, but instead use the "barrier" scheme.
> 
> I'm not really sure about that.  I'd rather do what's right now in the patch
> (well, that's why it's in there) until drivers and bus types start using the
> runtime PM framework.  If it turns out to be problematic, we'll change it
> later.

All right.  Since it involves a race, the problem may not show up for a
while.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-09  2:28                       ` Alan Stern
  2009-08-09 13:10                         ` Rafael J. Wysocki
@ 2009-08-09 13:10                         ` Rafael J. Wysocki
  2009-08-09 15:19                           ` Alan Stern
  2009-08-09 15:19                           ` Alan Stern
  1 sibling, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-09 13:10 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Sunday 09 August 2009, Alan Stern wrote:
> On Sat, 8 Aug 2009, Rafael J. Wysocki wrote:
> 
> > > The problem with this is that it calls pm_runtime_disable() at a time 
> > > when the driver is still supposed to be in control of the device.  
> > > Interfering with the driver's legitimate activity in this way is a bad 
> > > thing to do.
> > > 
> > > The difficulty here is that our requirements are a little
> > > contradictory.  We want to prevent all runtime PM callbacks while the
> > > remove method is running, but we also want the remove method to be able
> > > to carry out its own runtime PM activities.
> > > 
> > > So maybe what we really need is more like a barrier.  That is,
> > > something that will do a "get", wait for outstanding callbacks to
> > > finish, carry out a resume if one is pending, and cancel other pending
> > > requests.  This could easily share code with pm_runtime_disable.  We 
> > > should be able to use this for both probe and remove.
> > 
> > Isn't it what's done in rev. 14?
> > 
> > pm_runtime_disable(dev);
> > pm_runtime_get_noresume(dev);
> > pm_runtime_enable(dev);
> > 
> > is exactly a barrier like this.
> 
> It's not exactly the same because it disables runtime PM for a short 
> time.  A barrier never disables runtime PM.
> 
> >  How exactly would you like to implement it
> > instead?
> 
> As described above.  The barrier would be equivalent to
> pm_runtime_get_noresume followed by pm_runtime_disable except that it
> wouldn't actually disable anything.

OK, I can do that, but the only difference between that and the above sequence
of three calls will be the possibility to call resume helpers while the
"barrier" is in progress.

> > > Perhaps this means we don't want to disable runtime PM during system
> > > sleep callbacks, but instead use the "barrier" scheme.
> > 
> > I'm not really sure about that.  I'd rather do what's right now in the patch
> > (well, that's why it's in there) until drivers and bus types start using the
> > runtime PM framework.  If it turns out to be problematic, we'll change it
> > later.
> 
> All right.  Since it involves a race, the problem may not show up for a
> while.

Allowing runtime PM helpers to be run during system sleep transitions would be
problematic IMHO, because the run-time PM 'states' are not well defined at that
time.  Consequently, the rules that the PM helpers follow do not really hold
during system sleep transitions.

Also, in principle the device driver's ->suspend() routine  (the non-runtime
one), or even the ->prepare() callback, may notice that the remote wake-up has
happened and put the device back into the full power state and return -EBUSY.

Still, we can allow runtime PM requests to be put into the workqueue during
system sleep transitions, to be executed after the resume (or in case the
suspend fails, that will make the action described in the previous paragraph
somewhat easier).  It seems we'd need a separate flag for it, though.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-09  2:28                       ` Alan Stern
@ 2009-08-09 13:10                         ` Rafael J. Wysocki
  2009-08-09 13:10                         ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-09 13:10 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

On Sunday 09 August 2009, Alan Stern wrote:
> On Sat, 8 Aug 2009, Rafael J. Wysocki wrote:
> 
> > > The problem with this is that it calls pm_runtime_disable() at a time 
> > > when the driver is still supposed to be in control of the device.  
> > > Interfering with the driver's legitimate activity in this way is a bad 
> > > thing to do.
> > > 
> > > The difficulty here is that our requirements are a little
> > > contradictory.  We want to prevent all runtime PM callbacks while the
> > > remove method is running, but we also want the remove method to be able
> > > to carry out its own runtime PM activities.
> > > 
> > > So maybe what we really need is more like a barrier.  That is,
> > > something that will do a "get", wait for outstanding callbacks to
> > > finish, carry out a resume if one is pending, and cancel other pending
> > > requests.  This could easily share code with pm_runtime_disable.  We 
> > > should be able to use this for both probe and remove.
> > 
> > Isn't it what's done in rev. 14?
> > 
> > pm_runtime_disable(dev);
> > pm_runtime_get_noresume(dev);
> > pm_runtime_enable(dev);
> > 
> > is exactly a barrier like this.
> 
> It's not exactly the same because it disables runtime PM for a short 
> time.  A barrier never disables runtime PM.
> 
> >  How exactly would you like to implement it
> > instead?
> 
> As described above.  The barrier would be equivalent to
> pm_runtime_get_noresume followed by pm_runtime_disable except that it
> wouldn't actually disable anything.

OK, I can do that, but the only difference between that and the above sequence
of three calls will be the possibility to call resume helpers while the
"barrier" is in progress.

> > > Perhaps this means we don't want to disable runtime PM during system
> > > sleep callbacks, but instead use the "barrier" scheme.
> > 
> > I'm not really sure about that.  I'd rather do what's right now in the patch
> > (well, that's why it's in there) until drivers and bus types start using the
> > runtime PM framework.  If it turns out to be problematic, we'll change it
> > later.
> 
> All right.  Since it involves a race, the problem may not show up for a
> while.

Allowing runtime PM helpers to be run during system sleep transitions would be
problematic IMHO, because the run-time PM 'states' are not well defined at that
time.  Consequently, the rules that the PM helpers follow do not really hold
during system sleep transitions.

Also, in principle the device driver's ->suspend() routine  (the non-runtime
one), or even the ->prepare() callback, may notice that the remote wake-up has
happened and put the device back into the full power state and return -EBUSY.

Still, we can allow runtime PM requests to be put into the workqueue during
system sleep transitions, to be executed after the resume (or in case the
suspend fails, that will make the action described in the previous paragraph
somewhat easier).  It seems we'd need a separate flag for it, though.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-09 13:10                         ` Rafael J. Wysocki
  2009-08-09 15:19                           ` Alan Stern
@ 2009-08-09 15:19                           ` Alan Stern
  2009-08-09 20:49                             ` Rafael J. Wysocki
  2009-08-09 20:49                             ` Rafael J. Wysocki
  1 sibling, 2 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-09 15:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Sun, 9 Aug 2009, Rafael J. Wysocki wrote:

> > >  How exactly would you like to implement it
> > > instead?
> > 
> > As described above.  The barrier would be equivalent to
> > pm_runtime_get_noresume followed by pm_runtime_disable except that it
> > wouldn't actually disable anything.
> 
> OK, I can do that, but the only difference between that and the above sequence
> of three calls will be the possibility to call resume helpers while the
> "barrier" is in progress.

Exactly.  In other words, if the driver tries to carry out a resume
while the barrier is running, the resume won't get lost.  Whereas with 
the temporarily-disable approach, it _would_ get lost.

> Allowing runtime PM helpers to be run during system sleep transitions would be
> problematic IMHO, because the run-time PM 'states' are not well defined at that
> time.  Consequently, the rules that the PM helpers follow do not really hold
> during system sleep transitions.

The workqueue will be frozen, so runtime PM helpers will run only if
they are invoked more or less directly by the driver (i.e., through
pm_runtime_resume, ...).  I think we should allow drivers to do what
they want, especially between the "prepare" and "suspend" stages.

> Also, in principle the device driver's ->suspend() routine  (the non-runtime
> one), or even the ->prepare() callback, may notice that the remote wake-up has
> happened and put the device back into the full power state and return -EBUSY.

It may.  But then again, it may not -- it may depend on the runtime PM  
core to make sure that resume requests get forwarded appropriately.

Furthermore, if you disable runtime PM _before_ calling the prepare 
method, that leaves a window during which the driver has no reason to 
realize that anything unusual is going on.

> Still, we can allow runtime PM requests to be put into the workqueue during
> system sleep transitions, to be executed after the resume (or in case the
> suspend fails, that will make the action described in the previous paragraph
> somewhat easier).  It seems we'd need a separate flag for it, though.

If every device gets resumed at the end of a system sleep, even the
ones that were runtime-suspended before the sleep began, then there's
no reason to preserve requests in the workqueue.  But if
previously-suspended devices don't get resumed at the end of a system
sleep, then we should allow requests to remain in the workqueue.

In the end, it's probably safer and easier just to leave the workqueue 
alone -- freeze and unfreeze it, but don't meddle with its contents.

The whole question of remote wakeup vs. runtime suspend vs. system 
sleep is complicated, and people haven't dealt with all the issues yet.  
For instance, it seems quite likely that with some devices you would 
want to enable remote wakeup during runtime suspend but not during 
system sleep.  We don't have any good way to do this.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-09 13:10                         ` Rafael J. Wysocki
@ 2009-08-09 15:19                           ` Alan Stern
  2009-08-09 15:19                           ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2009-08-09 15:19 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Greg KH, LKML, Linux-pm mailing list

On Sun, 9 Aug 2009, Rafael J. Wysocki wrote:

> > >  How exactly would you like to implement it
> > > instead?
> > 
> > As described above.  The barrier would be equivalent to
> > pm_runtime_get_noresume followed by pm_runtime_disable except that it
> > wouldn't actually disable anything.
> 
> OK, I can do that, but the only difference between that and the above sequence
> of three calls will be the possibility to call resume helpers while the
> "barrier" is in progress.

Exactly.  In other words, if the driver tries to carry out a resume
while the barrier is running, the resume won't get lost.  Whereas with 
the temporarily-disable approach, it _would_ get lost.

> Allowing runtime PM helpers to be run during system sleep transitions would be
> problematic IMHO, because the run-time PM 'states' are not well defined at that
> time.  Consequently, the rules that the PM helpers follow do not really hold
> during system sleep transitions.

The workqueue will be frozen, so runtime PM helpers will run only if
they are invoked more or less directly by the driver (i.e., through
pm_runtime_resume, ...).  I think we should allow drivers to do what
they want, especially between the "prepare" and "suspend" stages.

> Also, in principle the device driver's ->suspend() routine  (the non-runtime
> one), or even the ->prepare() callback, may notice that the remote wake-up has
> happened and put the device back into the full power state and return -EBUSY.

It may.  But then again, it may not -- it may depend on the runtime PM  
core to make sure that resume requests get forwarded appropriately.

Furthermore, if you disable runtime PM _before_ calling the prepare 
method, that leaves a window during which the driver has no reason to 
realize that anything unusual is going on.

> Still, we can allow runtime PM requests to be put into the workqueue during
> system sleep transitions, to be executed after the resume (or in case the
> suspend fails, that will make the action described in the previous paragraph
> somewhat easier).  It seems we'd need a separate flag for it, though.

If every device gets resumed at the end of a system sleep, even the
ones that were runtime-suspended before the sleep began, then there's
no reason to preserve requests in the workqueue.  But if
previously-suspended devices don't get resumed at the end of a system
sleep, then we should allow requests to remain in the workqueue.

In the end, it's probably safer and easier just to leave the workqueue 
alone -- freeze and unfreeze it, but don't meddle with its contents.

The whole question of remote wakeup vs. runtime suspend vs. system 
sleep is complicated, and people haven't dealt with all the issues yet.  
For instance, it seems quite likely that with some devices you would 
want to enable remote wakeup during runtime suspend but not during 
system sleep.  We don't have any good way to do this.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-09 15:19                           ` Alan Stern
  2009-08-09 20:49                             ` Rafael J. Wysocki
@ 2009-08-09 20:49                             ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-09 20:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Magnus Damm, Greg KH, Pavel Machek,
	Len Brown, LKML

On Sunday 09 August 2009, Alan Stern wrote:
> On Sun, 9 Aug 2009, Rafael J. Wysocki wrote:
> 
> > > >  How exactly would you like to implement it
> > > > instead?
> > > 
> > > As described above.  The barrier would be equivalent to
> > > pm_runtime_get_noresume followed by pm_runtime_disable except that it
> > > wouldn't actually disable anything.
> > 
> > OK, I can do that, but the only difference between that and the above sequence
> > of three calls will be the possibility to call resume helpers while the
> > "barrier" is in progress.
> 
> Exactly.  In other words, if the driver tries to carry out a resume
> while the barrier is running, the resume won't get lost.  Whereas with 
> the temporarily-disable approach, it _would_ get lost.
> 
> > Allowing runtime PM helpers to be run during system sleep transitions would be
> > problematic IMHO, because the run-time PM 'states' are not well defined at that
> > time.  Consequently, the rules that the PM helpers follow do not really hold
> > during system sleep transitions.
> 
> The workqueue will be frozen, so runtime PM helpers will run only if
> they are invoked more or less directly by the driver (i.e., through
> pm_runtime_resume, ...).  I think we should allow drivers to do what
> they want, especially between the "prepare" and "suspend" stages.

Well, I'm not sure if that's a good idea, but also I have no good techincal
arguments against it at the moment.  And I'm too tired to argue. ;-)

> > Also, in principle the device driver's ->suspend() routine  (the non-runtime
> > one), or even the ->prepare() callback, may notice that the remote wake-up has
> > happened and put the device back into the full power state and return -EBUSY.
> 
> It may.  But then again, it may not -- it may depend on the runtime PM  
> core to make sure that resume requests get forwarded appropriately.
> 
> Furthermore, if you disable runtime PM _before_ calling the prepare 
> method, that leaves a window during which the driver has no reason to 
> realize that anything unusual is going on.
> 
> > Still, we can allow runtime PM requests to be put into the workqueue during
> > system sleep transitions, to be executed after the resume (or in case the
> > suspend fails, that will make the action described in the previous paragraph
> > somewhat easier).  It seems we'd need a separate flag for it, though.
> 
> If every device gets resumed at the end of a system sleep, even the
> ones that were runtime-suspended before the sleep began, then there's
> no reason to preserve requests in the workqueue.  But if
> previously-suspended devices don't get resumed at the end of a system
> sleep, then we should allow requests to remain in the workqueue.

We also should preserve the requests in case the system sleep transition
fails.

> In the end, it's probably safer and easier just to leave the workqueue 
> alone -- freeze and unfreeze it, but don't meddle with its contents.
> 
> The whole question of remote wakeup vs. runtime suspend vs. system 
> sleep is complicated, and people haven't dealt with all the issues yet.

Agreed.
 
> For instance, it seems quite likely that with some devices you would 
> want to enable remote wakeup during runtime suspend but not during 
> system sleep.  We don't have any good way to do this.

Yes, for now we have to assume that any device with wakeup enabled is a
wakeup device.

OK, I'll post the new version of the patch shortly.  Please check if the
barrier mechanism is implemeted and used correctly.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13)
  2009-08-09 15:19                           ` Alan Stern
@ 2009-08-09 20:49                             ` Rafael J. Wysocki
  2009-08-09 20:49                             ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2009-08-09 20:49 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, LKML, Linux-pm mailing list

On Sunday 09 August 2009, Alan Stern wrote:
> On Sun, 9 Aug 2009, Rafael J. Wysocki wrote:
> 
> > > >  How exactly would you like to implement it
> > > > instead?
> > > 
> > > As described above.  The barrier would be equivalent to
> > > pm_runtime_get_noresume followed by pm_runtime_disable except that it
> > > wouldn't actually disable anything.
> > 
> > OK, I can do that, but the only difference between that and the above sequence
> > of three calls will be the possibility to call resume helpers while the
> > "barrier" is in progress.
> 
> Exactly.  In other words, if the driver tries to carry out a resume
> while the barrier is running, the resume won't get lost.  Whereas with 
> the temporarily-disable approach, it _would_ get lost.
> 
> > Allowing runtime PM helpers to be run during system sleep transitions would be
> > problematic IMHO, because the run-time PM 'states' are not well defined at that
> > time.  Consequently, the rules that the PM helpers follow do not really hold
> > during system sleep transitions.
> 
> The workqueue will be frozen, so runtime PM helpers will run only if
> they are invoked more or less directly by the driver (i.e., through
> pm_runtime_resume, ...).  I think we should allow drivers to do what
> they want, especially between the "prepare" and "suspend" stages.

Well, I'm not sure if that's a good idea, but also I have no good techincal
arguments against it at the moment.  And I'm too tired to argue. ;-)

> > Also, in principle the device driver's ->suspend() routine  (the non-runtime
> > one), or even the ->prepare() callback, may notice that the remote wake-up has
> > happened and put the device back into the full power state and return -EBUSY.
> 
> It may.  But then again, it may not -- it may depend on the runtime PM  
> core to make sure that resume requests get forwarded appropriately.
> 
> Furthermore, if you disable runtime PM _before_ calling the prepare 
> method, that leaves a window during which the driver has no reason to 
> realize that anything unusual is going on.
> 
> > Still, we can allow runtime PM requests to be put into the workqueue during
> > system sleep transitions, to be executed after the resume (or in case the
> > suspend fails, that will make the action described in the previous paragraph
> > somewhat easier).  It seems we'd need a separate flag for it, though.
> 
> If every device gets resumed at the end of a system sleep, even the
> ones that were runtime-suspended before the sleep began, then there's
> no reason to preserve requests in the workqueue.  But if
> previously-suspended devices don't get resumed at the end of a system
> sleep, then we should allow requests to remain in the workqueue.

We also should preserve the requests in case the system sleep transition
fails.

> In the end, it's probably safer and easier just to leave the workqueue 
> alone -- freeze and unfreeze it, but don't meddle with its contents.
> 
> The whole question of remote wakeup vs. runtime suspend vs. system 
> sleep is complicated, and people haven't dealt with all the issues yet.

Agreed.
 
> For instance, it seems quite likely that with some devices you would 
> want to enable remote wakeup during runtime suspend but not during 
> system sleep.  We don't have any good way to do this.

Yes, for now we have to assume that any device with wakeup enabled is a
wakeup device.

OK, I'll post the new version of the patch shortly.  Please check if the
barrier mechanism is implemeted and used correctly.

Best,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2009-08-09 20:49 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-03 21:36 [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11) Rafael J. Wysocki
2009-08-04 20:33 ` Alan Stern
2009-08-04 20:33 ` Alan Stern
2009-08-05  0:19   ` Rafael J. Wysocki
2009-08-05  0:19   ` Rafael J. Wysocki
2009-08-05  2:44     ` Alan Stern
2009-08-05  2:44     ` Alan Stern
2009-08-05 13:25       ` Rafael J. Wysocki
2009-08-05 21:47         ` [PATCH update] PM: Introduce core framework for run-time PM of I/O devices (rev. 12) Rafael J. Wysocki
2009-08-05 21:47         ` Rafael J. Wysocki
2009-08-06 17:01           ` Alan Stern
2009-08-06 17:01           ` Alan Stern
2009-08-06 21:50             ` Rafael J. Wysocki
2009-08-06 21:50             ` Rafael J. Wysocki
2009-08-07 13:59               ` Alan Stern
2009-08-07 13:59               ` Alan Stern
2009-08-06 21:53             ` [PATCH update x2] PM: Introduce core framework for run-time PM of I/O devices (rev. 13) Rafael J. Wysocki
2009-08-07  7:45               ` Magnus Damm
2009-08-07  7:45               ` Magnus Damm
2009-08-07 13:54                 ` Rafael J. Wysocki
2009-08-07 13:54                 ` Rafael J. Wysocki
2009-08-07 15:41               ` Alan Stern
2009-08-08 14:03                 ` Rafael J. Wysocki
2009-08-08 14:03                 ` Rafael J. Wysocki
2009-08-08 15:50                   ` Alan Stern
2009-08-08 15:50                   ` Alan Stern
2009-08-08 21:55                     ` Rafael J. Wysocki
2009-08-08 21:55                     ` Rafael J. Wysocki
2009-08-09  2:28                       ` Alan Stern
2009-08-09  2:28                       ` Alan Stern
2009-08-09 13:10                         ` Rafael J. Wysocki
2009-08-09 13:10                         ` Rafael J. Wysocki
2009-08-09 15:19                           ` Alan Stern
2009-08-09 15:19                           ` Alan Stern
2009-08-09 20:49                             ` Rafael J. Wysocki
2009-08-09 20:49                             ` Rafael J. Wysocki
2009-08-07 15:41               ` Alan Stern
2009-08-06 21:53             ` Rafael J. Wysocki
2009-08-05 13:25       ` [Resend][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 11) Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.