All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
@ 2009-07-06  0:52 Rafael J. Wysocki
  2009-07-07 15:12 ` Magnus Damm
                   ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-06  0:52 UTC (permalink / raw)
  To: Alan Stern, Linux-pm mailing list
  Cc: Greg KH, LKML, ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

Hi,

There's a rev. 8 of the run-time PM framework patch.

Highlights:
* I did my best to follow the design we've recently discussed.
* pm_runtime_[get|put]() and the sync versions call
  pm_[request|runtime]_[resume|idle](), because I don't see much point
  manipulating the usage counter alone.
* pm_runtime_disable() carries out a (synchronous) wake-up if there's a
  resume request pending.

Comments welcome.

Best,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.

Not-yet-signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/dd.c            |   10 
 drivers/base/power/Makefile  |    1 
 drivers/base/power/main.c    |   21 -
 drivers/base/power/power.h   |   11 
 drivers/base/power/runtime.c |  901 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm.h           |  102 ++++
 include/linux/pm_runtime.h   |  105 +++++
 kernel/power/Kconfig         |   14 
 kernel/power/main.c          |   17 
 9 files changed, 1170 insertions(+), 12 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /**
@@ -315,14 +344,81 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		ignore_children:1;
+	unsigned int		runtime_disabled:1;
+	unsigned int		runtime_failure:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			last_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,901 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * If there's a request pending, make sure its work function will return
+	 * without doing anything.
+	 */
+	if (dev->power.request_pending)
+		dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.runtime_disabled
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
+		dev->bus->pm->runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If there's an idle notification pending, cancel it.  If
+ * there's a suspend request scheduled while this function is running and @sync
+ * is 'true', cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* Pending resume requests take precedence over us. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+		/* Other pending requests need to be canceled. */
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_disabled
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend ?
+		dev->bus->pm->runtime_suspend(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY) {
+			notify = true;
+		} else {
+			dev->power.runtime_failure = true;
+			dev->power.last_error = retval;
+		}
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		__pm_request_resume(dev);
+		dev->power.deferred_resume = false;
+	}
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (parent && !parent->power.ignore_children)
+		pm_request_idle(parent);
+
+	if (notify)
+		pm_runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  If there's a suspend
+ * request or idle notification pending, cancel it.  If there's a resume request
+ * scheduled while this function is running, cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -ENODEV;
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_disabled)
+		retval = -EAGAIN;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			return -EINPROGRESS;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		spin_unlock_irq(&dev->power.lock);
+
+		parent = dev->parent;
+		retval = pm_runtime_get_sync(parent);
+		if (retval < 0)
+			goto out_parent;
+
+		spin_lock_irq(&dev->power.lock);
+		retval = 0;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume ?
+		dev->bus->pm->runtime_resume(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		dev->power.runtime_failure = true;
+		dev->power.last_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	spin_unlock_irq(&dev->power.lock);
+
+ out_parent:
+	if (parent)
+		pm_runtime_put(parent);
+
+	if (!retval)
+		pm_request_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.runtime_disabled
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		goto out;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	if (dev->power.request_pending)
+		goto out;
+
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.runtime_disabled)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+
+		if (dev->power.request == RPM_REQ_SUSPEND)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.runtime_disabled)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.runtime_disabled)
+		retval = -EAGAIN;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		/* There's nothing to do if resume request is pending. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_failure flag is
+ * set, the status may be changed either to RPM_ACTIVE, or to RPM_SUSPENDED, as
+ * long as that reflects the actual state of the device.  However, if the device
+ * has a parent and the parent is not active, and the parent's
+ * power.ignore_children flag is unset, the device's status cannot be set to
+ * RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_failure
+ * flag and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_failure && !dev->power.runtime_disabled)
+		goto out;
+
+	if (dev->power.runtime_status == status)
+		goto out_clear;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent)
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		dev->power.runtime_status = status;
+		goto out_clear;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It may be invalid to put an active child under a suspended
+		 * parent.
+		 */
+		if (parent->power.runtime_status == RPM_ACTIVE
+		    || parent->power.ignore_children) {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+			dev->power.runtime_status = status;
+		} else {
+			error = -EBUSY;
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	} else {
+		dev->power.runtime_status = status;
+	}
+
+ out_clear:
+	dev->power.runtime_failure = false;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_disabled)
+		goto out;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		dev->power.runtime_disabled = false;
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ *
+ * Set the power.runtime_disabled flag for the device, cancel all pending
+ * run-time PM requests for it and wait for operations in progress to complete.
+ * The device can be either active or suspended after its run-time PM has been
+ * disabled.
+ *
+ * If there's a resume request pending when pm_runtime_disable() is called, it
+ * resumes the device before disabling its run-time PM and returns -EBUSY.
+ * Otherwise, 0 is returned.
+ */
+int pm_runtime_disable(struct device *dev)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	atomic_inc(&dev->power.usage_count);
+
+	if (dev->power.runtime_disabled)
+		goto out;
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and we shouldn't prevent
+	 * the device from processing the I/O.
+	 */
+	if (dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		__pm_runtime_resume(dev, false);
+		retval = -EBUSY;
+	}
+
+	dev->power.runtime_disabled = true;
+
+	if (dev->power.runtime_failure)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+	if (dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (!dev->power.idle_notification)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_ACTIVE;
+	dev->power.idle_notification = false;
+
+	dev->power.runtime_disabled = true;
+	atomic_set(&dev->power.usage_count, 1);
+
+	dev->power.runtime_failure = false;
+	dev->power.last_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	dev->power.suspend_timer.expires = jiffies;
+	dev->power.suspend_timer.data = (unsigned long)dev;
+	dev->power.suspend_timer.function = pm_suspend_timer_fn;
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_add - Update run-time PM fields of a device while adding it.
+ * @dev: Device object being added to device hierarchy.
+ */
+void pm_runtime_add(struct device *dev)
+{
+	if (dev->parent)
+		atomic_inc(&dev->parent->power.child_count);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	struct device *parent = dev->parent;
+
+	pm_runtime_disable(dev);
+
+	if (dev->power.runtime_status != RPM_SUSPENDED && parent) {
+		atomic_add_unless(&parent->power.child_count, -1, 0);
+		if (!parent->power.ignore_children)
+			pm_request_idle(parent);
+	}
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,105 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_add(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int pm_runtime_disable(struct device *dev);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+	return pm_request_resume(dev);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+	return pm_runtime_resume(dev);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+	return pm_request_idle(dev);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+	return pm_runtime_idle(dev);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_add(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int pm_runtime_disable(struct device *dev) { return 0; }
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline int pm_runtime_get(struct device *dev) { return 0; }
+static inline int pm_runtime_get_sync(struct device *dev) { return 0; }
+static inline int pm_runtime_put(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_put_sync(struct device *dev) { return -ENOSYS; }
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object to initialize.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -89,6 +100,8 @@ void device_pm_add(struct device *dev)
 
 	list_add_tail(&dev->power.entry, &dpm_list);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_add(dev);
 }
 
 /**
@@ -105,6 +118,8 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -510,6 +525,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -755,11 +771,14 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		error = pm_runtime_disable(dev);
+		if (!error || !device_may_wakeup(dev))
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,10 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
-	ret = really_probe(dev, drv);
+	ret = pm_runtime_get_sync(dev);
+	if (ret >= 0)
+		ret = really_probe(dev, drv);
+	pm_runtime_put(dev);
 
 	return ret;
 }
@@ -306,6 +310,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +330,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,8 +1,3 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
-
 #ifdef CONFIG_PM_SLEEP
 
 /*
@@ -16,14 +11,16 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
 
+static inline void device_pm_init(struct device *dev) {}
 static inline void device_pm_add(struct device *dev) {}
 static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
@@ -32,7 +29,7 @@ static inline void device_pm_move_after(
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-06  0:52 [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8) Rafael J. Wysocki
@ 2009-07-07 15:12   ` Magnus Damm
  2009-07-07 15:12   ` Magnus Damm
  2009-07-09 23:22   ` Pavel Machek
  2 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-07 15:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

Hi Rafael,

On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> Hi,
>
> There's a rev. 8 of the run-time PM framework patch.
>
> Highlights:
> * I did my best to follow the design we've recently discussed.
> * pm_runtime_[get|put]() and the sync versions call
>  pm_[request|runtime]_[resume|idle](), because I don't see much point
>  manipulating the usage counter alone.
> * pm_runtime_disable() carries out a (synchronous) wake-up if there's a
>  resume request pending.
>
> Comments welcome.

I've now jumped from v5 to v8 and I feel that the code is getting
cleaner and cleaner. Very nice.

My intention was to post a SuperH prototype last week, but I got side
tracked with other stuff. And today I ran into some problems related
to probe() that I'd like to ask about right away. At this point I've
got a few device drivers converted and some simple bus
runtime_suspend()/runtime_resume() code that stop and start clocks.

Issue 1:
------------
Device drivers which do not perform any hardware access in probe()
work fine. During software setup in probe() the runtime pm code is
initialized with the following:

+	pm_suspend_ignore_children(&dev->dev, true);
+	pm_runtime_set_suspended(&dev->dev);
+	pm_runtime_enable(&dev->dev);

Before accessing hardware I perform:
+	pm_runtime_resume(pd->dev);

When done with the hardware I do:
+	pm_runtime_suspend(pd->dev);

Not so complicated. Am I supposed to initialize something else as well?

All good with the code above, but there seem to be some issue with how
usage_count is counted up and down and when runtime_disabled is set:

1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
2. driver_probe_device(): pm_runtime_get_sync()
3. pm_runtime_get_sync(): usage_count = 2
4. device driver probe(): pm_runtime_enable()
5. pm_runtime_enable(): usage_count = 1
6. driver_probe_device(): pm_runtime_put()
7. pm_runtime_put(): usage_count = 0

I expect runtime_disabled = false in 7. Modifying the get/put calls to
do enable/disable may work around the issue, but that's probably not
what you guys want.

Issue 2:
------------
I cannot get any bus ->runtime_resume() callbacks from probe(). This
also seems related to usage_count and pm_runtime_get_sync() in
driver_probe_device(). Basically, from probe(), calling
pm_runtime_resume() after pm_runtime_set_suspended() results in error
and not in a ->runtime_resume() callback. Some device drives access
hardware in probe(), so the ->runtime_resume() callback is needed at
that point to turn on clocks before the hardware can be accessed.

Random thought:
-------------------------
The runtime_pm_get() and runtime_pm_put() look very nice. I assume
that inteface is supposed to be used by bus code. I wonder if it would
be cleaner to use a similar counter based interface from the driver
instead of the pm_runtime_idle()/suspend()/resume()...

Let me know what you think!

Cheers,

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 8)
@ 2009-07-07 15:12   ` Magnus Damm
  0 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-07 15:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

Hi Rafael,

On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> Hi,
>
> There's a rev. 8 of the run-time PM framework patch.
>
> Highlights:
> * I did my best to follow the design we've recently discussed.
> * pm_runtime_[get|put]() and the sync versions call
>  pm_[request|runtime]_[resume|idle](), because I don't see much point
>  manipulating the usage counter alone.
> * pm_runtime_disable() carries out a (synchronous) wake-up if there's a
>  resume request pending.
>
> Comments welcome.

I've now jumped from v5 to v8 and I feel that the code is getting
cleaner and cleaner. Very nice.

My intention was to post a SuperH prototype last week, but I got side
tracked with other stuff. And today I ran into some problems related
to probe() that I'd like to ask about right away. At this point I've
got a few device drivers converted and some simple bus
runtime_suspend()/runtime_resume() code that stop and start clocks.

Issue 1:
------------
Device drivers which do not perform any hardware access in probe()
work fine. During software setup in probe() the runtime pm code is
initialized with the following:

+	pm_suspend_ignore_children(&dev->dev, true);
+	pm_runtime_set_suspended(&dev->dev);
+	pm_runtime_enable(&dev->dev);

Before accessing hardware I perform:
+	pm_runtime_resume(pd->dev);

When done with the hardware I do:
+	pm_runtime_suspend(pd->dev);

Not so complicated. Am I supposed to initialize something else as well?

All good with the code above, but there seem to be some issue with how
usage_count is counted up and down and when runtime_disabled is set:

1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
2. driver_probe_device(): pm_runtime_get_sync()
3. pm_runtime_get_sync(): usage_count = 2
4. device driver probe(): pm_runtime_enable()
5. pm_runtime_enable(): usage_count = 1
6. driver_probe_device(): pm_runtime_put()
7. pm_runtime_put(): usage_count = 0

I expect runtime_disabled = false in 7. Modifying the get/put calls to
do enable/disable may work around the issue, but that's probably not
what you guys want.

Issue 2:
------------
I cannot get any bus ->runtime_resume() callbacks from probe(). This
also seems related to usage_count and pm_runtime_get_sync() in
driver_probe_device(). Basically, from probe(), calling
pm_runtime_resume() after pm_runtime_set_suspended() results in error
and not in a ->runtime_resume() callback. Some device drives access
hardware in probe(), so the ->runtime_resume() callback is needed at
that point to turn on clocks before the hardware can be accessed.

Random thought:
-------------------------
The runtime_pm_get() and runtime_pm_put() look very nice. I assume
that inteface is supposed to be used by bus code. I wonder if it would
be cleaner to use a similar counter based interface from the driver
instead of the pm_runtime_idle()/suspend()/resume()...

Let me know what you think!

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-06  0:52 [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8) Rafael J. Wysocki
@ 2009-07-07 15:12 ` Magnus Damm
  2009-07-07 15:12   ` Magnus Damm
  2009-07-09 23:22   ` Pavel Machek
  2 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-07 15:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

Hi Rafael,

On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> Hi,
>
> There's a rev. 8 of the run-time PM framework patch.
>
> Highlights:
> * I did my best to follow the design we've recently discussed.
> * pm_runtime_[get|put]() and the sync versions call
>  pm_[request|runtime]_[resume|idle](), because I don't see much point
>  manipulating the usage counter alone.
> * pm_runtime_disable() carries out a (synchronous) wake-up if there's a
>  resume request pending.
>
> Comments welcome.

I've now jumped from v5 to v8 and I feel that the code is getting
cleaner and cleaner. Very nice.

My intention was to post a SuperH prototype last week, but I got side
tracked with other stuff. And today I ran into some problems related
to probe() that I'd like to ask about right away. At this point I've
got a few device drivers converted and some simple bus
runtime_suspend()/runtime_resume() code that stop and start clocks.

Issue 1:
------------
Device drivers which do not perform any hardware access in probe()
work fine. During software setup in probe() the runtime pm code is
initialized with the following:

+	pm_suspend_ignore_children(&dev->dev, true);
+	pm_runtime_set_suspended(&dev->dev);
+	pm_runtime_enable(&dev->dev);

Before accessing hardware I perform:
+	pm_runtime_resume(pd->dev);

When done with the hardware I do:
+	pm_runtime_suspend(pd->dev);

Not so complicated. Am I supposed to initialize something else as well?

All good with the code above, but there seem to be some issue with how
usage_count is counted up and down and when runtime_disabled is set:

1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
2. driver_probe_device(): pm_runtime_get_sync()
3. pm_runtime_get_sync(): usage_count = 2
4. device driver probe(): pm_runtime_enable()
5. pm_runtime_enable(): usage_count = 1
6. driver_probe_device(): pm_runtime_put()
7. pm_runtime_put(): usage_count = 0

I expect runtime_disabled = false in 7. Modifying the get/put calls to
do enable/disable may work around the issue, but that's probably not
what you guys want.

Issue 2:
------------
I cannot get any bus ->runtime_resume() callbacks from probe(). This
also seems related to usage_count and pm_runtime_get_sync() in
driver_probe_device(). Basically, from probe(), calling
pm_runtime_resume() after pm_runtime_set_suspended() results in error
and not in a ->runtime_resume() callback. Some device drives access
hardware in probe(), so the ->runtime_resume() callback is needed at
that point to turn on clocks before the hardware can be accessed.

Random thought:
-------------------------
The runtime_pm_get() and runtime_pm_put() look very nice. I assume
that inteface is supposed to be used by bus code. I wonder if it would
be cleaner to use a similar counter based interface from the driver
instead of the pm_runtime_idle()/suspend()/resume()...

Let me know what you think!

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-07 15:12   ` Magnus Damm
  (?)
@ 2009-07-07 22:07   ` Rafael J. Wysocki
  2009-07-08  2:54     ` Alan Stern
                       ` (3 more replies)
  -1 siblings, 4 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-07 22:07 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Tuesday 07 July 2009, Magnus Damm wrote:
> Hi Rafael,
> 
> On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > Hi,
> >
> > There's a rev. 8 of the run-time PM framework patch.
> >
> > Highlights:
> > * I did my best to follow the design we've recently discussed.
> > * pm_runtime_[get|put]() and the sync versions call
> >  pm_[request|runtime]_[resume|idle](), because I don't see much point
> >  manipulating the usage counter alone.
> > * pm_runtime_disable() carries out a (synchronous) wake-up if there's a
> >  resume request pending.
> >
> > Comments welcome.
> 
> I've now jumped from v5 to v8 and I feel that the code is getting
> cleaner and cleaner. Very nice.

That's mostly thanks to Alan.

> My intention was to post a SuperH prototype last week, but I got side
> tracked with other stuff. And today I ran into some problems related
> to probe() that I'd like to ask about right away. At this point I've
> got a few device drivers converted and some simple bus
> runtime_suspend()/runtime_resume() code that stop and start clocks.
> 
> Issue 1:
> ------------
> Device drivers which do not perform any hardware access in probe()
> work fine. During software setup in probe() the runtime pm code is
> initialized with the following:
> 
> +	pm_suspend_ignore_children(&dev->dev, true);
> +	pm_runtime_set_suspended(&dev->dev);
> +	pm_runtime_enable(&dev->dev);
> 
> Before accessing hardware I perform:
> +	pm_runtime_resume(pd->dev);
> 
> When done with the hardware I do:
> +	pm_runtime_suspend(pd->dev);
> 
> Not so complicated. Am I supposed to initialize something else as well?
> 
> All good with the code above, but there seem to be some issue with how
> usage_count is counted up and down and when runtime_disabled is set:
> 
> 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> 2. driver_probe_device(): pm_runtime_get_sync()
> 3. pm_runtime_get_sync(): usage_count = 2
> 4. device driver probe(): pm_runtime_enable()
> 5. pm_runtime_enable(): usage_count = 1
> 6. driver_probe_device(): pm_runtime_put()
> 7. pm_runtime_put(): usage_count = 0
> 
> I expect runtime_disabled = false in 7. Modifying the get/put calls to
> do enable/disable may work around the issue, but that's probably not
> what you guys want.

Sure, that's my mistake.  I should have used a separate counter for
disable/enable, but I thought usage_counter would be sufficient.  Will fix.

> Issue 2:
> ------------
> I cannot get any bus ->runtime_resume() callbacks from probe(). This
> also seems related to usage_count and pm_runtime_get_sync() in
> driver_probe_device(). Basically, from probe(), calling
> pm_runtime_resume() after pm_runtime_set_suspended() results in error
> and not in a ->runtime_resume() callback. Some device drives access
> hardware in probe(), so the ->runtime_resume() callback is needed at
> that point to turn on clocks before the hardware can be accessed.

I think the problem is that pm_runtime_get_sync() in driver_probe_device()
calls ->runtime_resume(), so the device is active from the core's point of
view when you call pm_runtime_resume() from probe().

Hmm.  OK, perhaps we should just increment usage_count in
driver_device_probe() to prevent suspends from happening at that time, without
calling ->runtime_resume() so that the driver can do it by itself.  I'll do
that in the next version.

> Random thought:
> -------------------------
> The runtime_pm_get() and runtime_pm_put() look very nice. I assume
> that inteface is supposed to be used by bus code. I wonder if it would
> be cleaner to use a similar counter based interface from the driver
> instead of the pm_runtime_idle()/suspend()/resume()...
> 
> Let me know what you think!

In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
versions.  At least, I don't see why not at the moment (well, I'm a bit tired
right now ...).

However, I'm now thinking it should work like this:

* pm_runtime_get() increments usage_count and if it was zero before the
  incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
  by the 'sync' version).

* pm_runtime_put() decrements usage_count and if it's zero after the
  decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
  the 'sync' version).

* The 'suspend' callbacks won't succeed for usage_count > 0.

This way we would avoid calling the 'suspend' and 'idle' functions each time
unnecessarily, but then usage_count would have to be modified under the
spinlock only.

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-07 15:12   ` Magnus Damm
  (?)
  (?)
@ 2009-07-07 22:07   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-07 22:07 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Tuesday 07 July 2009, Magnus Damm wrote:
> Hi Rafael,
> 
> On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > Hi,
> >
> > There's a rev. 8 of the run-time PM framework patch.
> >
> > Highlights:
> > * I did my best to follow the design we've recently discussed.
> > * pm_runtime_[get|put]() and the sync versions call
> >  pm_[request|runtime]_[resume|idle](), because I don't see much point
> >  manipulating the usage counter alone.
> > * pm_runtime_disable() carries out a (synchronous) wake-up if there's a
> >  resume request pending.
> >
> > Comments welcome.
> 
> I've now jumped from v5 to v8 and I feel that the code is getting
> cleaner and cleaner. Very nice.

That's mostly thanks to Alan.

> My intention was to post a SuperH prototype last week, but I got side
> tracked with other stuff. And today I ran into some problems related
> to probe() that I'd like to ask about right away. At this point I've
> got a few device drivers converted and some simple bus
> runtime_suspend()/runtime_resume() code that stop and start clocks.
> 
> Issue 1:
> ------------
> Device drivers which do not perform any hardware access in probe()
> work fine. During software setup in probe() the runtime pm code is
> initialized with the following:
> 
> +	pm_suspend_ignore_children(&dev->dev, true);
> +	pm_runtime_set_suspended(&dev->dev);
> +	pm_runtime_enable(&dev->dev);
> 
> Before accessing hardware I perform:
> +	pm_runtime_resume(pd->dev);
> 
> When done with the hardware I do:
> +	pm_runtime_suspend(pd->dev);
> 
> Not so complicated. Am I supposed to initialize something else as well?
> 
> All good with the code above, but there seem to be some issue with how
> usage_count is counted up and down and when runtime_disabled is set:
> 
> 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> 2. driver_probe_device(): pm_runtime_get_sync()
> 3. pm_runtime_get_sync(): usage_count = 2
> 4. device driver probe(): pm_runtime_enable()
> 5. pm_runtime_enable(): usage_count = 1
> 6. driver_probe_device(): pm_runtime_put()
> 7. pm_runtime_put(): usage_count = 0
> 
> I expect runtime_disabled = false in 7. Modifying the get/put calls to
> do enable/disable may work around the issue, but that's probably not
> what you guys want.

Sure, that's my mistake.  I should have used a separate counter for
disable/enable, but I thought usage_counter would be sufficient.  Will fix.

> Issue 2:
> ------------
> I cannot get any bus ->runtime_resume() callbacks from probe(). This
> also seems related to usage_count and pm_runtime_get_sync() in
> driver_probe_device(). Basically, from probe(), calling
> pm_runtime_resume() after pm_runtime_set_suspended() results in error
> and not in a ->runtime_resume() callback. Some device drives access
> hardware in probe(), so the ->runtime_resume() callback is needed at
> that point to turn on clocks before the hardware can be accessed.

I think the problem is that pm_runtime_get_sync() in driver_probe_device()
calls ->runtime_resume(), so the device is active from the core's point of
view when you call pm_runtime_resume() from probe().

Hmm.  OK, perhaps we should just increment usage_count in
driver_device_probe() to prevent suspends from happening at that time, without
calling ->runtime_resume() so that the driver can do it by itself.  I'll do
that in the next version.

> Random thought:
> -------------------------
> The runtime_pm_get() and runtime_pm_put() look very nice. I assume
> that inteface is supposed to be used by bus code. I wonder if it would
> be cleaner to use a similar counter based interface from the driver
> instead of the pm_runtime_idle()/suspend()/resume()...
> 
> Let me know what you think!

In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
versions.  At least, I don't see why not at the moment (well, I'm a bit tired
right now ...).

However, I'm now thinking it should work like this:

* pm_runtime_get() increments usage_count and if it was zero before the
  incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
  by the 'sync' version).

* pm_runtime_put() decrements usage_count and if it's zero after the
  decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
  the 'sync' version).

* The 'suspend' callbacks won't succeed for usage_count > 0.

This way we would avoid calling the 'suspend' and 'idle' functions each time
unnecessarily, but then usage_count would have to be modified under the
spinlock only.

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-07 22:07   ` Rafael J. Wysocki
@ 2009-07-08  2:54     ` Alan Stern
  2009-07-08  4:40       ` Magnus Damm
  2009-07-08  4:40         ` Magnus Damm
  2009-07-08  2:54     ` Alan Stern
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08  2:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Magnus Damm, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:

> > I've now jumped from v5 to v8 and I feel that the code is getting
> > cleaner and cleaner. Very nice.
> 
> That's mostly thanks to Alan.

I haven't had time yet to look through the new code.  Things have been 
very busy.

> > Issue 1:
> > ------------
> > Device drivers which do not perform any hardware access in probe()
> > work fine. During software setup in probe() the runtime pm code is
> > initialized with the following:
> > 
> > +	pm_suspend_ignore_children(&dev->dev, true);
> > +	pm_runtime_set_suspended(&dev->dev);
> > +	pm_runtime_enable(&dev->dev);
> > 
> > Before accessing hardware I perform:
> > +	pm_runtime_resume(pd->dev);
> > 
> > When done with the hardware I do:
> > +	pm_runtime_suspend(pd->dev);
> > 
> > Not so complicated. Am I supposed to initialize something else as well?

No, that's all you need.

> > All good with the code above, but there seem to be some issue with how
> > usage_count is counted up and down and when runtime_disabled is set:
> > 
> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> > 2. driver_probe_device(): pm_runtime_get_sync()
> > 3. pm_runtime_get_sync(): usage_count = 2
> > 4. device driver probe(): pm_runtime_enable()
> > 5. pm_runtime_enable(): usage_count = 1
> > 6. driver_probe_device(): pm_runtime_put()
> > 7. pm_runtime_put(): usage_count = 0
> > 
> > I expect runtime_disabled = false in 7.

Wasn't it?  It should have been set to false in step 4 and remained 
that way.

> >  Modifying the get/put calls to
> > do enable/disable may work around the issue, but that's probably not
> > what you guys want.
> 
> Sure, that's my mistake.  I should have used a separate counter for
> disable/enable, but I thought usage_counter would be sufficient.  Will fix.

Presumably there won't be much nesting of disable/enable.  The counter 
will need only a few bits.

> > Issue 2:
> > ------------
> > I cannot get any bus ->runtime_resume() callbacks from probe(). This
> > also seems related to usage_count and pm_runtime_get_sync() in
> > driver_probe_device(). Basically, from probe(), calling
> > pm_runtime_resume() after pm_runtime_set_suspended() results in error
> > and not in a ->runtime_resume() callback. Some device drives access
> > hardware in probe(), so the ->runtime_resume() callback is needed at
> > that point to turn on clocks before the hardware can be accessed.
> 
> I think the problem is that pm_runtime_get_sync() in driver_probe_device()
> calls ->runtime_resume(), so the device is active from the core's point of
> view when you call pm_runtime_resume() from probe().

Yes.  Maybe devices should be initialized with runtime PM enabled.  Or 
perhaps it should be enabled when device_add runs (before the probe).

> Hmm.  OK, perhaps we should just increment usage_count in
> driver_device_probe() to prevent suspends from happening at that time, without
> calling ->runtime_resume() so that the driver can do it by itself.  I'll do
> that in the next version.

Not necessary if you enable runtime PM first.  Or maybe 
pm_runtime_enable should compare the state and the counters, calling 
pm_runtime_resume or pm_runtime_idle as needed.

> > Random thought:
> > -------------------------
> > The runtime_pm_get() and runtime_pm_put() look very nice. I assume
> > that inteface is supposed to be used by bus code. I wonder if it would
> > be cleaner to use a similar counter based interface from the driver
> > instead of the pm_runtime_idle()/suspend()/resume()...
> > 
> > Let me know what you think!
> 
> In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
> versions.  At least, I don't see why not at the moment (well, I'm a bit tired
> right now ...).
> 
> However, I'm now thinking it should work like this:
> 
> * pm_runtime_get() increments usage_count and if it was zero before the
>   incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
>   by the 'sync' version).
> 
> * pm_runtime_put() decrements usage_count and if it's zero after the
>   decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
>   the 'sync' version).
> 
> * The 'suspend' callbacks won't succeed for usage_count > 0.

I agree.

I wonder if there should be a way for drivers to increment usage_count
without forcing a resume.  For example, if a driver stores up pending
I/O for its own work routine to handle then it would want to prevent
suspends before the work routine runs, but it would also want the work
routine to call pm_runtime_resume directly.  Hence it would be
pointless to have _get queue an extra resume request.

> This way we would avoid calling the 'suspend' and 'idle' functions each time
> unnecessarily, but then usage_count would have to be modified under the
> spinlock only.

Why?  That is, why would it have to be any different from the way it is 
now?  All you really need is that the test for zero should be part of 
the atomic operation.

Alan Stern


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-07 22:07   ` Rafael J. Wysocki
  2009-07-08  2:54     ` Alan Stern
@ 2009-07-08  2:54     ` Alan Stern
  2009-07-08  5:45     ` Magnus Damm
  2009-07-08  5:45       ` Magnus Damm
  3 siblings, 0 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08  2:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:

> > I've now jumped from v5 to v8 and I feel that the code is getting
> > cleaner and cleaner. Very nice.
> 
> That's mostly thanks to Alan.

I haven't had time yet to look through the new code.  Things have been 
very busy.

> > Issue 1:
> > ------------
> > Device drivers which do not perform any hardware access in probe()
> > work fine. During software setup in probe() the runtime pm code is
> > initialized with the following:
> > 
> > +	pm_suspend_ignore_children(&dev->dev, true);
> > +	pm_runtime_set_suspended(&dev->dev);
> > +	pm_runtime_enable(&dev->dev);
> > 
> > Before accessing hardware I perform:
> > +	pm_runtime_resume(pd->dev);
> > 
> > When done with the hardware I do:
> > +	pm_runtime_suspend(pd->dev);
> > 
> > Not so complicated. Am I supposed to initialize something else as well?

No, that's all you need.

> > All good with the code above, but there seem to be some issue with how
> > usage_count is counted up and down and when runtime_disabled is set:
> > 
> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> > 2. driver_probe_device(): pm_runtime_get_sync()
> > 3. pm_runtime_get_sync(): usage_count = 2
> > 4. device driver probe(): pm_runtime_enable()
> > 5. pm_runtime_enable(): usage_count = 1
> > 6. driver_probe_device(): pm_runtime_put()
> > 7. pm_runtime_put(): usage_count = 0
> > 
> > I expect runtime_disabled = false in 7.

Wasn't it?  It should have been set to false in step 4 and remained 
that way.

> >  Modifying the get/put calls to
> > do enable/disable may work around the issue, but that's probably not
> > what you guys want.
> 
> Sure, that's my mistake.  I should have used a separate counter for
> disable/enable, but I thought usage_counter would be sufficient.  Will fix.

Presumably there won't be much nesting of disable/enable.  The counter 
will need only a few bits.

> > Issue 2:
> > ------------
> > I cannot get any bus ->runtime_resume() callbacks from probe(). This
> > also seems related to usage_count and pm_runtime_get_sync() in
> > driver_probe_device(). Basically, from probe(), calling
> > pm_runtime_resume() after pm_runtime_set_suspended() results in error
> > and not in a ->runtime_resume() callback. Some device drives access
> > hardware in probe(), so the ->runtime_resume() callback is needed at
> > that point to turn on clocks before the hardware can be accessed.
> 
> I think the problem is that pm_runtime_get_sync() in driver_probe_device()
> calls ->runtime_resume(), so the device is active from the core's point of
> view when you call pm_runtime_resume() from probe().

Yes.  Maybe devices should be initialized with runtime PM enabled.  Or 
perhaps it should be enabled when device_add runs (before the probe).

> Hmm.  OK, perhaps we should just increment usage_count in
> driver_device_probe() to prevent suspends from happening at that time, without
> calling ->runtime_resume() so that the driver can do it by itself.  I'll do
> that in the next version.

Not necessary if you enable runtime PM first.  Or maybe 
pm_runtime_enable should compare the state and the counters, calling 
pm_runtime_resume or pm_runtime_idle as needed.

> > Random thought:
> > -------------------------
> > The runtime_pm_get() and runtime_pm_put() look very nice. I assume
> > that inteface is supposed to be used by bus code. I wonder if it would
> > be cleaner to use a similar counter based interface from the driver
> > instead of the pm_runtime_idle()/suspend()/resume()...
> > 
> > Let me know what you think!
> 
> In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
> versions.  At least, I don't see why not at the moment (well, I'm a bit tired
> right now ...).
> 
> However, I'm now thinking it should work like this:
> 
> * pm_runtime_get() increments usage_count and if it was zero before the
>   incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
>   by the 'sync' version).
> 
> * pm_runtime_put() decrements usage_count and if it's zero after the
>   decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
>   the 'sync' version).
> 
> * The 'suspend' callbacks won't succeed for usage_count > 0.

I agree.

I wonder if there should be a way for drivers to increment usage_count
without forcing a resume.  For example, if a driver stores up pending
I/O for its own work routine to handle then it would want to prevent
suspends before the work routine runs, but it would also want the work
routine to call pm_runtime_resume directly.  Hence it would be
pointless to have _get queue an extra resume request.

> This way we would avoid calling the 'suspend' and 'idle' functions each time
> unnecessarily, but then usage_count would have to be modified under the
> spinlock only.

Why?  That is, why would it have to be any different from the way it is 
now?  All you really need is that the test for zero should be part of 
the atomic operation.

Alan Stern

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08  2:54     ` Alan Stern
@ 2009-07-08  4:40         ` Magnus Damm
  2009-07-08  4:40         ` Magnus Damm
  1 sibling, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-08  4:40 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, Jul 8, 2009 at 11:54 AM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
>
>> > I've now jumped from v5 to v8 and I feel that the code is getting
>> > cleaner and cleaner. Very nice.
>>
>> That's mostly thanks to Alan.
>
> I haven't had time yet to look through the new code.  Things have been
> very busy.
>
>> > Issue 1:
>> > ------------
>> > Device drivers which do not perform any hardware access in probe()
>> > work fine. During software setup in probe() the runtime pm code is
>> > initialized with the following:
>> >
>> > +   pm_suspend_ignore_children(&dev->dev, true);
>> > +   pm_runtime_set_suspended(&dev->dev);
>> > +   pm_runtime_enable(&dev->dev);
>> >
>> > Before accessing hardware I perform:
>> > +   pm_runtime_resume(pd->dev);
>> >
>> > When done with the hardware I do:
>> > +   pm_runtime_suspend(pd->dev);
>> >
>> > Not so complicated. Am I supposed to initialize something else as well?
>
> No, that's all you need.

Ok, thank you!

>> > All good with the code above, but there seem to be some issue with how
>> > usage_count is counted up and down and when runtime_disabled is set:
>> >
>> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
>> > 2. driver_probe_device(): pm_runtime_get_sync()
>> > 3. pm_runtime_get_sync(): usage_count = 2
>> > 4. device driver probe(): pm_runtime_enable()
>> > 5. pm_runtime_enable(): usage_count = 1
>> > 6. driver_probe_device(): pm_runtime_put()
>> > 7. pm_runtime_put(): usage_count = 0
>> >
>> > I expect runtime_disabled = false in 7.
>
> Wasn't it?  It should have been set to false in step 4 and remained
> that way.

I may misunderstand, but in v8 won't the pm_runtime_enable() function
do a atomic_dec_test() where the counter value will go from 2 to 1 in
the case above? This would mean that atomic_dec_test() returns false
so runtime_disabled is never modified.

Thanks for your feedback,

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 8)
@ 2009-07-08  4:40         ` Magnus Damm
  0 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-08  4:40 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, Jul 8, 2009 at 11:54 AM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
>
>> > I've now jumped from v5 to v8 and I feel that the code is getting
>> > cleaner and cleaner. Very nice.
>>
>> That's mostly thanks to Alan.
>
> I haven't had time yet to look through the new code.  Things have been
> very busy.
>
>> > Issue 1:
>> > ------------
>> > Device drivers which do not perform any hardware access in probe()
>> > work fine. During software setup in probe() the runtime pm code is
>> > initialized with the following:
>> >
>> > +   pm_suspend_ignore_children(&dev->dev, true);
>> > +   pm_runtime_set_suspended(&dev->dev);
>> > +   pm_runtime_enable(&dev->dev);
>> >
>> > Before accessing hardware I perform:
>> > +   pm_runtime_resume(pd->dev);
>> >
>> > When done with the hardware I do:
>> > +   pm_runtime_suspend(pd->dev);
>> >
>> > Not so complicated. Am I supposed to initialize something else as well?
>
> No, that's all you need.

Ok, thank you!

>> > All good with the code above, but there seem to be some issue with how
>> > usage_count is counted up and down and when runtime_disabled is set:
>> >
>> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
>> > 2. driver_probe_device(): pm_runtime_get_sync()
>> > 3. pm_runtime_get_sync(): usage_count = 2
>> > 4. device driver probe(): pm_runtime_enable()
>> > 5. pm_runtime_enable(): usage_count = 1
>> > 6. driver_probe_device(): pm_runtime_put()
>> > 7. pm_runtime_put(): usage_count = 0
>> >
>> > I expect runtime_disabled = false in 7.
>
> Wasn't it?  It should have been set to false in step 4 and remained
> that way.

I may misunderstand, but in v8 won't the pm_runtime_enable() function
do a atomic_dec_test() where the counter value will go from 2 to 1 in
the case above? This would mean that atomic_dec_test() returns false
so runtime_disabled is never modified.

Thanks for your feedback,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08  2:54     ` Alan Stern
@ 2009-07-08  4:40       ` Magnus Damm
  2009-07-08  4:40         ` Magnus Damm
  1 sibling, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-08  4:40 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wed, Jul 8, 2009 at 11:54 AM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
>
>> > I've now jumped from v5 to v8 and I feel that the code is getting
>> > cleaner and cleaner. Very nice.
>>
>> That's mostly thanks to Alan.
>
> I haven't had time yet to look through the new code.  Things have been
> very busy.
>
>> > Issue 1:
>> > ------------
>> > Device drivers which do not perform any hardware access in probe()
>> > work fine. During software setup in probe() the runtime pm code is
>> > initialized with the following:
>> >
>> > +   pm_suspend_ignore_children(&dev->dev, true);
>> > +   pm_runtime_set_suspended(&dev->dev);
>> > +   pm_runtime_enable(&dev->dev);
>> >
>> > Before accessing hardware I perform:
>> > +   pm_runtime_resume(pd->dev);
>> >
>> > When done with the hardware I do:
>> > +   pm_runtime_suspend(pd->dev);
>> >
>> > Not so complicated. Am I supposed to initialize something else as well?
>
> No, that's all you need.

Ok, thank you!

>> > All good with the code above, but there seem to be some issue with how
>> > usage_count is counted up and down and when runtime_disabled is set:
>> >
>> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
>> > 2. driver_probe_device(): pm_runtime_get_sync()
>> > 3. pm_runtime_get_sync(): usage_count = 2
>> > 4. device driver probe(): pm_runtime_enable()
>> > 5. pm_runtime_enable(): usage_count = 1
>> > 6. driver_probe_device(): pm_runtime_put()
>> > 7. pm_runtime_put(): usage_count = 0
>> >
>> > I expect runtime_disabled = false in 7.
>
> Wasn't it?  It should have been set to false in step 4 and remained
> that way.

I may misunderstand, but in v8 won't the pm_runtime_enable() function
do a atomic_dec_test() where the counter value will go from 2 to 1 in
the case above? This would mean that atomic_dec_test() returns false
so runtime_disabled is never modified.

Thanks for your feedback,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-07 22:07   ` Rafael J. Wysocki
@ 2009-07-08  5:45       ` Magnus Damm
  2009-07-08  2:54     ` Alan Stern
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-08  5:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, Jul 8, 2009 at 7:07 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> On Tuesday 07 July 2009, Magnus Damm wrote:
>> On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
>> > Hi,
>> >
>> > There's a rev. 8 of the run-time PM framework patch.

>> All good with the code above, but there seem to be some issue with how
>> usage_count is counted up and down and when runtime_disabled is set:
>>
>> 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
>> 2. driver_probe_device(): pm_runtime_get_sync()
>> 3. pm_runtime_get_sync(): usage_count = 2
>> 4. device driver probe(): pm_runtime_enable()
>> 5. pm_runtime_enable(): usage_count = 1
>> 6. driver_probe_device(): pm_runtime_put()
>> 7. pm_runtime_put(): usage_count = 0
>>
>> I expect runtime_disabled = false in 7. Modifying the get/put calls to
>> do enable/disable may work around the issue, but that's probably not
>> what you guys want.
>
> Sure, that's my mistake.  I should have used a separate counter for
> disable/enable, but I thought usage_counter would be sufficient.  Will fix.

Thank you. No problem.

>> Issue 2:
>> ------------
>> I cannot get any bus ->runtime_resume() callbacks from probe(). This
>> also seems related to usage_count and pm_runtime_get_sync() in
>> driver_probe_device(). Basically, from probe(), calling
>> pm_runtime_resume() after pm_runtime_set_suspended() results in error
>> and not in a ->runtime_resume() callback. Some device drives access
>> hardware in probe(), so the ->runtime_resume() callback is needed at
>> that point to turn on clocks before the hardware can be accessed.
>
> I think the problem is that pm_runtime_get_sync() in driver_probe_device()
> calls ->runtime_resume(), so the device is active from the core's point of
> view when you call pm_runtime_resume() from probe().
>
> Hmm.  OK, perhaps we should just increment usage_count in
> driver_device_probe() to prevent suspends from happening at that time, without
> calling ->runtime_resume() so that the driver can do it by itself.  I'll do
> that in the next version.

Sounds good.

>> Random thought:
>> -------------------------
>> The runtime_pm_get() and runtime_pm_put() look very nice. I assume
>> that inteface is supposed to be used by bus code. I wonder if it would
>> be cleaner to use a similar counter based interface from the driver
>> instead of the pm_runtime_idle()/suspend()/resume()...
>>
>> Let me know what you think!
>
> In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
> versions.  At least, I don't see why not at the moment (well, I'm a bit tired
> right now ...).

I think that's a nicer interface, but I must figure out how to use
->runtime_idle before I can switch to that...

> However, I'm now thinking it should work like this:
>
> * pm_runtime_get() increments usage_count and if it was zero before the
>  incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
>  by the 'sync' version).
>
> * pm_runtime_put() decrements usage_count and if it's zero after the
>  decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
>  the 'sync' version).
>
> * The 'suspend' callbacks won't succeed for usage_count > 0.
>
> This way we would avoid calling the 'suspend' and 'idle' functions each time
> unnecessarily, but then usage_count would have to be modified under the
> spinlock only.

If all usage_count users are moved under the spinlock then there would
be no need for atomic operations, right?

This get()/put() interface is interesting.

So I'd like to tie in two levels of power management in our runtime PM
implementation. The most simple level is clock stopping, and I can do
that using the bus callbacks ->runtime_suspend() and
->runtime_resume() with v8. The driver runtime callbacks are never
invoked for clock stopping.

On top of the clock stopping I'd like to turn off power to the domain.
So if all clocks are stopped to the devices within a domain, then I'd
like to call the per-device ->runtime_suspend() callbacks provided by
the drivers.

I wonder how to fit these two levels of power management into the
runtime PM in a nice way. My first attempts simply made use of
pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
get()/put() if possible. But for that to work I need to implement
->runtime_idle() in my bus code, and I wonder if the current runtime
PM idle behaviour is a good fit.

Below is how I'd like to make use of the runtime PM code. I'm not sure
if it's compatible with your view. =)

Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
after using the hardware. The runtime PM code invokes the bus
->runtime_idle() callback ASAP (of course depending on put() or
put_sync(), but no timer). The bus->runtime_idle() callback stops the
clock and decreases the power domain usage count. If the power domain
is unused, then the pm_schedule_suspend() is called for each of the
devices in the power domain. This in turn will invoke the
->runtime_suspend() callback which starts the clock, calls the driver
->runtime_suspend() and stops the clock again. When all devices are
runtime suspended the power domain is turned off.

I can't get the above to work with v8 though. This is because after
the clock is stopped with ->runtime_idle() the runtime_status of the
device is still RPM_ACTIVE, so when pm_runtime_get_sync() gets called
the ->runtime_resume() never gets invoked and the clock is never
started...

So I don't know if you think the ->runtime_idle usage above is a good
plan. I guess no, it's probably quite different from the USB case. I
can of course always skip using ->runtime_idle() and just use
suspend()/resume().

Any thoughts?

Thanks,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 8)
@ 2009-07-08  5:45       ` Magnus Damm
  0 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-08  5:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, Jul 8, 2009 at 7:07 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> On Tuesday 07 July 2009, Magnus Damm wrote:
>> On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
>> > Hi,
>> >
>> > There's a rev. 8 of the run-time PM framework patch.

>> All good with the code above, but there seem to be some issue with how
>> usage_count is counted up and down and when runtime_disabled is set:
>>
>> 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
>> 2. driver_probe_device(): pm_runtime_get_sync()
>> 3. pm_runtime_get_sync(): usage_count = 2
>> 4. device driver probe(): pm_runtime_enable()
>> 5. pm_runtime_enable(): usage_count = 1
>> 6. driver_probe_device(): pm_runtime_put()
>> 7. pm_runtime_put(): usage_count = 0
>>
>> I expect runtime_disabled = false in 7. Modifying the get/put calls to
>> do enable/disable may work around the issue, but that's probably not
>> what you guys want.
>
> Sure, that's my mistake.  I should have used a separate counter for
> disable/enable, but I thought usage_counter would be sufficient.  Will fix.

Thank you. No problem.

>> Issue 2:
>> ------------
>> I cannot get any bus ->runtime_resume() callbacks from probe(). This
>> also seems related to usage_count and pm_runtime_get_sync() in
>> driver_probe_device(). Basically, from probe(), calling
>> pm_runtime_resume() after pm_runtime_set_suspended() results in error
>> and not in a ->runtime_resume() callback. Some device drives access
>> hardware in probe(), so the ->runtime_resume() callback is needed at
>> that point to turn on clocks before the hardware can be accessed.
>
> I think the problem is that pm_runtime_get_sync() in driver_probe_device()
> calls ->runtime_resume(), so the device is active from the core's point of
> view when you call pm_runtime_resume() from probe().
>
> Hmm.  OK, perhaps we should just increment usage_count in
> driver_device_probe() to prevent suspends from happening at that time, without
> calling ->runtime_resume() so that the driver can do it by itself.  I'll do
> that in the next version.

Sounds good.

>> Random thought:
>> -------------------------
>> The runtime_pm_get() and runtime_pm_put() look very nice. I assume
>> that inteface is supposed to be used by bus code. I wonder if it would
>> be cleaner to use a similar counter based interface from the driver
>> instead of the pm_runtime_idle()/suspend()/resume()...
>>
>> Let me know what you think!
>
> In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
> versions.  At least, I don't see why not at the moment (well, I'm a bit tired
> right now ...).

I think that's a nicer interface, but I must figure out how to use
->runtime_idle before I can switch to that...

> However, I'm now thinking it should work like this:
>
> * pm_runtime_get() increments usage_count and if it was zero before the
>  incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
>  by the 'sync' version).
>
> * pm_runtime_put() decrements usage_count and if it's zero after the
>  decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
>  the 'sync' version).
>
> * The 'suspend' callbacks won't succeed for usage_count > 0.
>
> This way we would avoid calling the 'suspend' and 'idle' functions each time
> unnecessarily, but then usage_count would have to be modified under the
> spinlock only.

If all usage_count users are moved under the spinlock then there would
be no need for atomic operations, right?

This get()/put() interface is interesting.

So I'd like to tie in two levels of power management in our runtime PM
implementation. The most simple level is clock stopping, and I can do
that using the bus callbacks ->runtime_suspend() and
->runtime_resume() with v8. The driver runtime callbacks are never
invoked for clock stopping.

On top of the clock stopping I'd like to turn off power to the domain.
So if all clocks are stopped to the devices within a domain, then I'd
like to call the per-device ->runtime_suspend() callbacks provided by
the drivers.

I wonder how to fit these two levels of power management into the
runtime PM in a nice way. My first attempts simply made use of
pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
get()/put() if possible. But for that to work I need to implement
->runtime_idle() in my bus code, and I wonder if the current runtime
PM idle behaviour is a good fit.

Below is how I'd like to make use of the runtime PM code. I'm not sure
if it's compatible with your view. =)

Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
after using the hardware. The runtime PM code invokes the bus
->runtime_idle() callback ASAP (of course depending on put() or
put_sync(), but no timer). The bus->runtime_idle() callback stops the
clock and decreases the power domain usage count. If the power domain
is unused, then the pm_schedule_suspend() is called for each of the
devices in the power domain. This in turn will invoke the
->runtime_suspend() callback which starts the clock, calls the driver
->runtime_suspend() and stops the clock again. When all devices are
runtime suspended the power domain is turned off.

I can't get the above to work with v8 though. This is because after
the clock is stopped with ->runtime_idle() the runtime_status of the
device is still RPM_ACTIVE, so when pm_runtime_get_sync() gets called
the ->runtime_resume() never gets invoked and the clock is never
started...

So I don't know if you think the ->runtime_idle usage above is a good
plan. I guess no, it's probably quite different from the USB case. I
can of course always skip using ->runtime_idle() and just use
suspend()/resume().

Any thoughts?

Thanks,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-07 22:07   ` Rafael J. Wysocki
  2009-07-08  2:54     ` Alan Stern
  2009-07-08  2:54     ` Alan Stern
@ 2009-07-08  5:45     ` Magnus Damm
  2009-07-08  5:45       ` Magnus Damm
  3 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-08  5:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wed, Jul 8, 2009 at 7:07 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> On Tuesday 07 July 2009, Magnus Damm wrote:
>> On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
>> > Hi,
>> >
>> > There's a rev. 8 of the run-time PM framework patch.

>> All good with the code above, but there seem to be some issue with how
>> usage_count is counted up and down and when runtime_disabled is set:
>>
>> 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
>> 2. driver_probe_device(): pm_runtime_get_sync()
>> 3. pm_runtime_get_sync(): usage_count = 2
>> 4. device driver probe(): pm_runtime_enable()
>> 5. pm_runtime_enable(): usage_count = 1
>> 6. driver_probe_device(): pm_runtime_put()
>> 7. pm_runtime_put(): usage_count = 0
>>
>> I expect runtime_disabled = false in 7. Modifying the get/put calls to
>> do enable/disable may work around the issue, but that's probably not
>> what you guys want.
>
> Sure, that's my mistake.  I should have used a separate counter for
> disable/enable, but I thought usage_counter would be sufficient.  Will fix.

Thank you. No problem.

>> Issue 2:
>> ------------
>> I cannot get any bus ->runtime_resume() callbacks from probe(). This
>> also seems related to usage_count and pm_runtime_get_sync() in
>> driver_probe_device(). Basically, from probe(), calling
>> pm_runtime_resume() after pm_runtime_set_suspended() results in error
>> and not in a ->runtime_resume() callback. Some device drives access
>> hardware in probe(), so the ->runtime_resume() callback is needed at
>> that point to turn on clocks before the hardware can be accessed.
>
> I think the problem is that pm_runtime_get_sync() in driver_probe_device()
> calls ->runtime_resume(), so the device is active from the core's point of
> view when you call pm_runtime_resume() from probe().
>
> Hmm.  OK, perhaps we should just increment usage_count in
> driver_device_probe() to prevent suspends from happening at that time, without
> calling ->runtime_resume() so that the driver can do it by itself.  I'll do
> that in the next version.

Sounds good.

>> Random thought:
>> -------------------------
>> The runtime_pm_get() and runtime_pm_put() look very nice. I assume
>> that inteface is supposed to be used by bus code. I wonder if it would
>> be cleaner to use a similar counter based interface from the driver
>> instead of the pm_runtime_idle()/suspend()/resume()...
>>
>> Let me know what you think!
>
> In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
> versions.  At least, I don't see why not at the moment (well, I'm a bit tired
> right now ...).

I think that's a nicer interface, but I must figure out how to use
->runtime_idle before I can switch to that...

> However, I'm now thinking it should work like this:
>
> * pm_runtime_get() increments usage_count and if it was zero before the
>  incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
>  by the 'sync' version).
>
> * pm_runtime_put() decrements usage_count and if it's zero after the
>  decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
>  the 'sync' version).
>
> * The 'suspend' callbacks won't succeed for usage_count > 0.
>
> This way we would avoid calling the 'suspend' and 'idle' functions each time
> unnecessarily, but then usage_count would have to be modified under the
> spinlock only.

If all usage_count users are moved under the spinlock then there would
be no need for atomic operations, right?

This get()/put() interface is interesting.

So I'd like to tie in two levels of power management in our runtime PM
implementation. The most simple level is clock stopping, and I can do
that using the bus callbacks ->runtime_suspend() and
->runtime_resume() with v8. The driver runtime callbacks are never
invoked for clock stopping.

On top of the clock stopping I'd like to turn off power to the domain.
So if all clocks are stopped to the devices within a domain, then I'd
like to call the per-device ->runtime_suspend() callbacks provided by
the drivers.

I wonder how to fit these two levels of power management into the
runtime PM in a nice way. My first attempts simply made use of
pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
get()/put() if possible. But for that to work I need to implement
->runtime_idle() in my bus code, and I wonder if the current runtime
PM idle behaviour is a good fit.

Below is how I'd like to make use of the runtime PM code. I'm not sure
if it's compatible with your view. =)

Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
after using the hardware. The runtime PM code invokes the bus
->runtime_idle() callback ASAP (of course depending on put() or
put_sync(), but no timer). The bus->runtime_idle() callback stops the
clock and decreases the power domain usage count. If the power domain
is unused, then the pm_schedule_suspend() is called for each of the
devices in the power domain. This in turn will invoke the
->runtime_suspend() callback which starts the clock, calls the driver
->runtime_suspend() and stops the clock again. When all devices are
runtime suspended the power domain is turned off.

I can't get the above to work with v8 though. This is because after
the clock is stopped with ->runtime_idle() the runtime_status of the
device is still RPM_ACTIVE, so when pm_runtime_get_sync() gets called
the ->runtime_resume() never gets invoked and the clock is never
started...

So I don't know if you think the ->runtime_idle usage above is a good
plan. I guess no, it's probably quite different from the USB case. I
can of course always skip using ->runtime_idle() and just use
suspend()/resume().

Any thoughts?

Thanks,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 8)
  2009-07-08  4:40         ` Magnus Damm
@ 2009-07-08 14:26           ` Alan Stern
  -1 siblings, 0 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08 14:26 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Magnus Damm wrote:

> >> > All good with the code above, but there seem to be some issue with how
> >> > usage_count is counted up and down and when runtime_disabled is set:
> >> >
> >> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> >> > 2. driver_probe_device(): pm_runtime_get_sync()
> >> > 3. pm_runtime_get_sync(): usage_count = 2
> >> > 4. device driver probe(): pm_runtime_enable()
> >> > 5. pm_runtime_enable(): usage_count = 1
> >> > 6. driver_probe_device(): pm_runtime_put()
> >> > 7. pm_runtime_put(): usage_count = 0
> >> >
> >> > I expect runtime_disabled = false in 7.
> >
> > Wasn't it?  It should have been set to false in step 4 and remained
> > that way.
> 
> I may misunderstand, but in v8 won't the pm_runtime_enable() function
> do a atomic_dec_test() where the counter value will go from 2 to 1 in
> the case above? This would mean that atomic_dec_test() returns false
> so runtime_disabled is never modified.

There still hasn't been any time for me to look through the code.  It 
sounds like Rafael was trying to use one counter for two separate 
purposes.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 8)
@ 2009-07-08 14:26           ` Alan Stern
  0 siblings, 0 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08 14:26 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Magnus Damm wrote:

> >> > All good with the code above, but there seem to be some issue with how
> >> > usage_count is counted up and down and when runtime_disabled is set:
> >> >
> >> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> >> > 2. driver_probe_device(): pm_runtime_get_sync()
> >> > 3. pm_runtime_get_sync(): usage_count = 2
> >> > 4. device driver probe(): pm_runtime_enable()
> >> > 5. pm_runtime_enable(): usage_count = 1
> >> > 6. driver_probe_device(): pm_runtime_put()
> >> > 7. pm_runtime_put(): usage_count = 0
> >> >
> >> > I expect runtime_disabled = false in 7.
> >
> > Wasn't it?  It should have been set to false in step 4 and remained
> > that way.
> 
> I may misunderstand, but in v8 won't the pm_runtime_enable() function
> do a atomic_dec_test() where the counter value will go from 2 to 1 in
> the case above? This would mean that atomic_dec_test() returns false
> so runtime_disabled is never modified.

There still hasn't been any time for me to look through the code.  It 
sounds like Rafael was trying to use one counter for two separate 
purposes.

Alan Stern


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08  4:40         ` Magnus Damm
  (?)
  (?)
@ 2009-07-08 14:26         ` Alan Stern
  -1 siblings, 0 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08 14:26 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Magnus Damm wrote:

> >> > All good with the code above, but there seem to be some issue with how
> >> > usage_count is counted up and down and when runtime_disabled is set:
> >> >
> >> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> >> > 2. driver_probe_device(): pm_runtime_get_sync()
> >> > 3. pm_runtime_get_sync(): usage_count = 2
> >> > 4. device driver probe(): pm_runtime_enable()
> >> > 5. pm_runtime_enable(): usage_count = 1
> >> > 6. driver_probe_device(): pm_runtime_put()
> >> > 7. pm_runtime_put(): usage_count = 0
> >> >
> >> > I expect runtime_disabled = false in 7.
> >
> > Wasn't it?  It should have been set to false in step 4 and remained
> > that way.
> 
> I may misunderstand, but in v8 won't the pm_runtime_enable() function
> do a atomic_dec_test() where the counter value will go from 2 to 1 in
> the case above? This would mean that atomic_dec_test() returns false
> so runtime_disabled is never modified.

There still hasn't been any time for me to look through the code.  It 
sounds like Rafael was trying to use one counter for two separate 
purposes.

Alan Stern

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [update][RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 9)
  2009-07-08 14:26           ` Alan Stern
  (?)
  (?)
@ 2009-07-08 17:50           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 17:50 UTC (permalink / raw)
  To: Alan Stern
  Cc: Magnus Damm, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Alan Stern wrote:
> On Wed, 8 Jul 2009, Magnus Damm wrote:
> 
> > >> > All good with the code above, but there seem to be some issue with how
> > >> > usage_count is counted up and down and when runtime_disabled is set:
> > >> >
> > >> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> > >> > 2. driver_probe_device(): pm_runtime_get_sync()
> > >> > 3. pm_runtime_get_sync(): usage_count = 2
> > >> > 4. device driver probe(): pm_runtime_enable()
> > >> > 5. pm_runtime_enable(): usage_count = 1
> > >> > 6. driver_probe_device(): pm_runtime_put()
> > >> > 7. pm_runtime_put(): usage_count = 0
> > >> >
> > >> > I expect runtime_disabled = false in 7.
> > >
> > > Wasn't it?  It should have been set to false in step 4 and remained
> > > that way.
> > 
> > I may misunderstand, but in v8 won't the pm_runtime_enable() function
> > do a atomic_dec_test() where the counter value will go from 2 to 1 in
> > the case above? This would mean that atomic_dec_test() returns false
> > so runtime_disabled is never modified.
> 
> There still hasn't been any time for me to look through the code.  It 
> sounds like Rafael was trying to use one counter for two separate 
> purposes.

That's correct.  It's (hopefully) fixed in the appended update of the patch.

In addition, I reworked the pm_runtime_[put|get|put_sync|get_sync]() to work as
described in my previous message (ie. 'resume' is only called if usage_count
was 0 when the function was called and 'idle' is only called if the function
has decreasd the usage counter down to 0).

There also are pm_runtime_get_noresume() and pm_runtime_put_noidle() that only
increment and decrement the usage counter (respectively).  They are used in
driver_probe_device() to prevent a suspend of the device from being started
while ->probe() is running while allowing ->runtime_resume() to be called from
->probe() (as requested by Magnus).

Best,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 9)

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.

Not-yet-signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/dd.c            |    7 
 drivers/base/power/Makefile  |    1 
 drivers/base/power/main.c    |   21 
 drivers/base/power/power.h   |   11 
 drivers/base/power/runtime.c |  950 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm.h           |  102 ++++
 include/linux/pm_runtime.h   |  113 +++++
 kernel/power/Kconfig         |   14 
 kernel/power/main.c          |   17 
 9 files changed, 1225 insertions(+), 11 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /**
@@ -315,14 +344,81 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		disable_depth:3;
+	unsigned int		ignore_children:1;
+	unsigned int		runtime_failure:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			last_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,950 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * If there's a request pending, make sure its work function will return
+	 * without doing anything.
+	 */
+	if (dev->power.request_pending)
+		dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
+		dev->bus->pm->runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If there's an idle notification pending, cancel it.  If
+ * there's a suspend request scheduled while this function is running and @sync
+ * is 'true', cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* Pending resume requests take precedence over us. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+		/* Other pending requests need to be canceled. */
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.disable_depth > 0
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend ?
+		dev->bus->pm->runtime_suspend(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY) {
+			notify = true;
+		} else {
+			dev->power.runtime_failure = true;
+			dev->power.last_error = retval;
+		}
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		__pm_request_resume(dev);
+		dev->power.deferred_resume = false;
+	}
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (parent && !parent->power.ignore_children)
+		pm_request_idle(parent);
+
+	if (notify)
+		pm_runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  If there's a suspend
+ * request or idle notification pending, cancel it.  If there's a resume request
+ * scheduled while this function is running, cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -ENODEV;
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			return -EINPROGRESS;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		spin_unlock_irq(&dev->power.lock);
+
+		parent = dev->parent;
+		retval = pm_runtime_get_sync(parent);
+		if (retval < 0)
+			goto out_parent;
+
+		spin_lock_irq(&dev->power.lock);
+		retval = 0;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume ?
+		dev->bus->pm->runtime_resume(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		dev->power.runtime_failure = true;
+		dev->power.last_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	spin_unlock_irq(&dev->power.lock);
+
+ out_parent:
+	if (parent)
+		pm_runtime_put(parent);
+
+	if (!retval)
+		pm_request_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		goto out;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	if (dev->power.request_pending)
+		goto out;
+
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+
+		if (dev->power.request == RPM_REQ_SUSPEND)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		/* There's nothing to do if resume request is pending. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_get - Reference count a device and wake it up, if necessary.
+ * @dev: Device to handle.
+ * @sync: If set and the device is suspended, resume it synchronously.
+ *
+ * Increment the usage count of the device and if it was zero previously,
+ * resume it or submit a resume request for it, depending on the value of @sync.
+ */
+int __pm_runtime_get(struct device *dev, bool sync)
+{
+	int retval = 1;
+
+	if (atomic_add_return(1, &dev->power.usage_count) == 1)
+		retval = sync ? pm_runtime_resume(dev) : pm_request_resume(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_get);
+
+/**
+ * __pm_runtime_put - Decrement the device's usage counter and notify its bus.
+ * @dev: Device to handle.
+ * @sync: If the device's bus type is to be notified, do that synchronously.
+ *
+ * Decrement the usage count of the device and if it reaches zero, carry out a
+ * synchronous idle notification or submit an idle notification request for it,
+ * depending on the value of @sync.
+ */
+int __pm_runtime_put(struct device *dev, bool sync)
+{
+	int retval = 0;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		retval = sync ? pm_runtime_idle(dev) : pm_request_idle(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_put);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_failure flag is
+ * set, the status may be changed either to RPM_ACTIVE, or to RPM_SUSPENDED, as
+ * long as that reflects the actual state of the device.  However, if the device
+ * has a parent and the parent is not active, and the parent's
+ * power.ignore_children flag is unset, the device's status cannot be set to
+ * RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_failure
+ * flag and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_failure && !dev->power.disable_depth)
+		goto out;
+
+	if (dev->power.runtime_status == status)
+		goto out_clear;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent)
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		dev->power.runtime_status = status;
+		goto out_clear;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It may be invalid to put an active child under a suspended
+		 * parent.
+		 */
+		if (parent->power.runtime_status == RPM_ACTIVE
+		    || parent->power.ignore_children) {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+			dev->power.runtime_status = status;
+		} else {
+			error = -EBUSY;
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	} else {
+		dev->power.runtime_status = status;
+	}
+
+ out_clear:
+	dev->power.runtime_failure = false;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.disable_depth > 0)
+		dev->power.disable_depth--;
+	else
+		dev_warn(dev, "Unbalanced %s!", __func__);
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ *
+ * Increment power.disable_depth for the device and if was zero previously,
+ * cancel all pending run-time PM requests for the device and wait for all
+ * operations in progress to complete.  The device can be either active or
+ * suspended after its run-time PM has been disabled.
+ *
+ * If there's a resume request pending when pm_runtime_disable() is called and
+ * power.disable_depth is zero, the function will resume the device before
+ * disabling its run-time PM and will return -EBUSY.  Otherwise, 0 is returned.
+ */
+int pm_runtime_disable(struct device *dev)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (dev->power.disable_depth > 0) {
+		dev->power.disable_depth++;
+		goto out;
+	}
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and disabling run-time PM
+	 * shouldn't prevent the device from processing the I/O.
+	 */
+	if (dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		/*
+		 * Prevent suspends and idle notifications from being carried
+		 * out after we have woken up the device.
+		 */
+		pm_runtime_get_noresume(dev);
+
+		__pm_runtime_resume(dev, false);
+
+		pm_runtime_put_noidle(dev);
+		retval = -EBUSY;
+	}
+
+	if (dev->power.disable_depth++ > 0)
+		goto out;
+
+	if (dev->power.runtime_failure)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+	if (dev->power.runtime_failure)
+		goto out;
+
+	if (dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (!dev->power.idle_notification)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_ACTIVE;
+	dev->power.idle_notification = false;
+
+	dev->power.disable_depth = 1;
+	atomic_set(&dev->power.usage_count, 0);
+
+	dev->power.runtime_failure = false;
+	dev->power.last_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	dev->power.suspend_timer.expires = jiffies;
+	dev->power.suspend_timer.data = (unsigned long)dev;
+	dev->power.suspend_timer.function = pm_suspend_timer_fn;
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_add - Update run-time PM fields of a device while adding it.
+ * @dev: Device object being added to device hierarchy.
+ */
+void pm_runtime_add(struct device *dev)
+{
+	if (dev->parent)
+		atomic_inc(&dev->parent->power.child_count);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	struct device *parent = dev->parent;
+
+	pm_runtime_disable(dev);
+
+	if (dev->power.runtime_status != RPM_SUSPENDED && parent) {
+		atomic_add_unless(&parent->power.child_count, -1, 0);
+		if (!parent->power.ignore_children)
+			pm_request_idle(parent);
+	}
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,113 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_add(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_get(struct device *dev, bool sync);
+extern int __pm_runtime_put(struct device *dev, bool sync);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int pm_runtime_disable(struct device *dev);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline void pm_runtime_get_noresume(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+}
+
+static inline void pm_runtime_put_noidle(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_add(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_get(struct device *dev, bool sync) { return 1; }
+static inline int __pm_runtime_put(struct device *dev, bool sync) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int pm_runtime_disable(struct device *dev) { return 0; }
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline void pm_runtime_get_noresume(struct device *dev) {}
+static inline void pm_runtime_put_noidle(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	return __pm_runtime_get(dev, false);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	return __pm_runtime_get(dev, true);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	return __pm_runtime_put(dev, false);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	return __pm_runtime_put(dev, true);
+}
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object to initialize.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -89,6 +100,8 @@ void device_pm_add(struct device *dev)
 
 	list_add_tail(&dev->power.entry, &dpm_list);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_add(dev);
 }
 
 /**
@@ -105,6 +118,8 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -510,6 +525,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -755,11 +771,14 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		error = pm_runtime_disable(dev);
+		if (!error || !device_may_wakeup(dev))
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
+	pm_runtime_get_noresume(dev);
 	ret = really_probe(dev, drv);
+	pm_runtime_put_noidle(dev);
 
 	return ret;
 }
@@ -306,6 +309,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +329,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,8 +1,3 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
-
 #ifdef CONFIG_PM_SLEEP
 
 /*
@@ -16,14 +11,16 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
 
+static inline void device_pm_init(struct device *dev) {}
 static inline void device_pm_add(struct device *dev) {}
 static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
@@ -32,7 +29,7 @@ static inline void device_pm_move_after(
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [update][RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 9)
  2009-07-08 14:26           ` Alan Stern
  (?)
@ 2009-07-08 17:50           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 17:50 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Alan Stern wrote:
> On Wed, 8 Jul 2009, Magnus Damm wrote:
> 
> > >> > All good with the code above, but there seem to be some issue with how
> > >> > usage_count is counted up and down and when runtime_disabled is set:
> > >> >
> > >> > 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> > >> > 2. driver_probe_device(): pm_runtime_get_sync()
> > >> > 3. pm_runtime_get_sync(): usage_count = 2
> > >> > 4. device driver probe(): pm_runtime_enable()
> > >> > 5. pm_runtime_enable(): usage_count = 1
> > >> > 6. driver_probe_device(): pm_runtime_put()
> > >> > 7. pm_runtime_put(): usage_count = 0
> > >> >
> > >> > I expect runtime_disabled = false in 7.
> > >
> > > Wasn't it?  It should have been set to false in step 4 and remained
> > > that way.
> > 
> > I may misunderstand, but in v8 won't the pm_runtime_enable() function
> > do a atomic_dec_test() where the counter value will go from 2 to 1 in
> > the case above? This would mean that atomic_dec_test() returns false
> > so runtime_disabled is never modified.
> 
> There still hasn't been any time for me to look through the code.  It 
> sounds like Rafael was trying to use one counter for two separate 
> purposes.

That's correct.  It's (hopefully) fixed in the appended update of the patch.

In addition, I reworked the pm_runtime_[put|get|put_sync|get_sync]() to work as
described in my previous message (ie. 'resume' is only called if usage_count
was 0 when the function was called and 'idle' is only called if the function
has decreasd the usage counter down to 0).

There also are pm_runtime_get_noresume() and pm_runtime_put_noidle() that only
increment and decrement the usage counter (respectively).  They are used in
driver_probe_device() to prevent a suspend of the device from being started
while ->probe() is running while allowing ->runtime_resume() to be called from
->probe() (as requested by Magnus).

Best,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM: Introduce core framework for run-time PM of I/O devices (rev. 9)

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.

Not-yet-signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/dd.c            |    7 
 drivers/base/power/Makefile  |    1 
 drivers/base/power/main.c    |   21 
 drivers/base/power/power.h   |   11 
 drivers/base/power/runtime.c |  950 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm.h           |  102 ++++
 include/linux/pm_runtime.h   |  113 +++++
 kernel/power/Kconfig         |   14 
 kernel/power/main.c          |   17 
 9 files changed, 1225 insertions(+), 11 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /**
@@ -315,14 +344,81 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		disable_depth:3;
+	unsigned int		ignore_children:1;
+	unsigned int		runtime_failure:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			last_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,950 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * If there's a request pending, make sure its work function will return
+	 * without doing anything.
+	 */
+	if (dev->power.request_pending)
+		dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
+		dev->bus->pm->runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If there's an idle notification pending, cancel it.  If
+ * there's a suspend request scheduled while this function is running and @sync
+ * is 'true', cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* Pending resume requests take precedence over us. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+		/* Other pending requests need to be canceled. */
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.disable_depth > 0
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend ?
+		dev->bus->pm->runtime_suspend(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY) {
+			notify = true;
+		} else {
+			dev->power.runtime_failure = true;
+			dev->power.last_error = retval;
+		}
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		__pm_request_resume(dev);
+		dev->power.deferred_resume = false;
+	}
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (parent && !parent->power.ignore_children)
+		pm_request_idle(parent);
+
+	if (notify)
+		pm_runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  If there's a suspend
+ * request or idle notification pending, cancel it.  If there's a resume request
+ * scheduled while this function is running, cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -ENODEV;
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			return -EINPROGRESS;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		spin_unlock_irq(&dev->power.lock);
+
+		parent = dev->parent;
+		retval = pm_runtime_get_sync(parent);
+		if (retval < 0)
+			goto out_parent;
+
+		spin_lock_irq(&dev->power.lock);
+		retval = 0;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume ?
+		dev->bus->pm->runtime_resume(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		dev->power.runtime_failure = true;
+		dev->power.last_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	spin_unlock_irq(&dev->power.lock);
+
+ out_parent:
+	if (parent)
+		pm_runtime_put(parent);
+
+	if (!retval)
+		pm_request_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		goto out;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	if (dev->power.request_pending)
+		goto out;
+
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+
+		if (dev->power.request == RPM_REQ_SUSPEND)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.disable_depth > 0)
+		retval = -EAGAIN;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		/* There's nothing to do if resume request is pending. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_get - Reference count a device and wake it up, if necessary.
+ * @dev: Device to handle.
+ * @sync: If set and the device is suspended, resume it synchronously.
+ *
+ * Increment the usage count of the device and if it was zero previously,
+ * resume it or submit a resume request for it, depending on the value of @sync.
+ */
+int __pm_runtime_get(struct device *dev, bool sync)
+{
+	int retval = 1;
+
+	if (atomic_add_return(1, &dev->power.usage_count) == 1)
+		retval = sync ? pm_runtime_resume(dev) : pm_request_resume(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_get);
+
+/**
+ * __pm_runtime_put - Decrement the device's usage counter and notify its bus.
+ * @dev: Device to handle.
+ * @sync: If the device's bus type is to be notified, do that synchronously.
+ *
+ * Decrement the usage count of the device and if it reaches zero, carry out a
+ * synchronous idle notification or submit an idle notification request for it,
+ * depending on the value of @sync.
+ */
+int __pm_runtime_put(struct device *dev, bool sync)
+{
+	int retval = 0;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		retval = sync ? pm_runtime_idle(dev) : pm_request_idle(dev);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_put);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_failure flag is
+ * set, the status may be changed either to RPM_ACTIVE, or to RPM_SUSPENDED, as
+ * long as that reflects the actual state of the device.  However, if the device
+ * has a parent and the parent is not active, and the parent's
+ * power.ignore_children flag is unset, the device's status cannot be set to
+ * RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_failure
+ * flag and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_failure && !dev->power.disable_depth)
+		goto out;
+
+	if (dev->power.runtime_status == status)
+		goto out_clear;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent)
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		dev->power.runtime_status = status;
+		goto out_clear;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It may be invalid to put an active child under a suspended
+		 * parent.
+		 */
+		if (parent->power.runtime_status == RPM_ACTIVE
+		    || parent->power.ignore_children) {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+			dev->power.runtime_status = status;
+		} else {
+			error = -EBUSY;
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	} else {
+		dev->power.runtime_status = status;
+	}
+
+ out_clear:
+	dev->power.runtime_failure = false;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.disable_depth > 0)
+		dev->power.disable_depth--;
+	else
+		dev_warn(dev, "Unbalanced %s!", __func__);
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ *
+ * Increment power.disable_depth for the device and if was zero previously,
+ * cancel all pending run-time PM requests for the device and wait for all
+ * operations in progress to complete.  The device can be either active or
+ * suspended after its run-time PM has been disabled.
+ *
+ * If there's a resume request pending when pm_runtime_disable() is called and
+ * power.disable_depth is zero, the function will resume the device before
+ * disabling its run-time PM and will return -EBUSY.  Otherwise, 0 is returned.
+ */
+int pm_runtime_disable(struct device *dev)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (dev->power.disable_depth > 0) {
+		dev->power.disable_depth++;
+		goto out;
+	}
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and disabling run-time PM
+	 * shouldn't prevent the device from processing the I/O.
+	 */
+	if (dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		/*
+		 * Prevent suspends and idle notifications from being carried
+		 * out after we have woken up the device.
+		 */
+		pm_runtime_get_noresume(dev);
+
+		__pm_runtime_resume(dev, false);
+
+		pm_runtime_put_noidle(dev);
+		retval = -EBUSY;
+	}
+
+	if (dev->power.disable_depth++ > 0)
+		goto out;
+
+	if (dev->power.runtime_failure)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+	if (dev->power.runtime_failure)
+		goto out;
+
+	if (dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (!dev->power.idle_notification)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_ACTIVE;
+	dev->power.idle_notification = false;
+
+	dev->power.disable_depth = 1;
+	atomic_set(&dev->power.usage_count, 0);
+
+	dev->power.runtime_failure = false;
+	dev->power.last_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	dev->power.suspend_timer.expires = jiffies;
+	dev->power.suspend_timer.data = (unsigned long)dev;
+	dev->power.suspend_timer.function = pm_suspend_timer_fn;
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_add - Update run-time PM fields of a device while adding it.
+ * @dev: Device object being added to device hierarchy.
+ */
+void pm_runtime_add(struct device *dev)
+{
+	if (dev->parent)
+		atomic_inc(&dev->parent->power.child_count);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	struct device *parent = dev->parent;
+
+	pm_runtime_disable(dev);
+
+	if (dev->power.runtime_status != RPM_SUSPENDED && parent) {
+		atomic_add_unless(&parent->power.child_count, -1, 0);
+		if (!parent->power.ignore_children)
+			pm_request_idle(parent);
+	}
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,113 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_add(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_get(struct device *dev, bool sync);
+extern int __pm_runtime_put(struct device *dev, bool sync);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int pm_runtime_disable(struct device *dev);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline void pm_runtime_get_noresume(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+}
+
+static inline void pm_runtime_put_noidle(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_add(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_get(struct device *dev, bool sync) { return 1; }
+static inline int __pm_runtime_put(struct device *dev, bool sync) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int pm_runtime_disable(struct device *dev) { return 0; }
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline void pm_runtime_get_noresume(struct device *dev) {}
+static inline void pm_runtime_put_noidle(struct device *dev) {}
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	return __pm_runtime_get(dev, false);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	return __pm_runtime_get(dev, true);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	return __pm_runtime_put(dev, false);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	return __pm_runtime_put(dev, true);
+}
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object to initialize.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -89,6 +100,8 @@ void device_pm_add(struct device *dev)
 
 	list_add_tail(&dev->power.entry, &dpm_list);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_add(dev);
 }
 
 /**
@@ -105,6 +118,8 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -510,6 +525,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -755,11 +771,14 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		error = pm_runtime_disable(dev);
+		if (!error || !device_may_wakeup(dev))
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,9 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
+	pm_runtime_get_noresume(dev);
 	ret = really_probe(dev, drv);
+	pm_runtime_put_noidle(dev);
 
 	return ret;
 }
@@ -306,6 +309,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +329,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,8 +1,3 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
-
 #ifdef CONFIG_PM_SLEEP
 
 /*
@@ -16,14 +11,16 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
 
+static inline void device_pm_init(struct device *dev) {}
 static inline void device_pm_add(struct device *dev) {}
 static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
@@ -32,7 +29,7 @@ static inline void device_pm_move_after(
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08  5:45       ` Magnus Damm
  (?)
@ 2009-07-08 19:01       ` Rafael J. Wysocki
  2009-07-08 19:42         ` Alan Stern
  2009-07-08 19:42         ` Alan Stern
  -1 siblings, 2 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 19:01 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Magnus Damm wrote:
> On Wed, Jul 8, 2009 at 7:07 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > On Tuesday 07 July 2009, Magnus Damm wrote:
> >> On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> >> > Hi,
> >> >
> >> > There's a rev. 8 of the run-time PM framework patch.
> 
> >> All good with the code above, but there seem to be some issue with how
> >> usage_count is counted up and down and when runtime_disabled is set:
> >>
> >> 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> >> 2. driver_probe_device(): pm_runtime_get_sync()
> >> 3. pm_runtime_get_sync(): usage_count = 2
> >> 4. device driver probe(): pm_runtime_enable()
> >> 5. pm_runtime_enable(): usage_count = 1
> >> 6. driver_probe_device(): pm_runtime_put()
> >> 7. pm_runtime_put(): usage_count = 0
> >>
> >> I expect runtime_disabled = false in 7. Modifying the get/put calls to
> >> do enable/disable may work around the issue, but that's probably not
> >> what you guys want.
> >
> > Sure, that's my mistake.  I should have used a separate counter for
> > disable/enable, but I thought usage_counter would be sufficient.  Will fix.
> 
> Thank you. No problem.
> 
> >> Issue 2:
> >> ------------
> >> I cannot get any bus ->runtime_resume() callbacks from probe(). This
> >> also seems related to usage_count and pm_runtime_get_sync() in
> >> driver_probe_device(). Basically, from probe(), calling
> >> pm_runtime_resume() after pm_runtime_set_suspended() results in error
> >> and not in a ->runtime_resume() callback. Some device drives access
> >> hardware in probe(), so the ->runtime_resume() callback is needed at
> >> that point to turn on clocks before the hardware can be accessed.
> >
> > I think the problem is that pm_runtime_get_sync() in driver_probe_device()
> > calls ->runtime_resume(), so the device is active from the core's point of
> > view when you call pm_runtime_resume() from probe().
> >
> > Hmm.  OK, perhaps we should just increment usage_count in
> > driver_device_probe() to prevent suspends from happening at that time, without
> > calling ->runtime_resume() so that the driver can do it by itself.  I'll do
> > that in the next version.
> 
> Sounds good.
> 
> >> Random thought:
> >> -------------------------
> >> The runtime_pm_get() and runtime_pm_put() look very nice. I assume
> >> that inteface is supposed to be used by bus code. I wonder if it would
> >> be cleaner to use a similar counter based interface from the driver
> >> instead of the pm_runtime_idle()/suspend()/resume()...
> >>
> >> Let me know what you think!
> >
> > In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
> > versions.  At least, I don't see why not at the moment (well, I'm a bit tired
> > right now ...).
> 
> I think that's a nicer interface, but I must figure out how to use
> ->runtime_idle before I can switch to that...
> 
> > However, I'm now thinking it should work like this:
> >
> > * pm_runtime_get() increments usage_count and if it was zero before the
> >  incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
> >  by the 'sync' version).
> >
> > * pm_runtime_put() decrements usage_count and if it's zero after the
> >  decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
> >  the 'sync' version).
> >
> > * The 'suspend' callbacks won't succeed for usage_count > 0.
> >
> > This way we would avoid calling the 'suspend' and 'idle' functions each time
> > unnecessarily, but then usage_count would have to be modified under the
> > spinlock only.
> 
> If all usage_count users are moved under the spinlock then there would
> be no need for atomic operations, right?
> 
> This get()/put() interface is interesting.
> 
> So I'd like to tie in two levels of power management in our runtime PM
> implementation. The most simple level is clock stopping, and I can do
> that using the bus callbacks ->runtime_suspend() and
> ->runtime_resume() with v8. The driver runtime callbacks are never
> invoked for clock stopping.
> 
> On top of the clock stopping I'd like to turn off power to the domain.
> So if all clocks are stopped to the devices within a domain, then I'd
> like to call the per-device ->runtime_suspend() callbacks provided by
> the drivers.
>
> I wonder how to fit these two levels of power management into the
> runtime PM in a nice way. My first attempts simply made use of
> pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
> get()/put() if possible. But for that to work I need to implement
> ->runtime_idle() in my bus code, and I wonder if the current runtime
> PM idle behaviour is a good fit.
> 
> Below is how I'd like to make use of the runtime PM code. I'm not sure
> if it's compatible with your view. =)
> 
> Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
> after using the hardware. The runtime PM code invokes the bus
> ->runtime_idle() callback ASAP (of course depending on put() or
> put_sync(), but no timer). The bus->runtime_idle() callback stops the
> clock and decreases the power domain usage count. If the power domain
> is unused, then the pm_schedule_suspend() is called for each of the
> devices in the power domain. This in turn will invoke the
> ->runtime_suspend() callback which starts the clock, calls the driver
> ->runtime_suspend() and stops the clock again. When all devices are
> runtime suspended the power domain is turned off.
> 
> I can't get the above to work with v8 though. This is because after
> the clock is stopped with ->runtime_idle() the runtime_status of the
> device is still RPM_ACTIVE, so when pm_runtime_get_sync() gets called
> the ->runtime_resume() never gets invoked and the clock is never
> started...
> 
> So I don't know if you think the ->runtime_idle usage above is a good
> plan. I guess no, it's probably quite different from the USB case. I
> can of course always skip using ->runtime_idle() and just use
> suspend()/resume().
> 
> Any thoughts?

I think you'd need a separate bus type callback for that, call it
->runtime_deepen() for now, which could be executed for a _suspended_
(from the core's point of view) device and the role of which would be to put
the (already suspended) device into a deeper low power state.

Something like this might also be used for PCI and it's worth discussing IMO.

So, if we had such a callback, your scenario would be the following.

Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
after using the hardware. The runtime PM code invokes the bus
->runtime_idle() callback that in turn calls pm_runtime_suspend() or
pm_schedule_suspend() and the ->runtime_suspend() executed as a result
stops the clock and decreases the power domain usage count.  If the
domain usage count happens to be zero, pm_runtime_deepen() or
pm_schedule_deepen() is called for each device in the power domain.
Consequently, the bus type's ->runtime_deepen() is invoked and that can
call the device's ->runtime_suspend(), for example.  If there's
pm_runtime_get_sync() any time when this is happening, it will cancel the
pending requests and run ->runtime_resume().

Does it make sense?

Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08  5:45       ` Magnus Damm
  (?)
  (?)
@ 2009-07-08 19:01       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 19:01 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Magnus Damm wrote:
> On Wed, Jul 8, 2009 at 7:07 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> > On Tuesday 07 July 2009, Magnus Damm wrote:
> >> On Mon, Jul 6, 2009 at 9:52 AM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
> >> > Hi,
> >> >
> >> > There's a rev. 8 of the run-time PM framework patch.
> 
> >> All good with the code above, but there seem to be some issue with how
> >> usage_count is counted up and down and when runtime_disabled is set:
> >>
> >> 1. pm_runtime_init(): usage_count = 1, runtime_disabled = true
> >> 2. driver_probe_device(): pm_runtime_get_sync()
> >> 3. pm_runtime_get_sync(): usage_count = 2
> >> 4. device driver probe(): pm_runtime_enable()
> >> 5. pm_runtime_enable(): usage_count = 1
> >> 6. driver_probe_device(): pm_runtime_put()
> >> 7. pm_runtime_put(): usage_count = 0
> >>
> >> I expect runtime_disabled = false in 7. Modifying the get/put calls to
> >> do enable/disable may work around the issue, but that's probably not
> >> what you guys want.
> >
> > Sure, that's my mistake.  I should have used a separate counter for
> > disable/enable, but I thought usage_counter would be sufficient.  Will fix.
> 
> Thank you. No problem.
> 
> >> Issue 2:
> >> ------------
> >> I cannot get any bus ->runtime_resume() callbacks from probe(). This
> >> also seems related to usage_count and pm_runtime_get_sync() in
> >> driver_probe_device(). Basically, from probe(), calling
> >> pm_runtime_resume() after pm_runtime_set_suspended() results in error
> >> and not in a ->runtime_resume() callback. Some device drives access
> >> hardware in probe(), so the ->runtime_resume() callback is needed at
> >> that point to turn on clocks before the hardware can be accessed.
> >
> > I think the problem is that pm_runtime_get_sync() in driver_probe_device()
> > calls ->runtime_resume(), so the device is active from the core's point of
> > view when you call pm_runtime_resume() from probe().
> >
> > Hmm.  OK, perhaps we should just increment usage_count in
> > driver_device_probe() to prevent suspends from happening at that time, without
> > calling ->runtime_resume() so that the driver can do it by itself.  I'll do
> > that in the next version.
> 
> Sounds good.
> 
> >> Random thought:
> >> -------------------------
> >> The runtime_pm_get() and runtime_pm_put() look very nice. I assume
> >> that inteface is supposed to be used by bus code. I wonder if it would
> >> be cleaner to use a similar counter based interface from the driver
> >> instead of the pm_runtime_idle()/suspend()/resume()...
> >>
> >> Let me know what you think!
> >
> > In fact I thought drivers could also use pm_runtime_[get|put]() and the 'sync'
> > versions.  At least, I don't see why not at the moment (well, I'm a bit tired
> > right now ...).
> 
> I think that's a nicer interface, but I must figure out how to use
> ->runtime_idle before I can switch to that...
> 
> > However, I'm now thinking it should work like this:
> >
> > * pm_runtime_get() increments usage_count and if it was zero before the
> >  incrementation, it calls pm_request_resume() (pm_runtime_resume() is called
> >  by the 'sync' version).
> >
> > * pm_runtime_put() decrements usage_count and if it's zero after the
> >  decrementation, it calls pm_request_idle() (pm_runtime_idle() is called by
> >  the 'sync' version).
> >
> > * The 'suspend' callbacks won't succeed for usage_count > 0.
> >
> > This way we would avoid calling the 'suspend' and 'idle' functions each time
> > unnecessarily, but then usage_count would have to be modified under the
> > spinlock only.
> 
> If all usage_count users are moved under the spinlock then there would
> be no need for atomic operations, right?
> 
> This get()/put() interface is interesting.
> 
> So I'd like to tie in two levels of power management in our runtime PM
> implementation. The most simple level is clock stopping, and I can do
> that using the bus callbacks ->runtime_suspend() and
> ->runtime_resume() with v8. The driver runtime callbacks are never
> invoked for clock stopping.
> 
> On top of the clock stopping I'd like to turn off power to the domain.
> So if all clocks are stopped to the devices within a domain, then I'd
> like to call the per-device ->runtime_suspend() callbacks provided by
> the drivers.
>
> I wonder how to fit these two levels of power management into the
> runtime PM in a nice way. My first attempts simply made use of
> pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
> get()/put() if possible. But for that to work I need to implement
> ->runtime_idle() in my bus code, and I wonder if the current runtime
> PM idle behaviour is a good fit.
> 
> Below is how I'd like to make use of the runtime PM code. I'm not sure
> if it's compatible with your view. =)
> 
> Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
> after using the hardware. The runtime PM code invokes the bus
> ->runtime_idle() callback ASAP (of course depending on put() or
> put_sync(), but no timer). The bus->runtime_idle() callback stops the
> clock and decreases the power domain usage count. If the power domain
> is unused, then the pm_schedule_suspend() is called for each of the
> devices in the power domain. This in turn will invoke the
> ->runtime_suspend() callback which starts the clock, calls the driver
> ->runtime_suspend() and stops the clock again. When all devices are
> runtime suspended the power domain is turned off.
> 
> I can't get the above to work with v8 though. This is because after
> the clock is stopped with ->runtime_idle() the runtime_status of the
> device is still RPM_ACTIVE, so when pm_runtime_get_sync() gets called
> the ->runtime_resume() never gets invoked and the clock is never
> started...
> 
> So I don't know if you think the ->runtime_idle usage above is a good
> plan. I guess no, it's probably quite different from the USB case. I
> can of course always skip using ->runtime_idle() and just use
> suspend()/resume().
> 
> Any thoughts?

I think you'd need a separate bus type callback for that, call it
->runtime_deepen() for now, which could be executed for a _suspended_
(from the core's point of view) device and the role of which would be to put
the (already suspended) device into a deeper low power state.

Something like this might also be used for PCI and it's worth discussing IMO.

So, if we had such a callback, your scenario would be the following.

Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
after using the hardware. The runtime PM code invokes the bus
->runtime_idle() callback that in turn calls pm_runtime_suspend() or
pm_schedule_suspend() and the ->runtime_suspend() executed as a result
stops the clock and decreases the power domain usage count.  If the
domain usage count happens to be zero, pm_runtime_deepen() or
pm_schedule_deepen() is called for each device in the power domain.
Consequently, the bus type's ->runtime_deepen() is invoked and that can
call the device's ->runtime_suspend(), for example.  If there's
pm_runtime_get_sync() any time when this is happening, it will cancel the
pending requests and run ->runtime_resume().

Does it make sense?

Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 19:01       ` Rafael J. Wysocki
  2009-07-08 19:42         ` Alan Stern
@ 2009-07-08 19:42         ` Alan Stern
  2009-07-08 19:55           ` Rafael J. Wysocki
                             ` (3 more replies)
  1 sibling, 4 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08 19:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Magnus Damm, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:

> > So I'd like to tie in two levels of power management in our runtime PM
> > implementation. The most simple level is clock stopping, and I can do
> > that using the bus callbacks ->runtime_suspend() and
> > ->runtime_resume() with v8. The driver runtime callbacks are never
> > invoked for clock stopping.
> > 
> > On top of the clock stopping I'd like to turn off power to the domain.

I take it the devices in a single power domain don't all share a common 
parent.

> > So if all clocks are stopped to the devices within a domain, then I'd
> > like to call the per-device ->runtime_suspend() callbacks provided by
> > the drivers.

Why?  That is, why not tell the driver as soon as the device's own 
clock is stopped?  What point is there in waiting for all the other 
clocks to be stopped as well?

> > I wonder how to fit these two levels of power management into the
> > runtime PM in a nice way. My first attempts simply made use of
> > pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
> > get()/put() if possible. But for that to work I need to implement
> > ->runtime_idle() in my bus code, and I wonder if the current runtime
> > PM idle behaviour is a good fit.
> > 
> > Below is how I'd like to make use of the runtime PM code. I'm not sure
> > if it's compatible with your view. =)
> > 
> > Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
> > after using the hardware. The runtime PM code invokes the bus
> > ->runtime_idle() callback ASAP (of course depending on put() or
> > put_sync(), but no timer). The bus->runtime_idle() callback stops the
> > clock and decreases the power domain usage count. If the power domain
> > is unused, then the pm_schedule_suspend() is called for each of the
> > devices in the power domain. This in turn will invoke the
> > ->runtime_suspend() callback which starts the clock, calls the driver
> > ->runtime_suspend() and stops the clock again. When all devices are
> > runtime suspended the power domain is turned off.

Instead, you should call pm_runtime_suspend from within the
runtime_idle method.  When the runtime_suspend method runs, have it
decrement the power domain's usage count.  Is the power domain
represented by a single struct device?  If it is then that device's
power.usage_count field would naturally be the thing to use; otherwise
you'd have to set up your own counter.

Then depending on how things are organized, when the power-domain
device's usage_count goes to 0 you'll get a runtime_idle callback.  
Call pm_runtime_resume for the power-domain device, and have that
routine shut off the power.  Or if you set up your own private counter
for the power domain, shut off the power when the counter goes to 0.

> I think you'd need a separate bus type callback for that, call it
> ->runtime_deepen() for now, which could be executed for a _suspended_
> (from the core's point of view) device and the role of which would be to put
> the (already suspended) device into a deeper low power state.
> 
> Something like this might also be used for PCI and it's worth discussing IMO.

I thought you wanted to avoid this sort of complication.

Alan Stern


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 19:01       ` Rafael J. Wysocki
@ 2009-07-08 19:42         ` Alan Stern
  2009-07-08 19:42         ` Alan Stern
  1 sibling, 0 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08 19:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:

> > So I'd like to tie in two levels of power management in our runtime PM
> > implementation. The most simple level is clock stopping, and I can do
> > that using the bus callbacks ->runtime_suspend() and
> > ->runtime_resume() with v8. The driver runtime callbacks are never
> > invoked for clock stopping.
> > 
> > On top of the clock stopping I'd like to turn off power to the domain.

I take it the devices in a single power domain don't all share a common 
parent.

> > So if all clocks are stopped to the devices within a domain, then I'd
> > like to call the per-device ->runtime_suspend() callbacks provided by
> > the drivers.

Why?  That is, why not tell the driver as soon as the device's own 
clock is stopped?  What point is there in waiting for all the other 
clocks to be stopped as well?

> > I wonder how to fit these two levels of power management into the
> > runtime PM in a nice way. My first attempts simply made use of
> > pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
> > get()/put() if possible. But for that to work I need to implement
> > ->runtime_idle() in my bus code, and I wonder if the current runtime
> > PM idle behaviour is a good fit.
> > 
> > Below is how I'd like to make use of the runtime PM code. I'm not sure
> > if it's compatible with your view. =)
> > 
> > Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
> > after using the hardware. The runtime PM code invokes the bus
> > ->runtime_idle() callback ASAP (of course depending on put() or
> > put_sync(), but no timer). The bus->runtime_idle() callback stops the
> > clock and decreases the power domain usage count. If the power domain
> > is unused, then the pm_schedule_suspend() is called for each of the
> > devices in the power domain. This in turn will invoke the
> > ->runtime_suspend() callback which starts the clock, calls the driver
> > ->runtime_suspend() and stops the clock again. When all devices are
> > runtime suspended the power domain is turned off.

Instead, you should call pm_runtime_suspend from within the
runtime_idle method.  When the runtime_suspend method runs, have it
decrement the power domain's usage count.  Is the power domain
represented by a single struct device?  If it is then that device's
power.usage_count field would naturally be the thing to use; otherwise
you'd have to set up your own counter.

Then depending on how things are organized, when the power-domain
device's usage_count goes to 0 you'll get a runtime_idle callback.  
Call pm_runtime_resume for the power-domain device, and have that
routine shut off the power.  Or if you set up your own private counter
for the power domain, shut off the power when the counter goes to 0.

> I think you'd need a separate bus type callback for that, call it
> ->runtime_deepen() for now, which could be executed for a _suspended_
> (from the core's point of view) device and the role of which would be to put
> the (already suspended) device into a deeper low power state.
> 
> Something like this might also be used for PCI and it's worth discussing IMO.

I thought you wanted to avoid this sort of complication.

Alan Stern

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 19:42         ` Alan Stern
@ 2009-07-08 19:55           ` Rafael J. Wysocki
  2009-07-08 21:09             ` Alan Stern
  2009-07-08 21:09             ` Alan Stern
  2009-07-08 19:55           ` Rafael J. Wysocki
                             ` (2 subsequent siblings)
  3 siblings, 2 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 19:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: Magnus Damm, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Alan Stern wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
> 
> > > So I'd like to tie in two levels of power management in our runtime PM
> > > implementation. The most simple level is clock stopping, and I can do
> > > that using the bus callbacks ->runtime_suspend() and
> > > ->runtime_resume() with v8. The driver runtime callbacks are never
> > > invoked for clock stopping.
> > > 
> > > On top of the clock stopping I'd like to turn off power to the domain.
> 
> I take it the devices in a single power domain don't all share a common 
> parent.
> 
> > > So if all clocks are stopped to the devices within a domain, then I'd
> > > like to call the per-device ->runtime_suspend() callbacks provided by
> > > the drivers.
> 
> Why?  That is, why not tell the driver as soon as the device's own 
> clock is stopped?  What point is there in waiting for all the other 
> clocks to be stopped as well?
> 
> > > I wonder how to fit these two levels of power management into the
> > > runtime PM in a nice way. My first attempts simply made use of
> > > pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
> > > get()/put() if possible. But for that to work I need to implement
> > > ->runtime_idle() in my bus code, and I wonder if the current runtime
> > > PM idle behaviour is a good fit.
> > > 
> > > Below is how I'd like to make use of the runtime PM code. I'm not sure
> > > if it's compatible with your view. =)
> > > 
> > > Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
> > > after using the hardware. The runtime PM code invokes the bus
> > > ->runtime_idle() callback ASAP (of course depending on put() or
> > > put_sync(), but no timer). The bus->runtime_idle() callback stops the
> > > clock and decreases the power domain usage count. If the power domain
> > > is unused, then the pm_schedule_suspend() is called for each of the
> > > devices in the power domain. This in turn will invoke the
> > > ->runtime_suspend() callback which starts the clock, calls the driver
> > > ->runtime_suspend() and stops the clock again. When all devices are
> > > runtime suspended the power domain is turned off.
> 
> Instead, you should call pm_runtime_suspend from within the
> runtime_idle method.  When the runtime_suspend method runs, have it
> decrement the power domain's usage count.  Is the power domain
> represented by a single struct device?  If it is then that device's
> power.usage_count field would naturally be the thing to use; otherwise
> you'd have to set up your own counter.
> 
> Then depending on how things are organized, when the power-domain
> device's usage_count goes to 0 you'll get a runtime_idle callback.  
> Call pm_runtime_resume for the power-domain device, and have that
> routine shut off the power.  Or if you set up your own private counter
> for the power domain, shut off the power when the counter goes to 0.

Yes, I think the approach with a private counter should work in the Magnus'
case.

> > I think you'd need a separate bus type callback for that, call it
> > ->runtime_deepen() for now, which could be executed for a _suspended_
> > (from the core's point of view) device and the role of which would be to put
> > the (already suspended) device into a deeper low power state.
> > 
> > Something like this might also be used for PCI and it's worth discussing IMO.
> 
> I thought you wanted to avoid this sort of complication.

I did, but there might be some benefits.  For example, the timer and the work
structure provided by dev.power can be used for scheduling such operations
if they are defined at the core level.

Suppose your device has 3 low power states D1 - D3 (like PCI) and you want it
to go into D1 first, then, after a delay, to D2 and finally, again after a
delay, to D3.  Of course, if there's a resume in the meantime, it should cancel
whichever transition is in progress.

pm_runtime_suspend() can be used for the first transition, but the bus type or
driver will have to provide its own mechanics for going down to D2 and D3,
which must be synchronized with its ->runtime_resume().  That might be tricky
and the core already has what's necessary (well, almost).

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 19:42         ` Alan Stern
  2009-07-08 19:55           ` Rafael J. Wysocki
@ 2009-07-08 19:55           ` Rafael J. Wysocki
  2009-07-09  2:52           ` Magnus Damm
  2009-07-09  2:52             ` Magnus Damm
  3 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 19:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Alan Stern wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
> 
> > > So I'd like to tie in two levels of power management in our runtime PM
> > > implementation. The most simple level is clock stopping, and I can do
> > > that using the bus callbacks ->runtime_suspend() and
> > > ->runtime_resume() with v8. The driver runtime callbacks are never
> > > invoked for clock stopping.
> > > 
> > > On top of the clock stopping I'd like to turn off power to the domain.
> 
> I take it the devices in a single power domain don't all share a common 
> parent.
> 
> > > So if all clocks are stopped to the devices within a domain, then I'd
> > > like to call the per-device ->runtime_suspend() callbacks provided by
> > > the drivers.
> 
> Why?  That is, why not tell the driver as soon as the device's own 
> clock is stopped?  What point is there in waiting for all the other 
> clocks to be stopped as well?
> 
> > > I wonder how to fit these two levels of power management into the
> > > runtime PM in a nice way. My first attempts simply made use of
> > > pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
> > > get()/put() if possible. But for that to work I need to implement
> > > ->runtime_idle() in my bus code, and I wonder if the current runtime
> > > PM idle behaviour is a good fit.
> > > 
> > > Below is how I'd like to make use of the runtime PM code. I'm not sure
> > > if it's compatible with your view. =)
> > > 
> > > Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
> > > after using the hardware. The runtime PM code invokes the bus
> > > ->runtime_idle() callback ASAP (of course depending on put() or
> > > put_sync(), but no timer). The bus->runtime_idle() callback stops the
> > > clock and decreases the power domain usage count. If the power domain
> > > is unused, then the pm_schedule_suspend() is called for each of the
> > > devices in the power domain. This in turn will invoke the
> > > ->runtime_suspend() callback which starts the clock, calls the driver
> > > ->runtime_suspend() and stops the clock again. When all devices are
> > > runtime suspended the power domain is turned off.
> 
> Instead, you should call pm_runtime_suspend from within the
> runtime_idle method.  When the runtime_suspend method runs, have it
> decrement the power domain's usage count.  Is the power domain
> represented by a single struct device?  If it is then that device's
> power.usage_count field would naturally be the thing to use; otherwise
> you'd have to set up your own counter.
> 
> Then depending on how things are organized, when the power-domain
> device's usage_count goes to 0 you'll get a runtime_idle callback.  
> Call pm_runtime_resume for the power-domain device, and have that
> routine shut off the power.  Or if you set up your own private counter
> for the power domain, shut off the power when the counter goes to 0.

Yes, I think the approach with a private counter should work in the Magnus'
case.

> > I think you'd need a separate bus type callback for that, call it
> > ->runtime_deepen() for now, which could be executed for a _suspended_
> > (from the core's point of view) device and the role of which would be to put
> > the (already suspended) device into a deeper low power state.
> > 
> > Something like this might also be used for PCI and it's worth discussing IMO.
> 
> I thought you wanted to avoid this sort of complication.

I did, but there might be some benefits.  For example, the timer and the work
structure provided by dev.power can be used for scheduling such operations
if they are defined at the core level.

Suppose your device has 3 low power states D1 - D3 (like PCI) and you want it
to go into D1 first, then, after a delay, to D2 and finally, again after a
delay, to D3.  Of course, if there's a resume in the meantime, it should cancel
whichever transition is in progress.

pm_runtime_suspend() can be used for the first transition, but the bus type or
driver will have to provide its own mechanics for going down to D2 and D3,
which must be synchronized with its ->runtime_resume().  That might be tricky
and the core already has what's necessary (well, almost).

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 19:55           ` Rafael J. Wysocki
@ 2009-07-08 21:09             ` Alan Stern
  2009-07-08 21:29               ` Rafael J. Wysocki
                                 ` (3 more replies)
  2009-07-08 21:09             ` Alan Stern
  1 sibling, 4 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08 21:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Magnus Damm, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:

> > I thought you wanted to avoid this sort of complication.
> 
> I did, but there might be some benefits.  For example, the timer and the work
> structure provided by dev.power can be used for scheduling such operations
> if they are defined at the core level.
> 
> Suppose your device has 3 low power states D1 - D3 (like PCI) and you want it
> to go into D1 first, then, after a delay, to D2 and finally, again after a
> delay, to D3.  Of course, if there's a resume in the meantime, it should cancel
> whichever transition is in progress.
> 
> pm_runtime_suspend() can be used for the first transition, but the bus type or
> driver will have to provide its own mechanics for going down to D2 and D3,
> which must be synchronized with its ->runtime_resume().  That might be tricky
> and the core already has what's necessary (well, almost).

Maybe we can provide a way for drivers to set up their own timer 
callback or work routine for use while the status is RPM_SUSPENDED.

Alan Stern


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 19:55           ` Rafael J. Wysocki
  2009-07-08 21:09             ` Alan Stern
@ 2009-07-08 21:09             ` Alan Stern
  1 sibling, 0 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-08 21:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:

> > I thought you wanted to avoid this sort of complication.
> 
> I did, but there might be some benefits.  For example, the timer and the work
> structure provided by dev.power can be used for scheduling such operations
> if they are defined at the core level.
> 
> Suppose your device has 3 low power states D1 - D3 (like PCI) and you want it
> to go into D1 first, then, after a delay, to D2 and finally, again after a
> delay, to D3.  Of course, if there's a resume in the meantime, it should cancel
> whichever transition is in progress.
> 
> pm_runtime_suspend() can be used for the first transition, but the bus type or
> driver will have to provide its own mechanics for going down to D2 and D3,
> which must be synchronized with its ->runtime_resume().  That might be tricky
> and the core already has what's necessary (well, almost).

Maybe we can provide a way for drivers to set up their own timer 
callback or work routine for use while the status is RPM_SUSPENDED.

Alan Stern

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 21:09             ` Alan Stern
@ 2009-07-08 21:29               ` Rafael J. Wysocki
  2009-07-08 21:29               ` Rafael J. Wysocki
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 21:29 UTC (permalink / raw)
  To: Alan Stern
  Cc: Magnus Damm, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Alan Stern wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
> 
> > > I thought you wanted to avoid this sort of complication.
> > 
> > I did, but there might be some benefits.  For example, the timer and the work
> > structure provided by dev.power can be used for scheduling such operations
> > if they are defined at the core level.
> > 
> > Suppose your device has 3 low power states D1 - D3 (like PCI) and you want it
> > to go into D1 first, then, after a delay, to D2 and finally, again after a
> > delay, to D3.  Of course, if there's a resume in the meantime, it should cancel
> > whichever transition is in progress.
> > 
> > pm_runtime_suspend() can be used for the first transition, but the bus type or
> > driver will have to provide its own mechanics for going down to D2 and D3,
> > which must be synchronized with its ->runtime_resume().  That might be tricky
> > and the core already has what's necessary (well, almost).
> 
> Maybe we can provide a way for drivers to set up their own timer 
> callback or work routine for use while the status is RPM_SUSPENDED.

Agreed.

Anyway, I don't think it's really necessary in the Magnus' usage case, as
you pointed out earlier in this thread, so I think we can consider it as
something to add in future.

The current patch is already more that 1200 lines and there's some
documentation to add, so I wouldn't like to make it any bigger. :-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 21:09             ` Alan Stern
  2009-07-08 21:29               ` Rafael J. Wysocki
@ 2009-07-08 21:29               ` Rafael J. Wysocki
  2009-07-08 21:29               ` Rafael J. Wysocki
  2009-07-08 21:29               ` Rafael J. Wysocki
  3 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 21:29 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Alan Stern wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
> 
> > > I thought you wanted to avoid this sort of complication.
> > 
> > I did, but there might be some benefits.  For example, the timer and the work
> > structure provided by dev.power can be used for scheduling such operations
> > if they are defined at the core level.
> > 
> > Suppose your device has 3 low power states D1 - D3 (like PCI) and you want it
> > to go into D1 first, then, after a delay, to D2 and finally, again after a
> > delay, to D3.  Of course, if there's a resume in the meantime, it should cancel
> > whichever transition is in progress.
> > 
> > pm_runtime_suspend() can be used for the first transition, but the bus type or
> > driver will have to provide its own mechanics for going down to D2 and D3,
> > which must be synchronized with its ->runtime_resume().  That might be tricky
> > and the core already has what's necessary (well, almost).
> 
> Maybe we can provide a way for drivers to set up their own timer 
> callback or work routine for use while the status is RPM_SUSPENDED.

Agreed.

Anyway, I don't think it's really necessary in the Magnus' usage case, as
you pointed out earlier in this thread, so I think we can consider it as
something to add in future.

The current patch is already more that 1200 lines and there's some
documentation to add, so I wouldn't like to make it any bigger. :-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 21:09             ` Alan Stern
  2009-07-08 21:29               ` Rafael J. Wysocki
  2009-07-08 21:29               ` Rafael J. Wysocki
@ 2009-07-08 21:29               ` Rafael J. Wysocki
  2009-07-08 21:29               ` Rafael J. Wysocki
  3 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 21:29 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Alan Stern wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
> 
> > > I thought you wanted to avoid this sort of complication.
> > 
> > I did, but there might be some benefits.  For example, the timer and the work
> > structure provided by dev.power can be used for scheduling such operations
> > if they are defined at the core level.
> > 
> > Suppose your device has 3 low power states D1 - D3 (like PCI) and you want it
> > to go into D1 first, then, after a delay, to D2 and finally, again after a
> > delay, to D3.  Of course, if there's a resume in the meantime, it should cancel
> > whichever transition is in progress.
> > 
> > pm_runtime_suspend() can be used for the first transition, but the bus type or
> > driver will have to provide its own mechanics for going down to D2 and D3,
> > which must be synchronized with its ->runtime_resume().  That might be tricky
> > and the core already has what's necessary (well, almost).
> 
> Maybe we can provide a way for drivers to set up their own timer 
> callback or work routine for use while the status is RPM_SUSPENDED.

Agreed.

Anyway, I don't think it's really necessary in the Magnus' usage case, as
you pointed out earlier in this thread, so I think we can consider it as
something to add in future.

The current patch is already more that 1200 lines and there's some
documentation to add, so I wouldn't like to make it any bigger. :-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 21:09             ` Alan Stern
                                 ` (2 preceding siblings ...)
  2009-07-08 21:29               ` Rafael J. Wysocki
@ 2009-07-08 21:29               ` Rafael J. Wysocki
  3 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-08 21:29 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Wednesday 08 July 2009, Alan Stern wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
> 
> > > I thought you wanted to avoid this sort of complication.
> > 
> > I did, but there might be some benefits.  For example, the timer and the work
> > structure provided by dev.power can be used for scheduling such operations
> > if they are defined at the core level.
> > 
> > Suppose your device has 3 low power states D1 - D3 (like PCI) and you want it
> > to go into D1 first, then, after a delay, to D2 and finally, again after a
> > delay, to D3.  Of course, if there's a resume in the meantime, it should cancel
> > whichever transition is in progress.
> > 
> > pm_runtime_suspend() can be used for the first transition, but the bus type or
> > driver will have to provide its own mechanics for going down to D2 and D3,
> > which must be synchronized with its ->runtime_resume().  That might be tricky
> > and the core already has what's necessary (well, almost).
> 
> Maybe we can provide a way for drivers to set up their own timer 
> callback or work routine for use while the status is RPM_SUSPENDED.

Agreed.

Anyway, I don't think it's really necessary in the Magnus' usage case, as
you pointed out earlier in this thread, so I think we can consider it as
something to add in future.

The current patch is already more that 1200 lines and there's some
documentation to add, so I wouldn't like to make it any bigger. :-)

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 19:42         ` Alan Stern
@ 2009-07-09  2:52             ` Magnus Damm
  2009-07-08 19:55           ` Rafael J. Wysocki
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-09  2:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

Hi Alan,

On Thu, Jul 9, 2009 at 4:42 AM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
>
>> > So I'd like to tie in two levels of power management in our runtime PM
>> > implementation. The most simple level is clock stopping, and I can do
>> > that using the bus callbacks ->runtime_suspend() and
>> > ->runtime_resume() with v8. The driver runtime callbacks are never
>> > invoked for clock stopping.
>> >
>> > On top of the clock stopping I'd like to turn off power to the domain.
>
> I take it the devices in a single power domain don't all share a common
> parent.

It all depends on how we implement the software bus topology in the
future. Right now from the software perspective the platform bus
topology on SuperH is flat and more or less unused. It sounds sane to
map in the power domains into the bus topology. I'm not sure if it is
the best choice though, each device also has clock dependencies and
it's of course communicating through some internal hardware bus with
it's own hardware topology.

>> > So if all clocks are stopped to the devices within a domain, then I'd
>> > like to call the per-device ->runtime_suspend() callbacks provided by
>> > the drivers.
>
> Why?  That is, why not tell the driver as soon as the device's own
> clock is stopped?  What point is there in waiting for all the other
> clocks to be stopped as well?

Clocks should be stopped as soon as possible without any delay. The
clock stopping is very cheap performance wise. Also, the clock
stopping is done on bus level without invoking any driver callbacks.
Delaying the clock stopping does not make any sense to me.

For my use case the driver callbacks manage context save and restore.
This to allow turning off power domains.

The reason why I don't want to execute the driver ->runtime_suspend()
callbacks directly is performance. Basically, we only want to execute
the driver callbacks when we know that we will be able to power off
the domain. The driver callbacks need to save and restore registers,
and each uncached memory access is expensive. Executing the driver
callback does not give us any power savings at all, it's just consumes
power.

I want to avoid the situation where the driver ->runtime_suspend() and
->runtime_resume() callbacks get invoked over and over for all devices
except one in a certain power domain even though we will never be able
to power off because a single device in the power domain is active.

The situation above can be described with a practical example with an
open a serial port. The receive side of the serial port hardware needs
the clock to be enabled, so we can't turn off the clock. This leads to
that we can't runtime suspend the device driver. In my opinion it's
pure overhead to call ->runtime_suspend() and ->runtime_resume() for
all other devices in the same power domain as the serial port, this
because we already know that the open serial port is blocking the
entire power domain.

>> > I wonder how to fit these two levels of power management into the
>> > runtime PM in a nice way. My first attempts simply made use of
>> > pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
>> > get()/put() if possible. But for that to work I need to implement
>> > ->runtime_idle() in my bus code, and I wonder if the current runtime
>> > PM idle behaviour is a good fit.
>> >
>> > Below is how I'd like to make use of the runtime PM code. I'm not sure
>> > if it's compatible with your view. =)
>> >
>> > Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
>> > after using the hardware. The runtime PM code invokes the bus
>> > ->runtime_idle() callback ASAP (of course depending on put() or
>> > put_sync(), but no timer). The bus->runtime_idle() callback stops the
>> > clock and decreases the power domain usage count. If the power domain
>> > is unused, then the pm_schedule_suspend() is called for each of the
>> > devices in the power domain. This in turn will invoke the
>> > ->runtime_suspend() callback which starts the clock, calls the driver
>> > ->runtime_suspend() and stops the clock again. When all devices are
>> > runtime suspended the power domain is turned off.
>
> Instead, you should call pm_runtime_suspend from within the
> runtime_idle method.  When the runtime_suspend method runs, have it
> decrement the power domain's usage count.  Is the power domain
> represented by a single struct device?  If it is then that device's
> power.usage_count field would naturally be the thing to use; otherwise
> you'd have to set up your own counter.

Ok, calling pm_runtime_suspend() from the bus ->runtime_idle callback
sounds like a good plan. The power domain is only represented by a
simple structure at this point. I agree it would be nice to make use
of the usage_count for this purpose.

> Then depending on how things are organized, when the power-domain
> device's usage_count goes to 0 you'll get a runtime_idle callback.
> Call pm_runtime_resume for the power-domain device, and have that
> routine shut off the power.  Or if you set up your own private counter
> for the power domain, shut off the power when the counter goes to 0.

Right, I'm with you how it works. I have to think a bit more about how
to tie in both clock stopping and power domain control with
runtime_idle though.

Thanks for your suggestions,

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 8)
@ 2009-07-09  2:52             ` Magnus Damm
  0 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-09  2:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

Hi Alan,

On Thu, Jul 9, 2009 at 4:42 AM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
>
>> > So I'd like to tie in two levels of power management in our runtime PM
>> > implementation. The most simple level is clock stopping, and I can do
>> > that using the bus callbacks ->runtime_suspend() and
>> > ->runtime_resume() with v8. The driver runtime callbacks are never
>> > invoked for clock stopping.
>> >
>> > On top of the clock stopping I'd like to turn off power to the domain.
>
> I take it the devices in a single power domain don't all share a common
> parent.

It all depends on how we implement the software bus topology in the
future. Right now from the software perspective the platform bus
topology on SuperH is flat and more or less unused. It sounds sane to
map in the power domains into the bus topology. I'm not sure if it is
the best choice though, each device also has clock dependencies and
it's of course communicating through some internal hardware bus with
it's own hardware topology.

>> > So if all clocks are stopped to the devices within a domain, then I'd
>> > like to call the per-device ->runtime_suspend() callbacks provided by
>> > the drivers.
>
> Why?  That is, why not tell the driver as soon as the device's own
> clock is stopped?  What point is there in waiting for all the other
> clocks to be stopped as well?

Clocks should be stopped as soon as possible without any delay. The
clock stopping is very cheap performance wise. Also, the clock
stopping is done on bus level without invoking any driver callbacks.
Delaying the clock stopping does not make any sense to me.

For my use case the driver callbacks manage context save and restore.
This to allow turning off power domains.

The reason why I don't want to execute the driver ->runtime_suspend()
callbacks directly is performance. Basically, we only want to execute
the driver callbacks when we know that we will be able to power off
the domain. The driver callbacks need to save and restore registers,
and each uncached memory access is expensive. Executing the driver
callback does not give us any power savings at all, it's just consumes
power.

I want to avoid the situation where the driver ->runtime_suspend() and
->runtime_resume() callbacks get invoked over and over for all devices
except one in a certain power domain even though we will never be able
to power off because a single device in the power domain is active.

The situation above can be described with a practical example with an
open a serial port. The receive side of the serial port hardware needs
the clock to be enabled, so we can't turn off the clock. This leads to
that we can't runtime suspend the device driver. In my opinion it's
pure overhead to call ->runtime_suspend() and ->runtime_resume() for
all other devices in the same power domain as the serial port, this
because we already know that the open serial port is blocking the
entire power domain.

>> > I wonder how to fit these two levels of power management into the
>> > runtime PM in a nice way. My first attempts simply made use of
>> > pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
>> > get()/put() if possible. But for that to work I need to implement
>> > ->runtime_idle() in my bus code, and I wonder if the current runtime
>> > PM idle behaviour is a good fit.
>> >
>> > Below is how I'd like to make use of the runtime PM code. I'm not sure
>> > if it's compatible with your view. =)
>> >
>> > Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
>> > after using the hardware. The runtime PM code invokes the bus
>> > ->runtime_idle() callback ASAP (of course depending on put() or
>> > put_sync(), but no timer). The bus->runtime_idle() callback stops the
>> > clock and decreases the power domain usage count. If the power domain
>> > is unused, then the pm_schedule_suspend() is called for each of the
>> > devices in the power domain. This in turn will invoke the
>> > ->runtime_suspend() callback which starts the clock, calls the driver
>> > ->runtime_suspend() and stops the clock again. When all devices are
>> > runtime suspended the power domain is turned off.
>
> Instead, you should call pm_runtime_suspend from within the
> runtime_idle method.  When the runtime_suspend method runs, have it
> decrement the power domain's usage count.  Is the power domain
> represented by a single struct device?  If it is then that device's
> power.usage_count field would naturally be the thing to use; otherwise
> you'd have to set up your own counter.

Ok, calling pm_runtime_suspend() from the bus ->runtime_idle callback
sounds like a good plan. The power domain is only represented by a
simple structure at this point. I agree it would be nice to make use
of the usage_count for this purpose.

> Then depending on how things are organized, when the power-domain
> device's usage_count goes to 0 you'll get a runtime_idle callback.
> Call pm_runtime_resume for the power-domain device, and have that
> routine shut off the power.  Or if you set up your own private counter
> for the power domain, shut off the power when the counter goes to 0.

Right, I'm with you how it works. I have to think a bit more about how
to tie in both clock stopping and power domain control with
runtime_idle though.

Thanks for your suggestions,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-08 19:42         ` Alan Stern
  2009-07-08 19:55           ` Rafael J. Wysocki
  2009-07-08 19:55           ` Rafael J. Wysocki
@ 2009-07-09  2:52           ` Magnus Damm
  2009-07-09  2:52             ` Magnus Damm
  3 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-09  2:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

Hi Alan,

On Thu, Jul 9, 2009 at 4:42 AM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Wed, 8 Jul 2009, Rafael J. Wysocki wrote:
>
>> > So I'd like to tie in two levels of power management in our runtime PM
>> > implementation. The most simple level is clock stopping, and I can do
>> > that using the bus callbacks ->runtime_suspend() and
>> > ->runtime_resume() with v8. The driver runtime callbacks are never
>> > invoked for clock stopping.
>> >
>> > On top of the clock stopping I'd like to turn off power to the domain.
>
> I take it the devices in a single power domain don't all share a common
> parent.

It all depends on how we implement the software bus topology in the
future. Right now from the software perspective the platform bus
topology on SuperH is flat and more or less unused. It sounds sane to
map in the power domains into the bus topology. I'm not sure if it is
the best choice though, each device also has clock dependencies and
it's of course communicating through some internal hardware bus with
it's own hardware topology.

>> > So if all clocks are stopped to the devices within a domain, then I'd
>> > like to call the per-device ->runtime_suspend() callbacks provided by
>> > the drivers.
>
> Why?  That is, why not tell the driver as soon as the device's own
> clock is stopped?  What point is there in waiting for all the other
> clocks to be stopped as well?

Clocks should be stopped as soon as possible without any delay. The
clock stopping is very cheap performance wise. Also, the clock
stopping is done on bus level without invoking any driver callbacks.
Delaying the clock stopping does not make any sense to me.

For my use case the driver callbacks manage context save and restore.
This to allow turning off power domains.

The reason why I don't want to execute the driver ->runtime_suspend()
callbacks directly is performance. Basically, we only want to execute
the driver callbacks when we know that we will be able to power off
the domain. The driver callbacks need to save and restore registers,
and each uncached memory access is expensive. Executing the driver
callback does not give us any power savings at all, it's just consumes
power.

I want to avoid the situation where the driver ->runtime_suspend() and
->runtime_resume() callbacks get invoked over and over for all devices
except one in a certain power domain even though we will never be able
to power off because a single device in the power domain is active.

The situation above can be described with a practical example with an
open a serial port. The receive side of the serial port hardware needs
the clock to be enabled, so we can't turn off the clock. This leads to
that we can't runtime suspend the device driver. In my opinion it's
pure overhead to call ->runtime_suspend() and ->runtime_resume() for
all other devices in the same power domain as the serial port, this
because we already know that the open serial port is blocking the
entire power domain.

>> > I wonder how to fit these two levels of power management into the
>> > runtime PM in a nice way. My first attempts simply made use of
>> > pm_runtime_resume() and pm_runtime_suspend(), but I'd like to move to
>> > get()/put() if possible. But for that to work I need to implement
>> > ->runtime_idle() in my bus code, and I wonder if the current runtime
>> > PM idle behaviour is a good fit.
>> >
>> > Below is how I'd like to make use of the runtime PM code. I'm not sure
>> > if it's compatible with your view. =)
>> >
>> > Drivers call pm_runtime_get_sync() and pm_runtime_put() before and
>> > after using the hardware. The runtime PM code invokes the bus
>> > ->runtime_idle() callback ASAP (of course depending on put() or
>> > put_sync(), but no timer). The bus->runtime_idle() callback stops the
>> > clock and decreases the power domain usage count. If the power domain
>> > is unused, then the pm_schedule_suspend() is called for each of the
>> > devices in the power domain. This in turn will invoke the
>> > ->runtime_suspend() callback which starts the clock, calls the driver
>> > ->runtime_suspend() and stops the clock again. When all devices are
>> > runtime suspended the power domain is turned off.
>
> Instead, you should call pm_runtime_suspend from within the
> runtime_idle method.  When the runtime_suspend method runs, have it
> decrement the power domain's usage count.  Is the power domain
> represented by a single struct device?  If it is then that device's
> power.usage_count field would naturally be the thing to use; otherwise
> you'd have to set up your own counter.

Ok, calling pm_runtime_suspend() from the bus ->runtime_idle callback
sounds like a good plan. The power domain is only represented by a
simple structure at this point. I agree it would be nice to make use
of the usage_count for this purpose.

> Then depending on how things are organized, when the power-domain
> device's usage_count goes to 0 you'll get a runtime_idle callback.
> Call pm_runtime_resume for the power-domain device, and have that
> routine shut off the power.  Or if you set up your own private counter
> for the power domain, shut off the power when the counter goes to 0.

Right, I'm with you how it works. I have to think a bit more about how
to tie in both clock stopping and power domain control with
runtime_idle though.

Thanks for your suggestions,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 8)
  2009-07-09  2:52             ` Magnus Damm
  (?)
@ 2009-07-09 13:48             ` Alan Stern
  2009-07-09 15:31               ` Magnus Damm
  2009-07-09 15:31                 ` Magnus Damm
  -1 siblings, 2 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-09 13:48 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Thu, 9 Jul 2009, Magnus Damm wrote:

> Clocks should be stopped as soon as possible without any delay. The
> clock stopping is very cheap performance wise. Also, the clock
> stopping is done on bus level without invoking any driver callbacks.
> Delaying the clock stopping does not make any sense to me.

In that case the device driver or bus subsystem should manage the
device's clock directly.  There's no need to tie it in with the runtime
PM framework.  Simply start the clock before each I/O operation and
stop it afterward.

> For my use case the driver callbacks manage context save and restore.
> This to allow turning off power domains.
> 
> The reason why I don't want to execute the driver ->runtime_suspend()
> callbacks directly is performance. Basically, we only want to execute
> the driver callbacks when we know that we will be able to power off
> the domain. The driver callbacks need to save and restore registers,
> and each uncached memory access is expensive. Executing the driver
> callback does not give us any power savings at all, it's just consumes
> power.
> 
> I want to avoid the situation where the driver ->runtime_suspend() and
> ->runtime_resume() callbacks get invoked over and over for all devices
> except one in a certain power domain even though we will never be able
> to power off because a single device in the power domain is active.

Okay, I get the picture.  So what you want is something like this: Each 
time a driver starts a clock, it does a pm_runtime_get on the 
power-domain device.  Each time it stops the clock, it does a 
pm_runtime_put on the power-domain device.

When the power-domain device's runtime_idle callback runs, it should
call pm_schedule_suspend.  The runtime_suspend callback should call
pm_runtime_suspend for each of the devices in the power domain before
turning off the power supply.  Conversely, the runtime_resume callback
for the power-domain device should turn on the power supply and then
call pm_runtime_resume for each of the devices in the domain.

This will have to get more complicated if the individual device drivers
want to maintain their own usage counters -- especially if they want to
turn off the clock while leaving the counter positive.

Alan Stern


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-09  2:52             ` Magnus Damm
  (?)
  (?)
@ 2009-07-09 13:48             ` Alan Stern
  -1 siblings, 0 replies; 51+ messages in thread
From: Alan Stern @ 2009-07-09 13:48 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Thu, 9 Jul 2009, Magnus Damm wrote:

> Clocks should be stopped as soon as possible without any delay. The
> clock stopping is very cheap performance wise. Also, the clock
> stopping is done on bus level without invoking any driver callbacks.
> Delaying the clock stopping does not make any sense to me.

In that case the device driver or bus subsystem should manage the
device's clock directly.  There's no need to tie it in with the runtime
PM framework.  Simply start the clock before each I/O operation and
stop it afterward.

> For my use case the driver callbacks manage context save and restore.
> This to allow turning off power domains.
> 
> The reason why I don't want to execute the driver ->runtime_suspend()
> callbacks directly is performance. Basically, we only want to execute
> the driver callbacks when we know that we will be able to power off
> the domain. The driver callbacks need to save and restore registers,
> and each uncached memory access is expensive. Executing the driver
> callback does not give us any power savings at all, it's just consumes
> power.
> 
> I want to avoid the situation where the driver ->runtime_suspend() and
> ->runtime_resume() callbacks get invoked over and over for all devices
> except one in a certain power domain even though we will never be able
> to power off because a single device in the power domain is active.

Okay, I get the picture.  So what you want is something like this: Each 
time a driver starts a clock, it does a pm_runtime_get on the 
power-domain device.  Each time it stops the clock, it does a 
pm_runtime_put on the power-domain device.

When the power-domain device's runtime_idle callback runs, it should
call pm_schedule_suspend.  The runtime_suspend callback should call
pm_runtime_suspend for each of the devices in the power domain before
turning off the power supply.  Conversely, the runtime_resume callback
for the power-domain device should turn on the power supply and then
call pm_runtime_resume for each of the devices in the domain.

This will have to get more complicated if the individual device drivers
want to maintain their own usage counters -- especially if they want to
turn off the clock while leaving the counter positive.

Alan Stern

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-09 13:48             ` Alan Stern
@ 2009-07-09 15:31                 ` Magnus Damm
  2009-07-09 15:31                 ` Magnus Damm
  1 sibling, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-09 15:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Thu, 9 Jul 2009, Magnus Damm wrote:
>
>> Clocks should be stopped as soon as possible without any delay. The
>> clock stopping is very cheap performance wise. Also, the clock
>> stopping is done on bus level without invoking any driver callbacks.
>> Delaying the clock stopping does not make any sense to me.
>
> In that case the device driver or bus subsystem should manage the
> device's clock directly.  There's no need to tie it in with the runtime
> PM framework.  Simply start the clock before each I/O operation and
> stop it afterward.

It's not that easy. The clock needs to be enabled to let the hardware
device perform device specific stuff. For instance, the clock for the
LCD controller needs to be on to redraw the screen. When the driver
knows that it's done with the clock it can notify the bus using
Runtime PM.

>> For my use case the driver callbacks manage context save and restore.
>> This to allow turning off power domains.
>>
>> The reason why I don't want to execute the driver ->runtime_suspend()
>> callbacks directly is performance. Basically, we only want to execute
>> the driver callbacks when we know that we will be able to power off
>> the domain. The driver callbacks need to save and restore registers,
>> and each uncached memory access is expensive. Executing the driver
>> callback does not give us any power savings at all, it's just consumes
>> power.
>>
>> I want to avoid the situation where the driver ->runtime_suspend() and
>> ->runtime_resume() callbacks get invoked over and over for all devices
>> except one in a certain power domain even though we will never be able
>> to power off because a single device in the power domain is active.
>
> Okay, I get the picture.  So what you want is something like this: Each
> time a driver starts a clock, it does a pm_runtime_get on the
> power-domain device.  Each time it stops the clock, it does a
> pm_runtime_put on the power-domain device.

Yes, this is exactly what I'd like to do in the driver.

> When the power-domain device's runtime_idle callback runs, it should
> call pm_schedule_suspend.  The runtime_suspend callback should call
> pm_runtime_suspend for each of the devices in the power domain before
> turning off the power supply.  Conversely, the runtime_resume callback
> for the power-domain device should turn on the power supply and then
> call pm_runtime_resume for each of the devices in the domain.
>
> This will have to get more complicated if the individual device drivers
> want to maintain their own usage counters -- especially if they want to
> turn off the clock while leaving the counter positive.

I'd like to mange both clocks and power domains using Runtime PM. The
prototype I just posted does that, but the code is not very well
integrated with the shared Runtime PM code.

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O  devices (rev. 8)
@ 2009-07-09 15:31                 ` Magnus Damm
  0 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-09 15:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Thu, 9 Jul 2009, Magnus Damm wrote:
>
>> Clocks should be stopped as soon as possible without any delay. The
>> clock stopping is very cheap performance wise. Also, the clock
>> stopping is done on bus level without invoking any driver callbacks.
>> Delaying the clock stopping does not make any sense to me.
>
> In that case the device driver or bus subsystem should manage the
> device's clock directly.  There's no need to tie it in with the runtime
> PM framework.  Simply start the clock before each I/O operation and
> stop it afterward.

It's not that easy. The clock needs to be enabled to let the hardware
device perform device specific stuff. For instance, the clock for the
LCD controller needs to be on to redraw the screen. When the driver
knows that it's done with the clock it can notify the bus using
Runtime PM.

>> For my use case the driver callbacks manage context save and restore.
>> This to allow turning off power domains.
>>
>> The reason why I don't want to execute the driver ->runtime_suspend()
>> callbacks directly is performance. Basically, we only want to execute
>> the driver callbacks when we know that we will be able to power off
>> the domain. The driver callbacks need to save and restore registers,
>> and each uncached memory access is expensive. Executing the driver
>> callback does not give us any power savings at all, it's just consumes
>> power.
>>
>> I want to avoid the situation where the driver ->runtime_suspend() and
>> ->runtime_resume() callbacks get invoked over and over for all devices
>> except one in a certain power domain even though we will never be able
>> to power off because a single device in the power domain is active.
>
> Okay, I get the picture.  So what you want is something like this: Each
> time a driver starts a clock, it does a pm_runtime_get on the
> power-domain device.  Each time it stops the clock, it does a
> pm_runtime_put on the power-domain device.

Yes, this is exactly what I'd like to do in the driver.

> When the power-domain device's runtime_idle callback runs, it should
> call pm_schedule_suspend.  The runtime_suspend callback should call
> pm_runtime_suspend for each of the devices in the power domain before
> turning off the power supply.  Conversely, the runtime_resume callback
> for the power-domain device should turn on the power supply and then
> call pm_runtime_resume for each of the devices in the domain.
>
> This will have to get more complicated if the individual device drivers
> want to maintain their own usage counters -- especially if they want to
> turn off the clock while leaving the counter positive.

I'd like to mange both clocks and power domains using Runtime PM. The
prototype I just posted does that, but the code is not very well
integrated with the shared Runtime PM code.

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-09 13:48             ` Alan Stern
@ 2009-07-09 15:31               ` Magnus Damm
  2009-07-09 15:31                 ` Magnus Damm
  1 sibling, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-09 15:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
> On Thu, 9 Jul 2009, Magnus Damm wrote:
>
>> Clocks should be stopped as soon as possible without any delay. The
>> clock stopping is very cheap performance wise. Also, the clock
>> stopping is done on bus level without invoking any driver callbacks.
>> Delaying the clock stopping does not make any sense to me.
>
> In that case the device driver or bus subsystem should manage the
> device's clock directly.  There's no need to tie it in with the runtime
> PM framework.  Simply start the clock before each I/O operation and
> stop it afterward.

It's not that easy. The clock needs to be enabled to let the hardware
device perform device specific stuff. For instance, the clock for the
LCD controller needs to be on to redraw the screen. When the driver
knows that it's done with the clock it can notify the bus using
Runtime PM.

>> For my use case the driver callbacks manage context save and restore.
>> This to allow turning off power domains.
>>
>> The reason why I don't want to execute the driver ->runtime_suspend()
>> callbacks directly is performance. Basically, we only want to execute
>> the driver callbacks when we know that we will be able to power off
>> the domain. The driver callbacks need to save and restore registers,
>> and each uncached memory access is expensive. Executing the driver
>> callback does not give us any power savings at all, it's just consumes
>> power.
>>
>> I want to avoid the situation where the driver ->runtime_suspend() and
>> ->runtime_resume() callbacks get invoked over and over for all devices
>> except one in a certain power domain even though we will never be able
>> to power off because a single device in the power domain is active.
>
> Okay, I get the picture.  So what you want is something like this: Each
> time a driver starts a clock, it does a pm_runtime_get on the
> power-domain device.  Each time it stops the clock, it does a
> pm_runtime_put on the power-domain device.

Yes, this is exactly what I'd like to do in the driver.

> When the power-domain device's runtime_idle callback runs, it should
> call pm_schedule_suspend.  The runtime_suspend callback should call
> pm_runtime_suspend for each of the devices in the power domain before
> turning off the power supply.  Conversely, the runtime_resume callback
> for the power-domain device should turn on the power supply and then
> call pm_runtime_resume for each of the devices in the domain.
>
> This will have to get more complicated if the individual device drivers
> want to maintain their own usage counters -- especially if they want to
> turn off the clock while leaving the counter positive.

I'd like to mange both clocks and power domains using Runtime PM. The
prototype I just posted does that, but the code is not very well
integrated with the shared Runtime PM code.

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [linux-pm] [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-09 15:31                 ` Magnus Damm
@ 2009-07-09 21:56                   ` Mahalingam, Nithish
  -1 siblings, 0 replies; 51+ messages in thread
From: Mahalingam, Nithish @ 2009-07-09 21:56 UTC (permalink / raw)
  To: Magnus Damm, Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

Hi,

I am newbee to this mailing list. Please excuse me if I am talking nonsense here.



On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
>> On Thu, 9 Jul 2009, Magnus Damm wrote:
>>
>>> Clocks should be stopped as soon as possible without any delay. The
>>> clock stopping is very cheap performance wise. Also, the clock
>> stopping is done on bus level without invoking any driver callbacks.
>>> Delaying the clock stopping does not make any sense to me.
>>
>> In that case the device driver or bus subsystem should manage the
>> device's clock directly.  There's no need to tie it in with the runtime
>> PM framework.  Simply start the clock before each I/O operation and
>> stop it afterward.

> It's not that easy. The clock needs to be enabled to let the hardware
> device perform device specific stuff. For instance, the clock for the
> LCD controller needs to be on to redraw the screen. When the driver
> knows that it's done with the clock it can notify the bus using
> Runtime PM.

Is there any plan to look into the "Clock Framework" that was developed as part of OMAP and extending this to make it generic for all platforms?
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [linux-pm] [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
@ 2009-07-09 21:56                   ` Mahalingam, Nithish
  0 siblings, 0 replies; 51+ messages in thread
From: Mahalingam, Nithish @ 2009-07-09 21:56 UTC (permalink / raw)
  To: Magnus Damm, Alan Stern
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

Hi,

I am newbee to this mailing list. Please excuse me if I am talking nonsense here.



On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
>> On Thu, 9 Jul 2009, Magnus Damm wrote:
>>
>>> Clocks should be stopped as soon as possible without any delay. The
>>> clock stopping is very cheap performance wise. Also, the clock
>> stopping is done on bus level without invoking any driver callbacks.
>>> Delaying the clock stopping does not make any sense to me.
>>
>> In that case the device driver or bus subsystem should manage the
>> device's clock directly.  There's no need to tie it in with the runtime
>> PM framework.  Simply start the clock before each I/O operation and
>> stop it afterward.

> It's not that easy. The clock needs to be enabled to let the hardware
> device perform device specific stuff. For instance, the clock for the
> LCD controller needs to be on to redraw the screen. When the driver
> knows that it's done with the clock it can notify the bus using
> Runtime PM.

Is there any plan to look into the "Clock Framework" that was developed as part of OMAP and extending this to make it generic for all platforms?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-09 15:31                 ` Magnus Damm
  (?)
@ 2009-07-09 21:56                 ` Mahalingam, Nithish
  -1 siblings, 0 replies; 51+ messages in thread
From: Mahalingam, Nithish @ 2009-07-09 21:56 UTC (permalink / raw)
  To: Magnus Damm, Alan Stern
  Cc: Greg KH, LKML, ACPI, Maling List, Arjan, Linux-pm mailing list,
	Ingo Molnar, de Ven

Hi,

I am newbee to this mailing list. Please excuse me if I am talking nonsense here.



On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
>> On Thu, 9 Jul 2009, Magnus Damm wrote:
>>
>>> Clocks should be stopped as soon as possible without any delay. The
>>> clock stopping is very cheap performance wise. Also, the clock
>> stopping is done on bus level without invoking any driver callbacks.
>>> Delaying the clock stopping does not make any sense to me.
>>
>> In that case the device driver or bus subsystem should manage the
>> device's clock directly.  There's no need to tie it in with the runtime
>> PM framework.  Simply start the clock before each I/O operation and
>> stop it afterward.

> It's not that easy. The clock needs to be enabled to let the hardware
> device perform device specific stuff. For instance, the clock for the
> LCD controller needs to be on to redraw the screen. When the driver
> knows that it's done with the clock it can notify the bus using
> Runtime PM.

Is there any plan to look into the "Clock Framework" that was developed as part of OMAP and extending this to make it generic for all platforms?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-06  0:52 [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8) Rafael J. Wysocki
@ 2009-07-09 23:22   ` Pavel Machek
  2009-07-07 15:12   ` Magnus Damm
  2009-07-09 23:22   ` Pavel Machek
  2 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2009-07-09 23:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

hi!

> +/**
> + * Device run-time power management request types.
> + *
> + * RPM_REQ_NONE		Do nothing.
> + *
> + * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
> + *
> + * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
> + *
> + * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
> + */
> +
> +enum rpm_request {
> +	RPM_REQ_NONE = 0,
> +	RPM_REQ_IDLE,
> +	RPM_REQ_SUSPEND,
> +	RPM_REQ_RESUME,
> +};
> +
>  struct dev_pm_info {
>  	pm_message_t		power_state;
> -	unsigned		can_wakeup:1;
> -	unsigned		should_wakeup:1;
> +	unsigned int		can_wakeup:1;
> +	unsigned int		should_wakeup:1;
>  	enum dpm_state		status;		/* Owned by the PM core */
> -#ifdef	CONFIG_PM_SLEEP
> +#ifdef CONFIG_PM_SLEEP
>  	struct list_head	entry;
>  #endif
> +#ifdef CONFIG_PM_RUNTIME
> +	struct timer_list	suspend_timer;
> +	unsigned long		timer_expires;
> +	struct work_struct	work;
> +	wait_queue_head_t	wait_queue;
> +	spinlock_t		lock;
> +	atomic_t		usage_count;
> +	atomic_t		child_count;
> +	unsigned int		ignore_children:1;
> +	unsigned int		runtime_disabled:1;
> +	unsigned int		runtime_failure:1;
> +	unsigned int		idle_notification:1;
> +	unsigned int		request_pending:1;
> +	unsigned int		deferred_resume:1;
> +	enum rpm_request	request;
> +	enum rpm_status		runtime_status;

runtime_status seems to be accessed outside spinlocks. Should it be of
type atomic_t?


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
@ 2009-07-09 23:22   ` Pavel Machek
  0 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2009-07-09 23:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, Greg KH, LKML,
	ACPI Devel Maling List, Ingo Molnar, Arjan van de Ven

hi!

> +/**
> + * Device run-time power management request types.
> + *
> + * RPM_REQ_NONE		Do nothing.
> + *
> + * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
> + *
> + * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
> + *
> + * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
> + */
> +
> +enum rpm_request {
> +	RPM_REQ_NONE = 0,
> +	RPM_REQ_IDLE,
> +	RPM_REQ_SUSPEND,
> +	RPM_REQ_RESUME,
> +};
> +
>  struct dev_pm_info {
>  	pm_message_t		power_state;
> -	unsigned		can_wakeup:1;
> -	unsigned		should_wakeup:1;
> +	unsigned int		can_wakeup:1;
> +	unsigned int		should_wakeup:1;
>  	enum dpm_state		status;		/* Owned by the PM core */
> -#ifdef	CONFIG_PM_SLEEP
> +#ifdef CONFIG_PM_SLEEP
>  	struct list_head	entry;
>  #endif
> +#ifdef CONFIG_PM_RUNTIME
> +	struct timer_list	suspend_timer;
> +	unsigned long		timer_expires;
> +	struct work_struct	work;
> +	wait_queue_head_t	wait_queue;
> +	spinlock_t		lock;
> +	atomic_t		usage_count;
> +	atomic_t		child_count;
> +	unsigned int		ignore_children:1;
> +	unsigned int		runtime_disabled:1;
> +	unsigned int		runtime_failure:1;
> +	unsigned int		idle_notification:1;
> +	unsigned int		request_pending:1;
> +	unsigned int		deferred_resume:1;
> +	enum rpm_request	request;
> +	enum rpm_status		runtime_status;

runtime_status seems to be accessed outside spinlocks. Should it be of
type atomic_t?


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [linux-pm] [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-09 21:56                   ` Mahalingam, Nithish
  (?)
  (?)
@ 2009-07-11 11:08                   ` Rafael J. Wysocki
  2009-07-12  2:05                     ` Mahalingam, Nithish
  2009-07-12  2:05                     ` [linux-pm] " Mahalingam, Nithish
  -1 siblings, 2 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-11 11:08 UTC (permalink / raw)
  To: Mahalingam, Nithish
  Cc: Magnus Damm, Alan Stern, Greg KH, LKML, ACPI Devel Maling List,
	Linux-pm mailing list, Ingo Molnar, Arjan van de Ven

On Thursday 09 July 2009, Mahalingam, Nithish wrote:
> Hi,

Hi,

> I am newbee to this mailing list. Please excuse me if I am talking nonsense here.
> 
> 
> 
> On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
> >> On Thu, 9 Jul 2009, Magnus Damm wrote:
> >>
> >>> Clocks should be stopped as soon as possible without any delay. The
> >>> clock stopping is very cheap performance wise. Also, the clock
> >> stopping is done on bus level without invoking any driver callbacks.
> >>> Delaying the clock stopping does not make any sense to me.
> >>
> >> In that case the device driver or bus subsystem should manage the
> >> device's clock directly.  There's no need to tie it in with the runtime
> >> PM framework.  Simply start the clock before each I/O operation and
> >> stop it afterward.
> 
> > It's not that easy. The clock needs to be enabled to let the hardware
> > device perform device specific stuff. For instance, the clock for the
> > LCD controller needs to be on to redraw the screen. When the driver
> > knows that it's done with the clock it can notify the bus using
> > Runtime PM.
> 
> Is there any plan to look into the "Clock Framework" that was developed as
> part of OMAP and extending this to make it generic for all platforms?

I don't have any plan to do that and I heaven't heard of anyone planning to do
it.

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-09 21:56                   ` Mahalingam, Nithish
  (?)
@ 2009-07-11 11:08                   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-11 11:08 UTC (permalink / raw)
  To: Mahalingam, Nithish
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Thursday 09 July 2009, Mahalingam, Nithish wrote:
> Hi,

Hi,

> I am newbee to this mailing list. Please excuse me if I am talking nonsense here.
> 
> 
> 
> On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
> >> On Thu, 9 Jul 2009, Magnus Damm wrote:
> >>
> >>> Clocks should be stopped as soon as possible without any delay. The
> >>> clock stopping is very cheap performance wise. Also, the clock
> >> stopping is done on bus level without invoking any driver callbacks.
> >>> Delaying the clock stopping does not make any sense to me.
> >>
> >> In that case the device driver or bus subsystem should manage the
> >> device's clock directly.  There's no need to tie it in with the runtime
> >> PM framework.  Simply start the clock before each I/O operation and
> >> stop it afterward.
> 
> > It's not that easy. The clock needs to be enabled to let the hardware
> > device perform device specific stuff. For instance, the clock for the
> > LCD controller needs to be on to redraw the screen. When the driver
> > knows that it's done with the clock it can notify the bus using
> > Runtime PM.
> 
> Is there any plan to look into the "Clock Framework" that was developed as
> part of OMAP and extending this to make it generic for all platforms?

I don't have any plan to do that and I heaven't heard of anyone planning to do
it.

Best,
Rafael

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [linux-pm] [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-11 11:08                   ` [linux-pm] " Rafael J. Wysocki
  2009-07-12  2:05                     ` Mahalingam, Nithish
@ 2009-07-12  2:05                     ` Mahalingam, Nithish
  1 sibling, 0 replies; 51+ messages in thread
From: Mahalingam, Nithish @ 2009-07-12  2:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Magnus Damm, Alan Stern, Greg KH, LKML, ACPI Devel Maling List,
	Linux-pm mailing list, Ingo Molnar, Arjan van de Ven

>Hi,
>
>> I am newbee to this mailing list. Please excuse me if I am talking nonsense here.
>> 
>> 
>> 
>> On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
>>>> On Thu, 9 Jul 2009, Magnus Damm wrote:
>>>>
>>>> Clocks should be stopped as soon as possible without any delay. The
>>>> clock stopping is very cheap performance wise. Also, the clock
>>>> stopping is done on bus level without invoking any driver callbacks.
>>>>> Delaying the clock stopping does not make any sense to me.
>>>>
>>>> In that case the device driver or bus subsystem should manage the
>>>> device's clock directly.  There's no need to tie it in with the runtime
>>>> PM framework.  Simply start the clock before each I/O operation and
>>>> stop it afterward.
>> 
>>> It's not that easy. The clock needs to be enabled to let the hardware
>>> device perform device specific stuff. For instance, the clock for the
>>> LCD controller needs to be on to redraw the screen. When the driver
>>> knows that it's done with the clock it can notify the bus using
>>> Runtime PM.
>> 
>> Is there any plan to look into the "Clock Framework" that was developed as
>> part of OMAP and extending this to make it generic for all platforms?

>I don't have any plan to do that and I heaven't heard of anyone planning to do
>it.

Thanks for the reply Rafael. I felt if we are talking about controlling a device clock from a runtime PM framework we should look at extending the Clock framework rather than re-inventing things.

Regards,
Nithish Mahalingam

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-11 11:08                   ` [linux-pm] " Rafael J. Wysocki
@ 2009-07-12  2:05                     ` Mahalingam, Nithish
  2009-07-12  2:05                     ` [linux-pm] " Mahalingam, Nithish
  1 sibling, 0 replies; 51+ messages in thread
From: Mahalingam, Nithish @ 2009-07-12  2:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg KH, LKML, ACPI Devel Maling List, Arjan,
	Linux-pm mailing list, Ingo Molnar, de Ven

>Hi,
>
>> I am newbee to this mailing list. Please excuse me if I am talking nonsense here.
>> 
>> 
>> 
>> On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
>>>> On Thu, 9 Jul 2009, Magnus Damm wrote:
>>>>
>>>> Clocks should be stopped as soon as possible without any delay. The
>>>> clock stopping is very cheap performance wise. Also, the clock
>>>> stopping is done on bus level without invoking any driver callbacks.
>>>>> Delaying the clock stopping does not make any sense to me.
>>>>
>>>> In that case the device driver or bus subsystem should manage the
>>>> device's clock directly.  There's no need to tie it in with the runtime
>>>> PM framework.  Simply start the clock before each I/O operation and
>>>> stop it afterward.
>> 
>>> It's not that easy. The clock needs to be enabled to let the hardware
>>> device perform device specific stuff. For instance, the clock for the
>>> LCD controller needs to be on to redraw the screen. When the driver
>>> knows that it's done with the clock it can notify the bus using
>>> Runtime PM.
>> 
>> Is there any plan to look into the "Clock Framework" that was developed as
>> part of OMAP and extending this to make it generic for all platforms?

>I don't have any plan to do that and I heaven't heard of anyone planning to do
>it.

Thanks for the reply Rafael. I felt if we are talking about controlling a device clock from a runtime PM framework we should look at extending the Clock framework rather than re-inventing things.

Regards,
Nithish Mahalingam

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
  2009-07-09 21:56                   ` Mahalingam, Nithish
@ 2009-07-13  1:42                     ` Magnus Damm
  -1 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-13  1:42 UTC (permalink / raw)
  To: Mahalingam, Nithish
  Cc: Greg KH, LKML, ACPI Devel Maling List, Linux-pm mailing list,
	Ingo Molnar, Arjan van de Ven

On Fri, Jul 10, 2009 at 6:56 AM, Mahalingam,
Nithish<nithish.mahalingam@intel.com> wrote:
> On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
>>> On Thu, 9 Jul 2009, Magnus Damm wrote:
>>>
>>>> Clocks should be stopped as soon as possible without any delay. The
>>>> clock stopping is very cheap performance wise. Also, the clock
>>> stopping is done on bus level without invoking any driver callbacks.
>>>> Delaying the clock stopping does not make any sense to me.
>>>
>>> In that case the device driver or bus subsystem should manage the
>>> device's clock directly.  There's no need to tie it in with the runtime
>>> PM framework.  Simply start the clock before each I/O operation and
>>> stop it afterward.
>
>> It's not that easy. The clock needs to be enabled to let the hardware
>> device perform device specific stuff. For instance, the clock for the
>> LCD controller needs to be on to redraw the screen. When the driver
>> knows that it's done with the clock it can notify the bus using
>> Runtime PM.
>
> Is there any plan to look into the "Clock Framework" that was developed as part of OMAP and extending this to make it generic for all platforms?

Do you mean vendor specific extensions to the clock framework? I'm
quite sure a bunch of architectures already support the clock
framework. Some architectures probably have extensions that would be
nice if they could be made more generic. Do you have any special
extensions in mind? =)

As for Runtime PM, clock stopping is only one part of the problem. On
SuperH we manage clock stopping through the clock framework and I'm
pretty sure other embedded architectures do that as well. That problem
is more or less already solved in my opinion.

For efficient power management on embedded systems we need more than
just clock stopping. We also want to save and restore device context
so we can turn off power to power domains during runtime. This is
somewhat tied together with clock stopping because we want to disable
clocks as soon as possible when the device gets idle but in the case
of domain power off we also need to start the clock before we can call
the ->runtime_suspend() callback provided by the driver. All this can
of course be handled by the device driver, but that would just be
duplicating too much code - I'd rather let the Runtime PM bus code
manage that.

Many runtime power management aware device drivers use clk_enable()
and clk_disable() stop and start clocks. But the clk_disable()
function is actually often used to signal that the device is idle.
Allowing device drivers to replace clk_enable()/clk_disable() with
Runtime PM callbacks to add support for context save and restore is a
logical next step from my point of view.

For SuperH we will most likely allow clock stopping through the clock
framework _or_ using Runtime PM in parallel, at least until all
drivers are converted to make use of Runtime PM.

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [linux-pm] [RFC][PATCH] PM: Introduce core framework for run-time  PM of I/O devices (rev. 8)
@ 2009-07-13  1:42                     ` Magnus Damm
  0 siblings, 0 replies; 51+ messages in thread
From: Magnus Damm @ 2009-07-13  1:42 UTC (permalink / raw)
  To: Mahalingam, Nithish
  Cc: Alan Stern, Greg KH, LKML, ACPI Devel Maling List,
	Linux-pm mailing list, Ingo Molnar, Arjan van de Ven

On Fri, Jul 10, 2009 at 6:56 AM, Mahalingam,
Nithish<nithish.mahalingam@intel.com> wrote:
> On Thu, Jul 9, 2009 at 10:48 PM, Alan Stern<stern@rowland.harvard.edu> wrote:
>>> On Thu, 9 Jul 2009, Magnus Damm wrote:
>>>
>>>> Clocks should be stopped as soon as possible without any delay. The
>>>> clock stopping is very cheap performance wise. Also, the clock
>>> stopping is done on bus level without invoking any driver callbacks.
>>>> Delaying the clock stopping does not make any sense to me.
>>>
>>> In that case the device driver or bus subsystem should manage the
>>> device's clock directly.  There's no need to tie it in with the runtime
>>> PM framework.  Simply start the clock before each I/O operation and
>>> stop it afterward.
>
>> It's not that easy. The clock needs to be enabled to let the hardware
>> device perform device specific stuff. For instance, the clock for the
>> LCD controller needs to be on to redraw the screen. When the driver
>> knows that it's done with the clock it can notify the bus using
>> Runtime PM.
>
> Is there any plan to look into the "Clock Framework" that was developed as part of OMAP and extending this to make it generic for all platforms?

Do you mean vendor specific extensions to the clock framework? I'm
quite sure a bunch of architectures already support the clock
framework. Some architectures probably have extensions that would be
nice if they could be made more generic. Do you have any special
extensions in mind? =)

As for Runtime PM, clock stopping is only one part of the problem. On
SuperH we manage clock stopping through the clock framework and I'm
pretty sure other embedded architectures do that as well. That problem
is more or less already solved in my opinion.

For efficient power management on embedded systems we need more than
just clock stopping. We also want to save and restore device context
so we can turn off power to power domains during runtime. This is
somewhat tied together with clock stopping because we want to disable
clocks as soon as possible when the device gets idle but in the case
of domain power off we also need to start the clock before we can call
the ->runtime_suspend() callback provided by the driver. All this can
of course be handled by the device driver, but that would just be
duplicating too much code - I'd rather let the Runtime PM bus code
manage that.

Many runtime power management aware device drivers use clk_enable()
and clk_disable() stop and start clocks. But the clk_disable()
function is actually often used to signal that the device is idle.
Allowing device drivers to replace clk_enable()/clk_disable() with
Runtime PM callbacks to add support for context save and restore is a
logical next step from my point of view.

For SuperH we will most likely allow clock stopping through the clock
framework _or_ using Runtime PM in parallel, at least until all
drivers are converted to make use of Runtime PM.

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8)
@ 2009-07-06  0:52 Rafael J. Wysocki
  0 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2009-07-06  0:52 UTC (permalink / raw)
  To: Alan Stern, Linux-pm mailing list
  Cc: ACPI Devel Maling List, Ingo Molnar, Greg KH, LKML, Arjan van de Ven

Hi,

There's a rev. 8 of the run-time PM framework patch.

Highlights:
* I did my best to follow the design we've recently discussed.
* pm_runtime_[get|put]() and the sync versions call
  pm_[request|runtime]_[resume|idle](), because I don't see much point
  manipulating the usage counter alone.
* pm_runtime_disable() carries out a (synchronous) wake-up if there's a
  resume request pending.

Comments welcome.

Best,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>

Introduce a core framework for run-time power management of I/O
devices.  Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'.  Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level.

Not-yet-signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/dd.c            |   10 
 drivers/base/power/Makefile  |    1 
 drivers/base/power/main.c    |   21 -
 drivers/base/power/power.h   |   11 
 drivers/base/power/runtime.c |  901 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm.h           |  102 ++++
 include/linux/pm_runtime.h   |  105 +++++
 kernel/power/Kconfig         |   14 
 kernel/power/main.c          |   17 
 9 files changed, 1170 insertions(+), 12 deletions(-)

Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
 	  random kernel OOPSes or reboots that don't seem to be related to
 	  anything, try disabling/enabling this option (or disabling/enabling
 	  APM in your BIOS).
+
+config PM_RUNTIME
+	bool "Run-time PM core functionality"
+	depends on PM
+	---help---
+	  Enable functionality allowing I/O devices to be put into energy-saving
+	  (low power) states at run time (or autosuspended) after a specified
+	  period of inactivity and woken up in response to a hardware-generated
+	  wake-up event or a driver's request.
+
+	  Hardware support is generally required for this functionality to work
+	  and the bus type drivers of the buses the devices are on are
+	  responsible for the actual handling of the autosuspend requests and
+	  wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
 #include <linux/kobject.h>
 #include <linux/string.h>
 #include <linux/resume-trace.h>
+#include <linux/workqueue.h>
 
 #include "power.h"
 
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
 	.attrs = g,
 };
 
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+	pm_wq = create_freezeable_workqueue("pm");
+
+	return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
 static int __init pm_init(void)
 {
+	int error = pm_start_workqueue();
+	if (error)
+		return error;
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,10 @@
 #define _LINUX_PM_H
 
 #include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
 
 /*
  * Callbacks for platform drivers to implement.
@@ -165,6 +169,28 @@ typedef struct pm_message {
  * It is allowed to unregister devices while the above callbacks are being
  * executed.  However, it is not allowed to unregister a device from within any
  * of its own callbacks.
+ *
+ * There also are the following callbacks related to run-time power management
+ * of devices:
+ *
+ * @runtime_suspend: Prepare the device for a condition in which it won't be
+ *	able to communicate with the CPU(s) and RAM due to power management.
+ *	This need not mean that the device should be put into a low power state.
+ *	For example, if the device is behind a link which is about to be turned
+ *	off, the device may remain at full power.  If the device does go to low
+ *	power and if device_may_wakeup(dev) is true, remote wake-up (i.e., a
+ *	hardware mechanism allowing the device to request a change of its power
+ *	state, such as PCI PME) should be enabled for it.
+ *
+ * @runtime_resume: Put the device into the fully active state in response to a
+ *	wake-up event generated by hardware or at the request of software.  If
+ *	necessary, put the device into the full power state and restore its
+ *	registers, so that it is fully operational.
+ *
+ * @runtime_idle: Device appears to be inactive and it might be put into a low
+ *	power state if all of the necessary conditions are satisfied.  Check
+ *	these conditions and handle the device as appropriate, possibly queueing
+ *	a suspend request for it.
  */
 
 struct dev_pm_ops {
@@ -182,6 +208,9 @@ struct dev_pm_ops {
 	int (*thaw_noirq)(struct device *dev);
 	int (*poweroff_noirq)(struct device *dev);
 	int (*restore_noirq)(struct device *dev);
+	int (*runtime_suspend)(struct device *dev);
+	int (*runtime_resume)(struct device *dev);
+	void (*runtime_idle)(struct device *dev);
 };
 
 /**
@@ -315,14 +344,81 @@ enum dpm_state {
 	DPM_OFF_IRQ,
 };
 
+/**
+ * Device run-time power management status.
+ *
+ * These status labels are used internally by the PM core to indicate the
+ * current status of a device with respect to the PM core operations.  They do
+ * not reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE		Device is fully operational.  Indicates that the device
+ *			bus type's ->runtime_resume() callback has completed
+ *			successfully.
+ *
+ * RPM_SUSPENDED	Device bus type's ->runtime_suspend() callback has
+ *			completed successfully.  The device is regarded as
+ *			suspended.
+ *
+ * RPM_RESUMING		Device bus type's ->runtime_resume() callback is being
+ *			executed.
+ *
+ * RPM_SUSPENDING	Device bus type's ->runtime_suspend() callback is being
+ *			executed.
+ */
+
+enum rpm_status {
+	RPM_ACTIVE = 0,
+	RPM_RESUMING,
+	RPM_SUSPENDED,
+	RPM_SUSPENDING,
+};
+
+/**
+ * Device run-time power management request types.
+ *
+ * RPM_REQ_NONE		Do nothing.
+ *
+ * RPM_REQ_IDLE		Run the device bus type's ->runtime_idle() callback
+ *
+ * RPM_REQ_SUSPEND	Run the device bus type's ->runtime_suspend() callback
+ *
+ * RPM_REQ_RESUME	Run the device bus type's ->runtime_resume() callback
+ */
+
+enum rpm_request {
+	RPM_REQ_NONE = 0,
+	RPM_REQ_IDLE,
+	RPM_REQ_SUSPEND,
+	RPM_REQ_RESUME,
+};
+
 struct dev_pm_info {
 	pm_message_t		power_state;
-	unsigned		can_wakeup:1;
-	unsigned		should_wakeup:1;
+	unsigned int		can_wakeup:1;
+	unsigned int		should_wakeup:1;
 	enum dpm_state		status;		/* Owned by the PM core */
-#ifdef	CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
 #endif
+#ifdef CONFIG_PM_RUNTIME
+	struct timer_list	suspend_timer;
+	unsigned long		timer_expires;
+	struct work_struct	work;
+	wait_queue_head_t	wait_queue;
+	spinlock_t		lock;
+	atomic_t		usage_count;
+	atomic_t		child_count;
+	unsigned int		ignore_children:1;
+	unsigned int		runtime_disabled:1;
+	unsigned int		runtime_failure:1;
+	unsigned int		idle_notification:1;
+	unsigned int		request_pending:1;
+	unsigned int		deferred_resume:1;
+	enum rpm_request	request;
+	enum rpm_status		runtime_status;
+	int			last_error;
+#endif
 };
 
 /*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PM)	+= sysfs.o
 obj-$(CONFIG_PM_SLEEP)	+= main.o
+obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,901 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/sched.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+
+static int __pm_request_resume(struct device *dev);
+
+/**
+ * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_deactivate_timer(struct device *dev)
+{
+	if (dev->power.timer_expires > 0) {
+		del_timer(&dev->power.suspend_timer);
+		dev->power.timer_expires = 0;
+	}
+}
+
+/**
+ * pm_runtime_cancel_pending - Deactivate suspend timer and cancel requests.
+ * @dev: Device to handle.
+ */
+static void pm_runtime_cancel_pending(struct device *dev)
+{
+	pm_runtime_deactivate_timer(dev);
+	/*
+	 * If there's a request pending, make sure its work function will return
+	 * without doing anything.
+	 */
+	if (dev->power.request_pending)
+		dev->power.request = RPM_REQ_NONE;
+}
+
+/**
+ * __pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_runtime_idle(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (dev->power.idle_notification)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.runtime_disabled
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.request_pending) {
+		/*
+		 * If an idle notification request is pending, cancel it.  Any
+		 * other pending request takes precedence over us.
+		 */
+		if (dev->power.request == RPM_REQ_IDLE)
+			dev->power.request = RPM_REQ_NONE;
+		else if (dev->power.request != RPM_REQ_NONE)
+			return -EAGAIN;
+	}
+
+	dev->power.idle_notification = true;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
+		dev->bus->pm->runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	dev->power.idle_notification = false;
+	wake_up_all(&dev->power.wait_queue);
+
+	return 0;
+}
+
+/**
+ * pm_runtime_idle - Notify device bus type if the device can be suspended.
+ * @dev: Device to notify the bus type about.
+ */
+int pm_runtime_idle(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_idle(dev);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_idle);
+
+/**
+ * __pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be suspended and run the ->runtime_suspend() callback
+ * provided by its bus type.  If another suspend has been started earlier, wait
+ * for it to finish.  If there's an idle notification pending, cancel it.  If
+ * there's a suspend request scheduled while this function is running and @sync
+ * is 'true', cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_suspend(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	bool notify = false;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* Pending resume requests take precedence over us. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return -EAGAIN;
+		/* Other pending requests need to be canceled. */
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_disabled
+	    || atomic_read(&dev->power.usage_count) > 0)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq)
+			return -EINPROGRESS;
+
+		/* Wait for the other suspend running in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_SUSPENDING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend ?
+		dev->bus->pm->runtime_suspend(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_ACTIVE;
+		pm_runtime_cancel_pending(dev);
+		dev->power.deferred_resume = false;
+
+		if (retval == -EAGAIN || retval == -EBUSY) {
+			notify = true;
+		} else {
+			dev->power.runtime_failure = true;
+			dev->power.last_error = retval;
+		}
+	} else {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		if (dev->parent) {
+			parent = dev->parent;
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		}
+
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	if (dev->power.deferred_resume) {
+		__pm_request_resume(dev);
+		dev->power.deferred_resume = false;
+	}
+
+	spin_unlock_irq(&dev->power.lock);
+
+	if (parent && !parent->power.ignore_children)
+		pm_request_idle(parent);
+
+	if (notify)
+		pm_runtime_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_suspend - Carry out run-time suspend of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_suspend(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_suspend(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_suspend);
+
+/**
+ * __pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to resume.
+ * @from_wq: If set, the funtion has been called via pm_wq.
+ *
+ * Check if the device can be woken up and run the ->runtime_resume() callback
+ * provided by its bus type.  If another resume has been started earlier, wait
+ * for it to finish.  If there's a suspend running in parallel with this
+ * function, wait for it to finish and resume the device.  If there's a suspend
+ * request or idle notification pending, cancel it.  If there's a resume request
+ * scheduled while this function is running, cancel that request.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_runtime_resume(struct device *dev, bool from_wq)
+{
+	struct device *parent = NULL;
+	int retval = 0;
+
+ repeat:
+	if (dev->power.runtime_failure)
+		return -ENODEV;
+
+	pm_runtime_cancel_pending(dev);
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_disabled)
+		retval = -EAGAIN;
+	if (retval)
+		return retval;
+
+	if (dev->power.runtime_status == RPM_RESUMING
+	    || dev->power.runtime_status == RPM_SUSPENDING) {
+		DEFINE_WAIT(wait);
+
+		if (from_wq) {
+			if (dev->power.runtime_status == RPM_SUSPENDING)
+				dev->power.deferred_resume = true;
+			return -EINPROGRESS;
+		}
+
+		/* Wait for the operation carried out in parallel with us. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_RESUMING
+			    && dev->power.runtime_status != RPM_SUSPENDING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+		goto repeat;
+	}
+
+	if (!parent && dev->parent) {
+		/*
+		 * Increment the parent's resume counter and resume it if
+		 * necessary.
+		 */
+		spin_unlock_irq(&dev->power.lock);
+
+		parent = dev->parent;
+		retval = pm_runtime_get_sync(parent);
+		if (retval < 0)
+			goto out_parent;
+
+		spin_lock_irq(&dev->power.lock);
+		retval = 0;
+		goto repeat;
+	}
+
+	dev->power.runtime_status = RPM_RESUMING;
+
+	spin_unlock_irq(&dev->power.lock);
+
+	retval = dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume ?
+		dev->bus->pm->runtime_resume(dev) : -ENOSYS;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (retval) {
+		dev->power.runtime_status = RPM_SUSPENDED;
+
+		dev->power.runtime_failure = true;
+		dev->power.last_error = retval;
+
+		pm_runtime_cancel_pending(dev);
+	} else {
+		dev->power.runtime_status = RPM_ACTIVE;
+
+		if (parent)
+			atomic_inc(&parent->power.child_count);
+	}
+	wake_up_all(&dev->power.wait_queue);
+
+	spin_unlock_irq(&dev->power.lock);
+
+ out_parent:
+	if (parent)
+		pm_runtime_put(parent);
+
+	if (!retval)
+		pm_request_idle(dev);
+
+	spin_lock_irq(&dev->power.lock);
+
+	return retval;
+}
+
+/**
+ * pm_runtime_resume - Carry out run-time resume of given device.
+ * @dev: Device to suspend.
+ */
+int pm_runtime_resume(struct device *dev)
+{
+	int retval;
+
+	spin_lock_irq(&dev->power.lock);
+	retval = __pm_runtime_resume(dev, false);
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_resume);
+
+/**
+ * pm_runtime_work - Universal run-time PM work function.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the work is to be done for, determine what
+ * is to be done and execute the appropriate run-time PM function.
+ */
+static void pm_runtime_work(struct work_struct *work)
+{
+	struct device *dev = container_of(work, struct device, power.work);
+	enum rpm_request req;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.request_pending)
+		goto out;
+
+	req = dev->power.request;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.request_pending = false;
+
+	switch (req) {
+	case RPM_REQ_NONE:
+		break;
+	case RPM_REQ_IDLE:
+		__pm_runtime_idle(dev);
+		break;
+	case RPM_REQ_SUSPEND:
+		__pm_runtime_suspend(dev, true);
+		break;
+	case RPM_REQ_RESUME:
+		__pm_runtime_resume(dev, true);
+		break;
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+}
+
+/**
+ * pm_request_idle - Submit an idle notification request for given device.
+ * @dev: Device to handle.
+ *
+ * Check if the device's run-time PM status is correct for suspending the device
+ * and queue up a request to run __pm_runtime_idle() for it.
+ */
+int pm_request_idle(struct device *dev)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure)
+		retval = -EINVAL;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.runtime_disabled
+	    || dev->power.timer_expires > 0
+	    || dev->power.runtime_status == RPM_SUSPENDED
+	    || dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	if (dev->power.request_pending && dev->power.request != RPM_REQ_NONE) {
+		/* Any requests other then RPM_REQ_IDLE take precedence. */
+		if (dev->power.request != RPM_REQ_IDLE)
+			retval = -EAGAIN;
+		goto out;
+	}
+
+	dev->power.request = RPM_REQ_IDLE;
+	if (dev->power.request_pending)
+		goto out;
+
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_idle);
+
+/**
+ * __pm_request_suspend - Submit a suspend request for given device.
+ * @dev: Device to suspend.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+static int __pm_request_suspend(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.runtime_disabled)
+		retval = -EAGAIN;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but we can
+		 * overtake any other pending request.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME)
+			retval = -EAGAIN;
+		else if (dev->power.request != RPM_REQ_SUSPEND)
+			dev->power.request = retval ?
+						RPM_REQ_NONE : RPM_REQ_SUSPEND;
+
+		if (dev->power.request == RPM_REQ_SUSPEND)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_SUSPEND;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return 0;
+}
+
+/**
+ * pm_suspend_timer_fn - Timer function for pm_schedule_suspend().
+ * @data: Device pointer passed by pm_schedule_suspend().
+ *
+ * Check if the time is right and execute __pm_request_suspend() in that case.
+ */
+static void pm_suspend_timer_fn(unsigned long data)
+{
+	struct device *dev = (struct device *)data;
+	unsigned long flags;
+	unsigned long expires;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	expires = dev->power.timer_expires;
+	/* If 'expire' is after 'jiffies' we've been called too early. */
+	if (expires > 0 && !time_after(expires, jiffies)) {
+		dev->power.timer_expires = 0;
+		__pm_request_suspend(dev);
+	}
+
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_schedule_suspend - Set up a timer to submit a suspend request in future.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before submitting a suspend request, in milliseconds.
+ */
+int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	unsigned long flags;
+	int retval = 0;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (dev->power.runtime_failure) {
+		retval = -EINVAL;
+		goto out;
+	}
+
+	if (!delay) {
+		retval = __pm_request_suspend(dev);
+		goto out;
+	}
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/*
+		 * Pending resume requests take precedence over us, but any
+		 * other pending requests have to be canceled.
+		 */
+		if (dev->power.request == RPM_REQ_RESUME) {
+			retval = -EAGAIN;
+			goto out;
+		}
+		dev->power.request = RPM_REQ_NONE;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDED)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_SUSPENDING)
+		retval = -EINPROGRESS;
+	else if (atomic_read(&dev->power.usage_count) > 0
+	    || dev->power.runtime_disabled)
+		retval = -EAGAIN;
+	else if (!pm_children_suspended(dev))
+		retval = -EBUSY;
+	if (retval)
+		goto out;
+
+	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
+	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_schedule_suspend);
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ *
+ * This function must be called under dev->power.lock with interrupts disabled.
+ */
+int __pm_request_resume(struct device *dev)
+{
+	int retval = 0;
+
+	if (dev->power.runtime_failure)
+		return -EINVAL;
+
+	if (dev->power.runtime_status == RPM_ACTIVE)
+		retval = 1;
+	else if (dev->power.runtime_status == RPM_RESUMING)
+		retval = -EINPROGRESS;
+	else if (dev->power.runtime_disabled)
+		retval = -EAGAIN;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		/* If non-resume request is pending, we can overtake it. */
+		dev->power.request = retval ? RPM_REQ_NONE : RPM_REQ_RESUME;
+		/* There's nothing to do if resume request is pending. */
+		if (dev->power.request == RPM_REQ_RESUME)
+			return 0;
+	}
+
+	if (retval)
+		return retval;
+
+	dev->power.request = RPM_REQ_RESUME;
+	dev->power.request_pending = true;
+	queue_work(pm_wq, &dev->power.work);
+
+	return retval;
+}
+
+/**
+ * pm_request_resume - Submit a resume request for given device.
+ * @dev: Device to resume.
+ */
+int pm_request_resume(struct device *dev)
+{
+	unsigned long flags;
+	int retval;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	retval = __pm_request_resume(dev);
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_request_resume);
+
+/**
+ * __pm_runtime_set_status - Set run-time PM status of a device.
+ * @dev: Device to handle.
+ * @status: New run-time PM status of the device.
+ *
+ * If run-time PM of the device is disabled or its power.runtime_failure flag is
+ * set, the status may be changed either to RPM_ACTIVE, or to RPM_SUSPENDED, as
+ * long as that reflects the actual state of the device.  However, if the device
+ * has a parent and the parent is not active, and the parent's
+ * power.ignore_children flag is unset, the device's status cannot be set to
+ * RPM_ACTIVE, so -EBUSY is returned in that case.
+ *
+ * If successful, __pm_runtime_set_status() clears the power.runtime_failure
+ * flag and the device parent's counter of unsuspended children is modified to
+ * reflect the new status.
+ */
+int __pm_runtime_set_status(struct device *dev, unsigned int status)
+{
+	struct device *parent = dev->parent;
+	unsigned long flags;
+	int error = 0;
+
+	if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
+		return -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_failure && !dev->power.runtime_disabled)
+		goto out;
+
+	if (dev->power.runtime_status == status)
+		goto out_clear;
+
+	if (status == RPM_SUSPENDED) {
+		/* It always is possible to set the status to 'suspended'. */
+		if (parent)
+			atomic_add_unless(&parent->power.child_count, -1, 0);
+		dev->power.runtime_status = status;
+		goto out_clear;
+	}
+
+	if (parent) {
+		spin_lock_irq(&parent->power.lock);
+
+		/*
+		 * It may be invalid to put an active child under a suspended
+		 * parent.
+		 */
+		if (parent->power.runtime_status == RPM_ACTIVE
+		    || parent->power.ignore_children) {
+			if (dev->power.runtime_status == RPM_SUSPENDED)
+				atomic_inc(&parent->power.child_count);
+			dev->power.runtime_status = status;
+		} else {
+			error = -EBUSY;
+		}
+
+		spin_unlock_irq(&parent->power.lock);
+
+		if (error)
+			goto out;
+	} else {
+		dev->power.runtime_status = status;
+	}
+
+ out_clear:
+	dev->power.runtime_failure = false;
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(__pm_runtime_set_status);
+
+/**
+ * pm_runtime_enable - Enable run-time PM of a device.
+ * @dev: Device to handle.
+ */
+void pm_runtime_enable(struct device *dev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+
+	if (!dev->power.runtime_disabled)
+		goto out;
+
+	if (atomic_dec_and_test(&dev->power.usage_count))
+		dev->power.runtime_disabled = false;
+
+ out:
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enable);
+
+/**
+ * pm_runtime_disable - Disable run-time PM of a device.
+ * @dev: Device to handle.
+ *
+ * Set the power.runtime_disabled flag for the device, cancel all pending
+ * run-time PM requests for it and wait for operations in progress to complete.
+ * The device can be either active or suspended after its run-time PM has been
+ * disabled.
+ *
+ * If there's a resume request pending when pm_runtime_disable() is called, it
+ * resumes the device before disabling its run-time PM and returns -EBUSY.
+ * Otherwise, 0 is returned.
+ */
+int pm_runtime_disable(struct device *dev)
+{
+	int retval = 0;
+
+	spin_lock_irq(&dev->power.lock);
+
+	atomic_inc(&dev->power.usage_count);
+
+	if (dev->power.runtime_disabled)
+		goto out;
+
+	/*
+	 * Wake up the device if there's a resume request pending, because that
+	 * means there probably is some I/O to process and we shouldn't prevent
+	 * the device from processing the I/O.
+	 */
+	if (dev->power.request_pending
+	    && dev->power.request == RPM_REQ_RESUME) {
+		__pm_runtime_resume(dev, false);
+		retval = -EBUSY;
+	}
+
+	dev->power.runtime_disabled = true;
+
+	if (dev->power.runtime_failure)
+		goto out;
+
+	pm_runtime_deactivate_timer(dev);
+
+	if (dev->power.request_pending) {
+		dev->power.request = RPM_REQ_NONE;
+
+		spin_unlock_irq(&dev->power.lock);
+
+		cancel_work_sync(&dev->power.work);
+
+		spin_lock_irq(&dev->power.lock);
+
+		dev->power.request_pending = false;
+	}
+
+	if (dev->power.runtime_status == RPM_SUSPENDING
+	    || dev->power.runtime_status == RPM_RESUMING) {
+		DEFINE_WAIT(wait);
+
+		/* Suspend or wake-up in progress. */
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (dev->power.runtime_status != RPM_SUSPENDING
+			    && dev->power.runtime_status != RPM_RESUMING)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+	if (dev->power.idle_notification) {
+		DEFINE_WAIT(wait);
+
+		for (;;) {
+			prepare_to_wait(&dev->power.wait_queue, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (!dev->power.idle_notification)
+				break;
+
+			spin_unlock_irq(&dev->power.lock);
+
+			schedule();
+
+			spin_lock_irq(&dev->power.lock);
+		}
+		finish_wait(&dev->power.wait_queue, &wait);
+	}
+
+ out:
+	spin_unlock_irq(&dev->power.lock);
+
+	return retval;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_disable);
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to initialize.
+ */
+void pm_runtime_init(struct device *dev)
+{
+	spin_lock_init(&dev->power.lock);
+
+	dev->power.runtime_status = RPM_ACTIVE;
+	dev->power.idle_notification = false;
+
+	dev->power.runtime_disabled = true;
+	atomic_set(&dev->power.usage_count, 1);
+
+	dev->power.runtime_failure = false;
+	dev->power.last_error = 0;
+
+	atomic_set(&dev->power.child_count, 0);
+	pm_suspend_ignore_children(dev, false);
+
+	dev->power.request_pending = false;
+	dev->power.request = RPM_REQ_NONE;
+	dev->power.deferred_resume = false;
+	INIT_WORK(&dev->power.work, pm_runtime_work);
+
+	dev->power.timer_expires = 0;
+	dev->power.suspend_timer.expires = jiffies;
+	dev->power.suspend_timer.data = (unsigned long)dev;
+	dev->power.suspend_timer.function = pm_suspend_timer_fn;
+
+	init_waitqueue_head(&dev->power.wait_queue);
+}
+
+/**
+ * pm_runtime_add - Update run-time PM fields of a device while adding it.
+ * @dev: Device object being added to device hierarchy.
+ */
+void pm_runtime_add(struct device *dev)
+{
+	if (dev->parent)
+		atomic_inc(&dev->parent->power.child_count);
+}
+
+/**
+ * pm_runtime_remove - Prepare for removing a device from device hierarchy.
+ * @dev: Device object being removed from device hierarchy.
+ */
+void pm_runtime_remove(struct device *dev)
+{
+	struct device *parent = dev->parent;
+
+	pm_runtime_disable(dev);
+
+	if (dev->power.runtime_status != RPM_SUSPENDED && parent) {
+		atomic_add_unless(&parent->power.child_count, -1, 0);
+		if (!parent->power.ignore_children)
+			pm_request_idle(parent);
+	}
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,105 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+
+extern struct workqueue_struct *pm_wq;
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_runtime_add(struct device *dev);
+extern void pm_runtime_remove(struct device *dev);
+extern int pm_runtime_idle(struct device *dev);
+extern int pm_runtime_suspend(struct device *dev);
+extern int pm_runtime_resume(struct device *dev);
+extern int pm_request_idle(struct device *dev);
+extern int pm_schedule_suspend(struct device *dev, unsigned int delay);
+extern int pm_request_resume(struct device *dev);
+extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
+extern void pm_runtime_enable(struct device *dev);
+extern int pm_runtime_disable(struct device *dev);
+
+static inline bool pm_children_suspended(struct device *dev)
+{
+	return dev->power.ignore_children
+		|| !atomic_read(&dev->power.child_count);
+}
+
+static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
+{
+	dev->power.ignore_children = enable;
+}
+
+static inline int pm_runtime_get(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+	return pm_request_resume(dev);
+}
+
+static inline int pm_runtime_get_sync(struct device *dev)
+{
+	atomic_inc(&dev->power.usage_count);
+	return pm_runtime_resume(dev);
+}
+
+static inline int pm_runtime_put(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+	return pm_request_idle(dev);
+}
+
+static inline int pm_runtime_put_sync(struct device *dev)
+{
+	atomic_add_unless(&dev->power.usage_count, -1, 0);
+	return pm_runtime_idle(dev);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_runtime_add(struct device *dev) {}
+static inline void pm_runtime_remove(struct device *dev) {}
+static inline int pm_runtime_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_suspend(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_resume(struct device *dev) { return 0; }
+static inline int pm_request_idle(struct device *dev) { return -ENOSYS; }
+static inline int pm_schedule_suspend(struct device *dev, unsigned int delay)
+{
+	return -ENOSYS;
+}
+static inline int pm_request_resume(struct device *dev) { return 0; }
+static inline int __pm_runtime_set_status(struct device *dev,
+					    unsigned int status) { return 0; }
+static inline void pm_runtime_enable(struct device *dev) {}
+static inline int pm_runtime_disable(struct device *dev) { return 0; }
+
+static inline bool pm_children_suspended(struct device *dev) { return false; }
+static inline void pm_suspend_ignore_children(struct device *dev, bool en) {}
+static inline int pm_runtime_get(struct device *dev) { return 0; }
+static inline int pm_runtime_get_sync(struct device *dev) { return 0; }
+static inline int pm_runtime_put(struct device *dev) { return -ENOSYS; }
+static inline int pm_runtime_put_sync(struct device *dev) { return -ENOSYS; }
+
+#endif /* !CONFIG_PM_RUNTIME */
+
+static inline int pm_runtime_set_active(struct device *dev)
+{
+	return __pm_runtime_set_status(dev, RPM_ACTIVE);
+}
+
+static inline void pm_runtime_set_suspended(struct device *dev)
+{
+	__pm_runtime_set_status(dev, RPM_SUSPENDED);
+}
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
 #include <linux/kallsyms.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/resume-trace.h>
 #include <linux/rwsem.h>
 #include <linux/interrupt.h>
@@ -49,6 +50,16 @@ static DEFINE_MUTEX(dpm_list_mtx);
 static bool transition_started;
 
 /**
+ * device_pm_init - Initialize the PM-related part of a device object
+ * @dev: Device object to initialize.
+ */
+void device_pm_init(struct device *dev)
+{
+	dev->power.status = DPM_ON;
+	pm_runtime_init(dev);
+}
+
+/**
  *	device_pm_lock - lock the list of active devices used by the PM core
  */
 void device_pm_lock(void)
@@ -89,6 +100,8 @@ void device_pm_add(struct device *dev)
 
 	list_add_tail(&dev->power.entry, &dpm_list);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_add(dev);
 }
 
 /**
@@ -105,6 +118,8 @@ void device_pm_remove(struct device *dev
 	mutex_lock(&dpm_list_mtx);
 	list_del_init(&dev->power.entry);
 	mutex_unlock(&dpm_list_mtx);
+
+	pm_runtime_remove(dev);
 }
 
 /**
@@ -510,6 +525,7 @@ static void dpm_complete(pm_message_t st
 			mutex_unlock(&dpm_list_mtx);
 
 			device_complete(dev, state);
+			pm_runtime_enable(dev);
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -755,11 +771,14 @@ static int dpm_prepare(pm_message_t stat
 		dev->power.status = DPM_PREPARING;
 		mutex_unlock(&dpm_list_mtx);
 
-		error = device_prepare(dev, state);
+		error = pm_runtime_disable(dev);
+		if (!error || !device_may_wakeup(dev))
+			error = device_prepare(dev, state);
 
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			dev->power.status = DPM_ON;
+			pm_runtime_enable(dev);
 			if (error == -EAGAIN) {
 				put_device(dev);
 				error = 0;
Index: linux-2.6/drivers/base/dd.c
===================================================================
--- linux-2.6.orig/drivers/base/dd.c
+++ linux-2.6/drivers/base/dd.c
@@ -23,6 +23,7 @@
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/async.h>
+#include <linux/pm_runtime.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -202,7 +203,10 @@ int driver_probe_device(struct device_dr
 	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
 		 drv->bus->name, __func__, dev_name(dev), drv->name);
 
-	ret = really_probe(dev, drv);
+	ret = pm_runtime_get_sync(dev);
+	if (ret >= 0)
+		ret = really_probe(dev, drv);
+	pm_runtime_put(dev);
 
 	return ret;
 }
@@ -306,6 +310,8 @@ static void __device_release_driver(stru
 
 	drv = dev->driver;
 	if (drv) {
+		pm_runtime_disable(dev);
+
 		driver_sysfs_remove(dev);
 
 		if (dev->bus)
@@ -324,6 +330,8 @@ static void __device_release_driver(stru
 			blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 						     BUS_NOTIFY_UNBOUND_DRIVER,
 						     dev);
+
+		pm_runtime_enable(dev);
 	}
 }
 
Index: linux-2.6/drivers/base/power/power.h
===================================================================
--- linux-2.6.orig/drivers/base/power/power.h
+++ linux-2.6/drivers/base/power/power.h
@@ -1,8 +1,3 @@
-static inline void device_pm_init(struct device *dev)
-{
-	dev->power.status = DPM_ON;
-}
-
 #ifdef CONFIG_PM_SLEEP
 
 /*
@@ -16,14 +11,16 @@ static inline struct device *to_device(s
 	return container_of(entry, struct device, power.entry);
 }
 
+extern void device_pm_init(struct device *dev);
 extern void device_pm_add(struct device *);
 extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
 
-#else /* CONFIG_PM_SLEEP */
+#else /* !CONFIG_PM_SLEEP */
 
+static inline void device_pm_init(struct device *dev) {}
 static inline void device_pm_add(struct device *dev) {}
 static inline void device_pm_remove(struct device *dev) {}
 static inline void device_pm_move_before(struct device *deva,
@@ -32,7 +29,7 @@ static inline void device_pm_move_after(
 					struct device *devb) {}
 static inline void device_pm_move_last(struct device *dev) {}
 
-#endif
+#endif /* !CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM
 

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2009-07-13  1:51 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-06  0:52 [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8) Rafael J. Wysocki
2009-07-07 15:12 ` Magnus Damm
2009-07-07 15:12 ` Magnus Damm
2009-07-07 15:12   ` Magnus Damm
2009-07-07 22:07   ` Rafael J. Wysocki
2009-07-08  2:54     ` Alan Stern
2009-07-08  4:40       ` Magnus Damm
2009-07-08  4:40       ` Magnus Damm
2009-07-08  4:40         ` Magnus Damm
2009-07-08 14:26         ` Alan Stern
2009-07-08 14:26           ` Alan Stern
2009-07-08 17:50           ` [update][RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 9) Rafael J. Wysocki
2009-07-08 17:50           ` Rafael J. Wysocki
2009-07-08 14:26         ` [RFC][PATCH] PM: Introduce core framework for run-time PM of I/O devices (rev. 8) Alan Stern
2009-07-08  2:54     ` Alan Stern
2009-07-08  5:45     ` Magnus Damm
2009-07-08  5:45     ` Magnus Damm
2009-07-08  5:45       ` Magnus Damm
2009-07-08 19:01       ` Rafael J. Wysocki
2009-07-08 19:42         ` Alan Stern
2009-07-08 19:42         ` Alan Stern
2009-07-08 19:55           ` Rafael J. Wysocki
2009-07-08 21:09             ` Alan Stern
2009-07-08 21:29               ` Rafael J. Wysocki
2009-07-08 21:29               ` Rafael J. Wysocki
2009-07-08 21:29               ` Rafael J. Wysocki
2009-07-08 21:29               ` Rafael J. Wysocki
2009-07-08 21:09             ` Alan Stern
2009-07-08 19:55           ` Rafael J. Wysocki
2009-07-09  2:52           ` Magnus Damm
2009-07-09  2:52           ` Magnus Damm
2009-07-09  2:52             ` Magnus Damm
2009-07-09 13:48             ` Alan Stern
2009-07-09 15:31               ` Magnus Damm
2009-07-09 15:31               ` Magnus Damm
2009-07-09 15:31                 ` Magnus Damm
2009-07-09 21:56                 ` Mahalingam, Nithish
2009-07-09 21:56                 ` [linux-pm] " Mahalingam, Nithish
2009-07-09 21:56                   ` Mahalingam, Nithish
2009-07-11 11:08                   ` Rafael J. Wysocki
2009-07-11 11:08                   ` [linux-pm] " Rafael J. Wysocki
2009-07-12  2:05                     ` Mahalingam, Nithish
2009-07-12  2:05                     ` [linux-pm] " Mahalingam, Nithish
2009-07-13  1:42                   ` Magnus Damm
2009-07-13  1:42                     ` [linux-pm] " Magnus Damm
2009-07-09 13:48             ` Alan Stern
2009-07-08 19:01       ` Rafael J. Wysocki
2009-07-07 22:07   ` Rafael J. Wysocki
2009-07-09 23:22 ` Pavel Machek
2009-07-09 23:22   ` Pavel Machek
  -- strict thread matches above, loose matches on Subject: below --
2009-07-06  0:52 Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.