All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend
@ 2014-01-14 23:12 Rafael J. Wysocki
  2014-01-14 23:13 ` [RFC][PATCH 1/3] PM / sleep: Flag to avoid executing suspend callbacks for devices Rafael J. Wysocki
                   ` (3 more replies)
  0 siblings, 4 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-01-14 23:12 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

Hi,

The following experimental series of 3 patches implements a mechanism allowing
subsystems to avoid resuming runtime-suspended devices during system suspend.

As far as the PM core goes, it introduces a new flag, power.no_suspend, that
will be set by the core for devices which can stay suspended.  The idea is that
subsystems should know which devices can stay suspended over system suspend
and to allow them to tell the core about that patch [1/3] changes the calling
convention of the device PM .prepare() callback so that it can return a positive
value on success to be interpreted as "this device has been runtime-suspended
and doesn't need to be resumed" information.  If .prepare() returns a positive
number for certain device, the core will set power.no_suspend and will not run
suspend callbacks for device with that flag set going forward (during this
particular system suspend transition).

However, parents may generally need to be resumed so that the suspend of their
children can be carried out, so the PM core will clear power.no_suspend for
the parents of devices whose power.no_suspend is not set (unless those parents
have power.ignore_children set).

Patch [2/3] adds a new runtime PM helper function that subsystems can use to
check whether or not a given device is runtime-suspended when .prepare() is being
executed for it.

Patch [3/3] implements the subsystem part for the ACPI PM domain, because that
is relatively straightforward.  If the general approach makes sense, I'll think
about doing the same for PCI.

Comments welcome!

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 1/3] PM / sleep: Flag to avoid executing suspend callbacks for devices
  2014-01-14 23:12 [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend Rafael J. Wysocki
@ 2014-01-14 23:13 ` Rafael J. Wysocki
  2014-01-14 23:14 ` [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-01-14 23:13 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
resume all runtime-suspended devices during system suspend, mostly
because those devices may need to be reprogrammed due to different
wakeup settings for system sleep and runtime PM.  However, at least
in some cases that isn't really necessary, because the wakeup
settings aren't really different.

The idea here is that subsystems should know whether or not it is
necessary to reprogram a given device during system suspend and they
should be able to tell the PM core about that.  For this reason,
modify the PM core so that if the .prepare() callback returns a
positive value for certain device, the core will set a new
power.no_suspend flag for it.  Then, if that flag is set, the core
will skip all of the subsequent suspend callbacks for that device.

However, since parents may need to be resumed so that their children
can be reprogrammed, make the PM core clear power.no_suspend for
devices that don't have power.ignore_children set and whose
children don't have power.no_suspend set.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |   38 ++++++++++++++++++++++++++------------
 include/linux/pm.h        |    1 +
 2 files changed, 27 insertions(+), 12 deletions(-)

Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -918,7 +918,7 @@ static int device_suspend_noirq(struct d
 	pm_callback_t callback = NULL;
 	char *info = NULL;
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.no_suspend)
 		return 0;
 
 	if (dev->pm_domain) {
@@ -1006,7 +1006,7 @@ static int device_suspend_late(struct de
 
 	__pm_runtime_disable(dev, false);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.no_suspend)
 		return 0;
 
 	if (dev->pm_domain) {
@@ -1143,8 +1143,10 @@ static int __device_suspend(struct devic
 	 * for it, this is equivalent to the device signaling wakeup, so the
 	 * system suspend operation should be aborted.
 	 */
-	if (pm_runtime_barrier(dev) && device_may_wakeup(dev))
+	if (pm_runtime_barrier(dev) && device_may_wakeup(dev)) {
 		pm_wakeup_event(dev, 0);
+		dev->power.no_suspend = false;
+	}
 
 	if (pm_wakeup_pending()) {
 		async_error = -EBUSY;
@@ -1157,6 +1159,9 @@ static int __device_suspend(struct devic
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
 
+	if (dev->power.no_suspend)
+		goto End;
+
 	if (dev->pm_domain) {
 		info = "power domain ";
 		callback = pm_op(&dev->pm_domain->ops, state);
@@ -1205,9 +1210,13 @@ static int __device_suspend(struct devic
  End:
 	if (!error) {
 		dev->power.is_suspended = true;
-		if (dev->power.wakeup_path
-		    && dev->parent && !dev->parent->power.ignore_children)
-			dev->parent->power.wakeup_path = true;
+		if (dev->parent && !dev->parent->power.ignore_children) {
+			if (dev->power.wakeup_path)
+				dev->parent->power.wakeup_path = true;
+
+			if (!dev->power.no_suspend)
+				dev->parent->power.no_suspend = false;
+		}
 	}
 
 	device_unlock(dev);
@@ -1307,7 +1316,7 @@ static int device_prepare(struct device
 {
 	int (*callback)(struct device *) = NULL;
 	char *info = NULL;
-	int error = 0;
+	int ret = 0;
 
 	if (dev->power.syscore)
 		return 0;
@@ -1323,6 +1332,7 @@ static int device_prepare(struct device
 	device_lock(dev);
 
 	dev->power.wakeup_path = device_may_wakeup(dev);
+	dev->power.no_suspend = false;
 
 	if (dev->pm_domain) {
 		info = "preparing power domain ";
@@ -1344,16 +1354,20 @@ static int device_prepare(struct device
 	}
 
 	if (callback) {
-		error = callback(dev);
-		suspend_report_result(callback, error);
+		ret = callback(dev);
+		suspend_report_result(callback, ret);
 	}
 
 	device_unlock(dev);
 
-	if (error)
+	if (ret < 0) {
 		pm_runtime_put(dev);
+	} else if (ret > 0) {
+		dev->power.no_suspend = true;
+		ret = 0;
+	}
 
-	return error;
+	return ret;
 }
 
 /**
@@ -1422,7 +1436,7 @@ EXPORT_SYMBOL_GPL(dpm_suspend_start);
 
 void __suspend_report_result(const char *function, void *fn, int ret)
 {
-	if (ret)
+	if (ret < 0)
 		printk(KERN_ERR "%s(): %pF returns %d\n", function, fn, ret);
 }
 EXPORT_SYMBOL_GPL(__suspend_report_result);
Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -544,6 +544,7 @@ struct dev_pm_info {
 	bool			is_suspended:1;	/* Ditto */
 	bool			ignore_children:1;
 	bool			early_init:1;	/* Owned by the PM core */
+	bool			no_suspend:1;
 	spinlock_t		lock;
 #ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend
  2014-01-14 23:12 [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend Rafael J. Wysocki
  2014-01-14 23:13 ` [RFC][PATCH 1/3] PM / sleep: Flag to avoid executing suspend callbacks for devices Rafael J. Wysocki
@ 2014-01-14 23:14 ` Rafael J. Wysocki
  2014-01-16 13:32   ` Mika Westerberg
  2014-01-14 23:16 ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2014-02-16 23:49 ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices " Rafael J. Wysocki
  3 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-01-14 23:14 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add a new helper routine, pm_runtime_enabled_and_suspended(), to
allow subsystems (or PM domains) to check the runtime PM status of
devices during system suspend (possibly to avoid resuming those
devices upfront at that time).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c |   28 ++++++++++++++++++++++++++++
 include/linux/pm_runtime.h   |    2 ++
 2 files changed, 30 insertions(+)

Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -53,6 +53,7 @@ extern unsigned long pm_runtime_autosusp
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern bool pm_runtime_enabled_and_suspended(struct device *dev);
 
 static inline bool pm_children_suspended(struct device *dev)
 {
@@ -161,6 +162,7 @@ static inline unsigned long pm_runtime_a
 				struct device *dev) { return 0; }
 static inline void pm_runtime_set_memalloc_noio(struct device *dev,
 						bool enable){}
+static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false };
 
 #endif /* !CONFIG_PM_RUNTIME */
 
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1194,6 +1194,34 @@ void pm_runtime_enable(struct device *de
 EXPORT_SYMBOL_GPL(pm_runtime_enable);
 
 /**
+ * pm_runtime_enabled_and_suspended - Check runtime PM status of a device.
+ * @dev: Device to handle.
+ *
+ * This routine is to be executed during system suspend only, after
+ * device_prepare() has been executed for @dev.
+ *
+ * Return false if runtime PM is disabled for the device.  Otherwise, wait
+ * for pending transitions to complete and check the runtime PM status of the
+ * device after that.  Return true if it is RPM_SUSPENDED.
+ */
+bool pm_runtime_enabled_and_suspended(struct device *dev)
+{
+	unsigned long flags;
+	bool ret;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	if (dev->power.disable_depth) {
+		ret = false;
+	} else {
+		__pm_runtime_barrier(dev);
+		ret = pm_runtime_status_suspended(dev);
+	}
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enabled_and_suspended);
+
+/**
  * pm_runtime_forbid - Block runtime PM of a device.
  * @dev: Device to handle.
  *

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
  2014-01-14 23:12 [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend Rafael J. Wysocki
  2014-01-14 23:13 ` [RFC][PATCH 1/3] PM / sleep: Flag to avoid executing suspend callbacks for devices Rafael J. Wysocki
  2014-01-14 23:14 ` [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
@ 2014-01-14 23:16 ` Rafael J. Wysocki
  2014-01-15 13:57   ` [Update][RFC][PATCH " Rafael J. Wysocki
  2014-02-16 23:49 ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices " Rafael J. Wysocki
  3 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-01-14 23:16 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Rework the ACPI PM domain's PM callbacks to avoid resuming devices
during system suspend in order to modify their wakeup settings if
that isn't necessary.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/device_pm.c |   31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -812,6 +812,13 @@ int acpi_dev_runtime_resume(struct devic
 	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
 	int error;
 
+	/*
+	 * This only matters during system suspend, if acpi_subsys_prepare()
+	 * has returned 1.  In that case, we may be resumed through a child
+	 * runtime resume, in which case our system suspend callbacks will need
+	 * to be executed, so power.no_suspend has to be cleared.
+	 */
+	dev->power.no_suspend = false;
 	if (!adev)
 		return 0;
 
@@ -912,12 +919,28 @@ EXPORT_SYMBOL_GPL(acpi_dev_resume_early)
  */
 int acpi_subsys_prepare(struct device *dev)
 {
+	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
+	u32 target_state;
+	int error, state;
+
+	if (!adev || !pm_runtime_enabled_and_suspended(dev))
+		return pm_generic_prepare(dev);
+
+	target_state = acpi_target_system_state();
+	error = acpi_dev_pm_get_state(dev, adev, target_state, NULL, &state);
+	if (error || state != adev->power.state
+	    || device_may_wakeup(dev) != !!adev->wakeup.prepare_count) {
+		pm_runtime_resume(dev);
+		return pm_generic_prepare(dev);
+	}
 	/*
-	 * Follow PCI and resume devices suspended at run time before running
-	 * their system suspend callbacks.
+	 * If this is a wakeup device, wakeup power has been enabled already for
+	 * it during the preceding runtime suspend.  Caveat: "sleep state" is
+	 * one of the _DSW arguments, but that shouldn't matter for the devices
+	 * using acpi_general_pm_domain.
 	 */
-	pm_runtime_resume(dev);
-	return pm_generic_prepare(dev);
+	error =  pm_generic_prepare(dev);
+	return error ? error : 1;
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Update][RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
  2014-01-14 23:16 ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
@ 2014-01-15 13:57   ` Rafael J. Wysocki
  0 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-01-15 13:57 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend

Rework the ACPI PM domain's PM callbacks to avoid resuming devices
during system suspend in order to modify their wakeup settings if
that isn't necessary.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

Changes from the previous version:

 - I don't think it is really necessary to clear power.no_suspend from
   acpi_dev_runtime_resume(), because __device_suspend() will run for
   children before it runs for the parent, so it is sufficient to clear
   power.no_suspend for the parent from there.

   At the same time, since the parent can only be runtime-suspended if the
   children are runtime-suspended, the children with power.no_suspend clear
   have to be runtime-resumed before __device_suspend() is executed for
   them and that will trigger runtime resume of the parent.

 - To avoid problems related to possible differences between runtime resume
   and system resume driver callbacks, use pm_runtime_resume() to resume
   devices whose power.no_suspend was set during the suspend we're resuming
   from.

Thanks,
Rafael

---
 drivers/acpi/device_pm.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -912,12 +912,28 @@ EXPORT_SYMBOL_GPL(acpi_dev_resume_early)
  */
 int acpi_subsys_prepare(struct device *dev)
 {
+	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
+	u32 target_state;
+	int error, state;
+
+	if (!adev || !pm_runtime_enabled_and_suspended(dev))
+		return pm_generic_prepare(dev);
+
+	target_state = acpi_target_system_state();
+	error = acpi_dev_pm_get_state(dev, adev, target_state, NULL, &state);
+	if (error || state != adev->power.state
+	    || device_may_wakeup(dev) != !!adev->wakeup.prepare_count) {
+		pm_runtime_resume(dev);
+		return pm_generic_prepare(dev);
+	}
 	/*
-	 * Follow PCI and resume devices suspended at run time before running
-	 * their system suspend callbacks.
+	 * If this is a wakeup device, wakeup power has been enabled already for
+	 * it during the preceding runtime suspend.  Caveat: "sleep state" is
+	 * one of the _DSW arguments, but that shouldn't matter for the devices
+	 * using acpi_general_pm_domain.
 	 */
-	pm_runtime_resume(dev);
-	return pm_generic_prepare(dev);
+	error =  pm_generic_prepare(dev);
+	return error ? error : 1;
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
 
@@ -945,10 +961,27 @@ EXPORT_SYMBOL_GPL(acpi_subsys_suspend_la
  */
 int acpi_subsys_resume_early(struct device *dev)
 {
-	int ret = acpi_dev_resume_early(dev);
+	int ret;
+
+	if (dev->power.no_suspend)
+		return 0;
+
+	ret = acpi_dev_resume_early(dev);
 	return ret ? ret : pm_generic_resume_early(dev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_resume_early);
+
+/**
+ * acpi_subsys_resume - Resume device using ACPI (if not resumed before).
+ * @dev: Device to resume.
+ *
+ * If power.no_suspend is set for @dev, run pm_runtime_resume() for it.
+ */
+int acpi_subsys_resume(struct device *dev)
+{
+	return dev->power.no_suspend ? pm_runtime_resume(dev) : 0;
+}
+EXPORT_SYMBOL_GPL(acpi_subsys_resume);
 #endif /* CONFIG_PM_SLEEP */
 
 static struct dev_pm_domain acpi_general_pm_domain = {
@@ -961,8 +994,10 @@ static struct dev_pm_domain acpi_general
 		.prepare = acpi_subsys_prepare,
 		.suspend_late = acpi_subsys_suspend_late,
 		.resume_early = acpi_subsys_resume_early,
+		.resume = acpi_subsys_resume,
 		.poweroff_late = acpi_subsys_suspend_late,
 		.restore_early = acpi_subsys_resume_early,
+		.restore = acpi_subsys_resume,
 #endif
 	},
 };


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend
  2014-01-14 23:14 ` [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
@ 2014-01-16 13:32   ` Mika Westerberg
  2014-01-16 16:07     ` [Update][RFC][PATCH " Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Mika Westerberg @ 2014-01-16 13:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Alan Stern, Aaron Lu, ACPI Devel Maling List, LKML

On Wed, Jan 15, 2014 at 12:14:46AM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Add a new helper routine, pm_runtime_enabled_and_suspended(), to
> allow subsystems (or PM domains) to check the runtime PM status of
> devices during system suspend (possibly to avoid resuming those
> devices upfront at that time).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/power/runtime.c |   28 ++++++++++++++++++++++++++++
>  include/linux/pm_runtime.h   |    2 ++
>  2 files changed, 30 insertions(+)
> 
> Index: linux-pm/include/linux/pm_runtime.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm_runtime.h
> +++ linux-pm/include/linux/pm_runtime.h
> @@ -53,6 +53,7 @@ extern unsigned long pm_runtime_autosusp
>  extern void pm_runtime_update_max_time_suspended(struct device *dev,
>  						 s64 delta_ns);
>  extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
> +extern bool pm_runtime_enabled_and_suspended(struct device *dev);
>  
>  static inline bool pm_children_suspended(struct device *dev)
>  {
> @@ -161,6 +162,7 @@ static inline unsigned long pm_runtime_a
>  				struct device *dev) { return 0; }
>  static inline void pm_runtime_set_memalloc_noio(struct device *dev,
>  						bool enable){}
> +static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false };

The above probably doesn't compile if !CONFIG_PM_RUNTIME because of the
misplaced semicolon.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Update][RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend
  2014-01-16 13:32   ` Mika Westerberg
@ 2014-01-16 16:07     ` Rafael J. Wysocki
  0 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-01-16 16:07 UTC (permalink / raw)
  To: Mika Westerberg, Linux PM list
  Cc: Alan Stern, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, January 16, 2014 03:32:51 PM Mika Westerberg wrote:
> On Wed, Jan 15, 2014 at 12:14:46AM +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Add a new helper routine, pm_runtime_enabled_and_suspended(), to
> > allow subsystems (or PM domains) to check the runtime PM status of
> > devices during system suspend (possibly to avoid resuming those
> > devices upfront at that time).
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/power/runtime.c |   28 ++++++++++++++++++++++++++++
> >  include/linux/pm_runtime.h   |    2 ++
> >  2 files changed, 30 insertions(+)
> > 
> > Index: linux-pm/include/linux/pm_runtime.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/pm_runtime.h
> > +++ linux-pm/include/linux/pm_runtime.h
> > @@ -53,6 +53,7 @@ extern unsigned long pm_runtime_autosusp
> >  extern void pm_runtime_update_max_time_suspended(struct device *dev,
> >  						 s64 delta_ns);
> >  extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
> > +extern bool pm_runtime_enabled_and_suspended(struct device *dev);
> >  
> >  static inline bool pm_children_suspended(struct device *dev)
> >  {
> > @@ -161,6 +162,7 @@ static inline unsigned long pm_runtime_a
> >  				struct device *dev) { return 0; }
> >  static inline void pm_runtime_set_memalloc_noio(struct device *dev,
> >  						bool enable){}
> > +static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false };
> 
> The above probably doesn't compile if !CONFIG_PM_RUNTIME because of the
> misplaced semicolon.

Of well, indeed, thanks!

Update follows.

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: PM / runtime: Routine for checking device status during system suspend

Add a new helper routine, pm_runtime_enabled_and_suspended(), to
allow subsystems (or PM domains) to check the runtime PM status of
devices during system suspend (possibly to avoid resuming those
devices upfront at that time).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c |   28 ++++++++++++++++++++++++++++
 include/linux/pm_runtime.h   |    2 ++
 2 files changed, 30 insertions(+)

Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -53,6 +53,7 @@ extern unsigned long pm_runtime_autosusp
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern bool pm_runtime_enabled_and_suspended(struct device *dev);
 
 static inline bool pm_children_suspended(struct device *dev)
 {
@@ -161,6 +162,7 @@ static inline unsigned long pm_runtime_a
 				struct device *dev) { return 0; }
 static inline void pm_runtime_set_memalloc_noio(struct device *dev,
 						bool enable){}
+static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false; }
 
 #endif /* !CONFIG_PM_RUNTIME */
 
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1194,6 +1194,34 @@ void pm_runtime_enable(struct device *de
 EXPORT_SYMBOL_GPL(pm_runtime_enable);
 
 /**
+ * pm_runtime_enabled_and_suspended - Check runtime PM status of a device.
+ * @dev: Device to handle.
+ *
+ * This routine is to be executed during system suspend only, after
+ * device_prepare() has been executed for @dev.
+ *
+ * Return false if runtime PM is disabled for the device.  Otherwise, wait
+ * for pending transitions to complete and check the runtime PM status of the
+ * device after that.  Return true if it is RPM_SUSPENDED.
+ */
+bool pm_runtime_enabled_and_suspended(struct device *dev)
+{
+	unsigned long flags;
+	bool ret;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	if (dev->power.disable_depth) {
+		ret = false;
+	} else {
+		__pm_runtime_barrier(dev);
+		ret = pm_runtime_status_suspended(dev);
+	}
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enabled_and_suspended);
+
+/**
  * pm_runtime_forbid - Block runtime PM of a device.
  * @dev: Device to handle.
  *


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend
  2014-01-14 23:12 [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2014-01-14 23:16 ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
@ 2014-02-16 23:49 ` Rafael J. Wysocki
  2014-02-16 23:50   ` [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices Rafael J. Wysocki
                     ` (2 more replies)
  3 siblings, 3 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-16 23:49 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Wednesday, January 15, 2014 12:12:29 AM Rafael J. Wysocki wrote:
> Hi,
> 
> The following experimental series of 3 patches implements a mechanism allowing
> subsystems to avoid resuming runtime-suspended devices during system suspend.
> 
> As far as the PM core goes, it introduces a new flag, power.no_suspend, that
> will be set by the core for devices which can stay suspended.  The idea is that
> subsystems should know which devices can stay suspended over system suspend
> and to allow them to tell the core about that patch [1/3] changes the calling
> convention of the device PM .prepare() callback so that it can return a positive
> value on success to be interpreted as "this device has been runtime-suspended
> and doesn't need to be resumed" information.  If .prepare() returns a positive
> number for certain device, the core will set power.no_suspend and will not run
> suspend callbacks for device with that flag set going forward (during this
> particular system suspend transition).
> 
> However, parents may generally need to be resumed so that the suspend of their
> children can be carried out, so the PM core will clear power.no_suspend for
> the parents of devices whose power.no_suspend is not set (unless those parents
> have power.ignore_children set).
> 
> Patch [2/3] adds a new runtime PM helper function that subsystems can use to
> check whether or not a given device is runtime-suspended when .prepare() is being
> executed for it.
> 
> Patch [3/3] implements the subsystem part for the ACPI PM domain, because that
> is relatively straightforward.  If the general approach makes sense, I'll think
> about doing the same for PCI.

I have a new version of this.

The new patch [1/3] goes farther than the previous one, because I realized that
all subsystems returning values greater from zero from their .prepare()
callbacks will want to skip .resume_noirq() and .resume_early() for the
"fast suspended" devices and all of them will likely want to run a
pm_request_resume() for those devices in their .resume().  So, if all of
them would do that anyway, it's better if the core does that for them.
Of course, that simplifies patch [3/3] quite a bit.  Patch [2/3] is the
same as before.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-16 23:49 ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices " Rafael J. Wysocki
@ 2014-02-16 23:50   ` Rafael J. Wysocki
  2014-02-18 12:59     ` Ulf Hansson
  2014-02-19 17:01     ` Alan Stern
  2014-02-16 23:51   ` [PATCH 2/3][Resend] PM / runtime: Routine for checking device status " Rafael J. Wysocki
  2014-02-16 23:52   ` [PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2 siblings, 2 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-16 23:50 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
resume all runtime-suspended devices during system suspend, mostly
because those devices may need to be reprogrammed due to different
wakeup settings for system sleep and for runtime PM.  However, at
least in some cases, that isn't really necessary, because the wakeup
settings may not be really different.

The idea here is that subsystems should know whether or not it is
necessary to reprogram a given device during system suspend and they
should be able to tell the PM core about that.  For this reason,
modify the PM core so that if the .prepare() callback returns a
positive value for certain device, the core will set a new
power.fast_suspend flag for it.  Then, if that flag is set, the core
will skip all of the subsequent suspend callbacks for that device.
It also will skip all of the system resume callbacks for the device
during the subsequent system resume and pm_request_resume() will be
executed to trigger a runtime PM resume of the device after the
system device resume sequence has been finished.

However, since parents may need to be resumed so that their children
can be reprogrammed, make the PM core clear power.fast_suspend for
devices whose children don't have power.fast_suspend set (the
power.ignore_children flag doesn't matter here, because a parent
whose children are normally ignored for runtime PM may still need to
be accessible for their children to be prepare for system suspend
properly).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |   49 +++++++++++++++++++++++++++++++---------------
 include/linux/pm.h        |    1 
 2 files changed, 35 insertions(+), 15 deletions(-)

Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -478,7 +478,7 @@ static int device_resume_noirq(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.fast_suspend)
 		goto Out;
 
 	if (!dev->power.is_noirq_suspended)
@@ -599,7 +599,7 @@ static int device_resume_early(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.fast_suspend)
 		goto Out;
 
 	if (!dev->power.is_late_suspended)
@@ -724,6 +724,11 @@ static int device_resume(struct device *
 	if (dev->power.syscore)
 		goto Complete;
 
+	if (dev->power.fast_suspend) {
+		pm_request_resume(dev);
+		goto Complete;
+	}
+
 	dpm_wait(dev->parent, async);
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
@@ -994,7 +999,7 @@ static int __device_suspend_noirq(struct
 		return async_error;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.fast_suspend)
 		return 0;
 
 	if (dev->pm_domain) {
@@ -1127,7 +1132,7 @@ static int __device_suspend_late(struct
 		return async_error;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.fast_suspend)
 		return 0;
 
 	if (dev->pm_domain) {
@@ -1296,8 +1301,10 @@ static int __device_suspend(struct devic
 	 * for it, this is equivalent to the device signaling wakeup, so the
 	 * system suspend operation should be aborted.
 	 */
-	if (pm_runtime_barrier(dev) && device_may_wakeup(dev))
+	if (pm_runtime_barrier(dev) && device_may_wakeup(dev)) {
 		pm_wakeup_event(dev, 0);
+		dev->power.fast_suspend = false;
+	}
 
 	if (pm_wakeup_pending()) {
 		async_error = -EBUSY;
@@ -1310,6 +1317,9 @@ static int __device_suspend(struct devic
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
 
+	if (dev->power.fast_suspend)
+		goto End;
+
 	if (dev->pm_domain) {
 		info = "power domain ";
 		callback = pm_op(&dev->pm_domain->ops, state);
@@ -1358,9 +1368,14 @@ static int __device_suspend(struct devic
  End:
 	if (!error) {
 		dev->power.is_suspended = true;
-		if (dev->power.wakeup_path
-		    && dev->parent && !dev->parent->power.ignore_children)
-			dev->parent->power.wakeup_path = true;
+		if (dev->parent) {
+			if (!dev->parent->power.ignore_children
+			    && dev->power.wakeup_path)
+				dev->parent->power.wakeup_path = true;
+
+			if (!dev->power.fast_suspend)
+				dev->parent->power.fast_suspend = false;
+		}
 	}
 
 	device_unlock(dev);
@@ -1460,7 +1475,7 @@ static int device_prepare(struct device
 {
 	int (*callback)(struct device *) = NULL;
 	char *info = NULL;
-	int error = 0;
+	int ret = 0;
 
 	if (dev->power.syscore)
 		return 0;
@@ -1476,6 +1491,7 @@ static int device_prepare(struct device
 	device_lock(dev);
 
 	dev->power.wakeup_path = device_may_wakeup(dev);
+	dev->power.fast_suspend = false;
 
 	if (dev->pm_domain) {
 		info = "preparing power domain ";
@@ -1497,16 +1513,19 @@ static int device_prepare(struct device
 	}
 
 	if (callback) {
-		error = callback(dev);
-		suspend_report_result(callback, error);
+		ret = callback(dev);
+		suspend_report_result(callback, ret);
 	}
 
 	device_unlock(dev);
 
-	if (error)
+	if (ret < 0) {
 		pm_runtime_put(dev);
-
-	return error;
+		return ret;
+	} else if (ret > 0) {
+		dev->power.fast_suspend = true;
+	}
+	return 0;
 }
 
 /**
@@ -1575,7 +1594,7 @@ EXPORT_SYMBOL_GPL(dpm_suspend_start);
 
 void __suspend_report_result(const char *function, void *fn, int ret)
 {
-	if (ret)
+	if (ret < 0)
 		printk(KERN_ERR "%s(): %pF returns %d\n", function, fn, ret);
 }
 EXPORT_SYMBOL_GPL(__suspend_report_result);
Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -546,6 +546,7 @@ struct dev_pm_info {
 	bool			is_late_suspended:1;
 	bool			ignore_children:1;
 	bool			early_init:1;	/* Owned by the PM core */
+	bool			fast_suspend:1;	/* Owned by the PM core */
 	spinlock_t		lock;
 #ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 2/3][Resend] PM / runtime: Routine for checking device status during system suspend
  2014-02-16 23:49 ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices " Rafael J. Wysocki
  2014-02-16 23:50   ` [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices Rafael J. Wysocki
@ 2014-02-16 23:51   ` Rafael J. Wysocki
  2014-02-16 23:52   ` [PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-16 23:51 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add a new helper routine, pm_runtime_enabled_and_suspended(), to
allow subsystems (or PM domains) to check the runtime PM status of
devices during system suspend (possibly to avoid resuming those
devices upfront at that time).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c |   28 ++++++++++++++++++++++++++++
 include/linux/pm_runtime.h   |    2 ++
 2 files changed, 30 insertions(+)

Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -53,6 +53,7 @@ extern unsigned long pm_runtime_autosusp
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern bool pm_runtime_enabled_and_suspended(struct device *dev);
 
 static inline bool pm_children_suspended(struct device *dev)
 {
@@ -161,6 +162,7 @@ static inline unsigned long pm_runtime_a
 				struct device *dev) { return 0; }
 static inline void pm_runtime_set_memalloc_noio(struct device *dev,
 						bool enable){}
+static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false; }
 
 #endif /* !CONFIG_PM_RUNTIME */
 
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1194,6 +1194,34 @@ void pm_runtime_enable(struct device *de
 EXPORT_SYMBOL_GPL(pm_runtime_enable);
 
 /**
+ * pm_runtime_enabled_and_suspended - Check runtime PM status of a device.
+ * @dev: Device to handle.
+ *
+ * This routine is to be executed during system suspend only, after
+ * device_prepare() has been executed for @dev.
+ *
+ * Return false if runtime PM is disabled for the device.  Otherwise, wait
+ * for pending transitions to complete and check the runtime PM status of the
+ * device after that.  Return true if it is RPM_SUSPENDED.
+ */
+bool pm_runtime_enabled_and_suspended(struct device *dev)
+{
+	unsigned long flags;
+	bool ret;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	if (dev->power.disable_depth) {
+		ret = false;
+	} else {
+		__pm_runtime_barrier(dev);
+		ret = pm_runtime_status_suspended(dev);
+	}
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enabled_and_suspended);
+
+/**
  * pm_runtime_forbid - Block runtime PM of a device.
  * @dev: Device to handle.
  *


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
  2014-02-16 23:49 ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices " Rafael J. Wysocki
  2014-02-16 23:50   ` [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices Rafael J. Wysocki
  2014-02-16 23:51   ` [PATCH 2/3][Resend] PM / runtime: Routine for checking device status " Rafael J. Wysocki
@ 2014-02-16 23:52   ` Rafael J. Wysocki
  2 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-16 23:52 UTC (permalink / raw)
  To: Linux PM list
  Cc: Alan Stern, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Rework the ACPI PM domain's PM callbacks to avoid resuming devices
during system suspend in order to modify their wakeup settings if
that isn't necessary.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/device_pm.c |   24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -900,12 +900,28 @@ EXPORT_SYMBOL_GPL(acpi_dev_resume_early)
  */
 int acpi_subsys_prepare(struct device *dev)
 {
+	struct acpi_device *adev = ACPI_COMPANION(dev);
+	u32 target_state;
+	int error, state;
+
+	if (!adev || !pm_runtime_enabled_and_suspended(dev))
+		return pm_generic_prepare(dev);
+
+	target_state = acpi_target_system_state();
+	error = acpi_dev_pm_get_state(dev, adev, target_state, NULL, &state);
+	if (error || state != adev->power.state
+	    || device_may_wakeup(dev) != !!adev->wakeup.prepare_count) {
+		pm_runtime_resume(dev);
+		return pm_generic_prepare(dev);
+	}
 	/*
-	 * Follow PCI and resume devices suspended at run time before running
-	 * their system suspend callbacks.
+	 * If this is a wakeup device, wakeup power has been enabled already for
+	 * it during the preceding runtime suspend.  Caveat: "sleep state" is
+	 * one of the _DSW arguments, but that shouldn't matter for the devices
+	 * using acpi_general_pm_domain.
 	 */
-	pm_runtime_resume(dev);
-	return pm_generic_prepare(dev);
+	error =  pm_generic_prepare(dev);
+	return error ? error : 1;
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-16 23:50   ` [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices Rafael J. Wysocki
@ 2014-02-18 12:59     ` Ulf Hansson
  2014-02-18 13:25       ` Rafael J. Wysocki
  2014-02-19 17:01     ` Alan Stern
  1 sibling, 1 reply; 78+ messages in thread
From: Ulf Hansson @ 2014-02-18 12:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Alan Stern, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On 17 February 2014 00:50, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.  However, at
> least in some cases, that isn't really necessary, because the wakeup
> settings may not be really different.
>
> The idea here is that subsystems should know whether or not it is
> necessary to reprogram a given device during system suspend and they
> should be able to tell the PM core about that.  For this reason,
> modify the PM core so that if the .prepare() callback returns a
> positive value for certain device, the core will set a new
> power.fast_suspend flag for it.  Then, if that flag is set, the core
> will skip all of the subsequent suspend callbacks for that device.
> It also will skip all of the system resume callbacks for the device
> during the subsequent system resume and pm_request_resume() will be
> executed to trigger a runtime PM resume of the device after the
> system device resume sequence has been finished.
>
> However, since parents may need to be resumed so that their children
> can be reprogrammed, make the PM core clear power.fast_suspend for
> devices whose children don't have power.fast_suspend set (the
> power.ignore_children flag doesn't matter here, because a parent
> whose children are normally ignored for runtime PM may still need to
> be accessible for their children to be prepare for system suspend
> properly).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/power/main.c |   49 +++++++++++++++++++++++++++++++---------------
>  include/linux/pm.h        |    1
>  2 files changed, 35 insertions(+), 15 deletions(-)
>
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -478,7 +478,7 @@ static int device_resume_noirq(struct de
>         TRACE_DEVICE(dev);
>         TRACE_RESUME(0);
>
> -       if (dev->power.syscore)
> +       if (dev->power.syscore || dev->power.fast_suspend)
>                 goto Out;
>
>         if (!dev->power.is_noirq_suspended)
> @@ -599,7 +599,7 @@ static int device_resume_early(struct de
>         TRACE_DEVICE(dev);
>         TRACE_RESUME(0);
>
> -       if (dev->power.syscore)
> +       if (dev->power.syscore || dev->power.fast_suspend)
>                 goto Out;
>
>         if (!dev->power.is_late_suspended)
> @@ -724,6 +724,11 @@ static int device_resume(struct device *
>         if (dev->power.syscore)
>                 goto Complete;
>
> +       if (dev->power.fast_suspend) {
> +               pm_request_resume(dev);
> +               goto Complete;

So, this will trigger an async request to runtime resume the device.

At device_complete(), we do pm_runtime_put() to return the reference
we fetched at device_prepare(), thus likely causing the device to be
runtime suspended again. Is that the expected sequence you need? Could
you elaborate why?

Kind regards
Ulf Hansson

> +       }
> +
>         dpm_wait(dev->parent, async);
>         dpm_watchdog_set(&wd, dev);
>         device_lock(dev);
> @@ -994,7 +999,7 @@ static int __device_suspend_noirq(struct
>                 return async_error;
>         }
>
> -       if (dev->power.syscore)
> +       if (dev->power.syscore || dev->power.fast_suspend)
>                 return 0;
>
>         if (dev->pm_domain) {
> @@ -1127,7 +1132,7 @@ static int __device_suspend_late(struct
>                 return async_error;
>         }
>
> -       if (dev->power.syscore)
> +       if (dev->power.syscore || dev->power.fast_suspend)
>                 return 0;
>
>         if (dev->pm_domain) {
> @@ -1296,8 +1301,10 @@ static int __device_suspend(struct devic
>          * for it, this is equivalent to the device signaling wakeup, so the
>          * system suspend operation should be aborted.
>          */
> -       if (pm_runtime_barrier(dev) && device_may_wakeup(dev))
> +       if (pm_runtime_barrier(dev) && device_may_wakeup(dev)) {
>                 pm_wakeup_event(dev, 0);
> +               dev->power.fast_suspend = false;
> +       }
>
>         if (pm_wakeup_pending()) {
>                 async_error = -EBUSY;
> @@ -1310,6 +1317,9 @@ static int __device_suspend(struct devic
>         dpm_watchdog_set(&wd, dev);
>         device_lock(dev);
>
> +       if (dev->power.fast_suspend)
> +               goto End;
> +
>         if (dev->pm_domain) {
>                 info = "power domain ";
>                 callback = pm_op(&dev->pm_domain->ops, state);
> @@ -1358,9 +1368,14 @@ static int __device_suspend(struct devic
>   End:
>         if (!error) {
>                 dev->power.is_suspended = true;
> -               if (dev->power.wakeup_path
> -                   && dev->parent && !dev->parent->power.ignore_children)
> -                       dev->parent->power.wakeup_path = true;
> +               if (dev->parent) {
> +                       if (!dev->parent->power.ignore_children
> +                           && dev->power.wakeup_path)
> +                               dev->parent->power.wakeup_path = true;
> +
> +                       if (!dev->power.fast_suspend)
> +                               dev->parent->power.fast_suspend = false;
> +               }
>         }
>
>         device_unlock(dev);
> @@ -1460,7 +1475,7 @@ static int device_prepare(struct device
>  {
>         int (*callback)(struct device *) = NULL;
>         char *info = NULL;
> -       int error = 0;
> +       int ret = 0;
>
>         if (dev->power.syscore)
>                 return 0;
> @@ -1476,6 +1491,7 @@ static int device_prepare(struct device
>         device_lock(dev);
>
>         dev->power.wakeup_path = device_may_wakeup(dev);
> +       dev->power.fast_suspend = false;
>
>         if (dev->pm_domain) {
>                 info = "preparing power domain ";
> @@ -1497,16 +1513,19 @@ static int device_prepare(struct device
>         }
>
>         if (callback) {
> -               error = callback(dev);
> -               suspend_report_result(callback, error);
> +               ret = callback(dev);
> +               suspend_report_result(callback, ret);
>         }
>
>         device_unlock(dev);
>
> -       if (error)
> +       if (ret < 0) {
>                 pm_runtime_put(dev);
> -
> -       return error;
> +               return ret;
> +       } else if (ret > 0) {
> +               dev->power.fast_suspend = true;
> +       }
> +       return 0;
>  }
>
>  /**
> @@ -1575,7 +1594,7 @@ EXPORT_SYMBOL_GPL(dpm_suspend_start);
>
>  void __suspend_report_result(const char *function, void *fn, int ret)
>  {
> -       if (ret)
> +       if (ret < 0)
>                 printk(KERN_ERR "%s(): %pF returns %d\n", function, fn, ret);
>  }
>  EXPORT_SYMBOL_GPL(__suspend_report_result);
> Index: linux-pm/include/linux/pm.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm.h
> +++ linux-pm/include/linux/pm.h
> @@ -546,6 +546,7 @@ struct dev_pm_info {
>         bool                    is_late_suspended:1;
>         bool                    ignore_children:1;
>         bool                    early_init:1;   /* Owned by the PM core */
> +       bool                    fast_suspend:1; /* Owned by the PM core */
>         spinlock_t              lock;
>  #ifdef CONFIG_PM_SLEEP
>         struct list_head        entry;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-18 12:59     ` Ulf Hansson
@ 2014-02-18 13:25       ` Rafael J. Wysocki
  0 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-18 13:25 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM list, Alan Stern, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On Tuesday, February 18, 2014 01:59:36 PM Ulf Hansson wrote:
> On 17 February 2014 00:50, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.  However, at
> > least in some cases, that isn't really necessary, because the wakeup
> > settings may not be really different.
> >
> > The idea here is that subsystems should know whether or not it is
> > necessary to reprogram a given device during system suspend and they
> > should be able to tell the PM core about that.  For this reason,
> > modify the PM core so that if the .prepare() callback returns a
> > positive value for certain device, the core will set a new
> > power.fast_suspend flag for it.  Then, if that flag is set, the core
> > will skip all of the subsequent suspend callbacks for that device.
> > It also will skip all of the system resume callbacks for the device
> > during the subsequent system resume and pm_request_resume() will be
> > executed to trigger a runtime PM resume of the device after the
> > system device resume sequence has been finished.
> >
> > However, since parents may need to be resumed so that their children
> > can be reprogrammed, make the PM core clear power.fast_suspend for
> > devices whose children don't have power.fast_suspend set (the
> > power.ignore_children flag doesn't matter here, because a parent
> > whose children are normally ignored for runtime PM may still need to
> > be accessible for their children to be prepare for system suspend
> > properly).
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/power/main.c |   49 +++++++++++++++++++++++++++++++---------------
> >  include/linux/pm.h        |    1
> >  2 files changed, 35 insertions(+), 15 deletions(-)
> >
> > Index: linux-pm/drivers/base/power/main.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/main.c
> > +++ linux-pm/drivers/base/power/main.c
> > @@ -478,7 +478,7 @@ static int device_resume_noirq(struct de
> >         TRACE_DEVICE(dev);
> >         TRACE_RESUME(0);
> >
> > -       if (dev->power.syscore)
> > +       if (dev->power.syscore || dev->power.fast_suspend)
> >                 goto Out;
> >
> >         if (!dev->power.is_noirq_suspended)
> > @@ -599,7 +599,7 @@ static int device_resume_early(struct de
> >         TRACE_DEVICE(dev);
> >         TRACE_RESUME(0);
> >
> > -       if (dev->power.syscore)
> > +       if (dev->power.syscore || dev->power.fast_suspend)
> >                 goto Out;
> >
> >         if (!dev->power.is_late_suspended)
> > @@ -724,6 +724,11 @@ static int device_resume(struct device *
> >         if (dev->power.syscore)
> >                 goto Complete;
> >
> > +       if (dev->power.fast_suspend) {
> > +               pm_request_resume(dev);
> > +               goto Complete;
> 
> So, this will trigger an async request to runtime resume the device.
> 
> At device_complete(), we do pm_runtime_put() to return the reference
> we fetched at device_prepare(), thus likely causing the device to be
> runtime suspended again. Is that the expected sequence you need? Could
> you elaborate why?

That pm_runtime_put() will not cause the device to be re-suspended,
because it will be executed before the resume scheduled by the
pm_request_resume() above.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-16 23:50   ` [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices Rafael J. Wysocki
  2014-02-18 12:59     ` Ulf Hansson
@ 2014-02-19 17:01     ` Alan Stern
  2014-02-20  1:23       ` Rafael J. Wysocki
  1 sibling, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-02-19 17:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Mon, 17 Feb 2014, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.  However, at
> least in some cases, that isn't really necessary, because the wakeup
> settings may not be really different.
> 
> The idea here is that subsystems should know whether or not it is
> necessary to reprogram a given device during system suspend and they
> should be able to tell the PM core about that.  For this reason,
> modify the PM core so that if the .prepare() callback returns a
> positive value for certain device, the core will set a new
> power.fast_suspend flag for it.  Then, if that flag is set, the core
> will skip all of the subsequent suspend callbacks for that device.
> It also will skip all of the system resume callbacks for the device
> during the subsequent system resume and pm_request_resume() will be
> executed to trigger a runtime PM resume of the device after the
> system device resume sequence has been finished.

Does the PM core really need to get involved in this?  Can't the 
subsystem do the right thing on its own?

In the USB subsystem, the .suspend routine checks the required wakeup
settings.  If they are different for runtime suspend and system
suspend, and if the device is runtime suspended, then we call
pm_runtime_resume.  After that, if the device is still in runtime
suspend then we return immediately.

Of course, this addresses only the suspend side of the issue.  Skipping 
the resume callbacks is a separate matter, and the USB subsystem 
doesn't try to do it.  Still, I don't see any reason why we couldn't 
take care of that as well.

> However, since parents may need to be resumed so that their children
> can be reprogrammed, make the PM core clear power.fast_suspend for
> devices whose children don't have power.fast_suspend set (the
> power.ignore_children flag doesn't matter here, because a parent
> whose children are normally ignored for runtime PM may still need to
> be accessible for their children to be prepare for system suspend
> properly).

I have run across a similar issue.  It's a general problem that a
device may try to remain in runtime suspend during a system resume, but
a descendant of the device may need to perform I/O as part of its own
resume routine.  A natural solution would be to use the regular runtime
PM facilities to wake up the device.  But since the PM work queue is
frozen, we can't rely on pm_runtime_get or the equivalent.  I'm not
sure what the best solution will be.


After a quick look, I noticed a couple of questionable things in this
patch.  This is after reading just the second half...

> @@ -1296,8 +1301,10 @@ static int __device_suspend(struct devic
>  	 * for it, this is equivalent to the device signaling wakeup, so the
>  	 * system suspend operation should be aborted.
>  	 */
> -	if (pm_runtime_barrier(dev) && device_may_wakeup(dev))
> +	if (pm_runtime_barrier(dev) && device_may_wakeup(dev)) {
>  		pm_wakeup_event(dev, 0);
> +		dev->power.fast_suspend = false;
> +	}

Is this change needed?  We're aborting the sleep transition anyway, 
right?

> @@ -1310,6 +1317,9 @@ static int __device_suspend(struct devic
>  	dpm_watchdog_set(&wd, dev);
>  	device_lock(dev);
>  
> +	if (dev->power.fast_suspend)
> +		goto End;
> +

What happens if dev->power.fast_suspend gets set following the .prepare
callback, but then before __device_suspend runs, the device gets
runtime resumed?

It looks like rpm_resume needs to clear the new flag.

> @@ -1358,9 +1368,14 @@ static int __device_suspend(struct devic
>   End:
>  	if (!error) {
>  		dev->power.is_suspended = true;
> -		if (dev->power.wakeup_path
> -		    && dev->parent && !dev->parent->power.ignore_children)
> -			dev->parent->power.wakeup_path = true;
> +		if (dev->parent) {
> +			if (!dev->parent->power.ignore_children
> +			    && dev->power.wakeup_path)
> +				dev->parent->power.wakeup_path = true;
> +
> +			if (!dev->power.fast_suspend)
> +				dev->parent->power.fast_suspend = false;
> +		}

On SMP systems with async suspend, this isn't safe.  Two threads should 
not be allowed to write to bitfields in the same structure at the same 
time.

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-19 17:01     ` Alan Stern
@ 2014-02-20  1:23       ` Rafael J. Wysocki
  2014-02-20  1:42         ` Rafael J. Wysocki
  2014-02-20 17:03         ` Alan Stern
  0 siblings, 2 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-20  1:23 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Wednesday, February 19, 2014 12:01:20 PM Alan Stern wrote:
> On Mon, 17 Feb 2014, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.  However, at
> > least in some cases, that isn't really necessary, because the wakeup
> > settings may not be really different.
> > 
> > The idea here is that subsystems should know whether or not it is
> > necessary to reprogram a given device during system suspend and they
> > should be able to tell the PM core about that.  For this reason,
> > modify the PM core so that if the .prepare() callback returns a
> > positive value for certain device, the core will set a new
> > power.fast_suspend flag for it.  Then, if that flag is set, the core
> > will skip all of the subsequent suspend callbacks for that device.
> > It also will skip all of the system resume callbacks for the device
> > during the subsequent system resume and pm_request_resume() will be
> > executed to trigger a runtime PM resume of the device after the
> > system device resume sequence has been finished.

I was worried that you wouldn't comment at all. ;-)

> Does the PM core really need to get involved in this?

Yes, it does, in my opinion, and the reason is because it may be necessary
to resume parents in order to reprogram their children and the parents and
the children may be on different bus types.

> Can't the subsystem do the right thing on its own?

No, I don't think so.

> In the USB subsystem, the .suspend routine checks the required wakeup
> settings.  If they are different for runtime suspend and system
> suspend, and if the device is runtime suspended, then we call
> pm_runtime_resume.  After that, if the device is still in runtime
> suspend then we return immediately.
> 
> Of course, this addresses only the suspend side of the issue.  Skipping 
> the resume callbacks is a separate matter, and the USB subsystem 
> doesn't try to do it.  Still, I don't see any reason why we couldn't 
> take care of that as well.

What about USB controllers that are PCI devices?

> > However, since parents may need to be resumed so that their children
> > can be reprogrammed, make the PM core clear power.fast_suspend for
> > devices whose children don't have power.fast_suspend set (the
> > power.ignore_children flag doesn't matter here, because a parent
> > whose children are normally ignored for runtime PM may still need to
> > be accessible for their children to be prepare for system suspend
> > properly).
> 
> I have run across a similar issue.  It's a general problem that a
> device may try to remain in runtime suspend during a system resume, but
> a descendant of the device may need to perform I/O as part of its own
> resume routine.  A natural solution would be to use the regular runtime
> PM facilities to wake up the device.  But since the PM work queue is
> frozen, we can't rely on pm_runtime_get or the equivalent.  I'm not
> sure what the best solution will be.

We can rely on pm_runtime_get_sync(), though, which would be the right thing to
use here.

However, given that the parent's .prepare() has run already, I'm not sure
if we want runtime PM to be involved at all.

> After a quick look, I noticed a couple of questionable things in this
> patch.  This is after reading just the second half...
> 
> > @@ -1296,8 +1301,10 @@ static int __device_suspend(struct devic
> >  	 * for it, this is equivalent to the device signaling wakeup, so the
> >  	 * system suspend operation should be aborted.
> >  	 */
> > -	if (pm_runtime_barrier(dev) && device_may_wakeup(dev))
> > +	if (pm_runtime_barrier(dev) && device_may_wakeup(dev)) {
> >  		pm_wakeup_event(dev, 0);
> > +		dev->power.fast_suspend = false;
> > +	}
> 
> Is this change needed?  We're aborting the sleep transition anyway, 
> right?

Yes, we are.  The goal was to ensure that power.fast_suspend would be cleared
in that case, but dpm_resume_end() should take care of this anyway.

> > @@ -1310,6 +1317,9 @@ static int __device_suspend(struct devic
> >  	dpm_watchdog_set(&wd, dev);
> >  	device_lock(dev);
> >  
> > +	if (dev->power.fast_suspend)
> > +		goto End;
> > +
> 
> What happens if dev->power.fast_suspend gets set following the .prepare
> callback, but then before __device_suspend runs, the device gets
> runtime resumed?
> 
> It looks like rpm_resume needs to clear the new flag.

I thought about that and came to the conclusion that that wasn't necessary.

There simply is no reason for devices with power.fast_suspend set to be
runtime-resumed after (or even during) dpm_prepare() other than while
resuming their children, in which case power.fast_suspend is going to be
cleared for the children and then for the parents too.

That *is* a little fragile, though.

> > @@ -1358,9 +1368,14 @@ static int __device_suspend(struct devic
> >   End:
> >  	if (!error) {
> >  		dev->power.is_suspended = true;
> > -		if (dev->power.wakeup_path
> > -		    && dev->parent && !dev->parent->power.ignore_children)
> > -			dev->parent->power.wakeup_path = true;
> > +		if (dev->parent) {
> > +			if (!dev->parent->power.ignore_children
> > +			    && dev->power.wakeup_path)
> > +				dev->parent->power.wakeup_path = true;
> > +
> > +			if (!dev->power.fast_suspend)
> > +				dev->parent->power.fast_suspend = false;
> > +		}
> 
> On SMP systems with async suspend, this isn't safe.  Two threads should 
> not be allowed to write to bitfields in the same structure at the same 
> time.

Do I understand correctly that your concern is about suspending two
children of the same parent in parallel and one of them modifying
power.wakeup_path and the other modifying power.fast_suspend at the
same time?

If so, that modification can be done under the parent's power.lock.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-20  1:23       ` Rafael J. Wysocki
@ 2014-02-20  1:42         ` Rafael J. Wysocki
  2014-02-20 17:03         ` Alan Stern
  1 sibling, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-20  1:42 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, February 20, 2014 02:23:30 AM Rafael J. Wysocki wrote:
> On Wednesday, February 19, 2014 12:01:20 PM Alan Stern wrote:
> > On Mon, 17 Feb 2014, Rafael J. Wysocki wrote:
> > 
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > > resume all runtime-suspended devices during system suspend, mostly
> > > because those devices may need to be reprogrammed due to different
> > > wakeup settings for system sleep and for runtime PM.  However, at
> > > least in some cases, that isn't really necessary, because the wakeup
> > > settings may not be really different.
> > > 
> > > The idea here is that subsystems should know whether or not it is
> > > necessary to reprogram a given device during system suspend and they
> > > should be able to tell the PM core about that.  For this reason,
> > > modify the PM core so that if the .prepare() callback returns a
> > > positive value for certain device, the core will set a new
> > > power.fast_suspend flag for it.  Then, if that flag is set, the core
> > > will skip all of the subsequent suspend callbacks for that device.
> > > It also will skip all of the system resume callbacks for the device
> > > during the subsequent system resume and pm_request_resume() will be
> > > executed to trigger a runtime PM resume of the device after the
> > > system device resume sequence has been finished.
> 
> I was worried that you wouldn't comment at all. ;-)
> 
> > Does the PM core really need to get involved in this?
> 
> Yes, it does, in my opinion, and the reason is because it may be necessary
> to resume parents in order to reprogram their children and the parents and
> the children may be on different bus types.
> 
> > Can't the subsystem do the right thing on its own?
> 
> No, I don't think so.
> 
> > In the USB subsystem, the .suspend routine checks the required wakeup
> > settings.  If they are different for runtime suspend and system
> > suspend, and if the device is runtime suspended, then we call
> > pm_runtime_resume.  After that, if the device is still in runtime
> > suspend then we return immediately.
> > 
> > Of course, this addresses only the suspend side of the issue.  Skipping 
> > the resume callbacks is a separate matter, and the USB subsystem 
> > doesn't try to do it.  Still, I don't see any reason why we couldn't 
> > take care of that as well.
> 
> What about USB controllers that are PCI devices?
> 
> > > However, since parents may need to be resumed so that their children
> > > can be reprogrammed, make the PM core clear power.fast_suspend for
> > > devices whose children don't have power.fast_suspend set (the
> > > power.ignore_children flag doesn't matter here, because a parent
> > > whose children are normally ignored for runtime PM may still need to
> > > be accessible for their children to be prepare for system suspend
> > > properly).
> > 
> > I have run across a similar issue.  It's a general problem that a
> > device may try to remain in runtime suspend during a system resume, but
> > a descendant of the device may need to perform I/O as part of its own
> > resume routine.  A natural solution would be to use the regular runtime
> > PM facilities to wake up the device.  But since the PM work queue is
> > frozen, we can't rely on pm_runtime_get or the equivalent.  I'm not
> > sure what the best solution will be.
> 
> We can rely on pm_runtime_get_sync(), though, which would be the right thing to
> use here.
> 
> However, given that the parent's .prepare() has run already, I'm not sure
> if we want runtime PM to be involved at all.
> 
> > After a quick look, I noticed a couple of questionable things in this
> > patch.  This is after reading just the second half...
> > 
> > > @@ -1296,8 +1301,10 @@ static int __device_suspend(struct devic
> > >  	 * for it, this is equivalent to the device signaling wakeup, so the
> > >  	 * system suspend operation should be aborted.
> > >  	 */
> > > -	if (pm_runtime_barrier(dev) && device_may_wakeup(dev))
> > > +	if (pm_runtime_barrier(dev) && device_may_wakeup(dev)) {
> > >  		pm_wakeup_event(dev, 0);
> > > +		dev->power.fast_suspend = false;
> > > +	}
> > 
> > Is this change needed?  We're aborting the sleep transition anyway, 
> > right?
> 
> Yes, we are.  The goal was to ensure that power.fast_suspend would be cleared
> in that case, but dpm_resume_end() should take care of this anyway.

Sorry, I wanted to say "the clearing of it in device_prepare()", not sure why
I mentioned dpm_resume_end().  Well, maybe I'm too tired ...

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-20  1:23       ` Rafael J. Wysocki
  2014-02-20  1:42         ` Rafael J. Wysocki
@ 2014-02-20 17:03         ` Alan Stern
  2014-02-24  0:00           ` Rafael J. Wysocki
  1 sibling, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-02-20 17:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thu, 20 Feb 2014, Rafael J. Wysocki wrote:

> On Wednesday, February 19, 2014 12:01:20 PM Alan Stern wrote:
> > On Mon, 17 Feb 2014, Rafael J. Wysocki wrote:
> > 
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > > resume all runtime-suspended devices during system suspend, mostly
> > > because those devices may need to be reprogrammed due to different
> > > wakeup settings for system sleep and for runtime PM.  However, at
> > > least in some cases, that isn't really necessary, because the wakeup
> > > settings may not be really different.
> > > 
> > > The idea here is that subsystems should know whether or not it is
> > > necessary to reprogram a given device during system suspend and they
> > > should be able to tell the PM core about that.  For this reason,
> > > modify the PM core so that if the .prepare() callback returns a
> > > positive value for certain device, the core will set a new
> > > power.fast_suspend flag for it.  Then, if that flag is set, the core
> > > will skip all of the subsequent suspend callbacks for that device.
> > > It also will skip all of the system resume callbacks for the device
> > > during the subsequent system resume and pm_request_resume() will be
> > > executed to trigger a runtime PM resume of the device after the
> > > system device resume sequence has been finished.
> 
> I was worried that you wouldn't comment at all. ;-)

Things have been pretty busy here for the last few weeks...

> > Does the PM core really need to get involved in this?
> 
> Yes, it does, in my opinion, and the reason is because it may be necessary
> to resume parents in order to reprogram their children and the parents and
> the children may be on different bus types.

Hmmm.  Reprogramming the children could refer to two different things:

   (1)	Altering the child's wakeup settings before suspending it, or

   (2)	Doing I/O to the child after resuming it.

(1) is something we already have to deal with.  Generally, the child's
subsystem or driver would simply call pm_runtime_get_sync, or something
of the sort.  The parent would then be in the RPM_ACTIVE state when its
->resume callback was invoked, so the rest of this discussion would not 
apply.

(2) is more difficult.  It is the reason why, in the original runtime 
PM documentation, we suggested that system resume should wake up _all_ 
devices, including those that were in runtime suspend before the system 
sleep.

If the child's subsystem or driver knows to call pm_runtime_get_sync
before starting the I/O, then it should work out okay.  Trouble arises
when the subsystem/driver _assumes_ that the parent is active.  This
sounds like the sort of thing that could be fixed on a case-by-case
basis: Simply remove that assumption from the child's subsystem/driver.  
But I guess you want to set up a more structured solution.

Note that this patch set really addresses two distinct issues: Skipping 
the ->suspend callbacks if the device is already runtime suspended, and 
skipping the ->resume callbacks.  In principle these are independent 
decisions.

> > Can't the subsystem do the right thing on its own?
> 
> No, I don't think so.

Well, certainly the subsystem could avoid doing anything in the
->suspend callbacks, by returning immediately.  But skipping over the
->resume callbacks is more problematic, because of case (2) above.

> > In the USB subsystem, the .suspend routine checks the required wakeup
> > settings.  If they are different for runtime suspend and system
> > suspend, and if the device is runtime suspended, then we call
> > pm_runtime_resume.  After that, if the device is still in runtime
> > suspend then we return immediately.
> > 
> > Of course, this addresses only the suspend side of the issue.  Skipping 
> > the resume callbacks is a separate matter, and the USB subsystem 
> > doesn't try to do it.  Still, I don't see any reason why we couldn't 
> > take care of that as well.
> 
> What about USB controllers that are PCI devices?

The description above applies to USB devices, not USB controllers.  
For PCI-based USB controllers, we currently assume that the controller
is RPM_ACTIVE when the ->suspend callback is called.

Of course, that assumption could be removed easily enough.

> > I have run across a similar issue.  It's a general problem that a
> > device may try to remain in runtime suspend during a system resume, but
> > a descendant of the device may need to perform I/O as part of its own
> > resume routine.  A natural solution would be to use the regular runtime
> > PM facilities to wake up the device.  But since the PM work queue is
> > frozen, we can't rely on pm_runtime_get or the equivalent.  I'm not
> > sure what the best solution will be.
> 
> We can rely on pm_runtime_get_sync(), though, which would be the right thing to
> use here.
> 
> However, given that the parent's .prepare() has run already, I'm not sure
> if we want runtime PM to be involved at all.

This is a decision we have to make.  At the start of a system resume, 
we expect that the device is in a low-power state.  After the ->resume 
callback returns, there is a choice.  The device could be back to full 
power and marked RPM_ACTIVE.  Or it could remain in low power and be 
marked RPM_SUSPENDED.  In the second alternative, it seems that runtime 
PM is unavoidably involved.

Are you primarily concerned about performing runtime PM actions before 
calling the ->complete callback?  I don't think that will cause any 
difficulties, but we could warn about it in the documentation.

> > > @@ -1310,6 +1317,9 @@ static int __device_suspend(struct devic
> > >  	dpm_watchdog_set(&wd, dev);
> > >  	device_lock(dev);
> > >  
> > > +	if (dev->power.fast_suspend)
> > > +		goto End;
> > > +
> > 
> > What happens if dev->power.fast_suspend gets set following the .prepare
> > callback, but then before __device_suspend runs, the device gets
> > runtime resumed?
> > 
> > It looks like rpm_resume needs to clear the new flag.
> 
> I thought about that and came to the conclusion that that wasn't necessary.
> 
> There simply is no reason for devices with power.fast_suspend set to be
> runtime-resumed after (or even during) dpm_prepare() other than while
> resuming their children, in which case power.fast_suspend is going to be
> cleared for the children and then for the parents too.
> 
> That *is* a little fragile, though.

It's possible for the device to get a wakeup request.  If this happens, 
it should cause the entire system sleep to be aborted, but depending on 
that is also a little fragile.

There's also the possibility of doing I/O to the child while the parent 
is in runtime suspend (with ignore_children set).  These situations 
tend to be idiosyncratic; it's hard to say anything about them in 
general.

> > > @@ -1358,9 +1368,14 @@ static int __device_suspend(struct devic
> > >   End:
> > >  	if (!error) {
> > >  		dev->power.is_suspended = true;
> > > -		if (dev->power.wakeup_path
> > > -		    && dev->parent && !dev->parent->power.ignore_children)
> > > -			dev->parent->power.wakeup_path = true;
> > > +		if (dev->parent) {
> > > +			if (!dev->parent->power.ignore_children
> > > +			    && dev->power.wakeup_path)
> > > +				dev->parent->power.wakeup_path = true;
> > > +
> > > +			if (!dev->power.fast_suspend)
> > > +				dev->parent->power.fast_suspend = false;
> > > +		}
> > 
> > On SMP systems with async suspend, this isn't safe.  Two threads should 
> > not be allowed to write to bitfields in the same structure at the same 
> > time.
> 
> Do I understand correctly that your concern is about suspending two
> children of the same parent in parallel and one of them modifying
> power.wakeup_path and the other modifying power.fast_suspend at the
> same time?

Yes, that's what I meant.

> If so, that modification can be done under the parent's power.lock.

That would avoid the problem.

Overall, I'm not convinced about the need for the fast_suspend 
mechanism.  I'm quite certain that the PM core doesn't need to get 
involved in the decision about skipping ->suspend callbacks.

The question of skipping ->resume callbacks is more complex.  In the
end, it comes down to a question of leaving the device in runtime
suspend during system resume -- whether this is happens because the PM
core skips the ->resume callback or because the callback returns
immediately is unimportant.

So I think what we really need to do is discuss this last question more
fully.  Case (2) can be tricky, and we may find that the device's
subsystem doesn't have enough information to predict what the
children's subsystems will do.

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-20 17:03         ` Alan Stern
@ 2014-02-24  0:00           ` Rafael J. Wysocki
  2014-02-24 19:36             ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-24  0:00 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, February 20, 2014 12:03:37 PM Alan Stern wrote:
> On Thu, 20 Feb 2014, Rafael J. Wysocki wrote:
> 
> > On Wednesday, February 19, 2014 12:01:20 PM Alan Stern wrote:
> > > On Mon, 17 Feb 2014, Rafael J. Wysocki wrote:
> > > 
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > > > resume all runtime-suspended devices during system suspend, mostly
> > > > because those devices may need to be reprogrammed due to different
> > > > wakeup settings for system sleep and for runtime PM.  However, at
> > > > least in some cases, that isn't really necessary, because the wakeup
> > > > settings may not be really different.
> > > > 
> > > > The idea here is that subsystems should know whether or not it is
> > > > necessary to reprogram a given device during system suspend and they
> > > > should be able to tell the PM core about that.  For this reason,
> > > > modify the PM core so that if the .prepare() callback returns a
> > > > positive value for certain device, the core will set a new
> > > > power.fast_suspend flag for it.  Then, if that flag is set, the core
> > > > will skip all of the subsequent suspend callbacks for that device.
> > > > It also will skip all of the system resume callbacks for the device
> > > > during the subsequent system resume and pm_request_resume() will be
> > > > executed to trigger a runtime PM resume of the device after the
> > > > system device resume sequence has been finished.
> > 
> > I was worried that you wouldn't comment at all. ;-)
> 
> Things have been pretty busy here for the last few weeks...
> 
> > > Does the PM core really need to get involved in this?
> > 
> > Yes, it does, in my opinion, and the reason is because it may be necessary
> > to resume parents in order to reprogram their children and the parents and
> > the children may be on different bus types.
> 
> Hmmm.  Reprogramming the children could refer to two different things:
> 
>    (1)	Altering the child's wakeup settings before suspending it, or
> 
>    (2)	Doing I/O to the child after resuming it.
> 
> (1) is something we already have to deal with.  Generally, the child's
> subsystem or driver would simply call pm_runtime_get_sync, or something
> of the sort.

Yes, it could.  Except that that won't work for parents with power.ignore_children
set.

Also, it may not do that today and I'd like to introduce a mechanism by which
that optimizatiom may be enabled by subsystems/drivers when ready.

So for example today there is no guarantee that each device will be resumed
as appropriate by whoever handles its children when necessary, because that
code may expect the device to be operational when it is running (precisely
because we used to resume that device in .prepare()).

> The parent would then be in the RPM_ACTIVE state when its
> ->resume callback was invoked, so the rest of this discussion would not 
> apply.
> 
> (2) is more difficult.  It is the reason why, in the original runtime 
> PM documentation, we suggested that system resume should wake up _all_ 
> devices, including those that were in runtime suspend before the system 
> sleep.

In my view there really is no substantial difference between (1) and (2).
After all, changing the wakeup settings of a device's children may involve
doing I/O to them.

> If the child's subsystem or driver knows to call pm_runtime_get_sync
> before starting the I/O, then it should work out okay.  Trouble arises
> when the subsystem/driver _assumes_ that the parent is active.

Precisely

> This sounds like the sort of thing that could be fixed on a case-by-case
> basis: Simply remove that assumption from the child's subsystem/driver.  
> But I guess you want to set up a more structured solution.

Yes.  In particular, I'd like the "child" subsystem to be able to let the
core know that it is fine to leave the parent suspended.

> Note that this patch set really addresses two distinct issues: Skipping 
> the ->suspend callbacks if the device is already runtime suspended, and 
> skipping the ->resume callbacks.  In principle these are independent 
> decisions.

Yes, they are.

That said the patchset doesn't really do both.  It only really is about
skipping the ->suspend callbacks if possible, but the *consequence* of that
is that *system* resume ->resume callbacks cannot be used to resume the
device any more in general.  That's because ->resume_early may try to
reverse the ->suspend_late's actions and so on, so if ->suspend_late hasn't
run, it would be a bug to run ->resume_early for that device.

> > > Can't the subsystem do the right thing on its own?
> > 
> > No, I don't think so.
> 
> Well, certainly the subsystem could avoid doing anything in the
> ->suspend callbacks, by returning immediately.

As I said above, today there's no guaranee that things would work out correctly
in that case.

> But skipping over the ->resume callbacks is more problematic, because of case
> (2) above.
> 
> > > In the USB subsystem, the .suspend routine checks the required wakeup
> > > settings.  If they are different for runtime suspend and system
> > > suspend, and if the device is runtime suspended, then we call
> > > pm_runtime_resume.  After that, if the device is still in runtime
> > > suspend then we return immediately.
> > > 
> > > Of course, this addresses only the suspend side of the issue.  Skipping 
> > > the resume callbacks is a separate matter, and the USB subsystem 
> > > doesn't try to do it.  Still, I don't see any reason why we couldn't 
> > > take care of that as well.
> > 
> > What about USB controllers that are PCI devices?
> 
> The description above applies to USB devices, not USB controllers.  
> For PCI-based USB controllers, we currently assume that the controller
> is RPM_ACTIVE when the ->suspend callback is called.
> 
> Of course, that assumption could be removed easily enough.

Well, in my view, in general, any mechanism that crosses the subsystem
boundary pretty much requires the core to be involved, this way or another.

> > > I have run across a similar issue.  It's a general problem that a
> > > device may try to remain in runtime suspend during a system resume, but
> > > a descendant of the device may need to perform I/O as part of its own
> > > resume routine.  A natural solution would be to use the regular runtime
> > > PM facilities to wake up the device.  But since the PM work queue is
> > > frozen, we can't rely on pm_runtime_get or the equivalent.  I'm not
> > > sure what the best solution will be.
> > 
> > We can rely on pm_runtime_get_sync(), though, which would be the right thing to
> > use here.
> > 
> > However, given that the parent's .prepare() has run already, I'm not sure
> > if we want runtime PM to be involved at all.
> 
> This is a decision we have to make.  At the start of a system resume, 
> we expect that the device is in a low-power state.  After the ->resume 
> callback returns, there is a choice.  The device could be back to full 
> power and marked RPM_ACTIVE.  Or it could remain in low power and be 
> marked RPM_SUSPENDED.  In the second alternative, it seems that runtime 
> PM is unavoidably involved.

I agree.  However, as stated above, once we've decided not to run, for
example, the ->suspend callback, we also should not attempt to run
->resume for the same device, because the latter may expect that the former
has run.  That is an additional complication that needs to be taken into
account in my opinion.

Of course, if the decision is left to the subsystem, then it will know
that ->suspend has returned immediately and it will make ->resume return
immediately too, but that sounds like a thing that will be readily duplicated
by multiple subsystems just because we don't want the core to be involved.
Also there's no way to know anything about the children then.

> Are you primarily concerned about performing runtime PM actions before 
> calling the ->complete callback?

No, I'm not.

> I don't think that will cause any difficulties, but we could warn about it in
> the documentation.
> 
> > > > @@ -1310,6 +1317,9 @@ static int __device_suspend(struct devic
> > > >  	dpm_watchdog_set(&wd, dev);
> > > >  	device_lock(dev);
> > > >  
> > > > +	if (dev->power.fast_suspend)
> > > > +		goto End;
> > > > +
> > > 
> > > What happens if dev->power.fast_suspend gets set following the .prepare
> > > callback, but then before __device_suspend runs, the device gets
> > > runtime resumed?
> > > 
> > > It looks like rpm_resume needs to clear the new flag.
> > 
> > I thought about that and came to the conclusion that that wasn't necessary.
> > 
> > There simply is no reason for devices with power.fast_suspend set to be
> > runtime-resumed after (or even during) dpm_prepare() other than while
> > resuming their children, in which case power.fast_suspend is going to be
> > cleared for the children and then for the parents too.
> > 
> > That *is* a little fragile, though.
> 
> It's possible for the device to get a wakeup request.  If this happens, 
> it should cause the entire system sleep to be aborted, but depending on 
> that is also a little fragile.
> 
> There's also the possibility of doing I/O to the child while the parent 
> is in runtime suspend (with ignore_children set).  These situations 
> tend to be idiosyncratic; it's hard to say anything about them in 
> general.

I agree.  For this reason, I think that the core has no choice but to treat
power.ignore_children set as "well, that device may need to be operational
going forward".

> > > > @@ -1358,9 +1368,14 @@ static int __device_suspend(struct devic
> > > >   End:
> > > >  	if (!error) {
> > > >  		dev->power.is_suspended = true;
> > > > -		if (dev->power.wakeup_path
> > > > -		    && dev->parent && !dev->parent->power.ignore_children)
> > > > -			dev->parent->power.wakeup_path = true;
> > > > +		if (dev->parent) {
> > > > +			if (!dev->parent->power.ignore_children
> > > > +			    && dev->power.wakeup_path)
> > > > +				dev->parent->power.wakeup_path = true;
> > > > +
> > > > +			if (!dev->power.fast_suspend)
> > > > +				dev->parent->power.fast_suspend = false;
> > > > +		}
> > > 
> > > On SMP systems with async suspend, this isn't safe.  Two threads should 
> > > not be allowed to write to bitfields in the same structure at the same 
> > > time.
> > 
> > Do I understand correctly that your concern is about suspending two
> > children of the same parent in parallel and one of them modifying
> > power.wakeup_path and the other modifying power.fast_suspend at the
> > same time?
> 
> Yes, that's what I meant.
> 
> > If so, that modification can be done under the parent's power.lock.
> 
> That would avoid the problem.
> 
> Overall, I'm not convinced about the need for the fast_suspend 
> mechanism.  I'm quite certain that the PM core doesn't need to get 
> involved in the decision about skipping ->suspend callbacks.

I obviously disagree. :-)

Otherwise, I wouldn't have bothered with posting this patchset ...

> The question of skipping ->resume callbacks is more complex.  In the
> end, it comes down to a question of leaving the device in runtime
> suspend during system resume -- whether this is happens because the PM
> core skips the ->resume callback or because the callback returns
> immediately is unimportant.

Well, not really in my opinion ...

> So I think what we really need to do is discuss this last question more
> fully.  Case (2) can be tricky, and we may find that the device's
> subsystem doesn't have enough information to predict what the
> children's subsystems will do.

Precisely. :-)  And that's why I'd like to give the children's subsystem
a way to give the parent's one a clue - with some help from the core.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-24  0:00           ` Rafael J. Wysocki
@ 2014-02-24 19:36             ` Alan Stern
  2014-02-25  0:07               ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-02-24 19:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Mon, 24 Feb 2014, Rafael J. Wysocki wrote:

> Also, it may not do that today and I'd like to introduce a mechanism by which
> that optimizatiom may be enabled by subsystems/drivers when ready.
> 
> So for example today there is no guarantee that each device will be resumed
> as appropriate by whoever handles its children when necessary, because that
> code may expect the device to be operational when it is running (precisely
> because we used to resume that device in .prepare()).

> Yes.  In particular, I'd like the "child" subsystem to be able to let the
> core know that it is fine to leave the parent suspended.

> That said the patchset doesn't really do both.  It only really is about
> skipping the ->suspend callbacks if possible, but the *consequence* of that
> is that *system* resume ->resume callbacks cannot be used to resume the
> device any more in general.  That's because ->resume_early may try to
> reverse the ->suspend_late's actions and so on, so if ->suspend_late hasn't
> run, it would be a bug to run ->resume_early for that device.

> I agree.  For this reason, I think that the core has no choice but to treat
> power.ignore_children set as "well, that device may need to be operational
> going forward".

This discussion is getting a little messy.  Let's try to clarify it.
Here is the major point:

	We would like to save time during system suspend/resume by
	skipping over devices that are already in runtime suspend,
	whenever it is safe to do so.

Of course, the "it is safe to do so" part is what makes this difficult.  
It boils down to three characteristics for each device.

    (a) The device uses the same power state for runtime suspend and
	system suspend.  Therefore, if the device is already in runtime
	suspend and the wakeup settings are correct, there is no need
	for the PM core to invoke the device's ->suspend callbacks.

This requires a few comments.  The matter of whether the same power
state is used for both types of suspend is generally known beforehand,
because it is decided by the subsystem or driver.  The matter of
whether the wakeup settings are correct often can't be known until the
system suspend starts, because userspace can select whether or not
wakeup should be enabled during a system sleep.

Also, if the PM core is going to skip the ->suspend callbacks then it 
is obliged to skip the ->resume callbacks as well (we mustn't call one 
without the other).  Therefore, in cases where (a) holds, the device 
will necessarily emerge from the system resume in a runtime-suspended 
state.  This may or may not cause problems for the device's children; 
see below.

    (b) It's okay for the device's parent to be in runtime suspend
	when the device's ->suspend callbacks are invoked.

I included this just to be thorough.  In fact, I expect (b) to be true 
for pretty much every device already.  Or if it isn't true for some 
devices, this is because of a special arrangement between the device's 
subsystem and the parent's subsystem.  For example, the parent might 
always be runtime-resumed by its subsystem at the start of a system 
suspend (which is what PCI does now; I don't know if it is necessary).

In the absence of any sort of special arrangement, if (b) wasn't true 
for some device then that device would already be experiencing problems 
going into system suspend.  So (b) should not cause much difficulty.  
And if a special arrangement is present, it is a private matter between 
the two subsystems, not involving the PM core.

    (c) It's okay for the device's parent to be in runtime suspend
	when the device's ->resume callbacks are invoked.

Unlike (b), I expect that (c) does _not_ hold for quite a few devices
currently.  The reason is historical: When runtime PM was first
implemented, we decided that all devices should emerge from system
resume in the RPM_ACTIVE state, even if they were in runtime suspend
when the system suspend started.  Therefore drivers could depend on the
parent not being in runtime suspend while the device's ->resume
callback was running.

I don't think it is a good idea to perpetuate an accident of history.  
Instead of adding a special mechanism to the PM core for accomodating 
devices where (c) doesn't hold, I think we should fix up the drivers 
so that (c) _does_ hold everywhere.  Maybe you disagree.


Anyway, let's assume first that things are all fixed up, and (b) and
(c) hold for every device.  This means we can go back and consider (a).

Since the "same power state for both types of suspend" answer is known 
beforehand, let's concentrate on devices where it is true (other 
devices will simply continue to operate as they do today).  For these 
devices, the driver or subsystem will have to compute a flag value --
let's call it "same_wakeup_setting".  The PM core can't do this because 
it doesn't understand the device-specific details of wakeup settings.

Then your proposal comes down to this:

	If ->prepare returns > 0, the PM core sets the
	same_wakeup_setting flag.

	If same_wakeup_setting is on and the device is in runtime
	suspend, the PM core skips the various ->suspend and ->resume 
	callbacks.

My proposal was never made explicit, but it would take a form something 
like this:

	During ->prepare, the subsystem sets the same_wakeup_setting
	flag appropriately.

	If same_wakeup_setting is on and the device is in runtime
	suspend, the subsystem's ->suspend and ->resume callbacks
	return immediately.

There isn't very much difference between the two proposals.  Mine is a 
little more flexible; for example, it allows the subsystem to return 
immediately from ->suspend but have ->resume put the device back in the 
RPM_ACTIVE state.


Now let's change course and suppose that (c) _doesn't_ hold for a large
selection of devices.  As a simple consequence, if (c) doesn't hold for
some device then (a) can't be allowed to hold for any ancestor of that
device.  (I'm disregarding the power.ignore_children flag.)

Your proposal would take the same_wakeup_setting flag (you called it
"fast_suspend"), used for answering (a), and combine it with the answer
to (c).  That is, you would have ->prepare return > 0 only if (a) and
(c) both hold -- and in addition, you turn off the flag if it is off in
any child.

In practice, I suspect this means fast_suspend will end up affecting
only leaf devices: those with no children.  This is partly because many
devices don't have a .prepare method; there are plenty of entries in
the device tree that don't correspond to physical devices (e.g., class
devices, or devices present only for their sysfs attributes) and hence
have no PM support at all.  Even though (c) does hold for such devices,
the PM core won't realize it.

In addition, I don't like the way your proposal mixes together the
answers to (a) and (c).  If they could be kept separate, I think you
could do a better job.

For instance, suppose (a) is true for some device, but (c) is false for
one of its children.  Then the PM core could skip the ->suspend and
->resume callbacks for that device, and it could do a pm_runtime_resume
on the device before resuming the child.  The end result would be a
single ->runtime_resume call, instead of ->runtime_resume followed by
->suspend and then ->resume.

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-24 19:36             ` Alan Stern
@ 2014-02-25  0:07               ` Rafael J. Wysocki
  2014-02-25 17:08                 ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-25  0:07 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Monday, February 24, 2014 02:36:02 PM Alan Stern wrote:
> On Mon, 24 Feb 2014, Rafael J. Wysocki wrote:
> 
> > Also, it may not do that today and I'd like to introduce a mechanism by which
> > that optimizatiom may be enabled by subsystems/drivers when ready.
> > 
> > So for example today there is no guarantee that each device will be resumed
> > as appropriate by whoever handles its children when necessary, because that
> > code may expect the device to be operational when it is running (precisely
> > because we used to resume that device in .prepare()).
> 
> > Yes.  In particular, I'd like the "child" subsystem to be able to let the
> > core know that it is fine to leave the parent suspended.
> 
> > That said the patchset doesn't really do both.  It only really is about
> > skipping the ->suspend callbacks if possible, but the *consequence* of that
> > is that *system* resume ->resume callbacks cannot be used to resume the
> > device any more in general.  That's because ->resume_early may try to
> > reverse the ->suspend_late's actions and so on, so if ->suspend_late hasn't
> > run, it would be a bug to run ->resume_early for that device.
> 
> > I agree.  For this reason, I think that the core has no choice but to treat
> > power.ignore_children set as "well, that device may need to be operational
> > going forward".
> 
> This discussion is getting a little messy.  Let's try to clarify it.
> Here is the major point:
> 
> 	We would like to save time during system suspend/resume by

Actually, that's not only about saving time, but also about saving energy.

> 	skipping over devices that are already in runtime suspend,
> 	whenever it is safe to do so.
> 
> Of course, the "it is safe to do so" part is what makes this difficult.  
> It boils down to three characteristics for each device.
> 
>     (a) The device uses the same power state for runtime suspend and
> 	system suspend.  Therefore, if the device is already in runtime
> 	suspend and the wakeup settings are correct, there is no need
> 	for the PM core to invoke the device's ->suspend callbacks.
> 
> This requires a few comments.  The matter of whether the same power
> state is used for both types of suspend is generally known beforehand,
> because it is decided by the subsystem or driver.  The matter of
> whether the wakeup settings are correct often can't be known until the
> system suspend starts, because userspace can select whether or not
> wakeup should be enabled during a system sleep.
> 
> Also, if the PM core is going to skip the ->suspend callbacks then it 
> is obliged to skip the ->resume callbacks as well (we mustn't call one 
> without the other).  Therefore, in cases where (a) holds, the device 
> will necessarily emerge from the system resume in a runtime-suspended 
> state.

The "emerge from system resume" requires a bit of clarification in my
opinion.  Do you refer to the status of the device when user space is
thawed or earlier?  If earlier, then when exactly?

> This may or may not cause problems for the device's children; 
> see below.
> 
>     (b) It's okay for the device's parent to be in runtime suspend
> 	when the device's ->suspend callbacks are invoked.
> 
> I included this just to be thorough.  In fact, I expect (b) to be true 
> for pretty much every device already.

I don't quite understand this.  What if the parent is a bridge and the
child's ->suspend tries to access the child's registers?  That surely won't
work if the parent is in a low-power state at that point.

> Or if it isn't true for some 
> devices, this is because of a special arrangement between the device's 
> subsystem and the parent's subsystem.  For example, the parent might 
> always be runtime-resumed by its subsystem at the start of a system 
> suspend (which is what PCI does now; I don't know if it is necessary).
> 
> In the absence of any sort of special arrangement, if (b) wasn't true 
> for some device then that device would already be experiencing problems 
> going into system suspend.

Unless, of course, its parent is a PCI device, because in that case it will
always be resumed by the PCI bus type.  And if the "always resumed" is going
to be changed to "only resumed if the parent's configuration needs to change",
there may be some regressions here and there.

> So (b) should not cause much difficulty.

I disagree.

> And if a special arrangement is present, it is a private matter between 
> the two subsystems, not involving the PM core.
> 
>     (c) It's okay for the device's parent to be in runtime suspend
> 	when the device's ->resume callbacks are invoked.
> 
> Unlike (b), I expect that (c) does _not_ hold for quite a few devices
> currently.  The reason is historical: When runtime PM was first
> implemented, we decided that all devices should emerge from system
> resume in the RPM_ACTIVE state, even if they were in runtime suspend
> when the system suspend started.  Therefore drivers could depend on the
> parent not being in runtime suspend while the device's ->resume
> callback was running.

That's correct.

> I don't think it is a good idea to perpetuate an accident of history.  
> Instead of adding a special mechanism to the PM core for accomodating 
> devices where (c) doesn't hold, I think we should fix up the drivers 
> so that (c) _does_ hold everywhere.  Maybe you disagree.

I agree with the idea, but I have a certain view on how to achieve it.
Which is by allowing the "good" ones to mark themselves as "good", then
go through the ones that aren't marked as "good" yet, fix them up and
mark them as "good".  Finally, when everyone is marked "good", we can
drop the marking.

> Anyway, let's assume first that things are all fixed up, and (b) and
> (c) hold for every device.  This means we can go back and consider (a).
> 
> Since the "same power state for both types of suspend" answer is known 
> beforehand, let's concentrate on devices where it is true (other 
> devices will simply continue to operate as they do today).  For these 
> devices, the driver or subsystem will have to compute a flag value --
> let's call it "same_wakeup_setting".  The PM core can't do this because 
> it doesn't understand the device-specific details of wakeup settings.
> 
> Then your proposal comes down to this:
> 
> 	If ->prepare returns > 0, the PM core sets the
> 	same_wakeup_setting flag.
> 
> 	If same_wakeup_setting is on and the device is in runtime
> 	suspend, the PM core skips the various ->suspend and ->resume 
> 	callbacks.
> 
> My proposal was never made explicit, but it would take a form something 
> like this:
> 
> 	During ->prepare, the subsystem sets the same_wakeup_setting
> 	flag appropriately.
> 
> 	If same_wakeup_setting is on and the device is in runtime
> 	suspend, the subsystem's ->suspend and ->resume callbacks
> 	return immediately.
> 
> There isn't very much difference between the two proposals.  Mine is a 
> little more flexible; for example, it allows the subsystem to return 
> immediately from ->suspend but have ->resume put the device back in the 
> RPM_ACTIVE state.

Your version is fine by me, I only made the PM core set the flag, because it
would clear that flag afterward in some cases.

> Now let's change course and suppose that (c) _doesn't_ hold for a large
> selection of devices.  As a simple consequence, if (c) doesn't hold for
> some device then (a) can't be allowed to hold for any ancestor of that
> device.  (I'm disregarding the power.ignore_children flag.)
> 
> Your proposal would take the same_wakeup_setting flag (you called it
> "fast_suspend"), used for answering (a), and combine it with the answer
> to (c).  That is, you would have ->prepare return > 0 only if (a) and
> (c) both hold -- and in addition, you turn off the flag if it is off in
> any child.
> 
> In practice, I suspect this means fast_suspend will end up affecting
> only leaf devices: those with no children.  This is partly because many
> devices don't have a .prepare method; there are plenty of entries in
> the device tree that don't correspond to physical devices (e.g., class
> devices, or devices present only for their sysfs attributes) and hence
> have no PM support at all.  Even though (c) does hold for such devices,
> the PM core won't realize it.
> 
> In addition, I don't like the way your proposal mixes together the
> answers to (a) and (c).

It really doesn't and I tried to explain that in my previous message, but
that apparently wasn't clear enough.

The reasoning was basically to set fast_suspend for a device if your condition
(a) held *and* if fast_suspend was set for all of the device's children.  Then
we would know that all those children would be RPM_SUSPENDED during system
resume as well and the resume part might be "streamlined" as well.  There was
nothing about (c) anywhere in that patchset. :-)

> If they could be kept separate, I think you could do a better job.
> 
> For instance, suppose (a) is true for some device, but (c) is false for
> one of its children.  Then the PM core could skip the ->suspend and
> ->resume callbacks for that device, and it could do a pm_runtime_resume
> on the device before resuming the child.  The end result would be a
> single ->runtime_resume call, instead of ->runtime_resume followed by
> ->suspend and then ->resume.

That would be a different optimization from the one I'm thinking about.

For now, I'm focusing on one problem, which is when resuming runtime-suspended
devices during system suspend may be avoided and how to make that generally
work for different parent-child arrangements.

The resume part changes in my patchset were consequences of that only.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-25  0:07               ` Rafael J. Wysocki
@ 2014-02-25 17:08                 ` Alan Stern
  2014-02-25 23:56                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-02-25 17:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Tue, 25 Feb 2014, Rafael J. Wysocki wrote:

> > This discussion is getting a little messy.  Let's try to clarify it.
> > Here is the major point:
> > 
> > 	We would like to save time during system suspend/resume by
> 
> Actually, that's not only about saving time, but also about saving energy.

Sure.

> > 	skipping over devices that are already in runtime suspend,
> > 	whenever it is safe to do so.
> > 
> > Of course, the "it is safe to do so" part is what makes this difficult.  
> > It boils down to three characteristics for each device.
> > 
> >     (a) The device uses the same power state for runtime suspend and
> > 	system suspend.  Therefore, if the device is already in runtime
> > 	suspend and the wakeup settings are correct, there is no need
> > 	for the PM core to invoke the device's ->suspend callbacks.
> > 
> > This requires a few comments.  The matter of whether the same power
> > state is used for both types of suspend is generally known beforehand,
> > because it is decided by the subsystem or driver.  The matter of
> > whether the wakeup settings are correct often can't be known until the
> > system suspend starts, because userspace can select whether or not
> > wakeup should be enabled during a system sleep.
> > 
> > Also, if the PM core is going to skip the ->suspend callbacks then it 
> > is obliged to skip the ->resume callbacks as well (we mustn't call one 
> > without the other).  Therefore, in cases where (a) holds, the device 
> > will necessarily emerge from the system resume in a runtime-suspended 
> > state.
> 
> The "emerge from system resume" requires a bit of clarification in my
> opinion.  Do you refer to the status of the device when user space is
> thawed or earlier?  If earlier, then when exactly?

The status of the device when its ->resume callback returns.  Or in 
the context of your patch, when the ->resume callback is skipped.

> > This may or may not cause problems for the device's children; 
> > see below.
> > 
> >     (b) It's okay for the device's parent to be in runtime suspend
> > 	when the device's ->suspend callbacks are invoked.
> > 
> > I included this just to be thorough.  In fact, I expect (b) to be true 
> > for pretty much every device already.
> 
> I don't quite understand this.  What if the parent is a bridge and the
> child's ->suspend tries to access the child's registers?  That surely won't
> work if the parent is in a low-power state at that point.

It _does_ work on all current systems -- but only because the question 
never arises if the device's parent is never in runtime suspend when 
the device's ->suspend callbacks are invoked.

I admit, there most likely _are_ devices that would get into trouble if
the question ever did arise.

> > In the absence of any sort of special arrangement, if (b) wasn't true 
> > for some device then that device would already be experiencing problems 
> > going into system suspend.
> 
> Unless, of course, its parent is a PCI device, because in that case it will
> always be resumed by the PCI bus type.  And if the "always resumed" is going
> to be changed to "only resumed if the parent's configuration needs to change",
> there may be some regressions here and there.

Okay, so you want to take a problem involving PCI and some other
subsystems, and solve it by getting the PM core involved.  And you want
to mix it in with the idea of "fast suspend".  Is that really the best
approach?

Here's another approach.  Add an "okay for my parent to be
runtime-suspended when my ->suspend and ->resume callbacks are invoked"  
flag to dev->power.  Make pci_pm_prepare(dev) check this flag in every
child of dev; if there are any children where the flag isn't set then
call pm_runtime_resume(dev) (and print a message in the kernel log so 
that you know which driver needs to be fixed).

This is appropriate because it adjusts the PCI core in a way that can
safely avoid regressions, without getting the PM core involved in
matters that should remain strictly between the PCI subsystem and the
subsystems of children of PCI devices.

It also allows the "fast suspend" change to be cleanly separated out 
into a second patch.  In that patch, all you do is set 
dev->power.fast_suspend if ->prepare returns > 0, and then you skip 
->suspend and ->resume if fast_suspend is set and the device is
runtime-suspended.

> I agree with the idea, but I have a certain view on how to achieve it.
> Which is by allowing the "good" ones to mark themselves as "good", then
> go through the ones that aren't marked as "good" yet, fix them up and
> mark them as "good".  Finally, when everyone is marked "good", we can
> drop the marking.

How will you know when nothing remains unmarked?

> The reasoning was basically to set fast_suspend for a device if your condition
> (a) held *and* if fast_suspend was set for all of the device's children.  Then
> we would know that all those children would be RPM_SUSPENDED during system
> resume as well and the resume part might be "streamlined" as well.  There was
> nothing about (c) anywhere in that patchset. :-)

This seems like getting the answer to the wrong question.  You want to
know whether the children are okay with the parent being
runtime-suspended during the child's suspend and resume callbacks, but
instead you are asking if the children can remain runtime-suspended
through the entire system suspend.  Those are two different questions.

They seem related, because if the parent is in runtime suspend then the
child must also be in runtime suspend (disregarding the possibility of
ignore_children).  But this is a red herring.  It's entirely possible
for the child to be RPM_ACTIVE during one of these callback even if the
parent is RPM_SUSPENDED when the callback starts.  The child's driver
merely needs to call pm_runtime_resume(dev) at the beginning of the
callback.

If the child's driver does this, it would be perfectly okay for the
parent to use fast_suspend without the child using it too.  You could
skip calling the parent's suspend and resume callbacks, and the child
would work just fine.

> That would be a different optimization from the one I'm thinking about.
> 
> For now, I'm focusing on one problem, which is when resuming runtime-suspended
> devices during system suspend may be avoided and how to make that generally
> work for different parent-child arrangements.
> 
> The resume part changes in my patchset were consequences of that only.

See, I think that by considering this as a single problem, you aren't 
getting down to the fundamental issues.

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-25 17:08                 ` Alan Stern
@ 2014-02-25 23:56                   ` Rafael J. Wysocki
  2014-02-26 16:49                     ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-25 23:56 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Tuesday, February 25, 2014 12:08:14 PM Alan Stern wrote:
> On Tue, 25 Feb 2014, Rafael J. Wysocki wrote:
> 
> > > This discussion is getting a little messy.  Let's try to clarify it.
> > > Here is the major point:
> > > 
> > > 	We would like to save time during system suspend/resume by
> > 
> > Actually, that's not only about saving time, but also about saving energy.
> 
> Sure.
> 
> > > 	skipping over devices that are already in runtime suspend,
> > > 	whenever it is safe to do so.
> > > 
> > > Of course, the "it is safe to do so" part is what makes this difficult.  
> > > It boils down to three characteristics for each device.
> > > 
> > >     (a) The device uses the same power state for runtime suspend and
> > > 	system suspend.  Therefore, if the device is already in runtime
> > > 	suspend and the wakeup settings are correct, there is no need
> > > 	for the PM core to invoke the device's ->suspend callbacks.
> > > 
> > > This requires a few comments.  The matter of whether the same power
> > > state is used for both types of suspend is generally known beforehand,
> > > because it is decided by the subsystem or driver.  The matter of
> > > whether the wakeup settings are correct often can't be known until the
> > > system suspend starts, because userspace can select whether or not
> > > wakeup should be enabled during a system sleep.
> > > 
> > > Also, if the PM core is going to skip the ->suspend callbacks then it 
> > > is obliged to skip the ->resume callbacks as well (we mustn't call one 
> > > without the other).  Therefore, in cases where (a) holds, the device 
> > > will necessarily emerge from the system resume in a runtime-suspended 
> > > state.
> > 
> > The "emerge from system resume" requires a bit of clarification in my
> > opinion.  Do you refer to the status of the device when user space is
> > thawed or earlier?  If earlier, then when exactly?
> 
> The status of the device when its ->resume callback returns.  Or in 
> the context of your patch, when the ->resume callback is skipped.
> 
> > > This may or may not cause problems for the device's children; 
> > > see below.
> > > 
> > >     (b) It's okay for the device's parent to be in runtime suspend
> > > 	when the device's ->suspend callbacks are invoked.
> > > 
> > > I included this just to be thorough.  In fact, I expect (b) to be true 
> > > for pretty much every device already.
> > 
> > I don't quite understand this.  What if the parent is a bridge and the
> > child's ->suspend tries to access the child's registers?  That surely won't
> > work if the parent is in a low-power state at that point.
> 
> It _does_ work on all current systems -- but only because the question 
> never arises if the device's parent is never in runtime suspend when 
> the device's ->suspend callbacks are invoked.
> 
> I admit, there most likely _are_ devices that would get into trouble if
> the question ever did arise.

Well, I kind of put that to a test by posting these two patches:

https://patchwork.kernel.org/patch/3705261/
https://patchwork.kernel.org/patch/3705271/

We'll see if they lead to any regressions, but I'm going to work on top of
them going forward anyway.

> > > In the absence of any sort of special arrangement, if (b) wasn't true 
> > > for some device then that device would already be experiencing problems 
> > > going into system suspend.
> > 
> > Unless, of course, its parent is a PCI device, because in that case it will
> > always be resumed by the PCI bus type.  And if the "always resumed" is going
> > to be changed to "only resumed if the parent's configuration needs to change",
> > there may be some regressions here and there.
> 
> Okay, so you want to take a problem involving PCI and some other
> subsystems, and solve it by getting the PM core involved.  And you want
> to mix it in with the idea of "fast suspend".  Is that really the best
> approach?
> 
> Here's another approach.  Add an "okay for my parent to be
> runtime-suspended when my ->suspend and ->resume callbacks are invoked"  
> flag to dev->power.  Make pci_pm_prepare(dev) check this flag in every
> child of dev; if there are any children where the flag isn't set then
> call pm_runtime_resume(dev) (and print a message in the kernel log so 
> that you know which driver needs to be fixed).
> 
> This is appropriate because it adjusts the PCI core in a way that can
> safely avoid regressions, without getting the PM core involved in
> matters that should remain strictly between the PCI subsystem and the
> subsystems of children of PCI devices.
> 
> It also allows the "fast suspend" change to be cleanly separated out 
> into a second patch.  In that patch, all you do is set 
> dev->power.fast_suspend if ->prepare returns > 0, and then you skip 
> ->suspend and ->resume if fast_suspend is set and the device is
> runtime-suspended.

Actually, on top of the two patches mentioned above (and for devices
without power.ignore_children set) the question reduces to whether or not
(i) the device itself is runtime-suspended when its .suspend() callback is
running and (ii) its power state is such that it can remain suspended.
If both (i) and (ii) are met, the device may be left suspended safely,
because if any of its children had depended on it, they would have resumed
it already.

Still, I think that something like power.fast_suspend is needed to indicate
that .suspend_late(), .suspend_noirq(), .resume_noirq() and .resume_early()
should be skipped for it (in my opinion the core may very well skip them then)
and so that .resume() knows how to handle the device.

I'll prepare a new series working along these lines.

> > I agree with the idea, but I have a certain view on how to achieve it.
> > Which is by allowing the "good" ones to mark themselves as "good", then
> > go through the ones that aren't marked as "good" yet, fix them up and
> > mark them as "good".  Finally, when everyone is marked "good", we can
> > drop the marking.
> 
> How will you know when nothing remains unmarked?
> 
> > The reasoning was basically to set fast_suspend for a device if your condition
> > (a) held *and* if fast_suspend was set for all of the device's children.  Then
> > we would know that all those children would be RPM_SUSPENDED during system
> > resume as well and the resume part might be "streamlined" as well.  There was
> > nothing about (c) anywhere in that patchset. :-)
> 
> This seems like getting the answer to the wrong question.  You want to
> know whether the children are okay with the parent being
> runtime-suspended during the child's suspend and resume callbacks, but
> instead you are asking if the children can remain runtime-suspended
> through the entire system suspend.  Those are two different questions.
> 
> They seem related, because if the parent is in runtime suspend then the
> child must also be in runtime suspend (disregarding the possibility of
> ignore_children).  But this is a red herring.  It's entirely possible
> for the child to be RPM_ACTIVE during one of these callback even if the
> parent is RPM_SUSPENDED when the callback starts.  The child's driver
> merely needs to call pm_runtime_resume(dev) at the beginning of the
> callback.
> 
> If the child's driver does this, it would be perfectly okay for the
> parent to use fast_suspend without the child using it too.  You could
> skip calling the parent's suspend and resume callbacks, and the child
> would work just fine.
> 
> > That would be a different optimization from the one I'm thinking about.
> > 
> > For now, I'm focusing on one problem, which is when resuming runtime-suspended
> > devices during system suspend may be avoided and how to make that generally
> > work for different parent-child arrangements.
> > 
> > The resume part changes in my patchset were consequences of that only.
> 
> See, I think that by considering this as a single problem, you aren't 
> getting down to the fundamental issues.

Well, I'm not sure what exactly you mean.

I generally agree that whether or not a device may be left suspended during and
after system resume and whether or not a device may be left suspended during
system suspend are two different questions.  However, when it *is* left
suspended during system suspend, then that implies certain way of handling it
during the subsequent system resume.  After which it still may not be left
suspended.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-25 23:56                   ` Rafael J. Wysocki
@ 2014-02-26 16:49                     ` Alan Stern
  2014-02-26 21:44                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-02-26 16:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Wed, 26 Feb 2014, Rafael J. Wysocki wrote:

> > I admit, there most likely _are_ devices that would get into trouble if
> > the question ever did arise.
> 
> Well, I kind of put that to a test by posting these two patches:
> 
> https://patchwork.kernel.org/patch/3705261/
> https://patchwork.kernel.org/patch/3705271/
> 
> We'll see if they lead to any regressions, but I'm going to work on top of
> them going forward anyway.

And here I had the impression that you wanted to avoid any regressions
from those patches...

> Actually, on top of the two patches mentioned above (and for devices
> without power.ignore_children set) the question reduces to whether or not
> (i) the device itself is runtime-suspended when its .suspend() callback is
> running and (ii) its power state is such that it can remain suspended.
> If both (i) and (ii) are met, the device may be left suspended safely,
> because if any of its children had depended on it, they would have resumed
> it already.

Does this mean you changed your mind?  In an earlier email, you wrote:

> >     (b) It's okay for the device's parent to be in runtime suspend
> >       when the device's ->suspend callbacks are invoked.
> > 
> > I included this just to be thorough.  In fact, I expect (b) to be true 
> > for pretty much every device already.
> 
> I don't quite understand this.  What if the parent is a bridge and the
> child's ->suspend tries to access the child's registers?  That surely won't
> work if the parent is in a low-power state at that point.

So the answer is that if the bridge is suspended, then the child must
be suspended too and hence the child's ->suspend should _expect_
problems if it tries to access the child's registers.

(By the way, during this discussion I have had a tendency to mix up two 
related concepts:

	The device's ->suspend routine expects the _parent_ not to be
	suspended;

	The device's ->suspend routine expects the _device_ not to be
	suspended.

Obviously the second implies the first.  But once the second has been
fixed, the first should never cause any trouble.)

> Still, I think that something like power.fast_suspend is needed to indicate
> that .suspend_late(), .suspend_noirq(), .resume_noirq() and .resume_early()
> should be skipped for it (in my opinion the core may very well skip them then)
> and so that .resume() knows how to handle the device.

I don't follow.  Why would you skip these routines without also
skipping .suspend and .resume?

> I generally agree that whether or not a device may be left suspended during and
> after system resume and whether or not a device may be left suspended during
> system suspend are two different questions.  However, when it *is* left
> suspended during system suspend, then that implies certain way of handling it
> during the subsequent system resume.  After which it still may not be left
> suspended.

I would prefer to say: "However, when the system suspend callbacks
_are_ skipped, that implies the corresponding system resume callbacks
must also be skipped and hence the device must remain suspended".  Is
this consistent with what you meant?

As I see it, the fast_suspend implementation could lead to regressions
in two ways:

	The child's ->suspend doesn't expect the parent to be 
	suspended.

	The child's ->resume doesn't expect the parent to be
	suspended.

We agree now that the first won't be a problem, because it would imply
the child is suspended too.

However, the second may indeed be a problem.  I don't know how you
intend to handle it.  Apply the patch, like you did for ACPI and PCI
above, and then see what happens?

A simple solution is to use fast_suspend only for devices that have no
children.  But that would not be optimal.

Another possibility is always to call pm_runtime_resume(dev->parent)
before invoking dev's ->resume callback.  But that might not solve the
entire problem (it wouldn't help dev's ->resume_early callback, for
instance) and it also might be sub-optimal.

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-26 16:49                     ` Alan Stern
@ 2014-02-26 21:44                       ` Rafael J. Wysocki
  2014-02-26 22:17                         ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-26 21:44 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Wednesday, February 26, 2014 11:49:05 AM Alan Stern wrote:
> On Wed, 26 Feb 2014, Rafael J. Wysocki wrote:
> 
> > > I admit, there most likely _are_ devices that would get into trouble if
> > > the question ever did arise.
> > 
> > Well, I kind of put that to a test by posting these two patches:
> > 
> > https://patchwork.kernel.org/patch/3705261/
> > https://patchwork.kernel.org/patch/3705271/
> > 
> > We'll see if they lead to any regressions, but I'm going to work on top of
> > them going forward anyway.
> 
> And here I had the impression that you wanted to avoid any regressions
> from those patches...

In the meantime I realized that regressions from them are quite unlikely and
if they happen, they will be limited to corner cases.

The reason is (as you kind of noted below) that (assuming that runtime PM is
supported for both the child and the parent) if the child is to be accessed
in ->suspend, for example, it should be runtime-resumed beforehand and that
will cause the parent to be runtime-resumed as well (the power.ignore_children
devices are special-cased explicitly).

> > Actually, on top of the two patches mentioned above (and for devices
> > without power.ignore_children set) the question reduces to whether or not
> > (i) the device itself is runtime-suspended when its .suspend() callback is
> > running and (ii) its power state is such that it can remain suspended.
> > If both (i) and (ii) are met, the device may be left suspended safely,
> > because if any of its children had depended on it, they would have resumed
> > it already.
> 
> Does this mean you changed your mind?  In an earlier email, you wrote:

I've just realized that if the pm_runtime_resume() is moved from ->prepare
to ->suspend, then it is done *after* the ->suspend callbacks of the
children have been executed, which means that if the parent is runtime-suspended
at this point, the children are runtime-suspended either.  Moreover, they
can't be runtime-resumed in ->suspend_late, because that is executed with
runtime PM disabled, so it is reasonalby safe to assume that they won't need
the parent going forward - until ->resume.

> > >     (b) It's okay for the device's parent to be in runtime suspend
> > >       when the device's ->suspend callbacks are invoked.
> > > 
> > > I included this just to be thorough.  In fact, I expect (b) to be true 
> > > for pretty much every device already.
> > 
> > I don't quite understand this.  What if the parent is a bridge and the
> > child's ->suspend tries to access the child's registers?  That surely won't
> > work if the parent is in a low-power state at that point.
> 
> So the answer is that if the bridge is suspended, then the child must
> be suspended too and hence the child's ->suspend should _expect_
> problems if it tries to access the child's registers.

Agreed.

> (By the way, during this discussion I have had a tendency to mix up two 
> related concepts:
> 
> 	The device's ->suspend routine expects the _parent_ not to be
> 	suspended;
> 
> 	The device's ->suspend routine expects the _device_ not to be
> 	suspended.
> 
> Obviously the second implies the first.  But once the second has been
> fixed, the first should never cause any trouble.)
> 
> > Still, I think that something like power.fast_suspend is needed to indicate
> > that .suspend_late(), .suspend_noirq(), .resume_noirq() and .resume_early()
> > should be skipped for it (in my opinion the core may very well skip them then)
> > and so that .resume() knows how to handle the device.
> 
> I don't follow.  Why would you skip these routines without also
> skipping .suspend and .resume?

Because .suspend will set the flag and then it would be reasonable to call .resume,
for symmetry and to let it decide what to do (e.g. call pm_runtime_resume(dev) or
do something else, depending on the subsystem).

> > I generally agree that whether or not a device may be left suspended during and
> > after system resume and whether or not a device may be left suspended during
> > system suspend are two different questions.  However, when it *is* left
> > suspended during system suspend, then that implies certain way of handling it
> > during the subsequent system resume.  After which it still may not be left
> > suspended.
> 
> I would prefer to say: "However, when the system suspend callbacks
> _are_ skipped, that implies the corresponding system resume callbacks
> must also be skipped and hence the device must remain suspended".  Is
> this consistent with what you meant?

Yes, it is.

> As I see it, the fast_suspend implementation could lead to regressions
> in two ways:
> 
> 	The child's ->suspend doesn't expect the parent to be 
> 	suspended.
> 
> 	The child's ->resume doesn't expect the parent to be
> 	suspended.
> 
> We agree now that the first won't be a problem, because it would imply
> the child is suspended too.

Yes.

> However, the second may indeed be a problem.  I don't know how you
> intend to handle it.  Apply the patch, like you did for ACPI and PCI
> above, and then see what happens?

For starters, I'd just make the parent's ->resume call pm_runtime_resume(dev).
That will make the parent be ready before the child's ->resume is called.
And then it may be optimized further going forward, possibly by replacing
the pm_runtime_resume() with pm_request_resume() for some devices and by
leaving some devices in RPM_SUSPENDED.

> A simple solution is to use fast_suspend only for devices that have no
> children.  But that would not be optimal.
> 
> Another possibility is always to call pm_runtime_resume(dev->parent)
> before invoking dev's ->resume callback.  But that might not solve the
> entire problem (it wouldn't help dev's ->resume_early callback, for
> instance) and it also might be sub-optimal.

The child's ->resume_early may be a problem indeed (or its ->resume_noirq
for that matter).

Well, if power.fast_suspend set guarantees that ->suspend_late, ->suspend_noirq,
->resume_noirq, and ->resume_early will be skipped for a device, then we may
restrict setting it for devices whose children have it set (or that have no
children).  Initially, that will be equivalent to setting it for leaf devices
only, but it might be extended over time in a natural way.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-26 21:44                       ` Rafael J. Wysocki
@ 2014-02-26 22:17                         ` Alan Stern
  2014-02-26 23:13                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-02-26 22:17 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Wed, 26 Feb 2014, Rafael J. Wysocki wrote:

> > > Still, I think that something like power.fast_suspend is needed to indicate
> > > that .suspend_late(), .suspend_noirq(), .resume_noirq() and .resume_early()
> > > should be skipped for it (in my opinion the core may very well skip them then)
> > > and so that .resume() knows how to handle the device.
> > 
> > I don't follow.  Why would you skip these routines without also
> > skipping .suspend and .resume?
> 
> Because .suspend will set the flag and then it would be reasonable to call .resume,
> for symmetry and to let it decide what to do (e.g. call pm_runtime_resume(dev) or
> do something else, depending on the subsystem).

In the original patch, ->prepare returned the flag.  When it was set,
you would skip ->suspend, ->suspend_late, and ->suspend_noirq (and the
corresponding resume callbacks).  Did you decide to change this?

> > However, the second may indeed be a problem.  I don't know how you
> > intend to handle it.  Apply the patch, like you did for ACPI and PCI
> > above, and then see what happens?
> 
> For starters, I'd just make the parent's ->resume call pm_runtime_resume(dev).
> That will make the parent be ready before the child's ->resume is called.
> And then it may be optimized further going forward, possibly by replacing
> the pm_runtime_resume() with pm_request_resume() for some devices and by
> leaving some devices in RPM_SUSPENDED.

Of course, this would not be possible with the original version of the 
patch, because it wouldn't invoke the parent's ->resume.

> > A simple solution is to use fast_suspend only for devices that have no
> > children.  But that would not be optimal.
> > 
> > Another possibility is always to call pm_runtime_resume(dev->parent)
> > before invoking dev's ->resume callback.  But that might not solve the
> > entire problem (it wouldn't help dev's ->resume_early callback, for
> > instance) and it also might be sub-optimal.
> 
> The child's ->resume_early may be a problem indeed (or its ->resume_noirq
> for that matter).

If the child knows about the problem beforehand, it can runtime-resume 
the parent during its ->suspend.

> Well, if power.fast_suspend set guarantees that ->suspend_late, ->suspend_noirq,
> ->resume_noirq, and ->resume_early will be skipped for a device, then we may
> restrict setting it for devices whose children have it set (or that have no
> children).  Initially, that will be equivalent to setting it for leaf devices
> only, but it might be extended over time in a natural way.

Initially, maybe.  But it's the wrong approach in general.  The right 
approach is to restrict setting fast_suspend for devices whose children 
don't mind their parent being suspended when their resume callbacks 
run -- not for devices whose children also have fast_suspend set.

That's the point I've been trying to express all along.

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-26 22:17                         ` Alan Stern
@ 2014-02-26 23:13                           ` Rafael J. Wysocki
  2014-02-27 15:02                             ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-02-26 23:13 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Wednesday, February 26, 2014 05:17:03 PM Alan Stern wrote:
> On Wed, 26 Feb 2014, Rafael J. Wysocki wrote:
> 
> > > > Still, I think that something like power.fast_suspend is needed to indicate
> > > > that .suspend_late(), .suspend_noirq(), .resume_noirq() and .resume_early()
> > > > should be skipped for it (in my opinion the core may very well skip them then)
> > > > and so that .resume() knows how to handle the device.
> > > 
> > > I don't follow.  Why would you skip these routines without also
> > > skipping .suspend and .resume?
> > 
> > Because .suspend will set the flag and then it would be reasonable to call .resume,
> > for symmetry and to let it decide what to do (e.g. call pm_runtime_resume(dev) or
> > do something else, depending on the subsystem).
> 
> In the original patch, ->prepare returned the flag.  When it was set,
> you would skip ->suspend, ->suspend_late, and ->suspend_noirq (and the
> corresponding resume callbacks).  Did you decide to change this?

Yes, I did.

After these patches:

https://patchwork.kernel.org/patch/3705261/
https://patchwork.kernel.org/patch/3705271/

the decision doesn't have to be made until ->suspend (I'm ingoring the
power.ignore_children set special case), because that's when
pm_runtime_resume(dev) is now called (by ACPI and PCI).

> > > However, the second may indeed be a problem.  I don't know how you
> > > intend to handle it.  Apply the patch, like you did for ACPI and PCI
> > > above, and then see what happens?
> > 
> > For starters, I'd just make the parent's ->resume call pm_runtime_resume(dev).
> > That will make the parent be ready before the child's ->resume is called.
> > And then it may be optimized further going forward, possibly by replacing
> > the pm_runtime_resume() with pm_request_resume() for some devices and by
> > leaving some devices in RPM_SUSPENDED.
> 
> Of course, this would not be possible with the original version of the 
> patch, because it wouldn't invoke the parent's ->resume.

Right.

> > > A simple solution is to use fast_suspend only for devices that have no
> > > children.  But that would not be optimal.
> > > 
> > > Another possibility is always to call pm_runtime_resume(dev->parent)
> > > before invoking dev's ->resume callback.  But that might not solve the
> > > entire problem (it wouldn't help dev's ->resume_early callback, for
> > > instance) and it also might be sub-optimal.
> > 
> > The child's ->resume_early may be a problem indeed (or its ->resume_noirq
> > for that matter).
> 
> If the child knows about the problem beforehand, it can runtime-resume 
> the parent during its ->suspend.

Well, it even should do that in those cases.  We may need to deal with children
that don't do that, though.

> > Well, if power.fast_suspend set guarantees that ->suspend_late, ->suspend_noirq,
> > ->resume_noirq, and ->resume_early will be skipped for a device, then we may
> > restrict setting it for devices whose children have it set (or that have no
> > children).  Initially, that will be equivalent to setting it for leaf devices
> > only, but it might be extended over time in a natural way.
> 
> Initially, maybe.

Of course initially.

> But it's the wrong approach in general.

In the long run - I agree.

> The right approach is to restrict setting fast_suspend for devices whose
> children don't mind their parent being suspended when their resume callbacks 
> run -- not for devices whose children also have fast_suspend set.

I agree, but we need to know which children are OK with the parent being
suspended.  Having fast_suspend set is a good indication of that. :-)

Of course, we may introduce a separate flag for that just fine if you prefer.

> That's the point I've been trying to express all along.

I see.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices
  2014-02-26 23:13                           ` Rafael J. Wysocki
@ 2014-02-27 15:02                             ` Alan Stern
  2014-04-24 22:36                               ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend, v2 Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-02-27 15:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thu, 27 Feb 2014, Rafael J. Wysocki wrote:

> > If the child knows about the problem beforehand, it can runtime-resume 
> > the parent during its ->suspend.
> 
> Well, it even should do that in those cases.  We may need to deal with children
> that don't do that, though.
> 
> > > Well, if power.fast_suspend set guarantees that ->suspend_late, ->suspend_noirq,
> > > ->resume_noirq, and ->resume_early will be skipped for a device, then we may
> > > restrict setting it for devices whose children have it set (or that have no
> > > children).  Initially, that will be equivalent to setting it for leaf devices
> > > only, but it might be extended over time in a natural way.
> > 
> > Initially, maybe.
> 
> Of course initially.
> 
> > But it's the wrong approach in general.
> 
> In the long run - I agree.
> 
> > The right approach is to restrict setting fast_suspend for devices whose
> > children don't mind their parent being suspended when their resume callbacks 
> > run -- not for devices whose children also have fast_suspend set.
> 
> I agree, but we need to know which children are OK with the parent being
> suspended.  Having fast_suspend set is a good indication of that. :-)
> 
> Of course, we may introduce a separate flag for that just fine if you prefer.
> 
> > That's the point I've been trying to express all along.
> 
> I see.

Okay.  I'll wait to see the next version.

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend, v2
  2014-02-27 15:02                             ` Alan Stern
@ 2014-04-24 22:36                               ` Rafael J. Wysocki
  2014-04-24 22:37                                 ` [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
                                                   ` (2 more replies)
  0 siblings, 3 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-04-24 22:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, February 27, 2014 10:02:05 AM Alan Stern wrote:
> On Thu, 27 Feb 2014, Rafael J. Wysocki wrote:
> 
> > > If the child knows about the problem beforehand, it can runtime-resume 
> > > the parent during its ->suspend.
> > 
> > Well, it even should do that in those cases.  We may need to deal with children
> > that don't do that, though.
> > 
> > > > Well, if power.fast_suspend set guarantees that ->suspend_late, ->suspend_noirq,
> > > > ->resume_noirq, and ->resume_early will be skipped for a device, then we may
> > > > restrict setting it for devices whose children have it set (or that have no
> > > > children).  Initially, that will be equivalent to setting it for leaf devices
> > > > only, but it might be extended over time in a natural way.
> > > 
> > > Initially, maybe.
> > 
> > Of course initially.
> > 
> > > But it's the wrong approach in general.
> > 
> > In the long run - I agree.
> > 
> > > The right approach is to restrict setting fast_suspend for devices whose
> > > children don't mind their parent being suspended when their resume callbacks 
> > > run -- not for devices whose children also have fast_suspend set.
> > 
> > I agree, but we need to know which children are OK with the parent being
> > suspended.  Having fast_suspend set is a good indication of that. :-)
> > 
> > Of course, we may introduce a separate flag for that just fine if you prefer.
> > 
> > > That's the point I've been trying to express all along.
> > 
> > I see.
> 
> Okay.  I'll wait to see the next version.

Well, that took some time, but a new version follows.

It uses two flags (see the changelog of patch [1/3]) and is reworked on top of
the changes that went into 3.15-rc.  Patch [2/3] is just a resend.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-04-24 22:36                               ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend, v2 Rafael J. Wysocki
@ 2014-04-24 22:37                                 ` Rafael J. Wysocki
  2014-05-01 21:39                                   ` Alan Stern
  2014-04-24 22:39                                 ` [RFC][PATCH 2/3][Resend] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
  2014-04-24 22:40                                 ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-04-24 22:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
resume all runtime-suspended devices during system suspend, mostly
because those devices may need to be reprogrammed due to different
wakeup settings for system sleep and for runtime PM.  However, at
least in some cases, that isn't really necessary, because the wakeup
settings may not be really different.

The idea here is that subsystems should know whether or not it is
necessary to reprogram a given device during system suspend and they
should be able to tell the PM core about that.  For that reason, add
two new device PM flags, power.resume_not_needed and
power.use_runtime_resume, such that:

 (1) If power.resume_not_needed is set for the given device and for
     all of its children and the device is runtime-suspended during
     device_suspend(), the remaining device's system suspend/resume
     callbacks need not be executed.

 (2) If power.use_runtime_resume is set for the given device and the
     device is runtime-suspended in device_suspend_late(), its late/early
     and noirq system suspend/resume callbacks should be skipped and
     it should be resumed through pm_runtime_resume() in device_resume().

Those flags are cleared by the PM core in dpm_prepare() for all
devices.  Next, subsystems (or drivers) are supposed to set
power.resume_not_needed in their ->prepare() callbacks for devices
whose remaining system suspend/resume callbacks are generally safe to
be skipped if they are runtime-suspended already.  Finally, for each
runtime-suspended device with power.resume_not_needed set during
device_suspend(), its subsystem (or driver) may set the device's
power.use_runtime_resume in its ->suspend() callback without changing
the device's state, in which case the PM core will skip all of the
subsequent system suspend/resume callbacks for it and will resume it
in device_resume() using pm_runtime_resume().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |   41 ++++++++++++++++++++++++++++++++++-------
 include/linux/pm.h        |    2 ++
 2 files changed, 36 insertions(+), 7 deletions(-)

Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -546,6 +546,8 @@ struct dev_pm_info {
 	bool			is_late_suspended:1;
 	bool			ignore_children:1;
 	bool			early_init:1;	/* Owned by the PM core */
+	bool			resume_not_needed:1;
+	bool			use_runtime_resume:1;
 	spinlock_t		lock;
 #ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.use_runtime_resume)
 		goto Out;
 
 	if (!dev->power.is_noirq_suspended)
@@ -605,7 +605,7 @@ static int device_resume_early(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.use_runtime_resume)
 		goto Out;
 
 	if (!dev->power.is_late_suspended)
@@ -735,6 +735,11 @@ static int device_resume(struct device *
 	if (dev->power.syscore)
 		goto Complete;
 
+	if (dev->power.use_runtime_resume) {
+		pm_runtime_resume(dev);
+		goto Complete;
+	}
+
 	dpm_wait(dev->parent, async);
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
@@ -1007,7 +1012,7 @@ static int __device_suspend_noirq(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.use_runtime_resume)
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1146,7 +1151,7 @@ static int __device_suspend_late(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.use_runtime_resume)
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1383,9 +1388,29 @@ static int __device_suspend(struct devic
  End:
 	if (!error) {
 		dev->power.is_suspended = true;
-		if (dev->power.wakeup_path
-		    && dev->parent && !dev->parent->power.ignore_children)
-			dev->parent->power.wakeup_path = true;
+		if (dev->parent) {
+			spin_lock_irq(&dev->parent->power.lock);
+
+			if (dev->power.wakeup_path
+			    && !dev->parent->power.ignore_children)
+				dev->parent->power.wakeup_path = true;
+
+			/*
+			 * Subsystems are supposed to set resume_not_needed in
+			 * their ->prepare() callbacks for devices whose
+			 * remaining system suspend and resume callbacks are
+			 * generally safe to be skipped if the devices are
+			 * runtime-suspended.
+			 *
+			 * If the device's resume_not_needed is not set at this
+			 * point, it has to be cleared for its parent too (even
+			 * if the subsystem has set that flag for it).
+			 */
+			if (!dev->power.resume_not_needed)
+				dev->parent->power.resume_not_needed = false;
+
+			spin_unlock_irq(&dev->parent->power.lock);
+		}
 	}
 
 	device_unlock(dev);
@@ -1553,6 +1578,8 @@ int dpm_prepare(pm_message_t state)
 		struct device *dev = to_device(dpm_list.next);
 
 		get_device(dev);
+		dev->power.use_runtime_resume = false;
+		dev->power.resume_not_needed = false;
 		mutex_unlock(&dpm_list_mtx);
 
 		error = device_prepare(dev, state);

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 2/3][Resend] PM / runtime: Routine for checking device status during system suspend
  2014-04-24 22:36                               ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend, v2 Rafael J. Wysocki
  2014-04-24 22:37                                 ` [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
@ 2014-04-24 22:39                                 ` Rafael J. Wysocki
  2014-04-25 11:28                                   ` Ulf Hansson
  2014-04-24 22:40                                 ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-04-24 22:39 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add a new helper routine, pm_runtime_enabled_and_suspended(), to
allow subsystems (or PM domains) to check the runtime PM status of
devices during system suspend (possibly to avoid resuming those
devices upfront at that time).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c |   28 ++++++++++++++++++++++++++++
 include/linux/pm_runtime.h   |    2 ++
 2 files changed, 30 insertions(+)

Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -57,6 +57,7 @@ extern unsigned long pm_runtime_autosusp
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern bool pm_runtime_enabled_and_suspended(struct device *dev);
 
 static inline bool pm_children_suspended(struct device *dev)
 {
@@ -165,6 +166,7 @@ static inline unsigned long pm_runtime_a
 				struct device *dev) { return 0; }
 static inline void pm_runtime_set_memalloc_noio(struct device *dev,
 						bool enable){}
+static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false; }
 
 #endif /* !CONFIG_PM_RUNTIME */
 
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1195,6 +1195,34 @@ void pm_runtime_enable(struct device *de
 EXPORT_SYMBOL_GPL(pm_runtime_enable);
 
 /**
+ * pm_runtime_enabled_and_suspended - Check runtime PM status of a device.
+ * @dev: Device to handle.
+ *
+ * This routine is to be executed during system suspend only, after
+ * device_prepare() has been executed for @dev.
+ *
+ * Return false if runtime PM is disabled for the device.  Otherwise, wait
+ * for pending transitions to complete and check the runtime PM status of the
+ * device after that.  Return true if it is RPM_SUSPENDED.
+ */
+bool pm_runtime_enabled_and_suspended(struct device *dev)
+{
+	unsigned long flags;
+	bool ret;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	if (dev->power.disable_depth) {
+		ret = false;
+	} else {
+		__pm_runtime_barrier(dev);
+		ret = pm_runtime_status_suspended(dev);
+	}
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_enabled_and_suspended);
+
+/**
  * pm_runtime_forbid - Block runtime PM of a device.
  * @dev: Device to handle.
  *

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
  2014-04-24 22:36                               ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend, v2 Rafael J. Wysocki
  2014-04-24 22:37                                 ` [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
  2014-04-24 22:39                                 ` [RFC][PATCH 2/3][Resend] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
@ 2014-04-24 22:40                                 ` Rafael J. Wysocki
  2 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-04-24 22:40 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Rework the ACPI PM domain's PM callbacks to avoid resuming devices
during system suspend (in order to modify their wakeup settings etc.)
if that isn't necessary.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/device_pm.c |   33 ++++++++++++++++++++++++++++++---
 drivers/acpi/scan.c      |    4 ++++
 include/acpi/acpi_bus.h  |    3 ++-
 3 files changed, 36 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -907,6 +907,7 @@ int acpi_subsys_prepare(struct device *d
 	if (dev->power.ignore_children)
 		pm_runtime_resume(dev);
 
+	dev->power.resume_not_needed = true;
 	return pm_generic_prepare(dev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
@@ -914,13 +915,39 @@ EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
 /**
  * acpi_subsys_suspend - Run the device driver's suspend callback.
  * @dev: Device to handle.
- *
- * Follow PCI and resume devices suspended at run time before running their
- * system suspend callbacks.
  */
 int acpi_subsys_suspend(struct device *dev)
 {
+	struct acpi_device *adev = ACPI_COMPANION(dev);
+	u32 sys_target;
+
+	if (!adev || !pm_runtime_enabled_and_suspended(dev))
+		goto out;
+
+	if (!dev->power.resume_not_needed
+	    || device_may_wakeup(dev) != !!adev->wakeup.prepare_count)
+		goto resume;
+
+	sys_target = acpi_target_system_state();
+	if (sys_target != ACPI_STATE_S0) {
+		int ret, state;
+
+		if (adev->power.flags.dsw_present)
+			goto resume;
+
+		ret = acpi_dev_pm_get_state(dev, adev, sys_target, NULL, &state);
+		if (ret || state != adev->power.state)
+			goto resume;
+	}
+
+	dev->power.use_runtime_resume = true;
+	return 0;
+
+ resume:
 	pm_runtime_resume(dev);
+
+ out:
+	dev->power.resume_not_needed = false;
 	return pm_generic_suspend(dev);
 }
 
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -261,7 +261,8 @@ struct acpi_device_power_flags {
 	u32 inrush_current:1;	/* Serialize Dx->D0 */
 	u32 power_removed:1;	/* Optimize Dx->D0 */
 	u32 ignore_parent:1;	/* Power is independent of parent power state */
-	u32 reserved:27;
+	u32 dsw_present:1;	/* _DSW present? */
+	u32 reserved:26;
 };
 
 struct acpi_device_power_state {
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -1551,9 +1551,13 @@ static void acpi_bus_get_power_flags(str
 	 */
 	if (acpi_has_method(device->handle, "_PSC"))
 		device->power.flags.explicit_get = 1;
+
 	if (acpi_has_method(device->handle, "_IRC"))
 		device->power.flags.inrush_current = 1;
 
+	if (acpi_has_method(device->handle, "_DSW"))
+		device->power.flags.dsw_present = 1;
+
 	/*
 	 * Enumerate supported power management states
 	 */


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 2/3][Resend] PM / runtime: Routine for checking device status during system suspend
  2014-04-24 22:39                                 ` [RFC][PATCH 2/3][Resend] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
@ 2014-04-25 11:28                                   ` Ulf Hansson
  0 siblings, 0 replies; 78+ messages in thread
From: Ulf Hansson @ 2014-04-25 11:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On 25 April 2014 00:39, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Add a new helper routine, pm_runtime_enabled_and_suspended(), to
> allow subsystems (or PM domains) to check the runtime PM status of
> devices during system suspend (possibly to avoid resuming those
> devices upfront at that time).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/power/runtime.c |   28 ++++++++++++++++++++++++++++
>  include/linux/pm_runtime.h   |    2 ++
>  2 files changed, 30 insertions(+)
>
> Index: linux-pm/include/linux/pm_runtime.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm_runtime.h
> +++ linux-pm/include/linux/pm_runtime.h
> @@ -57,6 +57,7 @@ extern unsigned long pm_runtime_autosusp
>  extern void pm_runtime_update_max_time_suspended(struct device *dev,
>                                                  s64 delta_ns);
>  extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
> +extern bool pm_runtime_enabled_and_suspended(struct device *dev);
>
>  static inline bool pm_children_suspended(struct device *dev)
>  {
> @@ -165,6 +166,7 @@ static inline unsigned long pm_runtime_a
>                                 struct device *dev) { return 0; }
>  static inline void pm_runtime_set_memalloc_noio(struct device *dev,
>                                                 bool enable){}
> +static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false; }
>
>  #endif /* !CONFIG_PM_RUNTIME */
>
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -1195,6 +1195,34 @@ void pm_runtime_enable(struct device *de
>  EXPORT_SYMBOL_GPL(pm_runtime_enable);
>
>  /**
> + * pm_runtime_enabled_and_suspended - Check runtime PM status of a device.
> + * @dev: Device to handle.
> + *
> + * This routine is to be executed during system suspend only, after
> + * device_prepare() has been executed for @dev.

Hi Rafael,

Do we really need to state the above constraints. Could we not leave
it to be used in other scenarios as well? I am not sure those would
exists though, but still. :-)

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Ulf Hansson

> + *
> + * Return false if runtime PM is disabled for the device.  Otherwise, wait
> + * for pending transitions to complete and check the runtime PM status of the
> + * device after that.  Return true if it is RPM_SUSPENDED.
> + */
> +bool pm_runtime_enabled_and_suspended(struct device *dev)
> +{
> +       unsigned long flags;
> +       bool ret;
> +
> +       spin_lock_irqsave(&dev->power.lock, flags);
> +       if (dev->power.disable_depth) {
> +               ret = false;
> +       } else {
> +               __pm_runtime_barrier(dev);
> +               ret = pm_runtime_status_suspended(dev);
> +       }
> +       spin_unlock_irqrestore(&dev->power.lock, flags);
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(pm_runtime_enabled_and_suspended);
> +
> +/**
>   * pm_runtime_forbid - Block runtime PM of a device.
>   * @dev: Device to handle.
>   *
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-04-24 22:37                                 ` [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
@ 2014-05-01 21:39                                   ` Alan Stern
  2014-05-01 23:15                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-01 21:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Fri, 25 Apr 2014, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.  However, at
> least in some cases, that isn't really necessary, because the wakeup
> settings may not be really different.
> 
> The idea here is that subsystems should know whether or not it is
> necessary to reprogram a given device during system suspend and they
> should be able to tell the PM core about that.  For that reason, add
> two new device PM flags, power.resume_not_needed and
> power.use_runtime_resume, such that:
> 
>  (1) If power.resume_not_needed is set for the given device and for
>      all of its children and the device is runtime-suspended during
>      device_suspend(), the remaining device's system suspend/resume
>      callbacks need not be executed.

The patch doesn't do that last part.  That is, it still invokes the 
callbacks even when resume_not_needed is set.

I'm not sure that you should skip the resume callbacks.  We expect
devices to be resumed during system resume even if they were in runtime
suspend beforehand.  Yes, this means calling resume_noirq without
calling suspend_noirq (ditto for the other callbacks).  Think of it as
a form of optimization -- we could have called suspend_noirq, but we
know that the subsystem would have seen that the device was already in
runtime suspend and then returned immediately.  By not calling
suspend_noirq, we spare the subsystem from testing the runtime status.

(And I also think "resume_not_needed" is too general.  It should be 
something more like "runtime_resume_not_needed_during_system_suspend", 
only not quite so long.)

>  (2) If power.use_runtime_resume is set for the given device and the
>      device is runtime-suspended in device_suspend_late(), its late/early
>      and noirq system suspend/resume callbacks should be skipped and
>      it should be resumed through pm_runtime_resume() in device_resume().

IMO this should be a separate patch.  It has no direct connection with 
the main goal of providing subsystems with a mechanism to avoid waking 
up devices for reprogramming during system suspend.

The main goal of this other patch will be to allow devices which were
in runtime suspend throughout the system suspend phases to remain in
runtime suspend throughout the system resume.  When they do finally get
resumed, it will be by a ->runtime_resume() callback, not ->resume().

This will require coordination with the child devices.  If the 
child expects the parent always to be resumed during the resume_early 
phase, we mustn't skip the resume_early callback.  I'm not sure if the 
coordination provided by the resume_not_needed flag is sufficient.

("use_runtime_resume" is also too general.  
"remain_in_runtime_suspend_during_system_resume"?)

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-01 21:39                                   ` Alan Stern
@ 2014-05-01 23:15                                     ` Rafael J. Wysocki
  2014-05-01 23:36                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-01 23:15 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, May 01, 2014 05:39:31 PM Alan Stern wrote:
> On Fri, 25 Apr 2014, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.  However, at
> > least in some cases, that isn't really necessary, because the wakeup
> > settings may not be really different.
> > 
> > The idea here is that subsystems should know whether or not it is
> > necessary to reprogram a given device during system suspend and they
> > should be able to tell the PM core about that.  For that reason, add
> > two new device PM flags, power.resume_not_needed and
> > power.use_runtime_resume, such that:
> > 
> >  (1) If power.resume_not_needed is set for the given device and for
> >      all of its children and the device is runtime-suspended during
> >      device_suspend(), the remaining device's system suspend/resume
> >      callbacks need not be executed.
> 
> The patch doesn't do that last part.  That is, it still invokes the 
> callbacks even when resume_not_needed is set.

Yes, it does and the above paragraph doesn't say that it won't.  It only
says that the flag indicates no need to do that.

> I'm not sure that you should skip the resume callbacks.

The reason why is that if the device were suspended using ->runtime_suspend(),
its driver (or bus type) may be confused if ->runtime_resume() is not used
to resume it.

> We expect devices to be resumed during system resume even if they were in
> runtime suspend beforehand.  Yes, this means calling resume_noirq without
> calling suspend_noirq (ditto for the other callbacks).  Think of it as
> a form of optimization -- we could have called suspend_noirq, but we
> know that the subsystem would have seen that the device was already in
> runtime suspend and then returned immediately.  By not calling
> suspend_noirq, we spare the subsystem from testing the runtime status.

The driver/subsystem has the right to expect that ->resume_noirq() will
always be called after ->suspend_noirq() (if both are implemented).  Thus
calling the former without the latter may be a bug (for the particular
driver).  Since we've always had symmetry there, the risk of breaking stuff
if we change that is relatively high.

On the other hand, ->runtime_resume() should not make any assumptions
about the hardware state when it is called anyway, so it should handle
the device during system resume properly, given that the last PM callback
executed for it was ->runtime_suspend(), regardless of what the firmware
did to it in the meantime.

> (And I also think "resume_not_needed" is too general.  It should be 
> something more like "runtime_resume_not_needed_during_system_suspend", 
> only not quite so long.)

Well, I can agree with that, but I couldn't invent a better name. In particular,
one that wouldn't be too long. :-)

> >  (2) If power.use_runtime_resume is set for the given device and the
> >      device is runtime-suspended in device_suspend_late(), its late/early
> >      and noirq system suspend/resume callbacks should be skipped and
> >      it should be resumed through pm_runtime_resume() in device_resume().
> 
> IMO this should be a separate patch.  It has no direct connection with 
> the main goal of providing subsystems with a mechanism to avoid waking 
> up devices for reprogramming during system suspend.

I'm not sure what you mean, honestly.  The main goal here is to allow
devices that were runtime suspended to stay that way throughout system
suspend and resume up until device_resume() is called for them.

> The main goal of this other patch will be to allow devices which were
> in runtime suspend throughout the system suspend phases to remain in
> runtime suspend throughout the system resume.  When they do finally get
> resumed, it will be by a ->runtime_resume() callback, not ->resume().

Yes, and that's what *this* patch does, isn't it?

I have no idea how to split it so that both parts still make sense.

> This will require coordination with the child devices.  If the 
> child expects the parent always to be resumed during the resume_early 
> phase, we mustn't skip the resume_early callback.  I'm not sure if the 
> coordination provided by the resume_not_needed flag is sufficient.

Setting resume_not_needed means "if I'm runtime-suspended, I don't mind if
you leave me like that".  If the child is resumed during system suspend,
the parent will have to be resumed either and they will both go through
system suspend-resume callbacks.  Otherwise, the child will be resumed by
its ->runtime_resume() in device_resume() - after the parent.  I don't
think that any more coordination is needed.

> ("use_runtime_resume" is also too general.  
> "remain_in_runtime_suspend_during_system_resume"?)

Well, again, I have problems with inventing names for those things.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-01 23:15                                     ` Rafael J. Wysocki
@ 2014-05-01 23:36                                       ` Rafael J. Wysocki
  2014-05-02  0:04                                         ` Rafael J. Wysocki
  2014-05-02 16:12                                         ` [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices Alan Stern
  0 siblings, 2 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-01 23:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Friday, May 02, 2014 01:15:14 AM Rafael J. Wysocki wrote:
> On Thursday, May 01, 2014 05:39:31 PM Alan Stern wrote:
> > On Fri, 25 Apr 2014, Rafael J. Wysocki wrote:
> > 
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > > resume all runtime-suspended devices during system suspend, mostly
> > > because those devices may need to be reprogrammed due to different
> > > wakeup settings for system sleep and for runtime PM.  However, at
> > > least in some cases, that isn't really necessary, because the wakeup
> > > settings may not be really different.
> > > 
> > > The idea here is that subsystems should know whether or not it is
> > > necessary to reprogram a given device during system suspend and they
> > > should be able to tell the PM core about that.  For that reason, add
> > > two new device PM flags, power.resume_not_needed and
> > > power.use_runtime_resume, such that:
> > > 
> > >  (1) If power.resume_not_needed is set for the given device and for
> > >      all of its children and the device is runtime-suspended during
> > >      device_suspend(), the remaining device's system suspend/resume
> > >      callbacks need not be executed.
> > 
> > The patch doesn't do that last part.  That is, it still invokes the 
> > callbacks even when resume_not_needed is set.
> 
> Yes, it does and the above paragraph doesn't say that it won't.  It only
> says that the flag indicates no need to do that.
> 
> > I'm not sure that you should skip the resume callbacks.
> 
> The reason why is that if the device were suspended using ->runtime_suspend(),
> its driver (or bus type) may be confused if ->runtime_resume() is not used
> to resume it.
> 
> > We expect devices to be resumed during system resume even if they were in
> > runtime suspend beforehand.  Yes, this means calling resume_noirq without
> > calling suspend_noirq (ditto for the other callbacks).  Think of it as
> > a form of optimization -- we could have called suspend_noirq, but we
> > know that the subsystem would have seen that the device was already in
> > runtime suspend and then returned immediately.  By not calling
> > suspend_noirq, we spare the subsystem from testing the runtime status.
> 
> The driver/subsystem has the right to expect that ->resume_noirq() will
> always be called after ->suspend_noirq() (if both are implemented).  Thus
> calling the former without the latter may be a bug (for the particular
> driver).  Since we've always had symmetry there, the risk of breaking stuff
> if we change that is relatively high.
> 
> On the other hand, ->runtime_resume() should not make any assumptions
> about the hardware state when it is called anyway, so it should handle
> the device during system resume properly, given that the last PM callback
> executed for it was ->runtime_suspend(), regardless of what the firmware
> did to it in the meantime.
> 
> > (And I also think "resume_not_needed" is too general.  It should be 
> > something more like "runtime_resume_not_needed_during_system_suspend", 
> > only not quite so long.)
> 
> Well, I can agree with that, but I couldn't invent a better name. In particular,
> one that wouldn't be too long. :-)
> 
> > >  (2) If power.use_runtime_resume is set for the given device and the
> > >      device is runtime-suspended in device_suspend_late(), its late/early
> > >      and noirq system suspend/resume callbacks should be skipped and
> > >      it should be resumed through pm_runtime_resume() in device_resume().
> > 
> > IMO this should be a separate patch.  It has no direct connection with 
> > the main goal of providing subsystems with a mechanism to avoid waking 
> > up devices for reprogramming during system suspend.
> 
> I'm not sure what you mean, honestly.  The main goal here is to allow
> devices that were runtime suspended to stay that way throughout system
> suspend and resume up until device_resume() is called for them.
> 
> > The main goal of this other patch will be to allow devices which were
> > in runtime suspend throughout the system suspend phases to remain in
> > runtime suspend throughout the system resume.  When they do finally get
> > resumed, it will be by a ->runtime_resume() callback, not ->resume().
> 
> Yes, and that's what *this* patch does, isn't it?
> 
> I have no idea how to split it so that both parts still make sense.

OK, I guess you mean that the first flag (resume_not_needed) should be introduced
by one patch and the second one by another, but I'm not sure if that would really
make any difference.  Both of them are needed anyway and resume_not_needed without
the other flag wouldn't have any effect.

Well, perhaps I should make __device_suspend() check that use_runtime_resume is
only set when resume_not_needed is set too and return an error if that's not
the case.

Rafael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-01 23:36                                       ` Rafael J. Wysocki
@ 2014-05-02  0:04                                         ` Rafael J. Wysocki
  2014-05-02 15:41                                           ` Rafael J. Wysocki
  2014-05-02 16:12                                         ` [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices Alan Stern
  1 sibling, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-02  0:04 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Friday, May 02, 2014 01:36:38 AM Rafael J. Wysocki wrote:
> On Friday, May 02, 2014 01:15:14 AM Rafael J. Wysocki wrote:

[cut]

> 
> OK, I guess you mean that the first flag (resume_not_needed) should be introduced
> by one patch and the second one by another, but I'm not sure if that would really
> make any difference.  Both of them are needed anyway and resume_not_needed without
> the other flag wouldn't have any effect.
> 
> Well, perhaps I should make __device_suspend() check that use_runtime_resume is
> only set when resume_not_needed is set too and return an error if that's not
> the case.

Below is an updated patch for further discussion.

I tried to improve the changelog and invent better names for the new flags.

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices

Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
resume all runtime-suspended devices during system suspend, mostly
because those devices may need to be reprogrammed due to different
wakeup settings for system sleep and for runtime PM.  However, at
least in some cases, that isn't really necessary, because the wakeup
settings may not be really different.

The idea here is that subsystems should know whether or not it is
necessary to reprogram a given device during system suspend and they
should be able to tell the PM core about that.  For that reason, add
two new device PM flags, power.may_stay_suspended and
power.leave_runtime_suspended, such that:

 (1) If power.may_stay_suspended is set for the given device and the
     device is runtime-suspended during device_suspend(), it is safe to
     skip the remaining system suspend/resume callbacks for the device
     ("If I'm runtime-suspended, I don't mind it if you leave me like
     that").  This flag is for coordination between child and parent
     devices (that is, parents can only have power.may_stay_suspended
     set if it is set for all of their children).

 (2) If power.leave_runtime_suspended is set for the given device (which
     is valid only if power.may_stay_suspended also is set for it) and the
     device is runtime-suspended in device_suspend_late(), its late/early
     and noirq system suspend/resume callbacks are supposed to be skipped
     and pm_runtime_resume() is supposed to be used for resuming the
     device in device_resume().

Those flags are cleared by the PM core in dpm_prepare() for all
devices.  Next, subsystems (or drivers) are supposed to set
power.may_stay_suspended in their ->prepare() callbacks for devices
whose remaining system suspend/resume callbacks are generally safe to
be skipped if they are runtime-suspended already.  Finally, for each
runtime-suspended device with power.may_stay_suspended set during
device_suspend(), its subsystem (or driver) may set the device's
power.leave_runtime_suspended in its ->suspend() callback without changing
the device's state, in which case the PM core will skip all of the
subsequent system suspend/resume callbacks for it and will resume it
in device_resume() using pm_runtime_resume().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |   44 +++++++++++++++++++++++++++++++++++++-------
 include/linux/pm.h        |    2 ++
 2 files changed, 39 insertions(+), 7 deletions(-)

Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -546,6 +546,8 @@ struct dev_pm_info {
 	bool			is_late_suspended:1;
 	bool			ignore_children:1;
 	bool			early_init:1;	/* Owned by the PM core */
+	bool			may_stay_suspended:1;
+	bool			leave_runtime_suspended:1;
 	spinlock_t		lock;
 #ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.leave_runtime_suspended)
 		goto Out;
 
 	if (!dev->power.is_noirq_suspended)
@@ -605,7 +605,7 @@ static int device_resume_early(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.leave_runtime_suspended)
 		goto Out;
 
 	if (!dev->power.is_late_suspended)
@@ -735,6 +735,11 @@ static int device_resume(struct device *
 	if (dev->power.syscore)
 		goto Complete;
 
+	if (dev->power.leave_runtime_suspended) {
+		pm_runtime_resume(dev);
+		goto Complete;
+	}
+
 	dpm_wait(dev->parent, async);
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
@@ -1007,7 +1012,7 @@ static int __device_suspend_noirq(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.leave_runtime_suspended)
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1146,7 +1151,7 @@ static int __device_suspend_late(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.leave_runtime_suspended)
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1379,13 +1384,36 @@ static int __device_suspend(struct devic
 	}
 
 	error = dpm_run_callback(callback, dev, state, info);
+	if (!error && !dev->power.may_stay_suspended
+	  && dev->power.leave_runtime_suspended)
+		error = -EPROTO;
 
  End:
 	if (!error) {
 		dev->power.is_suspended = true;
-		if (dev->power.wakeup_path
-		    && dev->parent && !dev->parent->power.ignore_children)
-			dev->parent->power.wakeup_path = true;
+		if (dev->parent) {
+			spin_lock_irq(&dev->parent->power.lock);
+
+			if (dev->power.wakeup_path
+			    && !dev->parent->power.ignore_children)
+				dev->parent->power.wakeup_path = true;
+
+			/*
+			 * Subsystems are supposed to set may_stay_suspended in
+			 * their ->prepare() callbacks for devices whose
+			 * remaining system suspend and resume callbacks are
+			 * generally safe to be skipped if the devices are
+			 * runtime-suspended.
+			 *
+			 * If the device's may_stay_suspended is not set at this
+			 * point, it has to be cleared for its parent too (even
+			 * if the subsystem has set that flag for it).
+			 */
+			if (!dev->power.may_stay_suspended)
+				dev->parent->power.may_stay_suspended = false;
+
+			spin_unlock_irq(&dev->parent->power.lock);
+		}
 	}
 
 	device_unlock(dev);
@@ -1553,6 +1581,8 @@ int dpm_prepare(pm_message_t state)
 		struct device *dev = to_device(dpm_list.next);
 
 		get_device(dev);
+		dev->power.leave_runtime_suspended = false;
+		dev->power.may_stay_suspended = false;
 		mutex_unlock(&dpm_list_mtx);
 
 		error = device_prepare(dev, state);


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-02  0:04                                         ` Rafael J. Wysocki
@ 2014-05-02 15:41                                           ` Rafael J. Wysocki
  2014-05-02 18:44                                             ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-02 15:41 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Friday, May 02, 2014 02:04:07 AM Rafael J. Wysocki wrote:
> On Friday, May 02, 2014 01:36:38 AM Rafael J. Wysocki wrote:
> > On Friday, May 02, 2014 01:15:14 AM Rafael J. Wysocki wrote:
> 
> [cut]
> 
> > 
> > OK, I guess you mean that the first flag (resume_not_needed) should be introduced
> > by one patch and the second one by another, but I'm not sure if that would really
> > make any difference.  Both of them are needed anyway and resume_not_needed without
> > the other flag wouldn't have any effect.
> > 
> > Well, perhaps I should make __device_suspend() check that use_runtime_resume is
> > only set when resume_not_needed is set too and return an error if that's not
> > the case.
> 
> Below is an updated patch for further discussion.
> 
> I tried to improve the changelog and invent better names for the new flags.

Well, I have a second update.

It has different flag names and changelog (that should explain things better
hopefully) and the purpose of both flags should be more clear now (patch [3/3]
would need to be reworked on top of this, but for now let's just discuss the
core changes).

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices

Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
resume all runtime-suspended devices during system suspend, mostly
because those devices may need to be reprogrammed due to different
wakeup settings for system sleep and for runtime PM.  However, at
least in some cases, that isn't really necessary, because the wakeup
settings may not be really different.

The idea here is that subsystems should know whether or not it is
necessary to resume a given device during system suspend as long as
they know that the device's children will not need it to be functional
during the late/early and noirq phases of their suspend and resume.
To help them with that, introduce two new device PM flags:
power.parent_needed and power.leave_runtime_suspended supposed to work
as follows.

The PM core will clear power.leave_runtime_suspended and will set
power.parent_needed for all devices in dpm_prepare().  Next, the
subsystem (or driver) of a device that in principle may not need
to be resumed during system suspend, if runtume-suspended already,
will set power.leave_runtime_suspended in its ->prepare() callback.
Also the subsystems (or drivers) of devices whose parents need not
be resumed during system suspend, if runtime-suspended already,
are supposed to clear power.parent_needed for them.  The PM core
will then clear power.leave_runtime_suspended for the parents of
all devices having power.parent_needed set in __device_suspend().

Now, if the ->suspend() callback is executed for a device whose
power.leave_runtime_suspended is set, it can simply return 0 after
checking the device's state if that state is appropriate for
system suspend.  The PM core will then skip the late/early and
noirq system suspend/resume callbacks for that device and will
use pm_runtime_resume() to resume it in device_resume().

If the state of a device with power.leave_runtime_suspended is not
appropriate for system suspend, the ->suspend() callback should
resume it using pm_runtime_resume() and clear
power.leave_runtime_suspended for it.

Note: Drivers (or bus types etc.) can reasonably expect that the
next PM callback executed after ->runtime_suspend() will be
->runtime_resume() rather than ->resume_noirq() or ->resume_early().
This change is designed with that expectation in mind.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |   33 ++++++++++++++++++++++++++-------
 include/linux/pm.h        |    2 ++
 2 files changed, 28 insertions(+), 7 deletions(-)

Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -546,6 +546,8 @@ struct dev_pm_info {
 	bool			is_late_suspended:1;
 	bool			ignore_children:1;
 	bool			early_init:1;	/* Owned by the PM core */
+	bool			parent_needed:1;
+	bool			leave_runtime_suspended:1;
 	spinlock_t		lock;
 #ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.leave_runtime_suspended)
 		goto Out;
 
 	if (!dev->power.is_noirq_suspended)
@@ -605,7 +605,7 @@ static int device_resume_early(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.leave_runtime_suspended)
 		goto Out;
 
 	if (!dev->power.is_late_suspended)
@@ -735,6 +735,11 @@ static int device_resume(struct device *
 	if (dev->power.syscore)
 		goto Complete;
 
+	if (dev->power.leave_runtime_suspended) {
+		pm_runtime_resume(dev);
+		goto Complete;
+	}
+
 	dpm_wait(dev->parent, async);
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
@@ -1007,7 +1012,7 @@ static int __device_suspend_noirq(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.leave_runtime_suspended)
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1146,7 +1151,7 @@ static int __device_suspend_late(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || dev->power.leave_runtime_suspended)
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1383,9 +1388,21 @@ static int __device_suspend(struct devic
  End:
 	if (!error) {
 		dev->power.is_suspended = true;
-		if (dev->power.wakeup_path
-		    && dev->parent && !dev->parent->power.ignore_children)
-			dev->parent->power.wakeup_path = true;
+		if (dev->power.leave_runtime_suspended)
+			dev->power.parent_needed = false;
+
+		if (dev->parent) {
+			spin_lock_irq(&dev->parent->power.lock);
+
+			if (dev->power.wakeup_path
+			    && !dev->parent->power.ignore_children)
+				dev->parent->power.wakeup_path = true;
+
+			if (dev->power.parent_needed)
+				dev->parent->power.leave_runtime_suspended = false;
+
+			spin_unlock_irq(&dev->parent->power.lock);
+		}
 	}
 
 	device_unlock(dev);
@@ -1553,6 +1570,8 @@ int dpm_prepare(pm_message_t state)
 		struct device *dev = to_device(dpm_list.next);
 
 		get_device(dev);
+		dev->power.leave_runtime_suspended = false;
+		dev->power.parent_needed = true;
 		mutex_unlock(&dpm_list_mtx);
 
 		error = device_prepare(dev, state);


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-01 23:36                                       ` Rafael J. Wysocki
  2014-05-02  0:04                                         ` Rafael J. Wysocki
@ 2014-05-02 16:12                                         ` Alan Stern
  1 sibling, 0 replies; 78+ messages in thread
From: Alan Stern @ 2014-05-02 16:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Fri, 2 May 2014, Rafael J. Wysocki wrote:

> On Friday, May 02, 2014 01:15:14 AM Rafael J. Wysocki wrote:
> > On Thursday, May 01, 2014 05:39:31 PM Alan Stern wrote:
> > > On Fri, 25 Apr 2014, Rafael J. Wysocki wrote:
> > > 
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > > > resume all runtime-suspended devices during system suspend, mostly
> > > > because those devices may need to be reprogrammed due to different
> > > > wakeup settings for system sleep and for runtime PM.  However, at
> > > > least in some cases, that isn't really necessary, because the wakeup
> > > > settings may not be really different.
> > > > 
> > > > The idea here is that subsystems should know whether or not it is
> > > > necessary to reprogram a given device during system suspend and they
> > > > should be able to tell the PM core about that.  For that reason, add
> > > > two new device PM flags, power.resume_not_needed and
> > > > power.use_runtime_resume, such that:
> > > > 
> > > >  (1) If power.resume_not_needed is set for the given device and for
> > > >      all of its children and the device is runtime-suspended during
> > > >      device_suspend(), the remaining device's system suspend/resume
> > > >      callbacks need not be executed.
> > > 
> > > The patch doesn't do that last part.  That is, it still invokes the 
> > > callbacks even when resume_not_needed is set.
> > 
> > Yes, it does and the above paragraph doesn't say that it won't.  It only
> > says that the flag indicates no need to do that.

But then the reader is left wondering: What is the reason for having
the flag, if it doesn't cause the PM core to do anything different?
This is partly a weakness of the patch description.  Here's a 
suggestion for an improved description:

--------------------------------------------------------------------

Many devices have different wakeup settings for runtime suspend and
system suspend.  If such a device is already runtime-suspended when a
system suspend starts, it will be necessary to perform a runtime resume
in order to reprogram the device's wakeup settings.  Thus, the device
will always enter system suspend in a runtime-active state.

Other devices may need to be runtime-active when a system suspend
starts for other reasons.  Still, there are quite a few devices which
have no such requirements.  If they are already in runtime suspend,
there's no reason why they shouldn't remain that way during the system
suspend.  And avoiding an extra runtime-resume/device-suspend pair
could help speed up the procedure.

Some subsystems, such as PCI and the ACPI PM domain, automatically 
resume all runtime-suspended devices when a system suspend starts, on 
the theory that the device may require reprogramming.  As described 
above, we would like to avoid doing this in cases where it isn't 
needed.

This patch introduces a standardized way for drivers to tell their
subsystems that no reprogramming is needed and a device may remain in
runtime suspend during a system suspend.  A new flag,
dev->power.resume_not_needed, is added.  Drivers set this flag during
their ->prepare() callbacks to tell the subsystems that no runtime
resume is necessary.  If the flag is set and the device is
runtime-suspended, the subsystem can return immediately from its
->suspend(), ->suspend_late(), and ->suspend_noirq() callbacks.  
(However, the various resume callbacks should continue to function as 
before, because we want the device to end up at full power when the 
system resume is over.)

Note: If dev->power.ignore_children is set then
dev->power.resume_not_needed probably should not be set.  This is
because the driver for the child device may expect dev always to be at
full power when the child's suspend routines run.

--------------------------------------------------------------------

This description doesn't say anything about the PM core skipping the 
various suspend callbacks.  That's because the patch doesn't actually 
skip them.

It also doesn't say anything about requiring resume_not_needed to be 
set in all the descendant devices.  That's because this isn't 
necessary, if all you want to accomplish is to avoid the unnecessary 
runtime resumes.

> > > >  (2) If power.use_runtime_resume is set for the given device and the
> > > >      device is runtime-suspended in device_suspend_late(), its late/early
> > > >      and noirq system suspend/resume callbacks should be skipped and
> > > >      it should be resumed through pm_runtime_resume() in device_resume().
> > > 
> > > IMO this should be a separate patch.  It has no direct connection with 
> > > the main goal of providing subsystems with a mechanism to avoid waking 
> > > up devices for reprogramming during system suspend.
> > 
> > I'm not sure what you mean, honestly.  The main goal here is to allow
> > devices that were runtime suspended to stay that way throughout system
> > suspend and resume up until device_resume() is called for them.

No, not exactly.  The main goal is to allow devices that were runtime
suspended to stay that way throughout system suspend _only_.  We expect
system resume to function as it has in the past.  Changing the
expectations for system resume should be the subject of a second patch.

> > I have no idea how to split it so that both parts still make sense.
> 
> OK, I guess you mean that the first flag (resume_not_needed) should be introduced
> by one patch and the second one by another, but I'm not sure if that would really
> make any difference.  Both of them are needed anyway and resume_not_needed without
> the other flag wouldn't have any effect.
> 
> Well, perhaps I should make __device_suspend() check that use_runtime_resume is
> only set when resume_not_needed is set too and return an error if that's not
> the case.

Here's my suggestion for the second patch's description.  It describes 
what I have in mind:

--------------------------------------------------------------------

Now that we have a mechanism for allowing devices to remain in a
runtime-suspended state during system suspend, it makes sense to add a
mechanism allowing them to remain in runtime suspend during system
resume as well -- in other words, throughout an entire system suspend
cycle.  In theory, all the PM core has to do is avoid invoking the
suspend and resume callbacks for devices that are runtime suspended.

However, some drivers may not be prepared to handle this new behavior.
They may assume that their various resume callbacks always get called,
regardless of the device's runtime status, because that's how it works
now.  Or they may assume that the device's parent is always at full
power when the device's resume callbacks run, which wouldn't have to
hold if the parent's power.ignore_children flag was set.

Therefore this patch adds a new flag: dev->power.use_runtime_resume.  
Drivers can set this flag in their ->prepare() routines.  When
__device_suspend() runs, if the device and all its descendants are
runtime suspended, and if power.use_runtime_resume is set in the device
and all its descendants, then the PM core will skip the ->suspend(),
->suspend_late(), ->suspend_noirq(), ->resume_noirq(),
->resume_early(), and ->resume() callbacks.  The device will not return 
to full power until a runtime resume occurs.

--------------------------------------------------------------------

This isn't exactly the same as what you implemented, but I think the 
description explains the situation well enough that the reasons for the 
differences are clear.

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-02 15:41                                           ` Rafael J. Wysocki
@ 2014-05-02 18:44                                             ` Alan Stern
  2014-05-05  0:09                                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-02 18:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Fri, 2 May 2014, Rafael J. Wysocki wrote:

> Well, I have a second update.
> 
> It has different flag names and changelog (that should explain things better
> hopefully) and the purpose of both flags should be more clear now (patch [3/3]
> would need to be reworked on top of this, but for now let's just discuss the
> core changes).

We've got patch descriptions passing in the night!  :-)

This doesn't contain any changes to the patch itself, apart from the 
flag names, right?  The description below is much better than the 
earlier one, but I still feel this deserves to be split in two: one 
patch for each new flag.

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
> 
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.  However, at
> least in some cases, that isn't really necessary, because the wakeup
> settings may not be really different.
> 
> The idea here is that subsystems should know whether or not it is
> necessary to resume a given device during system suspend as long as
> they know that the device's children will not need it to be functional
> during the late/early and noirq phases of their suspend and resume.

Perhaps the matter of the children's requirements should be discussed 
more fully.  I skimmed over it in my suggested description too.

Under what conditions will a child need the parent device to be 
functional?  Let's start by assuming the parent's ignore_children 
bit isn't set.

By this assumption, if the child was at full power during the suspend
stages then the parent would have to be at full power too.  So let's
assume that the child is in runtime suspend when its ->suspend()
routine runs.  I can't think of any scenario where the child's driver
would require the parent to be at full power without also needing the
child to be at full power.  If the child really does need to be at full
power then the driver will have to do a runtime resume, which would
also bring the parent to full power.  Either way, we don't have to do
anything special -- during the suspend stages, if the child needs the 
parent to be at full power then it will be.

(As a variant of this case, maybe the child belongs to one of the
subsystems like PCI, and its driver expects the subsystem to
runtime-resume the child before invoking its ->suspend() callback.  
When the subsystem does this, the parent will automatically be resumed
as well.  Again there are no special requirements; the point is moot
because the parent will never be runtime-suspended when its ->suspend()
routine is ready to run.)

During the resume stages, if the child is going to be restored to full
power then certainly the parent has to be at full power first.  
Drivers expect this, so if we're going to leave the parent in runtime
suspend during system resume, we have to get the child driver's
permission first.  _That's_ what the parent_needed flag should mean.

What about the case where ignore_children _is_ set?  Then the child's 
driver might indeed need the parent to be at full power during system 
suspend, since we could start off with the parent suspended and the 
child active.

Putting these arguments together, the result is that during system
suspend we don't care about the children's needs unless the parent's
ignore_children bit is set.  But during system resume, we must resume
the parent unless the child's driver says we don't have to.

As a corollary, if we don't have the child's permission to leave the
parent suspended during system resume then we have to invoke all of the
parent's resume callbacks, which means we also have to invoke all the
suspend callbacks.  However, we still might be able to leave the parent
in runtime suspend during the suspend stages.  The decision whether or
not to do so should be up to the subsystem or driver, not the PM core; 
the subsystem's callback routines can check the device's runtime status 
and then do what they want.

> To help them with that, introduce two new device PM flags:
> power.parent_needed and power.leave_runtime_suspended supposed to work
> as follows.
> 
> The PM core will clear power.leave_runtime_suspended and will set
> power.parent_needed for all devices in dpm_prepare().  Next, the
> subsystem (or driver) of a device that in principle may not need
> to be resumed during system suspend, if runtume-suspended already,
> will set power.leave_runtime_suspended in its ->prepare() callback.
> Also the subsystems (or drivers) of devices whose parents need not
> be resumed during system suspend, if runtime-suspended already,
> are supposed to clear power.parent_needed for them.  The PM core
> will then clear power.leave_runtime_suspended for the parents of
> all devices having power.parent_needed set in __device_suspend().

You are using leave_runtime_suspended to mean two different things:  
remain runtime-suspended during the system suspend stages (i.e., no
reprogramming is needed so don't go to full power), and remain
runtime-suspended during both the system suspend and system resume
stages.  Only the first meaning matters if all you want to accomplish
is to avoid unnecessary runtime resumes during system suspend.

For the first meaning -- and I claim that this is the appropriate
meaning for this patch -- the leave_runtime_suspend flag doesn't depend
on the children's needs, except in the case where the parent's
ignore_children bit is set.  In that case, we could simply force the
parent's leave_runtime_suspended flag to be always off.  Or we could 
leave it set if it is set in all of the parent's children.

The parent_needed flag is the one that really has to propagate up the
device tree.  If this flag is set in a child then the PM core has to
invoke all the suspend and resume callbacks, not just in the child's
parent but in all its ancestors.  (Perhaps you could stop if you reach
an ancestor with ignore_children set, but it's safer not to.)

> Now, if the ->suspend() callback is executed for a device whose
> power.leave_runtime_suspended is set, it can simply return 0 after
> checking the device's state if that state is appropriate for
> system suspend.  The PM core will then skip the late/early and
> noirq system suspend/resume callbacks for that device and will
> use pm_runtime_resume() to resume it in device_resume().

By the discussion above, the PM core shouldn't skip anything
unless the parent_needed flag is clear.

> If the state of a device with power.leave_runtime_suspended is not
> appropriate for system suspend, the ->suspend() callback should
> resume it using pm_runtime_resume() and clear
> power.leave_runtime_suspended for it.

Oh yes, I forgot to discuss this earlier.  We have two choices for 
handling this:

	As you wrote above, require drivers not to set 
	leave_runtime_suspended if the device isn't in an appropriate
	state, and propagate the flag up the device tree.  (But as
	I mentioned, in most cases the flag shouldn't need to be
	propagated.)

	Make the PM core automatically clear leave_runtime_suspended
	whenever the device is (or becomes) runtime-active.  Then 
	callbacks don't have to check whether the device actually is in
	runtime suspend; they just have to check the flag.

I prefer the second choice, because it is easier for drivers.

> Note: Drivers (or bus types etc.) can reasonably expect that the
> next PM callback executed after ->runtime_suspend() will be
> ->runtime_resume() rather than ->resume_noirq() or ->resume_early().
> This change is designed with that expectation in mind.

Except, of course, that in the current kernel this isn't true.  And 
there probably are a few cases where it can't ever be true.

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-02 18:44                                             ` Alan Stern
@ 2014-05-05  0:09                                               ` Rafael J. Wysocki
  2014-05-05 15:46                                                 ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-05  0:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Friday, May 02, 2014 02:44:43 PM Alan Stern wrote:
> On Fri, 2 May 2014, Rafael J. Wysocki wrote:
> 
> > Well, I have a second update.
> > 
> > It has different flag names and changelog (that should explain things better
> > hopefully) and the purpose of both flags should be more clear now (patch [3/3]
> > would need to be reworked on top of this, but for now let's just discuss the
> > core changes).
> 
> We've got patch descriptions passing in the night!  :-)
> 
> This doesn't contain any changes to the patch itself, apart from the 
> flag names, right?

There is this change in the patch itself:

+               if (dev->power.leave_runtime_suspended)
+                       dev->power.parent_needed = false;

in __device_suspend() and power.parent_needed is set for all devices in
dpm_prepare().

> The description below is much better than the earlier one, but I still feel
> this deserves to be split in two: one patch for each new flag.

Well, I guess I can introduce power.leave_runtime_suspended for leaf devices
first, but that would be somewhat artificial, because in that case some code
added by the first patch would be removed by the second one. :-)

> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
> > 
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.  However, at
> > least in some cases, that isn't really necessary, because the wakeup
> > settings may not be really different.
> > 
> > The idea here is that subsystems should know whether or not it is
> > necessary to resume a given device during system suspend as long as
> > they know that the device's children will not need it to be functional
> > during the late/early and noirq phases of their suspend and resume.
> 
> Perhaps the matter of the children's requirements should be discussed 
> more fully.  I skimmed over it in my suggested description too.
> 
> Under what conditions will a child need the parent device to be 
> functional?  Let's start by assuming the parent's ignore_children 
> bit isn't set.
> 
> By this assumption, if the child was at full power during the suspend
> stages then the parent would have to be at full power too.  So let's
> assume that the child is in runtime suspend when its ->suspend()
> routine runs.  I can't think of any scenario where the child's driver
> would require the parent to be at full power without also needing the
> child to be at full power.  If the child really does need to be at full
> power then the driver will have to do a runtime resume, which would
> also bring the parent to full power.  Either way, we don't have to do
> anything special -- during the suspend stages, if the child needs the 
> parent to be at full power then it will be.
> 
> (As a variant of this case, maybe the child belongs to one of the
> subsystems like PCI, and its driver expects the subsystem to
> runtime-resume the child before invoking its ->suspend() callback.  
> When the subsystem does this, the parent will automatically be resumed
> as well.  Again there are no special requirements; the point is moot
> because the parent will never be runtime-suspended when its ->suspend()
> routine is ready to run.)

Yes.

> During the resume stages, if the child is going to be restored to full
> power then certainly the parent has to be at full power first.  
> Drivers expect this, so if we're going to leave the parent in runtime
> suspend during system resume, we have to get the child driver's
> permission first.  _That's_ what the parent_needed flag should mean.

There is a theoretical case where the child is runtime-suspended, but
not actually zero-power, and it doesn't have leave_runtime_suspended set,
but its driver doesn't implement ->suspend() at all and instead it waits
until ->suspend_late() or even ->suspend_noirq() and then attempts to do
something extra to the device.  Then, if the parent is a bridge and is
required to be functional for accessing the child, we can't leave it
runtime-suspended too.

I'm not sure how realistic that is, to be honest, but it does look like
a valid thing to do to my eyes, so in my opinion we may need to get the
child driver's permission to leave the parent in runtime suspend for
that reason too.

I guess it is fair to simply say that "we need to get the child driver's
permission to leave the parent in runtime suspend".

> What about the case where ignore_children _is_ set?  Then the child's 
> driver might indeed need the parent to be at full power during system 
> suspend, since we could start off with the parent suspended and the 
> child active.
> 
> Putting these arguments together, the result is that during system
> suspend we don't care about the children's needs unless the parent's
> ignore_children bit is set.  But during system resume, we must resume
> the parent unless the child's driver says we don't have to.
> 
> As a corollary, if we don't have the child's permission to leave the
> parent suspended during system resume then we have to invoke all of the
> parent's resume callbacks, which means we also have to invoke all the
> suspend callbacks.  However, we still might be able to leave the parent
> in runtime suspend during the suspend stages.  The decision whether or
> not to do so should be up to the subsystem or driver, not the PM core; 
> the subsystem's callback routines can check the device's runtime status 
> and then do what they want.

Yes, but they can do that anyway, can't they? :-)

> > To help them with that, introduce two new device PM flags:
> > power.parent_needed and power.leave_runtime_suspended supposed to work
> > as follows.
> > 
> > The PM core will clear power.leave_runtime_suspended and will set
> > power.parent_needed for all devices in dpm_prepare().  Next, the
> > subsystem (or driver) of a device that in principle may not need
> > to be resumed during system suspend, if runtume-suspended already,
> > will set power.leave_runtime_suspended in its ->prepare() callback.
> > Also the subsystems (or drivers) of devices whose parents need not
> > be resumed during system suspend, if runtime-suspended already,
> > are supposed to clear power.parent_needed for them.  The PM core
> > will then clear power.leave_runtime_suspended for the parents of
> > all devices having power.parent_needed set in __device_suspend().
> 
> You are using leave_runtime_suspended to mean two different things:  
> remain runtime-suspended during the system suspend stages (i.e., no
> reprogramming is needed so don't go to full power), and remain
> runtime-suspended during both the system suspend and system resume
> stages.  Only the first meaning matters if all you want to accomplish
> is to avoid unnecessary runtime resumes during system suspend.

Well, this is not the case, becase you can't call ->resume_noirq() *after*
->runtime_suspend() for a number of drivers, as they simply may not expect
that to happen (that covers all of the PCI drivers and the ACPI PM domain at
least).

So you can't say "well, I'll skip your ->suspend_late and ->suspend_noirq,
but then I'll resume you traditionally" for those drivers, but this isn't
about remaining runtime-suspended during system resume too, but about
preserving the expected ordering of callbacks for them.

So yes, the goal is to "remain runtime-suspended during the system suspend stages",
but that *leads* *to* "do not execute system resume callbacks up to and including
->resume()" either at least for an important subset of drivers.

> For the first meaning -- and I claim that this is the appropriate
> meaning for this patch -- the leave_runtime_suspend flag doesn't depend
> on the children's needs, except in the case where the parent's
> ignore_children bit is set.  In that case, we could simply force the
> parent's leave_runtime_suspended flag to be always off.  Or we could 
> leave it set if it is set in all of the parent's children.
> 
> The parent_needed flag is the one that really has to propagate up the
> device tree.

It doesn't need to be propagated if it is set for everybody to start with
which is the case in my last patch.

> If this flag is set in a child then the PM core has to
> invoke all the suspend and resume callbacks, not just in the child's
> parent but in all its ancestors.  (Perhaps you could stop if you reach
> an ancestor with ignore_children set, but it's safer not to.)
> 
> > Now, if the ->suspend() callback is executed for a device whose
> > power.leave_runtime_suspended is set, it can simply return 0 after
> > checking the device's state if that state is appropriate for
> > system suspend.  The PM core will then skip the late/early and
> > noirq system suspend/resume callbacks for that device and will
> > use pm_runtime_resume() to resume it in device_resume().
> 
> By the discussion above, the PM core shouldn't skip anything
> unless the parent_needed flag is clear.

In all of the device's children.

> > If the state of a device with power.leave_runtime_suspended is not
> > appropriate for system suspend, the ->suspend() callback should
> > resume it using pm_runtime_resume() and clear
> > power.leave_runtime_suspended for it.
> 
> Oh yes, I forgot to discuss this earlier.  We have two choices for 
> handling this:
> 
> 	As you wrote above, require drivers not to set 
> 	leave_runtime_suspended if the device isn't in an appropriate
> 	state, and propagate the flag up the device tree.  (But as
> 	I mentioned, in most cases the flag shouldn't need to be
> 	propagated.)

It isn't propageted in my last patch.

> 	Make the PM core automatically clear leave_runtime_suspended
> 	whenever the device is (or becomes) runtime-active.  Then 
> 	callbacks don't have to check whether the device actually is in
> 	runtime suspend; they just have to check the flag.
> 
> I prefer the second choice, because it is easier for drivers.

Well, I'm not sure how much easier that is, because the flag will have to be
operated under power.lock then, but that will cause patch [2/3] to be unnecessary,
so it's fine by me. :-)

> > Note: Drivers (or bus types etc.) can reasonably expect that the
> > next PM callback executed after ->runtime_suspend() will be
> > ->runtime_resume() rather than ->resume_noirq() or ->resume_early().
> > This change is designed with that expectation in mind.
> 
> Except, of course, that in the current kernel this isn't true.

Well, what about PCI devices?  Their drivers surely can have such an expectation,
because all of the PCI devices *are* resumed today in either pci_pm_prepare() or
pci_pm_suspend(), before executing the driver's ->suspend() callback.

> And there probably are a few cases where it can't ever be true.

It is easy to say things like that without giving any examples.

Rafael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-05  0:09                                               ` Rafael J. Wysocki
@ 2014-05-05 15:46                                                 ` Alan Stern
  2014-05-06  1:31                                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-05 15:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

I'll trim a lot of material and respond to the points that are 
important comments or criticisms.

On Mon, 5 May 2014, Rafael J. Wysocki wrote:

> > The description below is much better than the earlier one, but I still feel
> > this deserves to be split in two: one patch for each new flag.
> 
> Well, I guess I can introduce power.leave_runtime_suspended for leaf devices
> first, but that would be somewhat artificial, because in that case some code
> added by the first patch would be removed by the second one. :-)

The "leaf devices" thing is a key point; see below.

> There is a theoretical case where the child is runtime-suspended, but
> not actually zero-power, and it doesn't have leave_runtime_suspended set,
> but its driver doesn't implement ->suspend() at all and instead it waits
> until ->suspend_late() or even ->suspend_noirq() and then attempts to do
> something extra to the device.  Then, if the parent is a bridge and is
> required to be functional for accessing the child, we can't leave it
> runtime-suspended too.
> 
> I'm not sure how realistic that is, to be honest, but it does look like
> a valid thing to do to my eyes, so in my opinion we may need to get the
> child driver's permission to leave the parent in runtime suspend for
> that reason too.

The only time this would be a problem is if the driver changes the
device's settings from the runtime-suspend values to the system-suspend
values, without doing a runtime resume first.  For example, the driver
might disable wakeup while the device remains at low power.

I'm not sure how realistic this is, either -- although it's not hard to 
imagine a PCI driver doing this sort of thing.  Still, if you think we 
should worry about it then I agree, the parent_needed flag ought to be 
present from the start.

> I guess it is fair to simply say that "we need to get the child driver's
> permission to leave the parent in runtime suspend".

Okay.  But does it follow that we need permission from the child's
descendants as well?  I don't see any reason why.  After all, if a
grandchild needs the child to be at full power, then the parent will
automatically end up at full power too.  Which means neither
leave_runtime_suspended nor parent_needed has to be propagated up the
tree.

Hmmm, I just thought of something else.  What about non-parent-child 
relationships?  Device B might depend on device A, even though A isn't 
an ancestor of B.  I guess in this case, A's leave_runtime_suspended 
flag should not be set.

> > As a corollary, if we don't have the child's permission to leave the
> > parent suspended during system resume then we have to invoke all of the
> > parent's resume callbacks, which means we also have to invoke all the
> > suspend callbacks.  However, we still might be able to leave the parent
> > in runtime suspend during the suspend stages.  The decision whether or
> > not to do so should be up to the subsystem or driver, not the PM core; 
> > the subsystem's callback routines can check the device's runtime status 
> > and then do what they want.
> 
> Yes, but they can do that anyway, can't they? :-)

Yes, they can.  The point of this part of the patch is adding the
leave_runtime_suspended flag (1) makes the subsystem's decision a
little easier and (2) informs the subsystem when it can safely avoid
perfoming a runtime resume in its ->suspend() callback.

> > You are using leave_runtime_suspended to mean two different things:  
> > remain runtime-suspended during the system suspend stages (i.e., no
> > reprogramming is needed so don't go to full power), and remain
> > runtime-suspended during both the system suspend and system resume
> > stages.  Only the first meaning matters if all you want to accomplish
> > is to avoid unnecessary runtime resumes during system suspend.
> 
> Well, this is not the case, becase you can't call ->resume_noirq() *after*
> ->runtime_suspend() for a number of drivers, as they simply may not expect
> that to happen (that covers all of the PCI drivers and the ACPI PM domain at
> least).

For some non-PCI, non-ACPI PM domain drivers, it _is_ okay to call
->resume_noirq() after ->runtime_suspend().

But forget about that; let's concentrate on PCI.  When a PCI driver
sets leave_runtime_suspended, it is telling the PCI core that it
doesn't mind having its ->resume_noirq() callback invoked after
->runtime_suspend().  If a PCI driver doesn't set
leave_runtime_suspended, the PCI core will continue to handle it
exactly the same as now: do a runtime resume before invoking the
driver's ->suspend() callback.

> So you can't say "well, I'll skip your ->suspend_late and ->suspend_noirq,
> but then I'll resume you traditionally" for those drivers, but this isn't
> about remaining runtime-suspended during system resume too, but about
> preserving the expected ordering of callbacks for them.

For drivers that don't set leave_runtime_suspended, the ordering of 
callbacks will be unchanged.  That's why you introduced this flag 
originally, right?  So that the subsystem would know which drivers 
don't mind doing things differently.

> So yes, the goal is to "remain runtime-suspended during the system suspend stages",
> but that *leads* *to* "do not execute system resume callbacks up to and including
> ->resume()" either at least for an important subset of drivers.

I disagree, for the reasons given above.

Now, this raises a second issue: How should we handle devices that can
remain runtime-suspended through both the system suspend and system
resume stages?  Maybe we better discuss that in a separate email
thread.

> > > Note: Drivers (or bus types etc.) can reasonably expect that the
> > > next PM callback executed after ->runtime_suspend() will be
> > > ->runtime_resume() rather than ->resume_noirq() or ->resume_early().
> > > This change is designed with that expectation in mind.
> > 
> > Except, of course, that in the current kernel this isn't true.
> 
> Well, what about PCI devices?  Their drivers surely can have such an expectation,
> because all of the PCI devices *are* resumed today in either pci_pm_prepare() or
> pci_pm_suspend(), before executing the driver's ->suspend() callback.

Yes, of course.  But the USB subsystem, for example, doesn't expect it.

> > And there probably are a few cases where it can't ever be true.
> 
> It is easy to say things like that without giving any examples.

Sorry, what I meant was that _if_ a device is going to remain in 
runtime suspend throughout the system suspend and system resume stages, 
then there probably are cases where the device will have to come back 
to full power during resume_early -- which means ->runtime_resume() 
can't be used, because runtime PM is disabled during resume_early.  
Isn't this true of PCI bridges?

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-05 15:46                                                 ` Alan Stern
@ 2014-05-06  1:31                                                   ` Rafael J. Wysocki
  2014-05-06 19:31                                                     ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-06  1:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Monday, May 05, 2014 11:46:47 AM Alan Stern wrote:
> I'll trim a lot of material and respond to the points that are 
> important comments or criticisms.
> 
> On Mon, 5 May 2014, Rafael J. Wysocki wrote:
> 
> > > The description below is much better than the earlier one, but I still feel
> > > this deserves to be split in two: one patch for each new flag.
> > 
> > Well, I guess I can introduce power.leave_runtime_suspended for leaf devices
> > first, but that would be somewhat artificial, because in that case some code
> > added by the first patch would be removed by the second one. :-)
> 
> The "leaf devices" thing is a key point; see below.
> 
> > There is a theoretical case where the child is runtime-suspended, but
> > not actually zero-power, and it doesn't have leave_runtime_suspended set,
> > but its driver doesn't implement ->suspend() at all and instead it waits
> > until ->suspend_late() or even ->suspend_noirq() and then attempts to do
> > something extra to the device.  Then, if the parent is a bridge and is
> > required to be functional for accessing the child, we can't leave it
> > runtime-suspended too.
> > 
> > I'm not sure how realistic that is, to be honest, but it does look like
> > a valid thing to do to my eyes, so in my opinion we may need to get the
> > child driver's permission to leave the parent in runtime suspend for
> > that reason too.
> 
> The only time this would be a problem is if the driver changes the
> device's settings from the runtime-suspend values to the system-suspend
> values, without doing a runtime resume first.  For example, the driver
> might disable wakeup while the device remains at low power.
> 
> I'm not sure how realistic this is, either -- although it's not hard to 
> imagine a PCI driver doing this sort of thing.  Still, if you think we 
> should worry about it then I agree, the parent_needed flag ought to be 
> present from the start.
> 
> > I guess it is fair to simply say that "we need to get the child driver's
> > permission to leave the parent in runtime suspend".
> 
> Okay.  But does it follow that we need permission from the child's
> descendants as well?  I don't see any reason why.  After all, if a
> grandchild needs the child to be at full power, then the parent will
> automatically end up at full power too.  Which means neither
> leave_runtime_suspended nor parent_needed has to be propagated up the
> tree.
> 
> Hmmm, I just thought of something else.  What about non-parent-child 
> relationships?  Device B might depend on device A, even though A isn't 
> an ancestor of B.  I guess in this case, A's leave_runtime_suspended 
> flag should not be set.

That's true in general, but then I wouldn't expect A to be runtime-suspended
any more after B's ->suspend() has run which needs to happen before the A's
->suspend() runs and that's where the flag is checked.

[This assumes that rpm_resume() will clear leave_runtime_suspended automatically
and I'll make this assumption going forward.]

> > > As a corollary, if we don't have the child's permission to leave the
> > > parent suspended during system resume then we have to invoke all of the
> > > parent's resume callbacks, which means we also have to invoke all the
> > > suspend callbacks.  However, we still might be able to leave the parent
> > > in runtime suspend during the suspend stages.  The decision whether or
> > > not to do so should be up to the subsystem or driver, not the PM core; 
> > > the subsystem's callback routines can check the device's runtime status 
> > > and then do what they want.
> > 
> > Yes, but they can do that anyway, can't they? :-)
> 
> Yes, they can.  The point of this part of the patch is adding the
> leave_runtime_suspended flag (1) makes the subsystem's decision a
> little easier and (2) informs the subsystem when it can safely avoid
> perfoming a runtime resume in its ->suspend() callback.

Yes, it is.

> > > You are using leave_runtime_suspended to mean two different things:  
> > > remain runtime-suspended during the system suspend stages (i.e., no
> > > reprogramming is needed so don't go to full power), and remain
> > > runtime-suspended during both the system suspend and system resume
> > > stages.  Only the first meaning matters if all you want to accomplish
> > > is to avoid unnecessary runtime resumes during system suspend.
> > 
> > Well, this is not the case, becase you can't call ->resume_noirq() *after*
> > ->runtime_suspend() for a number of drivers, as they simply may not expect
> > that to happen (that covers all of the PCI drivers and the ACPI PM domain at
> > least).
> 
> For some non-PCI, non-ACPI PM domain drivers, it _is_ okay to call
> ->resume_noirq() after ->runtime_suspend().

Yes, it may be OK to do that for some drivers, but not for all of them and
that's the point.

> But forget about that; let's concentrate on PCI.  When a PCI driver
> sets leave_runtime_suspended, it is telling the PCI core that it
> doesn't mind having its ->resume_noirq() callback invoked after
> ->runtime_suspend().

No, it doesn't.

First of all, you're assumig that drivers will set that flag, but PCI
drivers have no idea about the wakeup settings which are taken care of by
the PCI bus type.  This means that the bus type will also set
leave_runtime_suspended.

Second, even if a driver sets leave_runtime_suspended for a device, this
doesn't have to mean that its ->resume_noirq() may be called directly
after its ->runtime_suspend().  What it means is that (a) the state of
the device is appropriate for system suspend and (b) there are no reasons
known to it why the device should be resumed during system suspend.

And yes, the subsystem can very well do it all by itself, but then the
same approach will probably be duplicated in multiple subsystems and
they won't be able to cross the bus type boundary, for example.

> If a PCI driver doesn't set
> leave_runtime_suspended, the PCI core will continue to handle it
> exactly the same as now: do a runtime resume before invoking the
> driver's ->suspend() callback.
> 
> > So you can't say "well, I'll skip your ->suspend_late and ->suspend_noirq,
> > but then I'll resume you traditionally" for those drivers, but this isn't
> > about remaining runtime-suspended during system resume too, but about
> > preserving the expected ordering of callbacks for them.
> 
> For drivers that don't set leave_runtime_suspended, the ordering of 
> callbacks will be unchanged.  That's why you introduced this flag 
> originally, right?

No.

> So that the subsystem would know which drivers don't mind doing things differently.

Again, no.

In fact, my original idea was to do that thing in the subsystems without
involving the PM core, but then I would only be able to cover leaf devices.
So I decided to do something more general, but the flag is exactly for what
it does in the pach - to tell the PM core to skip a number of callbacks for
a device, all of the high-level considerations notwithstanding.

So you may not like the idea that skipping suspend callbacks implies skipping
the corresponding resume callbacks, but that's the simplest way to do it and
quite frankly I don't see why this is a problem.

So if that is a problem, for example, for USB, please tell me why it is a
problem.  I mean, practically.

> > So yes, the goal is to "remain runtime-suspended during the system suspend stages",
> > but that *leads* *to* "do not execute system resume callbacks up to and including
> > ->resume()" either at least for an important subset of drivers.
> 
> I disagree, for the reasons given above.

Well, that means we don't agree here. :-)

> Now, this raises a second issue: How should we handle devices that can
> remain runtime-suspended through both the system suspend and system
> resume stages?  Maybe we better discuss that in a separate email
> thread.

I'm not sure what you mean?  Devices that can stay suspended throughout the
entire system suspend and resume?

> > > > Note: Drivers (or bus types etc.) can reasonably expect that the
> > > > next PM callback executed after ->runtime_suspend() will be
> > > > ->runtime_resume() rather than ->resume_noirq() or ->resume_early().
> > > > This change is designed with that expectation in mind.
> > > 
> > > Except, of course, that in the current kernel this isn't true.
> > 
> > Well, what about PCI devices?  Their drivers surely can have such an expectation,
> > because all of the PCI devices *are* resumed today in either pci_pm_prepare() or
> > pci_pm_suspend(), before executing the driver's ->suspend() callback.
> 
> Yes, of course.  But the USB subsystem, for example, doesn't expect it.

Well, it may not expect it and that's fine.  Still, as long as there are any
drivers that can expect it, the PM core has to take that into account.

> > > And there probably are a few cases where it can't ever be true.
> > 
> > It is easy to say things like that without giving any examples.
> 
> Sorry, what I meant was that _if_ a device is going to remain in 
> runtime suspend throughout the system suspend and system resume stages, 
> then there probably are cases where the device will have to come back 
> to full power during resume_early -- which means ->runtime_resume() 
> can't be used, because runtime PM is disabled during resume_early.  
> Isn't this true of PCI bridges?

Yes, we want PCI bridges to go to full power early and they are kind of special
in a few ways, so I don't think we'll set leave_runtime_suspended for them any
time soon (and it wouldn't make sense anyway, because they don't go to low
power during runtime suspend).

That said in principle if everything below a bridge is suspended it may
stay suspended too, as long as there's no need to access any devices below it.

This actually seems to be a key observation not only for bridges. :-)

Rafael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-06  1:31                                                   ` Rafael J. Wysocki
@ 2014-05-06 19:31                                                     ` Alan Stern
  2014-05-07  0:36                                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-06 19:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Tue, 6 May 2014, Rafael J. Wysocki wrote:

> > > > You are using leave_runtime_suspended to mean two different things:  
> > > > remain runtime-suspended during the system suspend stages (i.e., no
> > > > reprogramming is needed so don't go to full power), and remain
> > > > runtime-suspended during both the system suspend and system resume
> > > > stages.  Only the first meaning matters if all you want to accomplish
> > > > is to avoid unnecessary runtime resumes during system suspend.
> > > 
> > > Well, this is not the case, becase you can't call ->resume_noirq() *after*
> > > ->runtime_suspend() for a number of drivers, as they simply may not expect
> > > that to happen (that covers all of the PCI drivers and the ACPI PM domain at
> > > least).

If you can't call ->resume_noirq() after ->runtime_suspend(), is it
okay to call ->suspend() after ->runtime_suspend()?

If it is okay, then there's no problem.  The subsystem invokes all of 
the driver's callbacks, even if leave_runtime_suspended is set.  The 
only difference is that the subsystem doesn't do a runtime-resume 
before invoking the ->suspend() callback.

It it's not okay...  Well, then the only option (aside from the runtime
resume we currently do) is to leave the device in runtime suspend all
the way up to the resume stage of system resume, and then do a
runtime-resume instead of calling ->resume().

> > For some non-PCI, non-ACPI PM domain drivers, it _is_ okay to call
> > ->resume_noirq() after ->runtime_suspend().
> 
> Yes, it may be OK to do that for some drivers, but not for all of them and
> that's the point.
> 
> > But forget about that; let's concentrate on PCI.  When a PCI driver
> > sets leave_runtime_suspended, it is telling the PCI core that it
> > doesn't mind having its ->resume_noirq() callback invoked after
> > ->runtime_suspend().
> 
> No, it doesn't.
> 
> First of all, you're assumig that drivers will set that flag, but PCI
> drivers have no idea about the wakeup settings which are taken care of by
> the PCI bus type.  This means that the bus type will also set
> leave_runtime_suspended.

How can the subsystem know whether the device is in a suitable state 
for system suspend?  Only the driver knows -- assuming there is a 
driver.  Agreed, if there is no driver then the subsystem might set
leave_runtime_suspended, but in that case it wouldn't cause any 
problem.

> Second, even if a driver sets leave_runtime_suspended for a device, this
> doesn't have to mean that its ->resume_noirq() may be called directly
> after its ->runtime_suspend().  What it means is that (a) the state of
> the device is appropriate for system suspend and (b) there are no reasons
> known to it why the device should be resumed during system suspend.

It's not too late to change the meaning.  :-)

> And yes, the subsystem can very well do it all by itself, but then the
> same approach will probably be duplicated in multiple subsystems and
> they won't be able to cross the bus type boundary, for example.

True.

> In fact, my original idea was to do that thing in the subsystems without
> involving the PM core, but then I would only be able to cover leaf devices.
> So I decided to do something more general, but the flag is exactly for what
> it does in the pach - to tell the PM core to skip a number of callbacks for
> a device, all of the high-level considerations notwithstanding.
> 
> So you may not like the idea that skipping suspend callbacks implies skipping
> the corresponding resume callbacks, but that's the simplest way to do it and
> quite frankly I don't see why this is a problem.

All right.  Then this seems to be what you want:

	For some devices, it's okay to remain in runtime suspend 
	throughout a complete system suspend/resume cycle (if the
	device was in runtime suspend at the start of the cycle).
	We would like to do this whenever possible, to avoid the
	overhead of extra power-up and power-down events.

	However, problems may arise because the device's descendants 
	may require it to be at full power at various points during 
	the cycle.  Therefore the only way to do this safely is if the 
	device _and_ all its descendants can remain runtime suspended 
	until the resume stage of system resume.

	To this end, introduce dev->power.leave_runtime_suspended.
	If a subsystem or driver sets this flag during the ->prepare()
	callback, and if the flag is set in all of the device's
	descendants, and if the device is still in runtime suspend when
	the ->suspend() callback would normally be invoked, then the PM
	core will not invoke the device's ->suspend(), 
	->suspend_late(), ->suspend_irq(), ->resume_irq(),
	->resume_early(), or ->resume() callbacks.  Instead, it will 
	invoke ->runtime_resume() during the resume stage of system
	resume.

	By setting this flag, a driver or subsystem tells the PM core
	that the device is runtime suspended, it is in a suitable state
	for system suspend (for example, the wakeup setting does not
	need to be changed), and it does not need to return to full
	power until the resume stage.

Does that correctly describe what you want to do, the potential
problems, and the proposed solution?

If so, then it appears the parent_needed flag is unnecessary.

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-06 19:31                                                     ` Alan Stern
@ 2014-05-07  0:36                                                       ` Rafael J. Wysocki
  2014-05-07 15:43                                                         ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-07  0:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Tuesday, May 06, 2014 03:31:02 PM Alan Stern wrote:
> On Tue, 6 May 2014, Rafael J. Wysocki wrote:
> 
> > > > > You are using leave_runtime_suspended to mean two different things:  
> > > > > remain runtime-suspended during the system suspend stages (i.e., no
> > > > > reprogramming is needed so don't go to full power), and remain
> > > > > runtime-suspended during both the system suspend and system resume
> > > > > stages.  Only the first meaning matters if all you want to accomplish
> > > > > is to avoid unnecessary runtime resumes during system suspend.
> > > > 
> > > > Well, this is not the case, becase you can't call ->resume_noirq() *after*
> > > > ->runtime_suspend() for a number of drivers, as they simply may not expect
> > > > that to happen (that covers all of the PCI drivers and the ACPI PM domain at
> > > > least).
> 
> If you can't call ->resume_noirq() after ->runtime_suspend(), is it
> okay to call ->suspend() after ->runtime_suspend()?

For PCI devices, I don't know, because we've never done that for them.

We've always resumed them during system suspend, so system suspend/resume
callbacks have never been mixed with runtime PM callbacks for them.  And
that is the whole point. :-)

> If it is okay, then there's no problem.  The subsystem invokes all of 
> the driver's callbacks, even if leave_runtime_suspended is set.  The 
> only difference is that the subsystem doesn't do a runtime-resume 
> before invoking the ->suspend() callback.
> 
> It it's not okay...  Well, then the only option (aside from the runtime
> resume we currently do) is to leave the device in runtime suspend all
> the way up to the resume stage of system resume, and then do a
> runtime-resume instead of calling ->resume().

Precisely. :-)

> > > For some non-PCI, non-ACPI PM domain drivers, it _is_ okay to call
> > > ->resume_noirq() after ->runtime_suspend().
> > 
> > Yes, it may be OK to do that for some drivers, but not for all of them and
> > that's the point.
> > 
> > > But forget about that; let's concentrate on PCI.  When a PCI driver
> > > sets leave_runtime_suspended, it is telling the PCI core that it
> > > doesn't mind having its ->resume_noirq() callback invoked after
> > > ->runtime_suspend().
> > 
> > No, it doesn't.
> > 
> > First of all, you're assumig that drivers will set that flag, but PCI
> > drivers have no idea about the wakeup settings which are taken care of by
> > the PCI bus type.  This means that the bus type will also set
> > leave_runtime_suspended.
> 
> How can the subsystem know whether the device is in a suitable state 
> for system suspend?  Only the driver knows -- assuming there is a 
> driver.  Agreed, if there is no driver then the subsystem might set
> leave_runtime_suspended, but in that case it wouldn't cause any 
> problem.

Not really.  In ->runtime_suspend() a PCI driver (in particular, but I'd say
that in general) is supposed to save the registers of the device it cares about
and then it is not supposed to touch it until ->runtime_resume().  Now, for PCI
drivers there's one additional twist: They don't put devices into low-power
states and don't prepare them for wakeup.  The PCI bus type does that.  Hence,
PCI drivers don't know whether or not the devices are in the right state for
system suspend, in particular with respect to wakeup, but the PCI bus type
knows that.  The same applies to devices in the ACPI PM domain.

> > Second, even if a driver sets leave_runtime_suspended for a device, this
> > doesn't have to mean that its ->resume_noirq() may be called directly
> > after its ->runtime_suspend().  What it means is that (a) the state of
> > the device is appropriate for system suspend and (b) there are no reasons
> > known to it why the device should be resumed during system suspend.
> 
> It's not too late to change the meaning.  :-)

Well, OK. :-)

> > And yes, the subsystem can very well do it all by itself, but then the
> > same approach will probably be duplicated in multiple subsystems and
> > they won't be able to cross the bus type boundary, for example.
> 
> True.
> 
> > In fact, my original idea was to do that thing in the subsystems without
> > involving the PM core, but then I would only be able to cover leaf devices.
> > So I decided to do something more general, but the flag is exactly for what
> > it does in the pach - to tell the PM core to skip a number of callbacks for
> > a device, all of the high-level considerations notwithstanding.
> > 
> > So you may not like the idea that skipping suspend callbacks implies skipping
> > the corresponding resume callbacks, but that's the simplest way to do it and
> > quite frankly I don't see why this is a problem.
> 
> All right.  Then this seems to be what you want:
> 
> 	For some devices, it's okay to remain in runtime suspend 
> 	throughout a complete system suspend/resume cycle (if the
> 	device was in runtime suspend at the start of the cycle).
> 	We would like to do this whenever possible, to avoid the
> 	overhead of extra power-up and power-down events.

Yes.

> 	However, problems may arise because the device's descendants 
> 	may require it to be at full power at various points during 
> 	the cycle.  Therefore the only way to do this safely is if the 
> 	device _and_ all its descendants can remain runtime suspended 
> 	until the resume stage of system resume.

It may not be the only way, but it is *a* way to do this safely.

> 	To this end, introduce dev->power.leave_runtime_suspended.
> 	If a subsystem or driver sets this flag during the ->prepare()
> 	callback, and if the flag is set in all of the device's
> 	descendants, and if the device is still in runtime suspend when
> 	the ->suspend() callback would normally be invoked, then the PM
> 	core will not invoke the device's ->suspend(), 
> 	->suspend_late(), ->suspend_irq(), ->resume_irq(),
> 	->resume_early(), or ->resume() callbacks.  Instead, it will 
> 	invoke ->runtime_resume() during the resume stage of system
> 	resume.

Yes.

> 	By setting this flag, a driver or subsystem tells the PM core
> 	that the device is runtime suspended, it is in a suitable state
> 	for system suspend (for example, the wakeup setting does not
> 	need to be changed), and it does not need to return to full
> 	power until the resume stage.

Yes.

> Does that correctly describe what you want to do, the potential
> problems, and the proposed solution?

Almost.  Devices with power.ignore_children set are not covered by this.

> If so, then it appears the parent_needed flag is unnecessary.

Well, I can agree with that.  It wasn't there in my first patchset and I added
it kind of in the hope to be able to deal with the ignore_children devices
with the help of it.

OK, I guess I need to prepare a new version without the parent_needed flag for
further discussion. :-)

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices
  2014-05-07  0:36                                                       ` Rafael J. Wysocki
@ 2014-05-07 15:43                                                         ` Alan Stern
  2014-05-07 23:27                                                           ` [RFC][PATCH 0/3] (was: Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-07 15:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Wed, 7 May 2014, Rafael J. Wysocki wrote:

We seem to be in agreement that this is the way you want to go...

> > All right.  Then this seems to be what you want:
> > 
> > 	For some devices, it's okay to remain in runtime suspend 
> > 	throughout a complete system suspend/resume cycle (if the
> > 	device was in runtime suspend at the start of the cycle).
> > 	We would like to do this whenever possible, to avoid the
> > 	overhead of extra power-up and power-down events.
> 
> Yes.
> 
> > 	However, problems may arise because the device's descendants 
> > 	may require it to be at full power at various points during 
> > 	the cycle.  Therefore the only way to do this safely is if the 
> > 	device _and_ all its descendants can remain runtime suspended 
> > 	until the resume stage of system resume.
> 
> It may not be the only way, but it is *a* way to do this safely.
> 
> > 	To this end, introduce dev->power.leave_runtime_suspended.
> > 	If a subsystem or driver sets this flag during the ->prepare()
> > 	callback, and if the flag is set in all of the device's
> > 	descendants, and if the device is still in runtime suspend when
> > 	the ->suspend() callback would normally be invoked, then the PM
> > 	core will not invoke the device's ->suspend(), 
> > 	->suspend_late(), ->suspend_irq(), ->resume_irq(),
> > 	->resume_early(), or ->resume() callbacks.  Instead, it will 
> > 	invoke ->runtime_resume() during the resume stage of system
> > 	resume.
> 
> Yes.
> 
> > 	By setting this flag, a driver or subsystem tells the PM core
> > 	that the device is runtime suspended, it is in a suitable state
> > 	for system suspend (for example, the wakeup setting does not
> > 	need to be changed), and it does not need to return to full
> > 	power until the resume stage.
> 
> Yes.
> 
> > Does that correctly describe what you want to do, the potential
> > problems, and the proposed solution?
> 
> Almost.  Devices with power.ignore_children set are not covered by this.

I thought they were.  In what respect aren't they?  You mean because
they can be runtime suspended while their children remain active?

I don't think that matters here.  Suppose a parent device's
leave_runtime_suspended flag is set but one of its children isn't
runtime suspended.  Then that child's leave_runtime_suspended flag
won't be set, so the parent device won't meet the criterion for
skipping the normal PM callbacks.

Or do you mean that a child might expect the parent to be at full power
when the child is resumed (plus the fact that doing a runtime resume on
the child will not automatically resume the parent)?  That doesn't
matter either, because the PM core will do a runtime-resume of the
parent before the child's ->runtime_resume() is called.

> > If so, then it appears the parent_needed flag is unnecessary.
> 
> Well, I can agree with that.  It wasn't there in my first patchset and I added
> it kind of in the hope to be able to deal with the ignore_children devices
> with the help of it.

Yeah.  I contributed to that, by not understanding exactly what you 
were trying to accomplish.

> OK, I guess I need to prepare a new version without the parent_needed flag for
> further discussion. :-)

Consider using the description above (or some variant of it) for the
new Changelog.  IMNSHO it does a much better job of explaining the 
patch than your original version.  :-)

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 0/3] (was: Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices)
  2014-05-07 15:43                                                         ` Alan Stern
@ 2014-05-07 23:27                                                           ` Rafael J. Wysocki
  2014-05-07 23:29                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
                                                                               ` (2 more replies)
  0 siblings, 3 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-07 23:27 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Wednesday, May 07, 2014 11:43:39 AM Alan Stern wrote:
> On Wed, 7 May 2014, Rafael J. Wysocki wrote:
> 
> We seem to be in agreement that this is the way you want to go...
> 
> > > All right.  Then this seems to be what you want:
> > > 
> > > 	For some devices, it's okay to remain in runtime suspend 
> > > 	throughout a complete system suspend/resume cycle (if the
> > > 	device was in runtime suspend at the start of the cycle).
> > > 	We would like to do this whenever possible, to avoid the
> > > 	overhead of extra power-up and power-down events.
> > 
> > Yes.
> > 
> > > 	However, problems may arise because the device's descendants 
> > > 	may require it to be at full power at various points during 
> > > 	the cycle.  Therefore the only way to do this safely is if the 
> > > 	device _and_ all its descendants can remain runtime suspended 
> > > 	until the resume stage of system resume.
> > 
> > It may not be the only way, but it is *a* way to do this safely.
> > 
> > > 	To this end, introduce dev->power.leave_runtime_suspended.
> > > 	If a subsystem or driver sets this flag during the ->prepare()
> > > 	callback, and if the flag is set in all of the device's
> > > 	descendants, and if the device is still in runtime suspend when
> > > 	the ->suspend() callback would normally be invoked, then the PM
> > > 	core will not invoke the device's ->suspend(), 
> > > 	->suspend_late(), ->suspend_irq(), ->resume_irq(),
> > > 	->resume_early(), or ->resume() callbacks.  Instead, it will 
> > > 	invoke ->runtime_resume() during the resume stage of system
> > > 	resume.
> > 
> > Yes.
> > 
> > > 	By setting this flag, a driver or subsystem tells the PM core
> > > 	that the device is runtime suspended, it is in a suitable state
> > > 	for system suspend (for example, the wakeup setting does not
> > > 	need to be changed), and it does not need to return to full
> > > 	power until the resume stage.
> > 
> > Yes.
> > 
> > > Does that correctly describe what you want to do, the potential
> > > problems, and the proposed solution?
> > 
> > Almost.  Devices with power.ignore_children set are not covered by this.
> 
> I thought they were.  In what respect aren't they?  You mean because
> they can be runtime suspended while their children remain active?
> 
> I don't think that matters here.  Suppose a parent device's
> leave_runtime_suspended flag is set but one of its children isn't
> runtime suspended.  Then that child's leave_runtime_suspended flag
> won't be set, so the parent device won't meet the criterion for
> skipping the normal PM callbacks.
> 
> Or do you mean that a child might expect the parent to be at full power
> when the child is resumed (plus the fact that doing a runtime resume on
> the child will not automatically resume the parent)?  That doesn't
> matter either, because the PM core will do a runtime-resume of the
> parent before the child's ->runtime_resume() is called.

OK

> > > If so, then it appears the parent_needed flag is unnecessary.
> > 
> > Well, I can agree with that.  It wasn't there in my first patchset and I added
> > it kind of in the hope to be able to deal with the ignore_children devices
> > with the help of it.
> 
> Yeah.  I contributed to that, by not understanding exactly what you 
> were trying to accomplish.
> 
> > OK, I guess I need to prepare a new version without the parent_needed flag for
> > further discussion. :-)
> 
> Consider using the description above (or some variant of it) for the
> new Changelog.  IMNSHO it does a much better job of explaining the 
> patch than your original version.  :-)

Yes, it does and I actually used it with minor modifications. :-)

A refreshed series follows.  The reason why I still want pm_runtime_enabled_and_suspended()
is because a device's runtime suspend may (theoretically) complete after its ->prepare()
callback has been executed and I think it's better to avoid resuming it in that case
too if that's not necessary.

Rafael

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-07 23:27                                                           ` [RFC][PATCH 0/3] (was: Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
@ 2014-05-07 23:29                                                             ` Rafael J. Wysocki
  2014-05-08  7:49                                                               ` Ulf Hansson
                                                                                 ` (2 more replies)
  2014-05-07 23:31                                                             ` [Resend][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
  2014-05-07 23:33                                                             ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2 siblings, 3 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-07 23:29 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
resume all runtime-suspended devices during system suspend, mostly
because those devices may need to be reprogrammed due to different
wakeup settings for system sleep and for runtime PM.

For some devices, though, it's OK to remain in runtime suspend 
throughout a complete system suspend/resume cycle (if the device was in
runtime suspend at the start of the cycle).  We would like to do this
whenever possible, to avoid the overhead of extra power-up and power-down
events.

However, problems may arise because the device's descendants may require
it to be at full power at various points during the cycle.  Therefore the
most straightforward way to do this safely is if the device and all its
descendants can remain runtime suspended until the resume stage of system
resume.

To this end, introduce dev->power.leave_runtime_suspended.
If a subsystem or driver sets this flag during the ->prepare() callback,
and if the flag is set in all of the device's descendants, and if the
device is still in runtime suspend at the beginning of the ->suspend()
callback, that callback is allowed to return 0 without clearing
power.leave_runtime_suspended and without changing the state of the
device, unless the current state of the device is not appropriate for
the upcoming system sleep state (for example, the device is supposed to
wake up the system from that state and its current wakeup settings are
not suitable for that).  Then, the PM core will not invoke the device's
->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
the device resume stage of system resume.

By leaving this flag set after ->suspend(), a driver or subsystem tells
the PM core that the device is runtime suspended, it is in a suitable
state for system suspend (for example, the wakeup setting does not
need to be changed), and it does not need to return to full
power until the resume stage.

Changelog based on an Alan Stern's description of the idea
(http://marc.info/?l=linux-pm&m=139940466625569&w=2).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c    |   31 ++++++++++++++++++++++++-------
 drivers/base/power/runtime.c |   10 ++++++++++
 include/linux/pm.h           |    3 +++
 include/linux/pm_runtime.h   |   16 ++++++++++++++++
 kernel/power/Kconfig         |    4 ++++
 5 files changed, 57 insertions(+), 7 deletions(-)

Index: linux-pm/kernel/power/Kconfig
===================================================================
--- linux-pm.orig/kernel/power/Kconfig
+++ linux-pm/kernel/power/Kconfig
@@ -147,6 +147,10 @@ config PM
 	def_bool y
 	depends on PM_SLEEP || PM_RUNTIME
 
+config PM_BOTH
+	def_bool y
+	depends on PM_SLEEP && PM_RUNTIME
+
 config PM_DEBUG
 	bool "Power Management Debug Support"
 	depends on PM
Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -583,6 +583,9 @@ struct dev_pm_info {
 	unsigned long		suspended_jiffies;
 	unsigned long		accounting_timestamp;
 #endif
+#ifdef CONFIG_PM_BOTH
+	bool			leave_runtime_suspended:1;
+#endif
 	struct pm_subsys_data	*subsys_data;  /* Owned by the subsystem. */
 	void (*set_latency_tolerance)(struct device *, s32);
 	struct dev_pm_qos	*qos;
Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -264,4 +264,20 @@ static inline void pm_runtime_dont_use_a
 	__pm_runtime_use_autosuspend(dev, false);
 }
 
+#ifdef CONFIG_PM_BOTH
+static inline void __set_leave_runtime_suspended(struct device *dev, bool val)
+{
+	dev->power.leave_runtime_suspended = val;
+}
+extern void pm_set_leave_runtime_suspended(struct device *dev, bool val);
+static inline bool pm_leave_runtime_suspended(struct device *dev)
+{
+	return dev->power.leave_runtime_suspended;
+}
+#else
+static inline void __set_leave_runtime_suspended(struct device *dev, bool val) {}
+static inline void pm_set_leave_runtime_suspended(struct device *dev, bool val) {}
+static inline bool pm_leave_runtime_suspended(struct device *dev) { return false; }
+#endif
+
 #endif
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -732,6 +732,7 @@ static int rpm_resume(struct device *dev
 	}
  skip_parent:
 
+	__set_leave_runtime_suspended(dev, false);
 	if (dev->power.no_callbacks)
 		goto no_callback;	/* Assume success. */
 
@@ -1485,3 +1486,12 @@ out:
 	return ret;
 }
 EXPORT_SYMBOL_GPL(pm_runtime_force_resume);
+
+#ifdef CONFIG_PM_BOTH
+void pm_set_leave_runtime_suspended(struct device *dev, bool val)
+{
+	spin_lock_irq(&dev->power.lock);
+	__set_leave_runtime_suspended(dev, val);
+	spin_unlock_irq(&dev->power.lock);
+}
+#endif
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || pm_leave_runtime_suspended(dev))
 		goto Out;
 
 	if (!dev->power.is_noirq_suspended)
@@ -605,7 +605,7 @@ static int device_resume_early(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || pm_leave_runtime_suspended(dev))
 		goto Out;
 
 	if (!dev->power.is_late_suspended)
@@ -735,6 +735,11 @@ static int device_resume(struct device *
 	if (dev->power.syscore)
 		goto Complete;
 
+	if (pm_leave_runtime_suspended(dev)) {
+		pm_runtime_resume(dev);
+		goto Complete;
+	}
+
 	dpm_wait(dev->parent, async);
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
@@ -1007,7 +1012,7 @@ static int __device_suspend_noirq(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || pm_leave_runtime_suspended(dev))
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1146,7 +1151,7 @@ static int __device_suspend_late(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || pm_leave_runtime_suspended(dev))
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1382,10 +1387,21 @@ static int __device_suspend(struct devic
 
  End:
 	if (!error) {
+		struct device *parent = dev->parent;
+
 		dev->power.is_suspended = true;
-		if (dev->power.wakeup_path
-		    && dev->parent && !dev->parent->power.ignore_children)
-			dev->parent->power.wakeup_path = true;
+		if (parent) {
+			spin_lock_irq(&parent->power.lock);
+
+			if (dev->power.wakeup_path
+			    && !parent->power.ignore_children)
+				parent->power.wakeup_path = true;
+
+			if (!pm_leave_runtime_suspended(dev))
+				__set_leave_runtime_suspended(parent, false);
+
+			spin_unlock_irq(&parent->power.lock);
+		}
 	}
 
 	device_unlock(dev);
@@ -1553,6 +1569,7 @@ int dpm_prepare(pm_message_t state)
 		struct device *dev = to_device(dpm_list.next);
 
 		get_device(dev);
+		pm_set_leave_runtime_suspended(dev, false);
 		mutex_unlock(&dpm_list_mtx);
 
 		error = device_prepare(dev, state);

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [Resend][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend
  2014-05-07 23:27                                                           ` [RFC][PATCH 0/3] (was: Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
  2014-05-07 23:29                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
@ 2014-05-07 23:31                                                             ` Rafael J. Wysocki
  2014-05-07 23:33                                                             ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-07 23:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List,
	LKML, Ulf Hansson

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add a new helper routine, pm_runtime_enabled_and_suspended(), to
allow subsystems (or PM domains) to check the runtime PM status of
devices during system suspend (possibly to avoid resuming those
devices upfront at that time).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 drivers/base/power/runtime.c |   27 +++++++++++++++++++++++++++
 include/linux/pm_runtime.h   |    2 ++
 2 files changed, 29 insertions(+)

Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -57,6 +57,7 @@ extern unsigned long pm_runtime_autosusp
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern bool pm_runtime_enabled_and_suspended(struct device *dev);
 
 static inline bool pm_children_suspended(struct device *dev)
 {
@@ -165,6 +166,7 @@ static inline unsigned long pm_runtime_a
 				struct device *dev) { return 0; }
 static inline void pm_runtime_set_memalloc_noio(struct device *dev,
 						bool enable){}
+static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false; }
 
 #endif /* !CONFIG_PM_RUNTIME */
 
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1196,6 +1196,33 @@ void pm_runtime_enable(struct device *de
 EXPORT_SYMBOL_GPL(pm_runtime_enable);
 
 /**
+ * pm_runtime_enabled_and_suspended - Check runtime PM status of a device.
+ * @dev: Device to handle.
+ *
+ * This routine is to be executed during system suspend only, after
+ * device_prepare() has been executed for @dev.
+ *
+ * Return false if runtime PM is disabled for the device.  Otherwise, wait
+ * for pending transitions to complete and check the runtime PM status of the
+ * device after that.  Return true if it is RPM_SUSPENDED.
+ */
+bool pm_runtime_enabled_and_suspended(struct device *dev)
+{
+	unsigned long flags;
+	bool ret;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	if (dev->power.disable_depth) {
+		ret = false;
+	} else {
+		__pm_runtime_barrier(dev);
+		ret = pm_runtime_status_suspended(dev);
+	}
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+	return ret;
+}
+
+/**
  * pm_runtime_forbid - Block runtime PM of a device.
  * @dev: Device to handle.
  *

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
  2014-05-07 23:27                                                           ` [RFC][PATCH 0/3] (was: Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
  2014-05-07 23:29                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
  2014-05-07 23:31                                                             ` [Resend][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
@ 2014-05-07 23:33                                                             ` Rafael J. Wysocki
  2014-05-08 14:59                                                               ` Alan Stern
  2 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-07 23:33 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Rework the ACPI PM domain's PM callbacks to avoid resuming devices
during system suspend (in order to modify their wakeup settings etc.)
if that isn't necessary.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/device_pm.c |   31 ++++++++++++++++++++++++++++---
 drivers/acpi/scan.c      |    4 ++++
 include/acpi/acpi_bus.h  |    3 ++-
 3 files changed, 34 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -907,6 +907,7 @@ int acpi_subsys_prepare(struct device *d
 	if (dev->power.ignore_children)
 		pm_runtime_resume(dev);
 
+	pm_set_leave_runtime_suspended(dev, true);
 	return pm_generic_prepare(dev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
@@ -914,13 +915,37 @@ EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
 /**
  * acpi_subsys_suspend - Run the device driver's suspend callback.
  * @dev: Device to handle.
- *
- * Follow PCI and resume devices suspended at run time before running their
- * system suspend callbacks.
  */
 int acpi_subsys_suspend(struct device *dev)
 {
+	struct acpi_device *adev = ACPI_COMPANION(dev);
+	u32 sys_target;
+
+	if (!adev || !pm_runtime_enabled_and_suspended(dev)) {
+		pm_set_leave_runtime_suspended(dev, false);
+		goto out;
+	}
+	if (!pm_leave_runtime_suspended(dev)
+	    || device_may_wakeup(dev) != !!adev->wakeup.prepare_count)
+		goto resume;
+
+	sys_target = acpi_target_system_state();
+	if (sys_target != ACPI_STATE_S0) {
+		int ret, state;
+
+		if (adev->power.flags.dsw_present)
+			goto resume;
+
+		ret = acpi_dev_pm_get_state(dev, adev, sys_target, NULL, &state);
+		if (ret || state != adev->power.state)
+			goto resume;
+	}
+	return 0;
+
+ resume:
 	pm_runtime_resume(dev);
+
+ out:
 	return pm_generic_suspend(dev);
 }
 
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -261,7 +261,8 @@ struct acpi_device_power_flags {
 	u32 inrush_current:1;	/* Serialize Dx->D0 */
 	u32 power_removed:1;	/* Optimize Dx->D0 */
 	u32 ignore_parent:1;	/* Power is independent of parent power state */
-	u32 reserved:27;
+	u32 dsw_present:1;	/* _DSW present? */
+	u32 reserved:26;
 };
 
 struct acpi_device_power_state {
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -1551,9 +1551,13 @@ static void acpi_bus_get_power_flags(str
 	 */
 	if (acpi_has_method(device->handle, "_PSC"))
 		device->power.flags.explicit_get = 1;
+
 	if (acpi_has_method(device->handle, "_IRC"))
 		device->power.flags.inrush_current = 1;
 
+	if (acpi_has_method(device->handle, "_DSW"))
+		device->power.flags.dsw_present = 1;
+
 	/*
 	 * Enumerate supported power management states
 	 */


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-07 23:29                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
@ 2014-05-08  7:49                                                               ` Ulf Hansson
  2014-05-08 10:53                                                                 ` Rafael J. Wysocki
  2014-05-08 14:57                                                               ` Alan Stern
  2014-05-09 22:48                                                               ` Kevin Hilman
  2 siblings, 1 reply; 78+ messages in thread
From: Ulf Hansson @ 2014-05-08  7:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On 8 May 2014 01:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.
>
> For some devices, though, it's OK to remain in runtime suspend
> throughout a complete system suspend/resume cycle (if the device was in
> runtime suspend at the start of the cycle).  We would like to do this
> whenever possible, to avoid the overhead of extra power-up and power-down
> events.
>
> However, problems may arise because the device's descendants may require
> it to be at full power at various points during the cycle.  Therefore the
> most straightforward way to do this safely is if the device and all its
> descendants can remain runtime suspended until the resume stage of system
> resume.
>
> To this end, introduce dev->power.leave_runtime_suspended.
> If a subsystem or driver sets this flag during the ->prepare() callback,
> and if the flag is set in all of the device's descendants, and if the
> device is still in runtime suspend at the beginning of the ->suspend()
> callback, that callback is allowed to return 0 without clearing
> power.leave_runtime_suspended and without changing the state of the
> device, unless the current state of the device is not appropriate for
> the upcoming system sleep state (for example, the device is supposed to
> wake up the system from that state and its current wakeup settings are
> not suitable for that).  Then, the PM core will not invoke the device's
> ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
> the device resume stage of system resume.
>
> By leaving this flag set after ->suspend(), a driver or subsystem tells
> the PM core that the device is runtime suspended, it is in a suitable
> state for system suspend (for example, the wakeup setting does not
> need to be changed), and it does not need to return to full
> power until the resume stage.
>
> Changelog based on an Alan Stern's description of the idea
> (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/power/main.c    |   31 ++++++++++++++++++++++++-------
>  drivers/base/power/runtime.c |   10 ++++++++++
>  include/linux/pm.h           |    3 +++
>  include/linux/pm_runtime.h   |   16 ++++++++++++++++
>  kernel/power/Kconfig         |    4 ++++
>  5 files changed, 57 insertions(+), 7 deletions(-)
>
> Index: linux-pm/kernel/power/Kconfig
> ===================================================================
> --- linux-pm.orig/kernel/power/Kconfig
> +++ linux-pm/kernel/power/Kconfig
> @@ -147,6 +147,10 @@ config PM
>         def_bool y
>         depends on PM_SLEEP || PM_RUNTIME
>
> +config PM_BOTH
> +       def_bool y
> +       depends on PM_SLEEP && PM_RUNTIME
> +

Should we not depend on PM_RUNTIME only? Thus we don't need the new
Kconfig, and then we could rename the new APIs to pm_runtime_*
instead.

>  config PM_DEBUG
>         bool "Power Management Debug Support"
>         depends on PM
> Index: linux-pm/include/linux/pm.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm.h
> +++ linux-pm/include/linux/pm.h
> @@ -583,6 +583,9 @@ struct dev_pm_info {
>         unsigned long           suspended_jiffies;
>         unsigned long           accounting_timestamp;
>  #endif
> +#ifdef CONFIG_PM_BOTH
> +       bool                    leave_runtime_suspended:1;
> +#endif
>         struct pm_subsys_data   *subsys_data;  /* Owned by the subsystem. */
>         void (*set_latency_tolerance)(struct device *, s32);
>         struct dev_pm_qos       *qos;
> Index: linux-pm/include/linux/pm_runtime.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm_runtime.h
> +++ linux-pm/include/linux/pm_runtime.h
> @@ -264,4 +264,20 @@ static inline void pm_runtime_dont_use_a
>         __pm_runtime_use_autosuspend(dev, false);
>  }
>
> +#ifdef CONFIG_PM_BOTH
> +static inline void __set_leave_runtime_suspended(struct device *dev, bool val)
> +{
> +       dev->power.leave_runtime_suspended = val;
> +}
> +extern void pm_set_leave_runtime_suspended(struct device *dev, bool val);
> +static inline bool pm_leave_runtime_suspended(struct device *dev)
> +{
> +       return dev->power.leave_runtime_suspended;
> +}
> +#else
> +static inline void __set_leave_runtime_suspended(struct device *dev, bool val) {}
> +static inline void pm_set_leave_runtime_suspended(struct device *dev, bool val) {}
> +static inline bool pm_leave_runtime_suspended(struct device *dev) { return false; }
> +#endif
> +
>  #endif
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -732,6 +732,7 @@ static int rpm_resume(struct device *dev
>         }
>   skip_parent:
>
> +       __set_leave_runtime_suspended(dev, false);
>         if (dev->power.no_callbacks)
>                 goto no_callback;       /* Assume success. */
>
> @@ -1485,3 +1486,12 @@ out:
>         return ret;
>  }
>  EXPORT_SYMBOL_GPL(pm_runtime_force_resume);
> +
> +#ifdef CONFIG_PM_BOTH
> +void pm_set_leave_runtime_suspended(struct device *dev, bool val)
> +{
> +       spin_lock_irq(&dev->power.lock);
> +       __set_leave_runtime_suspended(dev, val);
> +       spin_unlock_irq(&dev->power.lock);
> +}
> +#endif
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
>         TRACE_DEVICE(dev);
>         TRACE_RESUME(0);
>
> -       if (dev->power.syscore)
> +       if (dev->power.syscore || pm_leave_runtime_suspended(dev))
>                 goto Out;
>
>         if (!dev->power.is_noirq_suspended)
> @@ -605,7 +605,7 @@ static int device_resume_early(struct de
>         TRACE_DEVICE(dev);
>         TRACE_RESUME(0);
>
> -       if (dev->power.syscore)
> +       if (dev->power.syscore || pm_leave_runtime_suspended(dev))
>                 goto Out;
>
>         if (!dev->power.is_late_suspended)
> @@ -735,6 +735,11 @@ static int device_resume(struct device *
>         if (dev->power.syscore)
>                 goto Complete;
>
> +       if (pm_leave_runtime_suspended(dev)) {
> +               pm_runtime_resume(dev);
> +               goto Complete;
> +       }
> +
>         dpm_wait(dev->parent, async);
>         dpm_watchdog_set(&wd, dev);
>         device_lock(dev);
> @@ -1007,7 +1012,7 @@ static int __device_suspend_noirq(struct
>                 goto Complete;
>         }
>
> -       if (dev->power.syscore)
> +       if (dev->power.syscore || pm_leave_runtime_suspended(dev))
>                 goto Complete;
>
>         dpm_wait_for_children(dev, async);
> @@ -1146,7 +1151,7 @@ static int __device_suspend_late(struct
>                 goto Complete;
>         }
>
> -       if (dev->power.syscore)
> +       if (dev->power.syscore || pm_leave_runtime_suspended(dev))
>                 goto Complete;
>
>         dpm_wait_for_children(dev, async);
> @@ -1382,10 +1387,21 @@ static int __device_suspend(struct devic
>
>   End:
>         if (!error) {
> +               struct device *parent = dev->parent;
> +
>                 dev->power.is_suspended = true;
> -               if (dev->power.wakeup_path
> -                   && dev->parent && !dev->parent->power.ignore_children)
> -                       dev->parent->power.wakeup_path = true;
> +               if (parent) {
> +                       spin_lock_irq(&parent->power.lock);
> +
> +                       if (dev->power.wakeup_path
> +                           && !parent->power.ignore_children)
> +                               parent->power.wakeup_path = true;
> +
> +                       if (!pm_leave_runtime_suspended(dev))

I suppose this is the reason to why you think you need CONFIG_PM_BOTH?

But won't this would work nicely even if we just had CONFIG_PM_RUNTIME?

> +                               __set_leave_runtime_suspended(parent, false);
> +
> +                       spin_unlock_irq(&parent->power.lock);
> +               }
>         }
>
>         device_unlock(dev);
> @@ -1553,6 +1569,7 @@ int dpm_prepare(pm_message_t state)
>                 struct device *dev = to_device(dpm_list.next);
>
>                 get_device(dev);
> +               pm_set_leave_runtime_suspended(dev, false);

Is this needed?

>                 mutex_unlock(&dpm_list_mtx);
>
>                 error = device_prepare(dev, state);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

Kind regards
Ulf Hansson

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08  7:49                                                               ` Ulf Hansson
@ 2014-05-08 10:53                                                                 ` Rafael J. Wysocki
  2014-05-08 10:59                                                                   ` Ulf Hansson
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 10:53 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 09:49:36 AM Ulf Hansson wrote:
> On 8 May 2014 01:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.
> >
> > For some devices, though, it's OK to remain in runtime suspend
> > throughout a complete system suspend/resume cycle (if the device was in
> > runtime suspend at the start of the cycle).  We would like to do this
> > whenever possible, to avoid the overhead of extra power-up and power-down
> > events.
> >
> > However, problems may arise because the device's descendants may require
> > it to be at full power at various points during the cycle.  Therefore the
> > most straightforward way to do this safely is if the device and all its
> > descendants can remain runtime suspended until the resume stage of system
> > resume.
> >
> > To this end, introduce dev->power.leave_runtime_suspended.
> > If a subsystem or driver sets this flag during the ->prepare() callback,
> > and if the flag is set in all of the device's descendants, and if the
> > device is still in runtime suspend at the beginning of the ->suspend()
> > callback, that callback is allowed to return 0 without clearing
> > power.leave_runtime_suspended and without changing the state of the
> > device, unless the current state of the device is not appropriate for
> > the upcoming system sleep state (for example, the device is supposed to
> > wake up the system from that state and its current wakeup settings are
> > not suitable for that).  Then, the PM core will not invoke the device's
> > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> > ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
> > the device resume stage of system resume.
> >
> > By leaving this flag set after ->suspend(), a driver or subsystem tells
> > the PM core that the device is runtime suspended, it is in a suitable
> > state for system suspend (for example, the wakeup setting does not
> > need to be changed), and it does not need to return to full
> > power until the resume stage.
> >
> > Changelog based on an Alan Stern's description of the idea
> > (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/power/main.c    |   31 ++++++++++++++++++++++++-------
> >  drivers/base/power/runtime.c |   10 ++++++++++
> >  include/linux/pm.h           |    3 +++
> >  include/linux/pm_runtime.h   |   16 ++++++++++++++++
> >  kernel/power/Kconfig         |    4 ++++
> >  5 files changed, 57 insertions(+), 7 deletions(-)
> >
> > Index: linux-pm/kernel/power/Kconfig
> > ===================================================================
> > --- linux-pm.orig/kernel/power/Kconfig
> > +++ linux-pm/kernel/power/Kconfig
> > @@ -147,6 +147,10 @@ config PM
> >         def_bool y
> >         depends on PM_SLEEP || PM_RUNTIME
> >
> > +config PM_BOTH
> > +       def_bool y
> > +       depends on PM_SLEEP && PM_RUNTIME
> > +
> 
> Should we not depend on PM_RUNTIME only? Thus we don't need the new
> Kconfig,

Well, OK.  I guess we can tolerate one useless statement in rpm_resume()
in case CONFIG_PM_SLEEP is unset.

> and then we could rename the new APIs to pm_runtime_* instead.

That would just make the name longer - for what value?

> >  config PM_DEBUG
> >         bool "Power Management Debug Support"
> >         depends on PM
> > Index: linux-pm/include/linux/pm.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/pm.h
> > +++ linux-pm/include/linux/pm.h
> > @@ -583,6 +583,9 @@ struct dev_pm_info {
> >         unsigned long           suspended_jiffies;
> >         unsigned long           accounting_timestamp;
> >  #endif
> > +#ifdef CONFIG_PM_BOTH
> > +       bool                    leave_runtime_suspended:1;
> > +#endif
> >         struct pm_subsys_data   *subsys_data;  /* Owned by the subsystem. */
> >         void (*set_latency_tolerance)(struct device *, s32);
> >         struct dev_pm_qos       *qos;
> > Index: linux-pm/include/linux/pm_runtime.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/pm_runtime.h
> > +++ linux-pm/include/linux/pm_runtime.h
> > @@ -264,4 +264,20 @@ static inline void pm_runtime_dont_use_a
> >         __pm_runtime_use_autosuspend(dev, false);
> >  }
> >
> > +#ifdef CONFIG_PM_BOTH
> > +static inline void __set_leave_runtime_suspended(struct device *dev, bool val)
> > +{
> > +       dev->power.leave_runtime_suspended = val;
> > +}
> > +extern void pm_set_leave_runtime_suspended(struct device *dev, bool val);
> > +static inline bool pm_leave_runtime_suspended(struct device *dev)
> > +{
> > +       return dev->power.leave_runtime_suspended;
> > +}
> > +#else
> > +static inline void __set_leave_runtime_suspended(struct device *dev, bool val) {}
> > +static inline void pm_set_leave_runtime_suspended(struct device *dev, bool val) {}
> > +static inline bool pm_leave_runtime_suspended(struct device *dev) { return false; }
> > +#endif
> > +
> >  #endif
> > Index: linux-pm/drivers/base/power/runtime.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/runtime.c
> > +++ linux-pm/drivers/base/power/runtime.c
> > @@ -732,6 +732,7 @@ static int rpm_resume(struct device *dev
> >         }
> >   skip_parent:
> >
> > +       __set_leave_runtime_suspended(dev, false);

(*)

> >         if (dev->power.no_callbacks)
> >                 goto no_callback;       /* Assume success. */
> >
> > @@ -1485,3 +1486,12 @@ out:
> >         return ret;
> >  }
> >  EXPORT_SYMBOL_GPL(pm_runtime_force_resume);
> > +
> > +#ifdef CONFIG_PM_BOTH
> > +void pm_set_leave_runtime_suspended(struct device *dev, bool val)
> > +{
> > +       spin_lock_irq(&dev->power.lock);
> > +       __set_leave_runtime_suspended(dev, val);
> > +       spin_unlock_irq(&dev->power.lock);
> > +}
> > +#endif
> > Index: linux-pm/drivers/base/power/main.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/main.c
> > +++ linux-pm/drivers/base/power/main.c
> > @@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
> >         TRACE_DEVICE(dev);
> >         TRACE_RESUME(0);
> >
> > -       if (dev->power.syscore)
> > +       if (dev->power.syscore || pm_leave_runtime_suspended(dev))
> >                 goto Out;
> >
> >         if (!dev->power.is_noirq_suspended)
> > @@ -605,7 +605,7 @@ static int device_resume_early(struct de
> >         TRACE_DEVICE(dev);
> >         TRACE_RESUME(0);
> >
> > -       if (dev->power.syscore)
> > +       if (dev->power.syscore || pm_leave_runtime_suspended(dev))
> >                 goto Out;
> >
> >         if (!dev->power.is_late_suspended)
> > @@ -735,6 +735,11 @@ static int device_resume(struct device *
> >         if (dev->power.syscore)
> >                 goto Complete;
> >
> > +       if (pm_leave_runtime_suspended(dev)) {
> > +               pm_runtime_resume(dev);
> > +               goto Complete;
> > +       }
> > +
> >         dpm_wait(dev->parent, async);
> >         dpm_watchdog_set(&wd, dev);
> >         device_lock(dev);
> > @@ -1007,7 +1012,7 @@ static int __device_suspend_noirq(struct
> >                 goto Complete;
> >         }
> >
> > -       if (dev->power.syscore)
> > +       if (dev->power.syscore || pm_leave_runtime_suspended(dev))
> >                 goto Complete;
> >
> >         dpm_wait_for_children(dev, async);
> > @@ -1146,7 +1151,7 @@ static int __device_suspend_late(struct
> >                 goto Complete;
> >         }
> >
> > -       if (dev->power.syscore)
> > +       if (dev->power.syscore || pm_leave_runtime_suspended(dev))
> >                 goto Complete;
> >
> >         dpm_wait_for_children(dev, async);
> > @@ -1382,10 +1387,21 @@ static int __device_suspend(struct devic
> >
> >   End:
> >         if (!error) {
> > +               struct device *parent = dev->parent;
> > +
> >                 dev->power.is_suspended = true;
> > -               if (dev->power.wakeup_path
> > -                   && dev->parent && !dev->parent->power.ignore_children)
> > -                       dev->parent->power.wakeup_path = true;
> > +               if (parent) {
> > +                       spin_lock_irq(&parent->power.lock);
> > +
> > +                       if (dev->power.wakeup_path
> > +                           && !parent->power.ignore_children)
> > +                               parent->power.wakeup_path = true;
> > +
> > +                       if (!pm_leave_runtime_suspended(dev))
> 
> I suppose this is the reason to why you think you need CONFIG_PM_BOTH?

Actually, no.  The reason is the (*) change in rpm_resume().

> But won't this would work nicely even if we just had CONFIG_PM_RUNTIME?

Yes, it will.

> > +                               __set_leave_runtime_suspended(parent, false);
> > +
> > +                       spin_unlock_irq(&parent->power.lock);
> > +               }
> >         }
> >
> >         device_unlock(dev);
> > @@ -1553,6 +1569,7 @@ int dpm_prepare(pm_message_t state)
> >                 struct device *dev = to_device(dpm_list.next);
> >
> >                 get_device(dev);
> > +               pm_set_leave_runtime_suspended(dev, false);
> 
> Is this needed?

Yes, it is.  We don't want any leftovers after this point.

> >                 mutex_unlock(&dpm_list_mtx);
> >
> >                 error = device_prepare(dev, state);
> >
> > --

Thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 10:53                                                                 ` Rafael J. Wysocki
@ 2014-05-08 10:59                                                                   ` Ulf Hansson
  2014-05-08 11:44                                                                     ` Rafael J. Wysocki
  2014-05-08 14:36                                                                     ` Alan Stern
  0 siblings, 2 replies; 78+ messages in thread
From: Ulf Hansson @ 2014-05-08 10:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On 8 May 2014 12:53, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, May 08, 2014 09:49:36 AM Ulf Hansson wrote:
>> On 8 May 2014 01:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >
>> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
>> > resume all runtime-suspended devices during system suspend, mostly
>> > because those devices may need to be reprogrammed due to different
>> > wakeup settings for system sleep and for runtime PM.
>> >
>> > For some devices, though, it's OK to remain in runtime suspend
>> > throughout a complete system suspend/resume cycle (if the device was in
>> > runtime suspend at the start of the cycle).  We would like to do this
>> > whenever possible, to avoid the overhead of extra power-up and power-down
>> > events.
>> >
>> > However, problems may arise because the device's descendants may require
>> > it to be at full power at various points during the cycle.  Therefore the
>> > most straightforward way to do this safely is if the device and all its
>> > descendants can remain runtime suspended until the resume stage of system
>> > resume.
>> >
>> > To this end, introduce dev->power.leave_runtime_suspended.
>> > If a subsystem or driver sets this flag during the ->prepare() callback,
>> > and if the flag is set in all of the device's descendants, and if the
>> > device is still in runtime suspend at the beginning of the ->suspend()
>> > callback, that callback is allowed to return 0 without clearing
>> > power.leave_runtime_suspended and without changing the state of the
>> > device, unless the current state of the device is not appropriate for
>> > the upcoming system sleep state (for example, the device is supposed to
>> > wake up the system from that state and its current wakeup settings are
>> > not suitable for that).  Then, the PM core will not invoke the device's
>> > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
>> > ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
>> > the device resume stage of system resume.
>> >
>> > By leaving this flag set after ->suspend(), a driver or subsystem tells
>> > the PM core that the device is runtime suspended, it is in a suitable
>> > state for system suspend (for example, the wakeup setting does not
>> > need to be changed), and it does not need to return to full
>> > power until the resume stage.
>> >
>> > Changelog based on an Alan Stern's description of the idea
>> > (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
>> >
>> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> > ---
>> >  drivers/base/power/main.c    |   31 ++++++++++++++++++++++++-------
>> >  drivers/base/power/runtime.c |   10 ++++++++++
>> >  include/linux/pm.h           |    3 +++
>> >  include/linux/pm_runtime.h   |   16 ++++++++++++++++
>> >  kernel/power/Kconfig         |    4 ++++
>> >  5 files changed, 57 insertions(+), 7 deletions(-)
>> >
>> > Index: linux-pm/kernel/power/Kconfig
>> > ===================================================================
>> > --- linux-pm.orig/kernel/power/Kconfig
>> > +++ linux-pm/kernel/power/Kconfig
>> > @@ -147,6 +147,10 @@ config PM
>> >         def_bool y
>> >         depends on PM_SLEEP || PM_RUNTIME
>> >
>> > +config PM_BOTH
>> > +       def_bool y
>> > +       depends on PM_SLEEP && PM_RUNTIME
>> > +
>>
>> Should we not depend on PM_RUNTIME only? Thus we don't need the new
>> Kconfig,
>
> Well, OK.  I guess we can tolerate one useless statement in rpm_resume()
> in case CONFIG_PM_SLEEP is unset.
>
>> and then we could rename the new APIs to pm_runtime_* instead.
>
> That would just make the name longer - for what value?

Only "__set_leave_runtime_suspended" will be a bit longer.

The idea I had was to clearly indicate, these functions is a part of
PM_RUNTIME API.

Compare what you have:
__set_leave_runtime_suspended
pm_set_leave_runtime_suspended
pm_leave_runtime_suspended

To what I suggest:
__pm_runtime_set_leave_suspended
pm_runtime_set_leave_suspended
pm_runtime_leave_suspended

Kind regards
Ulf Hansson

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 10:59                                                                   ` Ulf Hansson
@ 2014-05-08 11:44                                                                     ` Rafael J. Wysocki
  2014-05-08 12:25                                                                       ` Ulf Hansson
  2014-05-08 14:36                                                                     ` Alan Stern
  1 sibling, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 11:44 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 12:59:20 PM Ulf Hansson wrote:
> On 8 May 2014 12:53, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Thursday, May 08, 2014 09:49:36 AM Ulf Hansson wrote:
> >> On 8 May 2014 01:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >
> >> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> >> > resume all runtime-suspended devices during system suspend, mostly
> >> > because those devices may need to be reprogrammed due to different
> >> > wakeup settings for system sleep and for runtime PM.
> >> >
> >> > For some devices, though, it's OK to remain in runtime suspend
> >> > throughout a complete system suspend/resume cycle (if the device was in
> >> > runtime suspend at the start of the cycle).  We would like to do this
> >> > whenever possible, to avoid the overhead of extra power-up and power-down
> >> > events.
> >> >
> >> > However, problems may arise because the device's descendants may require
> >> > it to be at full power at various points during the cycle.  Therefore the
> >> > most straightforward way to do this safely is if the device and all its
> >> > descendants can remain runtime suspended until the resume stage of system
> >> > resume.
> >> >
> >> > To this end, introduce dev->power.leave_runtime_suspended.
> >> > If a subsystem or driver sets this flag during the ->prepare() callback,
> >> > and if the flag is set in all of the device's descendants, and if the
> >> > device is still in runtime suspend at the beginning of the ->suspend()
> >> > callback, that callback is allowed to return 0 without clearing
> >> > power.leave_runtime_suspended and without changing the state of the
> >> > device, unless the current state of the device is not appropriate for
> >> > the upcoming system sleep state (for example, the device is supposed to
> >> > wake up the system from that state and its current wakeup settings are
> >> > not suitable for that).  Then, the PM core will not invoke the device's
> >> > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> >> > ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
> >> > the device resume stage of system resume.
> >> >
> >> > By leaving this flag set after ->suspend(), a driver or subsystem tells
> >> > the PM core that the device is runtime suspended, it is in a suitable
> >> > state for system suspend (for example, the wakeup setting does not
> >> > need to be changed), and it does not need to return to full
> >> > power until the resume stage.
> >> >
> >> > Changelog based on an Alan Stern's description of the idea
> >> > (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
> >> >
> >> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> > ---
> >> >  drivers/base/power/main.c    |   31 ++++++++++++++++++++++++-------
> >> >  drivers/base/power/runtime.c |   10 ++++++++++
> >> >  include/linux/pm.h           |    3 +++
> >> >  include/linux/pm_runtime.h   |   16 ++++++++++++++++
> >> >  kernel/power/Kconfig         |    4 ++++
> >> >  5 files changed, 57 insertions(+), 7 deletions(-)
> >> >
> >> > Index: linux-pm/kernel/power/Kconfig
> >> > ===================================================================
> >> > --- linux-pm.orig/kernel/power/Kconfig
> >> > +++ linux-pm/kernel/power/Kconfig
> >> > @@ -147,6 +147,10 @@ config PM
> >> >         def_bool y
> >> >         depends on PM_SLEEP || PM_RUNTIME
> >> >
> >> > +config PM_BOTH
> >> > +       def_bool y
> >> > +       depends on PM_SLEEP && PM_RUNTIME
> >> > +
> >>
> >> Should we not depend on PM_RUNTIME only? Thus we don't need the new
> >> Kconfig,
> >
> > Well, OK.  I guess we can tolerate one useless statement in rpm_resume()
> > in case CONFIG_PM_SLEEP is unset.
> >
> >> and then we could rename the new APIs to pm_runtime_* instead.
> >
> > That would just make the name longer - for what value?
> 
> Only "__set_leave_runtime_suspended" will be a bit longer.
> 
> The idea I had was to clearly indicate, these functions is a part of
> PM_RUNTIME API.
> 
> Compare what you have:
> __set_leave_runtime_suspended
> pm_set_leave_runtime_suspended
> pm_leave_runtime_suspended
> 
> To what I suggest:
> __pm_runtime_set_leave_suspended
> pm_runtime_set_leave_suspended
> pm_runtime_leave_suspended

And why exactly do you think these are any better?

The flag is not called leave_suspended surely?


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 11:44                                                                     ` Rafael J. Wysocki
@ 2014-05-08 12:25                                                                       ` Ulf Hansson
  2014-05-08 20:02                                                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Ulf Hansson @ 2014-05-08 12:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On 8 May 2014 13:44, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, May 08, 2014 12:59:20 PM Ulf Hansson wrote:
>> On 8 May 2014 12:53, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> > On Thursday, May 08, 2014 09:49:36 AM Ulf Hansson wrote:
>> >> On 8 May 2014 01:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >> >
>> >> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
>> >> > resume all runtime-suspended devices during system suspend, mostly
>> >> > because those devices may need to be reprogrammed due to different
>> >> > wakeup settings for system sleep and for runtime PM.
>> >> >
>> >> > For some devices, though, it's OK to remain in runtime suspend
>> >> > throughout a complete system suspend/resume cycle (if the device was in
>> >> > runtime suspend at the start of the cycle).  We would like to do this
>> >> > whenever possible, to avoid the overhead of extra power-up and power-down
>> >> > events.
>> >> >
>> >> > However, problems may arise because the device's descendants may require
>> >> > it to be at full power at various points during the cycle.  Therefore the
>> >> > most straightforward way to do this safely is if the device and all its
>> >> > descendants can remain runtime suspended until the resume stage of system
>> >> > resume.
>> >> >
>> >> > To this end, introduce dev->power.leave_runtime_suspended.
>> >> > If a subsystem or driver sets this flag during the ->prepare() callback,
>> >> > and if the flag is set in all of the device's descendants, and if the
>> >> > device is still in runtime suspend at the beginning of the ->suspend()
>> >> > callback, that callback is allowed to return 0 without clearing
>> >> > power.leave_runtime_suspended and without changing the state of the
>> >> > device, unless the current state of the device is not appropriate for
>> >> > the upcoming system sleep state (for example, the device is supposed to
>> >> > wake up the system from that state and its current wakeup settings are
>> >> > not suitable for that).  Then, the PM core will not invoke the device's
>> >> > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
>> >> > ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
>> >> > the device resume stage of system resume.
>> >> >
>> >> > By leaving this flag set after ->suspend(), a driver or subsystem tells
>> >> > the PM core that the device is runtime suspended, it is in a suitable
>> >> > state for system suspend (for example, the wakeup setting does not
>> >> > need to be changed), and it does not need to return to full
>> >> > power until the resume stage.
>> >> >
>> >> > Changelog based on an Alan Stern's description of the idea
>> >> > (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
>> >> >
>> >> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >> > ---
>> >> >  drivers/base/power/main.c    |   31 ++++++++++++++++++++++++-------
>> >> >  drivers/base/power/runtime.c |   10 ++++++++++
>> >> >  include/linux/pm.h           |    3 +++
>> >> >  include/linux/pm_runtime.h   |   16 ++++++++++++++++
>> >> >  kernel/power/Kconfig         |    4 ++++
>> >> >  5 files changed, 57 insertions(+), 7 deletions(-)
>> >> >
>> >> > Index: linux-pm/kernel/power/Kconfig
>> >> > ===================================================================
>> >> > --- linux-pm.orig/kernel/power/Kconfig
>> >> > +++ linux-pm/kernel/power/Kconfig
>> >> > @@ -147,6 +147,10 @@ config PM
>> >> >         def_bool y
>> >> >         depends on PM_SLEEP || PM_RUNTIME
>> >> >
>> >> > +config PM_BOTH
>> >> > +       def_bool y
>> >> > +       depends on PM_SLEEP && PM_RUNTIME
>> >> > +
>> >>
>> >> Should we not depend on PM_RUNTIME only? Thus we don't need the new
>> >> Kconfig,
>> >
>> > Well, OK.  I guess we can tolerate one useless statement in rpm_resume()
>> > in case CONFIG_PM_SLEEP is unset.
>> >
>> >> and then we could rename the new APIs to pm_runtime_* instead.
>> >
>> > That would just make the name longer - for what value?
>>
>> Only "__set_leave_runtime_suspended" will be a bit longer.
>>
>> The idea I had was to clearly indicate, these functions is a part of
>> PM_RUNTIME API.
>>
>> Compare what you have:
>> __set_leave_runtime_suspended
>> pm_set_leave_runtime_suspended
>> pm_leave_runtime_suspended
>>
>> To what I suggest:
>> __pm_runtime_set_leave_suspended
>> pm_runtime_set_leave_suspended
>> pm_runtime_leave_suspended
>
> And why exactly do you think these are any better?

Because that's how all (almost all) other functions in the runtime PM
API are specified - I believe it makes sense to keep them aligned.

Anyway, if you insist in keeping your functions names, it's not that
of a big deal for me.

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

>
> The flag is not called leave_suspended surely?

To me that doesn't matter, the flag has nothing to do with the
function names in an API.

>
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 10:59                                                                   ` Ulf Hansson
  2014-05-08 11:44                                                                     ` Rafael J. Wysocki
@ 2014-05-08 14:36                                                                     ` Alan Stern
  1 sibling, 0 replies; 78+ messages in thread
From: Alan Stern @ 2014-05-08 14:36 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On Thu, 8 May 2014, Ulf Hansson wrote:

> >> Should we not depend on PM_RUNTIME only? Thus we don't need the new
> >> Kconfig,
> >
> > Well, OK.  I guess we can tolerate one useless statement in rpm_resume()
> > in case CONFIG_PM_SLEEP is unset.

It isn't a big deal.  However, Ulf, you need to understand that this 
API belongs to _both_ PM_SLEEP _and_ PM_RUNTIME.  It doesn't mean 
anything unless both are present.

But as Rafael said, the extra overhead if !CONFIG_PM_SLEEP is minimal.  
(Although I would move the new flag to be next to the other bitflags, 
so that it doesn't add an entire extra word to the dev_pm_info 
structure.)

> >> and then we could rename the new APIs to pm_runtime_* instead.
> >
> > That would just make the name longer - for what value?
> 
> Only "__set_leave_runtime_suspended" will be a bit longer.
> 
> The idea I had was to clearly indicate, these functions is a part of
> PM_RUNTIME API.

Not so.  They are part of both PM_SLEEP and PM_RUNTIME, which means
they are really just part of PM.  You can tell by the fact that they 
are used in both main.c and runtime.c.

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-07 23:29                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
  2014-05-08  7:49                                                               ` Ulf Hansson
@ 2014-05-08 14:57                                                               ` Alan Stern
  2014-05-08 20:17                                                                 ` Rafael J. Wysocki
  2014-05-09 22:48                                                               ` Kevin Hilman
  2 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-08 14:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thu, 8 May 2014, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.
> 
> For some devices, though, it's OK to remain in runtime suspend 
> throughout a complete system suspend/resume cycle (if the device was in
> runtime suspend at the start of the cycle).  We would like to do this
> whenever possible, to avoid the overhead of extra power-up and power-down
> events.
> 
> However, problems may arise because the device's descendants may require
> it to be at full power at various points during the cycle.  Therefore the
> most straightforward way to do this safely is if the device and all its
> descendants can remain runtime suspended until the resume stage of system
> resume.
> 
> To this end, introduce dev->power.leave_runtime_suspended.
> If a subsystem or driver sets this flag during the ->prepare() callback,
> and if the flag is set in all of the device's descendants, and if the
> device is still in runtime suspend at the beginning of the ->suspend()
> callback, that callback is allowed to return 0 without clearing
> power.leave_runtime_suspended and without changing the state of the
> device, unless the current state of the device is not appropriate for
> the upcoming system sleep state (for example, the device is supposed to
> wake up the system from that state and its current wakeup settings are
> not suitable for that).  Then, the PM core will not invoke the device's
> ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
> the device resume stage of system resume.

Wait a minute.  Following ->runtime_suspend(), you are going to call 
->suspend() and then ->runtime_resume()?  That doesn't seem like what 
you really want; a ->suspend() call should always have a matching 
->resume().

I guess you did it this way to allow for runtime-resumes and -suspends 
between ->prepare() and ->suspend(), but it still seems wrong.

How about asking drivers to set leave_runtime_suspended in their
->runtime_suspend() callbacks, as well as during ->prepare()?  Then
intervening runtime resume/suspend cycles wouldn't matter and you
wouldn't need to call ->suspend(); you could skip it along with the
other PM callbacks.

> By leaving this flag set after ->suspend(), a driver or subsystem tells
> the PM core that the device is runtime suspended, it is in a suitable
> state for system suspend (for example, the wakeup setting does not
> need to be changed), and it does not need to return to full
> power until the resume stage.

So: By setting this flag during ->runtime_suspend() and ->prepare(), a
driver or subsystem tells the PM core that the device is in a suitable
state for system suspend (for example, the wakeup setting would not
need to be changed), if one should occur before the next runtime
resume, and the device would not need to return to full power until the
resume stage.

> --- linux-pm.orig/include/linux/pm_runtime.h
> +++ linux-pm/include/linux/pm_runtime.h
> @@ -264,4 +264,20 @@ static inline void pm_runtime_dont_use_a
>  	__pm_runtime_use_autosuspend(dev, false);
>  }
>  
> +#ifdef CONFIG_PM_BOTH
> +static inline void __set_leave_runtime_suspended(struct device *dev, bool val)
> +{
> +	dev->power.leave_runtime_suspended = val;
> +}
> +extern void pm_set_leave_runtime_suspended(struct device *dev, bool val);
> +static inline bool pm_leave_runtime_suspended(struct device *dev)
> +{
> +	return dev->power.leave_runtime_suspended;
> +}

Is it generally your custom to use "set_" and "" rather than "set_" and 
"get_"?

>   End:
>  	if (!error) {
> +		struct device *parent = dev->parent;
> +
>  		dev->power.is_suspended = true;
> -		if (dev->power.wakeup_path
> -		    && dev->parent && !dev->parent->power.ignore_children)
> -			dev->parent->power.wakeup_path = true;
> +		if (parent) {
> +			spin_lock_irq(&parent->power.lock);
> +
> +			if (dev->power.wakeup_path
> +			    && !parent->power.ignore_children)
> +				parent->power.wakeup_path = true;
> +
> +			if (!pm_leave_runtime_suspended(dev))
> +				__set_leave_runtime_suspended(parent, false);
> +
> +			spin_unlock_irq(&parent->power.lock);
> +		}

Then of course, this code would move up, before the callback, and the 
callback would be skipped if leave_runtime_suspended was set.

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
  2014-05-07 23:33                                                             ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
@ 2014-05-08 14:59                                                               ` Alan Stern
  2014-05-08 19:40                                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-08 14:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thu, 8 May 2014, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Rework the ACPI PM domain's PM callbacks to avoid resuming devices
> during system suspend (in order to modify their wakeup settings etc.)
> if that isn't necessary.

Wasn't there going to be a patch in this series doing the same thing 
for PCI?  Or has it not yet been written?

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
  2014-05-08 14:59                                                               ` Alan Stern
@ 2014-05-08 19:40                                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 19:40 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 10:59:38 AM Alan Stern wrote:
> On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Rework the ACPI PM domain's PM callbacks to avoid resuming devices
> > during system suspend (in order to modify their wakeup settings etc.)
> > if that isn't necessary.
> 
> Wasn't there going to be a patch in this series doing the same thing 
> for PCI?  Or has it not yet been written?

Not yet written. :-)

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 12:25                                                                       ` Ulf Hansson
@ 2014-05-08 20:02                                                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 20:02 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 02:25:06 PM Ulf Hansson wrote:
> On 8 May 2014 13:44, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Thursday, May 08, 2014 12:59:20 PM Ulf Hansson wrote:
> >> On 8 May 2014 12:53, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> > On Thursday, May 08, 2014 09:49:36 AM Ulf Hansson wrote:
> >> >> On 8 May 2014 01:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >> >
> >> >> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> >> >> > resume all runtime-suspended devices during system suspend, mostly
> >> >> > because those devices may need to be reprogrammed due to different
> >> >> > wakeup settings for system sleep and for runtime PM.
> >> >> >
> >> >> > For some devices, though, it's OK to remain in runtime suspend
> >> >> > throughout a complete system suspend/resume cycle (if the device was in
> >> >> > runtime suspend at the start of the cycle).  We would like to do this
> >> >> > whenever possible, to avoid the overhead of extra power-up and power-down
> >> >> > events.
> >> >> >
> >> >> > However, problems may arise because the device's descendants may require
> >> >> > it to be at full power at various points during the cycle.  Therefore the
> >> >> > most straightforward way to do this safely is if the device and all its
> >> >> > descendants can remain runtime suspended until the resume stage of system
> >> >> > resume.
> >> >> >
> >> >> > To this end, introduce dev->power.leave_runtime_suspended.
> >> >> > If a subsystem or driver sets this flag during the ->prepare() callback,
> >> >> > and if the flag is set in all of the device's descendants, and if the
> >> >> > device is still in runtime suspend at the beginning of the ->suspend()
> >> >> > callback, that callback is allowed to return 0 without clearing
> >> >> > power.leave_runtime_suspended and without changing the state of the
> >> >> > device, unless the current state of the device is not appropriate for
> >> >> > the upcoming system sleep state (for example, the device is supposed to
> >> >> > wake up the system from that state and its current wakeup settings are
> >> >> > not suitable for that).  Then, the PM core will not invoke the device's
> >> >> > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> >> >> > ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
> >> >> > the device resume stage of system resume.
> >> >> >
> >> >> > By leaving this flag set after ->suspend(), a driver or subsystem tells
> >> >> > the PM core that the device is runtime suspended, it is in a suitable
> >> >> > state for system suspend (for example, the wakeup setting does not
> >> >> > need to be changed), and it does not need to return to full
> >> >> > power until the resume stage.
> >> >> >
> >> >> > Changelog based on an Alan Stern's description of the idea
> >> >> > (http://marc.info/?l=linux-pm&m=139940466625569&w=2).
> >> >> >
> >> >> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >> > ---
> >> >> >  drivers/base/power/main.c    |   31 ++++++++++++++++++++++++-------
> >> >> >  drivers/base/power/runtime.c |   10 ++++++++++
> >> >> >  include/linux/pm.h           |    3 +++
> >> >> >  include/linux/pm_runtime.h   |   16 ++++++++++++++++
> >> >> >  kernel/power/Kconfig         |    4 ++++
> >> >> >  5 files changed, 57 insertions(+), 7 deletions(-)
> >> >> >
> >> >> > Index: linux-pm/kernel/power/Kconfig
> >> >> > ===================================================================
> >> >> > --- linux-pm.orig/kernel/power/Kconfig
> >> >> > +++ linux-pm/kernel/power/Kconfig
> >> >> > @@ -147,6 +147,10 @@ config PM
> >> >> >         def_bool y
> >> >> >         depends on PM_SLEEP || PM_RUNTIME
> >> >> >
> >> >> > +config PM_BOTH
> >> >> > +       def_bool y
> >> >> > +       depends on PM_SLEEP && PM_RUNTIME
> >> >> > +
> >> >>
> >> >> Should we not depend on PM_RUNTIME only? Thus we don't need the new
> >> >> Kconfig,
> >> >
> >> > Well, OK.  I guess we can tolerate one useless statement in rpm_resume()
> >> > in case CONFIG_PM_SLEEP is unset.
> >> >
> >> >> and then we could rename the new APIs to pm_runtime_* instead.
> >> >
> >> > That would just make the name longer - for what value?
> >>
> >> Only "__set_leave_runtime_suspended" will be a bit longer.
> >>
> >> The idea I had was to clearly indicate, these functions is a part of
> >> PM_RUNTIME API.
> >>
> >> Compare what you have:
> >> __set_leave_runtime_suspended
> >> pm_set_leave_runtime_suspended
> >> pm_leave_runtime_suspended
> >>
> >> To what I suggest:
> >> __pm_runtime_set_leave_suspended
> >> pm_runtime_set_leave_suspended
> >> pm_runtime_leave_suspended
> >
> > And why exactly do you think these are any better?
> 
> Because that's how all (almost all) other functions in the runtime PM
> API are specified - I believe it makes sense to keep them aligned.
> 
> Anyway, if you insist in keeping your functions names, it's not that
> of a big deal for me.
> 
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> 
> >
> > The flag is not called leave_suspended surely?
> 
> To me that doesn't matter, the flag has nothing to do with the
> function names in an API.

Well, the point is that pm_runtime_leave_suspended suggests that the runtime
PM framework is supposed to leave the device suspended, while this isn't the
case.  This essentially is a system suspend flag that depends on runtime PM
being available.

Thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 14:57                                                               ` Alan Stern
@ 2014-05-08 20:17                                                                 ` Rafael J. Wysocki
  2014-05-08 21:03                                                                   ` Rafael J. Wysocki
  2014-05-08 21:08                                                                   ` Alan Stern
  0 siblings, 2 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 20:17 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 10:57:36 AM Alan Stern wrote:
> On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.
> > 
> > For some devices, though, it's OK to remain in runtime suspend 
> > throughout a complete system suspend/resume cycle (if the device was in
> > runtime suspend at the start of the cycle).  We would like to do this
> > whenever possible, to avoid the overhead of extra power-up and power-down
> > events.
> > 
> > However, problems may arise because the device's descendants may require
> > it to be at full power at various points during the cycle.  Therefore the
> > most straightforward way to do this safely is if the device and all its
> > descendants can remain runtime suspended until the resume stage of system
> > resume.
> > 
> > To this end, introduce dev->power.leave_runtime_suspended.
> > If a subsystem or driver sets this flag during the ->prepare() callback,
> > and if the flag is set in all of the device's descendants, and if the
> > device is still in runtime suspend at the beginning of the ->suspend()
> > callback, that callback is allowed to return 0 without clearing
> > power.leave_runtime_suspended and without changing the state of the
> > device, unless the current state of the device is not appropriate for
> > the upcoming system sleep state (for example, the device is supposed to
> > wake up the system from that state and its current wakeup settings are
> > not suitable for that).  Then, the PM core will not invoke the device's
> > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> > ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
> > the device resume stage of system resume.
> 
> Wait a minute.  Following ->runtime_suspend(), you are going to call 
> ->suspend() and then ->runtime_resume()?  That doesn't seem like what 
> you really want; a ->suspend() call should always have a matching 
> ->resume().

Yes, it should, but I didn't see any other way to do that.

> I guess you did it this way to allow for runtime-resumes and -suspends 
> between ->prepare() and ->suspend(), but it still seems wrong.

No.  I did that to allow ->suspend() to check whether or not the device is
in the right state.  ->prepare() could do that, arguably, but then there's
the case when ->runtime_suspend() may still be running in parallel with it.
And the device may be runtime-suspended immediately before its ->suspend()
in theory if its children do pm_runtime_put_sync(parent).

Also, this is a bus type ->suspend(), so the *driver* ->suspend()
won't be called at this point in the ACPI PM domain case for example.

> How about asking drivers to set leave_runtime_suspended in their
> ->runtime_suspend() callbacks, as well as during ->prepare()?  Then
> intervening runtime resume/suspend cycles wouldn't matter and you
> wouldn't need to call ->suspend(); you could skip it along with the
> other PM callbacks.

That wouldn't work, because they cannot know the target sleep state of the
system in advance.  This only is known during the given suspend sequence.

> > By leaving this flag set after ->suspend(), a driver or subsystem tells
> > the PM core that the device is runtime suspended, it is in a suitable
> > state for system suspend (for example, the wakeup setting does not
> > need to be changed), and it does not need to return to full
> > power until the resume stage.
> 
> So: By setting this flag during ->runtime_suspend() and ->prepare(), a
> driver or subsystem tells the PM core that the device is in a suitable
> state for system suspend (for example, the wakeup setting would not
> need to be changed), if one should occur before the next runtime
> resume, and the device would not need to return to full power until the
> resume stage.
>
> > --- linux-pm.orig/include/linux/pm_runtime.h
> > +++ linux-pm/include/linux/pm_runtime.h
> > @@ -264,4 +264,20 @@ static inline void pm_runtime_dont_use_a
> >  	__pm_runtime_use_autosuspend(dev, false);
> >  }
> >  
> > +#ifdef CONFIG_PM_BOTH
> > +static inline void __set_leave_runtime_suspended(struct device *dev, bool val)
> > +{
> > +	dev->power.leave_runtime_suspended = val;
> > +}
> > +extern void pm_set_leave_runtime_suspended(struct device *dev, bool val);
> > +static inline bool pm_leave_runtime_suspended(struct device *dev)
> > +{
> > +	return dev->power.leave_runtime_suspended;
> > +}
> 
> Is it generally your custom to use "set_" and "" rather than "set_" and 
> "get_"?

But (dev->power.syscore || pm_get_leave_runtime_suspended(dev)) looks awkward. :-)
 
> >   End:
> >  	if (!error) {
> > +		struct device *parent = dev->parent;
> > +
> >  		dev->power.is_suspended = true;
> > -		if (dev->power.wakeup_path
> > -		    && dev->parent && !dev->parent->power.ignore_children)
> > -			dev->parent->power.wakeup_path = true;
> > +		if (parent) {
> > +			spin_lock_irq(&parent->power.lock);
> > +
> > +			if (dev->power.wakeup_path
> > +			    && !parent->power.ignore_children)
> > +				parent->power.wakeup_path = true;
> > +
> > +			if (!pm_leave_runtime_suspended(dev))
> > +				__set_leave_runtime_suspended(parent, false);
> > +
> > +			spin_unlock_irq(&parent->power.lock);
> > +		}
> 
> Then of course, this code would move up, before the callback, and the 
> callback would be skipped if leave_runtime_suspended was set.

Well, not really. :-)

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 20:17                                                                 ` Rafael J. Wysocki
@ 2014-05-08 21:03                                                                   ` Rafael J. Wysocki
  2014-05-08 21:20                                                                     ` Alan Stern
  2014-05-08 21:08                                                                   ` Alan Stern
  1 sibling, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 21:03 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 10:17:50 PM Rafael J. Wysocki wrote:
> On Thursday, May 08, 2014 10:57:36 AM Alan Stern wrote:
> > On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> > 
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > > resume all runtime-suspended devices during system suspend, mostly
> > > because those devices may need to be reprogrammed due to different
> > > wakeup settings for system sleep and for runtime PM.
> > > 
> > > For some devices, though, it's OK to remain in runtime suspend 
> > > throughout a complete system suspend/resume cycle (if the device was in
> > > runtime suspend at the start of the cycle).  We would like to do this
> > > whenever possible, to avoid the overhead of extra power-up and power-down
> > > events.
> > > 
> > > However, problems may arise because the device's descendants may require
> > > it to be at full power at various points during the cycle.  Therefore the
> > > most straightforward way to do this safely is if the device and all its
> > > descendants can remain runtime suspended until the resume stage of system
> > > resume.
> > > 
> > > To this end, introduce dev->power.leave_runtime_suspended.
> > > If a subsystem or driver sets this flag during the ->prepare() callback,
> > > and if the flag is set in all of the device's descendants, and if the
> > > device is still in runtime suspend at the beginning of the ->suspend()
> > > callback, that callback is allowed to return 0 without clearing
> > > power.leave_runtime_suspended and without changing the state of the
> > > device, unless the current state of the device is not appropriate for
> > > the upcoming system sleep state (for example, the device is supposed to
> > > wake up the system from that state and its current wakeup settings are
> > > not suitable for that).  Then, the PM core will not invoke the device's
> > > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> > > ->resume() callbacks.  Instead, it will invoke ->runtime_resume() during
> > > the device resume stage of system resume.
> > 
> > Wait a minute.  Following ->runtime_suspend(), you are going to call 
> > ->suspend() and then ->runtime_resume()?  That doesn't seem like what 
> > you really want; a ->suspend() call should always have a matching 
> > ->resume().
> 
> Yes, it should, but I didn't see any other way to do that.

Actually, that's kind of easy to resolve. :-)

When ->suspend() leaves power.leave_runtime_suspended set, the PM core can
simply skip the early/late and noirq callbacks and then call ->resume()
that will be responsible for using whatever is necessary to resume the
device.

And perhaps the flag should be called something different then, like
direct_resume (meaning go directly for ->resume() without executing
the intermediate callbacks)?

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 20:17                                                                 ` Rafael J. Wysocki
  2014-05-08 21:03                                                                   ` Rafael J. Wysocki
@ 2014-05-08 21:08                                                                   ` Alan Stern
  1 sibling, 0 replies; 78+ messages in thread
From: Alan Stern @ 2014-05-08 21:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thu, 8 May 2014, Rafael J. Wysocki wrote:

> > Wait a minute.  Following ->runtime_suspend(), you are going to call 
> > ->suspend() and then ->runtime_resume()?  That doesn't seem like what 
> > you really want; a ->suspend() call should always have a matching 
> > ->resume().
> 
> Yes, it should, but I didn't see any other way to do that.
> 
> > I guess you did it this way to allow for runtime-resumes and -suspends 
> > between ->prepare() and ->suspend(), but it still seems wrong.
> 
> No.  I did that to allow ->suspend() to check whether or not the device is
> in the right state.  ->prepare() could do that, arguably, but then there's
> the case when ->runtime_suspend() may still be running in parallel with it.
> And the device may be runtime-suspended immediately before its ->suspend()
> in theory if its children do pm_runtime_put_sync(parent).
> 
> Also, this is a bus type ->suspend(), so the *driver* ->suspend()
> won't be called at this point in the ACPI PM domain case for example.
> 
> > How about asking drivers to set leave_runtime_suspended in their
> > ->runtime_suspend() callbacks, as well as during ->prepare()?  Then
> > intervening runtime resume/suspend cycles wouldn't matter and you
> > wouldn't need to call ->suspend(); you could skip it along with the
> > other PM callbacks.
> 
> That wouldn't work, because they cannot know the target sleep state of the
> system in advance.  This only is known during the given suspend sequence.

Argh!  We're both being foolish.  Runtime suspends can't occur between 
->prepare() and ->suspend(), because device_prepare() does a 
pm_runtime_get_noresume.

You might still have to worry about a runtime suspend concurrent with
->prepare(), though.  An appropriate barrier could fix that.

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 21:03                                                                   ` Rafael J. Wysocki
@ 2014-05-08 21:20                                                                     ` Alan Stern
  2014-05-08 21:42                                                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-08 21:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thu, 8 May 2014, Rafael J. Wysocki wrote:

> > > Wait a minute.  Following ->runtime_suspend(), you are going to call 
> > > ->suspend() and then ->runtime_resume()?  That doesn't seem like what 
> > > you really want; a ->suspend() call should always have a matching 
> > > ->resume().
> > 
> > Yes, it should, but I didn't see any other way to do that.
> 
> Actually, that's kind of easy to resolve. :-)
> 
> When ->suspend() leaves power.leave_runtime_suspended set, the PM core can
> simply skip the early/late and noirq callbacks and then call ->resume()
> that will be responsible for using whatever is necessary to resume the
> device.
> 
> And perhaps the flag should be called something different then, like
> direct_resume (meaning go directly for ->resume() without executing
> the intermediate callbacks)?

In light of what I wrote earlier, it should be okay for the ->prepare() 
callback to be responsible for setting leave_runtime_suspended.  Then 
there will be no need to call either ->suspend() or ->resume().

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 21:20                                                                     ` Alan Stern
@ 2014-05-08 21:42                                                                       ` Rafael J. Wysocki
  2014-05-08 21:50                                                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 21:42 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 05:20:43 PM Alan Stern wrote:
> On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> 
> > > > Wait a minute.  Following ->runtime_suspend(), you are going to call 
> > > > ->suspend() and then ->runtime_resume()?  That doesn't seem like what 
> > > > you really want; a ->suspend() call should always have a matching 
> > > > ->resume().
> > > 
> > > Yes, it should, but I didn't see any other way to do that.
> > 
> > Actually, that's kind of easy to resolve. :-)
> > 
> > When ->suspend() leaves power.leave_runtime_suspended set, the PM core can
> > simply skip the early/late and noirq callbacks and then call ->resume()
> > that will be responsible for using whatever is necessary to resume the
> > device.
> > 
> > And perhaps the flag should be called something different then, like
> > direct_resume (meaning go directly for ->resume() without executing
> > the intermediate callbacks)?
> 
> In light of what I wrote earlier, it should be okay for the ->prepare() 
> callback to be responsible for setting leave_runtime_suspended.  Then 
> there will be no need to call either ->suspend() or ->resume().

Hmm.  OK, let's try that.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 21:42                                                                       ` Rafael J. Wysocki
@ 2014-05-08 21:50                                                                         ` Rafael J. Wysocki
  2014-05-08 22:28                                                                           ` [RFC][PATCH 0/3] (was: Re: PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
  2014-05-09  1:52                                                                           ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Alan Stern
  0 siblings, 2 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 21:50 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 11:42:01 PM Rafael J. Wysocki wrote:
> On Thursday, May 08, 2014 05:20:43 PM Alan Stern wrote:
> > On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> > 
> > > > > Wait a minute.  Following ->runtime_suspend(), you are going to call 
> > > > > ->suspend() and then ->runtime_resume()?  That doesn't seem like what 
> > > > > you really want; a ->suspend() call should always have a matching 
> > > > > ->resume().
> > > > 
> > > > Yes, it should, but I didn't see any other way to do that.
> > > 
> > > Actually, that's kind of easy to resolve. :-)
> > > 
> > > When ->suspend() leaves power.leave_runtime_suspended set, the PM core can
> > > simply skip the early/late and noirq callbacks and then call ->resume()
> > > that will be responsible for using whatever is necessary to resume the
> > > device.
> > > 
> > > And perhaps the flag should be called something different then, like
> > > direct_resume (meaning go directly for ->resume() without executing
> > > the intermediate callbacks)?
> > 
> > In light of what I wrote earlier, it should be okay for the ->prepare() 
> > callback to be responsible for setting leave_runtime_suspended.  Then 
> > there will be no need to call either ->suspend() or ->resume().
> 
> Hmm.  OK, let's try that.

Well, no.

The reason why that doesn't work is because ->prepare() callbacks are
executed in the reverse order, so the perent's ones will be run before
the ->prepare() of the children.  Thus if ->prepare() sets the flag
with the expectation that ->suspend() (and the subsequent callbacks)
won't be executed, that expectation may not be met actually.

So I'm going to do what I said above.  I prefer it anyway. :-)

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 0/3] (was: Re: PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices)
  2014-05-08 21:50                                                                         ` Rafael J. Wysocki
@ 2014-05-08 22:28                                                                           ` Rafael J. Wysocki
  2014-05-08 22:41                                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
                                                                                               ` (2 more replies)
  2014-05-09  1:52                                                                           ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Alan Stern
  1 sibling, 3 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 22:28 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List,
	LKML, Ulf Hansson

On Thursday, May 08, 2014 11:50:23 PM Rafael J. Wysocki wrote:
> On Thursday, May 08, 2014 11:42:01 PM Rafael J. Wysocki wrote:
> > On Thursday, May 08, 2014 05:20:43 PM Alan Stern wrote:
> > > On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> > > 
> > > > > > Wait a minute.  Following ->runtime_suspend(), you are going to call 
> > > > > > ->suspend() and then ->runtime_resume()?  That doesn't seem like what 
> > > > > > you really want; a ->suspend() call should always have a matching 
> > > > > > ->resume().
> > > > > 
> > > > > Yes, it should, but I didn't see any other way to do that.
> > > > 
> > > > Actually, that's kind of easy to resolve. :-)
> > > > 
> > > > When ->suspend() leaves power.leave_runtime_suspended set, the PM core can
> > > > simply skip the early/late and noirq callbacks and then call ->resume()
> > > > that will be responsible for using whatever is necessary to resume the
> > > > device.
> > > > 
> > > > And perhaps the flag should be called something different then, like
> > > > direct_resume (meaning go directly for ->resume() without executing
> > > > the intermediate callbacks)?
> > > 
> > > In light of what I wrote earlier, it should be okay for the ->prepare() 
> > > callback to be responsible for setting leave_runtime_suspended.  Then 
> > > there will be no need to call either ->suspend() or ->resume().
> > 
> > Hmm.  OK, let's try that.
> 
> Well, no.
> 
> The reason why that doesn't work is because ->prepare() callbacks are
> executed in the reverse order, so the perent's ones will be run before
> the ->prepare() of the children.  Thus if ->prepare() sets the flag
> with the expectation that ->suspend() (and the subsequent callbacks)
> won't be executed, that expectation may not be met actually.
> 
> So I'm going to do what I said above.  I prefer it anyway. :-)

Reworked patches follow (again).

I followed the Ulf's suggestion to make the helpers depend on PM_RUNTIME
instead of adding the new CONFIG_ symbol.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 22:28                                                                           ` [RFC][PATCH 0/3] (was: Re: PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
@ 2014-05-08 22:41                                                                             ` Rafael J. Wysocki
  2014-05-09  7:23                                                                               ` Ulf Hansson
  2014-05-08 22:41                                                                             ` [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
  2014-05-08 22:42                                                                             ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 22:41 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List,
	LKML, Ulf Hansson

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
resume all runtime-suspended devices during system suspend, mostly
because those devices may need to be reprogrammed due to different
wakeup settings for system sleep and for runtime PM.

For some devices, though, it's OK to remain in runtime suspend 
throughout a complete system suspend/resume cycle (if the device was in
runtime suspend at the start of the cycle).  We would like to do this
whenever possible, to avoid the overhead of extra power-up and power-down
events.

However, problems may arise because the device's descendants may require
it to be at full power at various points during the cycle.  Therefore the
most straightforward way to do this safely is if the device and all its
descendants can remain runtime suspended until the resume stage of system
resume.

To this end, introduce a new device PM flag, power.direct_resume.
If a subsystem or driver sets this flag during the ->prepare()
callback, and if the flag is set in all of the device's descendants,
and if the device is still in runtime suspend at the beginning of the
->suspend() callback, that callback is allowed to return 0 without
clearing power.direct_resume and without changing the state of the
device, unless the current state of the device is not appropriate for
the upcoming system sleep state (for example, the device is supposed
to wake up the system from that state and its current wakeup settings
are not suitable for that).  Then, the PM core will not invoke the
device's ->suspend_late(), ->suspend_irq(), ->resume_irq(), or
->resume_early() callbacks.  Instead, it will only invoke ->resume()
during the device resume stage of system resume and that callback
will be entirely responsible for resuming the device as appropriate.

By leaving this flag set after ->suspend(), a driver or subsystem tells
the PM core that the device is runtime suspended, it is in a suitable
state for system suspend (for example, the wakeup setting does not
need to be changed), and it does not need to return to full power
until the resume stage.

Changelog based on an Alan Stern's description of the idea
(http://marc.info/?l=linux-pm&m=139940466625569&w=2).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c    |   26 +++++++++++++++++++-------
 drivers/base/power/runtime.c |    8 ++++++++
 include/linux/pm.h           |    1 +
 include/linux/pm_runtime.h   |   13 +++++++++++++
 4 files changed, 41 insertions(+), 7 deletions(-)

Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -574,6 +574,7 @@ struct dev_pm_info {
 	unsigned int		use_autosuspend:1;
 	unsigned int		timer_autosuspends:1;
 	unsigned int		memalloc_noio:1;
+	bool			direct_resume:1;	/* For system suspend */
 	enum rpm_request	request;
 	enum rpm_status		runtime_status;
 	int			runtime_error;
Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -57,6 +57,7 @@ extern unsigned long pm_runtime_autosusp
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern void pm_set_direct_resume(struct device *dev, bool val);
 
 static inline bool pm_children_suspended(struct device *dev)
 {
@@ -116,6 +117,15 @@ static inline void pm_runtime_mark_last_
 	ACCESS_ONCE(dev->power.last_busy) = jiffies;
 }
 
+static inline void __set_direct_resume(struct device *dev, bool val)
+{
+	dev->power.direct_resume = val;
+}
+
+static inline bool pm_direct_resume_is_set(struct device *dev)
+{
+	return dev->power.direct_resume;
+}
 #else /* !CONFIG_PM_RUNTIME */
 
 static inline int __pm_runtime_idle(struct device *dev, int rpmflags)
@@ -165,6 +175,9 @@ static inline unsigned long pm_runtime_a
 				struct device *dev) { return 0; }
 static inline void pm_runtime_set_memalloc_noio(struct device *dev,
 						bool enable){}
+static inline void __set_direct_resume(struct device *dev, bool val) {}
+static inline void pm_set_direct_resume(struct device *dev, bool val) {}
+static inline bool pm_direct_resume_is_set(struct device *dev) { return false; }
 
 #endif /* !CONFIG_PM_RUNTIME */
 
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -732,6 +732,7 @@ static int rpm_resume(struct device *dev
 	}
  skip_parent:
 
+	__set_direct_resume(dev, false);
 	if (dev->power.no_callbacks)
 		goto no_callback;	/* Assume success. */
 
@@ -1485,3 +1486,10 @@ out:
 	return ret;
 }
 EXPORT_SYMBOL_GPL(pm_runtime_force_resume);
+
+void pm_set_direct_resume(struct device *dev, bool val)
+{
+	spin_lock_irq(&dev->power.lock);
+	__set_direct_resume(dev, val);
+	spin_unlock_irq(&dev->power.lock);
+}
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -479,7 +479,7 @@ static int device_resume_noirq(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || pm_direct_resume_is_set(dev))
 		goto Out;
 
 	if (!dev->power.is_noirq_suspended)
@@ -605,7 +605,7 @@ static int device_resume_early(struct de
 	TRACE_DEVICE(dev);
 	TRACE_RESUME(0);
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || pm_direct_resume_is_set(dev))
 		goto Out;
 
 	if (!dev->power.is_late_suspended)
@@ -1007,7 +1007,7 @@ static int __device_suspend_noirq(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || pm_direct_resume_is_set(dev))
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1146,7 +1146,7 @@ static int __device_suspend_late(struct
 		goto Complete;
 	}
 
-	if (dev->power.syscore)
+	if (dev->power.syscore || pm_direct_resume_is_set(dev))
 		goto Complete;
 
 	dpm_wait_for_children(dev, async);
@@ -1382,10 +1382,21 @@ static int __device_suspend(struct devic
 
  End:
 	if (!error) {
+		struct device *parent = dev->parent;
+
 		dev->power.is_suspended = true;
-		if (dev->power.wakeup_path
-		    && dev->parent && !dev->parent->power.ignore_children)
-			dev->parent->power.wakeup_path = true;
+		if (parent) {
+			spin_lock_irq(&parent->power.lock);
+
+			if (dev->power.wakeup_path
+			    && !parent->power.ignore_children)
+				parent->power.wakeup_path = true;
+
+			if (!pm_direct_resume_is_set(dev))
+				__set_direct_resume(parent, false);
+
+			spin_unlock_irq(&parent->power.lock);
+		}
 	}
 
 	device_unlock(dev);
@@ -1553,6 +1564,7 @@ int dpm_prepare(pm_message_t state)
 		struct device *dev = to_device(dpm_list.next);
 
 		get_device(dev);
+		pm_set_direct_resume(dev, false);
 		mutex_unlock(&dpm_list_mtx);
 
 		error = device_prepare(dev, state);


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend
  2014-05-08 22:28                                                                           ` [RFC][PATCH 0/3] (was: Re: PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
  2014-05-08 22:41                                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
@ 2014-05-08 22:41                                                                             ` Rafael J. Wysocki
  2014-05-08 22:42                                                                             ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
  2 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 22:41 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List,
	LKML, Ulf Hansson

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add a new helper routine, pm_runtime_enabled_and_suspended(), to
allow subsystems (or PM domains) to check the runtime PM status of
devices during system suspend (possibly to avoid resuming those
devices upfront at that time).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c |   27 +++++++++++++++++++++++++++
 include/linux/pm_runtime.h   |    2 ++
 2 files changed, 29 insertions(+)

Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -57,6 +57,7 @@ extern unsigned long pm_runtime_autosusp
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern bool pm_runtime_enabled_and_suspended(struct device *dev);
 extern void pm_set_direct_resume(struct device *dev, bool val);
 
 static inline bool pm_children_suspended(struct device *dev)
@@ -175,6 +176,7 @@ static inline unsigned long pm_runtime_a
 				struct device *dev) { return 0; }
 static inline void pm_runtime_set_memalloc_noio(struct device *dev,
 						bool enable){}
+static inline bool pm_runtime_enabled_and_suspended(struct device *dev) { return false; }
 static inline void __set_direct_resume(struct device *dev, bool val) {}
 static inline void pm_set_direct_resume(struct device *dev, bool val) {}
 static inline bool pm_direct_resume_is_set(struct device *dev) { return false; }
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1196,6 +1196,33 @@ void pm_runtime_enable(struct device *de
 EXPORT_SYMBOL_GPL(pm_runtime_enable);
 
 /**
+ * pm_runtime_enabled_and_suspended - Check runtime PM status of a device.
+ * @dev: Device to handle.
+ *
+ * This routine is to be executed during system suspend only, after
+ * device_prepare() has been executed for @dev.
+ *
+ * Return false if runtime PM is disabled for the device.  Otherwise, wait
+ * for pending transitions to complete and check the runtime PM status of the
+ * device after that.  Return true if it is RPM_SUSPENDED.
+ */
+bool pm_runtime_enabled_and_suspended(struct device *dev)
+{
+	unsigned long flags;
+	bool ret;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	if (dev->power.disable_depth) {
+		ret = false;
+	} else {
+		__pm_runtime_barrier(dev);
+		ret = pm_runtime_status_suspended(dev);
+	}
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+	return ret;
+}
+
+/**
  * pm_runtime_forbid - Block runtime PM of a device.
  * @dev: Device to handle.
  *


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC][PATCH 3/3]  ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
  2014-05-08 22:28                                                                           ` [RFC][PATCH 0/3] (was: Re: PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
  2014-05-08 22:41                                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
  2014-05-08 22:41                                                                             ` [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
@ 2014-05-08 22:42                                                                             ` Rafael J. Wysocki
  2 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-08 22:42 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List,
	LKML, Ulf Hansson

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Rework the ACPI PM domain's PM callbacks to avoid resuming devices
during system suspend (in order to modify their wakeup settings etc.)
if that isn't necessary.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/device_pm.c |   46 ++++++++++++++++++++++++++++++++++++++++++----
 drivers/acpi/scan.c      |    4 ++++
 include/acpi/acpi_bus.h  |    3 ++-
 3 files changed, 48 insertions(+), 5 deletions(-)

Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -907,20 +907,44 @@ int acpi_subsys_prepare(struct device *d
 	if (dev->power.ignore_children)
 		pm_runtime_resume(dev);
 
+	pm_set_direct_resume(dev, true);
 	return pm_generic_prepare(dev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
 
 /**
- * acpi_subsys_suspend - Run the device driver's suspend callback.
+ * acpi_subsys_suspend - Handle device suspend stage of system suspend.
  * @dev: Device to handle.
- *
- * Follow PCI and resume devices suspended at run time before running their
- * system suspend callbacks.
  */
 int acpi_subsys_suspend(struct device *dev)
 {
+	struct acpi_device *adev = ACPI_COMPANION(dev);
+	u32 sys_target;
+
+	if (!adev || !pm_runtime_enabled_and_suspended(dev)) {
+		pm_set_direct_resume(dev, false);
+		goto out;
+	}
+	if (!pm_direct_resume_is_set(dev)
+	    || device_may_wakeup(dev) != !!adev->wakeup.prepare_count)
+		goto resume;
+
+	sys_target = acpi_target_system_state();
+	if (sys_target != ACPI_STATE_S0) {
+		int ret, state;
+
+		if (adev->power.flags.dsw_present)
+			goto resume;
+
+		ret = acpi_dev_pm_get_state(dev, adev, sys_target, NULL, &state);
+		if (!ret && state == adev->power.state)
+			return 0;
+	}
+
+ resume:
 	pm_runtime_resume(dev);
+
+ out:
 	return pm_generic_suspend(dev);
 }
 
@@ -954,6 +978,19 @@ int acpi_subsys_resume_early(struct devi
 EXPORT_SYMBOL_GPL(acpi_subsys_resume_early);
 
 /**
+ * acpi_subsys_resume - Handle device resume stage of system resume.
+ * @dev: Device to handle.
+ */
+int acpi_subsys_resume(struct device *dev)
+{
+	if (pm_direct_resume_is_set(dev)) {
+		pm_runtime_resume(dev);
+		return 0;
+	}
+	return pm_generic_resume(dev);
+}
+
+/**
  * acpi_subsys_freeze - Run the device driver's freeze callback.
  * @dev: Device to handle.
  */
@@ -982,6 +1019,7 @@ static struct dev_pm_domain acpi_general
 		.suspend = acpi_subsys_suspend,
 		.suspend_late = acpi_subsys_suspend_late,
 		.resume_early = acpi_subsys_resume_early,
+		.resume = acpi_subsys_resume,
 		.freeze = acpi_subsys_freeze,
 		.poweroff = acpi_subsys_suspend,
 		.poweroff_late = acpi_subsys_suspend_late,
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -261,7 +261,8 @@ struct acpi_device_power_flags {
 	u32 inrush_current:1;	/* Serialize Dx->D0 */
 	u32 power_removed:1;	/* Optimize Dx->D0 */
 	u32 ignore_parent:1;	/* Power is independent of parent power state */
-	u32 reserved:27;
+	u32 dsw_present:1;	/* _DSW present? */
+	u32 reserved:26;
 };
 
 struct acpi_device_power_state {
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -1551,9 +1551,13 @@ static void acpi_bus_get_power_flags(str
 	 */
 	if (acpi_has_method(device->handle, "_PSC"))
 		device->power.flags.explicit_get = 1;
+
 	if (acpi_has_method(device->handle, "_IRC"))
 		device->power.flags.inrush_current = 1;
 
+	if (acpi_has_method(device->handle, "_DSW"))
+		device->power.flags.dsw_present = 1;
+
 	/*
 	 * Enumerate supported power management states
 	 */


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 21:50                                                                         ` Rafael J. Wysocki
  2014-05-08 22:28                                                                           ` [RFC][PATCH 0/3] (was: Re: PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
@ 2014-05-09  1:52                                                                           ` Alan Stern
  2014-05-09 22:49                                                                             ` Rafael J. Wysocki
  1 sibling, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-09  1:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thu, 8 May 2014, Rafael J. Wysocki wrote:

> Well, no.
> 
> The reason why that doesn't work is because ->prepare() callbacks are
> executed in the reverse order, so the perent's ones will be run before
> the ->prepare() of the children.  Thus if ->prepare() sets the flag
> with the expectation that ->suspend() (and the subsequent callbacks)
> won't be executed, that expectation may not be met actually.

That's true also if the flag gets set in ->suspend(), isn't it?  A
driver may set direct_resume in its ->suspend() callback, expecting
that the subsequent callbacks won't be executed.  But if a descendant
hasn't also set its flag then the callbacks _will_ be executed.

> So I'm going to do what I said above.  I prefer it anyway. :-)

In your most recent patch (and in the earlier ones too), after you call
dev's ->suspend() routine, if dev->power.direct_resume isn't set then
you clear dev->parent->direct_resume.  But what good will that do if
dev->parent's ->suspend() routine turns the flag back on when it gets
called later?

I can think of two ways to make this work.

	Expect subsystems and drivers to set the flag during 
	->suspend().  Turn on the flag in every device during 
	device_prepare().  Then in __device_suspend(), remember the
	flag's value and turn it off before invoking the callback.  If 
	the flag is on again when the callback returns, set the flag 
	back to the remembered value.  If the flag ends up being off
	then turn off the parent's flag.

	Expect subsystems and drivers to set the flag during 
	->prepare().  Whenever a callback returns with the flag not
	set, clear the flag in all of the device's ancestors.

Both are somewhat awkward, and both involve turning the flag off after 
the callback has turned it on.

Also, how do you expect direct_resume to work with the PCI subsystem?  
Will the PCI core set the flag appropriately on behalf of the driver?  
If the core does set the flag, will it invoke the driver's ->suspend()
callback or skip the callback?  If it invokes the driver's callback but
leaves the device in runtime suspend, what happens if the driver
expects the device always to be at full power when its ->suspend()  
routine runs?  If the core skips the driver's ->suspend() callback,
what happens if one of the device's children did not set direct_resume 
and so the later PM callbacks do get invoked?

Several of these questions are a lot easier to answer if the flag gets 
set during ->prepare() rather than ->suspend().

Alan Stern


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-08 22:41                                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
@ 2014-05-09  7:23                                                                               ` Ulf Hansson
  2014-05-09 11:33                                                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Ulf Hansson @ 2014-05-09  7:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

Hi Rafael,

> @@ -1485,3 +1486,10 @@ out:
>         return ret;
>  }
>  EXPORT_SYMBOL_GPL(pm_runtime_force_resume);
> +
> +void pm_set_direct_resume(struct device *dev, bool val)
> +{
> +       spin_lock_irq(&dev->power.lock);
> +       __set_direct_resume(dev, val);
> +       spin_unlock_irq(&dev->power.lock);
> +}

I believe you have to move the implementation of this function inside
#ifdef CONFIG_PM_RUNTIME.

Kind regards
Ulf Hansson

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-09  7:23                                                                               ` Ulf Hansson
@ 2014-05-09 11:33                                                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-09 11:33 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On Friday, May 09, 2014 09:23:50 AM Ulf Hansson wrote:
> Hi Rafael,
> 
> > @@ -1485,3 +1486,10 @@ out:
> >         return ret;
> >  }
> >  EXPORT_SYMBOL_GPL(pm_runtime_force_resume);
> > +
> > +void pm_set_direct_resume(struct device *dev, bool val)
> > +{
> > +       spin_lock_irq(&dev->power.lock);
> > +       __set_direct_resume(dev, val);
> > +       spin_unlock_irq(&dev->power.lock);
> > +}
> 
> I believe you have to move the implementation of this function inside
> #ifdef CONFIG_PM_RUNTIME.

You're right, I forgot about the recent change in there.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-07 23:29                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
  2014-05-08  7:49                                                               ` Ulf Hansson
  2014-05-08 14:57                                                               ` Alan Stern
@ 2014-05-09 22:48                                                               ` Kevin Hilman
  2014-05-10  1:38                                                                 ` Rafael J. Wysocki
  2 siblings, 1 reply; 78+ messages in thread
From: Kevin Hilman @ 2014-05-09 22:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

"Rafael J. Wysocki" <rjw@rjwysocki.net> writes:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> resume all runtime-suspended devices during system suspend, mostly
> because those devices may need to be reprogrammed due to different
> wakeup settings for system sleep and for runtime PM.
>
> For some devices, though, it's OK to remain in runtime suspend 
> throughout a complete system suspend/resume cycle (if the device was in
> runtime suspend at the start of the cycle).  We would like to do this
> whenever possible, to avoid the overhead of extra power-up and power-down
> events.
>
> However, problems may arise because the device's descendants may require
> it to be at full power at various points during the cycle.  Therefore the
> most straightforward way to do this safely is if the device and all its
> descendants can remain runtime suspended until the resume stage of system
> resume.
>
> To this end, introduce dev->power.leave_runtime_suspended.
> If a subsystem or driver sets this flag during the ->prepare() callback,
> and if the flag is set in all of the device's descendants, and if the
> device is still in runtime suspend at the beginning of the ->suspend()
> callback, that callback is allowed to return 0 without clearing
> power.leave_runtime_suspended and without changing the state of the
> device, unless the current state of the device is not appropriate for
> the upcoming system sleep state (for example, the device is supposed to
> wake up the system from that state and its current wakeup settings are
> not suitable for that).  Then, the PM core will not invoke the device's
> ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> ->resume() callbacks.  

Up to here, this sounds great.

> Instead, it will invoke ->runtime_resume() during the device resume
> stage of system resume.

But this part I'm not fully following...

> By leaving this flag set after ->suspend(), a driver or subsystem tells
> the PM core that the device is runtime suspended, it is in a suitable
> state for system suspend (for example, the wakeup setting does not
> need to be changed), and it does not need to return to full
> power until the resume stage.

But taking this "leave runtime suspended" idea the next logical step,
why would/should a device need to return to full power at the ->resume()
stage?  especially when it wasn't at full power when ->suspend()
happened?

IOW, why doesn't "leave runtime suspended" mean "leave runtime suspended
until runtime resumed on demand."

Forcing ->runtime_resume() during device resume means that in most
cases, a device will be forcibly runtime resumed, only to have nothing
to do but go idle and runtime suspend again, resulting in a(nother)
unnessary power-up, power-down cycle this patch is trying to avoid
during ->suspend().

Hmm, but wait a minute...

[...]

> @@ -735,6 +735,11 @@ static int device_resume(struct device *
>  	if (dev->power.syscore)
>  		goto Complete;
>  
> +	if (pm_leave_runtime_suspended(dev)) {
> +		pm_runtime_resume(dev);
> +		goto Complete;
> +	}

... maybe I'm forgetting how this works (since it's Friday and my brain
is already shutting down for the week) but after pm_runtime_resume() is
called, won't the device remain runtime active until
pm_runtime_suspend() is called, or until a
pm_runtime_get()/pm_runtime_put() cycle happens?

That means that on device resume, the device is forced into full power
state (even though it was runtime suspended when ->suspend() happened)
and will stay there until its used again.  

That seems like a rather unpleasant (and non-intuitive) side-effect of
"leave runtime suspended".

Kevin

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-09  1:52                                                                           ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Alan Stern
@ 2014-05-09 22:49                                                                             ` Rafael J. Wysocki
  2014-05-11 16:46                                                                               ` Alan Stern
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-09 22:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Thursday, May 08, 2014 09:52:18 PM Alan Stern wrote:
> On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> 
> > Well, no.
> > 
> > The reason why that doesn't work is because ->prepare() callbacks are
> > executed in the reverse order, so the perent's ones will be run before
> > the ->prepare() of the children.  Thus if ->prepare() sets the flag
> > with the expectation that ->suspend() (and the subsequent callbacks)
> > won't be executed, that expectation may not be met actually.
> 
> That's true also if the flag gets set in ->suspend(), isn't it?  A
> driver may set direct_resume in its ->suspend() callback, expecting
> that the subsequent callbacks won't be executed.  But if a descendant
> hasn't also set its flag then the callbacks _will_ be executed.

No, that's not possible with the current patch, because __device_suspend() is
executed for descendants first and then for ancestors and it clears
direct_suspend for the parents of devices that don't have it set.  This means
that the ancestor's ->suspend() will see the flag clear if it is unset for
any of its descendants.

IOW, the only case in which the ancestor's ->suspend() sees the flag set is
when it has been set for all of its descendants.  Thus, if it leaves the
flag set, the late/early and noirq callbacks won't be executed for it.

Now, there is a reason for concern in that, because ->suspend() may set the
flag as a result of an error and that may lead to unexpected consequences.

> > So I'm going to do what I said above.  I prefer it anyway. :-)
> 
> In your most recent patch (and in the earlier ones too), after you call
> dev's ->suspend() routine, if dev->power.direct_resume isn't set then
> you clear dev->parent->direct_resume.  But what good will that do if
> dev->parent's ->suspend() routine turns the flag back on when it gets
> called later?

In fact, ->suspend() is not supposed to set the flag when it is clear.
It can clear it when it is set, which means that we have "normal" suspend.

> I can think of two ways to make this work.
> 
> 	Expect subsystems and drivers to set the flag during 
> 	->suspend().  Turn on the flag in every device during 
> 	device_prepare().  Then in __device_suspend(), remember the
> 	flag's value and turn it off before invoking the callback.

That doesn't work, because ->suspend() has to decide whether or not
to resume the device and do things it would do normally, so it needs to know
the value of the flag.

>       If the flag is on again when the callback returns, set the flag 
> 	back to the remembered value.  If the flag ends up being off
> 	then turn off the parent's flag.

That'd be too late.  The only thing we can do to kind of protect the PM
core from errors in drivers in that case would be to remember the value of
the flag before calling ->suspend() and return an error if it the flag after
->suspend() is set, but it wasn't before.

> 	Expect subsystems and drivers to set the flag during 
> 	->prepare().  Whenever a callback returns with the flag not
> 	set, clear the flag in all of the device's ancestors.
> 
> Both are somewhat awkward, and both involve turning the flag off after 
> the callback has turned it on.

After the callback set it on while it shouldn't, it might have done something
wrong already.

> Also, how do you expect direct_resume to work with the PCI subsystem?  
> Will the PCI core set the flag appropriately on behalf of the driver?

Yes.

> If the core does set the flag, will it invoke the driver's ->suspend()
> callback or skip the callback?

It will skip the driver's callback.  [It would actually help if you looked
at patch [3/3] which is there to illustrate my idea of how to do those
things in a subsystem.]

> If it invokes the driver's callback but
> leaves the device in runtime suspend, what happens if the driver
> expects the device always to be at full power when its ->suspend()  
> routine runs?  If the core skips the driver's ->suspend() callback,
> what happens if one of the device's children did not set direct_resume 
> and so the later PM callbacks do get invoked?

Then the parent will have direct_resume unset.  That is not a concern.
The only concern to me is possible errors in ->suspend() setting the
flag when it shouldn't.

> Several of these questions are a lot easier to answer if the flag gets 
> set during ->prepare() rather than ->suspend().

I agree with that, but I have one concern about this approach.  Namely,
in that case the PM core has to use pm_runtime_resume() or equivalent to
resume devices with the flag set during the device resume stage.  Now,
in the next step we may want to leave certain devices suspended at that
point and the PM core has no way to tell which ones.  Also subsystems
don't really have a chance to tell it about that (they would need to
know in advance during ->prepare(), which is kind of unrealistic, or
perhaps it isn't).

However, if ->resume() is called for devices with the flag set, like in
my most recent patch, the subsystem may decide not to resume the device
if it knows enough about it.

This pretty much is my only concern here, so I'm open to ideas how to deal
with leaving devices suspended (if possible) during the device resume stage. :-)

For one, postponing the resume to ->complete() is an option, but it will have
to be done with care, because the ->complete() callbacks are executed
sequentially, so calling pm_runtime_resume() from there is rather out of the
question.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-09 22:48                                                               ` Kevin Hilman
@ 2014-05-10  1:38                                                                 ` Rafael J. Wysocki
  2014-05-12 16:33                                                                   ` Kevin Hilman
  0 siblings, 1 reply; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-10  1:38 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

On Friday, May 09, 2014 03:48:21 PM Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@rjwysocki.net> writes:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
> > resume all runtime-suspended devices during system suspend, mostly
> > because those devices may need to be reprogrammed due to different
> > wakeup settings for system sleep and for runtime PM.
> >
> > For some devices, though, it's OK to remain in runtime suspend 
> > throughout a complete system suspend/resume cycle (if the device was in
> > runtime suspend at the start of the cycle).  We would like to do this
> > whenever possible, to avoid the overhead of extra power-up and power-down
> > events.
> >
> > However, problems may arise because the device's descendants may require
> > it to be at full power at various points during the cycle.  Therefore the
> > most straightforward way to do this safely is if the device and all its
> > descendants can remain runtime suspended until the resume stage of system
> > resume.
> >
> > To this end, introduce dev->power.leave_runtime_suspended.
> > If a subsystem or driver sets this flag during the ->prepare() callback,
> > and if the flag is set in all of the device's descendants, and if the
> > device is still in runtime suspend at the beginning of the ->suspend()
> > callback, that callback is allowed to return 0 without clearing
> > power.leave_runtime_suspended and without changing the state of the
> > device, unless the current state of the device is not appropriate for
> > the upcoming system sleep state (for example, the device is supposed to
> > wake up the system from that state and its current wakeup settings are
> > not suitable for that).  Then, the PM core will not invoke the device's
> > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
> > ->resume() callbacks.  
> 
> Up to here, this sounds great.
> 
> > Instead, it will invoke ->runtime_resume() during the device resume
> > stage of system resume.
> 
> But this part I'm not fully following...

You're not looking at the most recent one. :-)

Please look here: https://patchwork.kernel.org/patch/4139181/

> > By leaving this flag set after ->suspend(), a driver or subsystem tells
> > the PM core that the device is runtime suspended, it is in a suitable
> > state for system suspend (for example, the wakeup setting does not
> > need to be changed), and it does not need to return to full
> > power until the resume stage.
> 
> But taking this "leave runtime suspended" idea the next logical step,
> why would/should a device need to return to full power at the ->resume()
> stage?  especially when it wasn't at full power when ->suspend()
> happened?

Good question and I've been thinking about that for a while.

Generally, the main reason for resuming is that on some platforms devices are
automatically powered up by firmware and in those cases it's better to
resume them (to make the runtime PM status reflect the physical state) and
suspend again later.

Generally speaking, subsystems that need to do that know what they are and
that's what I was talking about in the most recent reply to Alan:

http://marc.info/?l=linux-pm&m=139967477806094&w=4

Currently, I think, there are two options on the table really.

 1. Do more or less what https://patchwork.kernel.org/patch/4139181/ does
    with a modification to check that ->suspend() doesn't "cheat" (by setting
    the flag that had been unset before it was called).  The subsystem's
    ->resume() would then decide what to do with the device (resume it or
    leave it suspended).

 2. Do what Alan was suggesting, that is set the flag in ->prepare() and
    make the PM core skip *all* of the system suspend/resume callbacks
    for devices with that flag set and let the ->complete() callback
    decide what to do with the device.

I'm leaning a bit towards 2, but still considering 1 too.

Thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-09 22:49                                                                             ` Rafael J. Wysocki
@ 2014-05-11 16:46                                                                               ` Alan Stern
  2014-05-13  0:51                                                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 78+ messages in thread
From: Alan Stern @ 2014-05-11 16:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Sat, 10 May 2014, Rafael J. Wysocki wrote:

> On Thursday, May 08, 2014 09:52:18 PM Alan Stern wrote:
> > On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> > 
> > > Well, no.
> > > 
> > > The reason why that doesn't work is because ->prepare() callbacks are
> > > executed in the reverse order, so the perent's ones will be run before
> > > the ->prepare() of the children.  Thus if ->prepare() sets the flag
> > > with the expectation that ->suspend() (and the subsequent callbacks)
> > > won't be executed, that expectation may not be met actually.
> > 
> > That's true also if the flag gets set in ->suspend(), isn't it?  A
> > driver may set direct_resume in its ->suspend() callback, expecting
> > that the subsequent callbacks won't be executed.  But if a descendant
> > hasn't also set its flag then the callbacks _will_ be executed.
> 
> No, that's not possible with the current patch, because __device_suspend() is
> executed for descendants first and then for ancestors and it clears
> direct_suspend for the parents of devices that don't have it set.  This means
> that the ancestor's ->suspend() will see the flag clear if it is unset for
> any of its descendants.
> 
> IOW, the only case in which the ancestor's ->suspend() sees the flag set is
> when it has been set for all of its descendants.  Thus, if it leaves the
> flag set, the late/early and noirq callbacks won't be executed for it.
> 
> Now, there is a reason for concern in that, because ->suspend() may set the
> flag as a result of an error and that may lead to unexpected consequences.

Ah, my mistake; I should have read the patch more carefully.  I didn't
realize that your plan was for subsystems/drivers to set the flag
during ->prepare() and then clear it (or leave it set) during
->suspend().

> Then the parent will have direct_resume unset.  That is not a concern.
> The only concern to me is possible errors in ->suspend() setting the
> flag when it shouldn't.

So now one question is: Why would a subsystem or driver want to clear a
flag that it had set earlier?  I can't think of any good reasons.  The
only obvious possibility would be if the wakeup requirements got
changed between ->prepare() and ->suspend(), but that should never
happen because wakeup settings are changed by userspace and userspace
will be frozen.

Another question is: Does a subsystem or driver need to know if the
original flag setting couldn't be honored?  Again, I don't think it is 
necessary to call ->suspend() just for this reason.  More precisely, I 
think it will be good enough to call ->suspend() when the flag is clear 
(either because a descendant device didn't set its flag or because the 
device is no longer in runtime suspend); if the flag is set then there 
is no reason to call ->suspend().  The subsystem can assume that 
->suspend() won't be called; then if it does get called, the subsystem 
will realize something has changed.

Thus, a suitable algorithm now appears to be:

	Have subsystems/drivers set the flag during ->prepare().  They
	don't even have to check if the device is runtime-suspended;
	if it isn't then the PM core will turn off the flag later.

	In __device_suspend(), before invoking the ->supend() callback, 
	check the flag.  If it is still set and if the device is
	runtime-suspended (a barrier may be necessary here), skip 
	->suspend() and the following callbacks.  Otherwise clear the
	parent's flag and proceed as usual.

> > Several of these questions are a lot easier to answer if the flag gets 
> > set during ->prepare() rather than ->suspend().
> 
> I agree with that, but I have one concern about this approach.  Namely,
> in that case the PM core has to use pm_runtime_resume() or equivalent to
> resume devices with the flag set during the device resume stage.  Now,
> in the next step we may want to leave certain devices suspended at that
> point and the PM core has no way to tell which ones.  Also subsystems
> don't really have a chance to tell it about that (they would need to
> know in advance during ->prepare(), which is kind of unrealistic, or
> perhaps it isn't).
> 
> However, if ->resume() is called for devices with the flag set, like in
> my most recent patch, the subsystem may decide not to resume the device
> if it knows enough about it.
> 
> This pretty much is my only concern here, so I'm open to ideas how to deal
> with leaving devices suspended (if possible) during the device resume stage. :-)

This is a good question.  I'm not sure of the best answer at the 
moment.

> For one, postponing the resume to ->complete() is an option, but it will have
> to be done with care, because the ->complete() callbacks are executed
> sequentially, so calling pm_runtime_resume() from there is rather out of the
> question.

Calling pm_request_resume() would be okay, though.

There's another aspect to this we need to consider: hibernation.  I'm
quite sure we don't want to come out of hibernation thinking that
devices are still in their runtime-suspended states.  Skipping the
callbacks for freeze and thaw would be all right in principle, and
maybe even for poweroff, but not for restore.  And of course, during
the restore stages, the last thing subsystems and drivers will remember
happening is freeze -- which might mean you shouldn't skip the freeze
callbacks either.

Alan Stern

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-10  1:38                                                                 ` Rafael J. Wysocki
@ 2014-05-12 16:33                                                                   ` Kevin Hilman
  0 siblings, 0 replies; 78+ messages in thread
From: Kevin Hilman @ 2014-05-12 16:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux PM list, Mika Westerberg, Aaron Lu,
	ACPI Devel Maling List, LKML

"Rafael J. Wysocki" <rjw@rjwysocki.net> writes:

> On Friday, May 09, 2014 03:48:21 PM Kevin Hilman wrote:
>> "Rafael J. Wysocki" <rjw@rjwysocki.net> writes:
>> 
>> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >
>> > Currently, some subsystems (e.g. PCI and the ACPI PM domain) have to
>> > resume all runtime-suspended devices during system suspend, mostly
>> > because those devices may need to be reprogrammed due to different
>> > wakeup settings for system sleep and for runtime PM.
>> >
>> > For some devices, though, it's OK to remain in runtime suspend 
>> > throughout a complete system suspend/resume cycle (if the device was in
>> > runtime suspend at the start of the cycle).  We would like to do this
>> > whenever possible, to avoid the overhead of extra power-up and power-down
>> > events.
>> >
>> > However, problems may arise because the device's descendants may require
>> > it to be at full power at various points during the cycle.  Therefore the
>> > most straightforward way to do this safely is if the device and all its
>> > descendants can remain runtime suspended until the resume stage of system
>> > resume.
>> >
>> > To this end, introduce dev->power.leave_runtime_suspended.
>> > If a subsystem or driver sets this flag during the ->prepare() callback,
>> > and if the flag is set in all of the device's descendants, and if the
>> > device is still in runtime suspend at the beginning of the ->suspend()
>> > callback, that callback is allowed to return 0 without clearing
>> > power.leave_runtime_suspended and without changing the state of the
>> > device, unless the current state of the device is not appropriate for
>> > the upcoming system sleep state (for example, the device is supposed to
>> > wake up the system from that state and its current wakeup settings are
>> > not suitable for that).  Then, the PM core will not invoke the device's
>> > ->suspend_late(), ->suspend_irq(), ->resume_irq(), ->resume_early(), or
>> > ->resume() callbacks.  
>> 
>> Up to here, this sounds great.
>> 
>> > Instead, it will invoke ->runtime_resume() during the device resume
>> > stage of system resume.
>> 
>> But this part I'm not fully following...
>
> You're not looking at the most recent one. :-)

Sorry about that, I haven't been able to keep up with the versions.

> Please look here: https://patchwork.kernel.org/patch/4139181/

OK.  

>> > By leaving this flag set after ->suspend(), a driver or subsystem tells
>> > the PM core that the device is runtime suspended, it is in a suitable
>> > state for system suspend (for example, the wakeup setting does not
>> > need to be changed), and it does not need to return to full
>> > power until the resume stage.
>> 
>> But taking this "leave runtime suspended" idea the next logical step,
>> why would/should a device need to return to full power at the ->resume()
>> stage?  especially when it wasn't at full power when ->suspend()
>> happened?
>
> Good question and I've been thinking about that for a while.
>
> Generally, the main reason for resuming is that on some platforms devices are
> automatically powered up by firmware and in those cases it's better to
> resume them (to make the runtime PM status reflect the physical state) and
> suspend again later.
>
> Generally speaking, subsystems that need to do that know what they are and
> that's what I was talking about in the most recent reply to Alan:
>
> http://marc.info/?l=linux-pm&m=139967477806094&w=4
>
> Currently, I think, there are two options on the table really.
>
>  1. Do more or less what https://patchwork.kernel.org/patch/4139181/ does
>     with a modification to check that ->suspend() doesn't "cheat" (by setting
>     the flag that had been unset before it was called).  The subsystem's
>     ->resume() would then decide what to do with the device (resume it or
>     leave it suspended).
>
>  2. Do what Alan was suggesting, that is set the flag in ->prepare() and
>     make the PM core skip *all* of the system suspend/resume callbacks
>     for devices with that flag set and let the ->complete() callback
>     decide what to do with the device.
>
> I'm leaning a bit towards 2, but still considering 1 too.

If it matters, I have a slight preference for 2 also, though as long as
the subsytem/device gets to decide whether to resume, I think I'm OK
with either approach.

Kevin


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices
  2014-05-11 16:46                                                                               ` Alan Stern
@ 2014-05-13  0:51                                                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 78+ messages in thread
From: Rafael J. Wysocki @ 2014-05-13  0:51 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM list, Mika Westerberg, Aaron Lu, ACPI Devel Maling List, LKML

On Sunday, May 11, 2014 12:46:10 PM Alan Stern wrote:
> On Sat, 10 May 2014, Rafael J. Wysocki wrote:
> 
> > On Thursday, May 08, 2014 09:52:18 PM Alan Stern wrote:
> > > On Thu, 8 May 2014, Rafael J. Wysocki wrote:
> > > 
> > > > Well, no.
> > > > 
> > > > The reason why that doesn't work is because ->prepare() callbacks are
> > > > executed in the reverse order, so the perent's ones will be run before
> > > > the ->prepare() of the children.  Thus if ->prepare() sets the flag
> > > > with the expectation that ->suspend() (and the subsequent callbacks)
> > > > won't be executed, that expectation may not be met actually.
> > > 
> > > That's true also if the flag gets set in ->suspend(), isn't it?  A
> > > driver may set direct_resume in its ->suspend() callback, expecting
> > > that the subsequent callbacks won't be executed.  But if a descendant
> > > hasn't also set its flag then the callbacks _will_ be executed.
> > 
> > No, that's not possible with the current patch, because __device_suspend() is
> > executed for descendants first and then for ancestors and it clears
> > direct_suspend for the parents of devices that don't have it set.  This means
> > that the ancestor's ->suspend() will see the flag clear if it is unset for
> > any of its descendants.
> > 
> > IOW, the only case in which the ancestor's ->suspend() sees the flag set is
> > when it has been set for all of its descendants.  Thus, if it leaves the
> > flag set, the late/early and noirq callbacks won't be executed for it.
> > 
> > Now, there is a reason for concern in that, because ->suspend() may set the
> > flag as a result of an error and that may lead to unexpected consequences.
> 
> Ah, my mistake; I should have read the patch more carefully.  I didn't
> realize that your plan was for subsystems/drivers to set the flag
> during ->prepare() and then clear it (or leave it set) during
> ->suspend().
> 
> > Then the parent will have direct_resume unset.  That is not a concern.
> > The only concern to me is possible errors in ->suspend() setting the
> > flag when it shouldn't.
> 
> So now one question is: Why would a subsystem or driver want to clear a
> flag that it had set earlier?  I can't think of any good reasons.  The
> only obvious possibility would be if the wakeup requirements got
> changed between ->prepare() and ->suspend(), but that should never
> happen because wakeup settings are changed by userspace and userspace
> will be frozen.
> 
> Another question is: Does a subsystem or driver need to know if the
> original flag setting couldn't be honored?  Again, I don't think it is 
> necessary to call ->suspend() just for this reason.  More precisely, I 
> think it will be good enough to call ->suspend() when the flag is clear 
> (either because a descendant device didn't set its flag or because the 
> device is no longer in runtime suspend); if the flag is set then there 
> is no reason to call ->suspend().  The subsystem can assume that 
> ->suspend() won't be called; then if it does get called, the subsystem 
> will realize something has changed.
> 
> Thus, a suitable algorithm now appears to be:
> 
> 	Have subsystems/drivers set the flag during ->prepare().  They
> 	don't even have to check if the device is runtime-suspended;
> 	if it isn't then the PM core will turn off the flag later.
> 
> 	In __device_suspend(), before invoking the ->supend() callback, 
> 	check the flag.  If it is still set and if the device is
> 	runtime-suspended (a barrier may be necessary here), skip 
> 	->suspend() and the following callbacks.  Otherwise clear the
> 	parent's flag and proceed as usual.
> 
> > > Several of these questions are a lot easier to answer if the flag gets 
> > > set during ->prepare() rather than ->suspend().

I actually decided to go that way with one difference.  I think it's better
to make the PM core own the new flag, so that bus types/drivers don't have
to set/clear it, so I got back to my very first idea about possibly returning
positive values from ->prepare().

The idea is this:

 - If ->prepare() returns a positive number, that means "this device is
   runtime-suspended and you can leave it like that if you do the same
   thing for all of its descendants".

 - If that happens, the PM core sets the new flag for the device in
   question *if* the device is indeed runtime-suspended *and* *if*
   the transition is a suspend (and not hibernation, for example).
   Otherwise, it clears the flag for the device.  All of that happens in
   device_prepare().

 - In __device_suspend() the PM core clears the flag for the device's
   parent if it is clear for the device to ensure that the flag will only
   be set for a device if it is also set for all of its descendants.

 - PM core skips ->suspend/late/noirq and ->resume/early/noirq for all devices
   having the flag set - so the flag can be called "direct_complete" as it
   causes the PM core to go directy for the ->complete() callback when set.

 - The ->complete() callback has to check direct_complete if ->prepare()
   returned a positive number previously and is responsible for further
   handling of the device.

> > I agree with that, but I have one concern about this approach.  Namely,
> > in that case the PM core has to use pm_runtime_resume() or equivalent to
> > resume devices with the flag set during the device resume stage.  Now,
> > in the next step we may want to leave certain devices suspended at that
> > point and the PM core has no way to tell which ones.  Also subsystems
> > don't really have a chance to tell it about that (they would need to
> > know in advance during ->prepare(), which is kind of unrealistic, or
> > perhaps it isn't).
> > 
> > However, if ->resume() is called for devices with the flag set, like in
> > my most recent patch, the subsystem may decide not to resume the device
> > if it knows enough about it.
> > 
> > This pretty much is my only concern here, so I'm open to ideas how to deal
> > with leaving devices suspended (if possible) during the device resume stage. :-)
> 
> This is a good question.  I'm not sure of the best answer at the 
> moment.
> 
> > For one, postponing the resume to ->complete() is an option, but it will have
> > to be done with care, because the ->complete() callbacks are executed
> > sequentially, so calling pm_runtime_resume() from there is rather out of the
> > question.
> 
> Calling pm_request_resume() would be okay, though.
> 
> There's another aspect to this we need to consider: hibernation.  I'm
> quite sure we don't want to come out of hibernation thinking that
> devices are still in their runtime-suspended states.  Skipping the
> callbacks for freeze and thaw would be all right in principle, and
> maybe even for poweroff, but not for restore.  And of course, during
> the restore stages, the last thing subsystems and drivers will remember
> happening is freeze -- which might mean you shouldn't skip the freeze
> callbacks either.

I agree.  I think the outline above should address this too.

I'm going to send a new version of the patches implementing the idea above.

Rafael


^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2014-05-13  0:51 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-14 23:12 [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend Rafael J. Wysocki
2014-01-14 23:13 ` [RFC][PATCH 1/3] PM / sleep: Flag to avoid executing suspend callbacks for devices Rafael J. Wysocki
2014-01-14 23:14 ` [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
2014-01-16 13:32   ` Mika Westerberg
2014-01-16 16:07     ` [Update][RFC][PATCH " Rafael J. Wysocki
2014-01-14 23:16 ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
2014-01-15 13:57   ` [Update][RFC][PATCH " Rafael J. Wysocki
2014-02-16 23:49 ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices " Rafael J. Wysocki
2014-02-16 23:50   ` [PATCH 1/3] PM / sleep: New flag to speed up suspend-resume of suspended devices Rafael J. Wysocki
2014-02-18 12:59     ` Ulf Hansson
2014-02-18 13:25       ` Rafael J. Wysocki
2014-02-19 17:01     ` Alan Stern
2014-02-20  1:23       ` Rafael J. Wysocki
2014-02-20  1:42         ` Rafael J. Wysocki
2014-02-20 17:03         ` Alan Stern
2014-02-24  0:00           ` Rafael J. Wysocki
2014-02-24 19:36             ` Alan Stern
2014-02-25  0:07               ` Rafael J. Wysocki
2014-02-25 17:08                 ` Alan Stern
2014-02-25 23:56                   ` Rafael J. Wysocki
2014-02-26 16:49                     ` Alan Stern
2014-02-26 21:44                       ` Rafael J. Wysocki
2014-02-26 22:17                         ` Alan Stern
2014-02-26 23:13                           ` Rafael J. Wysocki
2014-02-27 15:02                             ` Alan Stern
2014-04-24 22:36                               ` [RFC][PATCH 0/3] PM: Mechanism to avoid resuming runtime-suspended devices during system suspend, v2 Rafael J. Wysocki
2014-04-24 22:37                                 ` [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
2014-05-01 21:39                                   ` Alan Stern
2014-05-01 23:15                                     ` Rafael J. Wysocki
2014-05-01 23:36                                       ` Rafael J. Wysocki
2014-05-02  0:04                                         ` Rafael J. Wysocki
2014-05-02 15:41                                           ` Rafael J. Wysocki
2014-05-02 18:44                                             ` Alan Stern
2014-05-05  0:09                                               ` Rafael J. Wysocki
2014-05-05 15:46                                                 ` Alan Stern
2014-05-06  1:31                                                   ` Rafael J. Wysocki
2014-05-06 19:31                                                     ` Alan Stern
2014-05-07  0:36                                                       ` Rafael J. Wysocki
2014-05-07 15:43                                                         ` Alan Stern
2014-05-07 23:27                                                           ` [RFC][PATCH 0/3] (was: Re: [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
2014-05-07 23:29                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
2014-05-08  7:49                                                               ` Ulf Hansson
2014-05-08 10:53                                                                 ` Rafael J. Wysocki
2014-05-08 10:59                                                                   ` Ulf Hansson
2014-05-08 11:44                                                                     ` Rafael J. Wysocki
2014-05-08 12:25                                                                       ` Ulf Hansson
2014-05-08 20:02                                                                         ` Rafael J. Wysocki
2014-05-08 14:36                                                                     ` Alan Stern
2014-05-08 14:57                                                               ` Alan Stern
2014-05-08 20:17                                                                 ` Rafael J. Wysocki
2014-05-08 21:03                                                                   ` Rafael J. Wysocki
2014-05-08 21:20                                                                     ` Alan Stern
2014-05-08 21:42                                                                       ` Rafael J. Wysocki
2014-05-08 21:50                                                                         ` Rafael J. Wysocki
2014-05-08 22:28                                                                           ` [RFC][PATCH 0/3] (was: Re: PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices) Rafael J. Wysocki
2014-05-08 22:41                                                                             ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Rafael J. Wysocki
2014-05-09  7:23                                                                               ` Ulf Hansson
2014-05-09 11:33                                                                                 ` Rafael J. Wysocki
2014-05-08 22:41                                                                             ` [RFC][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
2014-05-08 22:42                                                                             ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
2014-05-09  1:52                                                                           ` [RFC][PATCH 1/3] PM / sleep: Flag to speed up suspend-resume of runtime-suspended devices Alan Stern
2014-05-09 22:49                                                                             ` Rafael J. Wysocki
2014-05-11 16:46                                                                               ` Alan Stern
2014-05-13  0:51                                                                                 ` Rafael J. Wysocki
2014-05-08 21:08                                                                   ` Alan Stern
2014-05-09 22:48                                                               ` Kevin Hilman
2014-05-10  1:38                                                                 ` Rafael J. Wysocki
2014-05-12 16:33                                                                   ` Kevin Hilman
2014-05-07 23:31                                                             ` [Resend][PATCH 2/3] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
2014-05-07 23:33                                                             ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
2014-05-08 14:59                                                               ` Alan Stern
2014-05-08 19:40                                                                 ` Rafael J. Wysocki
2014-05-02 16:12                                         ` [RFC][PATCH 1/3] PM / sleep: Flags to speed up suspend-resume of runtime-suspended devices Alan Stern
2014-04-24 22:39                                 ` [RFC][PATCH 2/3][Resend] PM / runtime: Routine for checking device status during system suspend Rafael J. Wysocki
2014-04-25 11:28                                   ` Ulf Hansson
2014-04-24 22:40                                 ` [RFC][PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki
2014-02-16 23:51   ` [PATCH 2/3][Resend] PM / runtime: Routine for checking device status " Rafael J. Wysocki
2014-02-16 23:52   ` [PATCH 3/3] ACPI / PM: Avoid resuming devices in ACPI PM domain " Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.