All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take)
@ 2021-03-18 18:06 Rafael J. Wysocki
  2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-18 18:06 UTC (permalink / raw)
  To: Linux PM, elaine.zhang; +Cc: LKML, Ulf Hansson

Hi All,

The previous attempt to address the issue tackled by this series, commit
44cc89f76464 ("PM: runtime: Update device status before letting suppliers
suspend") was incorrect, because it introduced a rather nasty race condition
into __rpm_callback(), so let's revert it (patch [1/2]).

Instead, let's avoid suspending the suppliers immediately after dropping the
PM-runtime references to them and do that later, when the status of the
consumer has changed to RPM_SUSPENDED.

Please see the patch changelogs for details.

Elaine, please test this series on the system where you saw the original
problem.

Thanks!




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend"
  2021-03-18 18:06 [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
@ 2021-03-18 18:09 ` Rafael J. Wysocki
  2021-03-19 13:29   ` Ulf Hansson
  2021-03-18 18:15 ` [PATCH v1 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
  2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-18 18:09 UTC (permalink / raw)
  To: Linux PM, elaine.zhang; +Cc: LKML, Ulf Hansson

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

Revert commit 44cc89f76464 ("PM: runtime: Update device status
before letting suppliers suspend") that introduced a race condition
into __rpm_callback() which allowed a concurrent rpm_resume() to
run and resume the device prematurely after its status had been
changed to RPM_SUSPENDED by __rpm_callback().

Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
Reported by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c | 62 +++++++++++++++---------------------
 1 file changed, 25 insertions(+), 37 deletions(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 18b82427d0cb..a46a7e30881b 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -325,22 +325,22 @@ static void rpm_put_suppliers(struct device *dev)
 static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
 	__releases(&dev->power.lock) __acquires(&dev->power.lock)
 {
-	bool use_links = dev->power.links_count > 0;
-	bool get = false;
 	int retval, idx;
-	bool put;
+	bool use_links = dev->power.links_count > 0;
 
 	if (dev->power.irq_safe) {
 		spin_unlock(&dev->power.lock);
-	} else if (!use_links) {
-		spin_unlock_irq(&dev->power.lock);
 	} else {
-		get = dev->power.runtime_status == RPM_RESUMING;
-
 		spin_unlock_irq(&dev->power.lock);
 
-		/* Resume suppliers if necessary. */
-		if (get) {
+		/*
+		 * Resume suppliers if necessary.
+		 *
+		 * The device's runtime PM status cannot change until this
+		 * routine returns, so it is safe to read the status outside of
+		 * the lock.
+		 */
+		if (use_links && dev->power.runtime_status == RPM_RESUMING) {
 			idx = device_links_read_lock();
 
 			retval = rpm_get_suppliers(dev);
@@ -355,36 +355,24 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
 
 	if (dev->power.irq_safe) {
 		spin_lock(&dev->power.lock);
-		return retval;
-	}
-
-	spin_lock_irq(&dev->power.lock);
-
-	if (!use_links)
-		return retval;
-
-	/*
-	 * If the device is suspending and the callback has returned success,
-	 * drop the usage counters of the suppliers that have been reference
-	 * counted on its resume.
-	 *
-	 * Do that if the resume fails too.
-	 */
-	put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
-	if (put)
-		__update_runtime_status(dev, RPM_SUSPENDED);
-	else
-		put = get && retval;
-
-	if (put) {
-		spin_unlock_irq(&dev->power.lock);
-
-		idx = device_links_read_lock();
+	} else {
+		/*
+		 * If the device is suspending and the callback has returned
+		 * success, drop the usage counters of the suppliers that have
+		 * been reference counted on its resume.
+		 *
+		 * Do that if resume fails too.
+		 */
+		if (use_links
+		    && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
+		    || (dev->power.runtime_status == RPM_RESUMING && retval))) {
+			idx = device_links_read_lock();
 
-fail:
-		rpm_put_suppliers(dev);
+ fail:
+			rpm_put_suppliers(dev);
 
-		device_links_read_unlock(idx);
+			device_links_read_unlock(idx);
+		}
 
 		spin_lock_irq(&dev->power.lock);
 	}
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 2/2] PM: runtime: Defer suspending suppliers
  2021-03-18 18:06 [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
@ 2021-03-18 18:15 ` Rafael J. Wysocki
  2021-03-19 13:29   ` Ulf Hansson
  2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-18 18:15 UTC (permalink / raw)
  To: Linux PM, elaine.zhang; +Cc: LKML, Ulf Hansson

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because the PM-runtime status of the device is not updated in
__rpm_callback(), attempts to suspend the suppliers of the given
device triggered by the rpm_put_suppliers() call in there may fail.

To fix this (1) modify __rpm_callback() to avoid attempting to
actually suspend the suppliers, but only decrease their PM-runtime
usage counters and (2) make rpm_suspend() try to suspend the suppliers
after changing the device's PM-runtime status, in analogy with the
handling of the device's parent.

Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
Reported-by: elaine.zhang <zhangqing@rock-chips.com>
Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org> 
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c |   45 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 39 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct devi
 	return 0;
 }
 
-static void rpm_put_suppliers(struct device *dev)
+static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
 {
 	struct device_link *link;
 
@@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct dev
 				device_links_read_lock_held()) {
 
 		while (refcount_dec_not_one(&link->rpm_active))
-			pm_runtime_put(link->supplier);
+			pm_runtime_put_noidle(link->supplier);
+
+		if (try_to_suspend)
+			pm_request_idle(link->supplier);
 	}
 }
 
+static void rpm_put_suppliers(struct device *dev)
+{
+	__rpm_put_suppliers(dev, true);
+}
+
+static void rpm_try_to_suspend_suppliers(struct device *dev)
+{
+	struct device_link *link;
+	int idx = device_links_read_lock();
+
+	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
+				device_links_read_lock_held())
+		pm_request_idle(link->supplier);
+
+	device_links_read_unlock(idx);
+}
+
 /**
  * __rpm_callback - Run a given runtime PM callback for a given device.
  * @cb: Runtime PM callback to run.
@@ -344,8 +364,10 @@ static int __rpm_callback(int (*cb)(stru
 			idx = device_links_read_lock();
 
 			retval = rpm_get_suppliers(dev);
-			if (retval)
+			if (retval) {
+				rpm_put_suppliers(dev);
 				goto fail;
+			}
 
 			device_links_read_unlock(idx);
 		}
@@ -368,9 +390,9 @@ static int __rpm_callback(int (*cb)(stru
 		    || (dev->power.runtime_status == RPM_RESUMING && retval))) {
 			idx = device_links_read_lock();
 
- fail:
-			rpm_put_suppliers(dev);
+			__rpm_put_suppliers(dev, false);
 
+fail:
 			device_links_read_unlock(idx);
 		}
 
@@ -642,8 +664,11 @@ static int rpm_suspend(struct device *de
 		goto out;
 	}
 
+	if (dev->power.irq_safe)
+		goto out;
+
 	/* Maybe the parent is now able to suspend. */
-	if (parent && !parent->power.ignore_children && !dev->power.irq_safe) {
+	if (parent && !parent->power.ignore_children) {
 		spin_unlock(&dev->power.lock);
 
 		spin_lock(&parent->power.lock);
@@ -652,6 +677,14 @@ static int rpm_suspend(struct device *de
 
 		spin_lock(&dev->power.lock);
 	}
+	/* Maybe the suppliers are now able to suspend. */
+	if (dev->power.links_count > 0) {
+		spin_unlock(&dev->power.lock);
+
+		rpm_try_to_suspend_suppliers(dev);
+
+		spin_lock(&dev->power.lock);
+	}
 
  out:
 	trace_rpm_return_int_rcuidle(dev, _THIS_IP_, retval);




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] PM: runtime: Defer suspending suppliers
  2021-03-18 18:15 ` [PATCH v1 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
@ 2021-03-19 13:29   ` Ulf Hansson
  2021-03-19 13:48     ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Ulf Hansson @ 2021-03-19 13:29 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, elaine.zhang, LKML

On Thu, 18 Mar 2021 at 19:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Because the PM-runtime status of the device is not updated in
> __rpm_callback(), attempts to suspend the suppliers of the given
> device triggered by the rpm_put_suppliers() call in there may fail.
>
> To fix this (1) modify __rpm_callback() to avoid attempting to
> actually suspend the suppliers, but only decrease their PM-runtime
> usage counters and (2) make rpm_suspend() try to suspend the suppliers
> after changing the device's PM-runtime status, in analogy with the
> handling of the device's parent.
>
> Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
> Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
> Reported-by: elaine.zhang <zhangqing@rock-chips.com>
> Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Just a minor nitpick, see below. In any case:

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

> ---
>  drivers/base/power/runtime.c |   45 +++++++++++++++++++++++++++++++++++++------
>  1 file changed, 39 insertions(+), 6 deletions(-)
>
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct devi
>         return 0;
>  }
>
> -static void rpm_put_suppliers(struct device *dev)
> +static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
>  {
>         struct device_link *link;
>
> @@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct dev
>                                 device_links_read_lock_held()) {
>
>                 while (refcount_dec_not_one(&link->rpm_active))
> -                       pm_runtime_put(link->supplier);
> +                       pm_runtime_put_noidle(link->supplier);
> +
> +               if (try_to_suspend)
> +                       pm_request_idle(link->supplier);
>         }
>  }
>
> +static void rpm_put_suppliers(struct device *dev)
> +{
> +       __rpm_put_suppliers(dev, true);
> +}
> +
> +static void rpm_try_to_suspend_suppliers(struct device *dev)

Maybe "rpm_suspend_suppliers" is sufficient for the name of the
function, but I have no strong opinion.

> +{
> +       struct device_link *link;
> +       int idx = device_links_read_lock();
> +
> +       list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> +                               device_links_read_lock_held())
> +               pm_request_idle(link->supplier);
> +
> +       device_links_read_unlock(idx);
> +}
> +

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend"
  2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
@ 2021-03-19 13:29   ` Ulf Hansson
  2021-03-19 13:43     ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Ulf Hansson @ 2021-03-19 13:29 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, elaine.zhang, LKML

On Thu, 18 Mar 2021 at 19:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
>
> Revert commit 44cc89f76464 ("PM: runtime: Update device status
> before letting suppliers suspend") that introduced a race condition
> into __rpm_callback() which allowed a concurrent rpm_resume() to
> run and resume the device prematurely after its status had been
> changed to RPM_SUSPENDED by __rpm_callback().

Huh, the code path is not entirely easy to follow. :-)

Did you find this by code inspection or did you get an error report?

>
> Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
> Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
> Reported by: Adrian Hunter <adrian.hunter@intel.com>
> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe


> ---
>  drivers/base/power/runtime.c | 62 +++++++++++++++---------------------
>  1 file changed, 25 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 18b82427d0cb..a46a7e30881b 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -325,22 +325,22 @@ static void rpm_put_suppliers(struct device *dev)
>  static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
>         __releases(&dev->power.lock) __acquires(&dev->power.lock)
>  {
> -       bool use_links = dev->power.links_count > 0;
> -       bool get = false;
>         int retval, idx;
> -       bool put;
> +       bool use_links = dev->power.links_count > 0;
>
>         if (dev->power.irq_safe) {
>                 spin_unlock(&dev->power.lock);
> -       } else if (!use_links) {
> -               spin_unlock_irq(&dev->power.lock);
>         } else {
> -               get = dev->power.runtime_status == RPM_RESUMING;
> -
>                 spin_unlock_irq(&dev->power.lock);
>
> -               /* Resume suppliers if necessary. */
> -               if (get) {
> +               /*
> +                * Resume suppliers if necessary.
> +                *
> +                * The device's runtime PM status cannot change until this
> +                * routine returns, so it is safe to read the status outside of
> +                * the lock.
> +                */
> +               if (use_links && dev->power.runtime_status == RPM_RESUMING) {
>                         idx = device_links_read_lock();
>
>                         retval = rpm_get_suppliers(dev);
> @@ -355,36 +355,24 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
>
>         if (dev->power.irq_safe) {
>                 spin_lock(&dev->power.lock);
> -               return retval;
> -       }
> -
> -       spin_lock_irq(&dev->power.lock);
> -
> -       if (!use_links)
> -               return retval;
> -
> -       /*
> -        * If the device is suspending and the callback has returned success,
> -        * drop the usage counters of the suppliers that have been reference
> -        * counted on its resume.
> -        *
> -        * Do that if the resume fails too.
> -        */
> -       put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
> -       if (put)
> -               __update_runtime_status(dev, RPM_SUSPENDED);
> -       else
> -               put = get && retval;
> -
> -       if (put) {
> -               spin_unlock_irq(&dev->power.lock);
> -
> -               idx = device_links_read_lock();
> +       } else {
> +               /*
> +                * If the device is suspending and the callback has returned
> +                * success, drop the usage counters of the suppliers that have
> +                * been reference counted on its resume.
> +                *
> +                * Do that if resume fails too.
> +                */
> +               if (use_links
> +                   && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
> +                   || (dev->power.runtime_status == RPM_RESUMING && retval))) {
> +                       idx = device_links_read_lock();
>
> -fail:
> -               rpm_put_suppliers(dev);
> + fail:
> +                       rpm_put_suppliers(dev);
>
> -               device_links_read_unlock(idx);
> +                       device_links_read_unlock(idx);
> +               }
>
>                 spin_lock_irq(&dev->power.lock);
>         }
> --
> 2.26.2
>
>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend"
  2021-03-19 13:29   ` Ulf Hansson
@ 2021-03-19 13:43     ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 13:43 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, elaine.zhang, LKML

On Fri, Mar 19, 2021 at 2:31 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Thu, 18 Mar 2021 at 19:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> >
> > Revert commit 44cc89f76464 ("PM: runtime: Update device status
> > before letting suppliers suspend") that introduced a race condition
> > into __rpm_callback() which allowed a concurrent rpm_resume() to
> > run and resume the device prematurely after its status had been
> > changed to RPM_SUSPENDED by __rpm_callback().
>
> Huh, the code path is not entirely easy to follow. :-)
>
> Did you find this by code inspection or did you get an error report?

There was a bug report that caused me to look at the code once again.

> > Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
> > Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
> > Reported by: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] PM: runtime: Defer suspending suppliers
  2021-03-19 13:29   ` Ulf Hansson
@ 2021-03-19 13:48     ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 13:48 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, elaine.zhang, LKML

On Fri, Mar 19, 2021 at 2:30 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Thu, 18 Mar 2021 at 19:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Because the PM-runtime status of the device is not updated in
> > __rpm_callback(), attempts to suspend the suppliers of the given
> > device triggered by the rpm_put_suppliers() call in there may fail.
> >
> > To fix this (1) modify __rpm_callback() to avoid attempting to
> > actually suspend the suppliers, but only decrease their PM-runtime
> > usage counters and (2) make rpm_suspend() try to suspend the suppliers
> > after changing the device's PM-runtime status, in analogy with the
> > handling of the device's parent.
> >
> > Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
> > Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
> > Reported-by: elaine.zhang <zhangqing@rock-chips.com>
> > Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Just a minor nitpick, see below. In any case:
>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Thanks!

>
> > ---
> >  drivers/base/power/runtime.c |   45 +++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 39 insertions(+), 6 deletions(-)
> >
> > Index: linux-pm/drivers/base/power/runtime.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/runtime.c
> > +++ linux-pm/drivers/base/power/runtime.c
> > @@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct devi
> >         return 0;
> >  }
> >
> > -static void rpm_put_suppliers(struct device *dev)
> > +static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
> >  {
> >         struct device_link *link;
> >
> > @@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct dev
> >                                 device_links_read_lock_held()) {
> >
> >                 while (refcount_dec_not_one(&link->rpm_active))
> > -                       pm_runtime_put(link->supplier);
> > +                       pm_runtime_put_noidle(link->supplier);
> > +
> > +               if (try_to_suspend)
> > +                       pm_request_idle(link->supplier);
> >         }
> >  }
> >
> > +static void rpm_put_suppliers(struct device *dev)
> > +{
> > +       __rpm_put_suppliers(dev, true);
> > +}
> > +
> > +static void rpm_try_to_suspend_suppliers(struct device *dev)
>
> Maybe "rpm_suspend_suppliers" is sufficient for the name of the
> function, but I have no strong opinion.

OK

In addition to this, spin_unlock_irq()/spin_lock_irq() need to be used
around the call to it in rpm_suspend(), so I'll send a v2.  I guess
that the R-by still applies, though. :-)

> > +{
> > +       struct device_link *link;
> > +       int idx = device_links_read_lock();
> > +
> > +       list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> > +                               device_links_read_lock_held())
> > +               pm_request_idle(link->supplier);
> > +
> > +       device_links_read_unlock(idx);
> > +}
> > +

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take)
  2021-03-18 18:06 [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
  2021-03-18 18:15 ` [PATCH v1 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
@ 2021-03-19 14:42 ` Rafael J. Wysocki
  2021-03-19 14:47   ` [PATCH v2 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
  2021-03-19 14:47   ` [PATCH v2 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
  2 siblings, 2 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 14:42 UTC (permalink / raw)
  To: Linux PM, elaine.zhang; +Cc: LKML, Ulf Hansson

On Thursday, March 18, 2021 7:06:54 PM CET Rafael J. Wysocki wrote:
> Hi All,
> 
> The previous attempt to address the issue tackled by this series, commit
> 44cc89f76464 ("PM: runtime: Update device status before letting suppliers
> suspend") was incorrect, because it introduced a rather nasty race condition
> into __rpm_callback(), so let's revert it (patch [1/2]).
> 
> Instead, let's avoid suspending the suppliers immediately after dropping the
> PM-runtime references to them and do that later, when the status of the
> consumer has changed to RPM_SUSPENDED.
> 
> Please see the patch changelogs for details.
> 
> Elaine, please test this series on the system where you saw the original
> problem.

The above still applies.

The v2 is sent because of some updates in the second patch, plus the tags from Ulf.

Thanks!




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend"
  2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
@ 2021-03-19 14:47   ` Rafael J. Wysocki
  2021-03-19 14:47   ` [PATCH v2 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
  1 sibling, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 14:47 UTC (permalink / raw)
  To: Linux PM; +Cc: elaine.zhang, LKML, Ulf Hansson

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

Revert commit 44cc89f76464 ("PM: runtime: Update device status
before letting suppliers suspend") that introduced a race condition
into __rpm_callback() which allowed a concurrent rpm_resume() to
run and resume the device prematurely after its status had been
changed to RPM_SUSPENDED by __rpm_callback().

Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
Reported by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
---

v1 -> v2: Added the R-by tag from Ulf.

---
 drivers/base/power/runtime.c | 62 +++++++++++++++---------------------
 1 file changed, 25 insertions(+), 37 deletions(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 18b82427d0cb..a46a7e30881b 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -325,22 +325,22 @@ static void rpm_put_suppliers(struct device *dev)
 static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
 	__releases(&dev->power.lock) __acquires(&dev->power.lock)
 {
-	bool use_links = dev->power.links_count > 0;
-	bool get = false;
 	int retval, idx;
-	bool put;
+	bool use_links = dev->power.links_count > 0;
 
 	if (dev->power.irq_safe) {
 		spin_unlock(&dev->power.lock);
-	} else if (!use_links) {
-		spin_unlock_irq(&dev->power.lock);
 	} else {
-		get = dev->power.runtime_status == RPM_RESUMING;
-
 		spin_unlock_irq(&dev->power.lock);
 
-		/* Resume suppliers if necessary. */
-		if (get) {
+		/*
+		 * Resume suppliers if necessary.
+		 *
+		 * The device's runtime PM status cannot change until this
+		 * routine returns, so it is safe to read the status outside of
+		 * the lock.
+		 */
+		if (use_links && dev->power.runtime_status == RPM_RESUMING) {
 			idx = device_links_read_lock();
 
 			retval = rpm_get_suppliers(dev);
@@ -355,36 +355,24 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
 
 	if (dev->power.irq_safe) {
 		spin_lock(&dev->power.lock);
-		return retval;
-	}
-
-	spin_lock_irq(&dev->power.lock);
-
-	if (!use_links)
-		return retval;
-
-	/*
-	 * If the device is suspending and the callback has returned success,
-	 * drop the usage counters of the suppliers that have been reference
-	 * counted on its resume.
-	 *
-	 * Do that if the resume fails too.
-	 */
-	put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
-	if (put)
-		__update_runtime_status(dev, RPM_SUSPENDED);
-	else
-		put = get && retval;
-
-	if (put) {
-		spin_unlock_irq(&dev->power.lock);
-
-		idx = device_links_read_lock();
+	} else {
+		/*
+		 * If the device is suspending and the callback has returned
+		 * success, drop the usage counters of the suppliers that have
+		 * been reference counted on its resume.
+		 *
+		 * Do that if resume fails too.
+		 */
+		if (use_links
+		    && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
+		    || (dev->power.runtime_status == RPM_RESUMING && retval))) {
+			idx = device_links_read_lock();
 
-fail:
-		rpm_put_suppliers(dev);
+ fail:
+			rpm_put_suppliers(dev);
 
-		device_links_read_unlock(idx);
+			device_links_read_unlock(idx);
+		}
 
 		spin_lock_irq(&dev->power.lock);
 	}
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] PM: runtime: Defer suspending suppliers
  2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2021-03-19 14:47   ` [PATCH v2 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
@ 2021-03-19 14:47   ` Rafael J. Wysocki
  1 sibling, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 14:47 UTC (permalink / raw)
  To: Linux PM; +Cc: elaine.zhang, LKML, Ulf Hansson

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because the PM-runtime status of the device is not updated in
__rpm_callback(), attempts to suspend the suppliers of the given
device triggered by the rpm_put_suppliers() call in there may
cause a supplier to be suspended completely before the status of
the consumer is updated to RPM_SUSPENDED, which is confusing.

To avoid that (1) modify __rpm_callback() to only decrease the
PM-runtime usage counter of each supplier and (2) make rpm_suspend()
try to suspend the suppliers after changing the consumer's status to
RPM_SUSPENDED, in analogy with the device's parent.

Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
Reported-by: elaine.zhang <zhangqing@rock-chips.com>
Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v1 -> v2:
   * Change a function name as requested by Ulf.
   * Add the R-by tag from Ulf.
   * Use spin_unlock_irq()/spin_lock_irq() around the rpm_suspend_suppliers()
     call in rpm_suspend().

---
 drivers/base/power/runtime.c |   45 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 39 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct devi
 	return 0;
 }
 
-static void rpm_put_suppliers(struct device *dev)
+static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
 {
 	struct device_link *link;
 
@@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct dev
 				device_links_read_lock_held()) {
 
 		while (refcount_dec_not_one(&link->rpm_active))
-			pm_runtime_put(link->supplier);
+			pm_runtime_put_noidle(link->supplier);
+
+		if (try_to_suspend)
+			pm_request_idle(link->supplier);
 	}
 }
 
+static void rpm_put_suppliers(struct device *dev)
+{
+	__rpm_put_suppliers(dev, true);
+}
+
+static void rpm_suspend_suppliers(struct device *dev)
+{
+	struct device_link *link;
+	int idx = device_links_read_lock();
+
+	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
+				device_links_read_lock_held())
+		pm_request_idle(link->supplier);
+
+	device_links_read_unlock(idx);
+}
+
 /**
  * __rpm_callback - Run a given runtime PM callback for a given device.
  * @cb: Runtime PM callback to run.
@@ -344,8 +364,10 @@ static int __rpm_callback(int (*cb)(stru
 			idx = device_links_read_lock();
 
 			retval = rpm_get_suppliers(dev);
-			if (retval)
+			if (retval) {
+				rpm_put_suppliers(dev);
 				goto fail;
+			}
 
 			device_links_read_unlock(idx);
 		}
@@ -368,9 +390,9 @@ static int __rpm_callback(int (*cb)(stru
 		    || (dev->power.runtime_status == RPM_RESUMING && retval))) {
 			idx = device_links_read_lock();
 
- fail:
-			rpm_put_suppliers(dev);
+			__rpm_put_suppliers(dev, false);
 
+fail:
 			device_links_read_unlock(idx);
 		}
 
@@ -642,8 +664,11 @@ static int rpm_suspend(struct device *de
 		goto out;
 	}
 
+	if (dev->power.irq_safe)
+		goto out;
+
 	/* Maybe the parent is now able to suspend. */
-	if (parent && !parent->power.ignore_children && !dev->power.irq_safe) {
+	if (parent && !parent->power.ignore_children) {
 		spin_unlock(&dev->power.lock);
 
 		spin_lock(&parent->power.lock);
@@ -652,6 +677,14 @@ static int rpm_suspend(struct device *de
 
 		spin_lock(&dev->power.lock);
 	}
+	/* Maybe the suppliers are now able to suspend. */
+	if (dev->power.links_count > 0) {
+		spin_unlock_irq(&dev->power.lock);
+
+		rpm_suspend_suppliers(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	}
 
  out:
 	trace_rpm_return_int_rcuidle(dev, _THIS_IP_, retval);




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-03-19 14:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-18 18:06 [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
2021-03-19 13:29   ` Ulf Hansson
2021-03-19 13:43     ` Rafael J. Wysocki
2021-03-18 18:15 ` [PATCH v1 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
2021-03-19 13:29   ` Ulf Hansson
2021-03-19 13:48     ` Rafael J. Wysocki
2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
2021-03-19 14:47   ` [PATCH v2 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
2021-03-19 14:47   ` [PATCH v2 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.