linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take)
@ 2021-03-18 18:06 Rafael J. Wysocki
  2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-18 18:06 UTC (permalink / raw)
  To: Linux PM, elaine.zhang; +Cc: LKML, Ulf Hansson

Hi All,

The previous attempt to address the issue tackled by this series, commit
44cc89f76464 ("PM: runtime: Update device status before letting suppliers
suspend") was incorrect, because it introduced a rather nasty race condition
into __rpm_callback(), so let's revert it (patch [1/2]).

Instead, let's avoid suspending the suppliers immediately after dropping the
PM-runtime references to them and do that later, when the status of the
consumer has changed to RPM_SUSPENDED.

Please see the patch changelogs for details.

Elaine, please test this series on the system where you saw the original
problem.

Thanks!




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend"
  2021-03-18 18:06 [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
@ 2021-03-18 18:09 ` Rafael J. Wysocki
  2021-03-19 13:29   ` Ulf Hansson
  2021-03-18 18:15 ` [PATCH v1 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
  2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-18 18:09 UTC (permalink / raw)
  To: Linux PM, elaine.zhang; +Cc: LKML, Ulf Hansson

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

Revert commit 44cc89f76464 ("PM: runtime: Update device status
before letting suppliers suspend") that introduced a race condition
into __rpm_callback() which allowed a concurrent rpm_resume() to
run and resume the device prematurely after its status had been
changed to RPM_SUSPENDED by __rpm_callback().

Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
Reported by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c | 62 +++++++++++++++---------------------
 1 file changed, 25 insertions(+), 37 deletions(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 18b82427d0cb..a46a7e30881b 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -325,22 +325,22 @@ static void rpm_put_suppliers(struct device *dev)
 static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
 	__releases(&dev->power.lock) __acquires(&dev->power.lock)
 {
-	bool use_links = dev->power.links_count > 0;
-	bool get = false;
 	int retval, idx;
-	bool put;
+	bool use_links = dev->power.links_count > 0;
 
 	if (dev->power.irq_safe) {
 		spin_unlock(&dev->power.lock);
-	} else if (!use_links) {
-		spin_unlock_irq(&dev->power.lock);
 	} else {
-		get = dev->power.runtime_status == RPM_RESUMING;
-
 		spin_unlock_irq(&dev->power.lock);
 
-		/* Resume suppliers if necessary. */
-		if (get) {
+		/*
+		 * Resume suppliers if necessary.
+		 *
+		 * The device's runtime PM status cannot change until this
+		 * routine returns, so it is safe to read the status outside of
+		 * the lock.
+		 */
+		if (use_links && dev->power.runtime_status == RPM_RESUMING) {
 			idx = device_links_read_lock();
 
 			retval = rpm_get_suppliers(dev);
@@ -355,36 +355,24 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
 
 	if (dev->power.irq_safe) {
 		spin_lock(&dev->power.lock);
-		return retval;
-	}
-
-	spin_lock_irq(&dev->power.lock);
-
-	if (!use_links)
-		return retval;
-
-	/*
-	 * If the device is suspending and the callback has returned success,
-	 * drop the usage counters of the suppliers that have been reference
-	 * counted on its resume.
-	 *
-	 * Do that if the resume fails too.
-	 */
-	put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
-	if (put)
-		__update_runtime_status(dev, RPM_SUSPENDED);
-	else
-		put = get && retval;
-
-	if (put) {
-		spin_unlock_irq(&dev->power.lock);
-
-		idx = device_links_read_lock();
+	} else {
+		/*
+		 * If the device is suspending and the callback has returned
+		 * success, drop the usage counters of the suppliers that have
+		 * been reference counted on its resume.
+		 *
+		 * Do that if resume fails too.
+		 */
+		if (use_links
+		    && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
+		    || (dev->power.runtime_status == RPM_RESUMING && retval))) {
+			idx = device_links_read_lock();
 
-fail:
-		rpm_put_suppliers(dev);
+ fail:
+			rpm_put_suppliers(dev);
 
-		device_links_read_unlock(idx);
+			device_links_read_unlock(idx);
+		}
 
 		spin_lock_irq(&dev->power.lock);
 	}
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 2/2] PM: runtime: Defer suspending suppliers
  2021-03-18 18:06 [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
@ 2021-03-18 18:15 ` Rafael J. Wysocki
  2021-03-19 13:29   ` Ulf Hansson
  2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-18 18:15 UTC (permalink / raw)
  To: Linux PM, elaine.zhang; +Cc: LKML, Ulf Hansson

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because the PM-runtime status of the device is not updated in
__rpm_callback(), attempts to suspend the suppliers of the given
device triggered by the rpm_put_suppliers() call in there may fail.

To fix this (1) modify __rpm_callback() to avoid attempting to
actually suspend the suppliers, but only decrease their PM-runtime
usage counters and (2) make rpm_suspend() try to suspend the suppliers
after changing the device's PM-runtime status, in analogy with the
handling of the device's parent.

Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
Reported-by: elaine.zhang <zhangqing@rock-chips.com>
Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org> 
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/runtime.c |   45 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 39 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct devi
 	return 0;
 }
 
-static void rpm_put_suppliers(struct device *dev)
+static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
 {
 	struct device_link *link;
 
@@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct dev
 				device_links_read_lock_held()) {
 
 		while (refcount_dec_not_one(&link->rpm_active))
-			pm_runtime_put(link->supplier);
+			pm_runtime_put_noidle(link->supplier);
+
+		if (try_to_suspend)
+			pm_request_idle(link->supplier);
 	}
 }
 
+static void rpm_put_suppliers(struct device *dev)
+{
+	__rpm_put_suppliers(dev, true);
+}
+
+static void rpm_try_to_suspend_suppliers(struct device *dev)
+{
+	struct device_link *link;
+	int idx = device_links_read_lock();
+
+	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
+				device_links_read_lock_held())
+		pm_request_idle(link->supplier);
+
+	device_links_read_unlock(idx);
+}
+
 /**
  * __rpm_callback - Run a given runtime PM callback for a given device.
  * @cb: Runtime PM callback to run.
@@ -344,8 +364,10 @@ static int __rpm_callback(int (*cb)(stru
 			idx = device_links_read_lock();
 
 			retval = rpm_get_suppliers(dev);
-			if (retval)
+			if (retval) {
+				rpm_put_suppliers(dev);
 				goto fail;
+			}
 
 			device_links_read_unlock(idx);
 		}
@@ -368,9 +390,9 @@ static int __rpm_callback(int (*cb)(stru
 		    || (dev->power.runtime_status == RPM_RESUMING && retval))) {
 			idx = device_links_read_lock();
 
- fail:
-			rpm_put_suppliers(dev);
+			__rpm_put_suppliers(dev, false);
 
+fail:
 			device_links_read_unlock(idx);
 		}
 
@@ -642,8 +664,11 @@ static int rpm_suspend(struct device *de
 		goto out;
 	}
 
+	if (dev->power.irq_safe)
+		goto out;
+
 	/* Maybe the parent is now able to suspend. */
-	if (parent && !parent->power.ignore_children && !dev->power.irq_safe) {
+	if (parent && !parent->power.ignore_children) {
 		spin_unlock(&dev->power.lock);
 
 		spin_lock(&parent->power.lock);
@@ -652,6 +677,14 @@ static int rpm_suspend(struct device *de
 
 		spin_lock(&dev->power.lock);
 	}
+	/* Maybe the suppliers are now able to suspend. */
+	if (dev->power.links_count > 0) {
+		spin_unlock(&dev->power.lock);
+
+		rpm_try_to_suspend_suppliers(dev);
+
+		spin_lock(&dev->power.lock);
+	}
 
  out:
 	trace_rpm_return_int_rcuidle(dev, _THIS_IP_, retval);




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] PM: runtime: Defer suspending suppliers
  2021-03-18 18:15 ` [PATCH v1 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
@ 2021-03-19 13:29   ` Ulf Hansson
  2021-03-19 13:48     ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Ulf Hansson @ 2021-03-19 13:29 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, elaine.zhang, LKML

On Thu, 18 Mar 2021 at 19:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Because the PM-runtime status of the device is not updated in
> __rpm_callback(), attempts to suspend the suppliers of the given
> device triggered by the rpm_put_suppliers() call in there may fail.
>
> To fix this (1) modify __rpm_callback() to avoid attempting to
> actually suspend the suppliers, but only decrease their PM-runtime
> usage counters and (2) make rpm_suspend() try to suspend the suppliers
> after changing the device's PM-runtime status, in analogy with the
> handling of the device's parent.
>
> Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
> Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
> Reported-by: elaine.zhang <zhangqing@rock-chips.com>
> Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Just a minor nitpick, see below. In any case:

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

> ---
>  drivers/base/power/runtime.c |   45 +++++++++++++++++++++++++++++++++++++------
>  1 file changed, 39 insertions(+), 6 deletions(-)
>
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct devi
>         return 0;
>  }
>
> -static void rpm_put_suppliers(struct device *dev)
> +static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
>  {
>         struct device_link *link;
>
> @@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct dev
>                                 device_links_read_lock_held()) {
>
>                 while (refcount_dec_not_one(&link->rpm_active))
> -                       pm_runtime_put(link->supplier);
> +                       pm_runtime_put_noidle(link->supplier);
> +
> +               if (try_to_suspend)
> +                       pm_request_idle(link->supplier);
>         }
>  }
>
> +static void rpm_put_suppliers(struct device *dev)
> +{
> +       __rpm_put_suppliers(dev, true);
> +}
> +
> +static void rpm_try_to_suspend_suppliers(struct device *dev)

Maybe "rpm_suspend_suppliers" is sufficient for the name of the
function, but I have no strong opinion.

> +{
> +       struct device_link *link;
> +       int idx = device_links_read_lock();
> +
> +       list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> +                               device_links_read_lock_held())
> +               pm_request_idle(link->supplier);
> +
> +       device_links_read_unlock(idx);
> +}
> +

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend"
  2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
@ 2021-03-19 13:29   ` Ulf Hansson
  2021-03-19 13:43     ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Ulf Hansson @ 2021-03-19 13:29 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, elaine.zhang, LKML

On Thu, 18 Mar 2021 at 19:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
>
> Revert commit 44cc89f76464 ("PM: runtime: Update device status
> before letting suppliers suspend") that introduced a race condition
> into __rpm_callback() which allowed a concurrent rpm_resume() to
> run and resume the device prematurely after its status had been
> changed to RPM_SUSPENDED by __rpm_callback().

Huh, the code path is not entirely easy to follow. :-)

Did you find this by code inspection or did you get an error report?

>
> Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
> Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
> Reported by: Adrian Hunter <adrian.hunter@intel.com>
> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe


> ---
>  drivers/base/power/runtime.c | 62 +++++++++++++++---------------------
>  1 file changed, 25 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 18b82427d0cb..a46a7e30881b 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -325,22 +325,22 @@ static void rpm_put_suppliers(struct device *dev)
>  static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
>         __releases(&dev->power.lock) __acquires(&dev->power.lock)
>  {
> -       bool use_links = dev->power.links_count > 0;
> -       bool get = false;
>         int retval, idx;
> -       bool put;
> +       bool use_links = dev->power.links_count > 0;
>
>         if (dev->power.irq_safe) {
>                 spin_unlock(&dev->power.lock);
> -       } else if (!use_links) {
> -               spin_unlock_irq(&dev->power.lock);
>         } else {
> -               get = dev->power.runtime_status == RPM_RESUMING;
> -
>                 spin_unlock_irq(&dev->power.lock);
>
> -               /* Resume suppliers if necessary. */
> -               if (get) {
> +               /*
> +                * Resume suppliers if necessary.
> +                *
> +                * The device's runtime PM status cannot change until this
> +                * routine returns, so it is safe to read the status outside of
> +                * the lock.
> +                */
> +               if (use_links && dev->power.runtime_status == RPM_RESUMING) {
>                         idx = device_links_read_lock();
>
>                         retval = rpm_get_suppliers(dev);
> @@ -355,36 +355,24 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
>
>         if (dev->power.irq_safe) {
>                 spin_lock(&dev->power.lock);
> -               return retval;
> -       }
> -
> -       spin_lock_irq(&dev->power.lock);
> -
> -       if (!use_links)
> -               return retval;
> -
> -       /*
> -        * If the device is suspending and the callback has returned success,
> -        * drop the usage counters of the suppliers that have been reference
> -        * counted on its resume.
> -        *
> -        * Do that if the resume fails too.
> -        */
> -       put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
> -       if (put)
> -               __update_runtime_status(dev, RPM_SUSPENDED);
> -       else
> -               put = get && retval;
> -
> -       if (put) {
> -               spin_unlock_irq(&dev->power.lock);
> -
> -               idx = device_links_read_lock();
> +       } else {
> +               /*
> +                * If the device is suspending and the callback has returned
> +                * success, drop the usage counters of the suppliers that have
> +                * been reference counted on its resume.
> +                *
> +                * Do that if resume fails too.
> +                */
> +               if (use_links
> +                   && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
> +                   || (dev->power.runtime_status == RPM_RESUMING && retval))) {
> +                       idx = device_links_read_lock();
>
> -fail:
> -               rpm_put_suppliers(dev);
> + fail:
> +                       rpm_put_suppliers(dev);
>
> -               device_links_read_unlock(idx);
> +                       device_links_read_unlock(idx);
> +               }
>
>                 spin_lock_irq(&dev->power.lock);
>         }
> --
> 2.26.2
>
>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend"
  2021-03-19 13:29   ` Ulf Hansson
@ 2021-03-19 13:43     ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 13:43 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, elaine.zhang, LKML

On Fri, Mar 19, 2021 at 2:31 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Thu, 18 Mar 2021 at 19:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> >
> > Revert commit 44cc89f76464 ("PM: runtime: Update device status
> > before letting suppliers suspend") that introduced a race condition
> > into __rpm_callback() which allowed a concurrent rpm_resume() to
> > run and resume the device prematurely after its status had been
> > changed to RPM_SUSPENDED by __rpm_callback().
>
> Huh, the code path is not entirely easy to follow. :-)
>
> Did you find this by code inspection or did you get an error report?

There was a bug report that caused me to look at the code once again.

> > Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
> > Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
> > Reported by: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] PM: runtime: Defer suspending suppliers
  2021-03-19 13:29   ` Ulf Hansson
@ 2021-03-19 13:48     ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 13:48 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, elaine.zhang, LKML

On Fri, Mar 19, 2021 at 2:30 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Thu, 18 Mar 2021 at 19:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Because the PM-runtime status of the device is not updated in
> > __rpm_callback(), attempts to suspend the suppliers of the given
> > device triggered by the rpm_put_suppliers() call in there may fail.
> >
> > To fix this (1) modify __rpm_callback() to avoid attempting to
> > actually suspend the suppliers, but only decrease their PM-runtime
> > usage counters and (2) make rpm_suspend() try to suspend the suppliers
> > after changing the device's PM-runtime status, in analogy with the
> > handling of the device's parent.
> >
> > Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
> > Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
> > Reported-by: elaine.zhang <zhangqing@rock-chips.com>
> > Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Just a minor nitpick, see below. In any case:
>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Thanks!

>
> > ---
> >  drivers/base/power/runtime.c |   45 +++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 39 insertions(+), 6 deletions(-)
> >
> > Index: linux-pm/drivers/base/power/runtime.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/runtime.c
> > +++ linux-pm/drivers/base/power/runtime.c
> > @@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct devi
> >         return 0;
> >  }
> >
> > -static void rpm_put_suppliers(struct device *dev)
> > +static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
> >  {
> >         struct device_link *link;
> >
> > @@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct dev
> >                                 device_links_read_lock_held()) {
> >
> >                 while (refcount_dec_not_one(&link->rpm_active))
> > -                       pm_runtime_put(link->supplier);
> > +                       pm_runtime_put_noidle(link->supplier);
> > +
> > +               if (try_to_suspend)
> > +                       pm_request_idle(link->supplier);
> >         }
> >  }
> >
> > +static void rpm_put_suppliers(struct device *dev)
> > +{
> > +       __rpm_put_suppliers(dev, true);
> > +}
> > +
> > +static void rpm_try_to_suspend_suppliers(struct device *dev)
>
> Maybe "rpm_suspend_suppliers" is sufficient for the name of the
> function, but I have no strong opinion.

OK

In addition to this, spin_unlock_irq()/spin_lock_irq() need to be used
around the call to it in rpm_suspend(), so I'll send a v2.  I guess
that the R-by still applies, though. :-)

> > +{
> > +       struct device_link *link;
> > +       int idx = device_links_read_lock();
> > +
> > +       list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> > +                               device_links_read_lock_held())
> > +               pm_request_idle(link->supplier);
> > +
> > +       device_links_read_unlock(idx);
> > +}
> > +

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take)
  2021-03-18 18:06 [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
  2021-03-18 18:15 ` [PATCH v1 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
@ 2021-03-19 14:42 ` Rafael J. Wysocki
  2021-03-19 14:47   ` [PATCH v2 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
  2021-03-19 14:47   ` [PATCH v2 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
  2 siblings, 2 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 14:42 UTC (permalink / raw)
  To: Linux PM, elaine.zhang; +Cc: LKML, Ulf Hansson

On Thursday, March 18, 2021 7:06:54 PM CET Rafael J. Wysocki wrote:
> Hi All,
> 
> The previous attempt to address the issue tackled by this series, commit
> 44cc89f76464 ("PM: runtime: Update device status before letting suppliers
> suspend") was incorrect, because it introduced a rather nasty race condition
> into __rpm_callback(), so let's revert it (patch [1/2]).
> 
> Instead, let's avoid suspending the suppliers immediately after dropping the
> PM-runtime references to them and do that later, when the status of the
> consumer has changed to RPM_SUSPENDED.
> 
> Please see the patch changelogs for details.
> 
> Elaine, please test this series on the system where you saw the original
> problem.

The above still applies.

The v2 is sent because of some updates in the second patch, plus the tags from Ulf.

Thanks!




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend"
  2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
@ 2021-03-19 14:47   ` Rafael J. Wysocki
  2021-03-19 14:47   ` [PATCH v2 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
  1 sibling, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 14:47 UTC (permalink / raw)
  To: Linux PM; +Cc: elaine.zhang, LKML, Ulf Hansson

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

Revert commit 44cc89f76464 ("PM: runtime: Update device status
before letting suppliers suspend") that introduced a race condition
into __rpm_callback() which allowed a concurrent rpm_resume() to
run and resume the device prematurely after its status had been
changed to RPM_SUSPENDED by __rpm_callback().

Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
Reported by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
---

v1 -> v2: Added the R-by tag from Ulf.

---
 drivers/base/power/runtime.c | 62 +++++++++++++++---------------------
 1 file changed, 25 insertions(+), 37 deletions(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 18b82427d0cb..a46a7e30881b 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -325,22 +325,22 @@ static void rpm_put_suppliers(struct device *dev)
 static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
 	__releases(&dev->power.lock) __acquires(&dev->power.lock)
 {
-	bool use_links = dev->power.links_count > 0;
-	bool get = false;
 	int retval, idx;
-	bool put;
+	bool use_links = dev->power.links_count > 0;
 
 	if (dev->power.irq_safe) {
 		spin_unlock(&dev->power.lock);
-	} else if (!use_links) {
-		spin_unlock_irq(&dev->power.lock);
 	} else {
-		get = dev->power.runtime_status == RPM_RESUMING;
-
 		spin_unlock_irq(&dev->power.lock);
 
-		/* Resume suppliers if necessary. */
-		if (get) {
+		/*
+		 * Resume suppliers if necessary.
+		 *
+		 * The device's runtime PM status cannot change until this
+		 * routine returns, so it is safe to read the status outside of
+		 * the lock.
+		 */
+		if (use_links && dev->power.runtime_status == RPM_RESUMING) {
 			idx = device_links_read_lock();
 
 			retval = rpm_get_suppliers(dev);
@@ -355,36 +355,24 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
 
 	if (dev->power.irq_safe) {
 		spin_lock(&dev->power.lock);
-		return retval;
-	}
-
-	spin_lock_irq(&dev->power.lock);
-
-	if (!use_links)
-		return retval;
-
-	/*
-	 * If the device is suspending and the callback has returned success,
-	 * drop the usage counters of the suppliers that have been reference
-	 * counted on its resume.
-	 *
-	 * Do that if the resume fails too.
-	 */
-	put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
-	if (put)
-		__update_runtime_status(dev, RPM_SUSPENDED);
-	else
-		put = get && retval;
-
-	if (put) {
-		spin_unlock_irq(&dev->power.lock);
-
-		idx = device_links_read_lock();
+	} else {
+		/*
+		 * If the device is suspending and the callback has returned
+		 * success, drop the usage counters of the suppliers that have
+		 * been reference counted on its resume.
+		 *
+		 * Do that if resume fails too.
+		 */
+		if (use_links
+		    && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
+		    || (dev->power.runtime_status == RPM_RESUMING && retval))) {
+			idx = device_links_read_lock();
 
-fail:
-		rpm_put_suppliers(dev);
+ fail:
+			rpm_put_suppliers(dev);
 
-		device_links_read_unlock(idx);
+			device_links_read_unlock(idx);
+		}
 
 		spin_lock_irq(&dev->power.lock);
 	}
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] PM: runtime: Defer suspending suppliers
  2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
  2021-03-19 14:47   ` [PATCH v2 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
@ 2021-03-19 14:47   ` Rafael J. Wysocki
  1 sibling, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2021-03-19 14:47 UTC (permalink / raw)
  To: Linux PM; +Cc: elaine.zhang, LKML, Ulf Hansson

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because the PM-runtime status of the device is not updated in
__rpm_callback(), attempts to suspend the suppliers of the given
device triggered by the rpm_put_suppliers() call in there may
cause a supplier to be suspended completely before the status of
the consumer is updated to RPM_SUSPENDED, which is confusing.

To avoid that (1) modify __rpm_callback() to only decrease the
PM-runtime usage counter of each supplier and (2) make rpm_suspend()
try to suspend the suppliers after changing the consumer's status to
RPM_SUSPENDED, in analogy with the device's parent.

Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
Reported-by: elaine.zhang <zhangqing@rock-chips.com>
Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v1 -> v2:
   * Change a function name as requested by Ulf.
   * Add the R-by tag from Ulf.
   * Use spin_unlock_irq()/spin_lock_irq() around the rpm_suspend_suppliers()
     call in rpm_suspend().

---
 drivers/base/power/runtime.c |   45 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 39 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct devi
 	return 0;
 }
 
-static void rpm_put_suppliers(struct device *dev)
+static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
 {
 	struct device_link *link;
 
@@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct dev
 				device_links_read_lock_held()) {
 
 		while (refcount_dec_not_one(&link->rpm_active))
-			pm_runtime_put(link->supplier);
+			pm_runtime_put_noidle(link->supplier);
+
+		if (try_to_suspend)
+			pm_request_idle(link->supplier);
 	}
 }
 
+static void rpm_put_suppliers(struct device *dev)
+{
+	__rpm_put_suppliers(dev, true);
+}
+
+static void rpm_suspend_suppliers(struct device *dev)
+{
+	struct device_link *link;
+	int idx = device_links_read_lock();
+
+	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
+				device_links_read_lock_held())
+		pm_request_idle(link->supplier);
+
+	device_links_read_unlock(idx);
+}
+
 /**
  * __rpm_callback - Run a given runtime PM callback for a given device.
  * @cb: Runtime PM callback to run.
@@ -344,8 +364,10 @@ static int __rpm_callback(int (*cb)(stru
 			idx = device_links_read_lock();
 
 			retval = rpm_get_suppliers(dev);
-			if (retval)
+			if (retval) {
+				rpm_put_suppliers(dev);
 				goto fail;
+			}
 
 			device_links_read_unlock(idx);
 		}
@@ -368,9 +390,9 @@ static int __rpm_callback(int (*cb)(stru
 		    || (dev->power.runtime_status == RPM_RESUMING && retval))) {
 			idx = device_links_read_lock();
 
- fail:
-			rpm_put_suppliers(dev);
+			__rpm_put_suppliers(dev, false);
 
+fail:
 			device_links_read_unlock(idx);
 		}
 
@@ -642,8 +664,11 @@ static int rpm_suspend(struct device *de
 		goto out;
 	}
 
+	if (dev->power.irq_safe)
+		goto out;
+
 	/* Maybe the parent is now able to suspend. */
-	if (parent && !parent->power.ignore_children && !dev->power.irq_safe) {
+	if (parent && !parent->power.ignore_children) {
 		spin_unlock(&dev->power.lock);
 
 		spin_lock(&parent->power.lock);
@@ -652,6 +677,14 @@ static int rpm_suspend(struct device *de
 
 		spin_lock(&dev->power.lock);
 	}
+	/* Maybe the suppliers are now able to suspend. */
+	if (dev->power.links_count > 0) {
+		spin_unlock_irq(&dev->power.lock);
+
+		rpm_suspend_suppliers(dev);
+
+		spin_lock_irq(&dev->power.lock);
+	}
 
  out:
 	trace_rpm_return_int_rcuidle(dev, _THIS_IP_, retval);




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-03-19 14:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-18 18:06 [PATCH v1 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
2021-03-18 18:09 ` [PATCH v1 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
2021-03-19 13:29   ` Ulf Hansson
2021-03-19 13:43     ` Rafael J. Wysocki
2021-03-18 18:15 ` [PATCH v1 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki
2021-03-19 13:29   ` Ulf Hansson
2021-03-19 13:48     ` Rafael J. Wysocki
2021-03-19 14:42 ` [PATCH v2 0/2] PM: runtime: Update device status before letting suppliers suspend (another take) Rafael J. Wysocki
2021-03-19 14:47   ` [PATCH v2 1/2] Revert "PM: runtime: Update device status before letting suppliers suspend" Rafael J. Wysocki
2021-03-19 14:47   ` [PATCH v2 2/2] PM: runtime: Defer suspending suppliers Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).