All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
@ 2018-06-12 11:00 ` Rafael J. Wysocki
  2018-06-12 12:44   ` Marek Szyprowski
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-06-12 11:00 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Greg Kroah-Hartman, Ulf Hansson, Lukas Wunner,
	Marek Szyprowski, Bartlomiej Zolnierkiewicz, Jon Hunter

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If a device link is added via device_link_add() by the driver of the
link's consumer device, the supplier's runtime PM usage counter is
going to be dropped by the pm_runtime_put_suppliers() call in
driver_probe_device().  However, in that case it is not incremented
unless the supplier driver is already present and the link is not
stateless.  That leads to a runtime PM usage counter imbalance for
the supplier device in a few cases.

To prevent that from happening, bump up the supplier runtime
PM usage counter in device_link_add() for all links with the
DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
time.  Use pm_runtime_get_noresume() for that as the callers of
device_link_add() who want the supplier to be resumed by it should
pass DL_FLAG_RPM_ACTIVE in flags to it anyway.

Fixes: 21d5c57b3726 (PM / runtime: Use device links)
Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
reference counting of device link suppliers at probe) that is going
to be reverted.

---
 drivers/base/core.c |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -216,6 +216,13 @@ struct device_link *device_link_add(stru
 			link->rpm_active = true;
 		}
 		pm_runtime_new_link(consumer);
+		/*
+		 * If the link is being added by the consumer driver at probe
+		 * time, balance the decrementation of the supplier's runtime PM
+		 * usage counter after consumer probe in driver_probe_device().
+		 */
+		if (consumer->links.status == DL_DEV_PROBING)
+			pm_runtime_get_noresume(supplier);
 	}
 	get_device(supplier);
 	link->supplier = supplier;
@@ -234,14 +241,6 @@ struct device_link *device_link_add(stru
 		case DL_DEV_DRIVER_BOUND:
 			switch (consumer->links.status) {
 			case DL_DEV_PROBING:
-				/*
-				 * Balance the decrementation of the supplier's
-				 * runtime PM usage counter after consumer probe
-				 * in driver_probe_device().
-				 */
-				if (flags & DL_FLAG_PM_RUNTIME)
-					pm_runtime_get_sync(supplier);
-
 				link->status = DL_STATE_CONSUMER_PROBE;
 				break;
 			case DL_DEV_DRIVER_BOUND:


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-12 11:00 ` [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance Rafael J. Wysocki
@ 2018-06-12 12:44   ` Marek Szyprowski
  2018-06-12 14:24     ` Rafael J. Wysocki
  2018-06-12 19:43     ` Ulf Hansson
  0 siblings, 2 replies; 12+ messages in thread
From: Marek Szyprowski @ 2018-06-12 12:44 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: LKML, Greg Kroah-Hartman, Ulf Hansson, Lukas Wunner,
	Bartlomiej Zolnierkiewicz, Jon Hunter

Hi Rafael,

On 2018-06-12 13:00, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> If a device link is added via device_link_add() by the driver of the
> link's consumer device, the supplier's runtime PM usage counter is
> going to be dropped by the pm_runtime_put_suppliers() call in
> driver_probe_device().  However, in that case it is not incremented
> unless the supplier driver is already present and the link is not
> stateless.  That leads to a runtime PM usage counter imbalance for
> the supplier device in a few cases.
>
> To prevent that from happening, bump up the supplier runtime
> PM usage counter in device_link_add() for all links with the
> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
> time.  Use pm_runtime_get_noresume() for that as the callers of
> device_link_add() who want the supplier to be resumed by it should
> pass DL_FLAG_RPM_ACTIVE in flags to it anyway.
>
> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>
> This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
> reference counting of device link suppliers at probe) that is going
> to be reverted.

Thanks Rafael for the patch. I've applied it and now I'm a bit puzzled.
Let's get back to my IOMMU and codec case, mentioned here:
https://marc.info/?l=linux-pm&m=152878741527962&w=2

Now, after applying your patch, when IOMMU creates a link with
DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
being probed), it is not IOMMU is not runtime resumed anymore (that's
because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).
This means that until jpeg driver enables runtime pm for its device and
finally performs runtime pm suspends/resume cycle, the supplier won't be
resumed. On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
creation, the supplier is properly resumed, but then, once the jpeg
device probe finishes, the supplier is still in runtime active state
until a next runtime suspend/resume cycle of jpeg device. If I understand
right, the DL_FLAG_RPM_ACTIVE flag should be there from the beginning,
but that time it runtime pm part of links worked in a bit different way
than now.

Is there any way to keep old behavior?

> ---
>   drivers/base/core.c |   15 +++++++--------
>   1 file changed, 7 insertions(+), 8 deletions(-)
>
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -216,6 +216,13 @@ struct device_link *device_link_add(stru
>   			link->rpm_active = true;
>   		}
>   		pm_runtime_new_link(consumer);
> +		/*
> +		 * If the link is being added by the consumer driver at probe
> +		 * time, balance the decrementation of the supplier's runtime PM
> +		 * usage counter after consumer probe in driver_probe_device().
> +		 */
> +		if (consumer->links.status == DL_DEV_PROBING)
> +			pm_runtime_get_noresume(supplier);
>   	}
>   	get_device(supplier);
>   	link->supplier = supplier;
> @@ -234,14 +241,6 @@ struct device_link *device_link_add(stru
>   		case DL_DEV_DRIVER_BOUND:
>   			switch (consumer->links.status) {
>   			case DL_DEV_PROBING:
> -				/*
> -				 * Balance the decrementation of the supplier's
> -				 * runtime PM usage counter after consumer probe
> -				 * in driver_probe_device().
> -				 */
> -				if (flags & DL_FLAG_PM_RUNTIME)
> -					pm_runtime_get_sync(supplier);
> -
>   				link->status = DL_STATE_CONSUMER_PROBE;
>   				break;
>   			case DL_DEV_DRIVER_BOUND:
>
>
>
>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-12 12:44   ` Marek Szyprowski
@ 2018-06-12 14:24     ` Rafael J. Wysocki
  2018-06-13  6:23       ` Marek Szyprowski
  2018-06-13  8:28       ` Ulf Hansson
  2018-06-12 19:43     ` Ulf Hansson
  1 sibling, 2 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-06-12 14:24 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Linux PM, LKML, Greg Kroah-Hartman, Ulf Hansson, Lukas Wunner,
	Bartlomiej Zolnierkiewicz, Jon Hunter

On Tuesday, June 12, 2018 2:44:23 PM CEST Marek Szyprowski wrote:
> Hi Rafael,

Hi,

> On 2018-06-12 13:00, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > If a device link is added via device_link_add() by the driver of the
> > link's consumer device, the supplier's runtime PM usage counter is
> > going to be dropped by the pm_runtime_put_suppliers() call in
> > driver_probe_device().  However, in that case it is not incremented
> > unless the supplier driver is already present and the link is not
> > stateless.  That leads to a runtime PM usage counter imbalance for
> > the supplier device in a few cases.
> >
> > To prevent that from happening, bump up the supplier runtime
> > PM usage counter in device_link_add() for all links with the
> > DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
> > time.  Use pm_runtime_get_noresume() for that as the callers of
> > device_link_add() who want the supplier to be resumed by it should
> > pass DL_FLAG_RPM_ACTIVE in flags to it anyway.
> >
> > Fixes: 21d5c57b3726 (PM / runtime: Use device links)
> > Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
> > reference counting of device link suppliers at probe) that is going
> > to be reverted.
> 
> Thanks Rafael for the patch. I've applied it and now I'm a bit puzzled.

Thanks for testing! :-)

> Let's get back to my IOMMU and codec case, mentioned here:
> https://marc.info/?l=linux-pm&m=152878741527962&w=2
> 
> Now, after applying your patch, when IOMMU creates a link with
> DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
> being probed), it is not IOMMU is not runtime resumed anymore (that's
> because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).
> This means that until jpeg driver enables runtime pm for its device and
> finally performs runtime pm suspends/resume cycle, the supplier won't be
> resumed. On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
> creation, the supplier is properly resumed, but then, once the jpeg
> device probe finishes, the supplier is still in runtime active state
> until a next runtime suspend/resume cycle of jpeg device.

That additional suspend-resume cycle should not be necessary in theory
unless I'm missing something.

The pm_request_idle() call in driver_probe_device() should trigger a
suspend of the jpeg device after probe (unless blocked by something)
and that should drop the RPM usage counter of the IOMMU.  Next, the
pm_runtime_put_suppliers() in there should actually suspend it.

It looks like the pm_request_idle() doesn't work as expected.

> If I understand right, the DL_FLAG_RPM_ACTIVE flag should be there from the
> beginning, but that time it runtime pm part of links worked in a bit
> different way than now.

Right, and evidently there are callers depending on the existing behavior.
 
> Is there any way to keep old behavior?

Yes, there is, please test the appended v2 of the $subject patch.

That said, I'd like to remove the extra supplier resume from there going
forward as there may be callers who actually don't want that to happen and
DL_FLAG_RPM_ACTIVE is there for a purpose.

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH v2] PM / core: Fix supplier device runtime PM usage counter imbalance

If a device link is added via device_link_add() by the driver of the
link's consumer device, the supplier's runtime PM usage counter is
going to be dropped by the pm_runtime_put_suppliers() call in
driver_probe_device().  However, in that case it is not incremented
unless the supplier driver is already present and the link is not
stateless.  That leads to a runtime PM usage counter imbalance for
the supplier device in a few cases.

To prevent that from happening, bump up the supplier runtime
PM usage counter in device_link_add() for all links with the
DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
time.  Use pm_runtime_get_noresume() for that as the callers of
device_link_add() who want the supplier to be resumed by it are
expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but
additionally resume the supplier if the link is added during
consumer driver probe to retain the existing behavior for the
callers depending on it.

Fixes: 21d5c57b3726 (PM / runtime: Use device links)
Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/core.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -216,6 +216,13 @@ struct device_link *device_link_add(stru
 			link->rpm_active = true;
 		}
 		pm_runtime_new_link(consumer);
+		/*
+		 * If the link is being added by the consumer driver at probe
+		 * time, balance the decrementation of the supplier's runtime PM
+		 * usage counter after consumer probe in driver_probe_device().
+		 */
+		if (consumer->links.status == DL_DEV_PROBING)
+			pm_runtime_get_noresume(supplier);
 	}
 	get_device(supplier);
 	link->supplier = supplier;
@@ -235,12 +242,12 @@ struct device_link *device_link_add(stru
 			switch (consumer->links.status) {
 			case DL_DEV_PROBING:
 				/*
-				 * Balance the decrementation of the supplier's
-				 * runtime PM usage counter after consumer probe
-				 * in driver_probe_device().
+				 * Some callers expect the link creation during
+				 * consumer driver probe to resume the supplier
+				 * even without DL_FLAG_RPM_ACTIVE.
 				 */
 				if (flags & DL_FLAG_PM_RUNTIME)
-					pm_runtime_get_sync(supplier);
+					pm_runtime_resume(supplier);
 
 				link->status = DL_STATE_CONSUMER_PROBE;
 				break;


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-12 12:44   ` Marek Szyprowski
  2018-06-12 14:24     ` Rafael J. Wysocki
@ 2018-06-12 19:43     ` Ulf Hansson
  2018-06-13  6:42       ` Marek Szyprowski
  2018-06-13  7:58       ` Rafael J. Wysocki
  1 sibling, 2 replies; 12+ messages in thread
From: Ulf Hansson @ 2018-06-12 19:43 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Rafael J. Wysocki, Linux PM, LKML, Greg Kroah-Hartman,
	Lukas Wunner, Bartlomiej Zolnierkiewicz, Jon Hunter

On 12 June 2018 at 14:44, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> Hi Rafael,
>
> On 2018-06-12 13:00, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> If a device link is added via device_link_add() by the driver of the
>> link's consumer device, the supplier's runtime PM usage counter is
>> going to be dropped by the pm_runtime_put_suppliers() call in
>> driver_probe_device().  However, in that case it is not incremented
>> unless the supplier driver is already present and the link is not
>> stateless.  That leads to a runtime PM usage counter imbalance for
>> the supplier device in a few cases.
>>
>> To prevent that from happening, bump up the supplier runtime
>> PM usage counter in device_link_add() for all links with the
>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>> time.  Use pm_runtime_get_noresume() for that as the callers of
>> device_link_add() who want the supplier to be resumed by it should
>> pass DL_FLAG_RPM_ACTIVE in flags to it anyway.
>>
>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>
>> This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
>> reference counting of device link suppliers at probe) that is going
>> to be reverted.
>
> Thanks Rafael for the patch. I've applied it and now I'm a bit puzzled.
> Let's get back to my IOMMU and codec case, mentioned here:
> https://marc.info/?l=linux-pm&m=152878741527962&w=2
>
> Now, after applying your patch, when IOMMU creates a link with
> DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
> being probed), it is not IOMMU is not runtime resumed anymore (that's
> because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).

According to your earlier replies, I thought this is what you wanted. No?

> This means that until jpeg driver enables runtime pm for its device and
> finally performs runtime pm suspends/resume cycle, the supplier won't be
> resumed.

What's the problem with that? When does the IOMMU device needs to be
runtime resumed?

> On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
> creation, the supplier is properly resumed, but then, once the jpeg
> device probe finishes, the supplier is still in runtime active state
> until a next runtime suspend/resume cycle of jpeg device.

Could you elaborate on the what happens in jpeg driver during probe,
in regards to runtime PM. Perhaps you can also point me to what driver
it is?

> If I understand
> right, the DL_FLAG_RPM_ACTIVE flag should be there from the beginning,
> but that time it runtime pm part of links worked in a bit different way
> than now.

Just to make sure, you also reverted 1e8378619841, before trying
Rafael's change on top?

>
> Is there any way to keep old behavior?

I think the old behavior is sub-optimal. I am sure there are users
that really don't want the driver core to runtime resume the supplier
unconditionally.

I would rather go and fix the few users of device_link_add(), to use
DL_FLAG_RPM_ACTIVE, in cases when they need it. Of course I am fine if
we do these changes in a step by step basis as well.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-12 14:24     ` Rafael J. Wysocki
@ 2018-06-13  6:23       ` Marek Szyprowski
  2018-06-13  8:16         ` Rafael J. Wysocki
  2018-06-13  8:28       ` Ulf Hansson
  1 sibling, 1 reply; 12+ messages in thread
From: Marek Szyprowski @ 2018-06-13  6:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, LKML, Greg Kroah-Hartman, Ulf Hansson, Lukas Wunner,
	Bartlomiej Zolnierkiewicz, Jon Hunter

Hi Rafael,

On 2018-06-12 16:24, Rafael J. Wysocki wrote:
> On Tuesday, June 12, 2018 2:44:23 PM CEST Marek Szyprowski wrote:
>> On 2018-06-12 13:00, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> If a device link is added via device_link_add() by the driver of the
>>> link's consumer device, the supplier's runtime PM usage counter is
>>> going to be dropped by the pm_runtime_put_suppliers() call in
>>> driver_probe_device().  However, in that case it is not incremented
>>> unless the supplier driver is already present and the link is not
>>> stateless.  That leads to a runtime PM usage counter imbalance for
>>> the supplier device in a few cases.
>>>
>>> To prevent that from happening, bump up the supplier runtime
>>> PM usage counter in device_link_add() for all links with the
>>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>>> time.  Use pm_runtime_get_noresume() for that as the callers of
>>> device_link_add() who want the supplier to be resumed by it should
>>> pass DL_FLAG_RPM_ACTIVE in flags to it anyway.
>>>
>>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> ---
>>>
>>> This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
>>> reference counting of device link suppliers at probe) that is going
>>> to be reverted.
>> Thanks Rafael for the patch. I've applied it and now I'm a bit puzzled.
> Thanks for testing! :-)
>
>> Let's get back to my IOMMU and codec case, mentioned here:
>> https://marc.info/?l=linux-pm&m=152878741527962&w=2
>>
>> Now, after applying your patch, when IOMMU creates a link with
>> DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
>> being probed), it is not IOMMU is not runtime resumed anymore (that's
>> because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).
>> This means that until jpeg driver enables runtime pm for its device and
>> finally performs runtime pm suspends/resume cycle, the supplier won't be
>> resumed. On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
>> creation, the supplier is properly resumed, but then, once the jpeg
>> device probe finishes, the supplier is still in runtime active state
>> until a next runtime suspend/resume cycle of jpeg device.
> That additional suspend-resume cycle should not be necessary in theory
> unless I'm missing something.
>
> The pm_request_idle() call in driver_probe_device() should trigger a
> suspend of the jpeg device after probe (unless blocked by something)
> and that should drop the RPM usage counter of the IOMMU.  Next, the
> pm_runtime_put_suppliers() in there should actually suspend it.

I've also would expect such behavior of PM core, but checks on real
hardware gives other results.

> It looks like the pm_request_idle() doesn't work as expected.

pm_request_idle() calls rpm_idle(), which returns early with -EAGAIN due to
(dev->power.runtime_status != RPM_ACTIVE) check. The device runtime_status
is RPM_SUSPENDED as initially set by pm_runtime_init() on device creation.
Notice that JPEG driver only calls pm_runtime_enable() and nothing more.

>> If I understand right, the DL_FLAG_RPM_ACTIVE flag should be there from the
>> beginning, but that time it runtime pm part of links worked in a bit
>> different way than now.
> Right, and evidently there are callers depending on the existing behavior.
>   
>> Is there any way to keep old behavior?
> Yes, there is, please test the appended v2 of the $subject patch.
>
> That said, I'd like to remove the extra supplier resume from there going
> forward as there may be callers who actually don't want that to happen and
> DL_FLAG_RPM_ACTIVE is there for a purpose.

Frankly, if the current behavior matches the designed behavior of
DL_FLAG_RPM_ACTIVE flag, then maybe instead of adding workarounds now, we
should simply fix all existing callers of device_link_add()? 'git grep' 
shows
only 6 places where links are created with DL_FLAG_PM_RUNTIME flag, I see no
problem to add DL_FLAG_RPM_ACTIVE there to keep current behavior after a fix
in runtime PM core. The description of DL_FLAG_RPM_ACTIVE should also be 
a bit
updated, because I initially thought that it means that the runtime pm 
counter
on supplier is increased for the whole lifetime of the device link (it 
is not
clear when core will call a corresponding pm_runtime_put()).

The other question is what need to be fixed to get proper behavior 
without the
additional suspend/resume cycle mentioned a few paragraphs above.

> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH v2] PM / core: Fix supplier device runtime PM usage counter imbalance
>
> If a device link is added via device_link_add() by the driver of the
> link's consumer device, the supplier's runtime PM usage counter is
> going to be dropped by the pm_runtime_put_suppliers() call in
> driver_probe_device().  However, in that case it is not incremented
> unless the supplier driver is already present and the link is not
> stateless.  That leads to a runtime PM usage counter imbalance for
> the supplier device in a few cases.
>
> To prevent that from happening, bump up the supplier runtime
> PM usage counter in device_link_add() for all links with the
> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
> time.  Use pm_runtime_get_noresume() for that as the callers of
> device_link_add() who want the supplier to be resumed by it are
> expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but
> additionally resume the supplier if the link is added during
> consumer driver probe to retain the existing behavior for the
> callers depending on it.
>
> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I've tested this version of the patch and it keeps the current behavior for
links created with DL_FLAG_PM_RUNTIME flag. The questions is if we really
want it?

> ---
>   drivers/base/core.c |   15 +++++++++++----
>   1 file changed, 11 insertions(+), 4 deletions(-)
>
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -216,6 +216,13 @@ struct device_link *device_link_add(stru
>   			link->rpm_active = true;
>   		}
>   		pm_runtime_new_link(consumer);
> +		/*
> +		 * If the link is being added by the consumer driver at probe
> +		 * time, balance the decrementation of the supplier's runtime PM
> +		 * usage counter after consumer probe in driver_probe_device().
> +		 */
> +		if (consumer->links.status == DL_DEV_PROBING)
> +			pm_runtime_get_noresume(supplier);
>   	}
>   	get_device(supplier);
>   	link->supplier = supplier;
> @@ -235,12 +242,12 @@ struct device_link *device_link_add(stru
>   			switch (consumer->links.status) {
>   			case DL_DEV_PROBING:
>   				/*
> -				 * Balance the decrementation of the supplier's
> -				 * runtime PM usage counter after consumer probe
> -				 * in driver_probe_device().
> +				 * Some callers expect the link creation during
> +				 * consumer driver probe to resume the supplier
> +				 * even without DL_FLAG_RPM_ACTIVE.
>   				 */
>   				if (flags & DL_FLAG_PM_RUNTIME)
> -					pm_runtime_get_sync(supplier);
> +					pm_runtime_resume(supplier);
>   
>   				link->status = DL_STATE_CONSUMER_PROBE;
>   				break;
>
>
>
>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-12 19:43     ` Ulf Hansson
@ 2018-06-13  6:42       ` Marek Szyprowski
  2018-06-13  8:25         ` Ulf Hansson
  2018-06-13  7:58       ` Rafael J. Wysocki
  1 sibling, 1 reply; 12+ messages in thread
From: Marek Szyprowski @ 2018-06-13  6:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Linux PM, LKML, Greg Kroah-Hartman,
	Lukas Wunner, Bartlomiej Zolnierkiewicz, Jon Hunter

Hi Ulf,

On 2018-06-12 21:43, Ulf Hansson wrote:
> On 12 June 2018 at 14:44, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>> On 2018-06-12 13:00, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> If a device link is added via device_link_add() by the driver of the
>>> link's consumer device, the supplier's runtime PM usage counter is
>>> going to be dropped by the pm_runtime_put_suppliers() call in
>>> driver_probe_device().  However, in that case it is not incremented
>>> unless the supplier driver is already present and the link is not
>>> stateless.  That leads to a runtime PM usage counter imbalance for
>>> the supplier device in a few cases.
>>>
>>> To prevent that from happening, bump up the supplier runtime
>>> PM usage counter in device_link_add() for all links with the
>>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>>> time.  Use pm_runtime_get_noresume() for that as the callers of
>>> device_link_add() who want the supplier to be resumed by it should
>>> pass DL_FLAG_RPM_ACTIVE in flags to it anyway.
>>>
>>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> ---
>>>
>>> This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
>>> reference counting of device link suppliers at probe) that is going
>>> to be reverted.
>> Thanks Rafael for the patch. I've applied it and now I'm a bit puzzled.
>> Let's get back to my IOMMU and codec case, mentioned here:
>> https://marc.info/?l=linux-pm&m=152878741527962&w=2
>>
>> Now, after applying your patch, when IOMMU creates a link with
>> DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
>> being probed), it is not IOMMU is not runtime resumed anymore (that's
>> because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).
> According to your earlier replies, I thought this is what you wanted. No?
>
>> This means that until jpeg driver enables runtime pm for its device and
>> finally performs runtime pm suspends/resume cycle, the supplier won't be
>> resumed.
> What's the problem with that?

Wasted power. IOMMU device is left enabled until first use of JPEG device,
what means that respective power domain is also turned on unnecessarily.

> When does the IOMMU device needs to be
> runtime resumed?

I would like to have IOMMU (supplier) runtime active all the time the JPEG
(consumer) device is runtime active AND all the time between probe() and
remove() if JPEG (consumer) doesn't support runtime PM (if it doesn't
enable runtime PM at all).

>> On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
>> creation, the supplier is properly resumed, but then, once the jpeg
>> device probe finishes, the supplier is still in runtime active state
>> until a next runtime suspend/resume cycle of jpeg device.
> Could you elaborate on the what happens in jpeg driver during probe,
> in regards to runtime PM. Perhaps you can also point me to what driver
> it is?

s5p_jpeg_probe() doesn't touch hardware at all and only calls 
pm_runtime_enable(),
see drivers/media/platform/s5p-jpeg/jpeg-core.c

>> If I understand
>> right, the DL_FLAG_RPM_ACTIVE flag should be there from the beginning,
>> but that time it runtime pm part of links worked in a bit different way
>> than now.
> Just to make sure, you also reverted 1e8378619841, before trying
> Rafael's change on top?

Yes

>> Is there any way to keep old behavior?
> I think the old behavior is sub-optimal. I am sure there are users
> that really don't want the driver core to runtime resume the supplier
> unconditionally.
>
> I would rather go and fix the few users of device_link_add(), to use
> DL_FLAG_RPM_ACTIVE, in cases when they need it. Of course I am fine if
> we do these changes in a step by step basis as well.

I also agree that the old behavior is sub-optimal and there are use
cases for the new (corrected) approach. I see no problems to add
DL_FLAG_RPM_ACTIVE to exynos-iommu driver (and probably the 5 other
drivers, which create links with DL_FLAG_PM_RUNTIME flag set to avoid
regressions). The question is what else should be changed/fixed to get
old behavior now.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-12 19:43     ` Ulf Hansson
  2018-06-13  6:42       ` Marek Szyprowski
@ 2018-06-13  7:58       ` Rafael J. Wysocki
  1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-06-13  7:58 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Marek Szyprowski, Rafael J. Wysocki, Linux PM, LKML,
	Greg Kroah-Hartman, Lukas Wunner, Bartlomiej Zolnierkiewicz,
	Jon Hunter

On Tue, Jun 12, 2018 at 9:43 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 12 June 2018 at 14:44, Marek Szyprowski <m.szyprowski@samsung.com> wrote:

[cut]

>>
>> Is there any way to keep old behavior?
>
> I think the old behavior is sub-optimal. I am sure there are users
> that really don't want the driver core to runtime resume the supplier
> unconditionally.

I agree.

That said, the existing behavior has been there for quite a while and
the callers of device_link_add() that have grown a dependency on it
should be given a chance to change before it goes away.

> I would rather go and fix the few users of device_link_add(), to use
> DL_FLAG_RPM_ACTIVE, in cases when they need it.

Which BTW was the original idea. :-)

> Of course I am fine if we do these changes in a step by step basis as well.

OK, because that's what I'd prefer to do.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-13  6:23       ` Marek Szyprowski
@ 2018-06-13  8:16         ` Rafael J. Wysocki
  2018-06-13 10:23           ` Marek Szyprowski
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-06-13  8:16 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Rafael J. Wysocki, Linux PM, LKML, Greg Kroah-Hartman,
	Ulf Hansson, Lukas Wunner, Bartlomiej Zolnierkiewicz, Jon Hunter

On Wed, Jun 13, 2018 at 8:23 AM, Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
> Hi Rafael,

Hi Marek,

> On 2018-06-12 16:24, Rafael J. Wysocki wrote:
>> On Tuesday, June 12, 2018 2:44:23 PM CEST Marek Szyprowski wrote:
>>> On 2018-06-12 13:00, Rafael J. Wysocki wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> If a device link is added via device_link_add() by the driver of the
>>>> link's consumer device, the supplier's runtime PM usage counter is
>>>> going to be dropped by the pm_runtime_put_suppliers() call in
>>>> driver_probe_device().  However, in that case it is not incremented
>>>> unless the supplier driver is already present and the link is not
>>>> stateless.  That leads to a runtime PM usage counter imbalance for
>>>> the supplier device in a few cases.
>>>>
>>>> To prevent that from happening, bump up the supplier runtime
>>>> PM usage counter in device_link_add() for all links with the
>>>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>>>> time.  Use pm_runtime_get_noresume() for that as the callers of
>>>> device_link_add() who want the supplier to be resumed by it should
>>>> pass DL_FLAG_RPM_ACTIVE in flags to it anyway.
>>>>
>>>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>>>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>> ---
>>>>
>>>> This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
>>>> reference counting of device link suppliers at probe) that is going
>>>> to be reverted.
>>> Thanks Rafael for the patch. I've applied it and now I'm a bit puzzled.
>> Thanks for testing! :-)
>>
>>> Let's get back to my IOMMU and codec case, mentioned here:
>>> https://marc.info/?l=linux-pm&m=152878741527962&w=2
>>>
>>> Now, after applying your patch, when IOMMU creates a link with
>>> DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
>>> being probed), it is not IOMMU is not runtime resumed anymore (that's
>>> because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).
>>> This means that until jpeg driver enables runtime pm for its device and
>>> finally performs runtime pm suspends/resume cycle, the supplier won't be
>>> resumed. On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
>>> creation, the supplier is properly resumed, but then, once the jpeg
>>> device probe finishes, the supplier is still in runtime active state
>>> until a next runtime suspend/resume cycle of jpeg device.
>> That additional suspend-resume cycle should not be necessary in theory
>> unless I'm missing something.
>>
>> The pm_request_idle() call in driver_probe_device() should trigger a
>> suspend of the jpeg device after probe (unless blocked by something)
>> and that should drop the RPM usage counter of the IOMMU.  Next, the
>> pm_runtime_put_suppliers() in there should actually suspend it.
>
> I've also would expect such behavior of PM core, but checks on real
> hardware gives other results.
>
>> It looks like the pm_request_idle() doesn't work as expected.
>
> pm_request_idle() calls rpm_idle(), which returns early with -EAGAIN due to
> (dev->power.runtime_status != RPM_ACTIVE) check. The device runtime_status
> is RPM_SUSPENDED as initially set by pm_runtime_init() on device creation.
> Notice that JPEG driver only calls pm_runtime_enable() and nothing more.

But is the device really suspended during probe?

Note that "suspend" means basically "not accessible to software", but
I guess software needs to access it to set it up, right?  If that is
the case, maybe the driver should set the initial RPM status of the
device to "active" before enabling runtime PM for it?  That at least
would be consistent with passing DL_FLAG_RPM_ACTIVE to
device_link_add().

There are drivers that don't actually touch the hardware in ->probe(),
but it follows from your description that this is not one of them.

>>> If I understand right, the DL_FLAG_RPM_ACTIVE flag should be there from the
>>> beginning, but that time it runtime pm part of links worked in a bit
>>> different way than now.
>> Right, and evidently there are callers depending on the existing behavior.
>>
>>> Is there any way to keep old behavior?
>> Yes, there is, please test the appended v2 of the $subject patch.
>>
>> That said, I'd like to remove the extra supplier resume from there going
>> forward as there may be callers who actually don't want that to happen and
>> DL_FLAG_RPM_ACTIVE is there for a purpose.
>
> Frankly, if the current behavior matches the designed behavior of
> DL_FLAG_RPM_ACTIVE flag,

It doesn't match the DL_FLAG_RPM_ACTIVE exactly as you've already noticed.

DL_FLAG_RPM_ACTIVE assumes that the initial RPM status of the device
will be RPM_ACTIVE and therefore the suppliers need to be resumed at
link creation time.  Therefore device_link_add() causes the suppliers
to remain in the RPM_ACTIVE state with the rpm_active status bit of
the link set, whereas currently they are simply suspended again by the
pm_runtime_put_suppliers() in driver_probe_device() and the link is
not marked as "rpm_active".

> then maybe instead of adding workarounds now, we
> should simply fix all existing callers of device_link_add()? 'git grep'
> shows
> only 6 places where links are created with DL_FLAG_PM_RUNTIME flag, I see no
> problem to add DL_FLAG_RPM_ACTIVE there to keep current behavior after a fix
> in runtime PM core. The description of DL_FLAG_RPM_ACTIVE should also be
> a bit
> updated, because I initially thought that it means that the runtime pm
> counter
> on supplier is increased for the whole lifetime of the device link (it
> is not
> clear when core will call a corresponding pm_runtime_put()).
>
> The other question is what need to be fixed to get proper behavior
> without the
> additional suspend/resume cycle mentioned a few paragraphs above.

As stated already, if the driver passes DL_FLAG_RPM_ACTIVE to
device_link_add() at probe time, then the initial RPM status of the
device being probed is expected to be RPM_ACTIVE.

>> ---
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Subject: [PATCH v2] PM / core: Fix supplier device runtime PM usage counter imbalance
>>
>> If a device link is added via device_link_add() by the driver of the
>> link's consumer device, the supplier's runtime PM usage counter is
>> going to be dropped by the pm_runtime_put_suppliers() call in
>> driver_probe_device().  However, in that case it is not incremented
>> unless the supplier driver is already present and the link is not
>> stateless.  That leads to a runtime PM usage counter imbalance for
>> the supplier device in a few cases.
>>
>> To prevent that from happening, bump up the supplier runtime
>> PM usage counter in device_link_add() for all links with the
>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>> time.  Use pm_runtime_get_noresume() for that as the callers of
>> device_link_add() who want the supplier to be resumed by it are
>> expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but
>> additionally resume the supplier if the link is added during
>> consumer driver probe to retain the existing behavior for the
>> callers depending on it.
>>
>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> I've tested this version of the patch and it keeps the current behavior for
> links created with DL_FLAG_PM_RUNTIME flag. The questions is if we really
> want it?

I think so.

Basically, there are two changes at hand: fixing the behavior for
stateless links (and for stateful ones if the supplier driver is not
present, but that arguably is a corner case) and the behavior change
for stateful links (with supplier drivers present).

Arguably, the former is more important than the latter and I'd like to
be able to push that fix into -stable without dependencies.  The
latter can be done when all of the current callers depending on the
existing behavior have been adjusted.

So, I'm going to add a Tested-by from you to this patch, if you don't
mind, and queue it up.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-13  6:42       ` Marek Szyprowski
@ 2018-06-13  8:25         ` Ulf Hansson
  0 siblings, 0 replies; 12+ messages in thread
From: Ulf Hansson @ 2018-06-13  8:25 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Rafael J. Wysocki, Linux PM, LKML, Greg Kroah-Hartman,
	Lukas Wunner, Bartlomiej Zolnierkiewicz, Jon Hunter

On 13 June 2018 at 08:42, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> Hi Ulf,
>
> On 2018-06-12 21:43, Ulf Hansson wrote:
>> On 12 June 2018 at 14:44, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>>> On 2018-06-12 13:00, Rafael J. Wysocki wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> If a device link is added via device_link_add() by the driver of the
>>>> link's consumer device, the supplier's runtime PM usage counter is
>>>> going to be dropped by the pm_runtime_put_suppliers() call in
>>>> driver_probe_device().  However, in that case it is not incremented
>>>> unless the supplier driver is already present and the link is not
>>>> stateless.  That leads to a runtime PM usage counter imbalance for
>>>> the supplier device in a few cases.
>>>>
>>>> To prevent that from happening, bump up the supplier runtime
>>>> PM usage counter in device_link_add() for all links with the
>>>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>>>> time.  Use pm_runtime_get_noresume() for that as the callers of
>>>> device_link_add() who want the supplier to be resumed by it should
>>>> pass DL_FLAG_RPM_ACTIVE in flags to it anyway.
>>>>
>>>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>>>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>> ---
>>>>
>>>> This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
>>>> reference counting of device link suppliers at probe) that is going
>>>> to be reverted.
>>> Thanks Rafael for the patch. I've applied it and now I'm a bit puzzled.
>>> Let's get back to my IOMMU and codec case, mentioned here:
>>> https://marc.info/?l=linux-pm&m=152878741527962&w=2
>>>
>>> Now, after applying your patch, when IOMMU creates a link with
>>> DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
>>> being probed), it is not IOMMU is not runtime resumed anymore (that's
>>> because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).
>> According to your earlier replies, I thought this is what you wanted. No?
>>
>>> This means that until jpeg driver enables runtime pm for its device and
>>> finally performs runtime pm suspends/resume cycle, the supplier won't be
>>> resumed.
>> What's the problem with that?
>
> Wasted power. IOMMU device is left enabled until first use of JPEG device,
> what means that respective power domain is also turned on unnecessarily.

I understand the PM domain (genpd) gets powered on during probe, for
the respective device.

Now, that shouldn't be a problem, because, if the driver enables
runtime PM and leaves its device in runtime suspend state when probe
is complete, then when driver core invokes the dev->pm_domain->sync()
callback (set to genpd_dev_pm_sync()), genpd should be able to power
off the PM domain.

>
>> When does the IOMMU device needs to be
>> runtime resumed?
>
> I would like to have IOMMU (supplier) runtime active all the time the JPEG
> (consumer) device is runtime active AND all the time between probe() and
> remove() if JPEG (consumer) doesn't support runtime PM (if it doesn't
> enable runtime PM at all).

Right, that explains why you need the existing behavior. Thanks for clarifying!

I do see an opportunity for optimizations here, as clearly there are
cases when you don't need the supplier, the IOMMU device, to be
runtime resumed during link creation.

However, currently this seems not feasible, as
exynos_iommu_add_device() is called from a bus notifier, when the jpeg
device is added. The point is, exynos_iommu_add_device() has no clue
whether runtime PM is or will become enabled for the supplier device.
Thus using DL_FLAG_RPM_ACTIVE is needed. To address this, I think the
device link creation needs to be closer managed by the jpeg driver,
somehow.

>
>>> On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
>>> creation, the supplier is properly resumed, but then, once the jpeg
>>> device probe finishes, the supplier is still in runtime active state
>>> until a next runtime suspend/resume cycle of jpeg device.
>> Could you elaborate on the what happens in jpeg driver during probe,
>> in regards to runtime PM. Perhaps you can also point me to what driver
>> it is?
>
> s5p_jpeg_probe() doesn't touch hardware at all and only calls
> pm_runtime_enable(),
> see drivers/media/platform/s5p-jpeg/jpeg-core.c

Thanks!

>
>>> If I understand
>>> right, the DL_FLAG_RPM_ACTIVE flag should be there from the beginning,
>>> but that time it runtime pm part of links worked in a bit different way
>>> than now.
>> Just to make sure, you also reverted 1e8378619841, before trying
>> Rafael's change on top?
>
> Yes

Great!

>
>>> Is there any way to keep old behavior?
>> I think the old behavior is sub-optimal. I am sure there are users
>> that really don't want the driver core to runtime resume the supplier
>> unconditionally.
>>
>> I would rather go and fix the few users of device_link_add(), to use
>> DL_FLAG_RPM_ACTIVE, in cases when they need it. Of course I am fine if
>> we do these changes in a step by step basis as well.
>
> I also agree that the old behavior is sub-optimal and there are use
> cases for the new (corrected) approach. I see no problems to add
> DL_FLAG_RPM_ACTIVE to exynos-iommu driver (and probably the 5 other
> drivers, which create links with DL_FLAG_PM_RUNTIME flag set to avoid
> regressions). The question is what else should be changed/fixed to get
> old behavior now.

I think Rafael's v2, may take care of the problem. No?

If not, then we need to investigate further.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-12 14:24     ` Rafael J. Wysocki
  2018-06-13  6:23       ` Marek Szyprowski
@ 2018-06-13  8:28       ` Ulf Hansson
  1 sibling, 0 replies; 12+ messages in thread
From: Ulf Hansson @ 2018-06-13  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Marek Szyprowski, Linux PM, LKML, Greg Kroah-Hartman,
	Lukas Wunner, Bartlomiej Zolnierkiewicz, Jon Hunter

[...]

> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH v2] PM / core: Fix supplier device runtime PM usage counter imbalance
>
> If a device link is added via device_link_add() by the driver of the
> link's consumer device, the supplier's runtime PM usage counter is
> going to be dropped by the pm_runtime_put_suppliers() call in
> driver_probe_device().  However, in that case it is not incremented
> unless the supplier driver is already present and the link is not
> stateless.  That leads to a runtime PM usage counter imbalance for
> the supplier device in a few cases.
>
> To prevent that from happening, bump up the supplier runtime
> PM usage counter in device_link_add() for all links with the
> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
> time.  Use pm_runtime_get_noresume() for that as the callers of
> device_link_add() who want the supplier to be resumed by it are
> expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but
> additionally resume the supplier if the link is added during
> consumer driver probe to retain the existing behavior for the
> callers depending on it.
>
> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe

> ---
>  drivers/base/core.c |   15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -216,6 +216,13 @@ struct device_link *device_link_add(stru
>                         link->rpm_active = true;
>                 }
>                 pm_runtime_new_link(consumer);
> +               /*
> +                * If the link is being added by the consumer driver at probe
> +                * time, balance the decrementation of the supplier's runtime PM
> +                * usage counter after consumer probe in driver_probe_device().
> +                */
> +               if (consumer->links.status == DL_DEV_PROBING)
> +                       pm_runtime_get_noresume(supplier);
>         }
>         get_device(supplier);
>         link->supplier = supplier;
> @@ -235,12 +242,12 @@ struct device_link *device_link_add(stru
>                         switch (consumer->links.status) {
>                         case DL_DEV_PROBING:
>                                 /*
> -                                * Balance the decrementation of the supplier's
> -                                * runtime PM usage counter after consumer probe
> -                                * in driver_probe_device().
> +                                * Some callers expect the link creation during
> +                                * consumer driver probe to resume the supplier
> +                                * even without DL_FLAG_RPM_ACTIVE.
>                                  */
>                                 if (flags & DL_FLAG_PM_RUNTIME)
> -                                       pm_runtime_get_sync(supplier);
> +                                       pm_runtime_resume(supplier);
>
>                                 link->status = DL_STATE_CONSUMER_PROBE;
>                                 break;
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-13  8:16         ` Rafael J. Wysocki
@ 2018-06-13 10:23           ` Marek Szyprowski
  2018-06-13 10:58             ` Rafael J. Wysocki
  0 siblings, 1 reply; 12+ messages in thread
From: Marek Szyprowski @ 2018-06-13 10:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, LKML, Greg Kroah-Hartman,
	Ulf Hansson, Lukas Wunner, Bartlomiej Zolnierkiewicz, Jon Hunter

Hi Rafael,

On 2018-06-13 10:16, Rafael J. Wysocki wrote:
> On Wed, Jun 13, 2018 at 8:23 AM, Marek Szyprowski
> <m.szyprowski@samsung.com> wrote:
> On 2018-06-12 16:24, Rafael J. Wysocki wrote:
>>> On Tuesday, June 12, 2018 2:44:23 PM CEST Marek Szyprowski wrote:
>>>> On 2018-06-12 13:00, Rafael J. Wysocki wrote:
>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>
>>>>> If a device link is added via device_link_add() by the driver of the
>>>>> link's consumer device, the supplier's runtime PM usage counter is
>>>>> going to be dropped by the pm_runtime_put_suppliers() call in
>>>>> driver_probe_device().  However, in that case it is not incremented
>>>>> unless the supplier driver is already present and the link is not
>>>>> stateless.  That leads to a runtime PM usage counter imbalance for
>>>>> the supplier device in a few cases.
>>>>>
>>>>> To prevent that from happening, bump up the supplier runtime
>>>>> PM usage counter in device_link_add() for all links with the
>>>>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>>>>> time.  Use pm_runtime_get_noresume() for that as the callers of
>>>>> device_link_add() who want the supplier to be resumed by it should
>>>>> pass DL_FLAG_RPM_ACTIVE in flags to it anyway.
>>>>>
>>>>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>>>>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>> ---
>>>>>
>>>>> This is a replacement for commit 1e8378619841 (PM / runtime: Fixup
>>>>> reference counting of device link suppliers at probe) that is going
>>>>> to be reverted.
>>>> Thanks Rafael for the patch. I've applied it and now I'm a bit puzzled.
>>> Thanks for testing! :-)
>>>
>>>> Let's get back to my IOMMU and codec case, mentioned here:
>>>> https://marc.info/?l=linux-pm&m=152878741527962&w=2
>>>>
>>>> Now, after applying your patch, when IOMMU creates a link with
>>>> DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
>>>> being probed), it is not IOMMU is not runtime resumed anymore (that's
>>>> because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).
>>>> This means that until jpeg driver enables runtime pm for its device and
>>>> finally performs runtime pm suspends/resume cycle, the supplier won't be
>>>> resumed. On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
>>>> creation, the supplier is properly resumed, but then, once the jpeg
>>>> device probe finishes, the supplier is still in runtime active state
>>>> until a next runtime suspend/resume cycle of jpeg device.
>>> That additional suspend-resume cycle should not be necessary in theory
>>> unless I'm missing something.
>>>
>>> The pm_request_idle() call in driver_probe_device() should trigger a
>>> suspend of the jpeg device after probe (unless blocked by something)
>>> and that should drop the RPM usage counter of the IOMMU.  Next, the
>>> pm_runtime_put_suppliers() in there should actually suspend it.
>> I've also would expect such behavior of PM core, but checks on real
>> hardware gives other results.
>>
>>> It looks like the pm_request_idle() doesn't work as expected.
>> pm_request_idle() calls rpm_idle(), which returns early with -EAGAIN due to
>> (dev->power.runtime_status != RPM_ACTIVE) check. The device runtime_status
>> is RPM_SUSPENDED as initially set by pm_runtime_init() on device creation.
>> Notice that JPEG driver only calls pm_runtime_enable() and nothing more.
> But is the device really suspended during probe?

This is a runtime pm state of the newly created platform device when driver
core calls ->probe() from its driver. At that time it is not yet known if
the driver supports runtime pm or not and typically drivers do some hardware
access there. Platform device is created from device tree.

> Note that "suspend" means basically "not accessible to software", but
> I guess software needs to access it to set it up, right?  If that is
> the case, maybe the driver should set the initial RPM status of the
> device to "active" before enabling runtime PM for it?  That at least
> would be consistent with passing DL_FLAG_RPM_ACTIVE to
> device_link_add().
>
> There are drivers that don't actually touch the hardware in ->probe(),
> but it follows from your description that this is not one of them.

The JPEG driver was just an example, and it actually doesn't touch hw in
probe. However I would like to have the typical cases working:

1. runtime pm enabled, no hw access
2. runtime pm enabled, some hw access (guarded by either
    pm_runtime_get_sync or pm_runtime_get_noresume+pm_runtime_set_active)
3. runtime pm disabled (no runtime pm calls at all), some hw access.

For the last type it is important to enable IOMMU during the probe().

>>>> If I understand right, the DL_FLAG_RPM_ACTIVE flag should be there from the
>>>> beginning, but that time it runtime pm part of links worked in a bit
>>>> different way than now.
>>> Right, and evidently there are callers depending on the existing behavior.
>>>
>>>> Is there any way to keep old behavior?
>>> Yes, there is, please test the appended v2 of the $subject patch.
>>>
>>> That said, I'd like to remove the extra supplier resume from there going
>>> forward as there may be callers who actually don't want that to happen and
>>> DL_FLAG_RPM_ACTIVE is there for a purpose.
>> Frankly, if the current behavior matches the designed behavior of
>> DL_FLAG_RPM_ACTIVE flag,
> It doesn't match the DL_FLAG_RPM_ACTIVE exactly as you've already noticed.
>
> DL_FLAG_RPM_ACTIVE assumes that the initial RPM status of the device
> will be RPM_ACTIVE and therefore the suppliers need to be resumed at
> link creation time.  Therefore device_link_add() causes the suppliers
> to remain in the RPM_ACTIVE state with the rpm_active status bit of
> the link set, whereas currently they are simply suspended again by the
> pm_runtime_put_suppliers() in driver_probe_device() and the link is
> not marked as "rpm_active".
>
>> then maybe instead of adding workarounds now, we
>> should simply fix all existing callers of device_link_add()? 'git grep'
>> shows
>> only 6 places where links are created with DL_FLAG_PM_RUNTIME flag, I see no
>> problem to add DL_FLAG_RPM_ACTIVE there to keep current behavior after a fix
>> in runtime PM core. The description of DL_FLAG_RPM_ACTIVE should also be
>> a bit
>> updated, because I initially thought that it means that the runtime pm
>> counter
>> on supplier is increased for the whole lifetime of the device link (it
>> is not
>> clear when core will call a corresponding pm_runtime_put()).
>>
>> The other question is what need to be fixed to get proper behavior
>> without the
>> additional suspend/resume cycle mentioned a few paragraphs above.
> As stated already, if the driver passes DL_FLAG_RPM_ACTIVE to
> device_link_add() at probe time, then the initial RPM status of the
> device being probed is expected to be RPM_ACTIVE.

Okay, then this doesn't match the case of Exynos IOMMU and JPEG (and other
Exynos multimedia drivers), because the links are created from the 
add_device
platform bus notifier, which is executed just before ->probe() callback of
the newly created jpeg/multimedia device. That time of course 
jpeg/multimedia
driver is not able to enable runtime PM of the handled device yet...

>>> ---
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> Subject: [PATCH v2] PM / core: Fix supplier device runtime PM usage counter imbalance
>>>
>>> If a device link is added via device_link_add() by the driver of the
>>> link's consumer device, the supplier's runtime PM usage counter is
>>> going to be dropped by the pm_runtime_put_suppliers() call in
>>> driver_probe_device().  However, in that case it is not incremented
>>> unless the supplier driver is already present and the link is not
>>> stateless.  That leads to a runtime PM usage counter imbalance for
>>> the supplier device in a few cases.
>>>
>>> To prevent that from happening, bump up the supplier runtime
>>> PM usage counter in device_link_add() for all links with the
>>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>>> time.  Use pm_runtime_get_noresume() for that as the callers of
>>> device_link_add() who want the supplier to be resumed by it are
>>> expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but
>>> additionally resume the supplier if the link is added during
>>> consumer driver probe to retain the existing behavior for the
>>> callers depending on it.
>>>
>>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> I've tested this version of the patch and it keeps the current behavior for
>> links created with DL_FLAG_PM_RUNTIME flag. The questions is if we really
>> want it?
> I think so.
>
> Basically, there are two changes at hand: fixing the behavior for
> stateless links (and for stateful ones if the supplier driver is not
> present, but that arguably is a corner case) and the behavior change
> for stateful links (with supplier drivers present).
>
> Arguably, the former is more important than the latter and I'd like to
> be able to push that fix into -stable without dependencies.  The
> latter can be done when all of the current callers depending on the
> existing behavior have been adjusted.
>
> So, I'm going to add a Tested-by from you to this patch, if you don't
> mind, and queue it up.

Okay, fine for me.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance
  2018-06-13 10:23           ` Marek Szyprowski
@ 2018-06-13 10:58             ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-06-13 10:58 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Linux PM, LKML,
	Greg Kroah-Hartman, Ulf Hansson, Lukas Wunner,
	Bartlomiej Zolnierkiewicz, Jon Hunter

Hi Marek,

On Wed, Jun 13, 2018 at 12:23 PM, Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
> Hi Rafael,
>

[cut]

>>>>> Let's get back to my IOMMU and codec case, mentioned here:
>>>>> https://marc.info/?l=linux-pm&m=152878741527962&w=2
>>>>>
>>>>> Now, after applying your patch, when IOMMU creates a link with
>>>>> DL_FLAG_PM_RUNTIME to the jpeg device (this happens when jpeg device is
>>>>> being probed), it is not IOMMU is not runtime resumed anymore (that's
>>>>> because the patch changes pm_runtime_get_sync to pm_runtime_get_noresume).
>>>>> This means that until jpeg driver enables runtime pm for its device and
>>>>> finally performs runtime pm suspends/resume cycle, the supplier won't be
>>>>> resumed. On the other hand, when I add DL_FLAG_RPM_ACTIVE flag to link
>>>>> creation, the supplier is properly resumed, but then, once the jpeg
>>>>> device probe finishes, the supplier is still in runtime active state
>>>>> until a next runtime suspend/resume cycle of jpeg device.
>>>> That additional suspend-resume cycle should not be necessary in theory
>>>> unless I'm missing something.
>>>>
>>>> The pm_request_idle() call in driver_probe_device() should trigger a
>>>> suspend of the jpeg device after probe (unless blocked by something)
>>>> and that should drop the RPM usage counter of the IOMMU.  Next, the
>>>> pm_runtime_put_suppliers() in there should actually suspend it.
>>> I've also would expect such behavior of PM core, but checks on real
>>> hardware gives other results.
>>>
>>>> It looks like the pm_request_idle() doesn't work as expected.
>>> pm_request_idle() calls rpm_idle(), which returns early with -EAGAIN due to
>>> (dev->power.runtime_status != RPM_ACTIVE) check. The device runtime_status
>>> is RPM_SUSPENDED as initially set by pm_runtime_init() on device creation.
>>> Notice that JPEG driver only calls pm_runtime_enable() and nothing more.
>>
>> But is the device really suspended during probe?
>
> This is a runtime pm state of the newly created platform device when driver
> core calls ->probe() from its driver. At that time it is not yet known if
> the driver supports runtime pm or not and typically drivers do some hardware
> access there. Platform device is created from device tree.

By the time the core calls pm_request_idle() in driver_probe_device(),
really_probe() has run already and the driver's ->probe() should have
run and that should damn well know if runtime PM can be supported.

>> Note that "suspend" means basically "not accessible to software", but
>> I guess software needs to access it to set it up, right?  If that is
>> the case, maybe the driver should set the initial RPM status of the
>> device to "active" before enabling runtime PM for it?  That at least
>> would be consistent with passing DL_FLAG_RPM_ACTIVE to
>> device_link_add().
>>
>> There are drivers that don't actually touch the hardware in ->probe(),
>> but it follows from your description that this is not one of them.
>
> The JPEG driver was just an example, and it actually doesn't touch hw in
> probe. However I would like to have the typical cases working:
>
> 1. runtime pm enabled, no hw access
> 2. runtime pm enabled, some hw access (guarded by either
>     pm_runtime_get_sync or pm_runtime_get_noresume+pm_runtime_set_active)
> 3. runtime pm disabled (no runtime pm calls at all), some hw access.
>
> For the last type it is important to enable IOMMU during the probe().

I see.

In that case whoever adds the link needs to do an extra
pm_runtime_resume() on the supplier after the link has been added.

Doing that in device_link_add() itself would adversely affect the case
in which the creator of the link does not want the supplier to be
resumed.

>>>>> If I understand right, the DL_FLAG_RPM_ACTIVE flag should be there from the
>>>>> beginning, but that time it runtime pm part of links worked in a bit
>>>>> different way than now.
>>>> Right, and evidently there are callers depending on the existing behavior.
>>>>
>>>>> Is there any way to keep old behavior?
>>>> Yes, there is, please test the appended v2 of the $subject patch.
>>>>
>>>> That said, I'd like to remove the extra supplier resume from there going
>>>> forward as there may be callers who actually don't want that to happen and
>>>> DL_FLAG_RPM_ACTIVE is there for a purpose.
>>> Frankly, if the current behavior matches the designed behavior of
>>> DL_FLAG_RPM_ACTIVE flag,
>> It doesn't match the DL_FLAG_RPM_ACTIVE exactly as you've already noticed.
>>
>> DL_FLAG_RPM_ACTIVE assumes that the initial RPM status of the device
>> will be RPM_ACTIVE and therefore the suppliers need to be resumed at
>> link creation time.  Therefore device_link_add() causes the suppliers
>> to remain in the RPM_ACTIVE state with the rpm_active status bit of
>> the link set, whereas currently they are simply suspended again by the
>> pm_runtime_put_suppliers() in driver_probe_device() and the link is
>> not marked as "rpm_active".
>>
>>> then maybe instead of adding workarounds now, we
>>> should simply fix all existing callers of device_link_add()? 'git grep'
>>> shows
>>> only 6 places where links are created with DL_FLAG_PM_RUNTIME flag, I see no
>>> problem to add DL_FLAG_RPM_ACTIVE there to keep current behavior after a fix
>>> in runtime PM core. The description of DL_FLAG_RPM_ACTIVE should also be
>>> a bit
>>> updated, because I initially thought that it means that the runtime pm
>>> counter
>>> on supplier is increased for the whole lifetime of the device link (it
>>> is not
>>> clear when core will call a corresponding pm_runtime_put()).
>>>
>>> The other question is what need to be fixed to get proper behavior
>>> without the
>>> additional suspend/resume cycle mentioned a few paragraphs above.
>> As stated already, if the driver passes DL_FLAG_RPM_ACTIVE to
>> device_link_add() at probe time, then the initial RPM status of the
>> device being probed is expected to be RPM_ACTIVE.
>
> Okay, then this doesn't match the case of Exynos IOMMU and JPEG (and other
> Exynos multimedia drivers), because the links are created from the
> add_device
> platform bus notifier, which is executed just before ->probe() callback of
> the newly created jpeg/multimedia device. That time of course
> jpeg/multimedia
> driver is not able to enable runtime PM of the handled device yet...

But you can run pm_runtime_resume(supplier) directly from there, right?

>>>> ---
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>> Subject: [PATCH v2] PM / core: Fix supplier device runtime PM usage counter imbalance
>>>>
>>>> If a device link is added via device_link_add() by the driver of the
>>>> link's consumer device, the supplier's runtime PM usage counter is
>>>> going to be dropped by the pm_runtime_put_suppliers() call in
>>>> driver_probe_device().  However, in that case it is not incremented
>>>> unless the supplier driver is already present and the link is not
>>>> stateless.  That leads to a runtime PM usage counter imbalance for
>>>> the supplier device in a few cases.
>>>>
>>>> To prevent that from happening, bump up the supplier runtime
>>>> PM usage counter in device_link_add() for all links with the
>>>> DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe
>>>> time.  Use pm_runtime_get_noresume() for that as the callers of
>>>> device_link_add() who want the supplier to be resumed by it are
>>>> expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but
>>>> additionally resume the supplier if the link is added during
>>>> consumer driver probe to retain the existing behavior for the
>>>> callers depending on it.
>>>>
>>>> Fixes: 21d5c57b3726 (PM / runtime: Use device links)
>>>> Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> I've tested this version of the patch and it keeps the current behavior for
>>> links created with DL_FLAG_PM_RUNTIME flag. The questions is if we really
>>> want it?
>> I think so.
>>
>> Basically, there are two changes at hand: fixing the behavior for
>> stateless links (and for stateful ones if the supplier driver is not
>> present, but that arguably is a corner case) and the behavior change
>> for stateful links (with supplier drivers present).
>>
>> Arguably, the former is more important than the latter and I'd like to
>> be able to push that fix into -stable without dependencies.  The
>> latter can be done when all of the current callers depending on the
>> existing behavior have been adjusted.
>>
>> So, I'm going to add a Tested-by from you to this patch, if you don't
>> mind, and queue it up.
>
> Okay, fine for me.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-06-13 10:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20180612110807epcas5p10e076681eac6f50b01840c0052748d12@epcas5p1.samsung.com>
2018-06-12 11:00 ` [PATCH] PM / core: Fix supplier device runtime PM usage counter imbalance Rafael J. Wysocki
2018-06-12 12:44   ` Marek Szyprowski
2018-06-12 14:24     ` Rafael J. Wysocki
2018-06-13  6:23       ` Marek Szyprowski
2018-06-13  8:16         ` Rafael J. Wysocki
2018-06-13 10:23           ` Marek Szyprowski
2018-06-13 10:58             ` Rafael J. Wysocki
2018-06-13  8:28       ` Ulf Hansson
2018-06-12 19:43     ` Ulf Hansson
2018-06-13  6:42       ` Marek Szyprowski
2018-06-13  8:25         ` Ulf Hansson
2018-06-13  7:58       ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.