linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PM / QoS: Fix default runtime_pm device resume latency
@ 2017-10-30  7:10 Tero Kristo
  2017-10-30 10:19 ` Rafael J. Wysocki
  0 siblings, 1 reply; 20+ messages in thread
From: Tero Kristo @ 2017-10-30  7:10 UTC (permalink / raw)
  To: linux-pm, linux-kernel, rafael.j.wysocki

The recent change to the PM QoS framework to introduce a proper
no constraint value overlooked to handle the devices which don't
implement PM QoS OPS. Runtime PM is one of the more severely
impacted subsystems, failing every attempt to runtime suspend
a device. This leads into some nasty second level issues like
probe failures and increased power consumption among other things.

Fix this by adding a proper return value for devices that don't
implement PM QoS implicitly.

Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 include/linux/pm_qos.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h
index 6737a8c..d68b056 100644
--- a/include/linux/pm_qos.h
+++ b/include/linux/pm_qos.h
@@ -175,7 +175,8 @@ static inline s32 dev_pm_qos_requested_flags(struct device *dev)
 static inline s32 dev_pm_qos_raw_read_value(struct device *dev)
 {
 	return IS_ERR_OR_NULL(dev->power.qos) ?
-		0 : pm_qos_read_value(&dev->power.qos->resume_latency);
+		PM_QOS_RESUME_LATENCY_NO_CONSTRAINT :
+		pm_qos_read_value(&dev->power.qos->resume_latency);
 }
 #else
 static inline enum pm_qos_flags_status __dev_pm_qos_flags(struct device *dev,
-- 
1.9.1

--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-30  7:10 [PATCH] PM / QoS: Fix default runtime_pm device resume latency Tero Kristo
@ 2017-10-30 10:19 ` Rafael J. Wysocki
  2017-10-30 23:27   ` Rafael J. Wysocki
  0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2017-10-30 10:19 UTC (permalink / raw)
  To: Tero Kristo; +Cc: Linux PM, Linux Kernel Mailing List, Rafael Wysocki

On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
> The recent change to the PM QoS framework to introduce a proper
> no constraint value overlooked to handle the devices which don't
> implement PM QoS OPS. Runtime PM is one of the more severely
> impacted subsystems, failing every attempt to runtime suspend
> a device. This leads into some nasty second level issues like
> probe failures and increased power consumption among other things.

Oh, that's bad.

Sorry about breaking it and thanks for the fix!

> Fix this by adding a proper return value for devices that don't
> implement PM QoS implicitly.
>
> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
> Signed-off-by: Tero Kristo <t-kristo@ti.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Applied.

> ---
>  include/linux/pm_qos.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h
> index 6737a8c..d68b056 100644
> --- a/include/linux/pm_qos.h
> +++ b/include/linux/pm_qos.h
> @@ -175,7 +175,8 @@ static inline s32 dev_pm_qos_requested_flags(struct device *dev)
>  static inline s32 dev_pm_qos_raw_read_value(struct device *dev)
>  {
>         return IS_ERR_OR_NULL(dev->power.qos) ?
> -               0 : pm_qos_read_value(&dev->power.qos->resume_latency);
> +               PM_QOS_RESUME_LATENCY_NO_CONSTRAINT :
> +               pm_qos_read_value(&dev->power.qos->resume_latency);
>  }
>  #else
>  static inline enum pm_qos_flags_status __dev_pm_qos_flags(struct device *dev,
> --
> 1.9.1
>
> --
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-30 10:19 ` Rafael J. Wysocki
@ 2017-10-30 23:27   ` Rafael J. Wysocki
  2017-10-31  7:13     ` Tero Kristo
  2017-10-31 13:09     ` Geert Uytterhoeven
  0 siblings, 2 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2017-10-30 23:27 UTC (permalink / raw)
  To: Tero Kristo; +Cc: Linux PM, Linux Kernel Mailing List

On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
> > The recent change to the PM QoS framework to introduce a proper
> > no constraint value overlooked to handle the devices which don't
> > implement PM QoS OPS. Runtime PM is one of the more severely
> > impacted subsystems, failing every attempt to runtime suspend
> > a device. This leads into some nasty second level issues like
> > probe failures and increased power consumption among other things.
> 
> Oh, that's bad.
> 
> Sorry about breaking it and thanks for the fix!
> 
> > Fix this by adding a proper return value for devices that don't
> > implement PM QoS implicitly.
> >
> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Applied.

And pushed to Linus.

That said, probe shouldn't ever fail if PM QoS is set to the
"never suspend" value.

User space can set it that way, after all, so the drivers that fail to probe
in that case aren't correct I'm afraid.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-30 23:27   ` Rafael J. Wysocki
@ 2017-10-31  7:13     ` Tero Kristo
  2017-10-31  8:40       ` Rafael J. Wysocki
  2017-10-31 13:09     ` Geert Uytterhoeven
  1 sibling, 1 reply; 20+ messages in thread
From: Tero Kristo @ 2017-10-31  7:13 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, Linux Kernel Mailing List

On 31/10/17 01:27, Rafael J. Wysocki wrote:
> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>> The recent change to the PM QoS framework to introduce a proper
>>> no constraint value overlooked to handle the devices which don't
>>> implement PM QoS OPS. Runtime PM is one of the more severely
>>> impacted subsystems, failing every attempt to runtime suspend
>>> a device. This leads into some nasty second level issues like
>>> probe failures and increased power consumption among other things.
>>
>> Oh, that's bad.
>>
>> Sorry about breaking it and thanks for the fix!
>>
>>> Fix this by adding a proper return value for devices that don't
>>> implement PM QoS implicitly.
>>>
>>> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> Applied.
> 
> And pushed to Linus.
> 
> That said, probe shouldn't ever fail if PM QoS is set to the
> "never suspend" value.
> 
> User space can set it that way, after all, so the drivers that fail to probe
> in that case aren't correct I'm afraid.

Ok interesting. The probe failure we had was a second order issue. A 
driver (omap_nmailbox) was attempting to pm_runtime_get_sync() 
...put_sync() during probe, and checked the return value of 
pm_runtime_put_sync() which was -EPERM and bailed out. Most of the time, 
drivers don't check the return value of this and will just succeed. I 
did a grep on kernel and there are few other drivers that check the 
return value also, didn't check if they do this during probe though but 
it can potentially cause various issues elsewhere also.

So, you are saying we should not check the return value of 
pm_runtime_put_x() ever, or should check if it is -EPERM and just pass 
in that case? Is there any point returning -EPERM from the runtime core 
at all then? This should probably be filtered out within runtime core as 
a valid situation and just return 0.

-Tero
--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31  7:13     ` Tero Kristo
@ 2017-10-31  8:40       ` Rafael J. Wysocki
  2017-10-31 10:18         ` Tero Kristo
  0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2017-10-31  8:40 UTC (permalink / raw)
  To: Tero Kristo; +Cc: Rafael J. Wysocki, Linux PM, Linux Kernel Mailing List

On Tue, Oct 31, 2017 at 8:13 AM, Tero Kristo <t-kristo@ti.com> wrote:
> On 31/10/17 01:27, Rafael J. Wysocki wrote:
>>
>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>
>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>
>>>> The recent change to the PM QoS framework to introduce a proper
>>>> no constraint value overlooked to handle the devices which don't
>>>> implement PM QoS OPS. Runtime PM is one of the more severely
>>>> impacted subsystems, failing every attempt to runtime suspend
>>>> a device. This leads into some nasty second level issues like
>>>> probe failures and increased power consumption among other things.
>>>
>>>
>>> Oh, that's bad.
>>>
>>> Sorry about breaking it and thanks for the fix!
>>>
>>>> Fix this by adding a proper return value for devices that don't
>>>> implement PM QoS implicitly.
>>>>
>>>> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>>
>>> Applied.
>>
>>
>> And pushed to Linus.
>>
>> That said, probe shouldn't ever fail if PM QoS is set to the
>> "never suspend" value.
>>
>> User space can set it that way, after all, so the drivers that fail to
>> probe
>> in that case aren't correct I'm afraid.
>
>
> Ok interesting. The probe failure we had was a second order issue. A driver
> (omap_nmailbox) was attempting to pm_runtime_get_sync() ...put_sync() during
> probe, and checked the return value of pm_runtime_put_sync() which was
> -EPERM and bailed out. Most of the time, drivers don't check the return
> value of this and will just succeed. I did a grep on kernel and there are
> few other drivers that check the return value also, didn't check if they do
> this during probe though but it can potentially cause various issues
> elsewhere also.
>
> So, you are saying we should not check the return value of
> pm_runtime_put_x() ever, or should check if it is -EPERM and just pass in
> that case?

The latter.

> Is there any point returning -EPERM from the runtime core at all
> then? This should probably be filtered out within runtime core as a valid
> situation and just return 0.

Fair point.

However, there are other situations in which pm_runtime_put_sync() can
return an error code which needs to be checked, like -EBUSY or -EAGAIN
returned if one of the reference counters is nonzero.

In fact, the "no suspend" PM QoS constraint is somewhat similar to
this situation, so what about changing the error code returned to
-EAGAIN, for example?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31  8:40       ` Rafael J. Wysocki
@ 2017-10-31 10:18         ` Tero Kristo
  0 siblings, 0 replies; 20+ messages in thread
From: Tero Kristo @ 2017-10-31 10:18 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Rafael J. Wysocki, Linux PM, Linux Kernel Mailing List

On 31/10/17 10:40, Rafael J. Wysocki wrote:
> On Tue, Oct 31, 2017 at 8:13 AM, Tero Kristo <t-kristo@ti.com> wrote:
>> On 31/10/17 01:27, Rafael J. Wysocki wrote:
>>>
>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>
>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>>
>>>>> The recent change to the PM QoS framework to introduce a proper
>>>>> no constraint value overlooked to handle the devices which don't
>>>>> implement PM QoS OPS. Runtime PM is one of the more severely
>>>>> impacted subsystems, failing every attempt to runtime suspend
>>>>> a device. This leads into some nasty second level issues like
>>>>> probe failures and increased power consumption among other things.
>>>>
>>>>
>>>> Oh, that's bad.
>>>>
>>>> Sorry about breaking it and thanks for the fix!
>>>>
>>>>> Fix this by adding a proper return value for devices that don't
>>>>> implement PM QoS implicitly.
>>>>>
>>>>> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>>
>>>> Applied.
>>>
>>>
>>> And pushed to Linus.
>>>
>>> That said, probe shouldn't ever fail if PM QoS is set to the
>>> "never suspend" value.
>>>
>>> User space can set it that way, after all, so the drivers that fail to
>>> probe
>>> in that case aren't correct I'm afraid.
>>
>>
>> Ok interesting. The probe failure we had was a second order issue. A driver
>> (omap_nmailbox) was attempting to pm_runtime_get_sync() ...put_sync() during
>> probe, and checked the return value of pm_runtime_put_sync() which was
>> -EPERM and bailed out. Most of the time, drivers don't check the return
>> value of this and will just succeed. I did a grep on kernel and there are
>> few other drivers that check the return value also, didn't check if they do
>> this during probe though but it can potentially cause various issues
>> elsewhere also.
>>
>> So, you are saying we should not check the return value of
>> pm_runtime_put_x() ever, or should check if it is -EPERM and just pass in
>> that case?
> 
> The latter.
> 
>> Is there any point returning -EPERM from the runtime core at all
>> then? This should probably be filtered out within runtime core as a valid
>> situation and just return 0.
> 
> Fair point.
> 
> However, there are other situations in which pm_runtime_put_sync() can
> return an error code which needs to be checked, like -EBUSY or -EAGAIN
> returned if one of the reference counters is nonzero.
>
> In fact, the "no suspend" PM QoS constraint is somewhat similar to
> this situation, so what about changing the error code returned to
> -EAGAIN, for example?

Ok yea thats true. Its better to leave the decision to the drivers as 
the core most likely can't know what the driver will actually want to do 
with a prevented pm_runtime_put. That way the policy can be implemented 
on case-by-case basis.

Keeping the -EPERM in place is probably also just fine, if someone needs 
a detailed info of what prevented the pm_runtime_put from finishing 
properly, and if someone is using this value already.

-Tero
--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-30 23:27   ` Rafael J. Wysocki
  2017-10-31  7:13     ` Tero Kristo
@ 2017-10-31 13:09     ` Geert Uytterhoeven
  2017-10-31 13:10       ` Geert Uytterhoeven
  1 sibling, 1 reply; 20+ messages in thread
From: Geert Uytterhoeven @ 2017-10-31 13:09 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Tero Kristo, Linux PM, Linux Kernel Mailing List

Hi Rafael, Tero,

On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>> > The recent change to the PM QoS framework to introduce a proper
>> > no constraint value overlooked to handle the devices which don't
>> > implement PM QoS OPS. Runtime PM is one of the more severely
>> > impacted subsystems, failing every attempt to runtime suspend
>> > a device. This leads into some nasty second level issues like
>> > probe failures and increased power consumption among other things.
>>
>> Oh, that's bad.
>>
>> Sorry about breaking it and thanks for the fix!
>>
>> > Fix this by adding a proper return value for devices that don't
>> > implement PM QoS implicitly.
>> >
>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> Applied.
>
> And pushed to Linus.

I'm afraid it is not sufficient.

Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
introduced two issues on Renesas platforms:
 1. After boot up, many devices have changed their state from "suspended"
    to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
    (comparing that file across boots is one of my standard tests).
    Interestingly, doing a system suspend/resume cycle restores their state
    to "suspended".

 2. During system suspend, the following warning is printed on
    r8a7791/koelsch:

        i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child

Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
latency") fixes the second issue, but not the first.

Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
latency PM QoS") fixes both.

Do you have a clue?
Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 13:09     ` Geert Uytterhoeven
@ 2017-10-31 13:10       ` Geert Uytterhoeven
  2017-10-31 13:55         ` Geert Uytterhoeven
  0 siblings, 1 reply; 20+ messages in thread
From: Geert Uytterhoeven @ 2017-10-31 13:10 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tero Kristo, Linux PM, Linux Kernel Mailing List, Linux-Renesas

CC linux-renesas-soc

On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Hi Rafael, Tero,
>
> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>> > The recent change to the PM QoS framework to introduce a proper
>>> > no constraint value overlooked to handle the devices which don't
>>> > implement PM QoS OPS. Runtime PM is one of the more severely
>>> > impacted subsystems, failing every attempt to runtime suspend
>>> > a device. This leads into some nasty second level issues like
>>> > probe failures and increased power consumption among other things.
>>>
>>> Oh, that's bad.
>>>
>>> Sorry about breaking it and thanks for the fix!
>>>
>>> > Fix this by adding a proper return value for devices that don't
>>> > implement PM QoS implicitly.
>>> >
>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> Applied.
>>
>> And pushed to Linus.
>
> I'm afraid it is not sufficient.
>
> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
> introduced two issues on Renesas platforms:
>  1. After boot up, many devices have changed their state from "suspended"
>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>     (comparing that file across boots is one of my standard tests).
>     Interestingly, doing a system suspend/resume cycle restores their state
>     to "suspended".
>
>  2. During system suspend, the following warning is printed on
>     r8a7791/koelsch:
>
>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
> active child
>
> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
> latency") fixes the second issue, but not the first.
>
> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
> latency PM QoS") fixes both.
>
> Do you have a clue?
> Thanks!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 13:10       ` Geert Uytterhoeven
@ 2017-10-31 13:55         ` Geert Uytterhoeven
  2017-10-31 14:04           ` Ulf Hansson
                             ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Geert Uytterhoeven @ 2017-10-31 13:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tero Kristo, Linux PM, Linux Kernel Mailing List, Linux-Renesas,
	Laurent Pinchart, DRI Development

Hi Rafael, Tero,

CC pinchartl, dri-devel

On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> CC linux-renesas-soc
>
> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>> > The recent change to the PM QoS framework to introduce a proper
>>>> > no constraint value overlooked to handle the devices which don't
>>>> > implement PM QoS OPS. Runtime PM is one of the more severely
>>>> > impacted subsystems, failing every attempt to runtime suspend
>>>> > a device. This leads into some nasty second level issues like
>>>> > probe failures and increased power consumption among other things.
>>>>
>>>> Oh, that's bad.
>>>>
>>>> Sorry about breaking it and thanks for the fix!
>>>>
>>>> > Fix this by adding a proper return value for devices that don't
>>>> > implement PM QoS implicitly.
>>>> >
>>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> Applied.
>>>
>>> And pushed to Linus.
>>
>> I'm afraid it is not sufficient.
>>
>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
>> introduced two issues on Renesas platforms:
>>  1. After boot up, many devices have changed their state from "suspended"
>>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>     (comparing that file across boots is one of my standard tests).
>>     Interestingly, doing a system suspend/resume cycle restores their state
>>     to "suspended".
>>
>>  2. During system suspend, the following warning is printed on
>>     r8a7791/koelsch:
>>
>>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
>> active child

 3. I've just bisected a seemingly unrelated issue to the same commit.
    On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
    takes more than 1 minute due to flip_done time outs, while it took 0.12s
    before:

    [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
    [    3.021721] [drm] No driver support for vblank timestamp query.
    [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out
    [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out
    [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out
    [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out
    [   44.003597] Console: switching to colour frame buffer device 128x48
    [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out
    [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out
    [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
    [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
feb00000.display on minor 0
    [   64.559873] [drm] Device feb00000.display probed

>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
>> latency") fixes the second issue, but not the first.

... nor the third.

>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
>> latency PM QoS") fixes both.

... all three.

>> Do you have a clue?
>> Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 13:55         ` Geert Uytterhoeven
@ 2017-10-31 14:04           ` Ulf Hansson
  2017-10-31 16:35             ` Rafael J. Wysocki
  2017-10-31 15:37           ` Jani Nikula
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Ulf Hansson @ 2017-10-31 14:04 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Rafael J. Wysocki, Tero Kristo, Linux PM,
	Linux Kernel Mailing List, Linux-Renesas, Laurent Pinchart,
	DRI Development

On 31 October 2017 at 14:55, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> Hi Rafael, Tero,
>
> CC pinchartl, dri-devel
>
> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> CC linux-renesas-soc
>>
>> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>> > The recent change to the PM QoS framework to introduce a proper
>>>>> > no constraint value overlooked to handle the devices which don't
>>>>> > implement PM QoS OPS. Runtime PM is one of the more severely
>>>>> > impacted subsystems, failing every attempt to runtime suspend
>>>>> > a device. This leads into some nasty second level issues like
>>>>> > probe failures and increased power consumption among other things.
>>>>>
>>>>> Oh, that's bad.
>>>>>
>>>>> Sorry about breaking it and thanks for the fix!
>>>>>
>>>>> > Fix this by adding a proper return value for devices that don't
>>>>> > implement PM QoS implicitly.
>>>>> >
>>>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>
>>>>> Applied.
>>>>
>>>> And pushed to Linus.
>>>
>>> I'm afraid it is not sufficient.
>>>
>>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
>>> introduced two issues on Renesas platforms:
>>>  1. After boot up, many devices have changed their state from "suspended"
>>>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>>     (comparing that file across boots is one of my standard tests).
>>>     Interestingly, doing a system suspend/resume cycle restores their state
>>>     to "suspended".
>>>
>>>  2. During system suspend, the following warning is printed on
>>>     r8a7791/koelsch:
>>>
>>>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
>>> active child
>
>  3. I've just bisected a seemingly unrelated issue to the same commit.
>     On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
>     takes more than 1 minute due to flip_done time outs, while it took 0.12s
>     before:
>
>     [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>     [    3.021721] [drm] No driver support for vblank timestamp query.
>     [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   44.003597] Console: switching to colour frame buffer device 128x48
>     [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>     [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
> feb00000.display on minor 0
>     [   64.559873] [drm] Device feb00000.display probed
>
>>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
>>> latency") fixes the second issue, but not the first.
>
> ... nor the third.
>
>>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
>>> latency PM QoS") fixes both.
>
> ... all three.
>
>>> Do you have a clue?
>>> Thanks!

As I didn't have the time to review the original commit, before it got
pushed as a fix, I am planning to review it now instead.

A vague guess is that the genpd governor prevents the device from
being suspended. That was also the most tricky part of the changes
from the original commit, which is causing problems.

I get back to this when I have reviewed it more thoroughly.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 13:55         ` Geert Uytterhoeven
  2017-10-31 14:04           ` Ulf Hansson
@ 2017-10-31 15:37           ` Jani Nikula
  2017-10-31 16:40             ` Daniel Vetter
  2017-10-31 17:12           ` Laurent Pinchart
  2017-10-31 17:22           ` Rafael J. Wysocki
  3 siblings, 1 reply; 20+ messages in thread
From: Jani Nikula @ 2017-10-31 15:37 UTC (permalink / raw)
  To: Geert Uytterhoeven, Rafael J. Wysocki
  Cc: Linux-Renesas, Linux PM, Linux Kernel Mailing List,
	DRI Development, Tero Kristo, Laurent Pinchart, Lofstedt, Marta,
	Peres, Martin

On Tue, 31 Oct 2017, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> Hi Rafael, Tero,
>
> CC pinchartl, dri-devel

Cc: Marta, Martin

Our CI is hitting this too.

BR,
Jani.

>
> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> CC linux-renesas-soc
>>
>> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>> > The recent change to the PM QoS framework to introduce a proper
>>>>> > no constraint value overlooked to handle the devices which don't
>>>>> > implement PM QoS OPS. Runtime PM is one of the more severely
>>>>> > impacted subsystems, failing every attempt to runtime suspend
>>>>> > a device. This leads into some nasty second level issues like
>>>>> > probe failures and increased power consumption among other things.
>>>>>
>>>>> Oh, that's bad.
>>>>>
>>>>> Sorry about breaking it and thanks for the fix!
>>>>>
>>>>> > Fix this by adding a proper return value for devices that don't
>>>>> > implement PM QoS implicitly.
>>>>> >
>>>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>
>>>>> Applied.
>>>>
>>>> And pushed to Linus.
>>>
>>> I'm afraid it is not sufficient.
>>>
>>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
>>> introduced two issues on Renesas platforms:
>>>  1. After boot up, many devices have changed their state from "suspended"
>>>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>>     (comparing that file across boots is one of my standard tests).
>>>     Interestingly, doing a system suspend/resume cycle restores their state
>>>     to "suspended".
>>>
>>>  2. During system suspend, the following warning is printed on
>>>     r8a7791/koelsch:
>>>
>>>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
>>> active child
>
>  3. I've just bisected a seemingly unrelated issue to the same commit.
>     On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
>     takes more than 1 minute due to flip_done time outs, while it took 0.12s
>     before:
>
>     [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>     [    3.021721] [drm] No driver support for vblank timestamp query.
>     [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   44.003597] Console: switching to colour frame buffer device 128x48
>     [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>     [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
> feb00000.display on minor 0
>     [   64.559873] [drm] Device feb00000.display probed
>
>>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
>>> latency") fixes the second issue, but not the first.
>
> ... nor the third.
>
>>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
>>> latency PM QoS") fixes both.
>
> ... all three.
>
>>> Do you have a clue?
>>> Thanks!


-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 14:04           ` Ulf Hansson
@ 2017-10-31 16:35             ` Rafael J. Wysocki
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2017-10-31 16:35 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Geert Uytterhoeven, Rafael J. Wysocki, Tero Kristo, Linux PM,
	Linux Kernel Mailing List, Linux-Renesas, Laurent Pinchart,
	DRI Development

On Tue, Oct 31, 2017 at 3:04 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 31 October 2017 at 14:55, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> Hi Rafael, Tero,
>>
>> CC pinchartl, dri-devel
>>
>> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> CC linux-renesas-soc
>>>
>>> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
>>> <geert@linux-m68k.org> wrote:
>>>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>>> > The recent change to the PM QoS framework to introduce a proper
>>>>>> > no constraint value overlooked to handle the devices which don't
>>>>>> > implement PM QoS OPS. Runtime PM is one of the more severely
>>>>>> > impacted subsystems, failing every attempt to runtime suspend
>>>>>> > a device. This leads into some nasty second level issues like
>>>>>> > probe failures and increased power consumption among other things.
>>>>>>
>>>>>> Oh, that's bad.
>>>>>>
>>>>>> Sorry about breaking it and thanks for the fix!
>>>>>>
>>>>>> > Fix this by adding a proper return value for devices that don't
>>>>>> > implement PM QoS implicitly.
>>>>>> >
>>>>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>
>>>>>> Applied.
>>>>>
>>>>> And pushed to Linus.
>>>>
>>>> I'm afraid it is not sufficient.
>>>>
>>>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
>>>> introduced two issues on Renesas platforms:
>>>>  1. After boot up, many devices have changed their state from "suspended"
>>>>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>>>     (comparing that file across boots is one of my standard tests).
>>>>     Interestingly, doing a system suspend/resume cycle restores their state
>>>>     to "suspended".
>>>>
>>>>  2. During system suspend, the following warning is printed on
>>>>     r8a7791/koelsch:
>>>>
>>>>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
>>>> active child
>>
>>  3. I've just bisected a seemingly unrelated issue to the same commit.
>>     On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
>>     takes more than 1 minute due to flip_done time outs, while it took 0.12s
>>     before:
>>
>>     [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>>     [    3.021721] [drm] No driver support for vblank timestamp query.
>>     [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   44.003597] Console: switching to colour frame buffer device 128x48
>>     [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>>     [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
>> feb00000.display on minor 0
>>     [   64.559873] [drm] Device feb00000.display probed
>>
>>>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
>>>> latency") fixes the second issue, but not the first.
>>
>> ... nor the third.
>>
>>>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>>>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
>>>> latency PM QoS") fixes both.
>>
>> ... all three.
>>
>>>> Do you have a clue?
>>>> Thanks!
>
> As I didn't have the time to review the original commit, before it got
> pushed as a fix, I am planning to review it now instead.
>
> A vague guess is that the genpd governor prevents the device from
> being suspended. That was also the most tricky part of the changes
> from the original commit, which is causing problems.

I think you are right.

> I get back to this when I have reviewed it more thoroughly.

Thanks, and sorry for breaking stuff.

In retrospect I should just have not pushed it so late in the cycle.

Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 15:37           ` Jani Nikula
@ 2017-10-31 16:40             ` Daniel Vetter
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Vetter @ 2017-10-31 16:40 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Geert Uytterhoeven, Rafael J. Wysocki, Tero Kristo, Linux PM,
	Linux Kernel Mailing List, DRI Development, Peres, Martin,
	Linux-Renesas, Laurent Pinchart

On Tue, Oct 31, 2017 at 05:37:50PM +0200, Jani Nikula wrote:
> On Tue, 31 Oct 2017, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > Hi Rafael, Tero,
> >
> > CC pinchartl, dri-devel
> 
> Cc: Marta, Martin
> 
> Our CI is hitting this too.

Should be ok again, we've locally reverted this patch. But a big chunk of
the pw runs overnight did hit it unfortunately :-(
-Daniel

> 
> BR,
> Jani.
> 
> >
> > On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
> > <geert@linux-m68k.org> wrote:
> >> CC linux-renesas-soc
> >>
> >> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
> >> <geert@linux-m68k.org> wrote:
> >>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
> >>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
> >>>>> > The recent change to the PM QoS framework to introduce a proper
> >>>>> > no constraint value overlooked to handle the devices which don't
> >>>>> > implement PM QoS OPS. Runtime PM is one of the more severely
> >>>>> > impacted subsystems, failing every attempt to runtime suspend
> >>>>> > a device. This leads into some nasty second level issues like
> >>>>> > probe failures and increased power consumption among other things.
> >>>>>
> >>>>> Oh, that's bad.
> >>>>>
> >>>>> Sorry about breaking it and thanks for the fix!
> >>>>>
> >>>>> > Fix this by adding a proper return value for devices that don't
> >>>>> > implement PM QoS implicitly.
> >>>>> >
> >>>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
> >>>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
> >>>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>>>
> >>>>> Applied.
> >>>>
> >>>> And pushed to Linus.
> >>>
> >>> I'm afraid it is not sufficient.
> >>>
> >>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
> >>> introduced two issues on Renesas platforms:
> >>>  1. After boot up, many devices have changed their state from "suspended"
> >>>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
> >>>     (comparing that file across boots is one of my standard tests).
> >>>     Interestingly, doing a system suspend/resume cycle restores their state
> >>>     to "suspended".
> >>>
> >>>  2. During system suspend, the following warning is printed on
> >>>     r8a7791/koelsch:
> >>>
> >>>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
> >>> active child
> >
> >  3. I've just bisected a seemingly unrelated issue to the same commit.
> >     On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
> >     takes more than 1 minute due to flip_done time outs, while it took 0.12s
> >     before:
> >
> >     [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> >     [    3.021721] [drm] No driver support for vblank timestamp query.
> >     [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> > [CRTC:58:crtc-3] flip_done timed out
> >     [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> > [CRTC:58:crtc-3] flip_done timed out
> >     [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> > [CRTC:58:crtc-3] flip_done timed out
> >     [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> > [CRTC:58:crtc-3] flip_done timed out
> >     [   44.003597] Console: switching to colour frame buffer device 128x48
> >     [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> > [CRTC:58:crtc-3] flip_done timed out
> >     [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> > [CRTC:58:crtc-3] flip_done timed out
> >     [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
> >     [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
> > feb00000.display on minor 0
> >     [   64.559873] [drm] Device feb00000.display probed
> >
> >>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
> >>> latency") fixes the second issue, but not the first.
> >
> > ... nor the third.
> >
> >>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
> >>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
> >>> latency PM QoS") fixes both.
> >
> > ... all three.
> >
> >>> Do you have a clue?
> >>> Thanks!
> 
> 
> -- 
> Jani Nikula, Intel Open Source Technology Center
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 13:55         ` Geert Uytterhoeven
  2017-10-31 14:04           ` Ulf Hansson
  2017-10-31 15:37           ` Jani Nikula
@ 2017-10-31 17:12           ` Laurent Pinchart
  2017-10-31 17:22           ` Rafael J. Wysocki
  3 siblings, 0 replies; 20+ messages in thread
From: Laurent Pinchart @ 2017-10-31 17:12 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Rafael J. Wysocki, Tero Kristo, Linux PM,
	Linux Kernel Mailing List, Linux-Renesas, DRI Development

Hi Geert,

On Tuesday, 31 October 2017 15:55:02 EET Geert Uytterhoeven wrote:
> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven wrote:
> > On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven wrote:
> >> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki wrote:
> >>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
> >>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
> >>>>> The recent change to the PM QoS framework to introduce a proper
> >>>>> no constraint value overlooked to handle the devices which don't
> >>>>> implement PM QoS OPS. Runtime PM is one of the more severely
> >>>>> impacted subsystems, failing every attempt to runtime suspend
> >>>>> a device. This leads into some nasty second level issues like
> >>>>> probe failures and increased power consumption among other things.
> >>>> 
> >>>> Oh, that's bad.
> >>>> 
> >>>> Sorry about breaking it and thanks for the fix!
> >>>> 
> >>>>> Fix this by adding a proper return value for devices that don't
> >>>>> implement PM QoS implicitly.
> >>>>> 
> >>>>> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
> >>>>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
> >>>>> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>> 
> >>>> Applied.
> >>> 
> >>> And pushed to Linus.
> >> 
> >> I'm afraid it is not sufficient.
> >> 
> >> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
> >> 
> >> introduced two issues on Renesas platforms:
> >>  1. After boot up, many devices have changed their state from "suspended"
> >>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
> >>     (comparing that file across boots is one of my standard tests).
> >>     Interestingly, doing a system suspend/resume cycle restores their
> >>     state to "suspended".
> >>  
> >>  2. During system suspend, the following warning is printed on
> >>     r8a7791/koelsch:
> >>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
> >> 
> >> active child
> 
>  3. I've just bisected a seemingly unrelated issue to the same commit.
>     On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
>     takes more than 1 minute due to flip_done time outs, while it took 0.12s
> before:
> 
>     [    3.015035] [drm] Supports vblank timestamp caching Rev 2
> (21.10.2013). [    3.021721] [drm] No driver support for vblank timestamp
> query. [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   44.003597] Console: switching to colour frame buffer device 128x48
>     [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>     [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
> feb00000.display on minor 0
>     [   64.559873] [drm] Device feb00000.display probed
> 
> >> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
> >> latency") fixes the second issue, but not the first.
> 
> ... nor the third.
> 
> >> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
> >> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device
> >> resume
> >> latency PM QoS") fixes both.
> 
> ... all three.

Thank you for tracking this and notifying me. I like it even better now that 
the problem seems to be fixed without requiring any action from my side :-)

> >> Do you have a clue?
> >> Thanks!

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 13:55         ` Geert Uytterhoeven
                             ` (2 preceding siblings ...)
  2017-10-31 17:12           ` Laurent Pinchart
@ 2017-10-31 17:22           ` Rafael J. Wysocki
  2017-10-31 18:07             ` Geert Uytterhoeven
  3 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2017-10-31 17:22 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Rafael J. Wysocki, Tero Kristo, Linux PM,
	Linux Kernel Mailing List, Linux-Renesas, Laurent Pinchart,
	DRI Development

[-- Attachment #1: Type: text/plain, Size: 4204 bytes --]

On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Hi Rafael, Tero,
>
> CC pinchartl, dri-devel
>
> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> CC linux-renesas-soc
>>
>> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>> > The recent change to the PM QoS framework to introduce a proper
>>>>> > no constraint value overlooked to handle the devices which don't
>>>>> > implement PM QoS OPS. Runtime PM is one of the more severely
>>>>> > impacted subsystems, failing every attempt to runtime suspend
>>>>> > a device. This leads into some nasty second level issues like
>>>>> > probe failures and increased power consumption among other things.
>>>>>
>>>>> Oh, that's bad.
>>>>>
>>>>> Sorry about breaking it and thanks for the fix!
>>>>>
>>>>> > Fix this by adding a proper return value for devices that don't
>>>>> > implement PM QoS implicitly.
>>>>> >
>>>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>
>>>>> Applied.
>>>>
>>>> And pushed to Linus.
>>>
>>> I'm afraid it is not sufficient.
>>>
>>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
>>> introduced two issues on Renesas platforms:
>>>  1. After boot up, many devices have changed their state from "suspended"
>>>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>>     (comparing that file across boots is one of my standard tests).
>>>     Interestingly, doing a system suspend/resume cycle restores their state
>>>     to "suspended".
>>>
>>>  2. During system suspend, the following warning is printed on
>>>     r8a7791/koelsch:
>>>
>>>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
>>> active child
>
>  3. I've just bisected a seemingly unrelated issue to the same commit.
>     On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
>     takes more than 1 minute due to flip_done time outs, while it took 0.12s
>     before:
>
>     [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>     [    3.021721] [drm] No driver support for vblank timestamp query.
>     [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   44.003597] Console: switching to colour frame buffer device 128x48
>     [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> [CRTC:58:crtc-3] flip_done timed out
>     [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>     [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
> feb00000.display on minor 0
>     [   64.559873] [drm] Device feb00000.display probed
>
>>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
>>> latency") fixes the second issue, but not the first.
>
> ... nor the third.
>
>>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
>>> latency PM QoS") fixes both.
>
> ... all three.

Sorry for the breakage.

OK, I'll just push the reverts to Linus later today.

>>> Do you have a clue?

Well, kind of.

There is a change in behavior in domain_governor.c that should not
have made any difference to my eyes, but maybe that's it.

Can you please check if the attached patch makes any difference?

Thanks,
Rafael

[-- Attachment #2: pm-domain-qos-fixup.patch --]
[-- Type: text/x-patch, Size: 828 bytes --]

---
 drivers/base/power/domain_governor.c |   11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/power/domain_governor.c
===================================================================
--- linux-pm.orig/drivers/base/power/domain_governor.c
+++ linux-pm/drivers/base/power/domain_governor.c
@@ -83,12 +83,11 @@ static bool default_suspend_ok(struct de
 		td->cached_suspend_ok = true;
 	} else {
 		constraint_ns -= td->suspend_latency_ns + td->resume_latency_ns;
-		if (constraint_ns > 0) {
-			td->effective_constraint_ns = constraint_ns;
-			td->cached_suspend_ok = true;
-		} else {
-			td->effective_constraint_ns = 0;
-		}
+		if (constraint_ns == 0)
+			return false;
+
+		td->effective_constraint_ns = constraint_ns;
+		td->cached_suspend_ok = constraint_ns >= 0;
 	}
 
 	/*

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 17:22           ` Rafael J. Wysocki
@ 2017-10-31 18:07             ` Geert Uytterhoeven
  2017-10-31 22:32               ` Rafael J. Wysocki
  0 siblings, 1 reply; 20+ messages in thread
From: Geert Uytterhoeven @ 2017-10-31 18:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Tero Kristo, Linux PM,
	Linux Kernel Mailing List, Linux-Renesas, Laurent Pinchart,
	DRI Development

Hi Rafael,

On Tue, Oct 31, 2017 at 6:22 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> Hi Rafael, Tero,
>>
>> CC pinchartl, dri-devel
>>
>> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> CC linux-renesas-soc
>>>
>>> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
>>> <geert@linux-m68k.org> wrote:
>>>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>>> > The recent change to the PM QoS framework to introduce a proper
>>>>>> > no constraint value overlooked to handle the devices which don't
>>>>>> > implement PM QoS OPS. Runtime PM is one of the more severely
>>>>>> > impacted subsystems, failing every attempt to runtime suspend
>>>>>> > a device. This leads into some nasty second level issues like
>>>>>> > probe failures and increased power consumption among other things.
>>>>>>
>>>>>> Oh, that's bad.
>>>>>>
>>>>>> Sorry about breaking it and thanks for the fix!
>>>>>>
>>>>>> > Fix this by adding a proper return value for devices that don't
>>>>>> > implement PM QoS implicitly.
>>>>>> >
>>>>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>
>>>>>> Applied.
>>>>>
>>>>> And pushed to Linus.
>>>>
>>>> I'm afraid it is not sufficient.
>>>>
>>>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
>>>> introduced two issues on Renesas platforms:
>>>>  1. After boot up, many devices have changed their state from "suspended"
>>>>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>>>     (comparing that file across boots is one of my standard tests).
>>>>     Interestingly, doing a system suspend/resume cycle restores their state
>>>>     to "suspended".
>>>>
>>>>  2. During system suspend, the following warning is printed on
>>>>     r8a7791/koelsch:
>>>>
>>>>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
>>>> active child
>>
>>  3. I've just bisected a seemingly unrelated issue to the same commit.
>>     On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
>>     takes more than 1 minute due to flip_done time outs, while it took 0.12s
>>     before:
>>
>>     [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>>     [    3.021721] [drm] No driver support for vblank timestamp query.
>>     [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   44.003597] Console: switching to colour frame buffer device 128x48
>>     [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>> [CRTC:58:crtc-3] flip_done timed out
>>     [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>>     [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
>> feb00000.display on minor 0
>>     [   64.559873] [drm] Device feb00000.display probed
>>
>>>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
>>>> latency") fixes the second issue, but not the first.
>>
>> ... nor the third.
>>
>>>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>>>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
>>>> latency PM QoS") fixes both.
>>
>> ... all three.
>
> Sorry for the breakage.
>
> OK, I'll just push the reverts to Linus later today.
>
>>>> Do you have a clue?
>
> Well, kind of.
>
> There is a change in behavior in domain_governor.c that should not
> have made any difference to my eyes, but maybe that's it.
>
> Can you please check if the attached patch makes any difference?

Thanks, but it doesn't seem to fix the issues.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 18:07             ` Geert Uytterhoeven
@ 2017-10-31 22:32               ` Rafael J. Wysocki
  2017-11-01 10:28                 ` Tero Kristo
  0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2017-10-31 22:32 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Tero Kristo, Linux PM,
	Linux Kernel Mailing List, Linux-Renesas, Laurent Pinchart,
	DRI Development

On Tue, Oct 31, 2017 at 7:07 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Hi Rafael,
>
> On Tue, Oct 31, 2017 at 6:22 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> Hi Rafael, Tero,
>>>
>>> CC pinchartl, dri-devel
>>>
>>> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
>>> <geert@linux-m68k.org> wrote:
>>>> CC linux-renesas-soc
>>>>
>>>> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
>>>> <geert@linux-m68k.org> wrote:
>>>>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>>>> > The recent change to the PM QoS framework to introduce a proper
>>>>>>> > no constraint value overlooked to handle the devices which don't
>>>>>>> > implement PM QoS OPS. Runtime PM is one of the more severely
>>>>>>> > impacted subsystems, failing every attempt to runtime suspend
>>>>>>> > a device. This leads into some nasty second level issues like
>>>>>>> > probe failures and increased power consumption among other things.
>>>>>>>
>>>>>>> Oh, that's bad.
>>>>>>>
>>>>>>> Sorry about breaking it and thanks for the fix!
>>>>>>>
>>>>>>> > Fix this by adding a proper return value for devices that don't
>>>>>>> > implement PM QoS implicitly.
>>>>>>> >
>>>>>>> > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>>>> > Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>>>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>>
>>>>>>> Applied.
>>>>>>
>>>>>> And pushed to Linus.
>>>>>
>>>>> I'm afraid it is not sufficient.
>>>>>
>>>>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
>>>>> introduced two issues on Renesas platforms:
>>>>>  1. After boot up, many devices have changed their state from "suspended"
>>>>>     to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>>>>     (comparing that file across boots is one of my standard tests).
>>>>>     Interestingly, doing a system suspend/resume cycle restores their state
>>>>>     to "suspended".
>>>>>
>>>>>  2. During system suspend, the following warning is printed on
>>>>>     r8a7791/koelsch:
>>>>>
>>>>>         i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
>>>>> active child
>>>
>>>  3. I've just bisected a seemingly unrelated issue to the same commit.
>>>     On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
>>>     takes more than 1 minute due to flip_done time outs, while it took 0.12s
>>>     before:
>>>
>>>     [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>>>     [    3.021721] [drm] No driver support for vblank timestamp query.
>>>     [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>> [CRTC:58:crtc-3] flip_done timed out
>>>     [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>> [CRTC:58:crtc-3] flip_done timed out
>>>     [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>> [CRTC:58:crtc-3] flip_done timed out
>>>     [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>> [CRTC:58:crtc-3] flip_done timed out
>>>     [   44.003597] Console: switching to colour frame buffer device 128x48
>>>     [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>> [CRTC:58:crtc-3] flip_done timed out
>>>     [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>> [CRTC:58:crtc-3] flip_done timed out
>>>     [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>>>     [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
>>> feb00000.display on minor 0
>>>     [   64.559873] [drm] Device feb00000.display probed
>>>
>>>>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
>>>>> latency") fixes the second issue, but not the first.
>>>
>>> ... nor the third.
>>>
>>>>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>>>>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
>>>>> latency PM QoS") fixes both.
>>>
>>> ... all three.
>>
>> Sorry for the breakage.
>>
>> OK, I'll just push the reverts to Linus later today.
>>
>>>>> Do you have a clue?
>>
>> Well, kind of.
>>
>> There is a change in behavior in domain_governor.c that should not
>> have made any difference to my eyes, but maybe that's it.
>>
>> Can you please check if the attached patch makes any difference?
>
> Thanks, but it doesn't seem to fix the issues.

Thanks for testing!

I've just pushed the reverts, but the PM QoS still needs to be fixed,
so we have to get to the bottom of this.

The current theory goes that the changes in domain_governor.c are to
blame.  Is genpd involved in all of the issues with the PM QoS fix you
have seen?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-10-31 22:32               ` Rafael J. Wysocki
@ 2017-11-01 10:28                 ` Tero Kristo
  2017-11-01 20:50                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 20+ messages in thread
From: Tero Kristo @ 2017-11-01 10:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Geert Uytterhoeven
  Cc: Rafael J. Wysocki, Linux PM, Linux Kernel Mailing List,
	Linux-Renesas, Laurent Pinchart, DRI Development

On 01/11/17 00:32, Rafael J. Wysocki wrote:
> On Tue, Oct 31, 2017 at 7:07 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> Hi Rafael,
>>
>> On Tue, Oct 31, 2017 at 6:22 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>>> On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven
>>> <geert@linux-m68k.org> wrote:
>>>> Hi Rafael, Tero,
>>>>
>>>> CC pinchartl, dri-devel
>>>>
>>>> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
>>>> <geert@linux-m68k.org> wrote:
>>>>> CC linux-renesas-soc
>>>>>
>>>>> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
>>>>> <geert@linux-m68k.org> wrote:
>>>>>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com> wrote:
>>>>>>>>> The recent change to the PM QoS framework to introduce a proper
>>>>>>>>> no constraint value overlooked to handle the devices which don't
>>>>>>>>> implement PM QoS OPS. Runtime PM is one of the more severely
>>>>>>>>> impacted subsystems, failing every attempt to runtime suspend
>>>>>>>>> a device. This leads into some nasty second level issues like
>>>>>>>>> probe failures and increased power consumption among other things.
>>>>>>>>
>>>>>>>> Oh, that's bad.
>>>>>>>>
>>>>>>>> Sorry about breaking it and thanks for the fix!
>>>>>>>>
>>>>>>>>> Fix this by adding a proper return value for devices that don't
>>>>>>>>> implement PM QoS implicitly.
>>>>>>>>>
>>>>>>>>> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>>>>>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>>>>>> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>>>
>>>>>>>> Applied.
>>>>>>>
>>>>>>> And pushed to Linus.
>>>>>>
>>>>>> I'm afraid it is not sufficient.
>>>>>>
>>>>>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
>>>>>> introduced two issues on Renesas platforms:
>>>>>>   1. After boot up, many devices have changed their state from "suspended"
>>>>>>      to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>>>>>      (comparing that file across boots is one of my standard tests).
>>>>>>      Interestingly, doing a system suspend/resume cycle restores their state
>>>>>>      to "suspended".
>>>>>>
>>>>>>   2. During system suspend, the following warning is printed on
>>>>>>      r8a7791/koelsch:
>>>>>>
>>>>>>          i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
>>>>>> active child
>>>>
>>>>   3. I've just bisected a seemingly unrelated issue to the same commit.
>>>>      On Salvator-XS with R-Car H3, initialization of the rcar-du driver now
>>>>      takes more than 1 minute due to flip_done time outs, while it took 0.12s
>>>>      before:
>>>>
>>>>      [    3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>>>>      [    3.021721] [drm] No driver support for vblank timestamp query.
>>>>      [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>      [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>      [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>      [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>      [   44.003597] Console: switching to colour frame buffer device 128x48
>>>>      [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>      [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>      [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>>>>      [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
>>>> feb00000.display on minor 0
>>>>      [   64.559873] [drm] Device feb00000.display probed
>>>>
>>>>>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume
>>>>>> latency") fixes the second issue, but not the first.
>>>>
>>>> ... nor the third.
>>>>
>>>>>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>>>>>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume
>>>>>> latency PM QoS") fixes both.
>>>>
>>>> ... all three.
>>>
>>> Sorry for the breakage.
>>>
>>> OK, I'll just push the reverts to Linus later today.
>>>
>>>>>> Do you have a clue?
>>>
>>> Well, kind of.
>>>
>>> There is a change in behavior in domain_governor.c that should not
>>> have made any difference to my eyes, but maybe that's it.
>>>
>>> Can you please check if the attached patch makes any difference?
>>
>> Thanks, but it doesn't seem to fix the issues.
> 
> Thanks for testing!
> 
> I've just pushed the reverts, but the PM QoS still needs to be fixed,
> so we have to get to the bottom of this.
> 
> The current theory goes that the changes in domain_governor.c are to
> blame.  Is genpd involved in all of the issues with the PM QoS fix you
> have seen?
> 
> Thanks,
> Rafael
> 

It seems the default values for pm_qos have changed with the patch, and 
that breaks genpd at least. I only fixed PM runtime initially, but you 
could try this diff to fix the genpd part also:

diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h
index d68b056..7c8f643 100644
--- a/include/linux/pm_qos.h
+++ b/include/linux/pm_qos.h
@@ -34,9 +34,9 @@ enum pm_qos_flags_status {
  #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE       (2000 * USEC_PER_SEC)
  #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE        0
  #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE  0
-#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE    0
+#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE    PM_QOS_LATENCY_ANY
  #define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT    PM_QOS_LATENCY_ANY
-#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0
+#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE (-1)
  #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)

  #define PM_QOS_FLAG_NO_POWER_OFF       (1 << 0)
--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-11-01 10:28                 ` Tero Kristo
@ 2017-11-01 20:50                   ` Rafael J. Wysocki
  2017-11-01 22:36                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2017-11-01 20:50 UTC (permalink / raw)
  To: Tero Kristo
  Cc: Rafael J. Wysocki, Geert Uytterhoeven, Rafael J. Wysocki,
	Linux PM, Linux Kernel Mailing List, Linux-Renesas,
	Laurent Pinchart, DRI Development

On Wed, Nov 1, 2017 at 11:28 AM, Tero Kristo <t-kristo@ti.com> wrote:
> On 01/11/17 00:32, Rafael J. Wysocki wrote:
>>
>> On Tue, Oct 31, 2017 at 7:07 PM, Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>>
>>> Hi Rafael,
>>>
>>> On Tue, Oct 31, 2017 at 6:22 PM, Rafael J. Wysocki <rafael@kernel.org>
>>> wrote:
>>>>
>>>> On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven
>>>> <geert@linux-m68k.org> wrote:
>>>>>
>>>>> Hi Rafael, Tero,
>>>>>
>>>>> CC pinchartl, dri-devel
>>>>>
>>>>> On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven
>>>>> <geert@linux-m68k.org> wrote:
>>>>>>
>>>>>> CC linux-renesas-soc
>>>>>>
>>>>>> On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven
>>>>>> <geert@linux-m68k.org> wrote:
>>>>>>>
>>>>>>> On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki
>>>>>>> <rjw@rjwysocki.net> wrote:
>>>>>>>>
>>>>>>>> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
>>>>>>>>>
>>>>>>>>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo <t-kristo@ti.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> The recent change to the PM QoS framework to introduce a proper
>>>>>>>>>> no constraint value overlooked to handle the devices which don't
>>>>>>>>>> implement PM QoS OPS. Runtime PM is one of the more severely
>>>>>>>>>> impacted subsystems, failing every attempt to runtime suspend
>>>>>>>>>> a device. This leads into some nasty second level issues like
>>>>>>>>>> probe failures and increased power consumption among other things.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Oh, that's bad.
>>>>>>>>>
>>>>>>>>> Sorry about breaking it and thanks for the fix!
>>>>>>>>>
>>>>>>>>>> Fix this by adding a proper return value for devices that don't
>>>>>>>>>> implement PM QoS implicitly.
>>>>>>>>>>
>>>>>>>>>> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS")
>>>>>>>>>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>>>>>>> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Applied.
>>>>>>>>
>>>>>>>>
>>>>>>>> And pushed to Linus.
>>>>>>>
>>>>>>>
>>>>>>> I'm afraid it is not sufficient.
>>>>>>>
>>>>>>> Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM
>>>>>>> QoS")
>>>>>>> introduced two issues on Renesas platforms:
>>>>>>>   1. After boot up, many devices have changed their state from
>>>>>>> "suspended"
>>>>>>>      to "active", according to
>>>>>>> /sys/kernel/debug/pm_genpd/pm_genpd_summary
>>>>>>>      (comparing that file across boots is one of my standard tests).
>>>>>>>      Interestingly, doing a system suspend/resume cycle restores
>>>>>>> their state
>>>>>>>      to "suspended".
>>>>>>>
>>>>>>>   2. During system suspend, the following warning is printed on
>>>>>>>      r8a7791/koelsch:
>>>>>>>
>>>>>>>          i2c-rcar e6530000.i2c: runtime PM trying to suspend device
>>>>>>> but
>>>>>>> active child
>>>>>
>>>>>
>>>>>   3. I've just bisected a seemingly unrelated issue to the same commit.
>>>>>      On Salvator-XS with R-Car H3, initialization of the rcar-du driver
>>>>> now
>>>>>      takes more than 1 minute due to flip_done time outs, while it took
>>>>> 0.12s
>>>>>      before:
>>>>>
>>>>>      [    3.015035] [drm] Supports vblank timestamp caching Rev 2
>>>>> (21.10.2013).
>>>>>      [    3.021721] [drm] No driver support for vblank timestamp query.
>>>>>      [   13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>>      [   23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>>      [   33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>>      [   44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>>      [   44.003597] Console: switching to colour frame buffer device
>>>>> 128x48
>>>>>      [   54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
>>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>>      [   64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
>>>>> [CRTC:58:crtc-3] flip_done timed out
>>>>>      [   64.544876] rcar-du feb00000.display: fb0:  frame buffer device
>>>>>      [   64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for
>>>>> feb00000.display on minor 0
>>>>>      [   64.559873] [drm] Device feb00000.display probed
>>>>>
>>>>>>> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device
>>>>>>> resume
>>>>>>> latency") fixes the second issue, but not the first.
>>>>>
>>>>>
>>>>> ... nor the third.
>>>>>
>>>>>>> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm
>>>>>>> device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device
>>>>>>> resume
>>>>>>> latency PM QoS") fixes both.
>>>>>
>>>>>
>>>>> ... all three.
>>>>
>>>>
>>>> Sorry for the breakage.
>>>>
>>>> OK, I'll just push the reverts to Linus later today.
>>>>
>>>>>>> Do you have a clue?
>>>>
>>>>
>>>> Well, kind of.
>>>>
>>>> There is a change in behavior in domain_governor.c that should not
>>>> have made any difference to my eyes, but maybe that's it.
>>>>
>>>> Can you please check if the attached patch makes any difference?
>>>
>>>
>>> Thanks, but it doesn't seem to fix the issues.
>>
>>
>> Thanks for testing!
>>
>> I've just pushed the reverts, but the PM QoS still needs to be fixed,
>> so we have to get to the bottom of this.
>>
>> The current theory goes that the changes in domain_governor.c are to
>> blame.  Is genpd involved in all of the issues with the PM QoS fix you
>> have seen?
>>
>> Thanks,
>> Rafael
>>
>
> It seems the default values for pm_qos have changed with the patch, and that
> breaks genpd at least. I only fixed PM runtime initially, but you could try
> this diff to fix the genpd part also:
>
> diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h
> index d68b056..7c8f643 100644
> --- a/include/linux/pm_qos.h
> +++ b/include/linux/pm_qos.h
> @@ -34,9 +34,9 @@ enum pm_qos_flags_status {
>  #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE       (2000 * USEC_PER_SEC)
>  #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE        0
>  #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE  0
> -#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE    0
> +#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE    PM_QOS_LATENCY_ANY
>  #define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT    PM_QOS_LATENCY_ANY
> -#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0
> +#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE (-1)
>  #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
>
>  #define PM_QOS_FLAG_NO_POWER_OFF       (1 << 0)

This is the original change in pm_qos.h (up to the GMail-induced
whitespace breakage):

-#define PM_QOS_DEFAULT_VALUE -1
+#define PM_QOS_DEFAULT_VALUE (-1)
+#define PM_QOS_LATENCY_ANY S32_MAX

#define PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC)
#define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC)
#define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE 0
#define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE 0
#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0
+#define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT PM_QOS_LATENCY_ANY
#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0
#define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)

-#define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))

#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0)
#define PM_QOS_FLAG_REMOTE_WAKEUP (1 << 1)

so the only thing that really has changed is the addition of
PM_QOS_RESUME_LATENCY_NO_CONSTRAINT, so I'm not really sure what you
mean.  Care to elaborate?

There is a bug in the genpd part of the original patch (the
multiplication by NSEC_PER_USEC in dev_update_qos_constraint() should
not be applied to the effective_constraint value), but it doesn't
matter too much now that the problematic commit has been reverted.

I'll post a new version of it for testing shortly, but on top of a
genpd governor patch to make it behave consistently.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] PM / QoS: Fix default runtime_pm device resume latency
  2017-11-01 20:50                   ` Rafael J. Wysocki
@ 2017-11-01 22:36                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2017-11-01 22:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tero Kristo, Geert Uytterhoeven, Rafael J. Wysocki, Linux PM,
	Linux Kernel Mailing List, Linux-Renesas, Laurent Pinchart,
	DRI Development

[cut]

>> It seems the default values for pm_qos have changed with the patch, and that
>> breaks genpd at least. I only fixed PM runtime initially, but you could try
>> this diff to fix the genpd part also:
>>
>> diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h
>> index d68b056..7c8f643 100644
>> --- a/include/linux/pm_qos.h
>> +++ b/include/linux/pm_qos.h
>> @@ -34,9 +34,9 @@ enum pm_qos_flags_status {
>>  #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE       (2000 * USEC_PER_SEC)
>>  #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE        0
>>  #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE  0
>> -#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE    0
>> +#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE    PM_QOS_LATENCY_ANY
>>  #define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT    PM_QOS_LATENCY_ANY
>> -#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0
>> +#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE (-1)
>>  #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
>>
>>  #define PM_QOS_FLAG_NO_POWER_OFF       (1 << 0)
>
> This is the original change in pm_qos.h (up to the GMail-induced
> whitespace breakage):
>
> -#define PM_QOS_DEFAULT_VALUE -1
> +#define PM_QOS_DEFAULT_VALUE (-1)
> +#define PM_QOS_LATENCY_ANY S32_MAX
>
> #define PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC)
> #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC)
> #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE 0
> #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE 0
> #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0
> +#define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT PM_QOS_LATENCY_ANY

OK, so I should have changed PM_QOS_RESUME_LATENCY_DEFAULT_VALUE to
PM_QOS_LATENCY_ANY too, so that the default is still "no restriction".

> #define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0
> #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
>
> -#define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
>
> #define PM_QOS_FLAG_NO_POWER_OFF (1 << 0)
> #define PM_QOS_FLAG_REMOTE_WAKEUP (1 << 1)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-11-01 22:36 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-30  7:10 [PATCH] PM / QoS: Fix default runtime_pm device resume latency Tero Kristo
2017-10-30 10:19 ` Rafael J. Wysocki
2017-10-30 23:27   ` Rafael J. Wysocki
2017-10-31  7:13     ` Tero Kristo
2017-10-31  8:40       ` Rafael J. Wysocki
2017-10-31 10:18         ` Tero Kristo
2017-10-31 13:09     ` Geert Uytterhoeven
2017-10-31 13:10       ` Geert Uytterhoeven
2017-10-31 13:55         ` Geert Uytterhoeven
2017-10-31 14:04           ` Ulf Hansson
2017-10-31 16:35             ` Rafael J. Wysocki
2017-10-31 15:37           ` Jani Nikula
2017-10-31 16:40             ` Daniel Vetter
2017-10-31 17:12           ` Laurent Pinchart
2017-10-31 17:22           ` Rafael J. Wysocki
2017-10-31 18:07             ` Geert Uytterhoeven
2017-10-31 22:32               ` Rafael J. Wysocki
2017-11-01 10:28                 ` Tero Kristo
2017-11-01 20:50                   ` Rafael J. Wysocki
2017-11-01 22:36                     ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).