All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-01-27 10:51 ` Lukasz Luba
  0 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2021-01-27 10:51 UTC (permalink / raw)
  To: linux-kernel, airlied, daniel, lima, dri-devel
  Cc: yuq825, christianshewitt, lukasz.luba

Devfreq framework supports 2 modes for monitoring devices.
Use delayed timer as default instead of deferrable timer
in order to monitor the GPU status regardless of CPU idle.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
Hi all,

I've missed the Lima driver while working on Panfrost patch for fixing
the issue with default devfreq framework polling mode. More about this
and the patch, can be found here [1].

Regards,
Lukasz Luba

[1] https://lore.kernel.org/lkml/20210105164111.30122-1-lukasz.luba@arm.com/

 drivers/gpu/drm/lima/lima_devfreq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/lima/lima_devfreq.c b/drivers/gpu/drm/lima/lima_devfreq.c
index 5686ad4aaf7c..f1c9eb3e71bd 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.c
+++ b/drivers/gpu/drm/lima/lima_devfreq.c
@@ -81,6 +81,7 @@ static int lima_devfreq_get_dev_status(struct device *dev,
 }
 
 static struct devfreq_dev_profile lima_devfreq_profile = {
+	.timer = DEVFREQ_TIMER_DELAYED,
 	.polling_ms = 50, /* ~3 frames */
 	.target = lima_devfreq_target,
 	.get_dev_status = lima_devfreq_get_dev_status,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-01-27 10:51 ` Lukasz Luba
  0 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2021-01-27 10:51 UTC (permalink / raw)
  To: linux-kernel, airlied, daniel, lima, dri-devel
  Cc: christianshewitt, yuq825, lukasz.luba

Devfreq framework supports 2 modes for monitoring devices.
Use delayed timer as default instead of deferrable timer
in order to monitor the GPU status regardless of CPU idle.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
Hi all,

I've missed the Lima driver while working on Panfrost patch for fixing
the issue with default devfreq framework polling mode. More about this
and the patch, can be found here [1].

Regards,
Lukasz Luba

[1] https://lore.kernel.org/lkml/20210105164111.30122-1-lukasz.luba@arm.com/

 drivers/gpu/drm/lima/lima_devfreq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/lima/lima_devfreq.c b/drivers/gpu/drm/lima/lima_devfreq.c
index 5686ad4aaf7c..f1c9eb3e71bd 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.c
+++ b/drivers/gpu/drm/lima/lima_devfreq.c
@@ -81,6 +81,7 @@ static int lima_devfreq_get_dev_status(struct device *dev,
 }
 
 static struct devfreq_dev_profile lima_devfreq_profile = {
+	.timer = DEVFREQ_TIMER_DELAYED,
 	.polling_ms = 50, /* ~3 frames */
 	.target = lima_devfreq_target,
 	.get_dev_status = lima_devfreq_get_dev_status,
-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
  2021-01-27 10:51 ` Lukasz Luba
@ 2021-01-30 13:51   ` Qiang Yu
  -1 siblings, 0 replies; 18+ messages in thread
From: Qiang Yu @ 2021-01-30 13:51 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: Linux Kernel Mailing List, David Airlie, Daniel Vetter, lima,
	dri-devel, christianshewitt

Thanks for the patch. But I can't observe any difference on glmark2
with or without this patch.
Maybe you can provide other test which can benefit from it.

Considering it will wake up CPU more frequently, and user may choose
to change this by sysfs,
I'd like to not apply it.

Regards,
Qiang

On Wed, Jan 27, 2021 at 6:51 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Devfreq framework supports 2 modes for monitoring devices.
> Use delayed timer as default instead of deferrable timer
> in order to monitor the GPU status regardless of CPU idle.
>
> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
> ---
> Hi all,
>
> I've missed the Lima driver while working on Panfrost patch for fixing
> the issue with default devfreq framework polling mode. More about this
> and the patch, can be found here [1].
>
> Regards,
> Lukasz Luba
>
> [1] https://lore.kernel.org/lkml/20210105164111.30122-1-lukasz.luba@arm.com/
>
>  drivers/gpu/drm/lima/lima_devfreq.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c b/drivers/gpu/drm/lima/lima_devfreq.c
> index 5686ad4aaf7c..f1c9eb3e71bd 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -81,6 +81,7 @@ static int lima_devfreq_get_dev_status(struct device *dev,
>  }
>
>  static struct devfreq_dev_profile lima_devfreq_profile = {
> +       .timer = DEVFREQ_TIMER_DELAYED,
>         .polling_ms = 50, /* ~3 frames */
>         .target = lima_devfreq_target,
>         .get_dev_status = lima_devfreq_get_dev_status,
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-01-30 13:51   ` Qiang Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Qiang Yu @ 2021-01-30 13:51 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: lima, David Airlie, christianshewitt, Linux Kernel Mailing List,
	dri-devel

Thanks for the patch. But I can't observe any difference on glmark2
with or without this patch.
Maybe you can provide other test which can benefit from it.

Considering it will wake up CPU more frequently, and user may choose
to change this by sysfs,
I'd like to not apply it.

Regards,
Qiang

On Wed, Jan 27, 2021 at 6:51 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Devfreq framework supports 2 modes for monitoring devices.
> Use delayed timer as default instead of deferrable timer
> in order to monitor the GPU status regardless of CPU idle.
>
> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
> ---
> Hi all,
>
> I've missed the Lima driver while working on Panfrost patch for fixing
> the issue with default devfreq framework polling mode. More about this
> and the patch, can be found here [1].
>
> Regards,
> Lukasz Luba
>
> [1] https://lore.kernel.org/lkml/20210105164111.30122-1-lukasz.luba@arm.com/
>
>  drivers/gpu/drm/lima/lima_devfreq.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/lima/lima_devfreq.c b/drivers/gpu/drm/lima/lima_devfreq.c
> index 5686ad4aaf7c..f1c9eb3e71bd 100644
> --- a/drivers/gpu/drm/lima/lima_devfreq.c
> +++ b/drivers/gpu/drm/lima/lima_devfreq.c
> @@ -81,6 +81,7 @@ static int lima_devfreq_get_dev_status(struct device *dev,
>  }
>
>  static struct devfreq_dev_profile lima_devfreq_profile = {
> +       .timer = DEVFREQ_TIMER_DELAYED,
>         .polling_ms = 50, /* ~3 frames */
>         .target = lima_devfreq_target,
>         .get_dev_status = lima_devfreq_get_dev_status,
> --
> 2.17.1
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
  2021-01-30 13:51   ` Qiang Yu
@ 2021-02-01  9:53     ` Lukasz Luba
  -1 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2021-02-01  9:53 UTC (permalink / raw)
  To: Qiang Yu
  Cc: Linux Kernel Mailing List, David Airlie, Daniel Vetter, lima,
	dri-devel, christianshewitt

Hi Qiang,

On 1/30/21 1:51 PM, Qiang Yu wrote:
> Thanks for the patch. But I can't observe any difference on glmark2
> with or without this patch.
> Maybe you can provide other test which can benefit from it.

This is a design problem and has impact on the whole system.
There is a few issues. When the device is not checked and there are
long delays between last check and current, the history is broken.
It confuses the devfreq governor and thermal governor (Intelligent Power
Allocation (IPA)). Thermal governor works on stale stats data and makes
stupid decisions, because there is no new stats (device not checked).
Similar applies to devfreq simple_ondemand governor, where it 'tires' to
work on a loooong period even 3sec and make prediction for the next
frequency based on it (which is broken).

How it should be done: constant reliable check is needed, then:
- period is guaranteed and has fixed size, e.g 50ms or 100ms.
- device status is quite recent so thermal devfreq cooling provides
   'fresh' data into thermal governor

This would prevent odd behavior and solve the broken cases.

> 
> Considering it will wake up CPU more frequently, and user may choose
> to change this by sysfs,
> I'd like to not apply it.

The deferred timer for GPU is wrong option, for UFS or eMMC makes more
sense. It's also not recommended for NoC busses. I've discovered that
some time ago and proposed to have option to switch into delayed timer.
Trust me, it wasn't obvious to find out that this missing check has
those impacts. So the other engineers or users might not know that some
problems they faces (especially when the device load is changing) is due
to this delayed vs deffered timer and they will change it in the sysfs.

Regards,
Lukasz

> 
> Regards,
> Qiang
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-02-01  9:53     ` Lukasz Luba
  0 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2021-02-01  9:53 UTC (permalink / raw)
  To: Qiang Yu
  Cc: lima, David Airlie, christianshewitt, Linux Kernel Mailing List,
	dri-devel

Hi Qiang,

On 1/30/21 1:51 PM, Qiang Yu wrote:
> Thanks for the patch. But I can't observe any difference on glmark2
> with or without this patch.
> Maybe you can provide other test which can benefit from it.

This is a design problem and has impact on the whole system.
There is a few issues. When the device is not checked and there are
long delays between last check and current, the history is broken.
It confuses the devfreq governor and thermal governor (Intelligent Power
Allocation (IPA)). Thermal governor works on stale stats data and makes
stupid decisions, because there is no new stats (device not checked).
Similar applies to devfreq simple_ondemand governor, where it 'tires' to
work on a loooong period even 3sec and make prediction for the next
frequency based on it (which is broken).

How it should be done: constant reliable check is needed, then:
- period is guaranteed and has fixed size, e.g 50ms or 100ms.
- device status is quite recent so thermal devfreq cooling provides
   'fresh' data into thermal governor

This would prevent odd behavior and solve the broken cases.

> 
> Considering it will wake up CPU more frequently, and user may choose
> to change this by sysfs,
> I'd like to not apply it.

The deferred timer for GPU is wrong option, for UFS or eMMC makes more
sense. It's also not recommended for NoC busses. I've discovered that
some time ago and proposed to have option to switch into delayed timer.
Trust me, it wasn't obvious to find out that this missing check has
those impacts. So the other engineers or users might not know that some
problems they faces (especially when the device load is changing) is due
to this delayed vs deffered timer and they will change it in the sysfs.

Regards,
Lukasz

> 
> Regards,
> Qiang
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
  2021-02-01  9:53     ` Lukasz Luba
@ 2021-02-02  1:01       ` Qiang Yu
  -1 siblings, 0 replies; 18+ messages in thread
From: Qiang Yu @ 2021-02-02  1:01 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: Linux Kernel Mailing List, David Airlie, Daniel Vetter, lima,
	dri-devel, Christian Hewitt

Hi Lukasz,

Thanks for the explanation. So the deferred timer option makes a mistake that
when GPU goes from idle to busy for only one poll periodic, in this
case 50ms, right?
But delayed timer will wakeup CPU every 50ms even when system is idle, will this
cause more power consumption for the case like phone suspend?

Regards,
Qiang


On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Hi Qiang,
>
> On 1/30/21 1:51 PM, Qiang Yu wrote:
> > Thanks for the patch. But I can't observe any difference on glmark2
> > with or without this patch.
> > Maybe you can provide other test which can benefit from it.
>
> This is a design problem and has impact on the whole system.
> There is a few issues. When the device is not checked and there are
> long delays between last check and current, the history is broken.
> It confuses the devfreq governor and thermal governor (Intelligent Power
> Allocation (IPA)). Thermal governor works on stale stats data and makes
> stupid decisions, because there is no new stats (device not checked).
> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
> work on a loooong period even 3sec and make prediction for the next
> frequency based on it (which is broken).
>
> How it should be done: constant reliable check is needed, then:
> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
> - device status is quite recent so thermal devfreq cooling provides
>    'fresh' data into thermal governor
>
> This would prevent odd behavior and solve the broken cases.
>
> >
> > Considering it will wake up CPU more frequently, and user may choose
> > to change this by sysfs,
> > I'd like to not apply it.
>
> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
> sense. It's also not recommended for NoC busses. I've discovered that
> some time ago and proposed to have option to switch into delayed timer.
> Trust me, it wasn't obvious to find out that this missing check has
> those impacts. So the other engineers or users might not know that some
> problems they faces (especially when the device load is changing) is due
> to this delayed vs deffered timer and they will change it in the sysfs.
>
> Regards,
> Lukasz
>
> >
> > Regards,
> > Qiang
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-02-02  1:01       ` Qiang Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Qiang Yu @ 2021-02-02  1:01 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: lima, David Airlie, Christian Hewitt, Linux Kernel Mailing List,
	dri-devel

Hi Lukasz,

Thanks for the explanation. So the deferred timer option makes a mistake that
when GPU goes from idle to busy for only one poll periodic, in this
case 50ms, right?
But delayed timer will wakeup CPU every 50ms even when system is idle, will this
cause more power consumption for the case like phone suspend?

Regards,
Qiang


On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Hi Qiang,
>
> On 1/30/21 1:51 PM, Qiang Yu wrote:
> > Thanks for the patch. But I can't observe any difference on glmark2
> > with or without this patch.
> > Maybe you can provide other test which can benefit from it.
>
> This is a design problem and has impact on the whole system.
> There is a few issues. When the device is not checked and there are
> long delays between last check and current, the history is broken.
> It confuses the devfreq governor and thermal governor (Intelligent Power
> Allocation (IPA)). Thermal governor works on stale stats data and makes
> stupid decisions, because there is no new stats (device not checked).
> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
> work on a loooong period even 3sec and make prediction for the next
> frequency based on it (which is broken).
>
> How it should be done: constant reliable check is needed, then:
> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
> - device status is quite recent so thermal devfreq cooling provides
>    'fresh' data into thermal governor
>
> This would prevent odd behavior and solve the broken cases.
>
> >
> > Considering it will wake up CPU more frequently, and user may choose
> > to change this by sysfs,
> > I'd like to not apply it.
>
> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
> sense. It's also not recommended for NoC busses. I've discovered that
> some time ago and proposed to have option to switch into delayed timer.
> Trust me, it wasn't obvious to find out that this missing check has
> those impacts. So the other engineers or users might not know that some
> problems they faces (especially when the device load is changing) is due
> to this delayed vs deffered timer and they will change it in the sysfs.
>
> Regards,
> Lukasz
>
> >
> > Regards,
> > Qiang
> >
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
  2021-02-02  1:01       ` Qiang Yu
@ 2021-02-02 14:02         ` Lukasz Luba
  -1 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2021-02-02 14:02 UTC (permalink / raw)
  To: Qiang Yu
  Cc: Linux Kernel Mailing List, David Airlie, Daniel Vetter, lima,
	dri-devel, Christian Hewitt



On 2/2/21 1:01 AM, Qiang Yu wrote:
> Hi Lukasz,
> 
> Thanks for the explanation. So the deferred timer option makes a mistake that
> when GPU goes from idle to busy for only one poll periodic, in this
> case 50ms, right?

Not exactly. Driver sets the polling interval to 50ms (in this case)
because it needs ~3-frame average load (in 60fps). I have discovered the
issue quite recently that on systems with 2 CPUs or more, the devfreq
core is not monitoring the devices even for seconds. Therefore, we might
end up with quite big amount of work that GPU is doing, but we don't
know about it. Devfreq core didn't check <- timer didn't fired. Then
suddenly that CPU, which had the deferred timer registered last time,
is waking up and timer triggers to check our device. We get the stats,
but they might be showing load from 1sec not 50ms. We feed them into
governor. Governor sees the new load, but was tested and configured for
50ms, so it might try to rise the frequency to max. The GPU work might
be already lower and there is no need for such freq. Then the CPU goes
idle again, so no devfreq core check for next e.g. 1sec, but the
frequency stays at max OPP and we burn power.

So, it's completely unreliable. We might stuck at min frequency and
suffer the frame drops, or sometimes stuck to max freq and burn more
power when there is no such need.

Similar for thermal governor, which is confused by this old stats and
long period stats, longer than 50ms.

Stats from last e.g. ~1sec tells you nothing about real recent GPU
workload.

> But delayed timer will wakeup CPU every 50ms even when system is idle, will this
> cause more power consumption for the case like phone suspend?

No, in case of phone suspend it won't increase the power consumption.
The device won't be woken up, it will stay in suspend.

Regards,
Lukasz


> 
> Regards,
> Qiang
> 
> 
> On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>> Hi Qiang,
>>
>> On 1/30/21 1:51 PM, Qiang Yu wrote:
>>> Thanks for the patch. But I can't observe any difference on glmark2
>>> with or without this patch.
>>> Maybe you can provide other test which can benefit from it.
>>
>> This is a design problem and has impact on the whole system.
>> There is a few issues. When the device is not checked and there are
>> long delays between last check and current, the history is broken.
>> It confuses the devfreq governor and thermal governor (Intelligent Power
>> Allocation (IPA)). Thermal governor works on stale stats data and makes
>> stupid decisions, because there is no new stats (device not checked).
>> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
>> work on a loooong period even 3sec and make prediction for the next
>> frequency based on it (which is broken).
>>
>> How it should be done: constant reliable check is needed, then:
>> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
>> - device status is quite recent so thermal devfreq cooling provides
>>     'fresh' data into thermal governor
>>
>> This would prevent odd behavior and solve the broken cases.
>>
>>>
>>> Considering it will wake up CPU more frequently, and user may choose
>>> to change this by sysfs,
>>> I'd like to not apply it.
>>
>> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
>> sense. It's also not recommended for NoC busses. I've discovered that
>> some time ago and proposed to have option to switch into delayed timer.
>> Trust me, it wasn't obvious to find out that this missing check has
>> those impacts. So the other engineers or users might not know that some
>> problems they faces (especially when the device load is changing) is due
>> to this delayed vs deffered timer and they will change it in the sysfs.
>>
>> Regards,
>> Lukasz
>>
>>>
>>> Regards,
>>> Qiang
>>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-02-02 14:02         ` Lukasz Luba
  0 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2021-02-02 14:02 UTC (permalink / raw)
  To: Qiang Yu
  Cc: lima, David Airlie, Christian Hewitt, Linux Kernel Mailing List,
	dri-devel



On 2/2/21 1:01 AM, Qiang Yu wrote:
> Hi Lukasz,
> 
> Thanks for the explanation. So the deferred timer option makes a mistake that
> when GPU goes from idle to busy for only one poll periodic, in this
> case 50ms, right?

Not exactly. Driver sets the polling interval to 50ms (in this case)
because it needs ~3-frame average load (in 60fps). I have discovered the
issue quite recently that on systems with 2 CPUs or more, the devfreq
core is not monitoring the devices even for seconds. Therefore, we might
end up with quite big amount of work that GPU is doing, but we don't
know about it. Devfreq core didn't check <- timer didn't fired. Then
suddenly that CPU, which had the deferred timer registered last time,
is waking up and timer triggers to check our device. We get the stats,
but they might be showing load from 1sec not 50ms. We feed them into
governor. Governor sees the new load, but was tested and configured for
50ms, so it might try to rise the frequency to max. The GPU work might
be already lower and there is no need for such freq. Then the CPU goes
idle again, so no devfreq core check for next e.g. 1sec, but the
frequency stays at max OPP and we burn power.

So, it's completely unreliable. We might stuck at min frequency and
suffer the frame drops, or sometimes stuck to max freq and burn more
power when there is no such need.

Similar for thermal governor, which is confused by this old stats and
long period stats, longer than 50ms.

Stats from last e.g. ~1sec tells you nothing about real recent GPU
workload.

> But delayed timer will wakeup CPU every 50ms even when system is idle, will this
> cause more power consumption for the case like phone suspend?

No, in case of phone suspend it won't increase the power consumption.
The device won't be woken up, it will stay in suspend.

Regards,
Lukasz


> 
> Regards,
> Qiang
> 
> 
> On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>> Hi Qiang,
>>
>> On 1/30/21 1:51 PM, Qiang Yu wrote:
>>> Thanks for the patch. But I can't observe any difference on glmark2
>>> with or without this patch.
>>> Maybe you can provide other test which can benefit from it.
>>
>> This is a design problem and has impact on the whole system.
>> There is a few issues. When the device is not checked and there are
>> long delays between last check and current, the history is broken.
>> It confuses the devfreq governor and thermal governor (Intelligent Power
>> Allocation (IPA)). Thermal governor works on stale stats data and makes
>> stupid decisions, because there is no new stats (device not checked).
>> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
>> work on a loooong period even 3sec and make prediction for the next
>> frequency based on it (which is broken).
>>
>> How it should be done: constant reliable check is needed, then:
>> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
>> - device status is quite recent so thermal devfreq cooling provides
>>     'fresh' data into thermal governor
>>
>> This would prevent odd behavior and solve the broken cases.
>>
>>>
>>> Considering it will wake up CPU more frequently, and user may choose
>>> to change this by sysfs,
>>> I'd like to not apply it.
>>
>> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
>> sense. It's also not recommended for NoC busses. I've discovered that
>> some time ago and proposed to have option to switch into delayed timer.
>> Trust me, it wasn't obvious to find out that this missing check has
>> those impacts. So the other engineers or users might not know that some
>> problems they faces (especially when the device load is changing) is due
>> to this delayed vs deffered timer and they will change it in the sysfs.
>>
>> Regards,
>> Lukasz
>>
>>>
>>> Regards,
>>> Qiang
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
  2021-02-02 14:02         ` Lukasz Luba
@ 2021-02-03  2:01           ` Qiang Yu
  -1 siblings, 0 replies; 18+ messages in thread
From: Qiang Yu @ 2021-02-03  2:01 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: Linux Kernel Mailing List, David Airlie, Daniel Vetter, lima,
	dri-devel, Christian Hewitt

On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 2/2/21 1:01 AM, Qiang Yu wrote:
> > Hi Lukasz,
> >
> > Thanks for the explanation. So the deferred timer option makes a mistake that
> > when GPU goes from idle to busy for only one poll periodic, in this
> > case 50ms, right?
>
> Not exactly. Driver sets the polling interval to 50ms (in this case)
> because it needs ~3-frame average load (in 60fps). I have discovered the
> issue quite recently that on systems with 2 CPUs or more, the devfreq
> core is not monitoring the devices even for seconds. Therefore, we might
> end up with quite big amount of work that GPU is doing, but we don't
> know about it. Devfreq core didn't check <- timer didn't fired. Then
> suddenly that CPU, which had the deferred timer registered last time,
> is waking up and timer triggers to check our device. We get the stats,
> but they might be showing load from 1sec not 50ms. We feed them into
> governor. Governor sees the new load, but was tested and configured for
> 50ms, so it might try to rise the frequency to max. The GPU work might
> be already lower and there is no need for such freq. Then the CPU goes
> idle again, so no devfreq core check for next e.g. 1sec, but the
> frequency stays at max OPP and we burn power.
>
> So, it's completely unreliable. We might stuck at min frequency and
> suffer the frame drops, or sometimes stuck to max freq and burn more
> power when there is no such need.
>
> Similar for thermal governor, which is confused by this old stats and
> long period stats, longer than 50ms.
>
> Stats from last e.g. ~1sec tells you nothing about real recent GPU
> workload.
Oh, right, I missed this case.

>
> > But delayed timer will wakeup CPU every 50ms even when system is idle, will this
> > cause more power consumption for the case like phone suspend?
>
> No, in case of phone suspend it won't increase the power consumption.
> The device won't be woken up, it will stay in suspend.
I mean the CPU is waked up frequently by timer when phone suspend,
not the whole device (like the display).

Seems it's better to have deferred timer when device is suspended for
power saving,
and delayed timer when device in working state. User knows this and
can use sysfs
to change it.

Set the delayed timer as default is reasonable, so patch is:
Reviewed-by: Qiang Yu <yuq825@gmail.com>

Regards,
Qiang

>
> Regards,
> Lukasz
>
>
> >
> > Regards,
> > Qiang
> >
> >
> > On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>
> >> Hi Qiang,
> >>
> >> On 1/30/21 1:51 PM, Qiang Yu wrote:
> >>> Thanks for the patch. But I can't observe any difference on glmark2
> >>> with or without this patch.
> >>> Maybe you can provide other test which can benefit from it.
> >>
> >> This is a design problem and has impact on the whole system.
> >> There is a few issues. When the device is not checked and there are
> >> long delays between last check and current, the history is broken.
> >> It confuses the devfreq governor and thermal governor (Intelligent Power
> >> Allocation (IPA)). Thermal governor works on stale stats data and makes
> >> stupid decisions, because there is no new stats (device not checked).
> >> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
> >> work on a loooong period even 3sec and make prediction for the next
> >> frequency based on it (which is broken).
> >>
> >> How it should be done: constant reliable check is needed, then:
> >> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
> >> - device status is quite recent so thermal devfreq cooling provides
> >>     'fresh' data into thermal governor
> >>
> >> This would prevent odd behavior and solve the broken cases.
> >>
> >>>
> >>> Considering it will wake up CPU more frequently, and user may choose
> >>> to change this by sysfs,
> >>> I'd like to not apply it.
> >>
> >> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
> >> sense. It's also not recommended for NoC busses. I've discovered that
> >> some time ago and proposed to have option to switch into delayed timer.
> >> Trust me, it wasn't obvious to find out that this missing check has
> >> those impacts. So the other engineers or users might not know that some
> >> problems they faces (especially when the device load is changing) is due
> >> to this delayed vs deffered timer and they will change it in the sysfs.
> >>
> >> Regards,
> >> Lukasz
> >>
> >>>
> >>> Regards,
> >>> Qiang
> >>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-02-03  2:01           ` Qiang Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Qiang Yu @ 2021-02-03  2:01 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: lima, David Airlie, Christian Hewitt, Linux Kernel Mailing List,
	dri-devel

On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 2/2/21 1:01 AM, Qiang Yu wrote:
> > Hi Lukasz,
> >
> > Thanks for the explanation. So the deferred timer option makes a mistake that
> > when GPU goes from idle to busy for only one poll periodic, in this
> > case 50ms, right?
>
> Not exactly. Driver sets the polling interval to 50ms (in this case)
> because it needs ~3-frame average load (in 60fps). I have discovered the
> issue quite recently that on systems with 2 CPUs or more, the devfreq
> core is not monitoring the devices even for seconds. Therefore, we might
> end up with quite big amount of work that GPU is doing, but we don't
> know about it. Devfreq core didn't check <- timer didn't fired. Then
> suddenly that CPU, which had the deferred timer registered last time,
> is waking up and timer triggers to check our device. We get the stats,
> but they might be showing load from 1sec not 50ms. We feed them into
> governor. Governor sees the new load, but was tested and configured for
> 50ms, so it might try to rise the frequency to max. The GPU work might
> be already lower and there is no need for such freq. Then the CPU goes
> idle again, so no devfreq core check for next e.g. 1sec, but the
> frequency stays at max OPP and we burn power.
>
> So, it's completely unreliable. We might stuck at min frequency and
> suffer the frame drops, or sometimes stuck to max freq and burn more
> power when there is no such need.
>
> Similar for thermal governor, which is confused by this old stats and
> long period stats, longer than 50ms.
>
> Stats from last e.g. ~1sec tells you nothing about real recent GPU
> workload.
Oh, right, I missed this case.

>
> > But delayed timer will wakeup CPU every 50ms even when system is idle, will this
> > cause more power consumption for the case like phone suspend?
>
> No, in case of phone suspend it won't increase the power consumption.
> The device won't be woken up, it will stay in suspend.
I mean the CPU is waked up frequently by timer when phone suspend,
not the whole device (like the display).

Seems it's better to have deferred timer when device is suspended for
power saving,
and delayed timer when device in working state. User knows this and
can use sysfs
to change it.

Set the delayed timer as default is reasonable, so patch is:
Reviewed-by: Qiang Yu <yuq825@gmail.com>

Regards,
Qiang

>
> Regards,
> Lukasz
>
>
> >
> > Regards,
> > Qiang
> >
> >
> > On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>
> >> Hi Qiang,
> >>
> >> On 1/30/21 1:51 PM, Qiang Yu wrote:
> >>> Thanks for the patch. But I can't observe any difference on glmark2
> >>> with or without this patch.
> >>> Maybe you can provide other test which can benefit from it.
> >>
> >> This is a design problem and has impact on the whole system.
> >> There is a few issues. When the device is not checked and there are
> >> long delays between last check and current, the history is broken.
> >> It confuses the devfreq governor and thermal governor (Intelligent Power
> >> Allocation (IPA)). Thermal governor works on stale stats data and makes
> >> stupid decisions, because there is no new stats (device not checked).
> >> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
> >> work on a loooong period even 3sec and make prediction for the next
> >> frequency based on it (which is broken).
> >>
> >> How it should be done: constant reliable check is needed, then:
> >> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
> >> - device status is quite recent so thermal devfreq cooling provides
> >>     'fresh' data into thermal governor
> >>
> >> This would prevent odd behavior and solve the broken cases.
> >>
> >>>
> >>> Considering it will wake up CPU more frequently, and user may choose
> >>> to change this by sysfs,
> >>> I'd like to not apply it.
> >>
> >> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
> >> sense. It's also not recommended for NoC busses. I've discovered that
> >> some time ago and proposed to have option to switch into delayed timer.
> >> Trust me, it wasn't obvious to find out that this missing check has
> >> those impacts. So the other engineers or users might not know that some
> >> problems they faces (especially when the device load is changing) is due
> >> to this delayed vs deffered timer and they will change it in the sysfs.
> >>
> >> Regards,
> >> Lukasz
> >>
> >>>
> >>> Regards,
> >>> Qiang
> >>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
  2021-02-03  2:01           ` Qiang Yu
@ 2021-02-04 13:39             ` Robin Murphy
  -1 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2021-02-04 13:39 UTC (permalink / raw)
  To: Qiang Yu, Lukasz Luba
  Cc: lima, David Airlie, Christian Hewitt, Linux Kernel Mailing List,
	dri-devel

On 2021-02-03 02:01, Qiang Yu wrote:
> On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>>
>>
>> On 2/2/21 1:01 AM, Qiang Yu wrote:
>>> Hi Lukasz,
>>>
>>> Thanks for the explanation. So the deferred timer option makes a mistake that
>>> when GPU goes from idle to busy for only one poll periodic, in this
>>> case 50ms, right?
>>
>> Not exactly. Driver sets the polling interval to 50ms (in this case)
>> because it needs ~3-frame average load (in 60fps). I have discovered the
>> issue quite recently that on systems with 2 CPUs or more, the devfreq
>> core is not monitoring the devices even for seconds. Therefore, we might
>> end up with quite big amount of work that GPU is doing, but we don't
>> know about it. Devfreq core didn't check <- timer didn't fired. Then
>> suddenly that CPU, which had the deferred timer registered last time,
>> is waking up and timer triggers to check our device. We get the stats,
>> but they might be showing load from 1sec not 50ms. We feed them into
>> governor. Governor sees the new load, but was tested and configured for
>> 50ms, so it might try to rise the frequency to max. The GPU work might
>> be already lower and there is no need for such freq. Then the CPU goes
>> idle again, so no devfreq core check for next e.g. 1sec, but the
>> frequency stays at max OPP and we burn power.
>>
>> So, it's completely unreliable. We might stuck at min frequency and
>> suffer the frame drops, or sometimes stuck to max freq and burn more
>> power when there is no such need.
>>
>> Similar for thermal governor, which is confused by this old stats and
>> long period stats, longer than 50ms.
>>
>> Stats from last e.g. ~1sec tells you nothing about real recent GPU
>> workload.
> Oh, right, I missed this case.
> 
>>
>>> But delayed timer will wakeup CPU every 50ms even when system is idle, will this
>>> cause more power consumption for the case like phone suspend?
>>
>> No, in case of phone suspend it won't increase the power consumption.
>> The device won't be woken up, it will stay in suspend.
> I mean the CPU is waked up frequently by timer when phone suspend,
> not the whole device (like the display).
> 
> Seems it's better to have deferred timer when device is suspended for
> power saving,
> and delayed timer when device in working state. User knows this and
> can use sysfs
> to change it.

Doesn't devfreq_suspend_device() already cancel any timer work either 
way in that case?

Robin.

> Set the delayed timer as default is reasonable, so patch is:
> Reviewed-by: Qiang Yu <yuq825@gmail.com>
> 
> Regards,
> Qiang
> 
>>
>> Regards,
>> Lukasz
>>
>>
>>>
>>> Regards,
>>> Qiang
>>>
>>>
>>> On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>>
>>>> Hi Qiang,
>>>>
>>>> On 1/30/21 1:51 PM, Qiang Yu wrote:
>>>>> Thanks for the patch. But I can't observe any difference on glmark2
>>>>> with or without this patch.
>>>>> Maybe you can provide other test which can benefit from it.
>>>>
>>>> This is a design problem and has impact on the whole system.
>>>> There is a few issues. When the device is not checked and there are
>>>> long delays between last check and current, the history is broken.
>>>> It confuses the devfreq governor and thermal governor (Intelligent Power
>>>> Allocation (IPA)). Thermal governor works on stale stats data and makes
>>>> stupid decisions, because there is no new stats (device not checked).
>>>> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
>>>> work on a loooong period even 3sec and make prediction for the next
>>>> frequency based on it (which is broken).
>>>>
>>>> How it should be done: constant reliable check is needed, then:
>>>> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
>>>> - device status is quite recent so thermal devfreq cooling provides
>>>>      'fresh' data into thermal governor
>>>>
>>>> This would prevent odd behavior and solve the broken cases.
>>>>
>>>>>
>>>>> Considering it will wake up CPU more frequently, and user may choose
>>>>> to change this by sysfs,
>>>>> I'd like to not apply it.
>>>>
>>>> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
>>>> sense. It's also not recommended for NoC busses. I've discovered that
>>>> some time ago and proposed to have option to switch into delayed timer.
>>>> Trust me, it wasn't obvious to find out that this missing check has
>>>> those impacts. So the other engineers or users might not know that some
>>>> problems they faces (especially when the device load is changing) is due
>>>> to this delayed vs deffered timer and they will change it in the sysfs.
>>>>
>>>> Regards,
>>>> Lukasz
>>>>
>>>>>
>>>>> Regards,
>>>>> Qiang
>>>>>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-02-04 13:39             ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2021-02-04 13:39 UTC (permalink / raw)
  To: Qiang Yu, Lukasz Luba
  Cc: David Airlie, dri-devel, Christian Hewitt,
	Linux Kernel Mailing List, lima

On 2021-02-03 02:01, Qiang Yu wrote:
> On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>>
>>
>> On 2/2/21 1:01 AM, Qiang Yu wrote:
>>> Hi Lukasz,
>>>
>>> Thanks for the explanation. So the deferred timer option makes a mistake that
>>> when GPU goes from idle to busy for only one poll periodic, in this
>>> case 50ms, right?
>>
>> Not exactly. Driver sets the polling interval to 50ms (in this case)
>> because it needs ~3-frame average load (in 60fps). I have discovered the
>> issue quite recently that on systems with 2 CPUs or more, the devfreq
>> core is not monitoring the devices even for seconds. Therefore, we might
>> end up with quite big amount of work that GPU is doing, but we don't
>> know about it. Devfreq core didn't check <- timer didn't fired. Then
>> suddenly that CPU, which had the deferred timer registered last time,
>> is waking up and timer triggers to check our device. We get the stats,
>> but they might be showing load from 1sec not 50ms. We feed them into
>> governor. Governor sees the new load, but was tested and configured for
>> 50ms, so it might try to rise the frequency to max. The GPU work might
>> be already lower and there is no need for such freq. Then the CPU goes
>> idle again, so no devfreq core check for next e.g. 1sec, but the
>> frequency stays at max OPP and we burn power.
>>
>> So, it's completely unreliable. We might stuck at min frequency and
>> suffer the frame drops, or sometimes stuck to max freq and burn more
>> power when there is no such need.
>>
>> Similar for thermal governor, which is confused by this old stats and
>> long period stats, longer than 50ms.
>>
>> Stats from last e.g. ~1sec tells you nothing about real recent GPU
>> workload.
> Oh, right, I missed this case.
> 
>>
>>> But delayed timer will wakeup CPU every 50ms even when system is idle, will this
>>> cause more power consumption for the case like phone suspend?
>>
>> No, in case of phone suspend it won't increase the power consumption.
>> The device won't be woken up, it will stay in suspend.
> I mean the CPU is waked up frequently by timer when phone suspend,
> not the whole device (like the display).
> 
> Seems it's better to have deferred timer when device is suspended for
> power saving,
> and delayed timer when device in working state. User knows this and
> can use sysfs
> to change it.

Doesn't devfreq_suspend_device() already cancel any timer work either 
way in that case?

Robin.

> Set the delayed timer as default is reasonable, so patch is:
> Reviewed-by: Qiang Yu <yuq825@gmail.com>
> 
> Regards,
> Qiang
> 
>>
>> Regards,
>> Lukasz
>>
>>
>>>
>>> Regards,
>>> Qiang
>>>
>>>
>>> On Mon, Feb 1, 2021 at 5:53 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>>
>>>> Hi Qiang,
>>>>
>>>> On 1/30/21 1:51 PM, Qiang Yu wrote:
>>>>> Thanks for the patch. But I can't observe any difference on glmark2
>>>>> with or without this patch.
>>>>> Maybe you can provide other test which can benefit from it.
>>>>
>>>> This is a design problem and has impact on the whole system.
>>>> There is a few issues. When the device is not checked and there are
>>>> long delays between last check and current, the history is broken.
>>>> It confuses the devfreq governor and thermal governor (Intelligent Power
>>>> Allocation (IPA)). Thermal governor works on stale stats data and makes
>>>> stupid decisions, because there is no new stats (device not checked).
>>>> Similar applies to devfreq simple_ondemand governor, where it 'tires' to
>>>> work on a loooong period even 3sec and make prediction for the next
>>>> frequency based on it (which is broken).
>>>>
>>>> How it should be done: constant reliable check is needed, then:
>>>> - period is guaranteed and has fixed size, e.g 50ms or 100ms.
>>>> - device status is quite recent so thermal devfreq cooling provides
>>>>      'fresh' data into thermal governor
>>>>
>>>> This would prevent odd behavior and solve the broken cases.
>>>>
>>>>>
>>>>> Considering it will wake up CPU more frequently, and user may choose
>>>>> to change this by sysfs,
>>>>> I'd like to not apply it.
>>>>
>>>> The deferred timer for GPU is wrong option, for UFS or eMMC makes more
>>>> sense. It's also not recommended for NoC busses. I've discovered that
>>>> some time ago and proposed to have option to switch into delayed timer.
>>>> Trust me, it wasn't obvious to find out that this missing check has
>>>> those impacts. So the other engineers or users might not know that some
>>>> problems they faces (especially when the device load is changing) is due
>>>> to this delayed vs deffered timer and they will change it in the sysfs.
>>>>
>>>> Regards,
>>>> Lukasz
>>>>
>>>>>
>>>>> Regards,
>>>>> Qiang
>>>>>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
  2021-02-04 13:39             ` Robin Murphy
@ 2021-02-04 14:23               ` Lukasz Luba
  -1 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2021-02-04 14:23 UTC (permalink / raw)
  To: Robin Murphy, Qiang Yu
  Cc: lima, David Airlie, Christian Hewitt, Linux Kernel Mailing List,
	dri-devel



On 2/4/21 1:39 PM, Robin Murphy wrote:
> On 2021-02-03 02:01, Qiang Yu wrote:
>> On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>
>>>
>>>
>>> On 2/2/21 1:01 AM, Qiang Yu wrote:
>>>> Hi Lukasz,
>>>>
>>>> Thanks for the explanation. So the deferred timer option makes a 
>>>> mistake that
>>>> when GPU goes from idle to busy for only one poll periodic, in this
>>>> case 50ms, right?
>>>
>>> Not exactly. Driver sets the polling interval to 50ms (in this case)
>>> because it needs ~3-frame average load (in 60fps). I have discovered the
>>> issue quite recently that on systems with 2 CPUs or more, the devfreq
>>> core is not monitoring the devices even for seconds. Therefore, we might
>>> end up with quite big amount of work that GPU is doing, but we don't
>>> know about it. Devfreq core didn't check <- timer didn't fired. Then
>>> suddenly that CPU, which had the deferred timer registered last time,
>>> is waking up and timer triggers to check our device. We get the stats,
>>> but they might be showing load from 1sec not 50ms. We feed them into
>>> governor. Governor sees the new load, but was tested and configured for
>>> 50ms, so it might try to rise the frequency to max. The GPU work might
>>> be already lower and there is no need for such freq. Then the CPU goes
>>> idle again, so no devfreq core check for next e.g. 1sec, but the
>>> frequency stays at max OPP and we burn power.
>>>
>>> So, it's completely unreliable. We might stuck at min frequency and
>>> suffer the frame drops, or sometimes stuck to max freq and burn more
>>> power when there is no such need.
>>>
>>> Similar for thermal governor, which is confused by this old stats and
>>> long period stats, longer than 50ms.
>>>
>>> Stats from last e.g. ~1sec tells you nothing about real recent GPU
>>> workload.
>> Oh, right, I missed this case.
>>
>>>
>>>> But delayed timer will wakeup CPU every 50ms even when system is 
>>>> idle, will this
>>>> cause more power consumption for the case like phone suspend?
>>>
>>> No, in case of phone suspend it won't increase the power consumption.
>>> The device won't be woken up, it will stay in suspend.
>> I mean the CPU is waked up frequently by timer when phone suspend,
>> not the whole device (like the display).
>>
>> Seems it's better to have deferred timer when device is suspended for
>> power saving,
>> and delayed timer when device in working state. User knows this and
>> can use sysfs
>> to change it.
> 
> Doesn't devfreq_suspend_device() already cancel any timer work either 
> way in that case?

Correct, the governor should pause the monitoring mechanism (and timer).

Regards,
Lukasz

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-02-04 14:23               ` Lukasz Luba
  0 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2021-02-04 14:23 UTC (permalink / raw)
  To: Robin Murphy, Qiang Yu
  Cc: David Airlie, dri-devel, Christian Hewitt,
	Linux Kernel Mailing List, lima



On 2/4/21 1:39 PM, Robin Murphy wrote:
> On 2021-02-03 02:01, Qiang Yu wrote:
>> On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>
>>>
>>>
>>> On 2/2/21 1:01 AM, Qiang Yu wrote:
>>>> Hi Lukasz,
>>>>
>>>> Thanks for the explanation. So the deferred timer option makes a 
>>>> mistake that
>>>> when GPU goes from idle to busy for only one poll periodic, in this
>>>> case 50ms, right?
>>>
>>> Not exactly. Driver sets the polling interval to 50ms (in this case)
>>> because it needs ~3-frame average load (in 60fps). I have discovered the
>>> issue quite recently that on systems with 2 CPUs or more, the devfreq
>>> core is not monitoring the devices even for seconds. Therefore, we might
>>> end up with quite big amount of work that GPU is doing, but we don't
>>> know about it. Devfreq core didn't check <- timer didn't fired. Then
>>> suddenly that CPU, which had the deferred timer registered last time,
>>> is waking up and timer triggers to check our device. We get the stats,
>>> but they might be showing load from 1sec not 50ms. We feed them into
>>> governor. Governor sees the new load, but was tested and configured for
>>> 50ms, so it might try to rise the frequency to max. The GPU work might
>>> be already lower and there is no need for such freq. Then the CPU goes
>>> idle again, so no devfreq core check for next e.g. 1sec, but the
>>> frequency stays at max OPP and we burn power.
>>>
>>> So, it's completely unreliable. We might stuck at min frequency and
>>> suffer the frame drops, or sometimes stuck to max freq and burn more
>>> power when there is no such need.
>>>
>>> Similar for thermal governor, which is confused by this old stats and
>>> long period stats, longer than 50ms.
>>>
>>> Stats from last e.g. ~1sec tells you nothing about real recent GPU
>>> workload.
>> Oh, right, I missed this case.
>>
>>>
>>>> But delayed timer will wakeup CPU every 50ms even when system is 
>>>> idle, will this
>>>> cause more power consumption for the case like phone suspend?
>>>
>>> No, in case of phone suspend it won't increase the power consumption.
>>> The device won't be woken up, it will stay in suspend.
>> I mean the CPU is waked up frequently by timer when phone suspend,
>> not the whole device (like the display).
>>
>> Seems it's better to have deferred timer when device is suspended for
>> power saving,
>> and delayed timer when device in working state. User knows this and
>> can use sysfs
>> to change it.
> 
> Doesn't devfreq_suspend_device() already cancel any timer work either 
> way in that case?

Correct, the governor should pause the monitoring mechanism (and timer).

Regards,
Lukasz
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
  2021-02-04 14:23               ` Lukasz Luba
@ 2021-02-07 13:11                 ` Qiang Yu
  -1 siblings, 0 replies; 18+ messages in thread
From: Qiang Yu @ 2021-02-07 13:11 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: Robin Murphy, lima, David Airlie, Christian Hewitt,
	Linux Kernel Mailing List, dri-devel

Applied to drm-misc-next.

Regards,
Qiang

On Thu, Feb 4, 2021 at 10:24 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 2/4/21 1:39 PM, Robin Murphy wrote:
> > On 2021-02-03 02:01, Qiang Yu wrote:
> >> On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>>
> >>>
> >>>
> >>> On 2/2/21 1:01 AM, Qiang Yu wrote:
> >>>> Hi Lukasz,
> >>>>
> >>>> Thanks for the explanation. So the deferred timer option makes a
> >>>> mistake that
> >>>> when GPU goes from idle to busy for only one poll periodic, in this
> >>>> case 50ms, right?
> >>>
> >>> Not exactly. Driver sets the polling interval to 50ms (in this case)
> >>> because it needs ~3-frame average load (in 60fps). I have discovered the
> >>> issue quite recently that on systems with 2 CPUs or more, the devfreq
> >>> core is not monitoring the devices even for seconds. Therefore, we might
> >>> end up with quite big amount of work that GPU is doing, but we don't
> >>> know about it. Devfreq core didn't check <- timer didn't fired. Then
> >>> suddenly that CPU, which had the deferred timer registered last time,
> >>> is waking up and timer triggers to check our device. We get the stats,
> >>> but they might be showing load from 1sec not 50ms. We feed them into
> >>> governor. Governor sees the new load, but was tested and configured for
> >>> 50ms, so it might try to rise the frequency to max. The GPU work might
> >>> be already lower and there is no need for such freq. Then the CPU goes
> >>> idle again, so no devfreq core check for next e.g. 1sec, but the
> >>> frequency stays at max OPP and we burn power.
> >>>
> >>> So, it's completely unreliable. We might stuck at min frequency and
> >>> suffer the frame drops, or sometimes stuck to max freq and burn more
> >>> power when there is no such need.
> >>>
> >>> Similar for thermal governor, which is confused by this old stats and
> >>> long period stats, longer than 50ms.
> >>>
> >>> Stats from last e.g. ~1sec tells you nothing about real recent GPU
> >>> workload.
> >> Oh, right, I missed this case.
> >>
> >>>
> >>>> But delayed timer will wakeup CPU every 50ms even when system is
> >>>> idle, will this
> >>>> cause more power consumption for the case like phone suspend?
> >>>
> >>> No, in case of phone suspend it won't increase the power consumption.
> >>> The device won't be woken up, it will stay in suspend.
> >> I mean the CPU is waked up frequently by timer when phone suspend,
> >> not the whole device (like the display).
> >>
> >> Seems it's better to have deferred timer when device is suspended for
> >> power saving,
> >> and delayed timer when device in working state. User knows this and
> >> can use sysfs
> >> to change it.
> >
> > Doesn't devfreq_suspend_device() already cancel any timer work either
> > way in that case?
>
> Correct, the governor should pause the monitoring mechanism (and timer).
>
> Regards,
> Lukasz

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/lima: Use delayed timer as default in devfreq profile
@ 2021-02-07 13:11                 ` Qiang Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Qiang Yu @ 2021-02-07 13:11 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: lima, David Airlie, Christian Hewitt, Linux Kernel Mailing List,
	dri-devel, Robin Murphy

Applied to drm-misc-next.

Regards,
Qiang

On Thu, Feb 4, 2021 at 10:24 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 2/4/21 1:39 PM, Robin Murphy wrote:
> > On 2021-02-03 02:01, Qiang Yu wrote:
> >> On Tue, Feb 2, 2021 at 10:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>>
> >>>
> >>>
> >>> On 2/2/21 1:01 AM, Qiang Yu wrote:
> >>>> Hi Lukasz,
> >>>>
> >>>> Thanks for the explanation. So the deferred timer option makes a
> >>>> mistake that
> >>>> when GPU goes from idle to busy for only one poll periodic, in this
> >>>> case 50ms, right?
> >>>
> >>> Not exactly. Driver sets the polling interval to 50ms (in this case)
> >>> because it needs ~3-frame average load (in 60fps). I have discovered the
> >>> issue quite recently that on systems with 2 CPUs or more, the devfreq
> >>> core is not monitoring the devices even for seconds. Therefore, we might
> >>> end up with quite big amount of work that GPU is doing, but we don't
> >>> know about it. Devfreq core didn't check <- timer didn't fired. Then
> >>> suddenly that CPU, which had the deferred timer registered last time,
> >>> is waking up and timer triggers to check our device. We get the stats,
> >>> but they might be showing load from 1sec not 50ms. We feed them into
> >>> governor. Governor sees the new load, but was tested and configured for
> >>> 50ms, so it might try to rise the frequency to max. The GPU work might
> >>> be already lower and there is no need for such freq. Then the CPU goes
> >>> idle again, so no devfreq core check for next e.g. 1sec, but the
> >>> frequency stays at max OPP and we burn power.
> >>>
> >>> So, it's completely unreliable. We might stuck at min frequency and
> >>> suffer the frame drops, or sometimes stuck to max freq and burn more
> >>> power when there is no such need.
> >>>
> >>> Similar for thermal governor, which is confused by this old stats and
> >>> long period stats, longer than 50ms.
> >>>
> >>> Stats from last e.g. ~1sec tells you nothing about real recent GPU
> >>> workload.
> >> Oh, right, I missed this case.
> >>
> >>>
> >>>> But delayed timer will wakeup CPU every 50ms even when system is
> >>>> idle, will this
> >>>> cause more power consumption for the case like phone suspend?
> >>>
> >>> No, in case of phone suspend it won't increase the power consumption.
> >>> The device won't be woken up, it will stay in suspend.
> >> I mean the CPU is waked up frequently by timer when phone suspend,
> >> not the whole device (like the display).
> >>
> >> Seems it's better to have deferred timer when device is suspended for
> >> power saving,
> >> and delayed timer when device in working state. User knows this and
> >> can use sysfs
> >> to change it.
> >
> > Doesn't devfreq_suspend_device() already cancel any timer work either
> > way in that case?
>
> Correct, the governor should pause the monitoring mechanism (and timer).
>
> Regards,
> Lukasz
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-02-07 13:12 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-27 10:51 [PATCH] drm/lima: Use delayed timer as default in devfreq profile Lukasz Luba
2021-01-27 10:51 ` Lukasz Luba
2021-01-30 13:51 ` Qiang Yu
2021-01-30 13:51   ` Qiang Yu
2021-02-01  9:53   ` Lukasz Luba
2021-02-01  9:53     ` Lukasz Luba
2021-02-02  1:01     ` Qiang Yu
2021-02-02  1:01       ` Qiang Yu
2021-02-02 14:02       ` Lukasz Luba
2021-02-02 14:02         ` Lukasz Luba
2021-02-03  2:01         ` Qiang Yu
2021-02-03  2:01           ` Qiang Yu
2021-02-04 13:39           ` Robin Murphy
2021-02-04 13:39             ` Robin Murphy
2021-02-04 14:23             ` Lukasz Luba
2021-02-04 14:23               ` Lukasz Luba
2021-02-07 13:11               ` Qiang Yu
2021-02-07 13:11                 ` Qiang Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.