* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-15 20:55 ` Daniel Lezcano
@ 2020-09-15 21:13 ` Matthias Kaehlcke
2020-09-15 21:23 ` Daniel Lezcano
2020-09-15 21:46 ` Doug Anderson
2020-09-16 9:53 ` Lukasz Luba
2 siblings, 1 reply; 18+ messages in thread
From: Matthias Kaehlcke @ 2020-09-15 21:13 UTC (permalink / raw)
To: Daniel Lezcano
Cc: Rajendra Nayak, Lukasz Luba, Rob Herring, DTML, Doug Anderson,
linux-pm, Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On Tue, Sep 15, 2020 at 10:55:52PM +0200, Daniel Lezcano wrote:
> On 15/09/2020 19:58, Matthias Kaehlcke wrote:
> > On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
> >> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
> >>> +Thermal folks
> >>>
> >>> Hi Rajendra,
> >>>
> >>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
> >>>> Hi Rob,
> >>>>
> >>>> There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
> >>>> for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
> >>>> at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
> >>>> I believe relative values work perfectly fine for scheduling decisions, but with others using this for
> >>>> calculating power values in mW, is there a need to document the property as something that *has* to be
> >>>> based on real power measurements?
> >>>
> >>> Relative values may work for scheduling decisions, but not for thermal
> >>> management with the power allocator, at least not when CPU cooling devices
> >>> are combined with others that specify their power consumption in absolute
> >>> values. Such a configuration should be supported IMO.
> >>
> >> The energy model is used in the cpufreq cooling device and if the
> >> sustainable power is consistent with the relative values then there is
> >> no reason it shouldn't work.
> >
> > Agreed on thermal zones that exclusively use CPUs as cooling devices, but
> > what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
> > GPU that specifies its power in mW?
>
> Well, if a SoC vendor decides to mix the units, then there is nothing we
> can do.
>
> When specifying the power numbers available for the SoC, they could be
> all scaled against the highest power number.
The GPU was just one example, a device could have heat dissipating components
that are not from the SoC vendor (e.g. WiFi, modem, backlight), and depending
on the design it might not make sense to have separate thermal zones.
> There are so many factors on the hardware, the firmware, the kernel and
> the userspace sides having an impact on the energy efficiency, I don't
> understand why SoC vendors are so shy to share the power numbers...
nor do I, someone could just perform measurements to determine DPCs
with the proper scale if Qualcomm refuses to provide them ...
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-15 21:13 ` Matthias Kaehlcke
@ 2020-09-15 21:23 ` Daniel Lezcano
2020-09-15 21:36 ` Matthias Kaehlcke
0 siblings, 1 reply; 18+ messages in thread
From: Daniel Lezcano @ 2020-09-15 21:23 UTC (permalink / raw)
To: Matthias Kaehlcke
Cc: Rajendra Nayak, Lukasz Luba, Rob Herring, DTML, Doug Anderson,
linux-pm, Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On 15/09/2020 23:13, Matthias Kaehlcke wrote:
> On Tue, Sep 15, 2020 at 10:55:52PM +0200, Daniel Lezcano wrote:
>> On 15/09/2020 19:58, Matthias Kaehlcke wrote:
>>> On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
>>>> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
>>>>> +Thermal folks
>>>>>
>>>>> Hi Rajendra,
>>>>>
>>>>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
>>>>>> Hi Rob,
>>>>>>
>>>>>> There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
>>>>>> for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
>>>>>> at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
>>>>>> I believe relative values work perfectly fine for scheduling decisions, but with others using this for
>>>>>> calculating power values in mW, is there a need to document the property as something that *has* to be
>>>>>> based on real power measurements?
>>>>>
>>>>> Relative values may work for scheduling decisions, but not for thermal
>>>>> management with the power allocator, at least not when CPU cooling devices
>>>>> are combined with others that specify their power consumption in absolute
>>>>> values. Such a configuration should be supported IMO.
>>>>
>>>> The energy model is used in the cpufreq cooling device and if the
>>>> sustainable power is consistent with the relative values then there is
>>>> no reason it shouldn't work.
>>>
>>> Agreed on thermal zones that exclusively use CPUs as cooling devices, but
>>> what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
>>> GPU that specifies its power in mW?
>>
>> Well, if a SoC vendor decides to mix the units, then there is nothing we
>> can do.
>>
>> When specifying the power numbers available for the SoC, they could be
>> all scaled against the highest power number.
>
> The GPU was just one example, a device could have heat dissipating components
> that are not from the SoC vendor (e.g. WiFi, modem, backlight), and depending
> on the design it might not make sense to have separate thermal zones.
Is it possible to elaborate, I'm not sure to get the point ?
>> There are so many factors on the hardware, the firmware, the kernel and
>> the userspace sides having an impact on the energy efficiency, I don't
>> understand why SoC vendors are so shy to share the power numbers...
>
> nor do I, someone could just perform measurements to determine DPCs
> with the proper scale if Qualcomm refuses to provide them ...
>
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-15 21:23 ` Daniel Lezcano
@ 2020-09-15 21:36 ` Matthias Kaehlcke
2020-09-16 4:15 ` Rajendra Nayak
0 siblings, 1 reply; 18+ messages in thread
From: Matthias Kaehlcke @ 2020-09-15 21:36 UTC (permalink / raw)
To: Daniel Lezcano
Cc: Rajendra Nayak, Lukasz Luba, Rob Herring, DTML, Doug Anderson,
linux-pm, Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On Tue, Sep 15, 2020 at 11:23:49PM +0200, Daniel Lezcano wrote:
> On 15/09/2020 23:13, Matthias Kaehlcke wrote:
> > On Tue, Sep 15, 2020 at 10:55:52PM +0200, Daniel Lezcano wrote:
> >> On 15/09/2020 19:58, Matthias Kaehlcke wrote:
> >>> On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
> >>>> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
> >>>>> +Thermal folks
> >>>>>
> >>>>> Hi Rajendra,
> >>>>>
> >>>>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
> >>>>>> Hi Rob,
> >>>>>>
> >>>>>> There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
> >>>>>> for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
> >>>>>> at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
> >>>>>> I believe relative values work perfectly fine for scheduling decisions, but with others using this for
> >>>>>> calculating power values in mW, is there a need to document the property as something that *has* to be
> >>>>>> based on real power measurements?
> >>>>>
> >>>>> Relative values may work for scheduling decisions, but not for thermal
> >>>>> management with the power allocator, at least not when CPU cooling devices
> >>>>> are combined with others that specify their power consumption in absolute
> >>>>> values. Such a configuration should be supported IMO.
> >>>>
> >>>> The energy model is used in the cpufreq cooling device and if the
> >>>> sustainable power is consistent with the relative values then there is
> >>>> no reason it shouldn't work.
> >>>
> >>> Agreed on thermal zones that exclusively use CPUs as cooling devices, but
> >>> what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
> >>> GPU that specifies its power in mW?
> >>
> >> Well, if a SoC vendor decides to mix the units, then there is nothing we
> >> can do.
> >>
> >> When specifying the power numbers available for the SoC, they could be
> >> all scaled against the highest power number.
> >
> > The GPU was just one example, a device could have heat dissipating components
> > that are not from the SoC vendor (e.g. WiFi, modem, backlight), and depending
> > on the design it might not make sense to have separate thermal zones.
>
> Is it possible to elaborate, I'm not sure to get the point ?
A device could have a thermal zone with the following cooling
devices:
- CPUs with power consumption specified as pmW (pseudo mW
- A modem from a third party vendor. The modem can dissipate
significant heat and allows to throttle the bandwidth for
cooling. The power consumption of the modem is given in
mW.
These could be crammed together in a small form factor
(e.g. ChromeCast or Chromebit) which makes it difficult to
discern with a sensor what exactly is generating the heat,
which is why you have a single thermal zone.
IPA is used as governor for this zone, it can't make accurate
decisions because one cooling device specifies it's power
consumption in pmW and the other in mW.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-15 21:36 ` Matthias Kaehlcke
@ 2020-09-16 4:15 ` Rajendra Nayak
2020-09-16 16:40 ` Matthias Kaehlcke
0 siblings, 1 reply; 18+ messages in thread
From: Rajendra Nayak @ 2020-09-16 4:15 UTC (permalink / raw)
To: Matthias Kaehlcke, Daniel Lezcano
Cc: Lukasz Luba, Rob Herring, DTML, Doug Anderson, linux-pm,
Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On 9/16/2020 3:06 AM, Matthias Kaehlcke wrote:
> On Tue, Sep 15, 2020 at 11:23:49PM +0200, Daniel Lezcano wrote:
>> On 15/09/2020 23:13, Matthias Kaehlcke wrote:
>>> On Tue, Sep 15, 2020 at 10:55:52PM +0200, Daniel Lezcano wrote:
>>>> On 15/09/2020 19:58, Matthias Kaehlcke wrote:
>>>>> On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
>>>>>> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
>>>>>>> +Thermal folks
>>>>>>>
>>>>>>> Hi Rajendra,
>>>>>>>
>>>>>>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
>>>>>>>> Hi Rob,
>>>>>>>>
>>>>>>>> There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
>>>>>>>> for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
>>>>>>>> at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
>>>>>>>> I believe relative values work perfectly fine for scheduling decisions, but with others using this for
>>>>>>>> calculating power values in mW, is there a need to document the property as something that *has* to be
>>>>>>>> based on real power measurements?
>>>>>>>
>>>>>>> Relative values may work for scheduling decisions, but not for thermal
>>>>>>> management with the power allocator, at least not when CPU cooling devices
>>>>>>> are combined with others that specify their power consumption in absolute
>>>>>>> values. Such a configuration should be supported IMO.
>>>>>>
>>>>>> The energy model is used in the cpufreq cooling device and if the
>>>>>> sustainable power is consistent with the relative values then there is
>>>>>> no reason it shouldn't work.
>>>>>
>>>>> Agreed on thermal zones that exclusively use CPUs as cooling devices, but
>>>>> what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
>>>>> GPU that specifies its power in mW?
>>>>
>>>> Well, if a SoC vendor decides to mix the units, then there is nothing we
>>>> can do.
>>>>
>>>> When specifying the power numbers available for the SoC, they could be
>>>> all scaled against the highest power number.
>>>
>>> The GPU was just one example, a device could have heat dissipating components
>>> that are not from the SoC vendor (e.g. WiFi, modem, backlight), and depending
>>> on the design it might not make sense to have separate thermal zones.
>>
>> Is it possible to elaborate, I'm not sure to get the point ?
>
> A device could have a thermal zone with the following cooling
> devices:
>
> - CPUs with power consumption specified as pmW (pseudo mW
> - A modem from a third party vendor. The modem can dissipate
> significant heat and allows to throttle the bandwidth for
> cooling. The power consumption of the modem is given in
> mW.
>
> These could be crammed together in a small form factor
> (e.g. ChromeCast or Chromebit) which makes it difficult to
> discern with a sensor what exactly is generating the heat,
> which is why you have a single thermal zone.
>
> IPA is used as governor for this zone, it can't make accurate
> decisions because one cooling device specifies it's power
> consumption in pmW and the other in mW.
Is there a real example upstream for this, or is it a theoretical
problem (which can exist in the future) we are trying to solve?
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-16 4:15 ` Rajendra Nayak
@ 2020-09-16 16:40 ` Matthias Kaehlcke
0 siblings, 0 replies; 18+ messages in thread
From: Matthias Kaehlcke @ 2020-09-16 16:40 UTC (permalink / raw)
To: Rajendra Nayak
Cc: Daniel Lezcano, Lukasz Luba, Rob Herring, DTML, Doug Anderson,
linux-pm, Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On Wed, Sep 16, 2020 at 09:45:04AM +0530, Rajendra Nayak wrote:
>
> On 9/16/2020 3:06 AM, Matthias Kaehlcke wrote:
> > On Tue, Sep 15, 2020 at 11:23:49PM +0200, Daniel Lezcano wrote:
> > > On 15/09/2020 23:13, Matthias Kaehlcke wrote:
> > > > On Tue, Sep 15, 2020 at 10:55:52PM +0200, Daniel Lezcano wrote:
> > > > > On 15/09/2020 19:58, Matthias Kaehlcke wrote:
> > > > > > On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
> > > > > > > On 15/09/2020 19:24, Matthias Kaehlcke wrote:
> > > > > > > > +Thermal folks
> > > > > > > >
> > > > > > > > Hi Rajendra,
> > > > > > > >
> > > > > > > > On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
> > > > > > > > > Hi Rob,
> > > > > > > > >
> > > > > > > > > There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
> > > > > > > > > for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
> > > > > > > > > at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
> > > > > > > > > I believe relative values work perfectly fine for scheduling decisions, but with others using this for
> > > > > > > > > calculating power values in mW, is there a need to document the property as something that *has* to be
> > > > > > > > > based on real power measurements?
> > > > > > > >
> > > > > > > > Relative values may work for scheduling decisions, but not for thermal
> > > > > > > > management with the power allocator, at least not when CPU cooling devices
> > > > > > > > are combined with others that specify their power consumption in absolute
> > > > > > > > values. Such a configuration should be supported IMO.
> > > > > > >
> > > > > > > The energy model is used in the cpufreq cooling device and if the
> > > > > > > sustainable power is consistent with the relative values then there is
> > > > > > > no reason it shouldn't work.
> > > > > >
> > > > > > Agreed on thermal zones that exclusively use CPUs as cooling devices, but
> > > > > > what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
> > > > > > GPU that specifies its power in mW?
> > > > >
> > > > > Well, if a SoC vendor decides to mix the units, then there is nothing we
> > > > > can do.
> > > > >
> > > > > When specifying the power numbers available for the SoC, they could be
> > > > > all scaled against the highest power number.
> > > >
> > > > The GPU was just one example, a device could have heat dissipating components
> > > > that are not from the SoC vendor (e.g. WiFi, modem, backlight), and depending
> > > > on the design it might not make sense to have separate thermal zones.
> > >
> > > Is it possible to elaborate, I'm not sure to get the point ?
> >
> > A device could have a thermal zone with the following cooling
> > devices:
> >
> > - CPUs with power consumption specified as pmW (pseudo mW
> > - A modem from a third party vendor. The modem can dissipate
> > significant heat and allows to throttle the bandwidth for
> > cooling. The power consumption of the modem is given in
> > mW.
> >
> > These could be crammed together in a small form factor
> > (e.g. ChromeCast or Chromebit) which makes it difficult to
> > discern with a sensor what exactly is generating the heat,
> > which is why you have a single thermal zone.
> >
> > IPA is used as governor for this zone, it can't make accurate
> > decisions because one cooling device specifies it's power
> > consumption in pmW and the other in mW.
>
> Is there a real example upstream for this, or is it a theoretical
> problem (which can exist in the future) we are trying to solve?
Sort of, there is the rk3288-based veyron-mickey, which uses CPUs,
the GPU and ddrfreq as cooling devices in the same zone:
https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/heads/chromeos-4.19/arch/arm/boot/dts/rk3288-veyron-mickey.dts#42
The device doesn't use IPA though, so mixed up units wouldn't matter in this
case.
From a quick grep in arch/arm(64)/boot/dts/ at least it seems that mixing
cooling devices of different types is not a common case, though it doesn't
necessarily reflect what is done in custom DTs.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-15 20:55 ` Daniel Lezcano
2020-09-15 21:13 ` Matthias Kaehlcke
@ 2020-09-15 21:46 ` Doug Anderson
2020-09-15 21:51 ` Matthias Kaehlcke
2020-09-16 9:53 ` Lukasz Luba
2 siblings, 1 reply; 18+ messages in thread
From: Doug Anderson @ 2020-09-15 21:46 UTC (permalink / raw)
To: Daniel Lezcano
Cc: Matthias Kaehlcke, Rajendra Nayak, Lukasz Luba, Rob Herring,
DTML, Linux PM, Amit Daniel Kachhap, Viresh Kumar, Javi Merino
Hi,
On Tue, Sep 15, 2020 at 1:55 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 15/09/2020 19:58, Matthias Kaehlcke wrote:
> > On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
> >> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
> >>> +Thermal folks
> >>>
> >>> Hi Rajendra,
> >>>
> >>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
> >>>> Hi Rob,
> >>>>
> >>>> There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
> >>>> for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
> >>>> at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
> >>>> I believe relative values work perfectly fine for scheduling decisions, but with others using this for
> >>>> calculating power values in mW, is there a need to document the property as something that *has* to be
> >>>> based on real power measurements?
> >>>
> >>> Relative values may work for scheduling decisions, but not for thermal
> >>> management with the power allocator, at least not when CPU cooling devices
> >>> are combined with others that specify their power consumption in absolute
> >>> values. Such a configuration should be supported IMO.
> >>
> >> The energy model is used in the cpufreq cooling device and if the
> >> sustainable power is consistent with the relative values then there is
> >> no reason it shouldn't work.
> >
> > Agreed on thermal zones that exclusively use CPUs as cooling devices, but
> > what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
> > GPU that specifies its power in mW?
>
> Well, if a SoC vendor decides to mix the units, then there is nothing we
> can do.
I mean, there is something someone could do. They could buy one of
these devices, measure the power (which wouldn't actually be that hard
to do), then submit a patch to adjust all the numbers. ;-)
-Doug
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-15 21:46 ` Doug Anderson
@ 2020-09-15 21:51 ` Matthias Kaehlcke
0 siblings, 0 replies; 18+ messages in thread
From: Matthias Kaehlcke @ 2020-09-15 21:51 UTC (permalink / raw)
To: Doug Anderson
Cc: Daniel Lezcano, Rajendra Nayak, Lukasz Luba, Rob Herring, DTML,
Linux PM, Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On Tue, Sep 15, 2020 at 02:46:16PM -0700, Doug Anderson wrote:
> Hi,
>
> On Tue, Sep 15, 2020 at 1:55 PM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
> >
> > On 15/09/2020 19:58, Matthias Kaehlcke wrote:
> > > On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
> > >> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
> > >>> +Thermal folks
> > >>>
> > >>> Hi Rajendra,
> > >>>
> > >>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
> > >>>> Hi Rob,
> > >>>>
> > >>>> There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
> > >>>> for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
> > >>>> at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
> > >>>> I believe relative values work perfectly fine for scheduling decisions, but with others using this for
> > >>>> calculating power values in mW, is there a need to document the property as something that *has* to be
> > >>>> based on real power measurements?
> > >>>
> > >>> Relative values may work for scheduling decisions, but not for thermal
> > >>> management with the power allocator, at least not when CPU cooling devices
> > >>> are combined with others that specify their power consumption in absolute
> > >>> values. Such a configuration should be supported IMO.
> > >>
> > >> The energy model is used in the cpufreq cooling device and if the
> > >> sustainable power is consistent with the relative values then there is
> > >> no reason it shouldn't work.
> > >
> > > Agreed on thermal zones that exclusively use CPUs as cooling devices, but
> > > what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
> > > GPU that specifies its power in mW?
> >
> > Well, if a SoC vendor decides to mix the units, then there is nothing we
> > can do.
>
> I mean, there is something someone could do. They could buy one of
> these devices, measure the power (which wouldn't actually be that hard
> to do), then submit a patch to adjust all the numbers. ;-)
In case they look for a recipe:
commit ac60c5e33df4ec2b69c7e3ebbc0ccf1557e7bd5e
Author: Matthias Kaehlcke <mka@chromium.org>
Date: Thu Apr 11 17:01:58 2019 -0700
ARM: dts: rockchip: Add dynamic-power-coefficient for rk3288
The value was determined with the following method:
- take CPUs 1-3 offline
- for each OPP
- set cpufreq min and max freq to OPP freq
- start dhrystone benchmark
- measure CPU power consumption during 10s
- calculate Cx for OPPx
- Cx = (Px - P1) / (Vx²fx - V1²f1) [1]
using the following units: mW / Ghz / V [2]
- C = avg(C2, ..., Cn)
[1] see commit 4daa001a1773 ("arm64: dts: juno: Add cpu
dynamic-power-coefficient information")
[2] https://patchwork.kernel.org/patch/10493615/#22158551
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-15 20:55 ` Daniel Lezcano
2020-09-15 21:13 ` Matthias Kaehlcke
2020-09-15 21:46 ` Doug Anderson
@ 2020-09-16 9:53 ` Lukasz Luba
2020-09-16 16:48 ` Matthias Kaehlcke
2 siblings, 1 reply; 18+ messages in thread
From: Lukasz Luba @ 2020-09-16 9:53 UTC (permalink / raw)
To: Daniel Lezcano, Matthias Kaehlcke
Cc: Rajendra Nayak, Rob Herring, DTML, Doug Anderson, linux-pm,
Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On 9/15/20 9:55 PM, Daniel Lezcano wrote:
> On 15/09/2020 19:58, Matthias Kaehlcke wrote:
>> On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
>>> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
>>>> +Thermal folks
>>>>
>>>> Hi Rajendra,
>>>>
>>>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
>>>>> Hi Rob,
>>>>>
>>>>> There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
>>>>> for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
>>>>> at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
>>>>> I believe relative values work perfectly fine for scheduling decisions, but with others using this for
>>>>> calculating power values in mW, is there a need to document the property as something that *has* to be
>>>>> based on real power measurements?
>>>>
>>>> Relative values may work for scheduling decisions, but not for thermal
>>>> management with the power allocator, at least not when CPU cooling devices
>>>> are combined with others that specify their power consumption in absolute
>>>> values. Such a configuration should be supported IMO.
>>>
>>> The energy model is used in the cpufreq cooling device and if the
>>> sustainable power is consistent with the relative values then there is
>>> no reason it shouldn't work.
>>
>> Agreed on thermal zones that exclusively use CPUs as cooling devices, but
>> what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
>> GPU that specifies its power in mW?
>
> Well, if a SoC vendor decides to mix the units, then there is nothing we
> can do.
>
> When specifying the power numbers available for the SoC, they could be
> all scaled against the highest power number.
>
> There are so many factors on the hardware, the firmware, the kernel and
> the userspace sides having an impact on the energy efficiency, I don't
> understand why SoC vendors are so shy to share the power numbers...
>
Unfortunately (because it might confuse engineers in some cases like
this one), even in the SCMI spec DEN0056B [1] we have this statement
which allows to expose an 'abstract scale' values from firmware:
'4.5.1 Performance domain management protocol background
...The power can be expressed in mW or in an abstract scale. Vendors
are not obliged to reveal power costs if it is undesirable, but a linear
scale is required.'
This is the source of our Energy Model values when we use SCMI cpufreq
driver [2].
So this might be an issue in the future, when some SoC vendor decides to
not expose the real mW, but the phone OEM would then take the SoC and
try to add some other cooling device into the thermal zone. That new
device is not part of the SCMI perf but some custom and has the real mW.
Do you think Daniel it should be somewhere documented in the kernel
thermal that the firmware might silently populate EM with 'abstract
scale'? Then special care should be taken when combining new
cooling devices.
Regards,
Lukasz
[1] https://developer.arm.com/documentation/den0056/b/?lang=en
[2]
https://elixir.bootlin.com/linux/latest/source/drivers/cpufreq/scmi-cpufreq.c#L121
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-16 9:53 ` Lukasz Luba
@ 2020-09-16 16:48 ` Matthias Kaehlcke
2020-09-24 6:09 ` Rajendra Nayak
0 siblings, 1 reply; 18+ messages in thread
From: Matthias Kaehlcke @ 2020-09-16 16:48 UTC (permalink / raw)
To: Lukasz Luba
Cc: Daniel Lezcano, Rajendra Nayak, Rob Herring, DTML, Doug Anderson,
linux-pm, Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On Wed, Sep 16, 2020 at 10:53:48AM +0100, Lukasz Luba wrote:
>
>
> On 9/15/20 9:55 PM, Daniel Lezcano wrote:
> > On 15/09/2020 19:58, Matthias Kaehlcke wrote:
> > > On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
> > > > On 15/09/2020 19:24, Matthias Kaehlcke wrote:
> > > > > +Thermal folks
> > > > >
> > > > > Hi Rajendra,
> > > > >
> > > > > On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
> > > > > > Hi Rob,
> > > > > >
> > > > > > There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
> > > > > > for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
> > > > > > at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
> > > > > > I believe relative values work perfectly fine for scheduling decisions, but with others using this for
> > > > > > calculating power values in mW, is there a need to document the property as something that *has* to be
> > > > > > based on real power measurements?
> > > > >
> > > > > Relative values may work for scheduling decisions, but not for thermal
> > > > > management with the power allocator, at least not when CPU cooling devices
> > > > > are combined with others that specify their power consumption in absolute
> > > > > values. Such a configuration should be supported IMO.
> > > >
> > > > The energy model is used in the cpufreq cooling device and if the
> > > > sustainable power is consistent with the relative values then there is
> > > > no reason it shouldn't work.
> > >
> > > Agreed on thermal zones that exclusively use CPUs as cooling devices, but
> > > what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
> > > GPU that specifies its power in mW?
> >
> > Well, if a SoC vendor decides to mix the units, then there is nothing we
> > can do.
> >
> > When specifying the power numbers available for the SoC, they could be
> > all scaled against the highest power number.
> >
> > There are so many factors on the hardware, the firmware, the kernel and
> > the userspace sides having an impact on the energy efficiency, I don't
> > understand why SoC vendors are so shy to share the power numbers...
> >
>
> Unfortunately (because it might confuse engineers in some cases like
> this one), even in the SCMI spec DEN0056B [1] we have this statement
> which allows to expose an 'abstract scale' values from firmware:
> '4.5.1 Performance domain management protocol background
> ...The power can be expressed in mW or in an abstract scale. Vendors
> are not obliged to reveal power costs if it is undesirable, but a linear
> scale is required.'
>
> This is the source of our Energy Model values when we use SCMI cpufreq
> driver [2].
>
> So this might be an issue in the future, when some SoC vendor decides to
> not expose the real mW, but the phone OEM would then take the SoC and
> try to add some other cooling device into the thermal zone. That new
> device is not part of the SCMI perf but some custom and has the real mW.
>
> Do you think Daniel it should be somewhere documented in the kernel
> thermal that the firmware might silently populate EM with 'abstract
> scale'? Then special care should be taken when combining new
> cooling devices.
>
> Regards,
> Lukasz
>
> [1] https://developer.arm.com/documentation/den0056/b/?lang=en
> [2] https://elixir.bootlin.com/linux/latest/source/drivers/cpufreq/scmi-cpufreq.c#L121
If an 'abstract scale' is explicitly allowed I think it should be documented
to avoid confusion and make engineers aware of the peril of combining cooling
devices of different types in the same thermal zone.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-16 16:48 ` Matthias Kaehlcke
@ 2020-09-24 6:09 ` Rajendra Nayak
2020-09-24 8:21 ` Lukasz Luba
0 siblings, 1 reply; 18+ messages in thread
From: Rajendra Nayak @ 2020-09-24 6:09 UTC (permalink / raw)
To: Matthias Kaehlcke, Lukasz Luba
Cc: Daniel Lezcano, Rob Herring, DTML, Doug Anderson, linux-pm,
Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On 9/16/2020 10:18 PM, Matthias Kaehlcke wrote:
> On Wed, Sep 16, 2020 at 10:53:48AM +0100, Lukasz Luba wrote:
>>
>>
>> On 9/15/20 9:55 PM, Daniel Lezcano wrote:
>>> On 15/09/2020 19:58, Matthias Kaehlcke wrote:
>>>> On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
>>>>> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
>>>>>> +Thermal folks
>>>>>>
>>>>>> Hi Rajendra,
>>>>>>
>>>>>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
>>>>>>> Hi Rob,
>>>>>>>
>>>>>>> There has been some discussions on another thread [1] around the DPC (dynamic-power-coefficient) values
>>>>>>> for CPU's being relative vs absolute (based on real power) and should they be used to derive 'real' power
>>>>>>> at various OPPs in order to calculate things like 'sustainable-power' for thermal zones.
>>>>>>> I believe relative values work perfectly fine for scheduling decisions, but with others using this for
>>>>>>> calculating power values in mW, is there a need to document the property as something that *has* to be
>>>>>>> based on real power measurements?
>>>>>>
>>>>>> Relative values may work for scheduling decisions, but not for thermal
>>>>>> management with the power allocator, at least not when CPU cooling devices
>>>>>> are combined with others that specify their power consumption in absolute
>>>>>> values. Such a configuration should be supported IMO.
>>>>>
>>>>> The energy model is used in the cpufreq cooling device and if the
>>>>> sustainable power is consistent with the relative values then there is
>>>>> no reason it shouldn't work.
>>>>
>>>> Agreed on thermal zones that exclusively use CPUs as cooling devices, but
>>>> what when you have mixed zones, with CPUs with their pseudo-unit and e.g. a
>>>> GPU that specifies its power in mW?
>>>
>>> Well, if a SoC vendor decides to mix the units, then there is nothing we
>>> can do.
>>>
>>> When specifying the power numbers available for the SoC, they could be
>>> all scaled against the highest power number.
>>>
>>> There are so many factors on the hardware, the firmware, the kernel and
>>> the userspace sides having an impact on the energy efficiency, I don't
>>> understand why SoC vendors are so shy to share the power numbers...
>>>
>>
>> Unfortunately (because it might confuse engineers in some cases like
>> this one), even in the SCMI spec DEN0056B [1] we have this statement
>> which allows to expose an 'abstract scale' values from firmware:
>> '4.5.1 Performance domain management protocol background
>> ...The power can be expressed in mW or in an abstract scale. Vendors
>> are not obliged to reveal power costs if it is undesirable, but a linear
>> scale is required.'
>>
>> This is the source of our Energy Model values when we use SCMI cpufreq
>> driver [2].
>>
>> So this might be an issue in the future, when some SoC vendor decides to
>> not expose the real mW, but the phone OEM would then take the SoC and
>> try to add some other cooling device into the thermal zone. That new
>> device is not part of the SCMI perf but some custom and has the real mW.
>>
>> Do you think Daniel it should be somewhere documented in the kernel
>> thermal that the firmware might silently populate EM with 'abstract
>> scale'? Then special care should be taken when combining new
>> cooling devices.
>>
>> Regards,
>> Lukasz
>>
>> [1] https://developer.arm.com/documentation/den0056/b/?lang=en
>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/cpufreq/scmi-cpufreq.c#L121
>
> If an 'abstract scale' is explicitly allowed I think it should be documented
> to avoid confusion and make engineers aware of the peril of combining cooling
> devices of different types in the same thermal zone.
Rob, we should perhaps also document this as part of the DT bindings document
to be consistent, that an abstract scale is allowed when specifying the DPC
values in DT.
if you agree, I can spin a quick patch to update the documentation.
thanks,
Rajendra
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: is 'dynamic-power-coefficient' expected to be based on 'real' power measurements?
2020-09-24 6:09 ` Rajendra Nayak
@ 2020-09-24 8:21 ` Lukasz Luba
0 siblings, 0 replies; 18+ messages in thread
From: Lukasz Luba @ 2020-09-24 8:21 UTC (permalink / raw)
To: Rajendra Nayak, Matthias Kaehlcke
Cc: Daniel Lezcano, Rob Herring, DTML, Doug Anderson, linux-pm,
Amit Daniel Kachhap, Viresh Kumar, Javi Merino
On 9/24/20 7:09 AM, Rajendra Nayak wrote:
>
> On 9/16/2020 10:18 PM, Matthias Kaehlcke wrote:
>> On Wed, Sep 16, 2020 at 10:53:48AM +0100, Lukasz Luba wrote:
>>>
>>>
>>> On 9/15/20 9:55 PM, Daniel Lezcano wrote:
>>>> On 15/09/2020 19:58, Matthias Kaehlcke wrote:
>>>>> On Tue, Sep 15, 2020 at 07:50:10PM +0200, Daniel Lezcano wrote:
>>>>>> On 15/09/2020 19:24, Matthias Kaehlcke wrote:
>>>>>>> +Thermal folks
>>>>>>>
>>>>>>> Hi Rajendra,
>>>>>>>
>>>>>>> On Tue, Sep 15, 2020 at 11:14:00AM +0530, Rajendra Nayak wrote:
>>>>>>>> Hi Rob,
>>>>>>>>
>>>>>>>> There has been some discussions on another thread [1] around the
>>>>>>>> DPC (dynamic-power-coefficient) values
>>>>>>>> for CPU's being relative vs absolute (based on real power) and
>>>>>>>> should they be used to derive 'real' power
>>>>>>>> at various OPPs in order to calculate things like
>>>>>>>> 'sustainable-power' for thermal zones.
>>>>>>>> I believe relative values work perfectly fine for scheduling
>>>>>>>> decisions, but with others using this for
>>>>>>>> calculating power values in mW, is there a need to document the
>>>>>>>> property as something that *has* to be
>>>>>>>> based on real power measurements?
>>>>>>>
>>>>>>> Relative values may work for scheduling decisions, but not for
>>>>>>> thermal
>>>>>>> management with the power allocator, at least not when CPU
>>>>>>> cooling devices
>>>>>>> are combined with others that specify their power consumption in
>>>>>>> absolute
>>>>>>> values. Such a configuration should be supported IMO.
>>>>>>
>>>>>> The energy model is used in the cpufreq cooling device and if the
>>>>>> sustainable power is consistent with the relative values then
>>>>>> there is
>>>>>> no reason it shouldn't work.
>>>>>
>>>>> Agreed on thermal zones that exclusively use CPUs as cooling
>>>>> devices, but
>>>>> what when you have mixed zones, with CPUs with their pseudo-unit
>>>>> and e.g. a
>>>>> GPU that specifies its power in mW?
>>>>
>>>> Well, if a SoC vendor decides to mix the units, then there is
>>>> nothing we
>>>> can do.
>>>>
>>>> When specifying the power numbers available for the SoC, they could be
>>>> all scaled against the highest power number.
>>>>
>>>> There are so many factors on the hardware, the firmware, the kernel and
>>>> the userspace sides having an impact on the energy efficiency, I don't
>>>> understand why SoC vendors are so shy to share the power numbers...
>>>>
>>>
>>> Unfortunately (because it might confuse engineers in some cases like
>>> this one), even in the SCMI spec DEN0056B [1] we have this statement
>>> which allows to expose an 'abstract scale' values from firmware:
>>> '4.5.1 Performance domain management protocol background
>>> ...The power can be expressed in mW or in an abstract scale. Vendors
>>> are not obliged to reveal power costs if it is undesirable, but a linear
>>> scale is required.'
>>>
>>> This is the source of our Energy Model values when we use SCMI cpufreq
>>> driver [2].
>>>
>>> So this might be an issue in the future, when some SoC vendor decides to
>>> not expose the real mW, but the phone OEM would then take the SoC and
>>> try to add some other cooling device into the thermal zone. That new
>>> device is not part of the SCMI perf but some custom and has the real mW.
>>>
>>> Do you think Daniel it should be somewhere documented in the kernel
>>> thermal that the firmware might silently populate EM with 'abstract
>>> scale'? Then special care should be taken when combining new
>>> cooling devices.
>>>
>>> Regards,
>>> Lukasz
>>>
>>> [1] https://developer.arm.com/documentation/den0056/b/?lang=en
>>> [2]
>>> https://elixir.bootlin.com/linux/latest/source/drivers/cpufreq/scmi-cpufreq.c#L121
>>>
>>
>> If an 'abstract scale' is explicitly allowed I think it should be
>> documented
>> to avoid confusion and make engineers aware of the peril of combining
>> cooling
>> devices of different types in the same thermal zone.
>
> Rob, we should perhaps also document this as part of the DT bindings
> document
> to be consistent, that an abstract scale is allowed when specifying the DPC
> values in DT.
> if you agree, I can spin a quick patch to update the documentation.
>
The 'dynamic-power-coefficient' which is in the:
Documentation/devicetree/bindings/arm/cpus.yaml does not need any update
because it expects units of 'uW/MHz/V^2' to calculate dynamic power.
You have two ways to register Energy Model for a device:
1. em_dev_register_perf_domain() where you provide the callback function
and that can feed the 'abstract scale' (like the scmi-cpufreq.c)
2. dev_pm_opp_of_register_em() where the 'dynamic-power-coefficient'
is going to be involved.
If the developer would see that the platform might face potential issue
of mixing devices in one thermal zone of two scales, it should not use
the 2nd registration, but the 1st API and provide callback with
consistent scale to all devices. It is also very unlikely that the
device like GPU or DSP would not be part of the scmi perf domains
and would not expose a consistent abstract scale.
I have a patch spinning in our internal review to update EAS, EM, IPA
documentation and that would be updated soon.
Regards,
Lukasz
^ permalink raw reply [flat|nested] 18+ messages in thread