linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukasz Luba <lukasz.luba@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	linux-kernel@vger.kernel.org, rafael@kernel.org,
	linux-pm@vger.kernel.org, Dietmar.Eggemann@arm.com,
	"daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>
Subject: Re: [PATCH 2/2] cpufreq: Update CPU capacity reduction in store_scaling_max_freq()
Date: Tue, 11 Oct 2022 11:25:42 +0100	[thread overview]
Message-ID: <5001a099-4596-2d10-2f79-3e39ad507959@arm.com> (raw)
In-Reply-To: <Y0UrbBioezoyeez/@hirez.programming.kicks-ass.net>



On 10/11/22 09:38, Peter Zijlstra wrote:
> On Mon, Oct 10, 2022 at 11:46:29AM +0100, Lukasz Luba wrote:
>>
>> +CC Daniel, since I have mentioned a few times DTPM
>>
>> On 10/10/22 11:25, Peter Zijlstra wrote:
>>> On Mon, Oct 10, 2022 at 11:12:06AM +0100, Lukasz Luba wrote:
>>>> BTW, those Android user space max freq requests are not that long,
>>>> mostly due to camera capturing (you can see a few in this file,
>>>> e.g. [1]).
>>>
>>> It does what now ?!? Why is Android using this *at*all* ?
>>
>> It tries to balance the power budget, before bad things happen
>> randomly (throttling different devices w/o a good context what's
>> going on). Please keep in mind that we have ~3 Watts total power
>> budget in a phone, while several devices might be suddenly used:
>> 1. big CPU with max power ~3-3.5 Watts (and we have 2 cores on pixel6)
>> 2. GPU with max power ~6Watts (normally ~1-2Watts when lightly used)
>> 3. ISP (Image Signal Processor) up to ~2Watts
>> 4. DSP also up to 1-2Watts
>>
>> We don't have currently a good mechanism which could be aware
>> of the total power/thermal budget and relations between those
>> devices. Vendors and OEMs run experiments on devices and profile
>> them to work more predictable in those 'important to users' scenarios.
>>
>> AFAIK Daniel Lescano is trying to help with this new interface
>> for PowerCap: DTMP. It might be use as a new interface for those known
>> scenarios like the camera snapshot. But that interface is on the list
>> that I have also mentioned - it's missing the notification mechanism
>> for the scheduler reduced capacity due to user-space new scenario.
> 
> DTMP is like IPA but including random devices? Because I thought IPA
> already did lots of this.

The DTMP is a kernel interface for power split which happen in the user
space policy. It exposes the sysfs to set those scenarios, even before
(like those Android 'powerhints') the power/thermal issue occur. I have
been reviewing it (and advocating internally). There is more work to
do there still and AFAIK is not yet used by Android.

IPA contains the policy to power budget split, but misses this 'context'
of what's going on and would happen. It has some PID mechanism to fix
itself, but it's not a silver bullet.

Furthermore, there are other IPA fundamental issues:
1. You might recall we added last year to IPA the utilization signal
    of the CPU runqueues. That model still has issues with input
    power estimation and I have described that here [1].
2. Cpu frequency sampling issue (we assume const. freq at whole period)
    (also in [1])
3. Power consumption of a CPU at the same frequency varies and depends
    on workload instruction mix, e.g. heavy SIMD floating-point code
    for some image filter in camera app drains more power vs. a code
    which is a garbage-collector background thread traversing a graph
    in memory and has big backend stall due to randomness of pointers
    (or a game thread for collision detection on octrees).
    Our Energy Model doesn't cover such thing (yet).
    The issue become more severe for us with last year available big
    cores: a new generation of uArch Cortex-X1. They are able to
    drain 3.5W instantly, while in Energy Model we have 2.2W for max
    freq. In previous big cores we haven't such power hungry CPUs.
    A fair assumption was 1.0W for EM value and 1.7W for a pick power
    in some SIMD code. That 3.5W-2.2W can heat up the SoC really
    quickly and use the free thermal budget easily. So hints from
    user space are welcome IMO.
4. User space restriction to cpufreq and devfreq, which are those
    'powerhints' about possible coming soon scenarios, are not taken into
    account, due to missing interface. I have mentioned it ~2 years ago
    and sent a RFC example patch for devfreq (didn't dare to address
    cpufreq at once) [2]
5. Thermal-pressure PELT signal converges slowly to the original
    instant signal set by thermal governor, so the capacity_of()
    has delays to 'observe' the reality of the capped CPUs. In those
    user space scenario short hints is important. I have tried to
    add a mechanism to react faster, since we might already have
    delays in our FW or IPA to the original signal. Patch didn't
    make any progress on LKML [3].
6. The leakage. Rising temperature above normal values, causing higher
    power drain by the CPU core. Presented on LPC 2022 [4]. This is an
    issue when our GPU or ISP heats up the SoC, thus CPUs.

If you like, I can give you more details how those different CPUs
(and other devices) behave under power/thermal stress in various
scenarios. I have spent a lot of time in last ~5years on researching
it.

Regards,
Lukasz

[1] 
https://lore.kernel.org/linux-pm/20220406220809.22555-1-lukasz.luba@arm.com/
[2] https://lore.kernel.org/lkml/20210126104001.20361-1-lukasz.luba@arm.com/
[3] https://lore.kernel.org/lkml/20220429091245.12423-1-lukasz.luba@arm.com/
[4] https://lpc.events/event/16/contributions/1341/

  reply	other threads:[~2022-10-11 10:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30  9:48 [PATCH 1/2] cpufreq: Change macro for store scaling min/max frequency Lukasz Luba
2022-09-30  9:48 ` [PATCH 2/2] cpufreq: Update CPU capacity reduction in store_scaling_max_freq() Lukasz Luba
2022-10-10  5:39   ` Viresh Kumar
2022-10-10  9:02     ` Lukasz Luba
2022-10-10  9:15       ` Vincent Guittot
2022-10-10  9:30         ` Lukasz Luba
2022-10-10  9:32           ` Vincent Guittot
2022-10-10 10:12             ` Lukasz Luba
2022-10-10 10:22               ` Vincent Guittot
2022-10-10 10:49                 ` Lukasz Luba
2022-10-10 12:21                   ` Vincent Guittot
2022-10-10 13:05                     ` Lukasz Luba
2022-10-10 10:25               ` Peter Zijlstra
2022-10-10 10:46                 ` Lukasz Luba
2022-10-11  8:38                   ` Peter Zijlstra
2022-10-11 10:25                     ` Lukasz Luba [this message]
2022-10-10  5:36 ` [PATCH 1/2] cpufreq: Change macro for store scaling min/max frequency Viresh Kumar
2022-10-10  8:49   ` Lukasz Luba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5001a099-4596-2d10-2f79-3e39ad507959@arm.com \
    --to=lukasz.luba@arm.com \
    --cc=Dietmar.Eggemann@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).