All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukasz Luba <lukasz.luba@arm.com>
To: Thara Gopinath <thara.gopinath@linaro.org>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-arm-msm@vger.kernel.org, sudeep.holla@arm.com,
	will@kernel.org, catalin.marinas@arm.com, linux@armlinux.org.uk,
	gregkh@linuxfoundation.org, rafael@kernel.org,
	viresh.kumar@linaro.org, amitk@kernel.org,
	daniel.lezcano@linaro.org, amit.kachhap@gmail.com,
	bjorn.andersson@linaro.org, agross@kernel.org,
	Steev Klimaszewski <steev@kali.org>
Subject: Re: [PATCH v3 4/5] cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function
Date: Mon, 8 Nov 2021 14:12:29 +0000	[thread overview]
Message-ID: <02468805-f626-1f61-7f7f-73ed7dfad034@arm.com> (raw)
In-Reply-To: <c4a2618f-71ee-b688-6268-08256a8edf10@linaro.org>

Hi Thara,

+CC Steev, who discovered this issue with boost
frequency

On 11/5/21 7:12 PM, Thara Gopinath wrote:
> Hi Lukasz,
> 
> 
> On 11/3/21 12:10 PM, Lukasz Luba wrote:
>> Thermal pressure provides a new API, which allows to use CPU frequency
>> as an argument. That removes the need of local conversion to capacity.
>> Use this new API and remove old local conversion code.
>>
>> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
>> ---
>>   drivers/cpufreq/qcom-cpufreq-hw.c | 15 +++++----------
>>   1 file changed, 5 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
>> b/drivers/cpufreq/qcom-cpufreq-hw.c
>> index 0138b2ec406d..425f351450ad 100644
>> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
>> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
>> @@ -275,10 +275,10 @@ static unsigned int 
>> qcom_lmh_get_throttle_freq(struct qcom_cpufreq_data *data)
>>   static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>>   {
>> -    unsigned long max_capacity, capacity, freq_hz, throttled_freq;
>>       struct cpufreq_policy *policy = data->policy;
>>       int cpu = cpumask_first(policy->cpus);
>>       struct device *dev = get_cpu_device(cpu);
>> +    unsigned long freq_hz, throttled_freq;
>>       struct dev_pm_opp *opp;
>>       unsigned int freq;
>> @@ -295,17 +295,12 @@ static void qcom_lmh_dcvs_notify(struct 
>> qcom_cpufreq_data *data)
>>       throttled_freq = freq_hz / HZ_PER_KHZ;
>> -    /* Update thermal pressure */
>> -
>> -    max_capacity = arch_scale_cpu_capacity(cpu);
>> -    capacity = mult_frac(max_capacity, throttled_freq, 
>> policy->cpuinfo.max_freq);
>> -
>>       /* Don't pass boost capacity to scheduler */
>> -    if (capacity > max_capacity)
>> -        capacity = max_capacity;
> 
> So, I think this should go into the common 
> topology_update_thermal_pressure in lieu of
> 
> +    if (WARN_ON(max_freq < capped_freq))
> +        return;
> 
> This will fix the issue Steev Klimaszewski has been reporting
> https://lore.kernel.org/linux-arm-kernel/3cba148a-7077-7b6b-f131-dc65045aa348@arm.com/ 
> 
> 
> 

Well, I think the issue is broader. Look at the code which
calculate this 'capacity'. It's just a multiplication & division:

max_capacity = arch_scale_cpu_capacity(cpu); // =1024 in our case
capacity = mult_frac(max_capacity, throttled_freq,
		policy->cpuinfo.max_freq);

In the reported by Steev output from sysfs cpufreq we know
that the value of 'policy->cpuinfo.max_freq' is:
/sys/devices/system/cpu/cpu5/cpufreq/cpuinfo_max_freq:2956800

so when we put the values to the equation we get:
capacity = 1024 * 2956800 / 2956800; // =1024
The 'capacity' will be always <= 1024 and this check won't
be triggered:

/* Don't pass boost capacity to scheduler */
if (capacity > max_capacity)
	capacity = max_capacity;


IIUC you original code, you don't want to have this boost
frequency to be treated as 1024 capacity. The reason is because
the whole capacity machinery in arch_topology.c is calculated based
on max freq value = 2841600,
so the max capacity 1024 would be pinned to that frequency
(according to Steeve's log:
[   22.552273] THERMAL_PRESSURE: max_freq(2841) < capped_freq(2956) for 
CPUs [4-7] )


Having all this in mind, the multiplication and division in your
original code should be done:

capacity = 1024 * 2956800 / 2841600; // = 1065

then clamped to 1024 value.

My change just unveiled this division issue.

With that in mind, I tend to agree that I should have not
rely on passed boost freq value and try to apply your suggestion check.
Let me experiment with that...

Regards,
Lukasz

WARNING: multiple messages have this Message-ID (diff)
From: Lukasz Luba <lukasz.luba@arm.com>
To: Thara Gopinath <thara.gopinath@linaro.org>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-arm-msm@vger.kernel.org, sudeep.holla@arm.com,
	will@kernel.org, catalin.marinas@arm.com, linux@armlinux.org.uk,
	gregkh@linuxfoundation.org, rafael@kernel.org,
	viresh.kumar@linaro.org, amitk@kernel.org,
	daniel.lezcano@linaro.org, amit.kachhap@gmail.com,
	bjorn.andersson@linaro.org, agross@kernel.org,
	Steev Klimaszewski <steev@kali.org>
Subject: Re: [PATCH v3 4/5] cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function
Date: Mon, 8 Nov 2021 14:12:29 +0000	[thread overview]
Message-ID: <02468805-f626-1f61-7f7f-73ed7dfad034@arm.com> (raw)
In-Reply-To: <c4a2618f-71ee-b688-6268-08256a8edf10@linaro.org>

Hi Thara,

+CC Steev, who discovered this issue with boost
frequency

On 11/5/21 7:12 PM, Thara Gopinath wrote:
> Hi Lukasz,
> 
> 
> On 11/3/21 12:10 PM, Lukasz Luba wrote:
>> Thermal pressure provides a new API, which allows to use CPU frequency
>> as an argument. That removes the need of local conversion to capacity.
>> Use this new API and remove old local conversion code.
>>
>> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
>> ---
>>   drivers/cpufreq/qcom-cpufreq-hw.c | 15 +++++----------
>>   1 file changed, 5 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
>> b/drivers/cpufreq/qcom-cpufreq-hw.c
>> index 0138b2ec406d..425f351450ad 100644
>> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
>> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
>> @@ -275,10 +275,10 @@ static unsigned int 
>> qcom_lmh_get_throttle_freq(struct qcom_cpufreq_data *data)
>>   static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
>>   {
>> -    unsigned long max_capacity, capacity, freq_hz, throttled_freq;
>>       struct cpufreq_policy *policy = data->policy;
>>       int cpu = cpumask_first(policy->cpus);
>>       struct device *dev = get_cpu_device(cpu);
>> +    unsigned long freq_hz, throttled_freq;
>>       struct dev_pm_opp *opp;
>>       unsigned int freq;
>> @@ -295,17 +295,12 @@ static void qcom_lmh_dcvs_notify(struct 
>> qcom_cpufreq_data *data)
>>       throttled_freq = freq_hz / HZ_PER_KHZ;
>> -    /* Update thermal pressure */
>> -
>> -    max_capacity = arch_scale_cpu_capacity(cpu);
>> -    capacity = mult_frac(max_capacity, throttled_freq, 
>> policy->cpuinfo.max_freq);
>> -
>>       /* Don't pass boost capacity to scheduler */
>> -    if (capacity > max_capacity)
>> -        capacity = max_capacity;
> 
> So, I think this should go into the common 
> topology_update_thermal_pressure in lieu of
> 
> +    if (WARN_ON(max_freq < capped_freq))
> +        return;
> 
> This will fix the issue Steev Klimaszewski has been reporting
> https://lore.kernel.org/linux-arm-kernel/3cba148a-7077-7b6b-f131-dc65045aa348@arm.com/ 
> 
> 
> 

Well, I think the issue is broader. Look at the code which
calculate this 'capacity'. It's just a multiplication & division:

max_capacity = arch_scale_cpu_capacity(cpu); // =1024 in our case
capacity = mult_frac(max_capacity, throttled_freq,
		policy->cpuinfo.max_freq);

In the reported by Steev output from sysfs cpufreq we know
that the value of 'policy->cpuinfo.max_freq' is:
/sys/devices/system/cpu/cpu5/cpufreq/cpuinfo_max_freq:2956800

so when we put the values to the equation we get:
capacity = 1024 * 2956800 / 2956800; // =1024
The 'capacity' will be always <= 1024 and this check won't
be triggered:

/* Don't pass boost capacity to scheduler */
if (capacity > max_capacity)
	capacity = max_capacity;


IIUC you original code, you don't want to have this boost
frequency to be treated as 1024 capacity. The reason is because
the whole capacity machinery in arch_topology.c is calculated based
on max freq value = 2841600,
so the max capacity 1024 would be pinned to that frequency
(according to Steeve's log:
[   22.552273] THERMAL_PRESSURE: max_freq(2841) < capped_freq(2956) for 
CPUs [4-7] )


Having all this in mind, the multiplication and division in your
original code should be done:

capacity = 1024 * 2956800 / 2841600; // = 1065

then clamped to 1024 value.

My change just unveiled this division issue.

With that in mind, I tend to agree that I should have not
rely on passed boost freq value and try to apply your suggestion check.
Let me experiment with that...

Regards,
Lukasz

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-11-08 14:12 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-03 16:10 [PATCH v3 0/5] Refactor thermal pressure update to avoid code duplication Lukasz Luba
2021-11-03 16:10 ` Lukasz Luba
2021-11-03 16:10 ` [PATCH v3 1/5] arch_topology: Introduce thermal pressure update function Lukasz Luba
2021-11-03 16:10   ` Lukasz Luba
2021-11-03 16:10 ` [PATCH v3 2/5] thermal: cpufreq_cooling: Use new " Lukasz Luba
2021-11-03 16:10   ` Lukasz Luba
2021-11-03 16:10 ` [PATCH v3 3/5] cpufreq: qcom-cpufreq-hw: Update offline CPUs per-cpu thermal pressure Lukasz Luba
2021-11-03 16:10   ` Lukasz Luba
2021-11-03 16:10 ` [PATCH v3 4/5] cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function Lukasz Luba
2021-11-03 16:10   ` Lukasz Luba
2021-11-05 19:12   ` Thara Gopinath
2021-11-05 19:12     ` Thara Gopinath
2021-11-08 14:12     ` Lukasz Luba [this message]
2021-11-08 14:12       ` Lukasz Luba
2021-11-08 21:23       ` Thara Gopinath
2021-11-08 21:23         ` Thara Gopinath
2021-11-09  8:46         ` Lukasz Luba
2021-11-09  8:46           ` Lukasz Luba
2021-11-03 16:10 ` [PATCH v3 5/5] arch_topology: Remove unused topology_set_thermal_pressure() and related Lukasz Luba
2021-11-03 16:10   ` Lukasz Luba
2021-11-05 15:39 ` [PATCH v3 0/5] Refactor thermal pressure update to avoid code duplication Steev Klimaszewski
2021-11-05 15:39   ` Steev Klimaszewski
2021-11-05 16:26   ` Lukasz Luba
2021-11-05 16:26     ` Lukasz Luba
2021-11-05 17:33     ` Steev Klimaszewski
2021-11-05 17:33       ` Steev Klimaszewski
2021-11-05 19:18       ` Thara Gopinath
2021-11-05 19:18         ` Thara Gopinath
2021-11-05 19:51         ` Steev Klimaszewski
2021-11-05 19:51           ` Steev Klimaszewski
2021-11-05 21:06           ` Thara Gopinath
2021-11-05 21:06             ` Thara Gopinath
2021-11-05 22:46             ` Steev Klimaszewski
2021-11-05 22:46               ` Steev Klimaszewski
2021-11-08 10:44               ` Lukasz Luba
2021-11-08 10:44                 ` Lukasz Luba
2021-11-08 14:11               ` Thara Gopinath
2021-11-08 14:11                 ` Thara Gopinath
2021-11-08 15:22                 ` Steev Klimaszewski
2021-11-08 15:22                   ` Steev Klimaszewski
2021-11-08 21:31                   ` Thara Gopinath
2021-11-08 21:31                     ` Thara Gopinath
2021-11-08 23:21                     ` Steev Klimaszewski
2021-11-08 23:21                       ` Steev Klimaszewski
2021-11-09  8:29                       ` Lukasz Luba
2021-11-09  8:29                         ` Lukasz Luba
2021-11-09 15:46                         ` Steev Klimaszewski
2021-11-09 15:46                           ` Steev Klimaszewski
2021-11-09 16:22                           ` Lukasz Luba
2021-11-09 16:22                             ` Lukasz Luba
2021-11-09 18:13                             ` Lukasz Luba
2021-11-09 18:13                               ` Lukasz Luba
2021-11-09 19:09                               ` Steev Klimaszewski
2021-11-09 19:09                                 ` Steev Klimaszewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02468805-f626-1f61-7f7f-73ed7dfad034@arm.com \
    --to=lukasz.luba@arm.com \
    --cc=agross@kernel.org \
    --cc=amit.kachhap@gmail.com \
    --cc=amitk@kernel.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=rafael@kernel.org \
    --cc=steev@kali.org \
    --cc=sudeep.holla@arm.com \
    --cc=thara.gopinath@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.