All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thara Gopinath <thara.gopinath@linaro.org>
To: Dmitry Osipenko <digetx@gmail.com>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>
Cc: Thierry Reding <thierry.reding@gmail.com>,
	Jonathan Hunter <jonathanh@nvidia.com>,
	Zhang Rui <rui.zhang@intel.com>, Amit Kucheria <amitk@kernel.org>,
	Andreas Westman Dorcsak <hedmoo@yahoo.com>,
	Maxim Schwalm <maxim.schwalm@gmail.com>,
	Svyatoslav Ryhel <clamor95@gmail.com>,
	Ihor Didenko <tailormoon@rambler.ru>,
	Ion Agorria <ion@agorria.com>,
	Matt Merhar <mattmerhar@protonmail.com>,
	Peter Geis <pgwipeout@gmail.com>,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-tegra@vger.kernel.org, linux-pm@vger.kernel.org
Subject: Re: [PATCH v3 4/7] thermal/drivers/tegra: Add driver for Tegra30 thermal sensor
Date: Wed, 16 Jun 2021 10:36:17 -0400	[thread overview]
Message-ID: <14b6344b-3994-7977-6933-a2d2357d23d5@linaro.org> (raw)
In-Reply-To: <f06370e0-bfde-87d0-03b4-93c667f81817@gmail.com>



On 6/16/21 6:47 AM, Dmitry Osipenko wrote:
> 16.06.2021 05:50, Thara Gopinath пишет:
> ...
>>
>> Hi,
>>
>> Thermal pressure is letting scheduler know that the max capacity
>> available for a cpu to schedule tasks is reduced due to a thermal event.
>> So you cannot have a h/w thermal pressure and s/w thermal pressure.
>> There is eventually only one capping applied at h/w level and the
>> frequency corresponding to this capping should be used for thermal
>> pressure.
>>
>> Ideally you should not be having both s/w and h/w trying to throttle at
>> the same time. Why is this a scenario and what prevents you from
>> disabling s/w throttling when h/w throttling is enabled. Now if there
>> has to a aggregation for whatever reason this should be done at the
>> thermal driver level and passed to scheduler.
> 
> Hello,
> 
> The h/w mitigation is much more reactive than software, in the same time
> it's much less flexible than software. It should provide additional
> protection in a cases where software isn't doing a good job. Ideally h/w
> mitigation should stay inactive all the time, nevertheless it should be
> modeled properly by the driver.

Ok. This is kind of opposite to what I am doing on the Qcom platform I 
am working on. The h/w throttling is the default since like you 
mentioned it is more reactive. And s/w does only critical trip management.

> 
>>>>
>>>> That is a good question. IMO, first step would be to call
>>>> cpufreq_update_limits().
>>>
>>> Right
>>>
>>>> [ Cc Thara who implemented the thermal pressure ]
>>>>
>>>> May be Thara has an idea about how to aggregate both? There is another
>>>> series floating around with hardware limiter [1] and the same
>>>> problematic.
>>>>
>>>>    [1] https://lkml.org/lkml/2021/6/8/1791
>>>
>>> Thanks, it indeed looks similar.
>>>
>>> I guess the common thermal pressure update code could be moved out into
>>> a new special cpufreq thermal QoS handler (policy->thermal_constraints),
>>> where handler will select the frequency constraint and set up the
>>> pressure accordingly. So there won't be any races in the code.
>>>
>> It was a conscious decision to keep thermal pressure update out of qos
>> max freq update because there are platforms that don't use the qos
>> framework. For eg acpi uses cpufreq_update_policy.
>> But you are right. We have two platforms now applying h/w throttling and
>> cpufreq_cooling applying s/w throttling. So it does make sense to have
>> one api doing all the computation to update thermal pressure. I am not
>> sure how exactly/where exactly this will reside.
> 
> The generic cpufreq_cooling already uses QoS for limiting the CPU
> frequency. It could be okay to use QoS for the OF drivers, this needs a
> closer look.
> 
> We have the case where CPU frequency is changed by the thermal event and
> the thermal pressure equation is the same for both s/w cpufreq_cooling
> and h/w thermal driver. The pressure is calculated based on the QoS
> cpufreq constraint that is already aggregated.
> 
> Hence what we may need to do on the thermal event is:
> 
> 1. Update the QoS request
> 2. Update the thermal pressure
> 3. Ensure that updates are not racing

Yes. So the first two steps you mentioned is exactly what 
cpufreq_cooling.c also does except for the fact that it is a s/w 
mitigation. Now if you have two sources that is updating the max 
frequency via qos, I think you can do either of the following before
calculating thermal pressure
1. Read the throttled frequency from h/w if  your h/w supports this feature.
	or
2. Use freq_qos_read_value to get the max frequency value.

Either way only the correct throttled capacity should be passed to 
scheduler.

-- 
Warm Regards
Thara (She/Her/Hers)
> 
>> So for starters, I think you should replicate the update of thermal
>> pressure in your h/w driver when you know that h/w is
>> throttling/throttled the frequency. You can refer to cpufreq_cooling.c
>> to see how it is done.
>>
>> Moving to a common api can be done as a separate patch series.
>>
> 
> Thank you for the clarification and suggestion.
> 


  reply	other threads:[~2021-06-16 14:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-29 17:09 [PATCH v3 0/7] Add driver for NVIDIA Tegra30 SoC Thermal sensor Dmitry Osipenko
2021-05-29 17:09 ` [PATCH v3 1/7] dt-bindings: thermal: Add binding for Tegra30 thermal sensor Dmitry Osipenko
2021-05-29 17:09 ` [PATCH v3 2/7] thermal: thermal_of: Stop zone device before unregistering it Dmitry Osipenko
2021-05-29 17:09 ` [PATCH v3 3/7] thermal/core: Export thermal_cooling_device_stats_update() Dmitry Osipenko
2021-05-29 17:09 ` [PATCH v3 4/7] thermal/drivers/tegra: Add driver for Tegra30 thermal sensor Dmitry Osipenko
2021-05-31 12:51   ` Thierry Reding
2021-06-15 10:03   ` Daniel Lezcano
2021-06-15 10:26     ` Viresh Kumar
2021-06-15 13:01       ` Dmitry Osipenko
2021-06-15 16:18         ` Daniel Lezcano
2021-06-15 19:32           ` Dmitry Osipenko
2021-06-16  2:50             ` Thara Gopinath
2021-06-16 10:47               ` Dmitry Osipenko
2021-06-16 14:36                 ` Thara Gopinath [this message]
2021-06-16  8:03             ` Viresh Kumar
2021-06-16  8:30               ` Vincent Guittot
2021-06-16  8:39                 ` Dmitry Osipenko
2021-06-16  8:51                   ` Vincent Guittot
2021-06-16 10:49                     ` Dmitry Osipenko
2021-05-29 17:09 ` [PATCH v3 5/7] ARM: tegra_defconfig: Enable CONFIG_TEGRA30_TSENSOR Dmitry Osipenko
2021-05-29 17:09 ` [PATCH v3 6/7] ARM: multi_v7_defconfig: " Dmitry Osipenko
2021-05-29 17:09 ` [PATCH v3 7/7] ARM: tegra: Add SoC thermal sensor to Tegra30 device-trees Dmitry Osipenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14b6344b-3994-7977-6933-a2d2357d23d5@linaro.org \
    --to=thara.gopinath@linaro.org \
    --cc=amitk@kernel.org \
    --cc=clamor95@gmail.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=devicetree@vger.kernel.org \
    --cc=digetx@gmail.com \
    --cc=hedmoo@yahoo.com \
    --cc=ion@agorria.com \
    --cc=jonathanh@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mattmerhar@protonmail.com \
    --cc=maxim.schwalm@gmail.com \
    --cc=pgwipeout@gmail.com \
    --cc=rui.zhang@intel.com \
    --cc=tailormoon@rambler.ru \
    --cc=thierry.reding@gmail.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.