All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukasz Majewski <l.majewski@samsung.com>
To: Eduardo Valentin <edubezval@gmail.com>
Cc: Zhang Rui <rui.zhang@intel.com>,
	Ezequiel Garcia <ezequiel.garcia@free-electrons.com>,
	Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Linux PM list <linux-pm@vger.kernel.org>,
	Vincenzo Frascino <vincenzo.frascino@st.com>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
	Lukasz Majewski <l.majewski@majess.pl>,
	Nobuhiro Iwamatsu <iwamatsu@nigauri.org>,
	Mikko Perttunen <mperttunen@nvidia.com>,
	Stephen Warren <swarren@wwwdotorg.org>,
	Thierry Reding <thierry.reding@gmail.com>,
	Alexandre Courbot <gnurou@gmail.com>,
	linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org,
	Viresh Kumar <viresh.kumar@linaro.org>
Subject: Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
Date: Mon, 24 Nov 2014 11:38:54 +0100	[thread overview]
Message-ID: <20141124113854.36975fb2@amdc2363> (raw)
In-Reply-To: <20141121180826.GA3331@developer>

Hi Eduardo,

> 
> Lukasz,
> 
> On Tue, Nov 18, 2014 at 11:16:30AM +0100, Lukasz Majewski wrote:
> > The return code from ->get_max_state() callback was not checked
> > during binding cooling device to thermal zone device.
> > 
> > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > ---
> > Changes for v2:
> > - It turned out that patches from 1 to 6 from v1 are not needed,
> > since they either already solve the problem (like imx_thermal.c) or
> > not use cpufreq as a thermal cooling device.
> > - The patch 7 from v1 is also not needed since this patch on error
> > exits this function without using max_state variable.
> > - In thermal driver probe the cpufreq_cooling_register() method
> > presence is crucial to evaluate if the thermal driver needs any
> > actions with -EPROBE_DEFER.
> 
> Have you tried this patch with of-thermal based systems?

Yes. I did try it with Exynos (after the rework). And there weren't
any regressions.

To be precise - do you refer to of_cpufreq_cooling_register() [1] or
cpufreq_cooling_register() [2]?

For the latter [2] - drivers like imx_thermal.c are fully prepared for
-EDEFER_PROBE.

For the former [1] - only cpufreq-dt.c uses it (and Exynos SoC after
the rework).

> 
> The above proposal works if the thermal driver is dealing with loading
> cpu_cooling. But for of-thermal based drivers, the idea is to leave to
> cpufreq code to load it. 

I assume, that you mean case [1]?

> 
> As an example, I am taking the ti-soc-thermal, but we already have
> other of-thermal based drivers. Booting with this patch ti-soc-thermal
> (of-based boot) loads fine, but the cpu_cooling never gets bound to
> the thermal zone.

Could you share the exact SoC/board/_defconfig setup to reproduce this
behavior? I possess Beagle Bone Black, but it doesn't have thermal
support (perhaps because its lack of accuracy).

With my Exynos setup I didn't experience any problems with this patch.

> 
> The thing is that the bind may happen before cpufreq-dt code loads the
> cpufreq driver, and when cpu_cooling is checking what is the max freq,
> by using cpufreq table, it won't be able to do it, as there is no
> table.

As I look into the cpufreq-dt.c driver - in the cpufreq_init()
function, the call to of_cpufreq_cooling_register() is performed just
before cpufreq_table_validate_and_show().
It looks like a mistake in the cpufreq-dt.c code.

> 
> While, without the patch, it will use wrong in the binding, but after
> it gets bound, and cpufreq loads, the max will be used correctly.

Correct. Such _wrong_ behavior was the original motivation to prepare
this patch.

> 
> And in this case, the system still works besides this bug. 

Unfortunately there is also a "window" in which the driver is not
properly configured and can cause system crash, although it is unlikely.


> The
> reasoning is because the max state comes from DT (2) and lower and
> upper wont be equal to THERMAL_NO_LIMIT. Then, the following check
> will use the parameter, instead of max_state:
> 
>         cdev->ops->get_max_state(cdev, &max_state);
> 
> 	/* lower default 0, upper default max_state */
> 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> 	upper = upper == THERMAL_NO_LIMIT ?
> 				max_state : upper;
> 
> In summary, introducing this patch, although it fix a problem, will
> introduce regressions, in of-thermal based drivers.

To be more precise - it will affect systems, which use of-thermal.c and
cpufreq-dt.c in the same time, due to wrong ordering in the latter file.

Could you give me a hint about the exact affected system? I've grep'ed
for CPUFREQ_DT in the ./arch/arm/configs with no success.

> 
> I believe, to have this fix, you need to provide a way to have probing
> deferring also in cpu_cooling. That needs also the change in the
> cpufreq driver, as I mentioned in the other thread.

I will think about possible solution and refer to previous discussion. 

> 
> Cheers,
> 
> > ---
> >  drivers/thermal/thermal_core.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/thermal/thermal_core.c
> > b/drivers/thermal/thermal_core.c index 43b9070..8567929 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> >  	struct thermal_cooling_device *pos2;
> >  	unsigned long max_state;
> > -	int result;
> > +	int result, ret;
> >  
> >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > THERMAL_TRIPS_NONE)) return -EINVAL;
> > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> >  		return -EINVAL;
> >  
> > -	cdev->ops->get_max_state(cdev, &max_state);
> > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > +	if (ret)
> > +		return ret;
> >  
> >  	/* lower default 0, upper default max_state */
> >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > -- 
> > 2.0.0.rc2
> > 


-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

  reply	other threads:[~2014-11-24 10:38 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-24  8:27 [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
2014-09-24  8:27 ` [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping Lukasz Majewski
2014-10-02 21:13   ` Eduardo Valentin
2014-10-03  7:26     ` Lukasz Majewski
2014-10-09  3:14   ` Zhang Rui
2014-09-24  8:27 ` [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0 Lukasz Majewski
2014-10-02 22:26   ` Eduardo Valentin
2014-10-03  8:26     ` Lukasz Majewski
2014-10-06 18:01       ` Eduardo Valentin
2014-09-24  8:27 ` [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback Lukasz Majewski
2014-10-02 22:24   ` Eduardo Valentin
2014-10-03 10:40     ` Lukasz Majewski
2014-10-09  3:15       ` Zhang Rui
2014-10-09 16:47         ` Lukasz Majewski
2014-10-02 14:40 ` [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
     [not found] ` <1411547232-21493-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-13 17:02   ` [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers Lukasz Majewski
2014-11-13 17:02     ` Lukasz Majewski
2014-11-13 17:02     ` [PATCH 1/8] thermal:cpu cooling:armada: Provide deferred probing for armada driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 2/8] thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 3/8] thermal:cpu cooling:rcar: Provide deferred probing for rcar driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 4/8] thermal:cpu cooling:spear: Provide deferred probing for spear driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver Lukasz Majewski
2014-11-14 10:47       ` Mikko Perttunen
2014-11-14 11:24         ` Lukasz Majewski
2014-11-17 11:57           ` Thierry Reding
2014-11-17 12:51             ` Lukasz Majewski
2014-11-17 13:18               ` Thierry Reding
     [not found]         ` <5465DDC5.6090301-/1wQRMveznE@public.gmane.org>
2014-11-17 11:43           ` Thierry Reding
2014-11-17 11:43             ` Thierry Reding
2014-11-17 11:50             ` Lukasz Majewski
2014-11-17 12:01               ` Thierry Reding
2014-11-17 12:01                 ` Thierry Reding
2014-11-17 13:02                 ` Lukasz Majewski
2014-11-17 13:02                   ` Lukasz Majewski
2014-11-17 12:51             ` Mikko Perttunen
2014-11-17 12:51               ` Mikko Perttunen
2014-11-17 13:08               ` Thierry Reding
2014-11-17 13:24                 ` Mikko Perttunen
     [not found]       ` <1415898165-27406-6-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-17 11:40         ` Thierry Reding
2014-11-17 11:40           ` Thierry Reding
2014-11-13 17:02     ` [PATCH 6/8] thermal:cpu cooling:ti: Provide deferred probing for ti drivers Lukasz Majewski
2014-11-20 19:00       ` Eduardo Valentin
2014-11-13 17:02     ` [PATCH 7/8] thermal:core:fix: Initialize the max_state variable to 0 Lukasz Majewski
2014-11-18 10:16     ` [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback Lukasz Majewski
     [not found]       ` <1416305790-27746-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-21 18:08         ` Eduardo Valentin
2014-11-21 18:08           ` Eduardo Valentin
2014-11-24 10:38           ` Lukasz Majewski [this message]
2014-11-24 11:00             ` Viresh Kumar
2014-11-24 11:00               ` Viresh Kumar
2014-11-24 14:24               ` Lukasz Majewski
2014-11-25 10:46                 ` Viresh Kumar
2014-11-24 18:02             ` Eduardo Valentin
2014-11-24 18:02               ` Eduardo Valentin
2014-11-24 20:28               ` Lukasz Majewski
     [not found]     ` <1415898165-27406-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-13 17:02       ` [PATCH 8/8] " Lukasz Majewski
2014-11-13 17:02         ` Lukasz Majewski
2014-11-20 18:54       ` [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers Eduardo Valentin
2014-11-20 18:54         ` Eduardo Valentin
2014-11-21  8:33         ` Lukasz Majewski
2014-11-21  2:47           ` Eduardo Valentin
2014-11-21 16:28             ` Lukasz Majewski
2014-11-21 16:28               ` Lukasz Majewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141124113854.36975fb2@amdc2363 \
    --to=l.majewski@samsung.com \
    --cc=b.zolnierkie@samsung.com \
    --cc=edubezval@gmail.com \
    --cc=ezequiel.garcia@free-electrons.com \
    --cc=gnurou@gmail.com \
    --cc=iwamatsu@nigauri.org \
    --cc=kuninori.morimoto.gx@renesas.com \
    --cc=l.majewski@majess.pl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mperttunen@nvidia.com \
    --cc=rui.zhang@intel.com \
    --cc=swarren@wwwdotorg.org \
    --cc=thierry.reding@gmail.com \
    --cc=vincenzo.frascino@st.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.