All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukasz Majewski <l.majewski@majess.pl>
To: Eduardo Valentin <edubezval@gmail.com>
Cc: Lukasz Majewski <l.majewski@samsung.com>,
	Zhang Rui <rui.zhang@intel.com>,
	Ezequiel Garcia <ezequiel.garcia@free-electrons.com>,
	Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Linux PM list <linux-pm@vger.kernel.org>,
	Vincenzo Frascino <vincenzo.frascino@st.com>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
	Nobuhiro Iwamatsu <iwamatsu@nigauri.org>,
	Mikko Perttunen <mperttunen@nvidia.com>,
	Stephen Warren <swarren@wwwdotorg.org>,
	Thierry Reding <thierry.reding@gmail.com>,
	Alexandre Courbot <gnurou@gmail.com>,
	linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org,
	Viresh Kumar <viresh.kumar@linaro.org>
Subject: Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
Date: Mon, 24 Nov 2014 21:28:30 +0100	[thread overview]
Message-ID: <20141124212830.0d7bec69@jawa> (raw)
In-Reply-To: <20141124180222.GA1449@developer>

[-- Attachment #1: Type: text/plain, Size: 6901 bytes --]

On Mon, 24 Nov 2014 14:02:25 -0400
Eduardo Valentin <edubezval@gmail.com> wrote:

> 
> Hello Lukasz,
> 
> On Mon, Nov 24, 2014 at 11:38:54AM +0100, Lukasz Majewski wrote:
> > Hi Eduardo,
> > 
> > > 
> > > Lukasz,
> > > 
> > > On Tue, Nov 18, 2014 at 11:16:30AM +0100, Lukasz Majewski wrote:
> > > > The return code from ->get_max_state() callback was not checked
> > > > during binding cooling device to thermal zone device.
> > > > 
> > > > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > > > ---
> > > > Changes for v2:
> > > > - It turned out that patches from 1 to 6 from v1 are not needed,
> > > > since they either already solve the problem (like
> > > > imx_thermal.c) or not use cpufreq as a thermal cooling device.
> > > > - The patch 7 from v1 is also not needed since this patch on
> > > > error exits this function without using max_state variable.
> > > > - In thermal driver probe the cpufreq_cooling_register() method
> > > > presence is crucial to evaluate if the thermal driver needs any
> > > > actions with -EPROBE_DEFER.
> > > 
> > > Have you tried this patch with of-thermal based systems?
> > 
> > Yes. I did try it with Exynos (after the rework). And there weren't
> > any regressions.
> > 
> > To be precise - do you refer to of_cpufreq_cooling_register() [1] or
> > cpufreq_cooling_register() [2]?
> > 
> 
> [1]
> 
> > For the latter [2] - drivers like imx_thermal.c are fully prepared
> > for -EDEFER_PROBE.
> > 
> > For the former [1] - only cpufreq-dt.c uses it (and Exynos SoC after
> > the rework).
> > 
> > > 
> > > The above proposal works if the thermal driver is dealing with
> > > loading cpu_cooling. But for of-thermal based drivers, the idea
> > > is to leave to cpufreq code to load it. 
> > 
> > I assume, that you mean case [1]?
> > 
> 
> yup
> 
> > > 
> > > As an example, I am taking the ti-soc-thermal, but we already have
> > > other of-thermal based drivers. Booting with this patch
> > > ti-soc-thermal (of-based boot) loads fine, but the cpu_cooling
> > > never gets bound to the thermal zone.
> > 
> > Could you share the exact SoC/board/_defconfig setup to reproduce
> > this behavior? I possess Beagle Bone Black, but it doesn't have
> > thermal support (perhaps because its lack of accuracy).
> >
> 
> Well, it may happen any system a driver with of-thermal + cpufreq-dt.
> 
> One board that is easily available is OMAP4460 panda board (tried
> myself, the problem is there).
> 
> > With my Exynos setup I didn't experience any problems with this
> > patch.
> > 
> > > 
> > > The thing is that the bind may happen before cpufreq-dt code
> > > loads the cpufreq driver, and when cpu_cooling is checking what
> > > is the max freq, by using cpufreq table, it won't be able to do
> > > it, as there is no table.
> > 
> > As I look into the cpufreq-dt.c driver - in the cpufreq_init()
> > function, the call to of_cpufreq_cooling_register() is performed
> > just before cpufreq_table_validate_and_show().
> > It looks like a mistake in the cpufreq-dt.c code.
> > 
> 
> Well, I believe for our case, better would be if the cpu_cooling could
> be done after cpufreq driver registration call.
> 
> 
> > > 
> > > While, without the patch, it will use wrong in the binding, but
> > > after it gets bound, and cpufreq loads, the max will be used
> > > correctly.
> > 
> > Correct. Such _wrong_ behavior was the original motivation to
> > prepare this patch.
> > 
> > > 
> > > And in this case, the system still works besides this bug. 
> > 
> > Unfortunately there is also a "window" in which the driver is not
> > properly configured and can cause system crash, although it is
> > unlikely.
> > 
> 
> Agreed.
> 
> > 
> > > The
> > > reasoning is because the max state comes from DT (2) and lower and
> > > upper wont be equal to THERMAL_NO_LIMIT. Then, the following check
> > > will use the parameter, instead of max_state:
> > > 
> > >         cdev->ops->get_max_state(cdev, &max_state);
> > > 
> > > 	/* lower default 0, upper default max_state */
> > > 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > > 	upper = upper == THERMAL_NO_LIMIT ?
> > > 				max_state : upper;
> > > 
> > > In summary, introducing this patch, although it fix a problem,
> > > will introduce regressions, in of-thermal based drivers.
> > 
> > To be more precise - it will affect systems, which use of-thermal.c
> > and cpufreq-dt.c in the same time, due to wrong ordering in the
> > latter file.
> > 
> 
> Exactly.
> 
> > Could you give me a hint about the exact affected system? I've
> > grep'ed for CPUFREQ_DT in the ./arch/arm/configs with no success.
> > 
> 
> Yeah, the grepping is correct. But well, just because it is not in
> defconfigs does not mean it won't be used. 
> 
> > > 
> > > I believe, to have this fix, you need to provide a way to have
> > > probing deferring also in cpu_cooling. That needs also the change
> > > in the cpufreq driver, as I mentioned in the other thread.
> > 
> > I will think about possible solution and refer to previous
> > discussion. 
> > 
> 
> Good. For your patch, it is still sane to have it. But needs to be
> taken after fixing the ordering between cpufreq-dt and cpu_cooling.

Such fix has been already prepared by Viresh. Could you test if it
works on your Panda Board (unfortunately I don't have one)?
> 
> 
> > > 
> > > Cheers,
> > > 
> > > > ---
> > > >  drivers/thermal/thermal_core.c | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/thermal/thermal_core.c
> > > > b/drivers/thermal/thermal_core.c index 43b9070..8567929 100644
> > > > --- a/drivers/thermal/thermal_core.c
> > > > +++ b/drivers/thermal/thermal_core.c
> > > > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > > > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> > > >  	struct thermal_cooling_device *pos2;
> > > >  	unsigned long max_state;
> > > > -	int result;
> > > > +	int result, ret;
> > > >  
> > > >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > > > THERMAL_TRIPS_NONE)) return -EINVAL;
> > > > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > > > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> > > >  		return -EINVAL;
> > > >  
> > > > -	cdev->ops->get_max_state(cdev, &max_state);
> > > > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > > > +	if (ret)
> > > > +		return ret;
> > > >  
> > > >  	/* lower default 0, upper default max_state */
> > > >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > > > -- 
> > > > 2.0.0.rc2
> > > > 
> > 
> > 
> > -- 
> > Best regards,
> > 
> > Lukasz Majewski
> > 
> > Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

Best regards,
Lukasz Majewski

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2014-11-24 20:28 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-24  8:27 [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
2014-09-24  8:27 ` [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping Lukasz Majewski
2014-10-02 21:13   ` Eduardo Valentin
2014-10-03  7:26     ` Lukasz Majewski
2014-10-09  3:14   ` Zhang Rui
2014-09-24  8:27 ` [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0 Lukasz Majewski
2014-10-02 22:26   ` Eduardo Valentin
2014-10-03  8:26     ` Lukasz Majewski
2014-10-06 18:01       ` Eduardo Valentin
2014-09-24  8:27 ` [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback Lukasz Majewski
2014-10-02 22:24   ` Eduardo Valentin
2014-10-03 10:40     ` Lukasz Majewski
2014-10-09  3:15       ` Zhang Rui
2014-10-09 16:47         ` Lukasz Majewski
2014-10-02 14:40 ` [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
     [not found] ` <1411547232-21493-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-13 17:02   ` [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers Lukasz Majewski
2014-11-13 17:02     ` Lukasz Majewski
2014-11-13 17:02     ` [PATCH 1/8] thermal:cpu cooling:armada: Provide deferred probing for armada driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 2/8] thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 3/8] thermal:cpu cooling:rcar: Provide deferred probing for rcar driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 4/8] thermal:cpu cooling:spear: Provide deferred probing for spear driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver Lukasz Majewski
2014-11-14 10:47       ` Mikko Perttunen
2014-11-14 11:24         ` Lukasz Majewski
2014-11-17 11:57           ` Thierry Reding
2014-11-17 12:51             ` Lukasz Majewski
2014-11-17 13:18               ` Thierry Reding
     [not found]         ` <5465DDC5.6090301-/1wQRMveznE@public.gmane.org>
2014-11-17 11:43           ` Thierry Reding
2014-11-17 11:43             ` Thierry Reding
2014-11-17 11:50             ` Lukasz Majewski
2014-11-17 12:01               ` Thierry Reding
2014-11-17 12:01                 ` Thierry Reding
2014-11-17 13:02                 ` Lukasz Majewski
2014-11-17 13:02                   ` Lukasz Majewski
2014-11-17 12:51             ` Mikko Perttunen
2014-11-17 12:51               ` Mikko Perttunen
2014-11-17 13:08               ` Thierry Reding
2014-11-17 13:24                 ` Mikko Perttunen
     [not found]       ` <1415898165-27406-6-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-17 11:40         ` Thierry Reding
2014-11-17 11:40           ` Thierry Reding
2014-11-13 17:02     ` [PATCH 6/8] thermal:cpu cooling:ti: Provide deferred probing for ti drivers Lukasz Majewski
2014-11-20 19:00       ` Eduardo Valentin
2014-11-13 17:02     ` [PATCH 7/8] thermal:core:fix: Initialize the max_state variable to 0 Lukasz Majewski
2014-11-18 10:16     ` [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback Lukasz Majewski
     [not found]       ` <1416305790-27746-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-21 18:08         ` Eduardo Valentin
2014-11-21 18:08           ` Eduardo Valentin
2014-11-24 10:38           ` Lukasz Majewski
2014-11-24 11:00             ` Viresh Kumar
2014-11-24 11:00               ` Viresh Kumar
2014-11-24 14:24               ` Lukasz Majewski
2014-11-25 10:46                 ` Viresh Kumar
2014-11-24 18:02             ` Eduardo Valentin
2014-11-24 18:02               ` Eduardo Valentin
2014-11-24 20:28               ` Lukasz Majewski [this message]
     [not found]     ` <1415898165-27406-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-13 17:02       ` [PATCH 8/8] " Lukasz Majewski
2014-11-13 17:02         ` Lukasz Majewski
2014-11-20 18:54       ` [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers Eduardo Valentin
2014-11-20 18:54         ` Eduardo Valentin
2014-11-21  8:33         ` Lukasz Majewski
2014-11-21  2:47           ` Eduardo Valentin
2014-11-21 16:28             ` Lukasz Majewski
2014-11-21 16:28               ` Lukasz Majewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141124212830.0d7bec69@jawa \
    --to=l.majewski@majess.pl \
    --cc=b.zolnierkie@samsung.com \
    --cc=edubezval@gmail.com \
    --cc=ezequiel.garcia@free-electrons.com \
    --cc=gnurou@gmail.com \
    --cc=iwamatsu@nigauri.org \
    --cc=kuninori.morimoto.gx@renesas.com \
    --cc=l.majewski@samsung.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mperttunen@nvidia.com \
    --cc=rui.zhang@intel.com \
    --cc=swarren@wwwdotorg.org \
    --cc=thierry.reding@gmail.com \
    --cc=vincenzo.frascino@st.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.